The present invention relates to the video encoding and decoding field, particularly to a multiple description video encoding and decoding method, apparatus and system based on rotation.
In recent years, with the development of Internet and widespread of every kinds of wireless terminals, multimedia transmission in the error prone network gets more and more attention. The current network is a so-called “best effort”, in which channel disturbance, network congestion and route delay, etc. exists. These problems result in data error and packet loss. In addition, with random bit error and consecutive burst error, wireless channel further aggravates the transmission environment. All these problems will result in the decoding failure of the whole bit stream or part of the bit stream at least. For the video coding, H.264/AVC standard or other MPEG standard, is employed generally, one packet lost will affect the other following packets due to the motion estimation and compensation. Hence, these problems have become the bottleneck of multimedia transmission.
Multiple description coding (MDC) is an effective scheme to solve the above problems. MDC assumes that there are more than one independent channel between the signal and the receiver. If the probability that one channel fails is p, then the probability that n channels fail will be pn. By generating n equally important descriptions for the same source signal that can be decoded independently, it can reconstruct an acceptable quality for the signal when some descriptions are lost. In the meantime, the more the descriptions are received, the better quality the reconstructed signal could be. For convenience, the decoding process for each single description is called side decoding, while the decoding process when all the descriptions are received is called central decoding. Because the decoding can be finished with only a part of information even though not all the descriptions are received, MDC is widely used in audio coding, image coding, video coding, distributed storing system and other low-delay coding systems. Different from the layer coding, there is no difference between different descriptions, in contrast with the base layer and enhancement layer in layer coding system. In fact, all of the descriptions are with the same importance. Hence, it is very suitable in the current internet that with no priority protection. Compared with forward error correction (FEC) and Automatic Error Request (ARQ), MDC can meet the real-time requirement.
As known, the conventional video coding technology uses the temporal relevance between the close frames to improve ability to compress the video data. Hence, almost all of the video encoding systems employ motion estimation and motion compensation. However, this will result in mismatch in MDC. The mismatch means that the reference frame (block/pixels) used in the decoding end/side and encoding end/side are different due to the packet loss. The simplest solution for controlling the mismatch is carrying out the independent prediction loop for each description. The general way is to subsample the video sequence into even/odd frames firstly. Then the subsampled sequences are encoded and decoded independently, with their independent prediction loop. When only one description is received, the subsampled frames will be interpolated to generate the lost description. When both of the two descriptions are received, each description is decoded and combined to reconstruct the final signal. Similarly, there is also spatial subsampling. This kind of MDC schemes is easy to implement and can be applied in any standard video coding system. However, its redundancy adjustment is not flexible. In addition, due to the subsampling, the relevance between different pixels or different frames is less than before. Hence, the compression of residual signal will be less efficient. Vaishampayan uses joint-quantization in the two prediction loops to avoid the mismatch. However, the compression efficiency is lower due to the coarse quantization. Another kind of mismatch controlling methods is to encode the mismatch signal again, that is, encode the mismatch between the central description and side description and distribute this information to the two descriptions. The mismatch could be controlled by this way, however, the structure is too complex and the redundancy is too much compared with other methods. When both of the two descriptions are received, the mismatch information is useless. Another kind of schemes use the redundancy slice in H.264, by optimizing the quantization steps between original slice and redundancy slice, this kind of schemes can be compatible with H.264 standard. Its performance is also very good.
All of the above schemes either try to exploit the redundancy existed in the video, such as spatial subsampling, or try to insert some redundancy, such as the scheme based on redundant slices. When only one description is received, its reconstructed quality is worse than its corresponding single description. In addition, the change of the standard encoder is required most of time to meet the MDC. Hence, the complexity of MDC scheme is increased and it is not standard compatible.
The object of the present invention is to provide multiple description video coding and decoding method, apparatus and system based on rotation, which can solve the complexity problem in MDC and its performance.
Thus, according to the first aspect of the present invention, there provides a multiple description video coding method based on rotation, characterized in that said method includes the following steps:
Preferably, said method further includes the following sub-steps:
According to the second aspect of the present invention, there provides a multiple description video decoding method based on rotation, characterized in that said method includes:
if two packets of the same image content, which belongs to the two descriptions, are not all lost, carrying out the reconstructed pixel of the received packet to replace the lost one, in which the displayed pixels at a decoding side are the average values of the two descriptions; and both of the two descriptions use their original corresponding reference image to decode; or
if both the packets of the same image content in the two descriptions are lost, carrying out a default concealment technique in H.264 to reconstruct the pixels at the decoding side, in which the displayed pixels are still the average ones of the two descriptions, and the reference image is changed as the average reconstructed one for both of the two descriptions.
Preferably, at the decoding side, the decoding for each of descriptions 1 and 2 are still implemented as claim 4; for description 3 and 4, the decoded residual will be added to the descriptions 1 and 2 respectively to finish the reconstruction.
According to the third aspect of the present invention, there provides a multiple description video coding apparatus based on rotation, characterized in that the apparatus includes the following modules:
Preferably, the apparatus includes the following modules:
According to the fourth aspect of the present invention, there provides a multiple description video decoding apparatus based on rotation, characterized in that the apparatus includes the following modules:
Preferably, the apparatus further includes the following modules:
According to the fifth aspect of the present invention, there provides a multiple description video decoding system based on rotation, characterized in that, the system includes the coding apparatus and the decoding apparatus.
Due to the use of symmetric transform based MDC algorithm in the present invention, each macroblock of each frame of each of the two descriptions uses its own different reference as prediction, so the generated residual will be different. Hence, the system according to the present invention is simple and efficient.
Through the following description with the appended figures, it can be easy to understand the present invention, as well as its advantages. But the included figures are used to provide further understanding of the present invention, it belongs to the present invention. It is used to describe the invention and the present invention is not limited to these figures alone.
In the following, embodiments of the present invention will be described with
To make the above object, features and advantages to be more obvious and easy to be understood, the present invention will be further explained with the figures.
This rotation-based multiple description encoding system tries to make each macroblock in each description exploit different reference macroblock. Then the generated residual for each description will be different. Hence, a rotation with 180 degree is just one example here, other transform such as flip and mirror etc. can also be employed. When both of the two descriptions are received, use the average values in the pixel domain as the reconstructed value of central performance. If the residual for the two descriptions are closed to uncorrelated or negative correlated, then the central performance can get higher gain compared to side description. Its theory model is: use f as the original frame, and {circumflex over (f)}1 and {circumflex over (f)}2 are reconstructed frame for the two descriptions respectively. Let P denote the prediction part, Q denote the quantization part, e represent the quantization error. Then the two multiple descriptions for the invention are as follows. That is, for each macroblock of the current f, the two descriptions use different reference frame or reference macroblock, result in different residual, and finally generate different quantization errors e1(n) and e2(n).
The corresponding central reconstruction is:
{circumflex over (f)}(n)=0.5({circumflex over (f)}1(n)+{circumflex over (f)}2(n))=f(n)−0.5(e1(n)+e2(n))
The more uncorrelated or close to negative correlated e1(n) and e2(n) are, then the smaller the error will be to get {circumflex over (f)}(n).
When a part of data packet in one description is lost, the system will use the corresponding packet in the other description to replace the lost packet. Since the two descriptions use the same bit rate and the same encoding method, the distortions of the two descriptions will have the same mean and variance. Hence, the mismatch error due to the replacement will be reduced significantly.
There are two situations for the decoding end. As shown in
A multiple description video encoding method that includes the following step:
The invention also provides a multiple description video decoding method that includes the following step:
The present invention also provides a multiple description encoding apparatus based on rotation, which includes the following modules:
The present invention also provides a multiple description decoding apparatus based on rotation, which includes the following modules:
To further tune the redundancy, calculate the residual between the original frame and the average value. Encode the residual signal with H.264. After that, subsample the packets according to even/odd way to form the second part of each description. The new encoding system is shown in
With the fixed total bitrate for each of the two descriptions, when the channel condition is good, the probability that the two descriptions are both received is larger, hence more bits should be distributed on the residual signal. And the whole system tends to provide a good central performance. When the channel condition is not stable, only one description is received most of time with high probability, fewer bits should be assigned to the residuals at this case. In the extreme case, if both of the two descriptions are reliable, the whole system will become encoding the sequence with H.264 and sending the packets alternatively in the two channels, that is only description 3 and descriptions 4 are kept in the system. In contrast, the whole system will only contain description 1 and description 2, with more redundancy to protect the data when the channel conditions are not good.
Preferably, a multiple description video encoding method includes the following steps:
Preferably, a multiple description video encoding apparatus includes:
Preferably, in a multiple description video decoding apparatus, each description for description 1 and description 2 is still decoded by the original way; for description 3 and description 4, the decoded residual will be added to the decoded description 1 and 2 respectively to complete the reconstruction.
The present invention also provide a rotation based multiple description video decoding system, which includes the above encoding apparatus and decoding apparatus.
This example adopts the H.264 JM as the software to generate the encoded bit stream. It uses the fixed number of macroblocks to organize the packet. For simplicity, it can take one row of the frame as one packet. In this way, the packets containing the same video content from normal encoded video packets and the inverse encoded rotated video packets can be easily found. The GOP structure of H.264 is IPPP, that is, only the first frame is I frame and others are all P frame. The tested video sequence is CIF format for Foreman sequence.
As mention above, the present invention has been described with the examples in detail. But there could be other embodiments obvious for those skilled in the art that do not go beyond the spirits from the essence of the present invention. Hence, any modified embodiments should also be protected in the present invention.
Number | Date | Country | Kind |
---|---|---|---|
201110444699.7 | Dec 2011 | CN | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/CN2012/084216 | 11/7/2012 | WO | 00 | 6/27/2014 |