The present invention relates to video coding, and more particularly an improved system for splitting and combining multiple description video streams.
With the advent of digital networks such as the Internet, there has been a demand for the ability to provide multimedia communication in real time over such networks. However, such multimedia communications, compared to analog communication systems, have been hampered by the limited bandwidth provided by the digital networks. To adapt multimedia communications to such hardware environments, much effort has been made to develop video compression techniques that improve multimedia throughput under limited bandwidth conditions using predictive coded video streams. These efforts have led to the emergence of several international standards such as the MPEG-2 and MPEG4 standards issued by the Motion Pictures Experts Group (MPEG) of the ISO and the H.26L and H.263 standards issued by the Video Coding Experts Group (VCEG) of the ITU. These standards achieve a high compression ratio by exploiting temporal and spatial correlations in real image sequences, using motion-compensated prediction and transform coding.
More recently diversity techniques, using Multiple Description Coding (MDC), have been employed to increase the robustness of communication systems and storage devices. Examples of such systems enhanced by diversity techniques include packet networks, wireless systems using multi-path and Doppler diversity and Redundant Arrays of Inexpensive Disks (RAIDs).
Present diversity techniques using MDC have worked best in systems were the diversity issues are known at the source of the communication. In such instances MDC is used to break the data to be communicated into separate pathways each being separately coded by the source. One such form of MDC is based on splitting (
Now with changes in the way information is delivered between wireless platforms and high-speed digital connections, the need for implementing diversity techniques at intermediate points in communication pathways is increasing in demand. By increasing the ways that hardware pathways are configured, a need has arisen for greater management of large multimedia data during communication. Presently, gateways that operate to channel high bandwidth channels between a plurality of low bandwidth stations have applied diversity techniques using MDC by transcoding all of the data. However, such solutions increase the overhead experienced at the gateway and may cause an increase in the transmission time. Both of these traits are undesirable. Thus, a need exists for a way to increase the advantages of diversity techniques during transmission, while minimizing the overhead imposed upon communication hardware.
The present invention utilizes a data relationship between B-frame motion vectors and P-frame motion vectors to simplify merging and dividing of multiple descriptions at gateways by avoiding the need to decompress and re-compress at least one of the multiple descriptions.
One aspect of the invention includes a data stream in which motion vectors of succeeding frames correspond to motion vectors of neighboring frames.
In one embodiment a gateway intermediate in the transmission of a data stream utilizes a method of managing multiple descriptions using the motion vector relationships to generate or merge multiple descriptions.
Other objects and advantages of the invention will become apparent from the foregoing detailed description taken in connection with the accompanying drawings, in which
With reference to the figures for purposes of illustration, the present invention relates to a system for implementing multi-channel transmission in a communications pathway of predictive scalable coding schemes. The present invention is presently described in connection with a communication system (
The invention is implemented upon the realization that a stream of multimedia data compressed using predictive coding may be split into multiple descriptions for multiple transmission pathways without the need to decompress and re-compress the data for multiple pathways. Predictive coding techniques of the type suitable for this purpose include MPEG standards MPEG-1, MPEG-2 and MPEG-4 as well as ITT standards H.261, H.262, H.263 and H.26L. With reference to the MPEG standard description for purposes of illustration, a movie or video data stream is made up of a sequence of frames that when displayed in sequential order produce the visual effect of animation. Predictive coding produces reductions in the amount of data to be transmitted by only transmitting information that relates to differences between each sequential frame. Under the MPEG standard, predictive coding of frames is based off of an I-frame (Intra-coded frame) that contains all the information to ‘re-build’ a frame of video. It should be noted that I-frame only encoded video does not utilize predictive coding techniques as every frame of the file is independent and requires no other frame information. Predictive coding permits greater compression factors by removing the redundancy from one frame to the next, in other words sending a set of instructions to create the next frame from the current. Such frames are called P-frames (Predicted frames). However, a drawback in using I- and P-frame predictive encoding is that data can only be taken from the previous picture. Moving objects can reveal a background that is unknown in previous pictures, while it may be visible in later pictures. B-frames (Bi-directional frames) can be created from preceding and/or later I or P-frames. An I-frame with a series of successive B- and P-frames, up to the next I-frame is called a GOP (Group of Pictures). An example of a GOP for broadcasting has the structure IBBPBBPBBPBB and is referred to as IPB-GOP.
One method of sending multimedia data through two or more pathways uses Multiple Description Coding (MDC). MDC has been shown to be an effective technique for robust communication over wireless systems using multi-path and Doppler diversity and Redundant Arrays of Inexpensive Disks (RAIDs), and also over the Internet. Currently, if an MPEG or H.26L coded or any other predictive coded video stream of data is transmitted through the Internet and then at the gateway it needs to be split into 2 multiple description video streams that better fit the channel characteristics of the down-link (e.g. wireless systems using multi-path) while preserving the same coding format as before, the video data is fully decoded and re-encoded. However, the present invention covers a system that allows the gateway to easily split a data stream into multiple descriptions without expensive full transcoding while still allowing for more resilient transmission. As will be described below this savings in time and format is accomplished by coding the hierarchy of motion vectors in a particular format. The particular coding format is based on the observation that the motion-vectors for the B-frames are not very different from part of the motion-vectors (MVs) used for P-frames.
Normally, independent MVs are computed for B-frames. However (
1. Splitting a Data Stream into Two Pathways
With reference to
{circumflex over (k)}(P)=kf(B)−kb(B); d(P)=k(P)−{circumflex over (k)}(P)
assuming that in this example there was only 1 B-frame in the initial bitstream between two consecutive P-frames. Note also that this is just an example and analogous equations can be derived if a different number of B-frames are present between 2 consecutive P-frames. In an alternate embodiment, the refinements “d” can be computed at the server and sent in a separate stream through the Internet.
2. Merging a Data Stream from Two Pathways
With reference to
where M is the number of newly created B-frames between two consecutive available P-frames. Note also that this is just an example and analogous equations can be derived if a different number of B-frames are created between 2 consecutive P-frames. In an alternate embodiment, the refinements “d” can be computed at the server and sent in a separate stream through the Internet together with the second MD.
It will be appreciated by those skilled in the art that the proposed method can be employed for any predictive coding scheme using Motion-estimation, such as MPEG-1, 2, 4 and H.263, H.26L.
It will further be appreciated by those skilled in the art that another advantage of this method resides in the fact that error recovery and concealment can be performed easier. This is because the redundant description of the MVs can be used to determined the MVs for the lost frame.
Finally those skilled in the art will appreciate that this method can be employed for robust, multi-channel transmission of “predictive” scalable coding schemes, such as Fine Granularity Scalable (FGS). This method can be used without MPEG-4 standard modifications and thus can be easily employed.
Uses in Gateway Processing:
With reference to
Currently, if an MPEG or H.26Lcoded or any other predictive coded video stream is transmitted through the Internet and then at the gateway it needs to be split into 2 multiple descriptions video streams that better fit the channel characteristics of the down-link (e.g. wireless systems using multi-path) while preserving the same coding format as before, the video data is fully decoded and re-encoded.
By implementing the present invention as described above in which a relationship is established between the B-frames' MVs and P-frames' MVs, the present process allows at the gateway easy splitting of an MPEG or H.26L coded data or any other predictive coded video stream into two multiple descriptions video streams that preserve the same coding format as before or results in merging of two multiple descriptions MPEG or H.26L coded or any other predictive coded video streams into a single coded format that preserves the same coding format as before without full decoding and re-encoding of the stream. It will be appreciated that with the proposed mechanism a considerable amount of the computational complexity at the gateway can be reduced.
While the present invention has been described in connection with what are presently considered to be the most practical and preferred embodiments, it is to be understood that the invention is not to be limited to the disclosed embodiments, but to the contrary, is intended to cover various modifications and equivalent arrangements included within the spirit of the invention, which are set forth in the appended claims, and which scope is to be accorded the broadest interpretation so as to encompass all such modifications and equivalent structures.
| Filing Document | Filing Date | Country | Kind | 371c Date |
|---|---|---|---|---|
| PCT/IB03/05949 | 12/11/2003 | WO | 6/15/2005 |
| Number | Date | Country | |
|---|---|---|---|
| 60434056 | Dec 2002 | US |