Described below is a method for video-coding a series of digitized pictures, to a method for transmitting the pictures and to a method for decoding the coded pictures. Also described below is a corresponding transmitter for transmitting the coded pictures and to a corresponding receiver for receiving and decoding the transmitted coded pictures.
A multiplicity of methods exist for the video coding of digitized pictures. Some of these methods are defined in corresponding standards, e.g. the standard H.264/MPEG-4 AVC. In known video-coding methods, the digitized pictures are arranged into groups of pictures (GOP=group of pictures), within which the individual pictures are coded. In order to ensure efficient coding, only a selection of pictures is completely intracoded, irrespective of the other pictures of the series. The remaining pictures are the subject of a prediction, in which movement vectors are specified for a relevant picture, the movement vectors describing the displacement of picture blocks relative to a reference picture. In this way, a predicted picture is determined, the prediction error between the original picture and the predicted picture being coded and transferred with the movement vectors. In a group of pictures, the pictures that have been coded using a prediction are called interpictures, because they are coded relative to one or more reference pictures.
Coded video contents can be transferred using broadcast channels, for example, as a result of which any users can receive the corresponding coded contents. In this context, the related art discloses the Multimedia Broadband Multicast Service (MBMS), which will be used in the future to transfer coded video contents via mobile radio networks. When transferring via broadcast channels, the problem arises that a systematic delay occurs when a corresponding user terminal is used to connect to a broadcast channel. This delay occurs inter alia because a Random Access Point must be found within the coded video stream, from which point the video decoder receiving the video data stream can process the video data stream. This type of delay is called Video Tune-in Delay. In this case, the Random Access Points are the above-described intrapictures, which are coded while disregarding other pictures. Because only some of the pictures are intrapictures, there is consequently a delay when connecting to a broadcast channel until a corresponding intrapicture is received.
When transferring coded video contents, use is often made of error correction methods, in particular Forward Error Correction (FEC), this being sufficiently well known from the related art. In the case of such error protection methods, provision is made for transferring redundancy packets, by which error correction for video pictures can be performed in the event of an invalid transfer, in addition to data packets containing video pictures. When error correction methods are used, it is necessary to wait a certain time, until sufficient video data and redundancy data is received, in order to carry out the error correction. This results in a further delay, which is also called Initial Delay.
With reference to
In the known prediction structure according to
I0 P1 P2 P3 P4 P5 P6 N7 FEC.
In this context, “FEC” is understood to mean error protection data which can be used for reconstructing invalid data of the GOP.
According to the related art, the pictures can also be transferred in a modified transfer order, which is the reverse order of the known transfer order and is therefore as follows:
N7 P6 P5 P4 P3 P2 P1 I0 FEC.
As a result of this modified transfer order, when connecting into a group of pictures GOP, it is possible to decode at least the pictures received at the end, because these pictures require only a small amount or even none of the information from other pictures. As in the known transfer order, the redundancy data FEC is likewise transmitted at the end when using the modified transfer order.
Dong Tian, Vinod Kumar M V, Miska Hannuksela, Stephan Wenger, Moncef Gabbouj, “Improved H.264/AVC Video Broadcast/Multicast”, in Proceedings of SPIE Visual Communications and Image Processing 2005 (VCIP 2005), Bejing, China, July 2005, further proposes a predication structure which is modified relative to that in
Tian et al. additionally disclose a further prediction structure in the form of so-called Multiple Reference Frames, the prediction structure being shown in
The prediction structure according to
I0 P2 N1 P4 N3 P6 N5 N8 N7 FEC1 FEC2.
In this context, the redundancy information is divided into the two redundancy blocks FEC1 and FEC2. In this context, the first redundancy block FEC1 protects the pictures I0, P2, P4, P6 and N8, while the second redundancy block FEC2 protects the pictures N1, N3, N5 and N7.
The prediction structures in
FEC2 N1 N3 N5 N7 FEC1 N8 P6 P4 P2 I0.
The pictures are arranged into subsequences in descending order of the resolution levels here, such that the pictures belonging to the highest resolution level, specifically N1, N3, N5 and N7, are transferred first and the pictures belonging to the next lower resolution level, specifically the pictures N8, P6, P4 and P2, are transferred next. Finally, the intrapicture I0 is transferred at the end of the transfer order. In addition, the redundancy blocks of the corresponding resolution level are always arranged at the beginning of the subsequence of pictures belonging to the relevant resolution level.
As a result of the above-modified transfer order, when connecting into a GOP at the beginning of the GOP, e.g. within the subsequence of the pictures N1, N3, N5 and N7, display of the pictures is in particular still possible with limited resolution because the pictures of the lower resolution are transferred later and do not require information from the preceding pictures. However, the above prediction structures according to
The related art also discloses the prediction structure which is shown in
The method addresses the problem of ensuring smooth playback of the video pictures with minimal delay when a receiving device connects to a channel that is transferring the video pictures.
The method provides for groups of pictures to be formed, wherein a relevant group of pictures includes a plurality of temporally consecutive pictures in an original temporal order. In this context, the original temporal order corresponds to the actual temporal course of the scenarios that are represented in the video stream.
In the method, each group of pictures is coded, i.e. by forming a prediction structure in which one or more pictures of the group of pictures are specified as intrapictures which are intracoded in each case, and the other pictures of the group of pictures are specified as interpictures which are predicted from at least one reference picture of the group of pictures and are intercoded relative to the at least one reference picture. According to the method, the prediction structure is configured such that:
i) each intrapicture is a reference picture, from which are predicted at least one picture which is temporally earlier than the intrapicture in the group of pictures, and at least one picture which is temporally later than the intrapicture in the group of pictures;
ii) the interpictures include a plurality of non-referenced pictures, from which no pictures of the series are predicted.
A transfer sequence having a temporal transfer order is then formed from the coded pictures of the group of pictures, wherein at least some of the coded non-referenced pictures are the first pictures of the transfer order. In this context, transfer order is understood to mean the order in which the pictures are subsequently to be transferred after the coding.
By virtue of non-referenced pictures being situated at the beginning of the series of pictures, it is often possible to render this group of pictures in reduced resolution when connecting into a group of pictures, because those pictures which are not required for decoding other pictures are transferred at the beginning of the group of pictures. Furthermore, smooth playback of the pictures becomes possible because the intrapicture is not arranged at the boundary of the series of pictures, and at least one temporally earlier and once temporally later picture are predicted from the intrapicture.
In an embodiment, the coded intrapicture (or intrapictures) is arranged as the last picture (or pictures) of the transfer order. Consequently, even when connecting into a group of pictures at a late time point, it is still possible to render at least the intracoded picture of the group of pictures.
In a further embodiment of the method, all coded non-referenced pictures are arranged as the first pictures at the beginning of the transfer order. In a variant, provision is further made for an essentially central arrangement of the intrapicture. If there is an uneven number of pictures in the group of pictures, this involves using the central picture of the group of pictures as the intrapicture, and if there is an even number of pictures in the group of pictures, the intrapicture is located at that position—in the group of pictures—which corresponds to the result of the division of the number of pictures of the group of pictures by two, or to this result plus one.
In a further embodiment, the groups of pictures include as interpictures not only non-referenced pictures, but also those pictures from which one or more pictures of the group of pictures are predicted. In the transfer order, these coded reference pictures may be arranged between the at least several coded non-referenced pictures and the coded intrapicture or intrapictures. In this way, a hierarchy of the pictures is effected, reflecting the importance of the corresponding pictures in the decoding. The more important a picture in the context of decoding, the later it is arranged in the transfer order.
In a further embodiment, redundancy data is generated in each case for the groups of pictures for the purpose of error protection when transferring the group of pictures concerned, wherein the redundancy data is inserted into the transfer order when the transfer sequence is generated. In this context, it is advantageous for at least part of the redundancy data in the transfer order to be arranged before the first pictures because, when connecting into a group of pictures, the actual picture information then follows at a later time point than it would if the redundancy information was situated at the end of the group of pictures.
In a further embodiment, a relevant group of pictures can be scaled into a plurality of resolution levels, wherein the lowest resolution level includes only the coded intrapicture or intrapictures, and each higher resolution level is wherein a number of coded pictures which are added at the higher resolution level in comparison with the next lower resolution level. An advantageous combination of the method with scalable video coding is achieved in this way. According to the method, the coded pictures in the transfer sequence may be arranged into subsequences, these being assigned a resolution level in each case, wherein a relevant subsequence includes the coded pictures which, in comparison with the next lower resolution level, are added at the resolution level that is assigned to the relevant subsequence, wherein the subsequences in the transfer order are arranged in descending order of the resolution levels. This ensures that the highest possible temporal resolution of the pictures is maintained when connecting into a group of pictures.
In a further embodiment, separate redundancy data is generated in each case for at least some of the subsequences, the data being arranged in each case in front of the corresponding subsequence in the transfer order. As a result, it is possible to achieve a flexible specification of the error protection according to resolution level by virtue of the separate redundancy data featuring at least partially different degrees of error protection, wherein the degree of error protection for the redundancy data of a subsequence may decrease as the resolution level of the subsequence increases.
In a further embodiment, regular temporal scalability is ensured in that the resolution levels are characterized by a factor, such that all resolution levels except for the lowest include a number of pictures which can be divided by the factor without a remainder.
In a further embodiment of the method, the prediction structure is specified in such a way that at least one non-referenced picture is assigned a predetermined number of pictures, the non-referenced picture being predicted from that picture, among the predetermined number of pictures, which was generated from the smallest number of predictions. Consequently, for the purpose of predicting a picture, a picture is always used which was derived from the fewest possible preceding prediction steps. This results in increased error resilience, since the error propagation is lower in the event of an invalid transfer. In this context, the predetermined number of pictures may be the two reference pictures which are situated temporally closest to the non-referenced picture in the series of pictures, i.e. the two temporally closest pictures which are not non-referenced pictures.
In a further embodiment, at least some interpictures are predicted in each case from a plurality of other pictures, wherein a relevant interpicture of the at least some interpictures is divided into a multiplicity of blocks and, for each block, an individual picture from which the block is predicted is specified from the plurality of other pictures. The method is thus combined with the prediction using Multiple Reference Frames as mentioned in the introduction.
In addition to the above-described method for video coding, a method is herein described for transmitting a series of digitized pictures, wherein the series of digitized pictures is coded in accordance with the method and the pictures are then transmitted in the temporal transfer order of the transfer sequence. In this context, the transmission may take place via a broadcast service on one or more broadcast channels.
In addition to the above-described method for video coding, a method is herein described for decoding a series of digitized pictures which were decoded and transmitted using the method. In the decoding method, the transfer sequences of the coded pictures of the groups of pictures of the series are received. The coded pictures of each transfer sequence are then decoded depending on the prediction structure being used. Finally, the decoded pictures of each transfer sequence are read out in the original temporal order of the group of pictures, thereby recreating the original video stream.
In addition the method further includes a corresponding transmitter for transmitting a series of digitized pictures, wherein the transmitter performs the coding method described herein and the subsequent transmission of the coded pictures in accordance with any variant of the method.
Also described below is a receiver for receiving and decoding a series of digitized pictures that was transmitted using the method, the receiver being configured in such a way that it performs the above-described decoding method.
These and other aspects and advantages will become more apparent and more readily appreciated from the following description of the exemplary embodiments, taken in conjunction with the accompanying drawings of which:
Reference will now be made in detail to the preferred embodiments, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to like elements throughout.
FEC2 N0 N2 N4 N6 FEC1 P1 P5 I3.
The redundancy block FEC2 protects the non-referenced pictures here, and the redundancy block FEC1 protects the intrapicture and the pictures P1 and P5 which are used for predicting the non-referenced pictures.
Because the pictures are not decoded in the original order of the series of pictures in the receiver, the pictures must be stored in a so-called playout buffer on the receiver side for subsequent display. In this case, the intrapicture I3 must be stored first, after it has been decoded. After the subsequent decoding of the interpicture P1, I3 and P1 remain in the memory. During the subsequent decoding of the non-referenced picture N0, this picture is likewise stored in the playout buffer and, after completion of the decoding, is read out for display and deleted from the buffer. Next, a series of contents is shown, rendering the contents of the playout buffer after each decoding of a picture. The contents of the buffer at the relevant time points are grouped together in parentheses, wherein the picture located at the right-hand end of a set of parentheses is the picture which was decoded at the relevant time point. Furthermore, an underscore indicates which picture is read out and deleted from the buffer after the decoding at the relevant time point. The following model, indicating the series of contents, is used in relation to the description of the further embodiments. The series of contents of the playout buffer for the series of pictures as per
(I3) (I3 P1) (I3 P1N0) (I3P1 N2) (I3N2 P5) (I3 P5 N4) (P5N4 N6) (P5 N6) (N6).
This means that a playout buffer of three decoded pictures must be provided for the embodiment as per
In the embodiment above, the first redundancy block FEC1 protects the pictures I3, P1 and P5, and the second redundancy block FEC2 protects the pictures N0, N2, N4 and N6. Because the latter pictures are not used for the prediction of other pictures, the protection for these pictures may be weaker. The error protection FEC2 can optionally be omitted completely, in which case only the reference pictures I3, P1 and P5 are protected. This results in Unequal Error Protection (UEP). By contrast, both error protection blocks FEC1 and FEC2 are combined into one error protection block FEC in the case of Equal Error Protection (EEP). Assuming that a picture is lost during the transfer (also assuming an equal distribution in the loss of pictures), this results in an expected value E of disrupted pictures as follows:
E=1/7·(4·1+2·3+1·7)=2.43.
E=1/7·(4·1+2·2+1·7)=2.14.
Consequently, the error susceptibility is reduced in the embodiment as per
In this context, the transfer order in the embodiment as per
FEC2 N0 N2 N4 N6 FEC1 P1 P5 P6 I3.
In this case, the series of contents of the playout buffer in the receiver is as follows:
(I3) (I3 P1) (I3 P1N0) (I3P1 N2) (I3N2 N4) (I3 N4 P5) (N4 P5 N6) (P5 N6) (N6).
FEC2 N1 N3 N5 N7 FEC1 N0 P2 P6 P4.
In this case, the series of contents of the playout buffer in the receiver is as follows:
(I4) (I4 P2) (I4 P2N0) (I4 P2N1) (I4P2 N3) (I4N3 N5) (I4 N5 P6) (N5 P6 N7) (P6 N7) (N7).
In this context, the first redundancy block FEC1 protects the pictures I4, P2, N0 and P6, while the second redundancy block FEC2 protects the pictures N1, N3, N5 and N7. Because the latter pictures are not used for prediction by other pictures, the protection for these pictures is weaker. This produces an Unequal Error Protection. In the case of Equal Error Protection, the two error protection blocks FEC1 and FEC2 can be combined into one error protection block FEC.
FEC3 N1 N3 N5 N7 FEC2 P2 P6 FEC1 N0 I4.
In this case, the series of contents of the playout buffer is as follows:
(I4) (I4N0) (I4 P2) (I4 P2N1) (I4P2 N3) (I4N3 N5) (I4 N5 P6) (N5 P6 N7) (P6 N7) (N7).
Unequal Error Protection can also be achieved in this variant. In this case, the redundancy block FEC1 protects the pictures I0 and I4, FEC2 protects the pictures P2 and P6, and FEC3 protects the pictures N1, N3, N5 and N7.
By a small modification to the prediction structure as per
According to the method, the following transfer order is generated for
FEC3 N1 N3 N5 N7 N9 N11 N13 N15 FEC2 N2 N6 N10 P14 FEC1 P0 P4 P12 I8.
In this case, the series of contents of the playout buffer is as follows:
For
FEC3 N1 N3 N5 N7 FEC2 P2 P6 FEC1 P0 I4.
In this case, the series of contents of the playout buffer is as follows:
(I4) (I4 P0) (I4P0 P2) (I4 P2N1) (I4P2 N3) (I4N3 N5) (I4 N5 P6) (N5 P6 N7) (P6 N7) (N7).
A plurality of advantages are derived from the above-described variants. Smoother playback of the pictures is permitted when connecting to a broadcast channel. Furthermore, as a result of the even (e.g. dyadic) temporal scalability, it becomes possible to support a plurality of scalability levels. If e.g. the error protection for non-referenced pictures is inadequate for decoding these correctly, it is possible to display just the remaining video stream using half the temporal resolution (half of the picture refresh rate). In the case of non-regular temporal scalability, the pictures would be displayed at irregular time intervals, which is perceived as disruptive. If applicable, it is also possible to define two different service classes, one class relating to the full temporal resolution and the other to the reduced temporal resolution. A further advantage of the above variants featuring shortened prediction paths is an increase in the error resilience of the transfer.
i) each intrapicture is a reference picture, from which are predicted at least one picture which is temporally earlier than the intrapicture in the group of pictures, and at least one picture which is temporally later than the intrapicture in the group of pictures;
ii) the interpictures include a plurality of non-referenced pictures, from which no pictures of the series are predicted.
The transmitter additionally includes a transmitter or transmission means 4 for transmitting the coded pictures, the transmission means being configured such that a transfer sequence having a temporal transfer order is formed from the coded pictures of each group of pictures, and the coded pictures are transmitted in the transfer order, wherein at least some of the coded non-referenced pictures are the first pictures of the transfer order.
The pictures are transferred from the transmitter 1 via a transfer link 5, e.g., via one or more broadcast channels. These broadcast channels can be received by a receiver 6, and the data stream which is coded therein can be read out by the receiver 6. For this purpose, the receiver 6 includes a receiver or receiving means 7 for receiving the transfer sequences of the coded pictures of the groups of pictures of the video stream, a decoder or decoding means 8 for decoding the pictures of each transfer sequence depending on the prediction structure, and a reader or reading means 9 for reading out the decoded pictures of each transfer sequence in the original temporal order of the group of pictures.
The system also includes permanent or removable storage, such as magnetic and optical discs, RAM, ROM, etc. on which the process and data structures of the present invention can be stored and distributed. The processes can also be distributed via, for example, downloading over a network such as the Internet. The system can output the results to a display device, printer, readily accessible memory or another computer on a network.
A description has been provided with particular reference to preferred embodiments thereof and examples, but it will be understood that variations and modifications can be effected within the spirit and scope of the claims which may include the phrase “at least one of A, B and C” as an alternative expression that means one or more of A, B and C may be used, contrary to the holding in Superguide v. DIRECTV, 358 F3d 870, 69 USPQ2d 1865 (Fed. Cir. 2004).
Number | Date | Country | Kind |
---|---|---|---|
10 2006 057 983.6 | Dec 2006 | DE | national |
This is the U.S. national stage of International Application No. PCT/EP2007/060957, filed Oct. 15, 2007 and claims the benefit thereof. The International Application claims the benefit of German Application No. 10 2006 057 983.6 filed on Dec. 8, 2006, both applications are incorporated by reference herein in their entirety.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/EP2007/060957 | 10/15/2007 | WO | 00 | 1/20/2011 |