The invention relates to a method for coding a sequence of digitized images with a plurality of image blocks as well as a corresponding decoding method. The invention also relates to corresponding coding and decoding devices.
Actual video coding standards (for example, see document [1], below) allow the coding of image sequences, wherein macro image blocks used for an estimation of movement are updated by means of an intra-coding mode. As a result, errors are not reproduced in the image sequence. Updating by means of intra-coding modes can be carried out at regular intervals or based on predetermined criteria. For other video coding methods, intra-coding modes, that refer back to several previously coded reference images, can be used. However, there are no mechanisms that allow an efficient video coding with inter-coding modes and intra-updating modes over error-prone networks.
The publication “Proc. Intl. Conf. On Image Processing ICIP, Lausanne, vol.1, 16.09.1996, pp.763-766 (Lio et al)” describes an intra-update method for video coding via channels prone to errors. This method analyses the specific sensitivity of macro blocks for channel errors and obtains a specification for the intra-update modes.
The publication “Proc. IEEE ICASSP, San Francisco, vol. 5, 23.3.1992, pp. 545-548 (Haskell et al)” describes several possible methods for resynchronizing movement-compensated videos that are adversely affected by ATM cell loss.
Accordingly, a system and method is needed for allow an efficient video coding with inter-coding modes and intra-updating modes over error-prone networks. As will be discussed below, a method is disclosed for coding a sequence of digitized images that uses a plurality of intra-coding and inter-coding modes as well as a plurality of reference images, to ensure a reliable reconstruction of the digitized images in error-prone networks.
One exemplary embodiment codes a sequence of digitized images with a plurality of image blocks in error-prone networks wherein the macro blocks in a section of the image are coded in a first intra-coding mode depending on predetermined criteria. Furthermore, the macro blocks in a section of the image are coded in a second intra-coding mode or in an inter-coding mode wherein in the inter-coding mode for the macro blocks, movement vectors are selected from the number of accessible reference images. The selection from the number of accessible reference images is limited in such a way that referencing takes place from image areas that were not subjected to the first intra-coding mode at a later stage. This helps to prevent a reference being made in the case of the inter-coding mode to the reference image areas that are subjected at least partially to an intra-coding mode. If the coding in the first intra-coding mode is carried out particularly for reasons of error robustness in order to avoid the reproduction of errors in the case of incorrect transmissions, this ensures that the coding is not based on image areas that were transmitted incorrectly. Therefore, an efficient and at the same time error robust coding, is provided in error-prone networks.
Under the exemplary embodiment discussed above, the coding is carried out in a first intra-coding mode at regular time intervals. Alternatively, the coding in the first intra-coding mode can be repeated at random time intervals.
Under another exemplary embodiment, the coding is carried out in a second intra-coding mode or in an inter-coding mode for reasons of coding efficiency. For reasons of coding efficiency, an intra-coding mode is particularly taken into consideration if an object in the image sequence only appears temporarily in some images.
Under yet another exemplary embodiment, the following steps are carried out to limit the reference images for coding a macro block. For each inter-coding mode from the number of possible inter-coding modes and for each reference image from the number of reference images that can be accessed by the rate distortion optimized movement compensation, optimized movement vectors are selected from the number of possible movement vectors. From a complete number that consists of the various possible combination of inter-coding modes and reference images, a limited number is created in which case the combinations that were coded in a later image in a first intra-coding mode are removed. Based on the limited number and a number of intra-coding modes, the best combination based on rate distortion criteria is formed. When the image block in the preceding aforementioned step was coded with an intra-coding mode, it is established in an additional step whether or not the image block was intra-coded on the basis of error robustness criteria (first intra-coding mode) or on the basis of the rate distortion optimization (second intra-coding mode). Therefore, an optimum coding mode can be determined for macro blocks to be coded. Utilization of rate distortion criteria can also be performed and is described in greater detail in documents [3] and [4] below.
Various rate distortion criteria are determined depending on the best combination of an error rate to be expected when transmitting the coded images. In this case, the distortion of the pixel values of the images is calculated in order to determine these criteria. The distortion of the pixel values preferably contains the total of the quadratic differences between the pixel values before coding and the correspondingly decoded pixel values. Because the distortion is usually not known when coding, it is possible to estimate the distortion in a particularly preferred embodiment.
In addition to the above-described coding method, a corresponding method is also disclosed for decoding a sequence of digitized images in error-prone networks in which case the method is embodied in such a way that a sequence of digitized images coded with the coding method described above is decoded. Under an example discussed in detail below, an error concealment is used for decoding.
A device for coding a sequence of digitized images in error-prone networks is also disclosed, in which case the device is embodied in such a way that the coding method described above can be carried out. The invention also includes a corresponding device for decoding digitized images in error-prone networks in which case the device is embodied in such a way that the decoding method described above can be carried out.
The invention and its wide variety of potential embodiments will be more readily understood through the following detailed description, with reference to the accompanying drawing in which:
The image sequence shown in
The image sequence is transmitted via an Internet test pattern that is described in document [2]. In this case, the image sequence is transmitted in data packets wherein a data packet consists of two rows of image blocks. In the text below, image blocks are referred to as macro image blocks whose shifting in the case of the inter-coding mode is determined by means of movement vectors. The coding method, by means of which the image sequence shown in
The section of the image sequence shows the images in references No. 9 to No. 12 (
The above-described image disturbances can be ascribed to the fact that for the coding used in the sequence of
In order to avoid the above-described disturbances to the greatest possible extent, the coding method according to the invention limits the reference images to the effect that for the inter-coding mode only such reference image blocks are used that are not subjected to any intra-updating mode after the reference image has been coded. The results of the method are shown in
Exemplary embodiments of the method according to the invention are described in greater detail below. For an embodiment of the method for each macro block coding mode m is selected from the number of possible inter-coding modes Mp and for each reference image r from the number of accessible reference images R and optimum movement vectors v(m, r) from the number of movement vectors V(m) for the movement compensation. The selection takes place according to the rate distortion criteria. Mathematically, the rate distortion criteria are displayed as follows:
in which case DDFD(m, r, v) is the distortion according to the movement compensation and Rmotion(m, r, v) contains the number of bits that are needed for coding the specific movement vector. The function ((DDFD(m, r, v)+λmotionRmotion(m, r, v)) is a so-called Lagrange cost function that contains the Lagrange multiplier λmotion. This function is minimized whereby optimum movement vectors are determined regarding the distortion and the memory space requirement for the movement vector. Therefore, as a first result, optimized movement vectors v(m, r) are obtained for each reference image r and for each macro block coding mode m.
In a next step, the number of movement vectors is limited by removing combinations from the number consisting of the inter-coding modes Mp and the reference images R, in which referencing takes place from image areas that are subjected to an intra-updating mode at a later stage, for example, for reasons of error robustness. In this way, a number Op of possible values m and r is obtained for the movement vectors and this is as follows:
Op={(m,r)∈{Mp,R}|smin β(v(m,r),f,k)≧r}, (2)
in which case
If the number of the last permitted reference image exceed the number of the reference image r, then it consists of a combination (m, r) whose reference image is within the number of reference images limited by the method according to the invention. If the last permitted reference image be less than the reference image r, then the corresponding combination (m, r) will be rejected.
The limited set Op of reference images and inter-coding modes m resulting from the previous step is combined with a set of intra-coding modes MI that can be used in the method according to the invention and the optimized coding mode 0(k) is again determined from the set union 0={MI, Op} for each macro block k by means of the rate distortion criteria. If this macro block is intra-coded of necessity, for example, by regular or random intra-updating modes then the number of 0 is limited to only intra-modes, that is 0=MI. Of course, in this case it is also not necessary to determine Op. Mathematically, the rate distortion criteria can again be formulated as the minimizing problem of a Lagrange cost function:
in which case R(o), describes the number of bits, codes the image block in the coding mode o, and D(o) represents the distortion for this coding mode.
If there is a regular or random intra-updating mode in the example above, the distortion is produced as the sum of the quadratic differences between the original image block and the image block received after the decoding. If the intra-updating mode should be carried out on the basis of an error-optimized channel adaptive coding described further below, the distortion is given in the decoder as the expected value of the distortion.
In a following step, it is still necessary to establish whether or not an intra-coded image block was intra-coded because of error robustness reasons in order to avoid the reproduction of errors or for reasons of coding efficiency. An intra-coding mode for reasons of coding efficiency particularly prevails if an object in the image sequence only appears temporarily. For an intra-coding mode because of coding efficiency reasons, a reference image limitation is not desired. In order to determine the reasons for the intra-coding mode, a rate distortion optimization is again performed according to equation (3), but where the total number 0={MI, Op} is used and as the distortion measurement the total of the quadratic differences between the original image block and the image block received after decoding. The result of the optimization is designated as ô(k). Subsequently, an error robustness flag ek is set in which case ek=δo(k)≠ô(k) and δcondition is the Kronecker symbol that is 1 if the condition has been met and otherwise has the value 0. Therefore, the intra-coding mode was carried out for reasons of error robustness if the flag is set to 1.
If all the image blocks of an image were processed, the vector f is updated for all the entries fk for which the error robustness flag ek is set at 1. As a result, a reference image limitation is avoided for such intra-coding modes that were performed for reasons of coding efficiency and thus the appearance and disappearance of objects can be efficiently executed by means of coding with the aid of a number of reference images.
Another exemplary embodiment is described below for which a channel-adaptive reference image is selected on the basis of rate distortion criteria. For this example, the distortion D(o) has to be estimated in the decoder. Possibilities of estimating this distortion can be determined by using any of the methods described in documents [5], [6] and [7] below. Another possibility of determining the distortion is the incorporation of the random channel behavior C when the distortion is estimated. After an image n has been transmitted, the channel behavior C is in this case given by means of the binary sequence {0, 1}p(n) in which case p(n) is the number of packets to be transmitted that are needed to transmit the images 1 to n. In this case, a 0 in the sequence designates a correctly received packet whereas a 1 indicates a lost packet. The random variable that describes the binary sequence up to image n is designated as Cp(n). The pixel distortion in the decoder will depend on the pixel value reconstructed in the decoder that is designated as ŝi and which is unknown to the encoder carrying out the coding. The pixel distortion depends on the channel behavior C and the selected coding mode o, i.e. ŝi=ŝi (Cp(n),o). The distortion is estimated as the total of all the expected values of the quadratic pixel distortions di(o) of all the macro blocks i in which case it is assumed that the channel behavior Cp(n) is known to the encoder. The pixel distortion di(o) for the macro block i is as follows:
di(o)=EC
in which case ECp(n−1) represents the expected value of the quadratic difference of the original pixel value and the reconstructed pixel value averaged over the channel Cp(n−1).
In order to calculate the expected value, the following method can be used. It is assumed that T copies of the random variables “channel behavior” are available in the encoder. These copies are designated as Cp(n)(t), with t=1, . . . , T. It is also assumed that all the random variables Cp(n)(t) are distributed independently, identically and statistically. Therefore, according to the strict law for high numbers, T—>∞ is as follows:
Therefore, with the expression on the left side, the expected value di(o) can be estimated and in a next step, the expected distortion Di(o) calculated. The reconstruction of the pixel values depends on the channel behavior Cp(n−1)(t) as well as the concealment in the decoder. By means of the last-mentioned formula it is possible to estimate in the encoder the intensity of the distortion in the decoder.
In addition, although the invention is described in connection with digitized images, it should be readily apparent that the invention may be practiced with any type of still or moving digital image format. It is also understood that the process portions and segments described in the embodiments above can substituted with equivalent processes to perform the disclosed methods and processes. Accordingly, the invention is not limited by the foregoing description or drawings, but is only limited by the scope of the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
102 02 500.2 | Jan 2002 | DE | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/DE03/00176 | 1/23/2003 | WO |