The present invention concerns a video coding method and device. It applies in particular to the compression of a video signal with a view to its transmission or storage.
The invention relates in particular to the problem of the correction of losses due to video transmission errors on a network. The context of the invention can be more easily understood from
After the packets are received and put in the buffer 106 and the bitstream 108 is reconstituted 107, the latter may therefore contain errors. These errors are detected and located on decoding 109 and an error concealment, or correction, module 110 is responsible for correcting them. The error concealment module 110 may be located in the decoder, before the decoder or even before the reconstitution of the bitstream. In the present case, the error concealment module 110 is located at the decoder. This error concealment module 110 uses the non-erroneous information (spatial and/or temporal) available at the decoder to reconstruct the damaged regions. After concealment, the reconstructed video 111 is available for being displayed or stored. The quality of the reconstructed video 111 depends greatly on the reconstruction methods used in the error concealment module 110.
It should be noted that the error concealment module is often imperfect and only partially corrects the lost region of the video. It is a case in fact of spatial and/or temporal interpolation methods. It should also be noted that the bitstream is composed of two types of picture: so called ‘INTRA’ pictures that are encoded independently of the other pictures and so called ‘INTER’ pictures which are, to simplify, composed of a vector field allowing the prediction of this picture according to the picture previously encoded/decoded and a prediction error describing the difference between the predicted picture and the real picture. Given the temporal dependence between two INTER pictures, a reconstruction error on a region located in a picture may have an impact on the successive pictures that depend on the first.
A known way of stopping the error propagation is to regularly insert INTRA blocks in an encoded picture. These INTRA blocks are coded independently of the previous picture. Because of this, they can stop the propagation of a transmission error poorly corrected by the error concealment module 110. However, the coding of these INTRA blocks is less effective and their number must be limited in order to avoid having to transmit excessively large pictures. In the same way, certain regions of the picture may be better protected by using other protection modes, for example correction or retransmission codes. However, these protection codes require a higher transmission rate.
There is known, from the document EP 1583369, an INTRA refreshing method whose decision depends on a measurement made on the motion vectors of the blocks around the lost region. This method assumes that the error correction algorithm is based on the estimation of the motion vectors of the lost region according to the motion vectors in the vicinity of this region. The error correction will be all the more effective, the lower the correlation between the adjacent vectors. Considering this hypothesis, INTRA refreshing is favored for blocks where the correlation between the motion vectors in the vicinity is low.
This method has many drawbacks. This method requires the detection of a sensitive region according to the type of error correction performed on decoding (in this patent, it is presupposed that the error correction technique uses the motion vectors adjacent to the lost region). In addition, this method causes the detection of so called ‘overlap’ regions’, that is to say regions in which a background element is progressively masked by a foreground element, regions that have only very little visual impact in the event of loss. Detecting these regions in the context of increased protection therefore proves to be of low effectiveness.
The present invention aims to remedy these drawbacks,
To this end, the present invention relates, according to a first aspect, to a video coding method that comprises:
It will be observed that a region of the picture can consist of a block or a set of blocks, for example in the form of a slice used during the formation of packets. By virtue of these provisions, blocks for which a transmission error would have more repercussions on the quality of the picture obtained after decoding can be coded differently from the other blocks. Moreover, contrary to the prior art, the method that is the object of the present invention makes it possible not to detect the overlap regions and the homogeneous motion regions which have only a little visual impact in the event of loss. The method that is the object of the present invention therefore detects better the blocks liable to have a high visual impact in the event of loss. The invention thus makes it possible to detect the regions of the picture to be coded where there is a significant risk of degradation occurring taking account of the motion information available to the encoder. Once detected, these regions can be better protected in order to avoid propagation of this degradation.
According to particular characteristics, the characteristic relating to the second block is a motion vector, referred to as the ‘second’, associated with the second block.
By virtue of these provisions, in order to determine the blocks presenting a risk of error propagation, the value between a vector of a predicted block and the values of the predictor block of the picture currently being processed are compared. It should be noted that, contrary to the prior art, the blocks considered are not necessarily adjacent.
According to particular characteristics, during the selection step, the first block is selected if the first and second vectors differ by a distance greater than a predetermined value.
This is because, if the values of these motion vectors are different, this indicates the presence of a break in motion and the block then risks being a source of error propagation.
According to particular characteristics, during the selection step, a first block is selected if at least one absolute value of the differences between the components of the first and second vectors is greater than the said predetermined value.
By virtue of these provisions, the determination of the distance is particularly rapid.
According to particular characteristics, during the selection step, a first block is selected according to a number called the ‘state’, which is a function of the state of the block having, in a previous picture, the position of the second block, and incremented according to the second motion vector.
By virtue of these provisions, the risk of unnecessary detections is reduced by awaiting the confirmation, on several successive pictures, of the existence of this risk.
According to particular characteristics, the second block vector is determined as being the previously determined vector associated with the block of the picture whose surface common with the second block is the greatest.
These provisions resolve the case where the first vector projects the first block onto several blocks of the picture.
According to particular characteristics, the characteristic relating to the second block is its position and, during the selection step, the first block is selected when the second block is, at least partially, outside the picture.
In this way propagations of errors relating to extrapolated blocks situated outside the picture are avoided.
According to particular characteristics, during the selection step, a first block is selected according to a number, called the ‘state’, which is a function of the state of the block having, in a previous picture, the position of the first block, and incremented according to the second motion vector.
By virtue of these provisions, the risk of unnecessary detections is reduced by awaiting the confirmation, on several successive pictures, of the existence of this risk.
According to particular characteristics, during the step of coding each selected block, each selected block is encoded, in a manner more robust to transmission errors than the non-selected blocks.
It should be noted that a more robust coding can consist of an INTRA coding, an addition of error correction codes or a multiple transmission of the coded block.
According to particular characteristics, the second coding mode is an INTRA coding mode.
According to particular characteristics, during the selection step the selection of adjacent blocks is prevented.
According to particular characteristics, the second coding mode is an INTER coding mode associated with a level of error correction codes greater than the level used in the first coding mode.
According to particular characteristics, during the selection step, only a number of blocks lower than a predetermined limit value are selected, according to a criterion that is a function of the first and second vectors.
This embodiment is particularly adapted to coding without notification of loss of packets by the decoder.
According to particular characteristics, during the coding step, a block selected with the second mode is coded only if said block is indicated as, at least partially, lost by a decoder.
Each of these provisions makes it possible to reduce the number of blocks coded with the second coding mode, which are generally more expensive in terms of resources or quantity of information remaining after coding.
According to a second aspect, the present invention relates to a video coding method, that comprises:
According to a third aspect, the present invention relates to a video coding device that comprises:
According to a fourth aspect, the present invention relates to a video coding device that comprises:
According to a fifth aspect, the present invention relates to a computer program that can be loaded into a computer system, said program containing instructions for implementing the coding method as succinctly disclosed above.
According to a sixth aspect, the present invention relates to an information medium that can be read by a computer or a microprocessor, removable or not, storing instructions of a computer program, characterized in that it allows the implementation of the coding method as succinctly disclosed above.
The advantages, aims and characteristics of the method according to the second aspect of the present invention, of these devices, of this program and of this information medium being similar to those of the coding method that is the object of the first aspect of present invention, as succinctly disclosed above, they are not repeated here.
Other advantages, aims and characteristics of the present invention will emerge from the following description given, for an explanatory and in no way limiting purpose, with regard to the accompanying drawings, in which:
Throughout the description, the term ‘error correction’ corresponds to the technique known by the term ‘error concealment’ aimed at restoring a damaged picture block rather than the use of error correction codes integrated in a packet of transmitted data.
As shown in
The device 210 comprises a communication interface 212 connected to a network 213 able to transmit digital data to be processed or conversely to transmit data processed by the device. The device 210 also comprises a storage means 208 such as for example a hard disk. It also comprises a reader 209 for a disk 205. This disk 205 may be a diskette, a CD-ROM or a DVD-ROM for example. The disk 205, like the disk 208, can contain data processed according to the invention as well as the program or programs implementing the invention which, once read by the device 210, will be stored on the hard disk 208. According to a variant, the program enabling the device to implement the invention can be stored in read-only memory 202 (referred to as ROM in the drawing). In a second variant, the program can be received so as to be stored in an identical fashion to that described previously by means of the communication network 213.
This same device has a screen 204 for displaying the data to be processed or serving as an interface with the user, who can thus parameterize certain processing modes, by means of the keyboard 214 or any other means (a mouse for example).
The central unit 200 (referred to as CPU in the drawing) executes the instructions relating to the implementation of the invention, instructions stored in the read-only memory 202 or in the other storage elements. For example, the central unit performs the steps illustrated in
In more general terms, an information storage means that can be read by a computer or a microprocessor, integrated or not into the device, possibly removable, stores a program implementing the video coding method according to the invention.
The communication bus 201 affords communication between the various elements included in the microcomputer 210 or connected to it. The representation of the bus 201 is not limiting and in particular the central unit 200 is able to communicate instructions to any element of the microcomputer 210 directly or by means of another element of the microcomputer 210.
If blocks Bi(t+1) 303 of the following picture are predicted from this region, the error is propagated on these blocks Bi(t+1). This propagation can then extend over a major part 306 of the picture to the nth following picture.
As disclosed with regard to
However, if the corrected region is situated at the boundary F 401 between the homogeneous regions in terms of motions H1400 and H2402, the reconstruction error 405 is propagated and risks growing along with the decoded pictures 406. This phenomenon is explained by three reasons:
In summary, we have a boundary region in terms of motion with:
These conditions explain the propagation phenomenon as illustrated on the picture t+n.
The missing regions are therefore corrected by the error correction module 505. The correction at 506 not being perfect, there exists an error that can be propagated to the following pictures depending on the current optical stream, that is to say the vector field. The schema 508 describes the configuration of the vector field prior to the creation of an error propagation source. At a time t, the vector field can be segmented into two regions: the fixed region 509, the motion vectors of which are zero, and the movable region 513, here the visible part of the vehicle 506, the motion vectors of which, issuing from the video encoding, are opposed to the actual movement direction. At the boundary between the vehicle 506 and the wall 511, certain blocks corresponding to the new textures in the ‘overlap’ region of the vehicle 506 are predicted on a part of the wall 511, by means of the motion vectors issuing from the video encoding 510. It will be recalled that an overlap region is a region in which the motion relating to a foreground element, here the vehicle, and a background element, here the wall, causes the appearance of the background element. The wall 511 being fixed, the motion vectors are zero for the blocks that describe it. A reconstruction error on the ‘vehicle’ region 513 therefore remains present at each picture until the next INTRA refreshing. Consequently, as this ‘zero’ movement region serves, over time, as a reference for other blocks, this error is propagated during the decoding of the following pictures of the video in the form of a drag in the direction of the movement of the vehicle.
Through the two examples described in
The same phenomenon could be observed for motions coming from the edges of the picture. In this case, there exists a boundary between a fixed region (the outside of the picture) and a motion region (inside the picture). The blocks inside are predicted on the external blocks ('padding' region used for the motion estimation).
In general terms, the regions of the picture having the characteristics of a vector field leading to the generation of artifacts are mainly the ‘overlap’ regions. An ‘overlap’ region is generally situated at the boundary between two homogeneous motion regions. At this point, new textures appear at each picture.
In the embodiment of the present invention described with regard to the figures, the detection of the potential sources of the generation of artifacts is based on an analysis of motion vectors of each block. This analysis consists of comparing the value of the motion vector of the current block with that of the block pointed to by the current vector. These sources are detected when the video is encoded, which makes it possible to limit any visual impact in the case of loss (for example by putting some of these blocks detected in INTRA mode) or to better protect these regions, as disclosed with regard to
The sources of error propagation are detected in two main steps:
It should be noted that the vector pointed to is the vector associated with the block pointed to by the first vector. Unlike a traditional decoding scheme, the first vector points to a block of the same picture.
Thus the non-zero value of the motion vector associated with the current block is the first condition necessary for detection. This is because, if a block has a zero associated vector, this means that it is not predicted from any adjoining block. Consequently it will not be responsible for any error propagation. It is considered that the block Bi 601 has a non-zero estimated motion vector, denoted 602 in the figure.
The comparison step consists of detecting a motion break at the vector field obtained on encoding of the current picture. For each block selected, including the block Bi 601, a motion vector is associated. Although this vector is deemed to point to the previous picture in order to predict the block, it is considered here that this vector points in the current picture. The block Bi 601 is therefore used to point to a region 608 of the current picture. It is considered that this region 608 also has a motion vector. If the value of this motion vector is different from that of the block Bi 601, this indicates the presence of a break in motion. The block Bi 601 then risks being a source of error propagation.
For the comparison made, it is considered that a vector V1 is different from a vector V2 if a distance, in the mathematical meaning of the term, between the two ends of the vectors is greater than a predetermined value. In one embodiment, the distance considered is the maximum value of the absolute values of the differences between the X and Y components, that is to say, respectively, abs(V1x−V2x) or abs(V1y−V2y). This absolute value is thus compared with a predetermined limit value, or threshold, S whose determination is presented below.
The motion vector associated with the region 608 is determined, in the preferred embodiment of the method that is the object of the present invention, in the following manner, which constitutes a variant embodiment.
The majority of the time, this region 608 covers four blocks B00, B01, B10 and B11. The vector of the one from amongst these blocks whose surface common with the region 603 is greatest is chosen. In our example, among the common surfaces A0, A1, A2 and A3 we have A1>A0>A3>A2. It is therefore the block 601, associated with the motion vector 609, that is pointed to by the vector associated with the block Bi 601. The comparison of the vectors is therefore made between the vector 609 associated with the chosen block B01 and the vector 602 associated with Bi.
In other variants, a calculation is made of a motion vector associated with the region 608 by the sum of the coordinates of the vectors associated with the overlapped blocks weighted by the respective standardized areas, as disclosed below.
It should also be noted that, when part of the block 608 is situated outside the picture, then the block 601 is automatically selected as the error source block.
The two-step algorithm described above makes it possible to detect the most important blocks. At the comparison step, two improvements can be made, to reduce the number of ‘false detections’. It should be noted that a false detection is a block whose transmission error causes a weaker visual artifact than a true detection.
The first improvement concerns an integration of the state of each block on a series of pictures. To this end, a state ‘Et’ is defined for each block for a picture at time t. When the block has been detected as a probable error propagation source, the state Et is incremented by the value 1. Otherwise the state Et is equal to 0. Consequently this state Et represents, for a given block, the number of successive pictures for which the block has been detected as a probable propagation source. An integration of the states of each block coupled with a thresholding process, that is to say of comparison with a threshold value, makes it possible to eliminate ‘false detections’. A block is selected if the state of the block exceeds a certain threshold N. This is because, if continuity of motion between two pictures is assumed, it is possible to consider that the configuration of the vector field of the propagation region (that is to say at the boundaries between two homogeneous regions) is constant from one picture to another.
The second improvement, more effective but requiring slightly greater calculation times, concerns the monitoring of the states for movable propagation sources. It is disclosed with regard to
In the example in
If another configuration is considered such that the vehicle 706 is placed in front of the wall 711, as in
In order to be able to detect a mobile propagation source, a temporal monitoring of the states of each block is established according to the vector field.
It should be noted that, in variants, a combination of the two improvements detailed above is implemented. For example, at each iteration, the state issuing from the method that is the object of the first improvement is added to the state issuing from the method that is the object of the second improvement in order to form the value of the state to which an incrementation applies if the block is an error propagation source. This makes it possible to limit the risk of not selecting an overlap region.
It should be noted, with regard to the description of
a formula in which Ai represent the surface of the block i and A represents the surface of a block (the surface of the block B01 for example). Once this initial state is propagated, the current state is calculated, as described with regard to
During steps 1006 to 1009, it is determined whether the current block, also referred to hereinafter as the ‘first block’, must be selected, according to:
During step 1006, it is determined whether the motion vectors of the first and second blocks are, between them, at a distance greater than a predetermined distance and/or whether the second block is, at least partially, outside the picture.
Thus the first vector, associated with the current block MV, is compared with the second vector, the vector pointed to MVp. Let S be the difference threshold. As described previously, the distance between the vectors is determined as the difference between the vectors on the components in x and y of the vectors MV and MVp by the formula max(abs(MVx−MVpx), abs(MVy−MVpy)). If this value is strictly greater than a limit value S, for example S=2 pixels, it is considered that the two vectors are different.
When the difference between the vectors is less than the threshold S, the block is inside a homogeneous region in terms of motion and is not to be considered as a propagation source.
The blocks situated on the edges of the picture are processed in a special manner. This is because, when the region pointed to associated with the current block, that is to say the picture of the current block by the translation whose vector is the motion vector of the current block, is at least partially situated outside the picture, the test of difference between associated vector and vector of the block pointed to and the state monitoring using the second blocks becoming, for the following iteration, first blocks, cannot be applied. In this case, the fact that the region pointed to is situated at least partially outside the picture is a sufficient condition for defining this first block as a probable source of propagation. In a variant, it is the fact that the region pointed to is situated mainly outside the picture, in terms of surface area similar to what was disclosed with regard to
It should be stated here that, in coders of the MPEG type, each picture is augmented on its edges by an external region filled, for example, with pixels of a predetermined constant value. Nevertheless, other methods of filling the edge region can be applied. It is considered, by hypothesis, that any extrapolated block belonging to the edge region external to the picture has a zero associated motion vector.
In each case where the block is not considered to be a probable source of propagation, the state counter of the block ‘BC’ is reset to zero, during step 1004. Otherwise the state monitoring process described previously is performed during a step 1007 and the state counter is incremented during a step 1008. If the value of the counter is greater than the limit value N, which is determined during a step 1009, the block is considered to be a source of propagation or ‘very important’ block, during a step 1010. Otherwise the following block is passed to during a step 1011, if such remains. When the last block of the picture has been processed, the coding of the picture is carried out, during a step 1012, by coding at least one non-selected block with a first coding mode and coding at least one selected block with a second coding mode different from the first coding mode, as disclosed above.
Preferentially, during the step 1012 of coding each selected block, each selected block is encoded in a manner more robust to transmission errors than the non-selected blocks.
According to the embodiment, the second coding mode is an INTRA coding mode or an INTER coding mode associated with an error correcting code rate higher than the rate used in the first coding mode.
Following the coding step 1012, the process stops during a step 1013, until the following picture is ready to be processed by once again implementing the steps illustrated in
In a variant, during step 1010, the selection of adjacent blocks is prevented with regard to coding in INTRA mode. Thus only a subset of the adjacent blocks are able to be selected if coded in INTRA mode. For example, this subset can correspond to the selection, in staggered fashion, of one block out of two, as on a checkerboard. In other variants, in order to limit the number of blocks coded in INTRA mode, a random drawing of the blocks in this subset is carried out, for example in order to select one block out of two which is then coded in INTRA mode. This is because coding all the blocks in the same region in INTRA on several pictures can greatly diminish the error correction methods based on the use of motion vectors. It is observed that the blocks eliminated by the prohibition on selecting adjacent blocks can, according to variants, be coded with a redundancy rate greater than that used for the blocks that had not initially been selected or be considered to be blocks that were not initially selected.
The algorithm for detecting the error propagation sources described in previous figures makes it possible to detect regions liable to generate artifacts in the event of loss and poor reconstruction of this region. Coupled with an error resilience method such as INTRA refreshing or an increase in protection against errors on the packets concerned, it makes it possible to reduce or stop the extent of the propagation. Two applications are proposed below:
In the first application, a coding without notification of packet loss is carried out. At the coding 101 (
In the second application, a coding is carried out with notification of packet loss. On coding, the detection of a source of major errors is applied for all the pictures. When a region is detected, the INTRA refreshing is applied to this region only if associated packet losses are indicated to the coder and the number M of blocks able to be coded in INTRA mode so permits. In this case, during the coding step 1012, a block selected with the second coding mode is coded only if said block is indicated as at least partially lost by a decoder. This application makes it possible to reduce the number of blocks coded in INTRA in the case of a very limited value of M. As before, it will however be ensured that not all the blocks in INTRA mode are covered.
Number | Date | Country | Kind |
---|---|---|---|
0754620 | Apr 2007 | FR | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/IB08/01242 | 4/17/2008 | WO | 00 | 10/20/2009 |