The present disclosure relates to a video predictive encoding method, device and program, and a video predictive decoding method, device and program, and more particularly, to a description in a buffer for reference pictures to be used in inter-frame predictive encoding.
Compression coding technologies are used for efficient transmission and storage of video data. The techniques defined in MPEG-1 to 4 and ITU (International Telecommunication Union) H.261 to H.264 are commonly used for video data.
Using encoding techniques, a picture which is used as an encoding target is divided into a plurality of blocks and then an encoding process and a decoding process are carried out on a block basis. Predictive encoding methods as described below are used in order to improve encoding efficiency. In intra-frame predictive encoding, a predicted signal is generated using a previously-reproduced neighboring picture signal (a reconstructed signal reconstructed from picture data compressed in the past) present in the same frame as a target block, and then a residual signal obtained by subtracting the predicted signal from a signal of the target block is encoded. In inter-frame predictive encoding, a displacement of signal is searched for with reference to a previously-reproduced picture signal present in a frame different from a target block, a predicted signal is generated with compensation for the displacement, and a residual signal obtained by subtracting the predicted signal from the signal of the target block is encoded. The previously-reproduced picture used for reference for the motion search and compensation is referred to as a reference picture.
In inter-frame predictive encoding, such as, for example, in H.264, the predicted signal for the target block is selected by performing the motion search with reference to a plurality of reference pictures having been encoded and then reproduced in the past, and defining a picture signal with the smallest error as an optimum predicted signal. A difference is calculated between the pixel signal of the target block and this optimum predicted signal, which is then subjected to a discrete cosine transform, quantization, and entropy encoding. At substantially the same time, also encoded is information about the reference picture from which the optimum predicted signal for the target block is derived (which will be referred to as “reference index”) and information about the region of the reference picture from which the optimum predicted signal is derived (which will be referred to as “motion vector”). In H.264, for example, reproduced pictures are stored as four to five reference pictures in a frame memory or reproduced picture buffer (or decoded picture buffer, which can also be referred to as “DPB”).
A general method for management of a plurality of reference pictures is a technique of releasing, from the buffer, a region occupied by the oldest reference picture (i.e., a picture having been stored in the buffer for the longest time) out of a plurality of reproduced pictures, and storing a reproduced picture having been decoded last, as a reference picture. On the other hand, a reference picture management method, such as the example method described in Rickard Sjoberg, Jonatan Samuelsson, “Absolute signaling of reference pictures,” Joint Collaborative Team on Video Coding, JCTVC-F493, Torino, 2011 may be used to flexibly prepare optimum reference pictures for a target picture, in order to enhance efficiency of inter-frame prediction.
Buffer description information to describe a plurality of reference pictures to be stored in the buffer can be added to encoded data of each target picture, and can then be encoded, such as in an example described by Rickard Sjoberg, Jonatan Samuelsson, “Absolute signaling of reference pictures,” Joint Collaborative Team on Video Coding, JCTVC-F493, Torino, 2011. Identifiers of the reference pictures necessary for processing (encoding or decoding) of the target picture and subsequent pictures can be described in this buffer description information. In an encoding device or a decoding device, the buffer can be managed so that designated reproduced pictures are stored in the buffer (frame memory), in accordance with the buffer description information. On the other hand, any reproduced picture not designated can be deleted from the buffer.
The buffer description information about each target picture may be sent by being added to the header of compressed data of each target picture, or pieces of buffer description information about a plurality of target pictures may be sent together as part of a PPS (picture parameter set) information carrying parameters of the decoding process applied in common.
In video encoding and decoding, reference can be made to an identical picture by a plurality of target pictures. In other words, the same reference picture can be used multiple times (repeatedly). It is seen from
In the buffer description information based on an example of conventional technology, however, ΔPOCk,j is independently determined in each BD[k], and for this reason, even for the same reference picture, ΔPOCk,j thereof is described in each BD[k]; therefore, the same information must be repeatedly transmitted and received, in spite of it being the same as previously transmitted and received information. This will be explained using the example of
A video predictive coding system includes a video predictive encoding device comprising: input means which implements input of a plurality of pictures constituting a video sequence; encoding means which conducts predictively coding of a target picture to generate compressed picture data, using, as reference pictures, a plurality of pictures which have been encoded and then decoded and reproduced in the past; reconstruction means which decodes the compressed picture data to reconstruct a reproduced picture; picture storage means which stores at least one aforementioned reproduced picture as a reference picture to be used for encoding of a subsequent picture; and buffer management means which controls the picture storage means, wherein (prior to processing of the target picture), the buffer management means controls the picture storage means on the basis of buffer description information BD[k] relating to a plurality of reference pictures to be used in predictive encoding of the target picture and, at substantially the same time, the buffer management means encodes the buffer description information BD[k], with reference to buffer description information BD[m] for another picture different from the target picture, and thereafter adds the encoded data thereof to the compressed picture data.
Furthermore, the video predictive coding system includes a video predictive decoding device comprising: input means which implements input of compressed picture data for each of a plurality of pictures constituting a video sequence, the compressed picture data containing data resulting from predictive coding using a plurality of reference pictures, which have been decoded and reproduced in the past, and encoded data of buffer description information BD[k] related to the plurality of reference pictures; reconstruction means which decodes the compressed picture data to reconstruct a reproduced picture; picture storage means which stores at least one aforementioned reproduced picture as a reference picture to be used for decoding of a subsequent picture; and buffer management means which controls the picture storage means, wherein (prior to reconstruction of the reproduced picture), the buffer management means decodes the encoded data of the buffer description information BD[k] for the reproduced picture, with reference to buffer description information BD[m] for another picture different from the reproduced picture, and then controls the picture storage means on the basis of the decoded buffer description information BD[k].
The encoding and decoding methods of the buffer description information according to the video predictive coding system make use of the property of repeatedly using the same reference picture in the predictive encoding and decoding processes for a plurality of pictures, so as to use the correlation between pieces of buffer description information BD[k] used for different pictures, in order to reduce redundant information, thereby achieving the effect of efficient encoding of the buffer description information. In addition, the information specific to each reference picture (dependence information) can be the same as that of the referenced picture and therefore the information can be inherited as it is, thereby achieving the advantage of no need for encoding and decoding it again.
Other systems, methods, features and advantages will be, or will become, apparent to one with skill in the art upon examination of the following figures and detailed description. It is intended that all such additional systems, methods, features and advantages be included within this description, be within the scope of the disclosure, and be protected by the following claims.
The video predictive coding system, may be better understood with reference to the following drawings and description. The components in the figures are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the system. Moreover, in the figures, like referenced numerals designate corresponding parts throughout the different views.
Embodiments of the video predictive coding system will be described below using
An example of the operation of the video predictive encoding device 100 will be described. A video signal consisting of a plurality of pictures can be fed to the input terminal 101. A picture of an encoding target is divided into a plurality of regions by the block division unit 102. In an embodiment, the target picture is divided into blocks each consisting of 8×8 pixels, but it may be divided into blocks of any size or shape other than the foregoing in other embodiments. A predicted signal is then generated for a region as a target of an encoding process (which will be referred to hereinafter as a target block). The embodiment can employ two types of prediction methods, the inter-frame prediction and the intra-frame prediction.
In an example of the inter-frame prediction, reproduced pictures having been encoded and thereafter reconstructed in the past are used as reference pictures and motion information to provide the predicted signal with the smallest difference from the target block is determined from the reference pictures. Depending upon situations, it is also allowable to subdivide the target block into sub-regions and determine an inter-frame prediction method for each of the sub-regions. In this case, an example efficient division method for the entire target block and motion information of each sub-region can be determined by various division methods. In an embodiment, the operation is carried out in the predicted signal generation unit 103, the target block is fed via line L102, and the reference pictures are fed via L104. The reference pictures to be used herein are a plurality of pictures which have been encoded and reconstructed in the past. An example of the details of the reconstruction and encoding are similar to the method of H.264 which is an example of conventional technology. The motion information and sub-region division method determined as described above are fed via line L112 to the entropy encoding unit 111 to be encoded thereby and then the encoded data is output from the output terminal 112. Information (reference index) indicative of which reference picture from among the plurality of reference pictures the predicted signal is derived is also sent via line L112 to the entropy encoding unit 111. In an embodiment, three to six reproduced pictures are stored in the frame memory 104 to be used as reference pictures. The predicted signal generation unit 103 derives reference picture signals from the frame memory 104, based on the reference pictures and motion information, which correspond to the sub-region division method and each sub-region, and generates the predicted signal. The inter-frame predicted signal generated in this manner is fed via line L103 to the subtraction unit 105.
In the intra-frame prediction, an intra-frame predicted signal is generated using previously-reproduced pixel values spatially adjacent to the target block. Specifically, the predicted signal generation unit 103 derives previously-reproduced pixel signals in the same frame as the target block from the frame memory 104 and extrapolates these signals to generate the intra-frame predicted signal. The information about the method of extrapolation is fed via line L112 to the entropy encoding unit 111 to be encoded thereby and then the encoded data is output from the output terminal 112. The intra-frame predicted signal generated in this manner is fed to the subtraction unit 105. The method of generating the intra-frame predicted signal in the predicted signal generation unit 103 can be, for example, similar to the method of H.264, which is an example of conventional technology. The predicted signal with the smallest difference is selected from the inter-frame predicted signal and the intra-frame predicted signal obtained as described above, and the selected predicted signal is fed to the subtraction unit 105.
The subtraction unit 105 subtracts the predicted signal (fed via line L103) from the signal of the target block (fed via line L102) to generate a residual signal. This residual signal is transformed by a discrete cosine transform by the transform unit 106 and the resulting transform coefficients are quantized by the quantization unit 107. Finally, the entropy encoding unit 111 encodes the quantized transform coefficients and the encoded data is output along with the information about the prediction method from the output terminal 112.
For the intra-frame prediction or the inter-frame prediction of the subsequent target block, the compressed signal of the target block is subjected to inverse processing to be reconstructed. For example, the quantized transform coefficients are inversely quantized by the inverse quantization unit 108 and then transformed by an inverse discrete cosine transform by the inverse transform unit 109, to reconstruct a residual signal. The addition unit 110 adds the reconstructed residual signal to the predicted signal fed via line L103 to reproduce a signal of the target block and the reproduced signal is stored in the frame memory 104. The present embodiment employs the transform unit 106 and the inverse transform unit 109, but it is also possible to use other transform processing instead of these transform units. In some situations, the transform unit 106 and the inverse transform unit 109 may be omitted.
The frame memory 104 is a finite storage, and storage of all reproduced pictures is beyond the scope of this discussion. Accordingly, only reproduced pictures to be used in encoding of the subsequent picture are described as being stored in the frame memory 104. A unit to control this frame memory 104 is the buffer management unit 114. Input data which is received through an input terminal 113 includes: information indicative of an output order of each picture (POC, picture output count), dependence information (dependency ID) related to D_IDk,j which is indicative of dependence on the picture in predictive encoding of other pictures, and a type of encoding of the picture (intra-frame predictive encoding or inter-frame predictive encoding); and the buffer management unit 114 operates based on this information. Buffer description information generated by the buffer management unit 114 and the POC information of each picture is fed via line L114 to the entropy encoding unit 111 to be encoded thereby, and the encoded data is output together with the compressed picture data. The processing method of the buffer management unit 114 will be described later.
Next, a video predictive decoding method of the predictive video coding system will be described.
Concerning the video predictive decoding device 200 configured as described above, an example of the operation thereof will be described below. Compressed data resulting from compression encoding by the aforementioned method is input through the input terminal 201. This compressed data contains the residual signal resulting from predictive encoding of each target block obtained by division of a picture into a plurality of blocks, and the information related to the generation of the predicted signal. The information related to the generation of the predicted signal includes the information about block division (size of block), the motion information, and the aforementioned POC information in the case of the inter-frame prediction, and includes the information about the extrapolation method from previously-reproduced surrounding pixels in the case of the intra-frame prediction. The compressed data also contains the buffer description information for control of the frame memory 207.
The data analysis unit 202 extracts the residual signal of the target block, the information related to the generation of the predicted signal, the quantization parameter, and the POC information of the picture from the compressed data. The residual signal of the target block is inversely quantized on the basis of the quantization parameter (fed via line L202) by the inverse quantization unit 203. The result is transformed by the inverse transform unit 204 using an inverse discrete cosine transform.
Next, the information related to the generation of the predicted signal is fed via line L206b to the predicted signal generation unit 208. The predicted signal generation unit 208 accesses the frame memory 207, based on the information related to the generation of the predicted signal, to derive a reference signal from a plurality of reference pictures to generate a predicted signal. This predicted signal is fed via line L208 to the addition unit 205, the addition unit 205 adds this predicted signal to the reconstructed residual signal to reproduce a target block signal, and the signal is output via line L205 and simultaneously stored into the frame memory 207.
Reproduced pictures to be used for decoding and reproduction of the subsequent picture are stored in the frame memory 207. The buffer management unit 209 controls the frame memory 207. The buffer management unit 209 operates based on the buffer description information and the picture encoding type fed via line L206a. A control method of the buffer management unit 209 according to embodiments of the predictive video coding system will be described later.
Next, example operations of the buffer management unit (114 in
In
Subsequently, information about the reference pictures to be used (831, 832, . . . ) is described. In the present embodiment {ΔPOC0,i, D_ID0,i} is described as the information about the reference pictures. The index i represents the i-th component of BD[0]. APOC0,i is a difference value between a POC number of the i-th reference picture and a POC number of the target picture that uses BD[0], and D_ID0,i dependence information of the i-th reference picture.
The information about BD[k] except for BD[0] is predictively encoded with reference to the buffer information BD[m] appearing before it (step 360). The present embodiment employs m=k−1, but reference can be made to any BD[m] as long as m<k. The information contained in BD[k] where k>0 is exemplified by 822 and 824 in
The buffer description (BD[k], k>0) shown in
(a) At least some of the reference pictures described in BD[k] are those already described in BD[m].
(b) N pictures which are newly encoded or decoded in addition to those in (a) (above) are described as “additional reference pictures” in BD[k]. The number N herein is an integer of not less than 0.
Furthermore, more preferred modes satisfy the following conditions.
(c) m=(k−1); that is, the immediately previous BD in the buffer description information is used for the prediction.
(d) The number of additional reference pictures described in above (b) is only one (N=1). This one additional reference picture is preferably a picture generated in the process using BD[m].
The above-described conditions will be described using the example of
The information about the reference pictures used for encoding or decoding/reproduction of the target picture (1610) with POC=32 is encoded as BD[0] using the syntax of 820 in
The information about the reference pictures described in rows 1611 to 1617 in
The predictive encoding method of buffer description information will be described. Let BD[k] be the buffer description information as a target and BD[m] be the buffer description information for the prediction of BD[k]. Furthermore, let POCcurrent be the POC number of the target picture using the information of BD[k] and POCprevious be the POC number of the target picture using the information of BD[m]. In addition, let POCk,i be the POC number of the i-th reference picture of BD[k] and POCm,j be the POC number of the j-th reference picture of BD[m]. In this case the difference values ΔPOCk,i and ΔPOC are given as follows.
ΔPOCk,i=POCk,i−POCcurrent (1)
ΔPOCm,j=POCm,j−POCprevious (2)
ΔPOCk,i is encoded using ΔPOCm,j as a predictive value. For example, the following relation holds.
ΔPOCk,i−ΔPOCm,j=(POCk,i−POCcurrent)−(POCm,j−POCprevious)=(POCk,i−POCm,j)(POCprevious−POCcurrent)=(POCk,i−POCm,j)+ΔBDk (3)
When the aforementioned condition (a) is satisfied, POCm,j is in BD[m] and, therefore, an identifier (or index) to ΔPOCm,j to make (POCk,i−POCm,j) zero is encoded. In the present embodiment, the identifier Δidxk,i defined below is used.
Δidxk,i=offsetk,i−offsetk,i−1 (4)
In this case, offsetk,i=j−i and offsetk,−1=0. Since ΔBDk defined in above formula (3) is constant irrespective of the values of (i, j), it is only necessary to describe ΔBDk defined below, once in BD[k].
ΔBDk=POCprevious−POCcurrent (5)
On the other hand, there is a situation where ΔPOCm,j to make (POCk,i−POCm,j) zero, is absent in BD[m]. For example, the component POC1,2=32 (cell 1620) in
As for the dependence information D_IDk,i which each reference picture has, if the reference picture exists in BD[m] used for the prediction, there is no need for encoding thereof because the dependence information D_IDk,i is equal to D_IDm,j. On the other hand, if the reference picture does not exist in the BD[m] which is used for the prediction, the dependence information D_IDk,i is encoded.
The contents (syntaxes) of 822, 824 in
j=i+Δidxk,i+offsetk,i−1, where offsetk,−1=0 (6)
Using this index j, it is determined in step 750 whether ΔPOCm,j as a reference value of ΔPOCk,i of a decoding target is present in BD[m]. If j<the number (#ΔPOCm) of components of BD[m], ΔPOCm,j is present; if j≥(#ΔPOCm), ΔPOCm,j is absent. When it is determined in step 750 that it is present, the processing proceeds to step 760 to determine the value of ΔPOCk,j. The dependence information D_IDk,i is simply a copy of that of ΔPOCm,j. It should be noted herein that there is no need for encoding of the dependence information D_IDk,i. When it is determined in step 750 that it is absent, the processing proceeds to step 765. In this step, the dependence information D_IDk,i is decoded and ΔBDk is substituted for the value of ΔPOCk,i in step 770. The above processing is repeatedly carried out up to the last component of BD[k].
As described above, the encoding and decoding methods of buffer description information make use of the property of repetitive use of reference pictures and make use of the correlation between pieces of buffer description information BD[k] used for different pictures, to compact or eliminate redundant information, thereby achieving the efficient encoding of buffer description information.
As shown in the example of
In the example of
In order to solve this problem, the aforementioned condition (c) is relieved so as to allow free selection of BD[m] and, in turn, an index m to identify BD[m] used for the prediction is encoded. In that case, when the buffer description information in row 914 is used as BD[m] for the prediction of the buffer description information BD[k] in row 913,
As another method, it is also possible to adopt a method of encoding the POC number ΔPOCk,i in aforementioned formula (1) as it is, for an additional reference picture absent in BD[m] used for the prediction, or, to adopt a method of encoding a difference between ΔPOCk,i and ΔBDk as IBDRk,i.
IBDRk,i=ΔPOCk,i−ΔBDk (7)
When the above formula (7) is expanded, it is equal to (POCk,i−POCprevious).
The buffer description information shown in
Using this index j, it is determined in step 1150 whether ΔPOCm,j as a reference value of ΔPOCk,i of a decoding target is present in BD[m]. In this example, if j<the number (#ΔPOCm) of components of BD[m], ΔPOCm,j is present; if j≥(#ΔPOCm), ΔPOCm,j is absent. When it is determined in step 1150 that it is present, the processing proceeds to step 1160 to determine the value of ΔPOCk,j. The dependence information D_IDk,i can be simply a copy of that owned by ΔPOCm,j. When it is determined in step 1150 that it is absent, the processing proceeds to step 1165. In this step, IBDRk,i and the dependence information D_IDk,i are decoded and the value of ΔPOCk,i is calculated in step 1170. The foregoing processing is repeatedly carried out up to the last component of BD[k].
As described above, the encoding and decoding methods of buffer description information according to the predictive video coding system make use of the property of repetitive use of reference pictures and make use of the correlation between pieces of buffer description information BD[k] used for different pictures, so as to compact redundant information, thereby enabling the efficient encoding of buffer description information. In addition, there is the effect of efficient encoding even in the case where cross reference to buffer description information is freely made.
The example encoding processes of
Since the values of Δidxk,i all are zero as seen in rows 512, 513, 514, and 517 in
In the above embodiments, the POC number of each reference picture described in the buffer description information is converted into ΔPOCk,i and then the buffer description information is encoded and decoded, but the method may be applied to the POC number itself. For example, when the POC number in the buffer description information BD[k] as a target is present in BD[m] used for the prediction, Δidxk,i indicating the POC number is encoded. When the desired POC number is absent in BD[m], ΔPOCk,i obtained by the aforementioned formula (1) is encoded as IBDRk,i. Formula (7) may be used instead of the aforementioned formula (1). In this case the process of block 360 in
In the above embodiments, when bdk,i represents the i-th component of the buffer description BD[k] as a target and bdm,j a component of BD[m] used for the prediction, Δidxk,i can be considered to be a relative position (index or address) of bdm,j from bdk,i. For example, supposing that bdk,i and bdj are information storage places, their POC numbers may be stored in the information storage places or values of ΔPOC may be stored therein. In this case, Δidxk,i is treated as a relative position between the information storage places (provided that their contents include the POC numbers used in common). In other words, the buffer description is a description of the positional relationship between the information storage place for storage of the buffer information of the target picture and the information storage place for storage of the buffer information as a reference for the target picture and provides a switching method for reproduction methods of the contents of bdk,i by comparing the position (j) of the designated information storage place with the number (#ΔPOCm or #POCm) of information storage places containing their contents.
Another embodiment as described below is also applicable to the encoding and decoding methods of buffer description information of the predictive video coding system. The present embodiment is based on the aforementioned conditions (c) and (d), similar to the embodiment shown in
Under these conditions, the present embodiment is one wherein it is determined in encoding the information of the buffer description BD[k] as a target, whether ΔPOCm,j in BD[m], which is used for the prediction shares an identical reference picture with ΔPOCk,I, which is a component of BD[k] (i.e., POCm,j=POCk,i) is “present or not”. Therefore, the aforementioned embodiment employed the “relative position Δidxk,i,” whereas the present embodiment employs a flag simply indicative of “present or not.” This flag is described as ibd_flagk,j herein. When the flag ibd_flagk,j indicates “present,” the j-th picture already stored in the buffer is continuously used as a reference picture. On the other hand, when the flag ibd_flagk,j indicates “not,” another designated picture is stored as a new reference picture (additional reference picture) into the buffer.
Under the conditions (c) and (d), the number of BD[k] is at most one larger than the number of BD[m]; i.e., the relation of #ΔPOCk=#ΔPOCm+1 is always met, and therefore there is no need for transmission of #ΔPOCk. For this reason, the present embodiment can further reduce the bit count.
Next, steps 2240 to 2265 are to check the components of BD[m] as many as the number of ΔPOCm. Specifically, when the condition of step 2245 is satisfied, the processing proceeds to step 2250; otherwise, the processing proceeds to step 2260. Specifically, the condition of step 2245 is given by formula (3) and corresponds to the case of (POCk,i=POCm). Step 2250 is to encode ibd_flagk,j of 1 for indicating that the condition is met, or “present.” At substantially the same time, the counter i of BD[k] is given an increment. On the other hand, step 2260 is to encode ibd_flagk,j of 0 for indicating that the condition is “not” met. Step 2265 is to give the count j an increment, for checking the next BD[m].
When the condition of step 2240 is not satisfied, i.e., when the check is completed for all the components of BD[m], the processing proceeds to step 2270. This step is to compare the number of ΔPOCk with the counter i of buffer description information BD[k] as a target. Since the counter i of BD[k] starts counting from 0, its maximum is (the number of ΔPOCk−1). If the condition of (i=the number of ΔPOCk) is satisfied in step 2270, the counter i exceeds the number of components of BD[k] and ibd_flagk,j is set to 0 to be encoded, followed by end of processing. On the other hand, if the condition of (i=the number of ΔPOCk) is not satisfied in step 2270, it is meant thereby that an additional reference picture absent in BD[m] is stored into the buffer. For encoding information about it, step 2290 is to encode ibd_flagk,j of 1 and step 2295 is to encode the dependence information D_IDk,i of the additional reference picture. Since the value of ΔPOCk,i of the additional reference picture is ΔBDk as described with
The information contained in BD[k] in the case of k>0 is exemplified by 2422 and 2424 in
Step 2345 is to judge the counter j of BD[m]. Before the counter j reaches the number of ΔPOCm, whether ΔPOCk,i is to be reconstructed using ΔPOCm,j is determined, based on the value of ibd_flagk,j (1 or 0) (step 2350). When the value of ibd_flagk,j is 1, step 2355 is carried out to add ΔBDk to ΔPOCm,j to generate ΔPOCk,i. In this case, ΔPOCk,i and ΔPOCm,j share the same reference picture (POCm,j=POCk,i), and therefore the dependence information D_IDk,i can be simply a copy of the dependence information D_IDm,j related to ΔPOCm,j. Next, the counter i of BD[k] is given an increment and then a determination on the next component of BD[m] is made.
After the check is completed up to the last component of BD[m] (or when step 2345 results in NO), the value of last ibd_flagk,j is judged (step 2370). When ibd_flagk,j=0, it is meant thereby that there is no additional reference picture, and the flow goes to below-described step 2390, without any processing. On the other hand, in the case of ibd_flagk,j=1, it is meant thereby that there is an additional reference picture (which is absent in BD[m]), and then step 2375 is carried out to reconstruct the dependence information D_IDk,i. Step 2380 uses ΔBDk as the POC number of the additional reference picture (because the condition (d) is applied). Furthermore, the counter i of BD[k] is given an increment. Finally, the value counted by the counter i is stored as the number of BD[k] (step 2390). This number of BD[k] is used for generation of each component of BD[k+1] (in step 2310).
The example processing methods of
In the above example the values of ibd_flagk,j are expressed by one bit (1 or 0), but they may be expressed by two or more bits. In this case, the additional bit or bits may be used to determine whether the other information (D_IDk,i, IBDRk,i, or other information) is explicitly encoded.
Furthermore, the additional bit may be used to indicate an application range of the reference pictures associated with ΔPOCk,i (i.e., the reference pictures having the POC numbers of POCk,i given in formula (1)). Specifically, when ibd_flagk,j is “1,” ΔPOCk,i is reconstructed using ΔPOCm,j and, at substantially the same time, the reference picture associated with ΔPOCk,i is applied to the picture as a current processing target (current picture) and a future picture subsequent thereto (a future picture or future pictures). When ibd_flagk,j is “01,” ΔPOCk,i is reconstructed using ΔPOCm,j and, at substantially the same time, the reference picture associated with ΔPOCk,i is not applied to the picture as a current processing target (current picture) but is applied to only a future picture subsequent thereto (a future picture or future pictures). Furthermore, when ibd_flagk,j is “00,” ΔPOCm,j is not used for reconstruction of ΔPOCk,j.
In the above embodiments the processing is carried out for ΔPOCk,i described in the buffer description information, but the processing may be carried out for the POC number itself owned by each reference picture.
The buffer description information was described in all the above embodiments. Since the buffer description information is also descriptions about a plurality of reference pictures used for encoding and decoding of the target picture, the foregoing embodiments may also be used as methods for management of reference picture lists.
The above embodiments explained the cases where the buffer description information was encoded together as part of the PPS information, but they are also applicable to cases where the buffer description information is described in the header of each individual target picture. For example, they are also applicable to a configuration wherein the information of row 510 in
A video predictive encoding program for letting a computer function as the foregoing video predictive encoding device 100 can be provided as stored in a recording medium. Similarly, a video predictive decoding program for letting a computer function as the foregoing video predictive decoding device 200 can be provided as stored in a recording medium. Examples of such recording media include recording media such as flexible disks, CD-ROM, DVD, or ROM, or semiconductor memories or the like.
As shown in
100: video predictive encoding device; 101: input terminal; 102: block division unit; 103: predicted signal generation unit; 104: frame memory (or buffer, DPB); 105: subtraction unit; 106: transform unit; 107: quantization unit; 108: inverse quantization unit; 109: inverse transform unit; 110: addition unit; 111: entropy encoding unit; 112: output terminal; 114: buffer management unit; 200: video predictive decoding device; 201: input terminal; 202: data analysis unit; 203: inverse quantization unit; 204: inverse transform unit; 205: addition unit; 206: output terminal; 207: frame memory; 208: predicted signal generation unit; 209: buffer management unit.
Number | Date | Country | Kind |
---|---|---|---|
2011-228758 | Oct 2011 | JP | national |
2011-240334 | Nov 2011 | JP | national |
This application is a continuation of U.S. application Ser. No. 15/251,876, filed Aug. 30, 2016, which is a continuation of U.S. application Ser. No. 14/255,728, filed Apr. 17, 2014, which is a continuation of PCT/JP2012/073090, filed Sep. 10, 2012, which claims the benefit of the filing date pursuant to 35 U.S.C. § 119(e) of JP2011-228758, filed Oct. 18, 2011 and JP2011-240334, filed Nov. 1, 2011, all of which are incorporated herein by reference.
Number | Name | Date | Kind |
---|---|---|---|
20060083298 | Wang | Apr 2006 | A1 |
Entry |
---|
Sjöberg et al. (Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, 6th Meeting: Torino, 2011, Document: JCTVC-F493, WG 11 No. m20923). |
Kourtis et al. (“Optimizing sparse matrix-vector multiplication using index and value compression”, Proceeding CF '08 Proceedings of the 5th conference on Computing frontiers, pp. 87-96, ACM, New York, NY, USA © 2008). |
Office Action in corresponding Canadian Application No. 2,972,448, dated Jan. 29, 2018, 7 pages. |
Sjöberg et al., Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, 6th Meeting: Torino, 2011, Document: JCTVC-F493, WG 11 No. m20923. |
Kourtis et al., “Optimizing Sparse Matrix-Vector Multiplication Using Index and Value Compression”, Proceeding CF '08 Proceedings of the 5th Conference on Computing Frontiers, ACM, New York, NY, USA © 2008, pp. 87-96. |
International Search Report, dated Dec. 4, 2012, pp. 1-3, issued in International Application No. PCT/JP2010/073090, Japanese Patent Office, Tokyo, Japan. |
Rickard Sjöberg et al., Absolute signaling of reference pictures, dated Jul. 18, 2011, pp. 1-15, Sixth Meeting of the Joint Collaborative Team on Video Coading (JCT-VC) of ITU-T ST16 WP3 and ISO/IEC JTC1/SC29/WG11 6th Meeting: Torino, Italy, Jul. 14-22, 2011. |
Australia Patent Examination Report No. 1, dated Dec. 15, 2014, pp. 1-3, issued in Australian Patent Application No. 2012324191, Offices of IP Australia, Woden ACT, Australia. |
Taiwan IPO Search Report with translation, dated Jan. 13, 2015, pp. 1-2, issued in Taiwan Invention Patent Application No. 101137724, Taiwan Intellectual Property Office, Taipei City, Taiwan, R.O.C. |
Bross, B., et al., “WD4: Working Draft 4 of High-Efficiency Video Coding” dated Jul. 14, 2011, pp. 1-216, Joint Collaboration Team on Video Coding (jct-vc) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, 6th Meeting, Torino, Italy, XP030009800. |
Tan, T.K. et al., “AHG21: Inter reference picture set prediction syntax and semantics,” dated Nov. 21, 2011, pp. 1-10, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, 7th Meeting, Geneva Switzerland, XP030050318. |
Extended European Search Report, dated May 6, 2015, pp. 1-8, issued in European Patent Application No. 12841320.0, European Patent Office, Munich, Germany. |
Korean Office Action with English translation, dated Jun. 1, 2015, pp. 1-6, issued in Korean Patent Application No. 10-2014-7013163, Korean Intellectual Property Office, Daejeon, Republic of Korea. |
Russian Office Action with English translation, dated Jul. 8, 2015, pp. 1-8, issued in Russian Patent Application No. 2014119878, Rospatent, Federal Institute of Industrial Property, Moscow, Russian Federation. |
Rickard Sjöberg et al., Absolute signaling of reference pictures, 6. JCT-VC Meeting, 97. MPEG Meeting; Jul. 1, 2011: pp. 1-10 Torino, Italy; (Joint Collaborative Team on Video Coding of ISO/IEC JTC1/SC29/WG11 and ITU-T SG. 16), No. JCTVC-F493, XP030009516. |
Extended European Search Report, dated Sep. 24, 2015, pp. 1-10, issued in European Patent Application No. 15169324.9, European Patent Office, Munich Germany. |
Extended European Search Report, dated Sep. 24, 2015, pp. 1-11, issued in European Patent Application No. 15169334.8, European Patent Office, Munich Germany. |
Extended European Search Report, dated Sep. 24, 2015, pp. 1-11, issued in European Patent Application No. 15169341.3, European Patent Office, Munich Germany. |
Extended European Search Report, dated Sep. 24, 2015, pp. 1-9, issued in European Patent Application No. 15169351.2, European Patent Office, Munich Germany. |
Canadian Office Action, dated Oct. 27, 2015, pp. 1-4, issued in Canadian Patent Application No. 2,852,888, Canadian Intellectual Property Office, Gatineau, Quebec, Canada. |
European Office Action, dated Nov. 24, 2015, pp. 1-6, issued in European Patent Application No. 12841320.0, European Patent Office, Munich, Germany. |
Australian Office Action, dated Mar. 30, 2016, pp. 1-3, issued in Australian Patent Application No. 2015202845, Australian Patent Office, Woden ACT, Australia. |
Australian Office Action, dated Apr. 21, 2016, pp. 1-3, issued in Australian Patent Application No. 2015202847, Australian Patent Office, Woden ACT, Australia. |
Canadian Office Action, dated Apr. 20, 2016, pp. 1-3, issued in Canadian Patent Application No. 2,852,888, Canadian Intellectual Property Office, Gatineau, Quebec, Canada. |
European Office Action, dated May 11, 2016, pp. 1-10, issued in European Patent Application No. 12841320.0, European Patent Office, Munich, Germany. |
Office Action and English language translation thereof, in corresponding Taiwanese Application No. 104111245, dated Apr. 27, 2016, 14 pages. |
Office Action in corresponding European Application No. 15169351.2, dated Jun. 22, 2016, 7 pages. |
Office Action, and English language translation thereof, in corresponding Chinese Application No. 201280050657.8, dated Jul. 1, 2016, 14 pages. |
Office Action in corresponding Australian Application No. 2015202850, dated Aug. 24, 2016, 3 pages. |
Office Action in U.S. Appl. No. 14/255,728, dated Sep. 28, 2016, 16 pages. |
Office Action in U.S. Appl. No. 15/251,808, dated Nov. 8, 2016, 23 pages. |
Office Action in U.S. Appl. No. 15/251,876, dated Nov. 8, 2016, 26 pages. |
Office Action in U.S. Appl. No. 15/251,842, dated Nov. 10, 2016, 27 pages. |
Office Action, and English language translation thereof, in corresponding Chinese Application No. 201280050657.8, dated Jan. 26, 2017, 12 pages. |
Extended Search Report in corresponding European Application No. 17152143.8, dated Mar. 30, 2017, 14 pages. |
Number | Date | Country | |
---|---|---|---|
20170208337 A1 | Jul 2017 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 15251876 | Aug 2016 | US |
Child | 15473849 | US | |
Parent | 14255728 | Apr 2014 | US |
Child | 15251876 | US | |
Parent | PCT/JP2012/073090 | Sep 2012 | US |
Child | 14255728 | US |