The present disclosure relates to video encoding/decoding apparatus and method
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
A video typically may include a series of pictures (or frames or images) each of which is divided into predetermined areas, such as macroblocks. The macroblock is the standard unit of video encoding and decoding. Macroblocks may be classified into intra macroblocks and inter macroblocks depending on the encoding method. The intra macroblock means a macroblock encoded through an intra prediction coding method that is an intra frame prediction coding. The intra prediction coding is adapted to generate a predicted block by predicting a pixel of a current block using pixels of reconstructed blocks that underwent previous encoding and decoding within a current picture where the current encoding is performed and then encode a differential value between the predicted block and the current block. The inter macroblock means a macroblock encoded through an inter prediction or inter frame prediction coding. The inter prediction coding is adapted to generate the predicted block by predicting the current block in the current picture through referencing one or more past (previous) pictures or future (subsequent) pictures and then encode the differential value of the predicted block from the current block. Here, the picture that is referenced in encoding or decoding the current picture (or current frame or current image) is called a reference picture (or reference frame or reference image).
Referring to
M_cos t=Distortion+λ•Rate Equation 1
In Equation 1, Distortion is the sum of absolute values of pixel differences between the current block and the block indicated by the motion vector, Rate is a predicted value of bits generated when encoding the predicted motion vector, and λ is Lagrange multiplier.
The process of encoding the predicted motion vector is as follows. A calculation is first performed on a prediction motion vector (PMV) predicted from adjacent blocks of the current block followed by another calculation of a differential vector between the PWM and the motion vector of the current block.
When predicting the motion vector, motion estimation may be performed in units of integers. However, for more accurate motion estimation, motion estimation may be performed in units of ½ pixels or ¼ pixels (i.e., non-integer pixels). This is because image does not move only in units of integer pixels, but can move in units of ½ pixels or ¼ pixels. Therefore, if motion estimation is performed only in units of integer pixels, coding efficiency is lowered in images that move in units of ½ pixels or ¼ pixels.
Considering this fact, JM reference software, which is an existing video codec, predicts motion vectors in units of integer pixels, ½ pixels, and finally ¼ pixels, and compresses signals by using a motion vector of a resolution having highest coding efficiency with the block to be currently coded. In addition, KTA reference software can detect more accurate motion by predicting a motion vector in units of integer pixels to ⅛ pixels so as to predict the motion vector more accurately. However, an reference image does not have ½ pixel or ¼ pixel values, but integer pixel values. Therefore, ½ pixel or ¼ pixel values are produced using the given integer pixel values.
As for the method for producing ½ pixel and ¼ pixel values in JM reference software, ½ pixel values are generated by using six integer pixel values around the ½ pixel, as shown in
A differential motion vector encoding method can be performed through tables of
As can be seen from
The prediction of the motion vector with high resolution has an advantage in that it can find such a reference block that has high correlation with the currently coded block. However, the inventors have noted that the compression efficiency may be lowered due to the use of variable-length codeword considering vectors of all resolutions encompassing values of motion vectors from low to high resolutions. For example, assuming a specific frame permits encoding with the use of motion vectors exclusively in units of integer pixels or ½ pixels when the variable length codebook is used to have all resolutions considered from the integer pixel unit to ⅛ pixel unit, the codewords for ¼ and ⅛ pixels are not used and lengthen the variable-length codewords of frequently used integer pixels and ½ pixel coded vectors. As a result, the coding efficiency may be lowered. In some contrary cases, due to the characteristics of the internal pixel values of certain frames, compression efficiency can be increased when using the variable-length codewords considering motion vectors of all resolutions from the integer to ⅛ pixel units.
In some embodiments, a video encoding/decoding apparatus comprises a video encoder and a video decoder. The video encoder is configured to set up motion vector resolutions differentiated by search areas centered on a prediction motion vector of a current block, perform a motion estimation with a resolution corresponding to each of the search areas to generate a motion vector, and encode a differential motion vector between the generated motion vector and the prediction motion vector. The video decoder is configured to extract the differential motion vector from a bitstream, and decode the extracted differential motion vector with a resolution corresponding to a search area where the differential motion vector belongs to among the search areas.
In some embodiments, a differential motion vector encoding method comprises setting up motion vector resolutions differentiated by the search areas centered on a prediction motion vector of a current block, performing motion estimation with the resolution corresponding to each of the search areas to generate a motion vector, calculating a differential motion vector between the generated motion vector and the prediction motion vector, and encoding the calculated differential motion vector.
In some embodiments, a differential motion vector decoding method comprises dividing search areas in accordance with threshold values, setting up motion vector resolutions differentiated by the search areas, extracting a differential motion vector from a bitstream, and decoding an extracted differential motion vector with the resolution corresponding to a search area where the differential motion vector belongs to among the search areas.
Some embodiments of the present disclosure provide differential motion vector encoding/decoding apparatus and method, in which motion vectors are predicted with resolutions differentiated by search areas, and a differential motion vector is adaptively encoded/decoded with a corresponding resolution, thereby increasing compression and/or reconstruction efficiency.
Hereinafter, at least one embodiment of the present disclosure will be described in detail with reference to the accompanying drawings. In the following description, like reference numerals designate like elements although the reference numerals are shown in different drawings. Further, in the following description of the present embodiments, a detailed description of known functions and/or configurations incorporated herein will be omitted for the purpose of clarity and for brevity.
Additionally, in describing various components of the present disclosure, terms like first, second, A, B, (a), and (b) are used solely for the purpose of differentiating one component from another, but one of ordinary skill would understand the terms do not imply or suggest the substances, order or sequence of the components. If a component is described as ‘connected’, ‘coupled’, or ‘linked’ to another component, one of ordinary skill would understand the components are not necessarily directly ‘connected’, ‘coupled’, or ‘linked’ but also are indirectly ‘connected’, ‘coupled’, or ‘linked’ via at least one additional third component.
Hereinafter, a video encoding apparatus and/or a video decoding apparatus in accordance with some embodiments described below may be user terminals such as a personal computer (PC), a notebook computer, a tablet, a personal digital assistant (PDA), a game console, a portable multimedia player (PMP), a PlayStation Portable (PSP), a wireless communication terminal, a smart phone, a TV, a media player, and the like. A video encoding apparatus and/or a video decoding apparatus according to one or more embodiments may correspond to server terminals such as an application server, a service server and the like. A video encoding apparatus and/or a video decoding apparatus according to one or more embodiments may correspond to various apparatuses each including (a) a communication unit apparatus such as a communication modem and the like for performing communication with various types of devices or a wired/wireless communication network, (b) a memory for storing various types of programs and data for encoding or decoding a video, or performing an inter or intra prediction for the encoding or decoding, and (c) a microprocessor and the like for executing the program to perform an operation and control. According to one or more embodiments, the memory comprises a computer-readable recording/storage medium such as a random access memory (RAM), a read only memory (ROM), a flash memory, an optical disk, a magnetic disk, a solid-state disk , and the like. According to one or more embodiments, the microprocessor is programmed for performing one or more of operations and/or functionality described herein. According to one or more embodiments, the microprocessor is implemented, in whole or in part, by specifically configured hardware (e.g., by one or more application specific integrated circuits or ASIC(s)).
Further, a video encoded into a bitstream by the video encoding apparatus may be transmitted in real time or non-real-time to the video decoding apparatus through wired/wireless communication networks such as the Internet, wireless short range or personal area network (WPAN), wireless local area network (WLAN), WiBro (wireless broadband, aka WiMax) network, mobile communication network and the like or through various communication interfaces such as a cable, a universal serial bus (USB) and the like. According to one or more embodiments, the bit stream is decoded in the video decoding apparatus and reconstructed and reproduced as the video. According to one or more embodiments, the bit stream is stored in a computer-readable recording/storage medium.
The technology described herein is not applied with limitation to motion vector prediction units (for example, macroblocks, 16×16, 16×8, 8×16, 8×8, 4×8, 8×4, 4×4) used in the existing H.264 standard or KTA reference software, and the size of motion vector estimation blocks also is not limited. In addition, the technology of the present disclosure can also be used when the motion vector prediction unit has a square shape, a rectangular shape, a triangular shape, and other various shapes.
The resolution setting unit 710 sets up motion vector resolutions differentiated by search areas centered on a prediction motion vector (PMV) of a current block. In the existing technique for encoding the differential motion vector, the motion vectors having the same resolution are used in all areas centered on the prediction motion vector. However, the differential motion vector encoding apparatus 700 according to at least one embodiment of the present disclosure is configured to estimate motion vectors having different resolutions in different search areas centered on the prediction motion vector, as opposed to the existing differential motion vector encoding method. For this purpose, the resolution setting unit 710 may set up resolutions by search areas such that the motion vector resolution is lowered as the distance from the search area to the prediction motion vector increases, or may set up resolutions by search areas such that the motion vector resolution is increased as the distance from the search area to the prediction motion vector of the current block increases. Alternatively, the present disclosure is not limited thereto, and available resolutions can be variously set up according to the distance from the search area to the prediction motion vector of the current block. In addition, different motion vector resolutions can be set differently in different directions centered on the prediction motion vector of the current block.
In at least one embodiment, the resolution setting unit 710 can calculate threshold values of respective search areas by using the current image and the reference image. The present disclosure does not limit the method for calculating the threshold values, and can generate a table (codebook) for encoding the differential motion vector by using one or more threshold values predetermined for the corresponding one or more search areas.
The motion estimation unit 720 generates the motion vector by performing motion estimation with the resolutions set correspondingly to the respective search areas by the resolution setting unit 710.
The differential motion vector calculator 730 calculates the differential motion vector between the motion vector generated by the motion estimation unit 720 and the prediction motion vector.
The differential motion vector encoder 740 encodes the differential motion vector, which is calculated by the differential motion vector calculator 730, with the resolution corresponding to the motion vector generated by the motion estimation unit 720, in a bitstream.
The threshold value encoder 750 encodes the threshold values of the respective search areas with the highest resolution of the corresponding search area and transmits the encoded values on a bitstream to the decoder. In some embodiments, the outputs from the threshold value encoder 750 and the differential motion vector encoder 740 are included in a single bitstream. In at least one embodiment, instead of notifying the decoder of the threshold values in the respective search areas through the threshold value encoder 750, the resolution setting unit 710 may be configured to set up motion vector resolutions differentiated by search areas according to threshold values prearranged with the decoder.
Such divided search areas respectively have motion vectors having different resolutions. For example, an area A has a motion vector encoding resolution of up to ⅛ pel (i.e., ⅛ pixel unit); area B up to ¼ el (i.e., ¼ pixel unit); area C up to ½ pel (i.e., ½ pixel unit); and area D encodes the motion vector with integer motion vector resolutions.
When the areas (i.e., search areas) are set as above, the threshold value encoder 750 encodes the threshold values of the respective search areas in order to notify the set areas to the decoder, the threshold values for use being encoded appropriately on the corresponding maximum resolution. For example, in case of using only up to ¼ pixel resolution, the threshold value encoder 750 encodes the threshold values in units of ¼ pixels before transmitting the same to the decoder. Since up to ⅛ pixel resolution is used in the above-described example, the threshold value encoder 750 encodes the threshold values in units of ⅛ pixels and transmits the same to the decoder. The present disclosure does not limit the method for transmitting the threshold values. If the codebook of
Although at least one embodiment of the present disclosure exemplifies using the exponential Golomb code to encode the differential motion vector into the bit string, the present disclosure is not limited thereto and other coding methods can be used.
The example of
Comparing
Although the foregoing description is related to the case where the longer distance between the search area and the prediction motion vector of the current block brings less available resolutions, the longer distance between the search area and the prediction motion vector of the current block may bring more available resolutions in at least another embodiment.
When the codebook for the differential motion vector is generated using the example of
In addition, various types of available resolutions distributed by search areas centered on the prediction motion vector of the current block can be arbitrarily set, regardless of distances.
As in the above-described examples, various threshold value settings for the respective areas (i.e., search areas) can be used, and there may be a variety of combinations of motion vector resolutions and threshold values used in the respective areas. The respective threshold values for respective search areas may be encoded by the threshold value encoder 750 before transmission to the decoder, or the transmission of the threshold values may be omitted in such a manner that the encoder and decoder use prearranged threshold values for respective search areas. Information about the combination of the motion vector resolutions and the threshold values used in the respective areas (i.e., search areas) can also be prearranged between the transmitter (e.g., encoder) and the receiver (e.g., decoder). Alternatively, the information about the combination of the resolutions and threshold values may be encoded in the encoder before transmission.
In addition, the search areas may be differently set with respect to x-axis and y-axis as shown in
The threshold value decoder 2110 extracts threshold values of respective search areas from a bitstream received from the encoder, and decodes the extracted threshold values. The threshold values used herein are threshold values of the respective search areas set by the encoding apparatus 700 according to at least one embodiment of the present disclosure, and are encoded with the highest resolution among motion vector resolutions available in the respective areas. For example, with respect to area A in
The resolution setting unit 2120 sets motion vector resolutions differentiated by search areas, based on the respective threshold values decoded by the threshold value decoder 2110. That is, the resolution setting unit 2120 can recognize motion vector resolutions available in the respective search areas set by the differential motion vector encoding apparatus 700, based on the respective decoded threshold values. For example, in a case where the threshold values 2/8 and − 2/8 for the area A of
The differential motion vector decoder 2130 extracts the differential motion vector from the bitstream received from the encoder, and decodes the differential motion vector with the resolutions corresponding to the area where the differential motion vector belongs among the respective areas. In this case, the differential motion vector decoder 2130 can generate the codebook of
The resolution setting unit 2210 may set up motion vector resolutions differentiated by search areas according to threshold values prearranged with the encoder. For example, the resolution setting unit 2210 may prearrange with the encoder to equally set up the respective search areas and the available motion vector resolutions as shown in
The differential motion vector decoder 2220 extracts the differential motion vector from the bitstream, and decodes the differential motion vector with the resolutions corresponding to the area where the differential motion vector belongs among the respective areas.
Referring to
The resolution setting unit 710 can calculate the threshold values of the respective search areas by using the current image and the reference image. The present disclosure does not limit the method for calculating the threshold values, and can generate a table (codebook) for encoding the differential motion vector by using the threshold values of the determined search area.
The threshold value encoder 750 encodes the threshold values of the respective search areas with the highest resolution of the corresponding search area and transmits a bitstream to the decoder (S2320). In at least one, when it is necessary to transmit the threshold values, the threshold value encoder 750 encodes the threshold value(s) and inserts the encoded threshold value(s) between a slice header and a coding unit block (MB data) before transmission as shown in
Instead of notifying the decoder of the threshold values of the respective search areas through the threshold value encoder 750, the resolution setting unit 710 may be configured to set up motion vector resolutions differentiated by search areas according to threshold values representing search area ranges prearranged with the decoder. In this case, the encoding of the threshold values may be omitted.
The motion estimation unit 720 generates the motion vector by performing motion estimation with the resolutions corresponding to the respective search areas set by the resolution setting unit 710 (S2330).
The differential motion vector calculator 730 calculates the differential motion vector between the motion vector generated by the motion estimation unit 720 and the prediction motion vector (S2340).
The differential motion vector encoder 740 encodes the differential motion vector, which is calculated by the differential motion vector calculator 730, with the resolution corresponding to the motion vector generated by the motion estimation unit 720 (S2350).
Referring to
The resolution setting unit 2120 sets up motion vector resolutions differentiated by search areas based on the respective threshold values decoded by the threshold value decoder 2110 (S2720). That is, the resolution setting unit 2120 can recognize motion vector resolutions available in the respective search areas set by the differential motion vector encoding apparatus 700, based on the respective decoded threshold values.
The differential motion vector decoder 2130 extracts the differential motion vector from the bitstream, and decodes the differential motion vector with the resolution corresponding to the search area where the differential motion vector belongs among the respective search areas (S2730). In at least one embodiment, the differential motion vector decoder 2130 can generate the codebook of
Referring to
The differential motion vector decoder 2220 extracts the differential motion vector from the bitstream, and decodes the differential motion vector with the resolution corresponding to the search area where the differential motion vector belongs among the respective search areas (S2820).
Next, in a case where a video is compressed and decoded using a plurality of reference images, a method for using threshold values will be described. For decoding the current image, information is read from the slice header, the threshold value(s) is read, and data of the coding unit block is read. In this case, the decoded threshold value(s) is used for the respective reference images so as to decode the current frame through the motion compensation.
According to the present disclosure as described above, motion vectors are predicted with resolutions differentiated by search areas, and a differential motion vector is adaptively encoded/decoded with a corresponding resolution, increasing compression and reconstruction efficiency in the case of using variable length codebooks.
In the description above, although all of the components of the embodiments of the present disclosure may have been explained as assembled or operatively connected as a unit, one of ordinary skill would understand the present disclosure is not limited to such embodiments. Rather, within some embodiments of the objective scope of the present disclosure, the respective components are selectively and operatively combined in any numbers of ways. Every one of the components are capable of being implemented alone in hardware or combined in part or as a whole and implemented in a computer program having program modules residing in computer readable media and causing a processor or microprocessor to execute functions of the hardware equivalents. The computer program is stored in a non-transitory computer readable media, which in operation realizes at least one embodiments of the present disclosure. The computer readable media include, but are not limited to, magnetic recording media, and optical recording media, in some embodiments.
In addition, one of ordinary skill would understand terms like ‘include’, ‘comprise’, and ‘have’ to be interpreted in default as inclusive or open-ended rather than exclusive or close-ended unless expressly defined to the contrary. All the terms that are technical, scientific or otherwise agree with the meanings as understood by a person skilled in the art unless defined to the contrary.
Although exemplary embodiments of the present disclosure have been described for illustrative purposes, those skilled in the art will appreciate that various modifications, additions and substitutions are possible, without departing from the various characteristics of the disclosure. Therefore, exemplary embodiments of the present disclosure have been described for the sake of brevity and clarity. Accordingly, one of ordinary skill would understand the scope of the disclosure is not to be limited by the explicitly described above embodiments.
Number | Date | Country | Kind |
---|---|---|---|
10-2010-0101439 | Oct 2010 | KR | national |
The instant application is the US national phase of PCT/KR2011/007736, filed Oct. 18, 2011, which claims priority to Korean Patent Application No. 10-2010-0101439, filed on Oct. 18, 2010. The above-listed applications are hereby incorporated by reference in their entirety.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/KR11/07736 | 10/18/2011 | WO | 00 | 4/17/2013 |