The present embodiment discussed herein relates to decoding methods, decoders and decoding apparatuses.
The dynamic images of digital broadcasting and DVD video that are monitored are made up of approximately 30 digital images per second. Hence, transmitting such dynamic images on a broadcasting wave or, storing such dynamic images on a storage medium such as the DVD, without processing the dynamic images, is difficult from the point of view of the limited frequency band of the broadcasting wave and the limited capacity of the storage medium. For this reason, in actual applications, the dynamic image is subjected to some kind of a compression process. Generally, the compression process is in conformance with a rules prescribed by a standardization organization, by taking into account the public interest and popularity on the market of the application of the compression process. The popularly used compression techniques include the MPEG-2 prescribed by the International Organization for Standardization/International Electrotechnical Commission (ISO/IEC), and the new compression technique called the H.264/Advanced Video Coding (H.264/AVC) which achieves compression rate two times that of the Moving Picture Experts Group 2 (MPEG-2). The H.264/AVC is anticipated as the next-generation compression technique to be employed in Digital Terrestrial Television (DTTV or DTT) broadcasting for mobile equipments, and reproducing apparatuses such as High-Definition DVD (HD-DVD) players/recorders and Blu-ray players/recorders. The dynamic image compression process such as the MPEG-2 and the H.264/AVC is based on the concept of detecting, from the images forming the dynamic image, regions that have a strong correlation, that is, regions that have similar picture patterns, in order to exclude redundant information. Each image is divided (or segmented) into rectangular regions (or areas) which are called macro blocks and are processing units of the compression. With respect to each macro block, and a search is made to find a rectangular region called a reference image which has a picture pattern similar to that of the macro block and is close to the macro block in terms of time. A spatial difference (or error) in positions of the macro block and the reference image that is found is regarded as motion vector data, and the residual data between the images of the macro block and the reference image is regarded as coefficient data. The motion vector data and the coefficient data that are obtained are compressed and encoded.
The apparatus which receives the digital broadcasting and displays the dynamic image or, reproduces the video data from the DVD, includes a dynamic image decoding apparatus which decodes and expands the compressed and encoded data. The dynamic image decoding apparatus generates a predicted image by referring to an image having a similar picture pattern based on the motion vector data, and performs a motion compensation process to add the predicted image and the residual image. In the case of the H.264/AVC, it may be possible to define the motion vector data in processing units of the rectangular regions smaller than that of the MPEG-2, and the processing load on the dynamic image decoding apparatus is larger than that of the MPEG-2, as reported in Impress Standard Textbook Series “H.264/AVC Textbook”, Impress Japan Incorporated, Net Business Company (Publisher), pp. 113-115, Aug. 11, 2004.
For example, with respect to the rectangular region (macro block) having a size of 16×16 pixels=256 pixels as the processing unit, the MPEG-2 requires the luminance values of 17×9 pixels to be read 2 times, that is, 306 pixels to be read as the reference image at the maximum. On the other hand, with respect to the rectangular region (macro block) having a size of 16×16 pixels=256 pixels as the processing unit, the H.264/AVC requires the luminance values of 9×9 pixels to be read 16 times, that is, 1296 pixels to be read as the reference image at the maximum. This means that in a worst case scenario, the H.264/AVC requires 4 times or more data to be read when compared to the MPEG-2.
The control part 7 controls the operations of the encoded data decoding part 1, the coefficient data processing part 4, the motion vector data processing part 5, and the motion compensating part 6, based on synchronizing signals SYNC which will be described later.
The coefficient data processing part 4 includes a coefficient data interpreting part 41, an inverse quantization part 42, and an inverse frequency conversion part 43. The coefficient data interpreting part 41 converts a macro block attribute in accordance with a compression rule, such as interpreting the order of the coefficient data within the macro block, into a data format handled by hardware. The coefficient data output from the coefficient data interpreting part 41 has been subjected to a quantization at the time of the compression, and thus, is subjected to an inverse quantization process in the inverse quantization part 42. In addition, because the image compression data has been subjected to a spatial and frequency conversion in accordance with the compression rule, the image compression data is then subjected to an inverse frequency conversion process in the inverse frequency conversion part 43, in order to output the residual image that is obtained by subtracting the predicted image from the original image. The residual image includes an error component generated by the compression process such as the spatial and frequency conversion and quantization, and this error component appears as a distortion in the decoded image.
On the other hand, the motion vector data processing part 5 includes a motion vector data interpreting part 51 and a predicted image generating part 53. The motion vector data interpreting part 51 converts the motion vector data into a motion vector which indicates the reference image, in accordance with the compression rule. The predicted image generating part 53 reads the reference image from the image memory 8 using the interpreted motion vector, and generates and outputs the predicted image based on the compression rule.
The motion compensating part 6 adds the residual image output from the coefficient data processing part 4 and the predicted image output from the motion vector data processing part 5 to generate a decoded image, and stores the decoded image in the image memory 8.
When performing the above described processes by hardware in the conventional decoder, a pipeline process is formed for each macro block, and the synchronizing signals SYNC that achieve synchronization for each macro block are output to the control part 7 from each of the encoded data decoding part 1, the coefficient data processing part 4, the motion vector data processing part 5 and the motion compensating part 6. In the decoder which synchronizes to each macro block, the reference image read performance is easy to estimate if the reference to the predicted image is simple as in the case of the MPEG-2, and a stable decoding performance may be obtained without stalling the pipeline system. However, in the case of the H.264/AVC in which the reading of the predicted image varies depending on the number of divisions of the macro blocks and the like, the reference image read performance greatly deteriorates if the number of divisions of the macro blocks is large, and the performance of the decoder as a whole deteriorates due to stalling of the pipeline system. In order to avoid such stalling of the pipeline system, it may be possible to construct the decoder by estimating the required memory performance for a worst case scenario of the reading of the reference image. However, the decoder in this case would require a high-speed memory which is several times faster than that required in the case of the MPEG-2, and consequently, the ease of design of the decoder will deteriorate, and the cost of the decoder will increase.
Next, a description will be given of a timing at which the deterioration of the performance is generated in the conventional dynamic image decoder, by referring to
The decoder illustrated in
The applicants are aware of a Japanese Laid-Open Patent Publication No. 8-214307, and Impress Standard Textbook Series “H.264/AVC Textbook”, Impress Japan Incorporated, Net Business Company (Publisher), pp. 113-115, Aug. 11, 2004 referred above.
Therefore, in the conventional decoder, if the processing time required to read the reference image with respect to the macro block number becomes long and the delay is introduced in the predicted image generating process, the start of the motion compensation process with respect to the macro block must wait. Thus, a delay is introduced when making the transition to the next macro block process at each pipeline stage, and the performance of the decoder as a whole deteriorates.
According to an aspect of the embodiments, a decoding method which decodes video compression data based on motion compensation for divided regions of the video image and decompresses the video compression data into an image that is stored in an image memory, includes decoding the dynamic image compression data and outputting coefficient data and motion vector data; storing the coefficient data in a coefficient data storage part; storing the motion vector data in a motion vector data storage part; generating a rectangular residual image by performing an inverse quantization process and an inverse frequency conversion process based on the coefficient data read from the coefficient data storage part; generating a rectangular predicted image by reading a reference image from the image memory based on the motion vector data read from the motion vector data storage part; generating a decoded image by adding the residual image and the predicted image, and storing the decoded image in the image memory; controlling processing timings of the storing of the coefficient data to the coefficient data storage part and the storing of the motion vector data to the motion vector data storage part; and storing predicted images with respect to at least two or more rectangular regions in a predicted image buffer when generating the rectangular predicted image by reading the reference image from the image memory, and notifying by a predicted image ready signal that the predicted image has been generated, wherein the processing timings are controlled in response to the predicted image ready signal.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention, as claimed.
Preferred embodiments will be described with reference to the accompanying drawings.
According to one aspect, an image is divided (or segmented) into a plurality of rectangular regions (or areas), and dynamic image compression data based on motion compensation are decoded and expanded into an image that is stored in an image memory. An encoded data decoding part decodes the compression data and outputs coefficient data and motion vector data. A coefficient data storage part stores the coefficient data, and a motion vector data storage part stores the motion vector data. A coefficient data processing part generates a rectangular residual image by performing an inverse quantization process and an inverse frequency conversion process based on the coefficient data read from the coefficient data storage part, and the motion vector data processing part generates a rectangular predicted image by reading a reference image from the image memory based on the motion vector data read from the motion vector data storage part. A motion compensating part generates a decoded image by adding the residual image generated by the coefficient data processing part and the predicted image generated by the motion vector data processing part, and stores the decoded image in the image memory. A control part controls operation timings of the coefficient data processing part and the motion vector data processing part.
The motion vector data processing part includes a predicted image buffer configured to store the predicted image with respect to at least two or more rectangular regions, and a predicted image generation notifying part configured to notify the control part by a predicted image ready signal that the predicted image has been generated. Accordingly, the control part controls the operation timings described above in response to the predicted image ready signal. The predicted image generation notifying part may be configured to notify the control part by a predicted image ready signal that the predicted image has been generated with respect to at least two or more rectangular regions.
Accordingly, it may be possible to provide a decoding method, a decoder and a decoding apparatus, which may reduce the delay that may be introduced for each macro block process, even if a delay is introduced in the predicted image generating process, in order to prevent the decoding performance from becoming deteriorated.
A description will now be given of a decoding method, a decoder and a decoding apparatus in each embodiment, by referring to
Compressed Audio Visual (AV) data are converted by the front end processing part 11 and the demultiplexer part 12 into encoded video data and encoded audio data having formats suited for the decoding performed by the corresponding decoders 13 and 14. The encoded video data are decoded by the video decoder 13 and displayed on a monitor 17 via the video output system 15. On the other hand, the encoded audio data are decoded by the audio decoder 14 and output from a speaker 18 via the audio output system 16.
The decoding apparatus 10 is implemented in an apparatus having a video reproducing function, such as a video player/recorder and a video camera. The basic structure of the decoding apparatus 10 itself is known, but the decoder 13 according to one aspect has a structure or features which will be described below, unlike the conventional decoder.
The decoder 13 expands (or decodes) a video stream (or encoded video data) of the dynamic image in conformance with a dynamic compression technique which performs an inter-frame prediction typified by the standards such as the MPEG-2, MPEG-4 and H.264.
The encoded data decoding part 61 interprets the encoded data for each macro block, and classifies the encoded data into the coefficient data and the motion vector data, without synchronizing to the macro block processes of the coefficient data processing part 64 and the motion vector data processing part 65. The encoded data decoding part 61 stores the coefficient data into the coefficient data storage part 62 and stores the motion vector data into the motion vector data storage part 63. In this embodiment, the motion vector data storage part 63 may store at least two or more motion vector data. The coefficient data storage part 62 and the motion vector data storage part 63 may be formed by separate storage parts or, may be formed by different storage regions (or areas) of a single (that is, the similar) storage part.
The control part 67 controls the operation timings of the encoded data decoding part 61, the coefficient data processing part 64, the motion vector data processing part 65, and the motion compensating part 66.
The coefficient data processing part 64 includes a coefficient data interpreting part 641, an inverse quantization part 642, and an inverse frequency conversion part 643. The coefficient data interpreting part 641 converts a macro block attribute in accordance with a compression rule, such as interpreting the order of the coefficient data within the macro block, into a data format handled by hardware. The coefficient data output from the coefficient data interpreting part 641 has been subjected to a quantization at the time of the compression, and thus, is subjected to an inverse quantization process in the inverse quantization part 642. In addition, because the image compression data has been subjected to a spatial and frequency conversion in accordance with the compression rule, the image compression data is then subjected to an inverse frequency conversion process in the inverse frequency conversion part 643, in order to output the residual image that is obtained by subtracting the predicted image from the original image. The residual image includes an error component generated by the compression process such as the spatial and frequency conversion and quantization, and this error component appears as a distortion in the decoded image.
On the other hand, the motion vector data processing part 65 includes a motion vector data interpreting part 651, a predicted image generating part 653, a predicted image buffer 654, and a predicted image generation notifying part 655. The motion vector data processing part 65 reads the motion vector data from the motion vector data storage part 63 and interprets the motion vector data. In addition, the motion vector data processing part 65 generates the predicted image by reading from the image memory 68 the reference image indicated by the interpreted motion vector, and stores the predicted image in the predicted image buffer 654 which may store the predicted images of at least two or more macro blocks. Further, the motion vector data processing part 65 processes the motion vector data of the next macro block to generate the predicted image and stores the predicted image in the predicted image buffer 654, if the data to be processed is stored in the motion vector data storage part 63 and the predicted image buffer 654 has a sufficient vacant space, without achieving synchronization in units of macro blocks.
Every time the predicted image is stored in the predicted image buffer 654, the motion vector data processing part 65 outputs a prediction image ready signal to the control part 67 from the predicted image generation notifying part 655. The control part 67 confirms the receipt of the predicted image ready signal with respect to the macro block requiring the motion compensation, and controls the start of the motion compensation operation of the motion compensation part 66 for the macro block corresponding to the predicted image ready signal. Hence, the motion vector data processing part 65 averages the performance among the macro blocks that may be processed at a high speed, such as the macro block requiring the motion compensation and the macro block having a simple block division, and the macro blocks that are processed at a low speed, such as the macro block requiring the motion compensation and the macro block having a complex block division. By averaging the performance among the macro blocks in this manner, the motion vector data processing part 65 may suppress the deterioration of the performance of the decoder 13 as a whole caused by the low-speed reading of the reference image.
The motion compensation part 66 generates a decoded image by adding the residual image output from the coefficient data processing part 64 and the predicted image output from the motion vector data processing part 65, and stores the decoded image in the image memory 68.
The motion vector data processing part 65 does not perform a timing synchronization with respect to the coefficient data processing part 64 for each macro block. Accordingly, as illustrated in
As indicated by X1 in
The effects of averaging the reference image read performance caused by the block division were described in conjunction with
Furthermore, this second embodiment provides the residual image buffer 644 in a stage following the inverse frequency conversion part 643 within the coefficient data processing part 64. For this reason, even in the case of the macro block number 6 illustrated in
If the intra-macro blocks are consecutive as in the case of the macro block numbers 0 to 3 illustrated in
In addition, in this second embodiment, the residual image buffer 644 stores one or more residual images in units of the rectangular regions generated by the inverse frequency conversion process of the inverse frequency conversion part 643 within the coefficient data processing part 64, and the residual image generation notifying part 645 notifies by the residual image ready signal that the residual image has been generated. For this reason, the control part 67 controls the operation timings of the encoded data decoding part 61, the coefficient data processing part 64, the motion vector data processing part 65 and the motion compensation part 66, based on the predicted image ready signal and the residual image ready signal from the residual image generation notifying part 645.
A dynamic image decoding apparatus according to one aspect may conceal the inconsistencies in the load of reading the reference images stored in an external image memory. Hence, it may be possible to provide a dynamic image decoding apparatus having a stable decoding performance.
Aforementioned embodiments may be applied to a decoder and a decoding apparatus for decoding compressed and encoded dynamic image data when receiving digital broadcasting or reproducing video data from Digital Versatile Disks (DVDs).
According to any one of the aforementioned embodiments, the delay that may be introduced for each macro block process may be reduced, even if a delay is introduced in the predicted image generating process, in order to prevent the decoding performance from becoming deteriorated.
Although the embodiments are numbered with, for example, “first,” “second,” or “third,” the ordinal numbers do not imply priorities of the embodiments. Many other variations and modifications will be apparent to those skilled in the art.
All examples and conditional language recited herein are intended for pedagogical purposes to aid the reader in understanding the invention and the concepts contribute by the inventor to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions, nor does the organization of such examples in the specification related to a showing of the superiority and inferiority of the invention. Although the embodiments have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
This application is a continuation application filed under 35 U.S.C. 111(a) claiming the benefit under 35 U.S.C. 120 and 365(c) of a PCT International Application No. PCT/JP2007/055621 filed on Mar. 20, 2007, in the Japanese Patent Office, the disclosure of which is hereby incorporated by reference.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/JP2007/055621 | Mar 2007 | US |
Child | 12561670 | US |