The present invention relates to a video coding apparatus and a video coding method and, in particular, to technologies suitably used for a recording and reproducing apparatus for storing and reproducing video data and an image capture apparatus.
In the recent years, advance in digital signal processing technologies enables a large volume of digital informations, such as a moving image, a still image, audio and the like to undergo highly efficient coding and to be stored in a small storage media and to be transmitted through a communication media. Applying such technologies, developments are underway on a video coding apparatus which enables conversion of videos of television broadcasts and video cameras into a compressed data (bit stream). Among video coding methods of various types of moving images, H.264 (also known as MPEG4 Part 10/AVC) is attracting a lot of notice among others.
For H.264, arithmetic coding called Context-based Adaptive Binary Arithmetic Coding (hereinafter to be referred to as CABAC) and variable length coding called Context-based Adaptive Variable Length Coding (herein to be referred to as CAVLC) are adopted.
Focusing attention on the above described CABAC as well as CAVLC, JP-A-2004-135251 is proposed as a prior example. According to “an Image Information Coding Method and an Image Information Decoding Method” described in JP-A-2004-135251, the disclosed invention intends to attain ristrict the input and output data amount to CABAC and secure the process time of the decoder.
Specifically, the invention of JP-A-2004-135251 individually includes counters of the number of the binary data input to the CABAC coder and counters of the number of bit data to be outputted. And there disclosed is a configuration that in the case where any one of those counters happens to exceed a preset threshold value, a restraint monitor outputs a signal indicating that the coding data is invalid, to carry out the recoding process.
The image data input from the video input unit 901 is divided into image blocks each including 16 pixels in each of the vertical and horizontal directions and converted into coefficient series on the unit basis of the image block by the conversion unit 902. The conversion unit 902 carries out orthogonal transform processing, movement prediction processing and the like to thereby reduce visual redundancy of the image block.
The coefficient series having undergone orthogonal transform in the conversion unit 902 are supplied to the quantization unit 903 and quantized by predetermined quantization parameters. In accordance with size of quantization parameters in the quantization unit 903, the information amount of the coefficient series is reduced and, in replacement thereof, coding deterioration occurs. The quantized coefficient series are supplied to the entropy coding unit 940.
The entropy coding unit 904 converts the input coefficient series into more efficient coding series based on the occurrence frequency of symbol configuring the coefficient series to carry out data compression processing. The coding series generated by entropy coding are outputted from the stream output unit 905 as a bit stream.
The data amount and the image quality in the bit stream enter trade-off relation in accordance with the quantization parameters in the quantization unit 903. Accordingly, determination on the quantization parameters significantly affects the performance of the video coding apparatus.
In addition, that determination must be implemented appropriately so that the data amount of the generated bit stream fulfills the buffer model at decoding. The buffer model is specified in the ISO/IEC13818-2 standards and ITU-TH.264 standards. The conventional apparatus carries out controls as follows.
The data amount detection unit 906 detects the data amount outputted from the entropy coding unit 904 to supply the quantization control unit 907 with a detected value. The quantization control unit 907 integrates the generated data amount of each coded picture so far and calculates the target code amount of the picture to be coded subsequently is calculated so that the buffer model is fulfilled.
And while the generated code amount of each image block configuring the picture is monitored, the quantization parameters are sequentially determined to be supplied to the quantization unit 903 so that the code amount generated from the picture to be coded approaches the target code amount. That is, it is a stream of processing that, each time a single picture undergoes coding, the data amount generated from the picture is detected and the detection outcome thereof gets fed back to be used for coding the subsequent picture.
However, use of a binarization arithmetic coding method (CABAC) in the recently notice-attracting video coding method “H.264” as the entropy coding unit 904 causes such a process to be extremely difficult.
In the case of adopting the “CABAC” in H.264, the entropy coding unit 904 in
At that occasion, the arithmetic coding unit 911 carries out processing on one binary symbol basis and therefore will require extremely more process cycles than a conventional entropy coding unit does. The reason thereof is that the conventional entropy coding carries out processing on the unit basis of a coefficient supplied from the quantization unit 903 and, in contrast for the CABAC, one of the coefficients thereof corresponds to a plurality of units of binary symbols. Therefore, processing cycles of several times to 10 times in the worst case more than those for the conventional cases are required.
Thus, processing cycles of the arithmetic coding unit 911 occasionally gets extremely long. Therefore, entropy coding unit processing time for one picture gets extremely long. Consequently, as described above, detecting and feeding back the amount of the data resulted in from entropy coding on the unit basis of one picture as described above becomes hardly realizable.
There considered is a configuration of a coding apparatus for solving that problem by increasing process speed with a circuit configuration using a higher processing clock. However, such a configuration will have to sacrifice circuit size and electricity consumption, giving rise to a problem.
In addition, there also considered is a coding apparatus for solving the problem with variable length coding CAVLC being a coding method equivalent to conventional entropy coding. However, the CAVLC is lower than the CABAC in coding efficiency and gives rise to such a problem to spoil the image quality. Accordingly, even the invention of “an Image Information Coding Method and Image Information Decoding Method” proposed in the above described JP-A-2004-135251 can not solve such a problem as described above.
In view of the above described problems, an object of the present invention is to enable provision of a video coding apparatus that can enhance coding efficiency without increasing the circuit size and electricity consumption to an excess extent and can generate a bit stream appropriately fulfilling a buffer model and to enable provision of a video coding method.
According to an aspect of the present invention, a video coding apparatus comprises:
a quantization unit configured to quantize input video data;
a binarization unit configured to convert video data quantized by the quantization unit into binary symbol series;
an arithmetic coding unit configured to subject the binary symbol series converted by the binarization unit to arithmetic coding; and
a quantization control unit configured to estimate a generated data amount based on the output of the binarization unit to control quantization, and detect an actual generated data amount from the output of the arithmetic coding unit to thereby correct the estimated generated data amount.
According to another aspect of the present invention, a video coding method comprises the steps of:
quantizing input video data;
converting the quantized video data into binary symbol series;
subjecting the binary symbol series to arithmetic coding; and
controlling to estimate a generated data amount on the basis of a result of the binarization to control quantization, and to detect an actual generated data amount from the result of the arithmetic coding to thereby correct the estimated generated data amount.
Further features and aspects of the present invention will become apparent from the following detailed description of exemplary embodiments with reference to the attached drawings.
Numerous embodiments, features and aspects of the present invention will be described with reference to the drawings.
Exemplary embodiments of the present invention will be described with reference to the drawings below.
A video coding apparatus 100 in the present exemplary embodiment carries out highly efficient coding processing in the following procedure based on the H.264 (AVC) specification. The video data input from the video input unit 101 is divided into image blocks each including 16 pixels in each of the vertical and horizontal directions, by the conversion unit 102 and is converted into coefficient series on the unit basis of an image block. The conversion unit 102 carries out the orthogonal transform processing and the moving prediction processing (intra-prediction and inter-prediction) to thereby reduce visual redundancy on information amount in the image block.
The coefficient series having undergone orthogonal transform in the conversion unit 102 are supplied to the quantization unit 103. The quantization unit 103 quantizes the input coefficient series with the quantization parameters supplied from the quantization control unit 109.
In accordance with size of quantization parameters in the quantization unit 103, the information amount of the coefficient series is reduced and, in replacement thereof, coding deterioration occurs. The quantized coefficient series are supplied to the binarization unit 104.
The binarization unit 104 converts the respective coefficients in the input coefficient series into the binary symbol of “0” and “1”. The binary symbol series including a plurality of binary symbols are supplied to the arithmetic coding unit 105 and are converted into arithmetic code series and subjected to data compression.
At that occasion, the binarization unit 104 can convert one to one respective coefficients to the binary symbol series in accordance with the table. Therefore the processing here can be executed “on the coefficient basis”.
On the other hand, the arithmetic coding unit 105 needs to bring the binary symbol one by one into processing (“on the binary symbol basis”). The code series generated by arithmetic coding are outputted from the stream output unit 106 as a bit stream.
The data amount and the image quality in the bit stream enter trade-off relation in accordance with the quantization parameters in the quantization unit 103. Accordingly, processing of the quantization control unit 109 determining the quantization parameters significantly affects the performance of the present video coding apparatus.
Moreover, that determination must be controlled appropriately so that the data amount of the generated bit stream fulfills the predetermined buffer model at decoding. Therefore, the quantization control unit 109 in the present video coding apparatus determines the quantization parameters with both of the first data amount detection unit 107 and the second data amount detection unit 108 and supplies the quantization unit 103 with the determined quantization parameters and thereby controls quantization.
At first, the first data amount detection unit 107 detects the data amount outputted from the binarization unit 104, that is, the data amount of the binary symbol series and supplies the quantization control unit 109 with the detection value (first detection value). The data amount of the binary symbol series is detected on the coefficient basis in correspondence with processing of the binarization unit 104. The quantization control unit 109 integrates the data amount of the binary symbol series generated on the unit basis of picture coded so far and calculates the data amount of the binary symbol series to provide the target value of the picture to be coded subsequently so that the buffer model is fulfilled.
And while the quantization control unit 109 monitors the generated code amount on the unit basis of the image block configuring the picture, the quantization parameters are sequentially determined to supply the quantization unit 103 with the quantization parameters so that the data amount of the binary symbol series generated from the picture to be coded approaches the target value.
At that occasion, the detection value used for quantization control, that is, the data amount of the binary symbol series indicates the data amount in the middle stage of entropy coding. Therefore, the value around several ten percent more than the finally generated data amount will be indicated. That is, the data amount of the bit stream finishing up to the arithmetic coding of the subsequent stage will be less than the detection value and the target value in the state of the binary symbol series. However, from the point of view of fulfilling a buffer model, it is comprehensible that the too large amount of data will exceed the buffer capacity to result in wreckage and, however, in the case of the smaller amount, is appropriate for the buffer model.
In addition, processing of the quantization control unit 109 is carried out in accordance with the output of the first data amount detection unit 107, updated on the coefficient basis. Therefore, the process cycle can be proceeded on the coefficient basis. Consequently, while adopting high-performance coding system, data amount can be controlled smoothly likewise a conventional system, in synchronization with the unit of picture, the image block or the slice including the image block set.
Incidentally, as described above, controlling the data amount generated in accordance with the binary symbol series, only data in the amount less than the generated data amount assumed by the buffer model is generated. Therefore, although the buffer model is not devastated, there still remains a room to be devised in the point of view of coding efficiency.
Therefore, using the second data amount detection unit 108 in the video coding apparatus 100 in the following description, the data amount is estimated accurately to realize coding with good efficiency. The processing procedure thereof will be described below.
The above described quantization control unit 109 obtains the second detection value from the second data amount detection unit 108 together with the first detection value obtained from the first data amount detection unit 107. The above described second detection value is updated on the binary symbol basis. In the video coding apparatus 100, one coefficient becomes extremely long binary symbol series or extremely short binary symbol series. Therefore, the relation between the coefficient and the detection amount of the generated data amount is not one for one. Therefore, the relation will be out of synchronization with the unit of coding processing such as picture, image block and slice.
In the present exemplary configuration of the present embodiment, the quantization control unit 109 determines the target value as well as the quantization parameter of the provisional data amount by using the first detection value outputted from the first data amount detection unit 107 as described above. Moreover, the quantization control unit 109 calculates the generated data amount corresponding to the picture on the basis of the second detection value outputted from the second data amount detection unit 108 in order to controls the data amount in synchronization with picture, image block and slice. In accordance with information on the generated data amount corresponding to the picture, the quantization control unit 109 corrects the generated data amount asynchronously while updating the successive quantization parameters.
The step S0 determines whether or not the process is finalized. When the process comes to an end (YES in the step S0), the present flow comes to an end. On the other hand, when the process does not come to an end (NO in the step S0), the step goes forward to the step S1.
In the step S1, the quantization control unit 109 calculates the provisional generated data amount of the pictures so far (provisional generated data amount) on the basis of the consequence of integrating the first detection value from the first data amount detection unit 107.
In the step S2, the quantization control unit 109 determines the target value of the data amount of the subsequent picture on the basis of the provisional generated data amount on the pictures calculated so far in the step S1.
In the step S3, the quantization control unit 109 sequentially determines quantization parameters of each image block within the picture on the basis of the target value of the subsequent picture determined in the step S2.
In the step S4, the quantization control unit 109 obtains the second detection value from the above described second data amount detection unit 108 and determines whether or not the generated data amount corresponding to the picture is calculated.
If the determination in the step S4 results in that the amount occasionally has not yet reached the amount corresponding to the picture(NO in the step S4), then the step returns to the step S0. In addition, in the case where the amount has reached the amount corresponding to the picture (YES in the step S4), the step goes forward to the step S5.
In the step S5, the generated data amount corresponding to the picture, calculated based on the above described second detection value, and the above described provisional generated data amount are brought into comparison on mutually corresponding pictures and thereby the above described provisional generated data amount is corrected. That is, the target value and the quantization parameters, based on the provisional generated data amount, are updated into the more appropriate value to adjust the generated data amount. Thereafter, the step returns to the step S0 so that the present flow continues to be repeated until the finalization conditions are fulfilled.
On the other hand, the development of the generated data amount based on the second detection value indicated in the trace 302 indicates that the process in the arithmetic coding unit 105 is out of synchronization with the pixel number and therefore the data amount of the picture is calculated in a fluctuating time interval. In
In the state prior to correction, the provisional generated data amount is calculated to be more than the correct generated data amount. Therefore, buffer control is strictly controlled than the original. The data amount which is always smaller than the amount allowed as the specification is generated to enable no sufficient execution of the image quality performance. However, as described above, every time when the difference d is defined, the detection amount of the data amount is corrected to the correct value. Thereby more preferable image quality performance with the appropriate buffer control can be realized.
Here, at the time point of the marker 404 corresponding to the sixth picture, for example, based on the above described difference d, the decoder buffer amount shifts to the state of the marker 403 to enable control in such a manner as to take advantage of the sufficient buffer remaining amount. Thus, while controlling the buffer model at a provisional generated data amount, it is possible to implement correction so as to follow it to carry out a preferable control. Here, in the description of the present embodiment, an example of carrying out correction with the difference between the first detection value and the second detection value is described. However, correction based on proportion can give to rise a likewise effect, falling within the aspect of the present invention.
With such a configuration, while the provisional generated data amount is being estimated in accordance with an output of the binarization unit of a short processing cycle to carry out quantization control sequentially, the correct generated data amount is detected from the output of the arithmetic coding unit so as to follow the quantization control, to correct the estimated amount appropriately. Consequently, without increasing the circuit size and the electricity consumption, a bit stream fulfilling the buffer model appropriately can be generated. Other exemplary embodiments According to an Aspect of the Present Invention
The respective units configuring the video coding apparatus in the exemplary embodiment of the present invention described above and the respective steps of the video coding method can be realized by the programs stored in the RAM, the ROM and the like of a computer to operate. The computer readable storage media that has stored those programs as well as the above described program is included in the present invention.
In addition, the present invention is applicable to an embodiment in the form selected from a system, an apparatus, a method, a program or a storage media and the like and specifically may be applied to a system configured by a plurality of apparatuses or may be applied to an apparatus including one apparatus.
Here, the present invention supplies the system or the apparatus directly or remotely with the program of the software for realizing the function of the embodiment (the program corresponding with the flow chart illustrated in
Accordingly, in order to realize the function processing of the present invention with a computer, the program code itself installed in the above described computer is also for realizing the present invention. That is, the present invention includes the computer program itself for realizing the function processing of the present invention.
In that case, if the function of the program is present, the modes such as an object code, a program executed by an interpreter and a script data supplied to the OS are also preferable.
The storage media for supplying the program is, for example, a floppy (registered trade name) disk, a hard disk, an optical disk, a magnet-optical disk, an MO, a CD-ROM, a CD-R, a CD-RW and the like. In addition, there are a magnetic tape, a nonvolatile memory card, a ROM, a DVD (DVD-ROM, DVD-R) and the like.
Otherwise, as a method, the program can be supplied by connection to a home page on the Internet with a browser of a client computer and also by downloading the computer program of the present invention itself from the above described website or the file compressed and containing an automatic install function to the storage media such as a hard disk.
In addition, the program is realizable as well by dividing the program code configuring the program of the present invention into a plurality of files and downloading the respective files from different websites. That is, a WWW server of allowing a plurality of users to download the program files for realizing the function processing of the present invention with a computer is also included in the present invention.
In addition, the program of the present invention is coded and stored in a storage media such as a CD-ROM to be distributed to a user and a user who cleared the predetermined conditions are allowed to download the key information for decoding the coding from the website through the Internet. And it is also possible to use the downloaded key information to thereby carry out the coded program and to cause a computer to install the program to realize it.
In addition, a computer executes the read program and thereby the functions of the embodiment described above are realized. Otherwise, based on the instruction of that program, the OS and the like in operation on a computer execute a part or the whole of the actual processing and thereby the functions of the embodiment described above can be realized.
Moreover, the program read from a storage media is written in a memory provided in a function expanding board inserted into a computer or a function expanding unit brought into connection to a computer. Otherwise, based on the instruction of that program, a CPU and the like provided in its function expanding board or function expanding unit execute a part or the whole of the actual processing and thereby the functions of the embodiment described above are realized.
While the present invention has been described with reference to the exemplary embodiments, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. The scope of the following claims is to be accorded the broadcast interpretation so as to encompass all such modifications and equivalent structures and functions.
This application claims the benefit of Japanese Patent Application No. 2006-176636 filed on Jun. 27, 2006, which is hereby incorporated by reference herein.
Number | Date | Country | Kind |
---|---|---|---|
2006-176636 | Jun 2006 | JP | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/JP2007/061400 | 5/30/2007 | WO | 00 | 10/21/2008 |