This invention relates to a technique for the compression encoding of a moving picture and, more particularly, to a moving picture compression encoding method, apparatus and program for performing encoding upon selecting a mode of intra prediction (intraframe prediction) based upon evaluation values of a plurality of types.
In H.264/MPEG-4 Part 10 (ISO/IEC 14496-10) (referred to as “H.264” below) (see Non-Patent Document 1), intra prediction involves 4×4 and 16×16 blocks with regard to the luminance component, and there are prediction modes of nine types and four types, respectively. Further, there are four types of prediction modes with regard to an 8×8 block for the color-difference component.
In contribution JVT-I1049d0.doc (Non-Patent Reference 2) at a meeting of the JVT (Joint Video Team) performing H.264 standardization work, the following have been proposed as an evaluation measure for selecting a prediction mode in intra prediction:
(a) SAD (Sum of Absolute Differences), which generates a cost value of difference information indicative of a difference between a moving picture signal that is input to a moving picture compression encoding system and a prediction signal that is output from a prediction signal generating system; and
(b) SATD (Sum of Absolute Transformed Differences), which subjects this difference information to a Hadamard transform and generates cost values for all modes.
In H.264 referential software (Joint Model, referred to as “JM” below) that has been developed as part of the standardization activities by the JVT, the SAD and the SATD are employed in selecting an intra prediction mode. However, cost values for when a prediction mode is selected are all found based upon either the SAD or the SATD.
Techniques for selecting a prediction scheme and block size are known in the art (e.g., see Patent Documents 1 and 2 below). Patent Document 1 discloses a moving picture predictive encoding scheme in which prediction errors based upon a plurality of prediction methods that differ for respective ones of blocks of various block sizes are obtained, a prediction method that is suited to the block concerned and a prediction error are selected adaptively by evaluating each prediction error by first evaluating means, and a block size that is optimum for encoding is selected adaptively by second evaluating means for every part of a moving picture by evaluating the prediction error regarding each block size which has been obtained by the first evaluation means. Further, Patent Document 2 discloses a moving picture predictive encoding scheme adapted to transmit block-size information and prediction-scheme type to a receiving side, detect changeless portions and portions of sudden change based upon results of DCT processing, and increase or decrease block size in accordance with each portion detected to thereby enable the selection of block sizes optimum for changeless portions and for suddenly changing portions of a picture. Further, Patent Document 3 discloses a moving picture predictive encoding scheme in which the prediction method is changed over adaptively in conformity with the local quality of a picture, as a result of which the overall transmission efficiency is improved. Furthermore, Patent Document 4 discloses an arrangement having first encoding means for obtaining an approximate plane for every image data block and encoding information that specifies the approximate plane as well as a difference value between the approximate plane and a block; second encoding means for selecting a prediction scheme from a plurality of prediction schemes block by block, and performing encoding by the scheme selected; and means for selecting either the first encoding means or the second encoding means block by block.
In the prior art described in Patent Documents 1 to 3, however, only SAD is used as the measure for evaluating the intra prediction scheme. As a consequence, encoding efficiency declines markedly in comparison with a case where intra prediction is performed based upon SATD.
[Patent Document 1]
Japanese Patent No. 2608909
[Patent Document 2]
Japanese Patent No. 2702139
[Patent Document 3]
Japanese Patent No. 2716703
[Patent Document 4]
Japanese Patent Kokai Publication No. JP-A-9-9265
[Non-Patent Document 1]
H.264/MPEG-4 Part 10 (ISO/IEC 14496-10) Internet <URL:http://www.itu.int/rec/recommendation.asp?type=item&|ang=e&p arent=T-REC-H.264-200305-I>
[Non-Patent Document 2]
JVT-1049d0.doc Internet
<ftp://standards.polycom.com/2003—09_SanDiego>
The following problems arise with the moving picture compression encoding schemes of the prior art set forth above:
The first problem is that encoding efficiency is poor when a mode is selected based solely upon difference information (SAD) between an input signal and prediction signal with regard to an intra prediction scheme.
The reason for this is as follows: Data that has been encoded by a moving picture compression encoding scheme has undergone a frequency conversion with regard to the difference information between the input signal and the prediction signal. Difference information alone, therefore, exhibits poor accuracy as a standard for evaluating encoding efficiency.
The second problem is that in a case where a frequency conversion is performed with regard to difference information and cost calculated in order to improve encoding efficiency, a very large amount of computation is performed to select the optimum mode.
The reason for this is that since a plurality of prediction modes exist for each of a plurality of block sizes in order to perform intra prediction, it is necessary that values obtained by frequency-converting the difference information be calculated with respect to all modes in a case where the optimum prediction mode is selected.
Accordingly, it is an object of the present invention to provide a moving picture compression encoding apparatus, method and program whereby the encoding efficiency of an intra predicting (intraframe predicting) unit can be improved.
Another object of the present invention is to provide a moving picture compression encoding apparatus, method and program having an intra predicting unit that is capable of high-speed processing.
According to a first aspect of the present invention, the above and other objects are attained by providing a moving picture compression encoding apparatus in which an intra predicting unit performs a multiple-stage search for a prediction mode, the device comprising means for calculating first evaluation values based upon difference information indicative of differences between a moving picture signal that is input to the moving picture compression encoding apparatus and prediction signals that have been generated by the intra predicting unit; means for preliminarily selecting a plurality of prediction modes based upon the first evaluation values; means for calculating second evaluation values based upon difference information of the plurality of prediction modes preliminarily selected; and means for selecting one prediction mode from the plurality of preliminarily selected prediction modes based upon the second evaluation values.
According to another aspect of the present invention, the foregoing object is attained by providing a moving picture compression encoding apparatus for a case where a plurality of block sizes exist, comprising: means for calculating first evaluation values based upon difference information indicative of differences between a moving picture signal that is input to the moving picture compression encoding apparatus and prediction signals that have been generated by an intra predicting unit; means for preliminarily selecting a plurality of prediction modes based upon the first evaluation values; means for calculating second evaluation values based upon difference information of the prediction modes preliminarily selected; means for selecting one prediction mode from the preliminarily selected prediction modes based upon the second evaluation values; and means for selecting an optimum block size from the second evaluation values.
According to another aspect of the present invention, the foregoing object is attained by providing a moving picture compression encoding method, comprising the steps of: calculating first evaluation values based upon differences information indicative of differences between an input moving picture signal and prediction signals that have been generated by an intra predicting unit; preliminarily selecting a plurality of prediction modes based upon the first evaluation values; calculating second evaluation values based upon difference information of the prediction modes preliminarily selected; and selecting one prediction mode from the preliminarily selected prediction modes based upon the second evaluation values.
According to another aspect of the present invention, the foregoing object is attained by providing a program for causing a computer constituting a moving picture compression encoding apparatus to execute the following processing: processing for calculating first evaluation values based upon difference information indicative of differences between an input moving picture signal and prediction signals that have been generated by an intra predicting unit; processing for preliminarily selecting a plurality of prediction modes based upon the first evaluation value; processing for calculating second evaluation values based upon difference information of the prediction modes preliminarily selected; and processing for selecting one prediction mode from the preliminarily selected prediction modes based upon the second evaluation values.
The meritorious effects of the present invention are summarized as follows.
The present invention is such that when a mode is selected, encoding efficiency can be improved greatly and processing speeded up in comparison with a method of selecting a mode based solely upon difference information (SAD).
In accordance with the present invention, it is unnecessary to frequency-convert difference information with regard to all modes when a mode is selected by an intra predicting unit. As a result, the amount of computation can be reduced greatly and a high performance approaching that of SATD can be provided in terms of efficiency.
Still other features and advantages of the present invention will become readily apparent to those skilled in this art from the following detailed description in conjunction with the accompanying drawings wherein only the preferred embodiments of the invention are shown and described, simply by way of illustration of the best mode contemplated of carrying out this invention. As will be realized, the invention is capable of other and different embodiments, and its several details are capable of modifications in various obvious respects, all without departing from the invention. Accordingly, the drawing and description are to be regarded as illustrative in nature, and not as restrictive.
A preferred mode of practicing the present invention will be described in detail with reference to the accompanying drawings.
An intra prediction apparatus according to this mode of practicing the present invention includes a prediction signal generator (103) for generating prediction signals, the generator- receiving a reconstructed signal (13) that is part of an image signal of an adjacent block for which encoding ended before the encoding of a block to undergo intra prediction; a prediction mode decision unit (108) having a cost calculator (104) for generating cost values based upon difference information indicative of differences between the prediction signals generated by the prediction signal generator (103) and a moving picture signal that is input to the moving picture compression encoding apparatus, a preliminary selector (105) for selecting and outputting to a cost calculator (106) at least two prediction modes (e.g., two or more high-order modes of small cost values), based upon the cost values of the difference information, from among all modes of an intra predicting unit, the cost calculator (106) for frequency-converting difference information and generating new cost values with regard to the prediction modes that have been output from the preliminary selector (105), and a prediction mode selector (107) for selecting the optimum prediction mode based upon the cost values obtained after the frequency conversion; and a prediction signal call unit (109) for reading in a prediction signal, which corresponds to the prediction mode (115) that has been output from the prediction mode decision unit (108), from a prediction signal memory (102) and outputting this signal as a result of intra prediction.
In this mode of practicing the present invention, if a plurality of intra predicting units are provided and a plurality of block sizes exist, then means are provided for changing a prediction mode discriminating method and selecting a prediction mode by the intra predicting units for every block size, and selecting the block size. Alternatively, means are provided for selecting the prediction mode by the intra predicting units for every block size using any one among a plurality of prediction mode discriminating methods, and selecting the block size.
Embodiments of the present invention will now be described.
As shown in
The prediction signal generator 103 includes a plurality of filters 101 that receive the reconstructed signal, and a plurality of prediction signal memories 102 that receive prediction signals 110 from respective ones of the filters 101.
The prediction mode decision unit 108 includes a cost calculator 104, the preliminary selector 105, a cost calculator 106 and a prediction mode selector 107.
The operation of the prediction signal generator 103 will be described first. In
With regard to the intra prediction, this is performed, as set forth in Non-Patent Document 1, using 4 or 16 pixels situated at the right end in a block to the left of the particular block, 4 or 16 pixels situated at the lower end in a block above or to the upper right of the particular block, and 1 pixel at the lower-right end of a block to the upper left of the particular block.
The following prediction modes are available in intra prediction:
The prediction signal memories 102 are devices for storing the prediction signals 110 that are output from the filters 101. The prediction signals 110 are output to the cost calculator 104 of the prediction mode decision unit 108 and to the prediction signal call unit 109.
The reconstructed signal 13 is part of an image signal of an adjacent block for which encoding ended before the encoding of the particular block to undergo intra prediction.
The prediction signals 110 are signals generated using the filters 101 that differ depending upon the prediction mode, or are signals having the same value as the reconstructed signal 13.
The operation of the prediction mode decision unit 108 will be described next.
The cost calculator 104 calculates difference information 112 between the prediction signals 110 and moving picture signal 12 and outputs cost values 111 to the preliminary selector 105. The cost calculator 104 calculates the following as the difference information (block difference) 112, by way of example:
Diff(i,j)=Original(i,j)−Prediction(i,j) (1)
where Prediction(i,j) is the prediction signal 110 and Original(i,j) is the moving picture signal 12.
The cost calculator 104 outputs SAD (Sum of Absolute Differences) [see Equation (2) below)], which is the sum of absolute values of the difference data Diff(i,j), as the cost value.
SAD=Σi,j|Diff(i,j)| (2)
The preliminary selector 105 selects at least two prediction modes from among prediction modes for which the difference information of the cost values 111 is small and outputs the selected modes 113 to the cost calculator 106.
The cost calculator 106 calculates cost values 114 obtained by applying a frequency conversion to the difference information 112 corresponding to the selected modes 113 that have been output from the preliminary selector 105 and outputs the cost values 114 to the cost calculator 106. The cost calculator 106 calculates SATD (Sum of Absolute Transformed Differences) [see Equation (3) below] as the cost value 114 of the particular block, where SATD is the sum of absolute values of DiffT(i,j) obtained by subjecting the difference data Diff(i,j) to a frequency conversion by, e.g., a Hadamard transform.
SATD=(Σi,j|DiffT(i,j)|)/2 (3)
The prediction mode selector 107 selects the optimum prediction mode 115 based upon the cost values 114 that have been output by the cost calculator 106 and outputs the prediction mode 115 and this cost value 114 as the results of intra prediction.
The moving picture signal 12 is the input image signal of the moving picture compression encoding apparatus, which is indicated at reference number 14.
The prediction signal call unit 109 reads in the prediction signal, which corresponds to the prediction mode 15 output from the prediction mode decision unit 108, from the prediction signal memory 102 and outputs this signal as the result of intra prediction. Alternatively, if intra prediction has been selected by the switch 11 in
This embodiment has been described with regard to a two-stage search for a prediction mode. However, this can be expanded to three stages or more by the following means:
A second embodiment will now be described in detail with reference to
The block-size decision unit 202 selects a block size 207 having the smallest cost value 206 that is output from the intra predictors 2011 to 201N and outputs the block size 207 and cost value 206 to the prediction-signal/prediction-mode memory 203.
For every block size, the prediction-signal/prediction-mode memory 203 stores prediction signals 204 and prediction modes 205 that are output from the intra predictors 2011 to 201N and outputs the prediction signal 204 and prediction mode 205, which correspond to the block size 207 that is output from the block-size decision unit 202, together with the cost value 206 and block size 207 as the results of intra prediction. As each of the intra predictors 2011 to 201N has the structure (the prediction signal generator 103, prediction mode decision unit 108 and prediction signal call unit 109) shown in
A third embodiment of the present invention will now be described in detail with reference to
The block-size selecting section includes intra predictors 301, 302, and 303 for which the prediction mode discriminating methods differ for each of the block sizes, a block-size decision unit 304 and a prediction-signal/prediction-mode memory 305. The structure and operation of the intra predictors 301, 302, and 303 will be described.
The intra predictors 301, 302, and 303 employ intra predicting units (see
As shown in
The cost calculator 404 calculates cost values 410 [e.g., the SAD in Equation (2) cited above] from difference information 411 indicative of differences between the moving picture signal 12, which is input to the moving picture compression encoding apparatus 14, and prediction signals generated by the prediction signal generator 403, and outputs the cost values to the prediction mode selector 405.
The prediction mode selector 405 selects the optimum prediction mode 412 from the cost values 410 and inputs this prediction mode and the difference information thereof to the cost calculator 406.
The cost calculator 406 applies a frequency conversion to the difference information 411 of the prediction mode 412, calculates a cost value 413 and outputs the prediction mode 412 and cost value 413 [e.g., SATD in Equation (3) cited above] as the results of intra prediction.
Further, in the arrangement shown in
The difference calculator 424 calculates difference information 430 [e.g., see Equation (1) cited above] indicative of differences between the moving picture signal 12, which is input to the moving picture compression encoding apparatus 14, and the outputs (prediction signals) from the prediction signal generator 423, and outputs the information to the cost calculator 425.
The cost calculator 425 applies a frequency conversion [e.g., DiffT(i,j) obtained by a Hadamard transform] to the difference information 430 [Diff(i,j)] of all prediction modes, calculates cost values 431 [e.g., SATD in Equation (3) cited above] and outputs these to the prediction mode selector 426.
The prediction mode selector 426 selects the optimum prediction mode 432 based upon the cost values 431 that have been output from the cost calculator 425 and outputs the optimum prediction mode 432 and cost value 431 as the results of intra prediction. Structural components and operations other than those set forth above are identical with those of the second embodiment and need not be described again.
Next, a fourth embodiment will be described with reference to
The section for selecting block size includes intra predictors 501, 502, a block-size decision unit 503 and a prediction-signal/prediction-mode memory 504.
The structure of the intra predictors 501, 502 will be described. The intra predictors 501, 502 can employ any of the structures shown in
A fifth embodiment of the present invention will be described next.
The structure of the intra predictors 601 and 602 will be described. Here the intra predictors 601 and 602 employ the intra predicting unit 1 of
Next, a sixth embodiment of the present invention will be described with reference to
The structure of the intra predictors 701, 702, 703, and 704 will be described. Two of the intra predictors 701, 702, 703, and 704, for which the block sizes differ from one another, are constructed from
Thus, the present invention is such that when mode selection is performed, encoding efficiency can be improved greatly in comparison with a method of selecting a mode based solely upon difference information (SAD).
In accordance with the present invention, it is unnecessary to frequency-convert difference information with regard to all modes when a mode is selected by an intra predicting unit. As a result, the amount of computation can be reduced greatly and a high performance approaching that of SATD can be provided in terms of efficiency.
These items of data illustrate the second and fifth embodiments, the method (SAD) of selecting mode based solely upon difference information and the method (SATD) of selecting mode by applying a frequency conversion to difference information with regard to all modes.
Among the embodiments of the present invention, the first embodiment makes two preliminary selections for both 4×4 and 16×16 blocks, and the fifth embodiment makes four preliminary selections for a 4×4 block and two selections for a 16×16 block.
It will be understood from
The second embodiment provides an encoding efficiency substantially equivalent to that obtained with SATD.
It will be understood from
As many apparently widely different embodiments of the present invention can be made without departing from the spirit and scope thereof, it is to be understood that the invention is not limited to the specific embodiments thereof except as defined in the appended claims.
It should be noted that other objects, features and aspects of the present invention will become apparent in the entire disclosure and that modifications may be done without departing the gist and scope of the present invention as disclosed herein and claimed as appended herewith.
Also it should be noted that any combination of the disclosed and/or claimed elements, matters and/or items may fall under the modifications aforementioned.
Number | Date | Country | Kind |
---|---|---|---|
2004-371112 | Dec 2004 | JP | national |