This application is based upon and claims the benefit of priority from Japanese Patent Application No. 2007-214974, filed Aug. 21, 2007, the entire contents of which are incorporated herein by reference.
1. Field
One embodiment of the present invention relates to an encoding technique for moving images, which is suitably applicable to information processing apparatuses such as personal computers.
2. Description of the Related Art
In recent years, personal computers equipped with software encoders, which encode moving images by software, have come into widespread use. Further, recently, attention has been given to H.264/Advanced Video Coding (AVC) as a next-generation moving-image compression encoding technique. H.264/AVC is a compression encoding technique with higher efficiency than conventional compression encoding techniques such as MPEG2 and MPEG4. Therefore, encoding processing compliant with H.264/AVC requires a processing amount larger than that required in conventional compression encoding techniques such as MPEG2 and MPEG4. Thus, there have been made various propositions to reduce the encoding processing amount of moving images thereof (for example, refer to Jpn. Pat. Appln. KOKAI Pub. No. 2006-332986).
Encoding processing compliant with H.264/AVD has a large processing amount in a determination of a prediction mode for each macroblock. In particular, in High Profile (HP), when the block size of the prediction mode is 8×8 pixels or more in inter-prediction, it is possible to select the most suitable one from either of a discrete cosine transform (DCT) with a block size of 4×4 pixels and DCT with a block size of 8×8 pixels. Therefore, the processing amount required for prediction mode determination of the inter-prediction increases in proportion to the number of prediction mode candidates (since there are substantially prediction modes of a number twice as large as the number of selectable prediction modes). Thus, it is strongly desired to achieve a mechanism for efficiently performing prediction mode determination, while deterioration in image quality is suppressed.
A general architecture that implements the various feature of the invention will now be described with reference to the drawings. The drawings and the associated descriptions are provided to illustrate embodiments of the invention and not to limit the scope of the invention.
Various embodiments according to the invention will be described hereinafter with reference to the accompanying drawings. In general, according to one embodiment of the invention, an information processing apparatus which encodes a moving image signal, includes an inter-prediction mode determination unit which determines a combination of an inter-prediction mode used by an inter-prediction unit and a DCT used by a transformation unit among from an inter-prediction modes and a DCTs, for each of macroblocks. The inter-prediction mode determination unit includes a first selection unit which selects DCTs of a predetermined number from the DCTs, for at least one specific inter-prediction mode among the inter-prediction modes, and a second selection unit which selects a combination of one inter-prediction mode and one DCT from the inter-prediction modes and the DCTs of the predetermined number selected by the first selection unit.
As illustrated in
The CPU 11 is a processor which controls operation of the units in the computer. The CPU 11 executes an operating system (OS) 100 and various application programs, which include utility and operate under the control of the OS 100, which are loaded from the HDD 18 to the main memory 13. The application programs include a video encoder application 200. The video encoder application 200 is software to encode moving images, and operates as a software encoder compliant with the H.264/AVC standard. Further, the CPU 11 also executes a BIOS stored in the BIOS-ROM 17. The BIOS is a program for controlling various hardware.
The north bridge 12 is a bridge device which connects a local bus of the CPU 11 and the south bridge 16. The north bridge 12 has a function of executing communications with the graphics controller 14 through a bus, and includes a memory controller to control access to the main memory 13. The graphic controller 14 is a display controller which controls the LCD 15 that is used as a display monitor of the computer. The graphics controller 14 generates, from image data written in the VRAM 14A, a display signal to be transmitted to the LCD 15.
The south bridge 16 is a controller which controls devices on a PCI bus and an LPC bus. Further, the south bridge 16 is directly connected with the BIOS-ROM 17, the HDD 18, the HD DVD 19, and the sound controller 20, and has a function of controlling the connected devices. The sound controller 20 is a sound source controller which controls the speaker 21.
The EC/KBC 22 is a one-chip microcomputer obtained by integrating an embedded controller for electric power control with a keyboard controller for controlling the keyboard 23 and the touch pad 24. The EC/KBC 22 controls supply of electric power from the battery 26 or an external AC power source to the units in the computer, in cooperation with the power supply circuit 25. The network controller 27 is a communication apparatus which executes communications with external networks such as the Internet.
Next, with reference to
Encoding processing performed by the video encoder application 200 is compliant with the H.264/AVC standard. As illustrated in
The video encoder application 200 executes encoding of pictures input through the input unit 201 in macroblocks of, for example, 16×16 pixels. The prediction mode determination unit 210 selects either of an intraframe prediction encoding mode (intra-prediction mode) and a motion compensation interframe prediction encoding mode (inter-prediction mode) for each macroblock.
As illustrated in
Each of the intra-prediction mode and the inter-prediction mode has a plurality of prediction mode candidates which can be selected for each macroblock. First, each of the intra-prediction mode determination unit 2101 and the inter-prediction mode determination unit 2102 selects the most cost-effective (optimum and well-balanced between distortion and code amount) prediction mode candidate from the prediction mode candidates. Then, the intra/inter-prediction mode determination unit 2103 compares the two prediction mode candidates selected by the determination units 2101 and the 2102, and selects a more cost-effective prediction mode, that is, one of the intra-prediction mode and the inter-prediction mode.
In the meantime, in high profile of the H.264/AVC standard, when the block size of the prediction mode is 8×8 pixels or more in inter-prediction, it is possible to select an optimum block size from DCT with a block size of 4×4 pixels and DCT with a block size of 8×8 pixels. Therefore, supposing that there are m types of prediction modes in one standard and n types of DCTs are selectable, there are substantially m×n prediction mode candidates only for the inter-prediction, as illustrated in
In the software encoder whose functional configuration is illustrated in
On the other hand, in the inter-prediction encoding mode, first, the motion detecting unit 208 estimates motion from an encoded picture stored in the frame memory 207, and then the inter-prediction unit 209 generates a motion compensating interframe prediction signal s3 corresponding to a picture to be encoded, in a predetermined form and unit. Thereafter, the DCT quantizing unit 202 performs orthogonal transformation and quantization for a prediction error signal s4 obtained by subtracting the motion compensating interframe prediction signal s3 from the picture to be encoded. Then, the entropy encoding unit 203 performs entropy encoding for inter-prediction mode information and the quantized orthogonal transformation coefficient, and thereby encoding of the picture is performed.
The inverse quantizing/inverse DCT unit 204 performs inverse quantization and inverse orthogonal transformation for a quantization coefficient of a picture subjected to orthogonal transformation and quantization. The deblocking filter 206 performs deblocking filtering for reducing block noises.
An optimum DCT block size tends to be the same between different prediction block sizes with high probability. Therefore, even when DCT (DCT block size) is selected first to narrow down DCTs for each of which an optimum prediction mode (prediction block size) is selected, it can be said that a combination of a truly optimum prediction mode and DCT can be selected with comparatively high probability. In view of this tendency, the inter-prediction mode determination unit 2102 of the computer performs DCT determination under conditions of a specific prediction mode, and selects optimum DCT candidates (“A” of
This method will be explained below with a more specific example. In high profile of the H.264/AVC standard, there are four types of prediction modes having different block sizes of 16×16 pixels, 16×8 pixels, 8×16 pixels, and 8×8 pixels as the processing unit, as illustrated in
On the other hand, the inter-prediction mode determination unit 2102 performs evaluation of the two DCTs of 4×4 pixels and 8×8 pixels only in 1 (M of
Next, DCT evaluation is performed in the other prediction modes of 16×8 pixels, 8×16 pixels, and 8×8 pixels, only for the 1 optimum DCT candidate obtained by the prediction mode of 16×16 pixels (the number of candidates is 3(m−M)×1(N): since the original image is used for DCT determination in the case illustrated in
Specifically, in this case (in which one optimum DCT candidate is selected), the inter-prediction mode determination unit 2102 achieves reduction in the number of prediction mode candidates to “2+3=5” (that is, the number of candidates is reduced by 3). Since there is high probability that the optimum DCT is the same between the different prediction modes of 16×16 pixels, 16×8 pixels, 8×16 pixels, and 8×8 pixels, a truly optimum combination can be selected with high probability, and deterioration in image quality hardly occurs.
In addition, the inter-prediction mode determination unit 2102 of the computer performs control to determine one prediction mode candidate used for selection of DCT in B pictures, for which prediction succeeds with high probability, and increase the number of prediction mode candidates to 2 in P pictures. Specifically, the inter-prediction mode determining unit 2102 determines the number of prediction mode candidates used for selection of DCT in accordance with the type of the picture, and thereby achieves reduction in the processing amount required for prediction mode determination while adaptively suppressing deterioration in image quality.
First, the inter-prediction mode determination unit 2102 determines whether the picture to be encoded is a P picture or a B picture (Block A1), and determines the number of prediction mode candidates used for selection of DCT, based on the determination result (Block A2). For example, if the picture is a P picture, the number of prediction mode candidates is set to 2. If the picture is a B picture, the number of prediction mode candidates is set to 1.
Next, the inter-prediction mode determination unit 2102 calculates the cost of each of all the DCTs for the specific prediction mode candidate(s) of the number determined in Block A2 (Block A3). Based on a result of the cost calculation, the inter-prediction mode determination unit 2102 selects optimum DCT candidates of the predetermined candidate number (Block A4).
Then, the inter-prediction mode determination unit 2102 calculates the cost of each of the other prediction mode candidates than the specific prediction mode candidate only for the selected DCTs (Block A5), and determines the optimum prediction mode and the optimum DCT (among the calculated costs including the costs concerning the specific prediction mode candidate(s) already calculated in Block A3) (Block A6).
As described above, according to the computer of the present invention, it is possible to efficiently perform prediction mode determination of inter-prediction, without deterioration in image quality or the like.
While certain embodiments of the inventions have been described, these embodiments have been presented by way of example only, and are not intended to limit the scope of the inventions. Indeed, the novel methods and systems described herein may be embodied in a variety of other forms; furthermore, various omissions, substitutions and changes in the form of the methods and systems described herein may be made without departing from the spirit of the inventions. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the scope and spirit of the inventions.
Number | Date | Country | Kind |
---|---|---|---|
2007-214974 | Aug 2007 | JP | national |
Number | Name | Date | Kind |
---|---|---|---|
6724818 | Frojdh et al. | Apr 2004 | B1 |
7280597 | Zhang et al. | Oct 2007 | B2 |
7643559 | Kato et al. | Jan 2010 | B2 |
20020186765 | Morley et al. | Dec 2002 | A1 |
20040086042 | Kim et al. | May 2004 | A1 |
20050249291 | Gordon et al. | Nov 2005 | A1 |
20050276331 | Lee et al. | Dec 2005 | A1 |
20060215763 | Morimoto et al. | Sep 2006 | A1 |
Number | Date | Country |
---|---|---|
2003-319394 | Nov 2003 | JP |
2005-151017 | Jun 2005 | JP |
2006-148419 | Jun 2006 | JP |
2006-332986 | Dec 2006 | JP |
2007-201558 | Aug 2007 | JP |
2008-205627 | Sep 2008 | JP |
2008-219205 | Sep 2008 | JP |
Entry |
---|
Japanese Office Action dated Jun. 14, 2011. |
Number | Date | Country | |
---|---|---|---|
20090285285 A1 | Nov 2009 | US |