The present disclosure relates to a motion estimation device that estimates motion for blocks included in a picture.
For motion estimation, a multi-frame memory has been used that stores a plurality of reference pictures. A multi-frame memory is implemented in two separate parts: an external memory provided outside a motion estimation device and an internal memory provided inside the motion estimation device to be accessed during block matching (see Japanese Patent Publication No. H05-260461, for example).
In many video encoding schemes standardized by international standardization groups such as the Moving Picture Experts Group (MPEG) and the International Telecommunications Union (ITU), each of pixels included in an image is divided into a luminance component and color-difference components, making use of the human visual feature, to perform encoding. In motion estimation, in particular, only the luminance component is generally used for reducing the memory space required for storage of reference pictures and the computation amount required for processing of the motion estimation (see WO 1998/042135, for example).
In a motion estimation device that performs motion estimation using only the luminance component as described above, luminance pixels in a region of a reference picture to be searched are first transferred from an external memory to an internal memory via an external connection bus and stored in the internal memory. During motion estimation, the reference picture stored in the internal memory is read and used.
Motion compensation is then performed, in which, for the luminance component, data stored in the internal memory is used. However, for the color-difference components, data must be read from the external memory via the external connection bus. This increases the data transfer amount of the external connection bus.
The above configuration will pose no problem if a small-size image, like a standard definition (SD) image, is to be encoded. Even encoding of a large-size image, like a high definition (HD) image, will pose no problem as far as the number of images per unit time, or the picture rate, is comparatively low, because the required data transfer rate will not exceed the maximum transfer amount per unit time allowed for the external connection bus.
Reference is also to be made to “Advanced video coding for generic audiovisual services” (§8.4 Inter prediction process), ITU-T Recommendation H.264, 11/2007, pp. 143-169.
Since consumer camcorders (or video cameras) capable of recording HD images have been becoming widespread in recent years, it is expected that the size of images recorded may become larger and the picture rate may become higher in the near future.
One method to address the above problem is to increase the transfer amount per unit time allowed for the external connection bus by replacing the external memory with a higher performance one and by increasing the number of memories placed externally. However, in replacing the external memory with a higher performance one, an expensive memory will be necessary as the external memory. Also, the operating speed of the bus will increase, resulting in increase of the power consumption of the entire product including the motion estimation device. In increasing the number of memories placed externally, also, the cost and the power consumption will increase.
As another method, the number of reference pictures may be reduced, or the range within which search for motion can be made may be reduced. However, reducing such a requirement easily will greatly degrade the quality of the image recorded.
It is an objective of the present invention to achieve encoding of a larger-size images and encoding of images at higher picture rate while preventing or reducing increase in cost and degradation in image quality.
The motion estimation device of an embodiment of the present invention is a motion estimation device configured to estimate motion for blocks, the device including in an input picture using a reference picture includes an internal reference memory configured to store the reference picture transferred from outside the motion estimation device; a motion estimator configured to estimate motion information for a target block that is a block of the input picture and where motion is to be estimated, using pixel data of the reference picture stored in the internal reference memory; a motion compensator configured to perform motion compensation for the target block using the motion information estimated by the motion estimator; and a reference memory manager configured to control the internal reference memory. The reference memory manager is configured to control the internal reference memory to store a luminance reference picture and a color-difference reference picture as the reference picture.
With the above configuration, the frequency of data transfer from outside can be reduced, and the data amount per unit time transferred from outside can be reduced. Thus, encoding at higher picture rate can be achieved.
According to the present invention, since transfer of color-difference reference pictures from an external memory can be reduced, it is possible to achieve encoding of images at higher picture rate and encoding of larger-size images while preventing or reducing increase in the cost of external and internal memories, increase in the power consumption of the device, and degradation in image quality.
An embodiment of the present invention will be described hereinafter with reference to the drawings.
An external memory 38 is coupled to the motion estimation device 10 via an external connection bus. The external memory 38, which is a high-capacity memory such as a synchronous dynamic random access memory (SDRAM), is used as a multi-frame memory for storing a plurality of reference pictures for motion estimation that are required for inter-frame prediction. The reference memory manager 12 controls data transfer from the external memory 38 to the internal reference memory 14: it controls transfer of pixel data in a region used for motion compensated prediction from the external memory 38 to the internal reference memory 14.
An input picture to be encoded is input into the motion estimator 16 as a video signal VIN. The motion estimator 16 also receives a reference picture for motion estimation read from the external memory 38 via the internal reference memory 14, and determines a motion vector MV and a reference frame number RFN for each of blocks included in the input picture. The reference frame number RFN is a number for identifying a reference picture, among the plurality of pictures, referred to in encoding of the input picture. The motion vector MV is temporarily stored in the motion vector memory 24 and then output as a neighboring motion vector PVM. The motion vector predictor 26 predicts a predicted motion vector PDM with reference to the neighboring motion vector PVM.
The subtractor 28 subtracts the predicted motion vector PDM from the motion vector MV and outputs the resultant difference as a motion vector predicted difference DMV. The internal reference memory 14 outputs pixels indicated by the reference frame number RFN and the motion vector MV as motion compensation pixels MCP. The motion compensator 18 generates reference pixels with fractional pixel precision based on the motion vector MV and the motion compensation reference pixels MCP and outputs the pixels as reference frame pixels MCP2. The subtractor 34 subtracts the reference frame pixels MCP2 from the input picture and outputs the resultant difference as a frame prediction error DP.
The encoder 32 performs discrete cosine transform (DCT) and quantization for the frame prediction error DP, then performs variable-length encoding for a quantized DCT coefficient, the motion vector predicted difference DMV, and the reference frame number RFN, and outputs the resultant encoded image data as an image stream STR. The encoder 32 also decodes the encoded frame prediction error and outputs the result as a decoded frame prediction error RDP. The decoded frame prediction error RDP is equal to the frame prediction error DP with an encoding error added thereon, which corresponds with an inter-frame prediction error obtained by decoding the image stream STR.
The adder 36 adds the decoded frame prediction error RDP to the reference frame pixels MCP2, and the added result is stored in the external multi-frame memory 38 as a decoded frame RP. Note however that, for effective use of the space of the external memory 38 and the internal reference memory 14, if a frame stored in these memories is unnecessary, the area in which the frame has been stored is released, and the decoded frame RP for such an unnecessary frame is not stored.
The cache memory 42 stores pixels transferred from the external memory 38 and outputs the pixels to the reference local memory 48 and the bus switch 44. The reference local memory 48 is referred to when the motion estimator 16 actually performs pixel search. This hierarchical configuration is for reducing the pixel transfer amount from the external memory 38. More specifically, the access frequency to the external memory 38 is kept low by storing a large quantity of reference pictures into the cache memory 42 at a time, and reference picture data is transferred from the cache memory 42 to the reference local memory 48 whenever necessary during execution of motion estimation in which the access frequency to reference pictures is high.
The bus switch 44 switches between two input paths to the reference local memory 46 where color-difference (chroma) reference pictures (reference pictures with respect to color difference) are stored. More specifically, the bus switch 44 operates to transfer a color-difference reference picture stored in the external memory 38 to the reference local memory 46 directly or after being stored in the cache memory 42, under the instruction of the reference memory manager 12.
When the bus switch 44 selects the direct path from the external memory 38, the following operation is performed. That is, under the instruction of the reference memory manager 12, first, luminance reference pictures (reference pictures with respect to luminance) required by the motion estimator 16 for search are transferred from the external memory 38 to the cache memory 42 and stored. Reference picture data, among the reference pictures stored in the cache memory 42, required by the motion estimator 16 for search is transferred to the reference local memory 48, to allow the motion estimator 16 to perform search with reference to the transferred data. At this time, no color-difference reference picture is stored in the cache memory 42. Thereafter, color-difference reference picture data at a position indicated by motion information (motion vector) obtained as a result of the search by the motion estimator 16 is directly transferred from the external memory 38 to the reference local memory 46, to allow the motion compensator 52 to perform motion compensation of the color-difference components with reference to the transferred data.
Conversely, when the bus switch 44 selects the path from the cache memory 42, the following operation is performed. That is, under the instruction of the reference memory manager 12, first, luminance reference pictures required by the motion estimator 16 for search are transferred from the external memory 38 to the cache memory 42 and stored. At this time, the corresponding color-difference reference pictures at the same position as the luminance reference pictures are also transferred to the cache memory 42 and stored. Luminance reference picture data, among the reference pictures stored in the cache memory 42, required by the motion estimator 16 for search is then transferred to the reference local memory 48, to allow the motion estimator 16 to perform search with reference to the transferred data. Thereafter, also under the instruction of the reference memory manager 12, color-difference reference picture data at a position indicated by motion information obtained as a result of the search by the motion estimator 16 is transferred from the cache memory 42 to the reference local memory 46, to allow the motion compensator 52 to perform motion compensation for the color-difference components with reference to the transferred data.
Logical mapping will be described hereinafter for the sake of simplicity. Actually, since a reference region comprised of a combination of rectangular regions as shown in
Reference picture data corresponding to a reference region RFA in
The region REL, which will be unnecessary in subsequent motion estimation, is reserved for a next-time acquisition region NXT to be utilized as a region used as a reference picture in the next target block. The region other than the above regions is not used as a reference picture in a target block for motion estimation, but is mapped as a reserve storage region SUB to be used as a reference picture in a target block at the occasion of subsequent motion estimation.
In other words, the reference pictures in the cache memory 42 are mapped in a first-in, first-out (FIFO) manner. The next-time release region REL is reserved for the next-time acquisition region NXT immediately after being released. The reference region RFA is transferred to the reference local memory 48 to be referred to by the motion estimator 16. By repeating this operation, motion estimation of the entire picture can be performed. The management of the cache memory 42 as descried above is performed by the reference memory manager 12.
When the bus switch 44 in
When the bus switch 44 in
The reference memory manager 12 in
For example, the reference memory manager 12 determines the transfer flag based on the picture structure for control of the bus switch 44. Specifically, when the input picture is a frame picture where lines are sequentially arranged vertically, the bus switch 44 selects the path from the cache memory 42. When the input picture is a field picture where a first field having only odd-numbered lines sequentially arranged vertically and a second field having only even-numbered lines sequentially arranged vertically are displayed alternately, the bus switch 44 selects the direct path from the external memory 38.
The reference memory manager 12 may determine the transfer flag based on the rate of the input picture (the number of pictures displayed every second or the number of pictures encoded every second) for control of the bus switch 44. Specifically, the bus switch 44 may select the path from the cache memory 42 when the rate of the input picture is equal to or higher than a predetermined value, and select the direct path from the external memory 38 when the rate is lower than the predetermined value.
Alternatively, the reference memory manager 12 may determine the transfer flag based on the size of the input picture, e.g., (the number of pixels displayed every line)×(the number of lines) for control of the bus switch 44. Specifically, the bus switch 44 may select the path from the cache memory 42 when the size of the input picture is equal to or larger than a predetermined value, and select the direct path from the external memory 38 when the size is smaller than the predetermined value.
Otherwise, the reference memory manager 12 may determine the transfer flag based on the color-difference signal format of the input picture (the ratio of the number of pixels of the luminance component to the numbers of pixels of the two color-difference components in the picture, which is any of 4:2:0, 4:2:2, and 4:4:4) or the number of bits with which one pixel is expressed, for control of the bus switch 44. For example, the bus switch 44 may select the path from the cache memory 42 when the color-difference format is 4:4:4, and otherwise select the direct path from the external memory 38.
The reference memory manager 12 may otherwise determine the transfer flag based on a combination of the above formats of the input picture to be encoded described above for control of the bus switch 44.
The AV processing apparatus 140 includes the video encoding device 100 of
The stream I/O section 122, coupled to the bus 132, receives and outputs audio and video stream data ESTR. The video encoding device 100 encodes an image, and the video decoding section 102 decodes an image. The audio encoding section 104 encodes voice, and the audio decoding section 106 decodes voice. The memory I/O section 124 is an interface for input/output of data signals from/to a memory 138. The memory 138, including the external memory 38 in
The video processing section 114 performs pre- and post-processing for a video signal. The video I/O section 112 outputs a video data signal that has been processed by the video processing section 114, or just has passed through the video processing section 114 without being processed, to the outside as a video signal EVS. The video I/O section 112 also receives a video signal EVS from the outside.
The audio processing section 118 performs pre- and post-processing for an audio signal. The audio I/O section 116 outputs an audio data signal that has been processed by the audio processing section 118, or just has passed through the audio processing section 118 without being processed, to the outside as an audio signal EAS. The audio I/O section 116 also receives an audio signal EAS from the outside. The AV control section 126 controls the entirety of the AV processing apparatus 140. The bus 132 transfers data such as stream data and audio/video decoded data.
Only encoding operation will be described hereinafter with reference to
Likewise, the audio processing section 118 performs feature extraction for filtering and encoding for the audio signal EAS input into the audio I/O section 116, and stores the result in the memory 138 via the memory I/O section 124 as original audio data. Thereafter, the original audio data is transferred from the memory 138 to the audio encoding section 104 via the memory I/O section 124. The audio encoding section 104 encodes the original audio data and stores the resultant audio stream in the memory 138.
Finally, the stream I/O section 122 integrates the video stream, the audio stream, and other stream information into one stream and outputs the integrated stream as stream data ESTR. The stream data ESTR is written in a recording medium such as an optical disk and a hard disk.
The reference memory manager 12 in
Alternatively, the reference memory manager 12 in
Otherwise, the reference memory manager 12 in
The reference memory manager 12 may otherwise control the bus switch 44 based on a combination of the formats of the input picture, the encoding bit rate of the stream, and the transfer bandwidth of the memory.
The reference memory manager 12 may determine the transfer flag based on the recording mode set outside the motion estimation device 10. For example, the video I/O section 112 may directly set the recording mode for the reference memory manager 12, or the AV control section 126 may extract estimation information from the video I/O section 112 and set the recording mode based on the extracted information. Alternatively, the stream I/O section 122 or the memory I/O section 124 may set the recording mode. It is otherwise possible to use a recording mode determined in advance in a program for controlling the system executed by the system control section 128 that controls the entirety of the AV processing apparatus 140. The system control section 128 may otherwise extract information estimated by the video I/O section 112, the stream I/O section 122, or the memory I/O section 124 and set the recording mode based on the extracted information. Otherwise, the recording mode may be set by the external control section 136.
The transfer flag may be set directly, not via the recording mode, in the reference memory manager 12. Thus, any block of the apparatus coupled to the video encoding device 100 having the reference memory manager 12 can set the transfer flag directly or indirectly, to control the bus switch 44.
The video encoding control section 211 operates the reference memory manager 212, to transfer reference pictures referred to by the motion estimator 216 and the motion compensator 218 from the external memory 38 to the internal reference memory 14. Once reference picture data is stored in the internal reference memory 14, the video encoding control section 211 operates the motion estimator 216, to search the reference pictures.
When the motion estimator 216 estimates motion, the video encoding control section 211 operates the reference memory manager 212 again, to transfer reference picture data required by the motion compensator 218, thereby to allow the motion compensator 218 to perform motion compensation.
Once the motion compensator 218 completes the motion compensation, the video encoding control section 211 operates the encoder 232. The encoder 232 encodes the difference between a predicted image generated by the motion compensator 218 and the input image, and outputs the result as the stream STR, as well as outputting difference image data. To store a reference picture to be required for the next inter-frame prediction in the external memory 38, a reconstructed image is generated from the difference image data and the predicted image generated by the motion compensator 218, and transferred to the external memory 38.
In step S16 of transfer determination, the video encoding controller 211 determines whether the converted recording condition exceeds the maximum performance the video encoding device 200 can exhibit according to the conventional procedure. If it is determined that the required performance does not exceed the maximum performance, the video encoding controller 211 sets the transfer flag at 0 in step S18 of transfer flag setting. If it is determined that the required performance exceeds the maximum performance, the video encoding controller 211 sets the transfer flag at 1 in step S20 of transfer flag setting. In this way, the transfer flag is set based on the set recording mode. The reference memory manager 212 stores the transfer flag in its register, for example.
From the information of the transfer flag set in the step S18 or S20, the procedure of transfer of reference pictures is determined in step S22 of transfer mode determination. Specifically, the process proceeds to step S24 of reference picture transfer if the transfer flag is 0, or to step S26 of reference picture transfer if the transfer flag is 1, whereby the reference memory manager 212 changes what to store in the cache memory 42.
In the step S24, the reference memory manager 212 maps the cache memory 42 as shown in
In step S28 of reference picture transfer, the video encoding controller 211 transfers reference picture data required for search by the motion estimator 16 from the cache memory 42 to the reference local memory 48. Once the reference picture data required for search by the motion estimator 16 is stored in the reference local memory 48, the video encoding controller 211 allows the motion estimator 16 to perform search in step S30 of motion estimation.
Once search by the motion estimator 16 is completed, the reference picture transfer procedure is determined again from the information of the transfer flag set in the step S18 or S20 in step S32 of transfer mode determination. Specifically, the process proceeds to step S34 of reference picture transfer if the transfer flag is 0, or to step S36 of reference picture transfer if the transfer flag is 1.
In the step S34, the reference memory manager 212 operates the bus switch 44 so that color-difference reference picture data, among reference picture data required for motion compensation determined from motion information estimated in the step S30, is transferred from the external memory 38 to the reference local memory 46. In the step S36, the reference memory manager 212 operates the bus switch 44 so that color-difference reference picture data, among reference picture data required for motion compensation, is transferred from the cache memory 42 to the reference local memory 46.
In step S38 of motion compensation, the motion compensator 218 performs motion compensation using the motion information estimated in the step S30 and the luminance reference picture data stored in the reference local memory 48. Although the step S38 of motion compensation is described as being performed after the step S34 or S36 of reference picture transfer, the step S38 may perform before the step S34 or S36, or perform simultaneously with the step S34 or S36. Once reference picture data required for motion compensation is stored in the reference local memory 46 in the step S34 or S36, the motion compensator 218 performs motion compensation using the motion information estimated in the step S30 and the color-difference reference picture data stored in the reference local memory 46, in step S40 of motion compensation.
Once the motion compensator 218 generates a predicted image in the steps S38 and S40, the encoder 232 encodes the difference between the input image and the predicted image, and outputs the result as the stream STR, as well as decoding the encoded data to generate a difference image for generation of a reconstructed image, in step S42 of encoding/stream output. The reconstructed image is generated using the predicted image generated in the steps S38 and S40 with the adder 36, and transferred to the external memory 38 as a reference picture used at the next inter-frame prediction.
The series of processing described above is repeated from the point immediately after the step S12 until termination of the recording is determined in step S44 of recording termination. If termination of the recording is not determined in the step S44, whether the maximum processing performance of the video encoding device 200 has exceeded is monitored in step S46 of recording status monitoring, and then the process returns to the point immediately after the step S12. In the step S14, the recording mode is converted to a recording condition including the monitor information obtained in the step S46. If termination of the recording is determined in the step S44, the video encoding processing is terminated.
The video encoding controller 211 that controls the series of processing described above may be a processor operating by executing a program or a sequencer comprised of a combinational circuit and a sequential circuit. Although the video encoding controller 211 has been described above as being a dedicated controller incorporated in the video encoding device 200, the function of the video encoding controller 211 may be undertaken by the system control section 128 and the external control section 136 in
Variations of mapping of the cache memory 42 will be described hereinafter using the video encoding device 100 of
In this variation, to allow the motion estimator 16 to refer to a plurality of reference pictures during inter-frame prediction of a unit block to be encoded, the cache memory 42 is mapped as shown in
When the bus switch 44 in
Conversely, when the bus switch 44 in
In the above mapping, the total volume of reference pictures mapped to address areas Ref(1) to Ref(n) in the area ALR in
More specifically, when the bus switch 44 is controlled statically based on the recording mode, the number of pictures that can be referred to by the motion estimator 16 is changed as shown in
If the reference memory manager 12 controls the bus switch 44 to invariably transfer color-difference reference picture data from the cache memory 42 to the reference local memory 46, for example, the cache memory 42 will be utilized most efficiently by being mapped so that the ratio of the area ALR to the area ACR is 2:1 when the color-difference signal format of the input picture is 4:2:0, 1:1 when it is 4:2:2, and 1:2 when it is 4:4:4.
While the number of reference pictures is m (m<n) in the case of
When color-difference reference picture data is directly transferred from the external memory 38 to the reference local memory 46 via the bus switch 44, the motion estimator 16 searches a reference region having a height h, for example, for each of n reference pictures as shown in
In the case of controlling the bus switch 44 based on the picture structure, or whether the input picture is a frame picture or a field picture, the reference region may be narrowed when the input picture is a frame picture, and widened when it is a field picture, whereby the cache memory 42 can be utilized efficiently. In the case of controlling the bus switch 44 based on the rate of processing of pictures, the reference region may be narrowed when the picture processing rate is low, and widened when it is high, whereby the cache memory 42 can be utilized efficiently.
Since the number of color-difference reference pictures is smaller than the number of luminance reference pictures in the cache memory 42, it is necessary to control the bus switch 44 dynamically depending on whether a reference picture necessary to be transferred to the reference local memory 46 is stored in the cache memory 42. Such control is entirely performed by the reference memory manager 12 as follows.
The reference memory manager 12 first transfers color-difference reference picture data, as well as luminance reference picture data, from the external memory 38 to the cache memory 42. Thereafter, when the motion estimator 16 uses, as a reference picture optimal to encoding, a luminance reference picture corresponding to a color-difference reference picture that has been transferred to the cache memory 42, the reference memory manager 12 controls the bus switch 44 to transfer the color-difference reference picture from the cache memory 42 to the reference local memory 46.
Conversely, when the motion estimator 16 uses, as a reference picture optimal to encoding, a luminance reference picture corresponding to a color-difference reference picture that has not been transferred to the cache memory 42, the reference memory manager 12 controls the bus switch 44 to transfer the color-difference reference picture directly from the external memory 38 to the reference local memory 46.
In the case of storing only luminance reference picture data in the cache memory as in
The functional blocks in
The external memory 38 in
Although the LSI was mentioned in the above description, it may be replaced with an IC, a system on a chip, a super LSI, an ultra LSI depending on the degree of integration. The technique of circuit integration is not limited to the LSI technology, but a dedicated circuit or a general-purpose processor may be used to achieve circuit integration. A field programmable gate array (FPGA) that can be programmed after LSI fabrication, or a reconfigurable processor that can reconfigure connection and setting of circuit cells inside an LSI may be used. If a circuit integration technology replacing the LSI technology appears along with progress of the semiconductor technology or development of its derivative technology in the future, the functional blocks may naturally be integrated using such a technology. One possibility of such a technology is application of the biotechnology.
As described above, in the embodiment of the present invention, data transfer from the external memory can be reduced. Thus, the present invention is useful for a motion estimation device, etc.
Many features and advantages of the present invention are apparent from the above description, and thus it is intended that all of such features and advantages of the present invention are covered by the scope of the appended claims. Since many changes and modifications may be easily made by those skilled in the art, the present invention should not be limited to exactly the same configurations and operations as those illustrated and described. It is therefore to be understood that all appropriate modifications and equivalents fall within the scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2008-212959 | Aug 2008 | JP | national |
This is a continuation of PCT International Application PCT/JP2009/004034 filed on Aug. 21, 2009, which claims priority to Japanese Patent Application No. 2008-212959 filed on Aug. 21, 2008. The disclosures of these applications including the specifications, the drawings, and the claims are hereby incorporated by reference in their entirety.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/JP2009/004034 | Aug 2009 | US |
Child | 13017639 | US |