This disclosure relates generally to fields of video technology, and more particularly to reuse of a search region in motion estimation of multiple target frames.
An encoder may perform motion estimation and encoding of a macroblock of a frame. The encoder may use a two window approach in which a search window of a prior frame and an additional search window of a later frame are used to perform motion estimation of the macroblock. The encoder may require additional bandwidth and additional memory to perform motion estimation using the two window approach. As a result, a size of a search window may be reduced. In addition, a loss of quality in the encoding may occur, resulting in a lower quality transmission or recording.
This summary is provided to comply with 37 C.F.R. §1.73, requesting a summary of the invention briefly indicating the nature and substance of the invention. It is submitted with the understanding that it will not be used to limit the scope or meaning of the claims.
Several methods and a system to reuse a search region in motion estimation of multiple target frames are disclosed.
In an exemplary embodiment, a method includes acquiring a search region of a reference frame. The method also includes maintaining the search region in a memory. The method also includes performing motion estimation of a macroblock of a target frame in a direction using a processor and the search region. In addition, the method includes reusing the search region maintained in the memory to perform motion estimation of an additional macroblock of an additional target frame in an additional direction.
In an exemplary embodiment, a system includes a motion estimation module to acquire a search region of a reference frame. In addition, the system includes a memory to maintain the search region. The system also includes a processor to perform motion estimation of a macroblock of a target frame in a direction using the search region. The system also includes the processor to reuse the search region maintained in the memory to perform motion estimation of an additional macroblock of an additional target frame in an additional direction.
In an exemplary embodiment, a method includes acquiring a search region of a reference frame. The method further includes maintaining the search region in a memory. The method also includes determining a motion estimation predictor using the reference frame. The motion estimation of the macroblock of the target frame utilizes the motion estimation predictor. In the embodiment, the reference frame and the target frame are adjacent frames.
In the embodiment, the method includes performing motion estimation of a macroblock of a target frame in a direction using a processor and the search region. At least two of the macroblock, an additional macroblock, and a separate macroblock are collocated. The method also includes reusing the search region maintained in the memory to perform motion estimation of the additional macroblock of an additional target frame in an additional direction. The method includes reusing the search region maintained in the memory to perform motion estimation of the separate macroblock of a separate target frame in a separate direction. In the embodiment, the direction of motion estimation, the additional direction of motion estimation, and the separate direction of motion estimation is forward. In addition, the target frame and the additional target frame are each B-frames, and the separate target frame is a P-frame.
The method further includes acquiring a previously determined motion estimation data of the target frame. The previously determined motion estimation data is generated by performing an alternate direction of motion estimation on the macroblock of the target frame using an alternate search region of an alternate reference frame. The method also includes selecting at least one of a forward mode, a backward mode, and a bipredictive mode as a preferred motion estimation method of the target frame. The method further includes performing a real time encoding of the macroblock of the target frame. The real time encoding is performed after the alternate search region is maintained in the memory and reused to perform motion estimation of multiple frames.
The methods, systems, and apparatuses disclosed herein may be implemented in any means to achieve various aspects, and may be executed in a form of a machine-readable medium embodying a set of instructions that, when executed by a machine, cause the machine to perform any of the operations disclosed herein. Other features will be apparent from the accompanying drawings and from the detailed description that follows.
Example embodiments are illustrated by way of example and not limitation in the figures of the accompanying drawings, in which like references indicate similar elements and in which:
Other features of the present embodiments will be apparent from the accompanying Drawings and from the Detailed Description that follows.
Several methods and a system to reuse a search region in motion estimation of multiple target frames are disclosed.
Although the present embodiments have been described with reference to specific example embodiments, it will be evident that various modifications and changes may be made to these embodiments without departing from the broader spirit and scope of the various embodiments.
Compression of video data may involve intra frame coding using an I-frame, predictive frame coding using a P-frame, or bipredictive frame coding using a B-frame. The P-frame may be P1112 or P2118. The B-frame may be one of B11108, B12110, B21114, or B22116. The I-frame may be coded by itself. A macroblock 122N of a P-frame may be predicted using motion estimation from a recently coded frame. A macroblock 122A-D of a B-frame may be bipredictively coded using a combination of data from two previously coded frames. P-frames may be coded in temporal order, which may be the order in which the frames occurred. B-frames may be coded using a frame that occurred temporally later and a frame that occurred temporally earlier.
Video frames that occur within a threshold time period or a threshold number of frames may use the same reference data to perform motion estimation. The threshold time period or the threshold number of frames may be affected by the video coding standard, available memory, processor limitations, and any applicable time limits with respect to completing motion estimation and encoding. Different B-frames may use the same reference data to perform either forward or reverse motion estimation, depending on whether the reference frame 124 occurred later in time or earlier in time than a particular target B-frame.
Motion estimation and motion compensation may be delinked at a frame level from other encoding processes to allow one or more pixels of a search region to be reused. The other encoding processes may include a transform module, an entropy coder module, an intraprediction module, and a quantization module. The transform module may perform a Fourier transform operation or an inverse Fourier transform operation. The intraprediction module may perform a prediction operation with respect to an intra coded frame. The quantization module may perform quantization.
A reference frame may be stored in external memory, which may be a non-volatile or volatile memory, or any other storage medium. A search region 120 of the reference frame may be transferred from the external memory and maintained in internal memory 104. The search region 120 may be reused to provide reference data, which may be used to perform motion estimation. Maintaining and reusing the search region 120 in the internal memory 104 may reduce a bandwidth needed to transfer the same search region 120 data between external memory and the internal memory 104. The reduction in bandwidth may allow an increased search region 120 to be used in motion estimation, which may result in an increased quality of video compression.
In an example embodiment, a standard group of picture (“GOP”) structure may be used with respect to a series of video frames. The structure may be a “PBBP” sequence, in which a P-frame is followed by one or more B-frames, which may be followed by another P-frame. In an embodiment, the frames may be designated as follows: P0, B11108, B12110, P1112, B21114, B22116, P2118, B31, B32, P3, etc. This order may be a display order of the frames. The frames may be adjacent to each other or separated by any number of frames.
In the embodiment, reference data from P1112 may be used with respect to motion estimation in five frames, including: B11108, B12110, B21114, B22116, and P2118. The frame P1112 may serve as a backward reference with respect to the frames B11108 and B12110. The frame P1112 may serve as a forward reference with respect to B21114, B22116, and P2118.
In an embodiment, a search region 120 of a reference frame 124 is acquired. A search region 120 may include data of P1112. The data may be transferred from external memory into the internal memory 104 using a windowing approach. The data may then be maintained in memory 104 to perform backwards and forwards motion estimation of macroblocks of multiple frames. The motion estimation may be performed using a process and the search region 120. This motion estimation may be pipelined with other video encoding operations such that real time motion estimation and encoding can be performed. The motion estimation process may be performed using the motion estimation module 100.
Temporal predictors to improve motion estimation efficiency may be selected using frames within a temporal threshold time period or a temporal threshold frame distance. The temporal threshold time period or a temporal threshold frame distance may vary depending on video standards, available memory, and hardware limitations. In the embodiment, the temporal predictors may be selected based on an adjacent frame, which may improve a motion estimation operation. Reusing a search region 120 to perform motion estimation with respect to multiple frames in a sequence including B-frames may allow multiple frames to gain the benefit of temporal predictors obtained from frames within a temporal threshold time period or temporal threshold frame distance. In particular, each frame in the sequence including B-frames may be able to obtain a predictor from an adjacent frame, which may improve motion estimation accuracy and a rate of convergence.
In an embodiment, given a macroblock in P1112, a backward motion vector 134B of a collocated macroblock 122B in the target frame 126A, B12110, may be determined using the search region 120. The macroblock 122A-N may include the collocated macroblock. An additional backward motion vector 134A of an additional collocated macroblock 122A in an additional target frame 128A, B11108, may then be determined by reusing the search region 120. A forward motion vector 134C of a target frame 126B, B21114, may be determined by reusing the search region 120. An additional forward motion vector 134D of an additional target frame 128B, B22116, may then be determined by reusing the search region 120. In addition, a separate forward motion vector 134N of a separate target frame, P2118, may then be determined by reusing the search region 120.
The direction of motion estimation performed to obtain each motion vector may be either forward or backward. The target frame 126A-B and the additional target frame 128A-B may each include a B-frame. The separate target frame 130 may include a P-frame. In an embodiment, the direction of motion estimation performed on the target frame 126A and the additional target frame 128A may be in the forward direction. In an additional embodiment, the direction of motion estimation performed on the target frame 126B, the additional target frame 128B, and the separate target frame 130 may be backwards.
In the embodiment, motion estimation may be performed sequentially in either a forward or a backward direction. Alternatively, motion estimation may be performed in any other order that uses a prior motion estimation result as a predictor with respect to motion estimation of another frame in a sequence that includes B-frames. Using an adjacent frame or a frame within a threshold time period or threshold number of frames may allow a motion estimation process to track a series of motion vectors through temporally related frames. Tracking the series of motion vectors may improve a reuse of the search region 120 in the reference frame 124.
In an additional embodiment, backwards motion estimation of a macroblock of B12110 may be performed by obtaining motion estimation predictor 132B of the frame P1112 and searching in the reference frame 124, P1112. A forward prediction and a forward motion vector of the macroblock may be fetched from external memory to internal memory 104 to perform a bipredictive motion estimation operation. The sum of absolute differences may be used to determine a preferred method of motion estimation of a B-frame between a forward motion estimation result, a backwards motion estimation result, and a combined motion estimation result. The combined motion estimation result may be an output of a bipredictive motion estimation mode. A chosen predicted reference may be stored in external memory.
In the embodiment, backwards estimation of a collocated macroblock of B11108 may be performed using a motion estimation predictor 132A from B12110. A forward prediction and a forward motion vector of the macroblock may be fetched from external memory to internal memory 104 to perform a bipredictive motion estimation operation. The sum of absolute differences may be used to determine a preferred method of motion estimation of a B-frame between a forward motion estimation result, a backwards motion estimation result, and a combined motion estimation result. The combined motion estimation result may be an output of a bipredictive motion estimation mode. A chosen predicted reference may be stored in external memory.
In the embodiment, forward motion estimation of a collocated macroblock of B21114 may be performed using a motion estimation predictor 132C from P1112 and a search in the reference frame P1112. A forward prediction and a forward motion vector may be stored in external memory. In the embodiment, forward motion estimation of a collocated macroblock of B22116 may then be performed using a motion estimation predictor 132D from B21114. The resulting forward motion vector and forward prediction may then be stored in external memory. Forward motion estimation of P2118 may be performed using a motion estimation predictor 132N from B22 and a search in P1112.
In an embodiment, the described methods and system to reuse the search region 120 may be performed using 195 KB of internal memory and 975 MBPS of bandwidth between external memory and internal memory. The search range in a forward or a backward direction may be +/−144 horizontally and +/−72 vertically.
In an embodiment, when encoding a B-frame based group of picture using a “PBBP” sequence with a 960 megabit per second (MBPS) external memory transfer budget, reuse of a search region 120 in motion estimation may increase a supportable vertical search range from +/−24 pixels to +/−72 pixels with respect to a B-frame. A vertical search range of a P-frame may be increased from +/−64 pixels to +/−72 pixels. The increase in vertical search range may improve a quality of video compression with respect to a substantially equivalent external memory traffic used with another method.
In an embodiment, using a time period allotted to displaying three frames to perform five motion estimation operations allows an improvement in balancing with respect to the motion estimation operations. In other embodiments, any number of time periods may be used to complete any corresponding number of motion estimation operations to perform real time motion estimation and encoding. Motion estimation operations may be balanced to provide greater time periods to perform particular motion estimation operations given that other operations may not use all of the allotted time period.
In an embodiment, motion estimation may be performed in a sequence to allow predictors from completed motion estimation operations of adjacent frames to be used to perform motion estimation of additional frames. The order of completed motion estimation operations may begin with frames closes to the reference frame and progress in either a forward or a backward direction away from the reference frame. In other embodiments, other frames may be used as a source of a predictor with respect to a motion estimation operation.
In another embodiment, encoding may be performed in which P-frames and intra coded frames are encoded in their display order, and other frames may be encoded in between the P-frames and intra coded frames. In the embodiment, B-frames are coded in their display order, but encoding operations are time delayed so that the B-frames are encoded after a P-frame that follows the B-frames in the display frame order 252.
In the example embodiment, the display order of frames may be 10258 at time TO, followed by B11208, B12210, P1212, B21214, B22216, P2218, B31260, B32262, P3264, B41266, B42268, P4270, B51272, and B52274. The I0258 may be an intra frame that is coded between T0+100 ms and T0+133 ms. Between T0+133 ms and T0+166 ms, motion estimation in the forward direction may be performed with respect to a collocated macroblock of the frames B11208 and B12210. Between T0+166 ms and T0+200 ms, motion estimation and encoding may be performed with respect to the frame P1212.
In the embodiment, between T0+200 ms and T0+300 ms, motion estimation is performed in the backwards direction with respect to the frames B12210 and B11208, in that order. Motion estimation is performed in the forwards direction with respect to the frames B21214, B22216, and P2218, in that order. Encoding is performed with respect to P2218.
In the embodiment, between T0+300 ms and T0+400 ms, motion estimation is performed in the backwards direction with respect to the frames B22216 and B21214, in that order. Motion estimation is performed in the forwards direction with respect to the frames B31260, B32262, and P3264, in that order. Encoding is performed with respect to B11, B12, and P3.
The diagrammatic system view 300 may indicate a personal computer and/or the data processing system in which one or more operations disclosed herein are performed. The processor 302 may be a microprocessor, a state machine, an application specific integrated circuit, a field programmable gate array, etc. (e.g., Intel® Pentium® processor). The main memory 304 may be a dynamic random access memory and/or a primary memory of a computer system.
The static memory 306 may be a hard drive, a flash drive, and/or other memory information associated with the data processing system. The bus 308 may be an interconnection between various circuits and/or structures of the data processing system. The video display 310 may provide graphical representation of information on the data processing system. The alpha-numeric input device 312 may be a keypad, a keyboard and/or any other input device of text (e.g., a special device to aid the physically handicapped).
The cursor control device 314 may be a pointing device such as a mouse. The drive unit 316 may be the hard drive, a storage system, and/or other longer term storage subsystem. The signal generation device 318 may be a bios and/or a functional operating system of the data processing system. The network interface device 320 may be a device that performs interface functions such as code conversion, protocol conversion and/or buffering used to perform communication to and from the network 326. The machine readable medium 322 may provide instructions on which any of the methods disclosed herein may be performed. The instructions 324 may provide source code and/or data code to the processor 302 to enable any one or more operations disclosed herein.
Although the present embodiments have been described with reference to specific example embodiments, it will be evident that various modifications and changes may be made to these embodiments without departing from the broader spirit and scope of the various embodiments. For example, the various devices, modules, analyzers, generators, etc. described herein may be enabled and operated using hardware circuitry such as CMOS based logic circuitry, firmware, software or any combination of hardware, firmware, or software, which may be embodied in a machine readable medium. For example, the various electrical structure and methods may be embodied using transistors, logic gates, and electrical circuits. Examples of electrical circuits may include application specific integrated (ASIC) circuitry or Digital Signal Processor (DSP) circuitry.
Particularly, the motion estimation module 100, the encoder module 106, the quantization module, the intraprediction module, and the transform module may be enabled using software and/or using transistors, logic gates, and electrical circuits (e.g., application specific integrated ASIC circuitry) such as a motion estimation circuit, an encoding circuit, a quantization circuit, an intraprediction circuit, a transform circuit, and other circuits.
In addition, it will be appreciated that the various operations, processes, and methods disclosed herein may be embodied in a machine-readable medium and/or a machine accessible medium compatible with a data processing system (e.g., a computer system), and may be performed in any order (e.g., including using means to achieve the various operations). Accordingly, the specification and drawings are to be regarded in an illustrative rather than a restrictive sense.