The present invention relates to a method and apparatus for producing a video stream.
It is known when capturing a sequence of images of a scene which are to be combined into a video stream that motion of an object within the field of view can cause problems with the perceived quality of the video stream.
Referring to
Referring to
Even in an acquisition device with variable aperture, closing down the aperture rather than speeding up exposure time to solve motion problems would cause problems with depth of field.
Separately, it is known to use Neutral Density Filters to reduce the amount of light entering the lens, so increasing exposure time and introducing blur into video frames. However, good quality neutral density filters are expensive, they cannot be added to any kind of camera, and they are not designed to be used by casual users.
ReelSmart Motion Blur (http://www.revisionfx.com/products/rsmb/overview/) introduces motion blur into images which have been acquired with a relatively short exposure time, but this does not address problems with scenes which require a high dynamic range and so leads to problems with underexposure or saturation.
High Dynamic Range (HDR) images are typically generated by acquiring multiple component images of a scene, each with different exposure levels, and then later, merging the component images into a single HDR image. This is a useful way of synthesizing an image of a scene comprising very dark and very bright regions.
HDRx from Red.com, Inc. (http://www.red.com/learn/red-101/hdrx-high-dynamic-range-video) includes a “Magic Motion” facility which interprets different motion blur characteristics from a pair of frame sequences, each simultaneously acquired with different exposure times, and then blends them together under user-control.
According to a first aspect, there is provided a method of processing a stream of images according to claim 1.
This aspect provides a method for automatically combining sequences of frames acquired with different exposure times into a video stream with high dynamic range and with natural motion blur.
Embodiments of this aspect capture a pair of frame sequences including relatively lighter, longer exposure time (LET) frames which tend to be susceptible to motion blur and relatively darker, sharper, shorter exposure time (SET) frames. Motion is evaluated for each frame and synthetic motion blur is added to the SET frames before the SET and LET frames are combined to produce a high dynamic range (HDR) video sequence with natural motion blur.
According to a second aspect, there is provided a method of producing a video stream according to claim 11.
This aspect provides an effective mechanism for applying motion blur to images within a sequence of images.
Further aspects provide respective apparatus for performing the above methods.
In some embodiments motion is estimated between pairs of SET image frames of a given scene.
Various embodiments of the invention will now be described, by way of example, with reference to the accompanying drawings, in which:
Referring now to
Image frames are acquired by an image processing pipeline (IPP) 12 from an image sensor (not shown) and written to memory, typically at a frame rate from 24 fps up to from 60 fps to even 240 fps. In the embodiment, for each frame, a pair of images are acquired, one for a short exposure time (SET) and one for a relatively longer exposure time (LET). The exact exposure times can be chosen as a function of the scene illumination and may vary across a sequence, but the difference between the SET and the LET means that a moving object will be captured sharply in one whereas it will be subject to motion blur in the other. Typically, the SET and LET image frames are acquired by reading image sensor values after the SET and, without clearing sensor values, continuing to expose the image sensor until the LET at which time LET image values are read. In the example, SET image values are written to a frame buffer 22-A and LET image values are written to a frame buffer 22-B. As explained, SET image frames tend to be dark but sharp, whereas LET image frame tend to be bright and naturally blurred.
It will be appreciated that if SET and LET image frames were to be captured successively that artefacts such as ghosting might tend to appear in the frames of the final video stream.
In this embodiment, the apparatus is arranged to determine a motion vector 26 between successive SET image frames. It will nonetheless be appreciated that the motion vector may not need to be determined between immediately temporally adjacent SET image frames, but they should nonetheless be acquired relatively close to one another. Also, the motion vector for a given SET image could be determined across more than two images. One suitable technique for determining a motion vector is disclosed in WO2014/146983. Using this technique, a down sampler (DS) 14 acquires information for successive SET image frames from the IPP 12. (This can also be acquired from the frame buffer 22-A if there is no direct connection to the IPP 12.) The down sampler 14 may for example be a Gaussian down-sampler of the type provided by Fujitsu. The down-sampled image information is fed to an integral image (II) generator (GEN) 16 which writes the II for an image frame to memory 24. Calculation of integral image is well known and was originally disclosed by Viola, P. and Jones, M. in “Rapid Object Detection using a Boosted Cascade of Simple Features”, Computer Vision and Pattern Recognition, 2001, Volume 1. As will be appreciated, only an intensity version of the original SET image is required to provide an integral image. This could be a grey scale version of the image, or it could be any single plane of a multi-plane image format, for example, RGB, LAB, YCC etc.
A hierarchical registration engine (HRE) 18 such as disclosed in WO2014/146983 reads integral image information for a pair of SET frames from memory 24 and by correlating the two images, generates a displacement map 26 for the image pair. An exemplary displacement map is shown in detail in
In the approach disclosed in WO2014/146983, a reduced integral image (RII) is also obtained for every second down-sampled SET image frame and the motion vector 26 is determined using an II for one image and an RII for an adjacent image frame.
In the embodiment of
Referring to
Referring back to
The HDR frames can then be processed by a conventional type video encoder 21 to produce a video stream which can be displayed or stored on the acquisition device or transmitted to a remote device (not shown) in a conventional fashion.
It will be appreciated however, that the motion vector for each pixel of an image produced by the HRE module 18 can vary in angle and magnitude and so it could be extremely processor intensive to generate a unique blurring kernel for each SET frame pixel to take into account the motion vector for the pixel. On the other hand, if a per pixel blurring kernel were not generated for example, if kernels were only generated for blocks of pixels, then blurring artefacts could be visible within the final HDR image frame.
In an embodiment of the present invention, the blurring module 13 generates four blurred versions of each SET frame: B0, B90, B180, B270, each blurred by a set amount B in a respective cardinal direction 0°, 90°, 180° and 270°. As indicated in
The original image from buffer 22-A, the four blurred images B0, B90, B180, B270 and the motion vector MV for an SET frame are fed to the blending module 17 to produce a blurred version of the SET frame.
Referring now to
Calculation of the blending parameters is based on the observation that the motion vector at any given pixel MV is a linear combination of motion in the four cardinal directions:
MV=α*B0+β*B90+γ*B180+δ*B270
It will be appreciated that there is no advantage to adding blur components from diametrically opposing blurred images, so: α*γ=0; β*δ=0. Thus, for a MV value in each quadrant, two of the blending parameters will be zero valued as in the case of β and γ in
An equal blend of the original image and the blurred images gives a pixel value for the blurred image as follows:
Blurred=¼(α*B0+(1−α)*Orig+β*B90+(1−β)*Orig+γ*B180+(1−γ) *Orig+δ*B270+(1−δ)*Orig)
This in turn simplifies to:
Nonetheless, it will be appreciated that in variants of this approach, the blurred image can be more weighted towards either the original SET image or towards the blurred images; or indeed towards a specific one of the blurred images.
Also, in other variants different combinations of blurred images can be employed at angles other than at the cardinal directions.
It will be appreciated that this technique works well as long as the actual motion blur for a LET image is less than the fixed blur B for the images B0 to B270. Actual motion blur depends on several factors:
Note that if B is chosen too large, the approximation used in such an embodiment might provide undesirable artefacts in the blurred SET image frame. The value for B can be determined experimentally, but it does not have to remain static for a complete video sequence.
In some embodiments, several predefined modes can be available, for example: High Speed Motion, Normal Scene, Static Scene. The blurring module 13 could choose between each of these three (or more) pre-sets by determining a maximum magnitude of the motion vector 26 from time to time, for example, every 2 seconds. Each of the three pre-sets would have an associated blur B size, tuned for the image sensor and based at least on its resolution. In other embodiments, B could vary in proportion to the difference between LET and SET for a corresponding pair of image frames.
It will be appreciated that while the above embodiments have been described in terms of a pair of SET and LET image streams, in other embodiments of the invention, more than two image streams may be employed, so providing, for example, a nominally exposed image stream; an underexposed image stream; and an overexposed image stream. While this may require differential blurring of both the underexposed and nominally exposed image streams so that they match the content of the overexposed image stream, it may provide for better quality HDR image frames. For both a two stream and multiple stream embodiments, the adjustment to exposure time can be determined as disclosed in U.S. Patent Application No. 62/147,464 filed 14 Apr. 2015 and entitled “Image Acquisition Method and Apparatus”.
Number | Name | Date | Kind |
---|---|---|---|
20090303343 | Drimbarean | Dec 2009 | A1 |
20170034429 | Huysegems | Feb 2017 | A1 |
Number | Date | Country |
---|---|---|
2933999 | Oct 2015 | EP |
WO2014146983 | Sep 2014 | WO |
Entry |
---|
High Dynamic Range Video with HDRX—http://www.red.com/learn/red-101/hrdx-high-dynamic-range-video; p. 1-7. |
“Reel Smart Motion Blur in Action” http://revisionfx.com/products/rsmb/overview/; p. 1-3. |
“Reel Smart Motion Blur” http://www.revisionfx.com/products/rsmb/overview/; 2015, p. 1-2. |
“ReelSmart Motion Blur in Action” http://www.revisionfx.com:80/products/rsmbi; 2016, p. 1-2. |
“High Dynamic Range Video with HDRX” http://www.red.com/learn/red-101/hdrx-high-dynamic-range-video; 2016, p. 1-8. |
“High Dynamic Range Video with HDRX” http://www.red.com/learn/red-101/hdrx-high-dynamic-range-video; 2015. page 1-8. |
Number | Date | Country | |
---|---|---|---|
20160323596 A1 | Nov 2016 | US |
Number | Date | Country | |
---|---|---|---|
62155310 | Apr 2015 | US |