The present disclosure is related generally to digital camera focusing and, more particularly, to a system and method for sensing scene movement to initiate focusing of a camera.
The introduction of the consumer-level film camera changed the way we saw our world, mesmerizing the public with life-like images and opening up an era of increasingly visual information. However, as imaging technologies continued to improve, the advent of inexpensive digital cameras would eventually render traditional film cameras obsolete, along with the sepia tones and grainy pictures of yesteryear. However, the digital camera offered the one thing that had eluded the film camera—spontaneity and instant gratification. Pictures could be taken, erased, saved, instantly viewed or printed and otherwise utilized without delay.
The quality of digital image technology has now improved to the point that very few users miss the film camera. Indeed, most cell phones, smart phones, tablets, and other portable electronic devices include a built-in digital camera. Nonetheless, despite the unquestioned dominance of digital imaging today, one requirement remains unchanged from the days of yore: the requirement to focus the camera. Today's digital cameras often provide an autofocus function that automatically places a scene in focus. However, when the scene suddenly changes, the autofocus function must collect enough frames of data to refocus the scene. This results in a delay of 300 ms or more while the autofocus function waits for the scene to stabilize, resulting in a poor user experience in dynamic environments.
It will be appreciated that this Background section represents the observations of the inventors, which are provided simply as a research guide to the reader. As such, nothing in this Background section is intended to represent, or to fully describe, prior art.
While the appended claims set forth the features of the present techniques with particularity, these techniques, together with their objects and advantages, may be best understood from the following detailed description taken in conjunction with the accompanying drawings of which:
Turning to the drawings, wherein like reference numerals refer to like elements, techniques of the present disclosure are illustrated as being implemented in a suitable environment. The following description is based on embodiments of the disclosed principles and should not be taken as limiting the claims with regard to alternative embodiments that are not explicitly described herein.
Before providing a detailed discussion of the figures, a brief overview will be given to guide the reader. In the disclosed examples, only a single frame is needed to detect scene movement and to start the autofocus routine, meaning that the delay until the initiation of focusing, when needed, is only 60 ms rather than the traditional 300 ms. In this regard, the disclosed examples process each frame using a graphics processing unit (GPU) of the device to accelerate the focus decision and improve the preview and video experience. This can be viewed colloquially as a continuous rather than intermittent auto-focus function. A GPU is a specialized chip, board or module that is designed specifically for efficient manipulation of computer graphics. In particular, a GPU embodies a more parallel structure than general-purpose CPUs, allowing more efficient processing of large blocks of data.
In an embodiment, the GPU calculates a pixel-based frame difference and estimates scene complexity at a camera frame rate to detect scene stability in real time (at each new frame). In addition to providing a speed advantage over CPU-based systems that wait for multiple frames, this also provides a lower complexity than techniques that rely on per-block motion vectors estimated during compression, e.g., techniques used in video processing.
At a basic level, certain of the disclosed embodiments simulate image jitter to derive a frame-specific threshold level for judging an inter-frame difference (from the previous frame to the current frame). In this way, more highly detailed scenes may experience a higher movement threshold and thus the system will provide a similar rapid auto focus response for both high detail and low detail scenes.
Turning now to a more detailed discussion in conjunction with the attached figures, the schematic diagram of
The processor 130 can be any of a microprocessor, microcomputer, application-specific integrated circuit, or the like. For example, the processor 130 can be implemented by one or more microprocessors or controllers from any desired family or manufacturer. Similarly, the memory 140 may reside on the same integrated circuit as the processor 130. Additionally or alternatively, the memory 140 may be accessed via a network, e.g., via cloud-based storage. The memory 140 may include a random access memory (i.e., Synchronous Dynamic Random Access Memory (SDRAM), Dynamic Random Access Memory (DRAM), RAMBUS Dynamic Random Access Memory (RDRM) and/or any other type of random access memory device). Additionally or alternatively, the memory 140 may include a read only memory (i.e., a hard drive, flash memory and/or any other desired type of memory device).
The information that is stored by the memory 140 can include code associated with one or more operating systems and/or applications as well as informational data, e.g., program parameters, process data, etc. The operating system and applications are typically implemented via executable instructions stored in a non-transitory computer readable medium (e.g., memory 140) to control basic functions of the electronic device 110. Such functions may include, for example, interaction among various internal components, control of the camera 120 and/or the component interface 170, and storage and retrieval of applications and data to and from the memory 140.
The device 110 may also include a component interface 170 to provide a direct connection to auxiliary components or accessories and a power supply 180, such as a battery, for providing power to the device components. In an embodiment, all or some of the internal components communicate with one another by way of one or more internal communication links 190, such as an internal bus.
Further with respect to the applications, these typically utilize the operating system to provide more specific functionality, such as file system service and handling of protected and unprotected data stored in the memory 140. Although many applications may govern standard or required functionality of the user device 110, in many cases applications govern optional or specialized functionality, which can be provided, in some cases, by third party vendors unrelated to the device manufacturer.
Finally, with respect to informational data, e.g., program parameters and process data, this non-executable information can be referenced, manipulated, or written by the operating system or an application. Such informational data can include, for example, data that is preprogrammed into the device during manufacture, data that is created by the device, or any of a variety of types of information that is uploaded to, downloaded from, or otherwise accessed at servers or other devices with which the device is in communication during its ongoing operation.
In an embodiment, the device 110 is programmed such that the processor 130 and memory 140 interact with the other components of the device to perform a variety of functions. The processor 130 may include or implement various modules and execute programs for initiating different activities such as launching an application, transferring data, and toggling through various graphical user interface objects (e.g., toggling through various icons that are linked to executable applications).
Within the context of prior autofocus systems,
An improved decision architecture in keeping with the disclosed principles is shown in
Meanwhile, the current frame 301 is also provided as input to a jitter simulator 305, which outputs a jitter difference 306. The operation of the jitter simulator 305 will be described in greater detail below in reference to
As noted above, the jitter simulator 305 produces a jitter difference 306 for use in evaluating the current frame 301. In an embodiment, the jitter simulator 305 processes the current frame 301 to simulate or predict the effect of jitter. An exemplary jitter simulator 305 is shown schematically in
No particular treatment of the pixel locations vacated by the shift is required, and the pixel values pushed out of frame by the shift may also be ignored. However, in an alternative embodiment, each vacated pixel location is populated by a copy of the pixel value that was shifted across it. In the context of the above example, this results in a smearing or copying of a portion of the frame left side values and a portion of the frame bottom values. Alternatively, the frame 401 may be looped, with the pixel values that are pushed out-of-frame being reintroduced in the opposite side or corner of the frame to populate the vacated pixel locations.
The diagonally shifted array 403 is then differenced at comparator 404 to produce a jitter difference 405, which is then provided to the comparator 307. The jitter difference 306, 405 provides a predictive measure regarding the likely results of scene movement without actually requiring scene movement. Thus, for example, a scene with many details and clean edges will result in a higher value jitter difference than will a scene with fewer details and clean edges. This effect can be seen in principle in
In an embodiment, mean pixel value is the measure of merit for each frame. In this embodiment, a jitter score is calculated as the mean pixel value of the current frame minus the previous frame, minus the mean pixel value of the jitter difference 502. As can be seen, the jitter difference 502 is significantly populated due to the movement of the many clean edges, which will lead to a high jitter score.
In a sense, the jitter difference and jitter score can be seen as a prediction of how much effect a small scene movement would have on the inter-frame difference. By traditional measures, a small movement in a complicated scene would register as a larger movement than the same small movement in a less complicated scene. In traditional systems, this results in constant refocusing on complicated scenes and an inability to settle or stabilize focus in such environments. Conversely, the same traditional systems may underestimate the amount of movement in simpler scenes, leading to a failure to refocus when focusing is otherwise needed.
Against this backdrop, the disclosed principles provide a scene-specific reference against which to measure the significance of observed movement between a first frame and a second frame. In other words, the movement threshold for complex scenes will be greater than the movement threshold for less complicated scenes. This allows the autofocus function to provide the same experience regardless of whether the scene is high contrast or low contrast.
While the disclosed principles may be applied in a variety of ways, an exemplary decision process 700 is shown in the flowchart of
At stage 701 of the process 700, a current frame corresponding to a scene is captured, e.g., by the camera 115, it being understood that a prior frame corresponding essentially to the same scene has been previously stored during a prior iteration of the process 700. The current frame is differenced with the stored prior frame at stage 702 to yield a difference signal (e.g., difference signal 304).
The current frame is also shifted by a predetermined amount, e.g., a predetermined number of pixels, in a predetermined direction at stage 703 to produce a jitter simulation. It will be appreciated that the exact direction and exact amount of the shift are not critical. Moreover, although the shift is predetermined, there may be multiple such predetermined shifts that vary in direction and amount. For example, of three predetermined shifts, the shifts may be applied randomly, cyclically, or otherwise.
At stage 704, the jittered simulation is differenced from the current frame to provide a jitter difference, which is in turn differenced at stage 705 from the difference signal to produce a movement signal. If the difference signal exceeds the jitter difference, then the movement signal is positive, whereas if the jitter difference exceeds the difference signal then the movement signal is negative. At stage 706, it is determined whether the movement signal is positive or negative. If it is determined at stage 706 that the movement signal is positive, then an autofocus operation is requested at stage 707, while if the movement signal is negative, then the process flows to stage 708 and an autofocus operation is not requested. From either of stages 707 and 708, the process 700 returns to stage 701.
In an embodiment, the magnitude of the positive movement signal is further used to determine auto focus behavior at finer granularity than a simple binary decision. In particular, in this embodiment, if the movement signal is positive and relatively small, then a small focus adjustment is attempted. Conversely, if the signal is positive and relatively large, then a larger focus adjustment may be attempted.
In this way, small focus adjustments, e.g., using a continuous auto-focus algorithm, may be used to provide a better user experience when possible without causing slow focus adjustment when large adjustments are needed. Similarly, larger focus adjustments, e.g., using an exhaustive auto-focus algorithm, may speed the focusing task when needed, e.g., to focus from a close object to a distant object. By making a calculated decision on when small or large adjustments are needed the system can deliver an improved user experience and better focus performance.
It will be appreciated that the disclosed principles provide a means, though not a requirement, for improving camera autofocus response and stability. However, in view of the many possible embodiments to which the principles of the present discussion may be applied, it should be recognized that the embodiments described herein with respect to the drawing figures are meant to be illustrative only and should not be taken as limiting the scope of the claims. Therefore, the techniques as described herein contemplate all such embodiments as may come within the scope of the following claims and equivalents thereof.
The present disclosure is a non-provisional application of co-pending and commonly assigned U.S. Provisional Application No. 61/846,680, filed on 16 Jul. 2013, from which benefits under 35 USC 119 are hereby claimed and the contents of which are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
61846680 | Jul 2013 | US |