Image shake is an issue that degrades performance in digital cameras. Image shake often results from movement by the user of the camera, or from vibrations transmitted through a mounting such as a tripod or bracket. Another source of image shake is from motion of the object to be imaged. As sensors become smaller, while even increasing the numbers of pixels in the image sensor, image shake becomes a larger issue.
One approach to avoid image shake is to build in a gyroscopic mount for a sensor array. Thus, the sensor array is kept still even when surrounding parts of a camera are in motion. However, this is relatively costly. Also, compensating for camera motion does not reduce adverse effects of movement of the object to be imaged.
While one may expect that a camera is always moving somewhat, the motion of a camera manifesting as image shake may vary during the process of taking a picture. Similarly, the motion of an object can vary, for example when a basketball player jumps or executes an abrupt transient motion. Thus, it may be useful to provide a method and system which takes advantage of times of low motion. Additionally, a low-cost solution for minimizing the effects of image shake can be useful.
The present invention is illustrated in an exemplary manner by the accompanying drawings. The drawings should be understood as exemplary rather than limiting, as the scope of the invention is defined by the claims.
A system, method and apparatus are provided for image anti-shake in digital still cameras. The specific embodiments described in this document represent exemplary instances of the present invention, and are illustrative in nature rather than restrictive.
In one embodiment, a method of capturing an image in a digital camera is presented. The method includes calculating a sharpness value based on an image input. In the embodiment, calculating the sharpness value comprises determining a high frequency value related to the image input. The method also includes predicting the quality of a next image based on the sharpness value. The method further includes deciding whether to capture a next image input data responsive to the prediction.
In another embodiment, a digital camera is presented. The camera includes a processor. The camera also includes media for image storage coupled to the processor. The camera further includes an image sensor coupled to the processor. Also, the camera includes an image quality detector for detecting image motion. The image quality detector includes a sharpness detector based on high pass filtering of image data from the digital image sensor. Moreover, the camera includes a predictor of next image motion coupled to the quality detector. Furthermore, the camera includes a decision maker coupled to the predictor. The predictor and decision maker are to evaluate output of the quality detector and capture an image from the digital image sensor in the media, responsive to the output of the quality detector. The quality detector, the predictor and the decision maker can be implemented by the processor in some embodiments.
In another embodiment, an apparatus is presented. The apparatus includes means for calculating a sharpness value related to a current image input. The apparatus also includes means for estimating next image quality depending on the sharpness value. The apparatus further includes means for capturing a next image input data frame from the image input. The means for capturing operates responsive to the means for estimating next image quality.
A method and apparatus as described and illustrated can improve image quality in digital still cameras. The method and apparatus depend on evaluating motion characteristics of image input data, and then capturing the next image sensed. Thus, when a process determines that criteria for relatively stable pictures are met, the process can then capture the next image with the expectation that the next image will be relatively stable. This allows for rapid evaluation, without the need to store multiple images. It also reduces cost associated with expensive components such a movable image sensors and lenses, for example.
In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the invention. It will be apparent, however, to one skilled in the art, that the invention can be practiced without these specific details. In other instances, structures and devices are shown in block diagram form in order to avoid obscuring the invention.
Reference in the specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the invention. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments.
Various embodiments may be further understood from the Figures.
It will be apparent to those skilled in the art that the referring to a line of sensors in the image sensor as a row or column is somewhat arbitrary. Although the term row often references a line of pixels parallel to the top and bottom of a digital camera, the invention does not depend on any specific designation and a columnwise raster scan can also be performed within the scope and spirit of the invention.
Also, various other methods of partitioning the image sensor and outputting data from the image sensor are useful in some embodiments. For example an artificial frame consisting of selected submatrices of image sensor pixel data can be repeatedly scanned in a selected order. Also, a sensor comprising a pixel addressable image buffer is operable to implement aspects of the invention.
It is known that the quality of digital images is adversely affected by image shake (also referred to herein as image motion) while the image is exposed. Unintentional camera motion causes image shake and results in decreased sharpness. However image motion can also arise from motion of the subject or object being imaged. Relative changes in the sharpness of successive image frames depends on motion characteristics. Generally, the sharpness of an image is greater when there is less motion. It has been discovered that image motion can be effectively predicted based on relative change in the sharpness of subframes. The invention includes an efficient camera anti-shake method that includes evaluating relative sharpness changes based on the subframes, and predicting the sharpness of a next image based on the relative changes. In another aspect of the invention, the camera anti-shake method is operable to reduce the adverse effects of subject motion on image quality. In still another aspect of the invention, an anti-shake digital camera using the anti-shake method is disclosed. The digital camera is easily implemented with modest hardware resources, resulting in relatively low cost.
An embodiment of a method and system for improving image quality in digital cameras according to the present invention is explained further with reference to
In the embodiment, data input from the image sensor 110 is moved into a buffer 150 of machine readable media. The buffer is coupled to the image quality detector 120. In many embodiments, the quality detector 120 comprises a sharpness detector. The system 100 also includes a motion predictor 130 and a decision maker 140. Instructions and data in the computer readable media are operable by the processor to implement the sharpness detector 120, the predictor 130, and the decision maker 140. However in various other embodiments, functionality of the detector, predictor or decision maker, in whole or in part, can be implemented with control circuits. In the embodiment, the media comprises line buffers operable for storing a number of lines of image data corresponding to rows of the pixel data input from the image sensor 110. While lines of image data in the embodiment are rows of data, lines which are columns of pixel data can equivalently be used in various embodiments. Also, detectors comprising non-rectangular pixel configurations and/or other methods operable to receive an image input from an image sensor and store the data in the machine readable media are operable to practice the invention.
In an embodiment of the system shown in
It has been found that subdividing the image frame into subframes often improves sensitivity for detecting motion. In an embodiment shown in
In many embodiments, the frame image acquisition rate is 15 or 30 frames per second, responsive to a number of design constraints and industry standards. The corresponding image frame exposure times will be somewhat less than 1/15 s or 1/30 s since the frame rate usually includes a blanking period between exposures.
In many embodiments, the successively numbered lines of an image frame are exposed sequentially. As the frame rate decreases, an image frame must be divided into more subframes in order to effectively sample relative motion of the image. On the other hand, if the number of subframes is too high, the reduced height of each subframe can introduce an artificial sensitivity to motion in the vertical direction. Hence it is seen that there is a tradeoff between the image acquisition rate and the number of image subframes. It has been found that using at least four equal horizontal subframes provides effective motion sensing in embodiments having VGA (640×480) or XGA image frame resolution and standard 15 s−1 and 30 s−1 frame rates.
In one embodiment illustrated in
In the embodiment, a sharpness detector comprises high pass filtering. A suitable high pass filter comprises convolving the example 5×5 high pass filter matrix in
where w(m,n) is an element of the high pass filter matrix shown in
The sharpness detector embodiment forms a subframe sharpness value for the ith subframe in the jth frame comprised of the subframe average high frequency portion given by the following relationship:
where W is the number of pixels along the width of the subframe (which is a row of the frame in this embodiment), and H is the number of pixels in a column of the subframe (which is a subframe portion of the full frame column height).
The predictor 130 estimates the expected image motion of a next frame based on the current subframe sharpness value of the current frame and previous subframe sharpness values of previous image frames. An illustrative predictor method in an embodiment that divides an image frame into four subframes operates as follows. The difference between the ith subframe sharpness value in the current jth frame and the corresponding subframe sharpness value, determined for the previous (j−1)th frame is computed according to:
Dij=sij−si,j−1,
where i is the current subframe number (i=1,2,3,4). Note that subframes of different frames are said to be corresponding subframes if and only if they have same subframe number. Also, Dmax, the maximum absolute value of the previous consecutive subframe sharpness value differences Dij following the capture command, is found according to:
Dmax=max(|Dij|),
where the subscript i ranges over subframe numbers (i=1,2,3,4) and the subscript j ranges over image frame numbers from the first image frame following the capture command to the current image frame.
The predictor 130 estimates an image motion of the next image frame based on evaluating the two propositions:
Proposition 1: D4j>0 and D4j<k1*Dmax and D4j>D3j
Proposition 2: D4j<0 and |D4j|<k2*Dmax and D4j>D3j
where k1 and k2 are selected factors. If at least one of these two propositions is true, a prediction that next frame image motion will be less than the current image frame motion is output to the decision module 140. Otherwise a prediction that the next frame image motion will be equal or greater than the current image frame motion is output to the decision maker. In practice, selecting a constant of ⅔ for k1 and a constant of ½ for k2 has been found to be quite effective. Other selections of k1 and k2 are operable, although it has been found that selecting values of k1 and k2 less than 1 is preferable.
The most recent subframe image data of a frame are considered to provide more accurate estimates of future motion than older data. Therefore Proposition 1 and Proposition 2 in the embodiment are based on D4j since the data of subframe 4 are the last and most recent portion of image frame data input from the image sensor. However other methods of estimating motion based on the sharpness of the image frame data can be selected in various embodiments, and other predictors of future image motion can be used within the scope and spirit of the present invention.
In an embodiment of the method and apparatus illustrated in
However other metrics of the frame sharpness based on other relationships are also operable, depending on the embodiment.
The decision maker 140 decides whether or not to capture a next frame from the image sensor 110 and save it in an image memory 115. It will make a decision to capture when all of the following three propositions are true: 1) a capture command was received, 2) the predictor predicts that the next frame will have increased sharpness, and 3) Sj>Sc (e.g. the current frame is sharper than the first image frame received after the capture command).
In practice, image capture must be completed within a limited time after a capture command to be acceptable. It is possible that malfunction or unusual image conditions could lead to an unacceptable delay before the predictor predicts increased sharpness or Sj>Sc. To prevent unacceptable delay, the decision maker 140 in
Alternatively, where the decision maker 140 does not decide to capture a next image, another subframe of pixels is read from the image sensor 110 and processed by the quality detector 120 and the predictor 130. Of course the scope and spirit of the present invention includes embodiments comprising other methods and/or algorithms for deciding whether to capture a next image based on an output of the predictor and the number of frames and/or time elapsed after a capture command.
As merely one example, another embodiment comprising the method of
Next, a frame is acquired at box 430, subframe by subframe 442 as sharpness values of the subframes, sharpness value differences, and Dmax are evaluated and stored in registers 445. In some embodiments, the sharpness values are evaluated according to the relationships set forth above. However, within the scope and spirit of the invention, various other methods and measures of sharpness can be used for evaluating the quality of an image, including other measures depending on filtering a high frequency portion of the image.
Each of the blocks 430 and 435 of
After a frame is read and evaluated at 430, another frame is acquired and evaluated until a capture command is received 450. After the capture command is received, the predictor estimates the quality of the next frame 455 and evaluates capture criteria based on the estimated quality 460. If the capture criteria based on the estimated quality are met, the decision maker commands the capture of a next frame into an image memory 480 and the process ends 490. However, if the capture criteria based on the estimated quality are not met 460, the frame counter is incremented 461 and the decision maker tests whether the frame count has reached the predetermined maximum number “Max” 470. If the frame counter has reached Max, the decision maker commands the capture of a next frame into an image memory 480 and the process ends at block 490. Otherwise at block 435 another frame is acquired and evaluated.
Another aspect of the invention is that the method and system are operable in a relatively small amount of memory. A buffer memory for storing pixel data to evaluate terms according to the methods of
In an embodiment, the computational operations to evaluate the subframe sums sij include accumulating the convolution terms g(x,y) as taught above. In the embodiment the terms are evaluated using the convolution matrix coefficients (shown in
The pixel data in “window” 540 of the line buffers 510 and the short five column register buffer 530 are sufficient for evaluating one term of the convolution sum. The evaluation is by multiplying the elements of the 5×5 mask w(m,n) of
After evaluating g(x,y), the window of the line buffers and the most recently received pixel data are advanced to the right by one column to the position 550 as shown in the lower portion of
The next pixel datum received from the image sensor is then read into the rightmost column of buffer register 530. The next term, g(x+1,y) can then be evaluated as described for g(x,y) above. The process of evaluating a term of the convolution sum, shifting the pixel data positions, and advancing the window right one column is repeated until all of the terms in the row y of sij have been accumulated. The window position is then restarted at the left and terms of the next row, row y+1, are evaluated and accumulated in the same way. The sum sij is complete when all rows of subframe i have been processed.
However, evaluating g(x,y) in the two rows or two columns bordering the edge of a subframe requires data beyond the perimeter of the subframe (formally, the convolution sum for [x,y] requires data from two adjacent rows and column in each direction). In various embodiments, these edge values can be estimated by standard techniques for a boundary (e.g. extrapolation, mirroring, assuming the value of the first available row, truncation, etc.). In the instant embodiment, filtering is limited to the reduced subframe [W−2×H−2] so that physical row and column data from the image sensor are available to evaluate the terms. Of course the numbering of the indices and constants for boundary values are adjusted accordingly using standard techniques. Also, it will be understood that describing the movement of data in terms of “left” and “right,” “above” and “below,” or as rows and column, is only by way of explanation. In various embodiments these terms may be interchanged, or other terms may be used. While these terms are convenient for referencing the logical data structure, physical storage locations of the media are often mapped in various ways, depending on the application.
It is seen that a line buffer for a relatively small number of lines and a small number of storage registers are sufficient for implementing the predictor 130 and decision maker 140. In an embodiment comprising four subframes, four sharpness values sij characterize motion in the jth frame. The predictor in the embodiment depends on Dmax,D3j,D4j,k1, and k2. The decision maker 140 depends on Sj,Sc, predictor output, and the selected maximum number of frames. Hence the predictor and decision maker can be implemented using about 15 register cells for storing constants and values characteristic of the motion. Of course, depending on the application, other filtering methods and/or different filters may be used within the scope and spirit of the invention, including filter convolution matrices that are larger or smaller than the illustrative 5×5 matrix. In one embodiment according to the illustrative example, the number of line buffers operable to evaluate image sharpness (four line buffers in the embodiment of
The camera also includes an image signal processor (ISP) 680 for processing pixel data 610 from a register 675 of the media, and transforming the pixel data into another form, such as compressed Joint Photographic Experts Group form (JPEG), for capturing and storing in a picture storage media 695. The image sensor data is independently pipelined by control circuits from the image sensor into the line buffers 670, the register 675 and the ISP.
In one embodiment, the control program in media 640 directs operation and data flow between the program modules of the quality detector, predictor and decision maker according to the method of
While the quality detector, predictor and decision maker are implemented by a processor operable to perform program code in the embodiment, in other embodiments various of these functions or all of these functions are implemented using control circuitry. Also, although the mirror registers 660 and quality detector in 630 of the embodiment detect image quality based on high a 5×5 matrix high pass filtering of the image data, in various other embodiments a quality detector is implemented based on wavelet transforms, or various other methods adapted to detect sharpness, depending on the application.
As shown in
After the capture command is received (N=20 in this example), there is increasing motion which is manifest by the negative sharpness value differences Dij found in subframes 21 through 24. Responsive to the increased motion, the sharpness of the next complete frame after receiving the capture command at N=24 (frame 6) 786 is diminished relative to frame 5. The sharpness of this first image frame following the capture instruction, referenced by the decision maker as Sc=S6 in the formulae above, has a value S6≈−127. Motion continues to increase during the next 3 subframes, 25-27 as evidenced by negative Dij (Dij<0). At subframe 28 there is a small improvement (less motion, Dij>0), but the increased motion during subframes 25-27 outweighs the relatively small improvement at subframe 28. Hence at subframe 28 the metric of motion Sc+1 (S7), which is comprised of contributions from the four subframes N=25−28, has worsened relative to S6 (Sc), decreasing from S6≈−127 to S7≈−318(Sc+1). Since Sc+1≈−318<Sc≈−127, the decision maker does not capture.
At subframe 29 the level of motion is relatively stable (Dij=0) and starting at subframe 30 the level of motion significantly diminishes. The subframe sharpness continuously improves (Dij>0) from subframe 30 through subframe 32. This results in substantially increased sharpness of frame S8=Sc+2=788 which reaches a local maximum of about 240. Although S8>Sc at this frame, the rate of sharpness improvement between the last successive subframes of frame 8 has decreased (D4j<D3j, j=8). Therefore the next image motion is not predicted to improve (since neither Proposition 1 nor Proposition 2 are true) and the decision maker does not decide to capture the next image frame.
Starting at subframe 34, 790, motion increases again (Dij<0) resulting in renewed deterioration of the frame sharpness. At frame 9, 792, the rate of sharpness improvement between the last two successive subframes of a frame has decreased again, e.g. an increased rate of sharpness deterioration, (D4j<D3j<0, j=9). Therefore the decision maker does not decide to capture a next frame. However after frame 9 (subframe 36) the rate of deterioration eases, as evidenced by Dij increasing monotonically from subframe 36 of frame 10 (793) through subframe 40 of frame 10 (794).
When frame 10 is evaluated (subframe 40 at 794), the predictor predicts that next image frame motion will be less than the current image frame motion because Proposition 2 is true. This is apparent from the following considerations. D4j>D3j, (j=10) as required by the last term of both Proposition 1 and Proposition 2. Next, it is seen that D4j<0 at frame 10 (subframe 40). Therefore Proposition 1 is false and it remains to evaluate the second term of Proposition 2 for deciding whether next image motion is predicted to be less than current image frame motion. |Dij|<k2*Dmax (k2=½), Dmax is first determined by selecting the maximum value of |Dij| found in the interval that begins after the capture command at frame 20 and ends after the subframes of frame 40. It is seen that |Dij| reaches a relative maximum value in this interval at subframe 31 (787) where Di,j≈240. At frame 10, (subframe 40) D4j is about −18 units. Therefore inequality: D4,10−18<k2*Dmax≈½*240=120 is satisfied at frame 10. Hence, at frame 10 the Proposition 2 is true and the predictor predicts that the next image motion will be less than the current image frame motion.
Furthermore, at frame 10 (subframe 40), decision maker criteria to capture the next image frame are met because: 1) a capture command was received, 2) increasing sharpness is predicted by the predictor, and 3) the frame sharpness of frame 10 is greater than Sc: S10≈−54>Sc=S6≈−127. Following the decision to capture at frame 10, the next frame 796, frame 11 at subframe 44, is captured and stored in the image memory. Computed values of Dij and the sharpness of some successive subframes following the illustrative captured frame are also included
The computer system 800 includes a processor 810, which can be a conventional microprocessor such as an Intel Pentium microprocessor, an IBM power PC microprocessor, a Texas Instruments digital signal processor, or some combination of various types of processors, depending on the embodiment. Memory 840 is coupled to the processor 810 by a bus 870. Memory 840 can be dynamic random access memory (DRAM) and can also include static ram (SRAM), flash memory, magnetic memory (MRAM) and other types, depending on the application. The bus 870 couples the processor 810 to the memory 840, also to non-volatile storage 850, to display controller 830, and to the input/output (I/O) controller 860. In some embodiments, various combinations of these components are integrated in a single integrated circuit or in a combination of integrated circuits that are combined into a single package. Note that the display controller 830 and I/O controller 860 are often be integrated together, and the display may also provide input.
The display controller 830 controls in the conventional manner of a display controller on a display device 835 which typically is a liquid crystal display (LCD) or similar flat-panel, small form factor display. The input/output devices 855 can include a keyboard, or stylus and touch-screen, and may sometimes be extended to include disk drives, printers, a scanner, and other input and output devices, including a mouse or other pointing device, such as when a camera is connected to some form of docking station or personal computer. The display controller 830 and the I/O controller 860 can be implemented with conventional well known technology. A digital image input device 865 can be a digital camera comprising an embodiment of the invention which is coupled to an I/O controller 860 or through a separate coupling in order to allow images to be input into the device 800.
The non-volatile storage 850 is often a FLASH memory or read-only memory, or some combination of the two. A magnetic hard disk, an optical disk, or another form of storage for large amounts of data may also be used in some embodiments, though the form factors for such devices typically preclude installation as a permanent component of the device 800. Rather, a mass storage device on another computer is typically used in conjunction with the more limited storage of the device 800. Some of this data is often written, by a direct memory access process, into memory 840 during execution of software in the device 800. One of skill in the art will immediately recognize that the terms “machine-readable medium” or “computer-readable medium” include any type of storage device that is accessible by the processor 810 and also encompasses a carrier wave that encodes a data signal.
The device 800 is one example of many possible devices which have different architectures. For example, devices based on an Intel microprocessor often have multiple buses, one of which can be an input/output (I/O) bus for the peripherals and one that directly connects the processor 810 and the memory 840 (often referred to as a memory bus). The buses are connected together through bridge components that perform any necessary translation due to differing bus protocols.
In addition, the device 800 is controlled by operating system software which may include a file management system, such as a disk operating system, which is part of the operating system software. One example of an operating system with its associated file management system software is the family of operating systems known as Windows CE® from Microsoft Corporation of Redmond, Wash., and their associated file management systems. Another example of an operating system with its associated file management system software is the Palm® operating system and its associated file management system. However, it is common for digital cameras to have much less developed file management software and associated user interfaces. The file management system is typically stored in the non-volatile storage 850 and causes the processor 810 to execute the various acts required by the operating system to input and output data and to store data in memory, including storing files on the non-volatile storage 850. Other operating systems may be provided by makers of devices, and those operating systems typically will have device-specific features which are not part of similar operating systems on similar devices. Similarly, WinCE® or Palm® operating systems may be adapted to specific devices for specific device capabilities.
Device 800 may be integrated onto a single chip or set of chips in some embodiments, and typically is fitted into a small form factor for use as a personal device. Thus, it is not uncommon for a processor, bus, onboard memory, and display-I/O controllers to all be integrated onto a single chip. Alternatively, functions may be split into several chips with point-to-point interconnection, causing the bus to be logically apparent but not physically obvious from inspection of either the actual device or related schematics.
Some portions of the detailed description are presented in terms of algorithms and symbolic representations of operations on data bits within a computer memory. These algorithmic descriptions and representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. An algorithm is here, and generally, conceived to be a self-consistent sequence of operations leading to a desired result. The operations are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like.
It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the following discussion, it is appreciated that throughout the description, discussions utilizing terms such as “processing” or “computing” or “calculating” or “determining” or “displaying” or “evaluating” or the like, refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission or display devices. Thus the apparatus may be embodied in a medium.
One skilled in the art will appreciate that although specific examples and embodiments of the system and methods have been described for purposes of illustration, various modifications can be made without deviating from the spirit and scope of the present invention. For example, embodiments of the present invention may be applied to many different types of image acquisition systems, imaging devices, databases, application programs and other systems. Moreover, features of one embodiment may be incorporated into other embodiments, even where those features are not described together in a single embodiment within the present document. Accordingly, the invention is described by the appended claims.
The present application is related to U.S. Provisional Patent Application Ser. No. 60/763,516 filed on Jan. 30, 2006, priority to which is claimed.
Number | Date | Country | |
---|---|---|---|
60763516 | Jan 2006 | US |