1. Field
Embodiments of the present invention generally relate to the field of controlling the generation of images with a camera, especially those built into portable electronic devices. One object is to substantially reduce blur in images and to encourage generation of images of improved quality for improved results from post-processing and pre-processing particularly for image recognition and optical character recognition (OCR) applications.
2. Related Art
There are many electronic devices that include a built-in camera, including many mobile devices such as laptops, tablet computers, netbooks, smartphones, mobile phones, personal digital assistants (PDAs) and so on. Almost all of these devices possess a camera with an auto-focusing function. A camera's auto-focus mechanism is the system that allows the camera to focus on a certain subject. Through this mechanism, when pressing the shoot button that triggers capture of an image, a sharp image results—an image where there is a defined location of focus in the image.
However, many of these cameras and devices still do not possess a function of photography that adequately provides auto-focusing and thereby a user is allowed to take excessively blurry photographs. For example, these devices do not possess the function of, or make an adequate check of, auto-focusing just at the moment of taking the photo (shot). That means that a shot is performed merely and directly by triggering a shot button, regardless of whether auto-focusing was achieved or not. That is why to receive a sharp image a user should wait for auto-focusing of the device to perform its function.
There are other sources of blur in images taken with currently popular portable devices. Some of the popular camera-enabled devices employ complementary metal oxide semiconductor (CMOS) active pixel sensors to capture images. Often, these are small sensors and have a tendency to record blurry images unless exposed to relatively strong light.
The lack of the above-described function—auto-focusing at the moment of taking a shot—is evident in such devices as iOS-based portable electronic devices, netbooks, laptops, tablets, etc. Currently, these devices do not support auto-focusing at the moment of photography. Consequently, these devices require a user to wait for the camera to select a focus or require user assistance to focus the camera of the device on a subject before making, triggering or shooting a sufficiently sharp image (photos with minimal blur). Taking images in this manner does not guarantee sharp images because a user's hand can shake in the moment of capturing an image or photograph. Blurry images taken in this manner are generally unusable for the purpose of subsequent OCR or text recognition. There is substantial opportunity to improve the photography related to portable electronic devices. Non-blurred, sharper images are needed from portable electronic devices including from those devices that do not employ auto-focusing at the moment of taking a shot.
There are many factors that can cause blur in photographic images including relative motion between an imaging system or camera and an object or scene of interest. An example of a motion-blurred image is illustrated in
In one embodiment, the invention provides a method that includes instructions for a device, an operating system, firmware or software application that guarantee the best opportunity for a camera to receive or capture an image with a very good or acceptable level of quality. Photography is allowed or performed only after successfully completing a check of sufficient stabilization and focusing of the camera. A result of this invention is an image with sufficiently clear text for subsequent, accurate recognition of the text.
In the following description, for purposes of explanation, numerous specific details are set forth in order to provide an understanding of the invention. It will be apparent, however, to one skilled in the art that the invention can be practiced without these specific details. In other instances, structures and devices are shown only in block diagram form in order to avoid obscuring the invention.
Reference in this specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the invention. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment, and are not separate or alternative embodiments mutually exclusive of other embodiments. Moreover, various features are described which may be exhibited by some embodiments and not by others. Similarly, various requirements are described which may be requirements for some embodiments but not other embodiments.
Advantageously, the present invention discloses methods and devices that facilitate reduction in the number of blurred images being taken. These methods are effective for a variety of devices including those that do and do not support or implement auto-focusing at the moment of taking a shot.
There are many causes of blur in images. One of the principal causes is motion of an image capturing device, such as when moving a device at the instant of capturing images in relatively moderate to low levels of light for example. This type of blur is referred to as motion blur. This type of blur is almost always undesirable and detrimental to subsequent processing or consumption.
Referring now to
The electronic device 102 may comprise a general purpose computer embodied in different configurations such as a mobile phone, smartphone, cell phone, digital camera, laptop computer or any other gadget having a screen and a camera, or access to an image or image-generating device or component. A camera or scanner allows converting information represented on—for example—paper into a digital form.
In contrast,
Referring now to
Next, a user chooses a subject of interest and directs the camera for taking or capturing an image. For example, a viewfinder may be directed to a portion of text or page of a document. Then, at step 302, a user presses or actuates a button that ordinarily triggers capture of an electronic image. An exemplary button 106 is shown in
Returning to
The starting of a camera application 302 starts the disclosed method.
After that, a user chooses the subject of interest and directs the camera to a photo. For example, the subject of the photo may be a text or something else. Then at the step 304, the user actuates a camera button 106 (virtual or real) or taps the touch screen 104 for capturing image. Following this trigger, devices without a function of auto-focusing at the moment of triggering actually trigger capture of a photo. Taken in such manner, an image may be of very poor quality. That is why without implementation of the invention a user should wait by himself for the moment when auto-focusing has engaged and then should press the button only after sufficient focusing has occurred to receive a sharp image. In contrast, the implementation of disclosed invention helps avoid these shortcomings. According to the invention the next steps are performed.
If at the time that a capture button of a camera is pressed (304) the camera is already focused (the image is in focus) at 306, then photography is performed (318): an image is captured (318) and the device captures or receives a sharp image at step 320.
Otherwise, if a camera is not properly focused, an accelerometer or other sensor 308 starts to work. At step 308, the system tracks the sensor and seeks for a moment based on the readings of accelerometer (sensor) for a moment when the electronic device is stabilized. The stabilized state means that there is substantially little shaking of the device. The sensor provides feedback to the device and/or camera.
The feedback includes a signal that the device and/or camera is likely experiencing motion and there is a substantial likelihood of motion blur if an image is captured at that time or instant. The sensor or sensor system allows the device to wait, based on readings of the sensor (e.g., accelerometer, gyrometer), for the next moment when the electronic device and/or camera is sufficiently stabilized. The stabilized point means, for example, that there is relatively little shaking of the device and/or camera.
In one implementation, substantially simultaneously at the time the sensor starts to work, a timer starts to work. The timer keeps track of the time during which the device and/or camera seeks for a moment of sufficient stabilization. If a predetermined time limit for stabilization is exceeded, the process of photography stops. The user must again engage or trigger the device to take a photograph. In a preferred implementation, the device or camera provides a mechanism to override the stabilization checking.
From the point of view of a user, a user activates the button, and the device waits for a first available time for when there is a window of opportunity or moment of opportunity to capture a focused image. An exemplary scenario is illustrative. For a passenger riding in a vehicle, a user pulls out her mobile phone and desires to take a picture of a sign posted along side of the road while the car is moving. At this time, the mobile phone is moving around in the hands of the user, and the car is experiencing some ordinary turbulence as it advances on the road. The user activates a camera application on the mobile phone. During this time, the mobile phone activates or powers up the camera and related circuitry. The user points the mobile phone out the window of the vehicle. An image immediately captured may be blurry. Thus, the device waits, and a timer starts. Over the next few seconds, if the mobile phone (camera)—in the vehicle and in the control of the passenger (user)—reaches a sufficiently stable state, and there is sufficient incident light, the sensor in the mobile phone communicates that the camera is free to take a photograph. Assuming that the vehicle is moving slowly enough, an image captured at this instant is likely to be sufficiently in focus.
In the case of failure, a user must press the shoot button or tap the touch screen to run this process again. The limits of the timer may be preliminary specified by the user. For example, the device or camera may wait for one or more stable opportunities within 5 seconds, or 10 seconds. This function is useful in conditions where there is steady or unpredictable shaking, for example in a subway. Therefore, it is impossible to take a sharp image of the text in some cases because there may be excessive movement or instability of the electronic device or camera.
In one embodiment, the system implements a plurality of thresholds for levels of noise corresponding to shaking or movement of the electronic device or camera.
Thresholds for noise based on sensor readings may be specified preliminarily according to different types of subjects captured in images. For example, the level of noise for a text-based image destined for recognition must be much less than the level for a picture with no text elements. The level of noise concerning subsequent recognition of a textual image (text-based image, or an image that includes text) is important for acquiring accurate recognition of the corresponding text. Blurred textual images require much more computational resources from the electronic device to be perfectly or adequately recognized. Also, the rate of blur may be so high that recognition may not be possible at all. Consequently, textual images (images that include text) must be acquired with the smallest level of blur as possible. In contrast, some images, even some that include text, may have some level of blur, if the best quality image is not required for subsequent processing (e.g., printing, sharing via social media, archiving) for the particular images. The level of noise for each kind of image may be preliminarily specified by a user in one or more settings of the system, or may be programmatically obtained by training the device. One way in which the level of acceptable noise may be specified is to allow a user to select a quality of picture that is desired. For example, if a user desires to take landscape photographs of mountains, the user selects a “non-text” option. In another example, a user desires to take a series of pictures of receipts for submission of the information (text) to a finance system. In this example, the user would select a “text-based picture” option. By doing so, the device is programmed to detect a sufficiently stable moment in which to take photographs that are receptive to OCR.
In another embodiment, readings or recordings of a light sensor of an electronic device also may be applied for acquiring images of a good quality or of sufficient quality. Generally, the quality of each image increases with the level of light. The light sensor helps to set optimal values of brightness and contrast for the certain level of illuminance. The light sensor allows the electronic device or camera to determine thresholds of illumination for subsequent acquiring of images. For example, the values of thresholds may be specified in such manner that the electronic device allows triggering of the camera to capture images only when there is a sufficiently high level (amount) of light.
The readings of two or more sensors (e.g., light sensor and accelerometer) may be combined for sending feedback to the electronic device for eventual triggering of the camera for taking shots (capturing images).
With reference again to
After the focusing at step 312 the system is checked whether the device is stabilized at step 314. In the case when the device is not stabilized, the system returns again to step 308. If the electronic device is stabilized, the system checks whether the camera is focused at step 316.
If the camera is focused, photography/taking of an image 318 is performed automatically or programmatically. Otherwise, the system returns to the step of focusing 312. So if during the process of focusing shaking starts, the system is returned to the step of tracking accelerometer readings for the waiting a moment when the electronic device with camera is stabilized.
The above-described invention helps to identify opportunities to take a photograph (such as of text for example) without substantial blur excepting such human factors as shaking of a user's hand in typical circumstances related to photography. Otherwise, if the circumstances are not suitable for photography, shooting is not enabled by functionality consistent with that described herein.
The hardware 600 also typically receives a number of inputs and outputs for communicating information externally. For interface with a user or operator, the hardware 600 usually includes one or more user input devices 606 (e.g., a keyboard, a mouse, imaging device, scanner, etc.) and a one or more output devices 608 (e.g., a Liquid Crystal Display (LCD) panel, a sound playback device (speaker). To embody the present invention, the hardware 600 must include at least one touch screen device (for example, a touch screen), an interactive whiteboard or any other device which allows the user to interact with a computer by touching areas on the screen. The keyboard is not obligatory in case of embodiment of the present invention.
For additional storage, the hardware 600 may also include one or more mass storage devices 610, e.g., a floppy or other removable disk drive, a hard disk drive, a Direct Access Storage Device (DASD), an optical drive (e.g. a Compact Disk (CD) drive, a Digital Versatile Disk (DVD) drive, etc.) and/or a tape drive, among others. Furthermore, the hardware 600 may include an interface with one or more networks 612 (e.g., a local area network (LAN), a wide area network (WAN), a wireless network, and/or the Internet among others) to permit the communication of information with other computers coupled to the networks. It should be appreciated that the hardware 600 typically includes suitable analog and/or digital interfaces between the processor 602 and each of the components 604, 606, 608, and 612 as is well known in the art.
The hardware 600 operates under the control of an operating system 614, and executes various computer software applications 616, components, programs, objects, modules, etc. to implement the techniques described above. In particular, the computer software applications will include the client dictionary application and also other installed applications for displaying text and/or text image content such a word processor, dedicated e-book reader etc. in the case of the client user device 102. Moreover, various applications, components, programs, objects, etc., collectively indicated by reference 616 in
In general, the routines executed to implement the embodiments of the invention may be implemented as part of an operating system or a specific application, component, program, object, module or sequence of instructions referred to as “computer programs.” The computer programs typically comprise one or more instructions set at various times in various memory and storage devices in a computer, and that, when read and executed by one or more processors in a computer, cause the computer to perform operations necessary to execute elements involving the various aspects of the invention. Moreover, while the invention has been described in the context of fully functioning computers and computer systems, those skilled in the art will appreciate that the various embodiments of the invention are capable of being distributed as a program product in a variety of forms, and that the invention applies equally regardless of the particular type of computer-readable media used to actually effect the distribution. Examples of computer-readable media include but are not limited to recordable type media such as volatile and non-volatile memory devices, floppy and other removable disks, hard disk drives, optical disks (e.g., Compact Disk Read-Only Memory (CD-ROMs), Digital Versatile Disks (DVDs), flash memory, etc.), among others. Another type of distribution may be implemented as Internet downloads.
While certain exemplary embodiments have been described and shown in the accompanying drawings, it is to be understood that such embodiments are merely illustrative and not restrictive of the broad invention and that this invention is not limited to the specific constructions and arrangements shown and described, since various other modifications may occur to those ordinarily skilled in the art upon studying this disclosure. In an area of technology such as this, where growth is fast and further advancements are not easily foreseen, the disclosed embodiments may be readily modifiable in arrangement and detail as facilitated by enabling technological advancements without departing from the principals of the present disclosure.