Pursuant to 35 U.S.C. §371, this application is the United States National Stage Application of International Patent Application No. PCT/CN2013/088145, filed on Nov. 29, 2013, entitled “CONTROLLING A CAMERA WITH FACE DETECTION”, which is incorporated by reference herein in its entirety.
The present disclosure relates generally to techniques for controlling an electronic device. More specifically, the present disclosure relates generally to techniques for controlling a camera using face detection.
Many electronic devices, such as smartphones and tablet computers, have integrated cameras that allow users to capture digital pictures. To capture a picture, the user can enter a camera mode that enables the user to use the device like a camera. During camera mode, the display of the device is used as a sort of viewfinder that shows the image that will be captured. If the display is a touchscreen, the display can also serve as an input device. For example, the user may be able to press a virtual button on the display to capture a picture. Even prior to capturing the picture, the user may have the ability to press regions of the screen to change the focus target, disable the flash, or change the zoom, for example. If the device user is holding the device chassis in one hand and using the other hand to press buttons on the display, the stability of the devices camera may be compromised resulting in extra movement of the chassis and additional blur in the captured picture. Furthermore, if the device user only has one hand free, the user may not be able to make the desired camera adjustments or capture an image.
The same numbers are used throughout the disclosure and the figures to reference like components and features. Numbers in the 100 series refer to features originally found in
The present disclosure is related to techniques for using facial detection to control a camera in a device with a front-facing camera and a rear-facing camera. The front-facing camera points in the same direction as the display so that it can be used, for example, to capture the user's face for video conferencing. The rear-facing camera points in the opposite direction from the front-facing camera and the display. The rear-facing camera is the camera that would most often be used for capturing digital pictures or digital video to be stored to the user's device.
The front-facing camera can be used to detect the face of the camera user. Changes in the image of the user's face can be used to control the rear-facing camera, which may be pointed at a scene that the user is trying to capture. For example, the user may move the camera away from or toward their face. The front-facing camera can detect the change in the apparent size of the user's face and use that information to control some feature of the rear-facing camera, such as changing the zoom factor. As used herein, the term “gesture” refers to any user action that can cause a change in the image stream of the front-facing camera. Gestures that may be detected and used for controlling the rear-facing camera include, but are not limited to, head movements, eye movements, blinking, winking, shoulder shrugging, sticking out the tongue, smiling, frowning, movements of the camera that cause changes in the rear-facing image such as movements away from or toward the user's face or tilting the camera, which causes the size or position of the image captured by the rear-facing camera to change. Any control feature of the rear-facing camera can be paired with any gesture. The control features include, but are not limited to, zoom, focus depth, focus position, shutter speed, image capture, mode changing, and toggling various features, such as flash, toggling between picture mode and video mode, enabling a burst mode, etc.
With the present techniques, the front-facing camera can be used to receive a stream of images of the user's face at the same time that the back-facing camera is used to capture a picture (such as a landscape or of a group of people). The stream of images received by the front facing camera can include one or more frames of image data. The device uses the image stream from the front-facing camera to derive the user's gestures, such as facial expressions, head movements, camera movements, and the like. Based on the various gestures, the device initiates actions that would otherwise have to be done using fingers on the touchscreen or pressing buttons on the side of computing device.
The techniques described herein enable a user to keep both hands on the chassis of the device while controlling the device's camera, which can increase stability and reduce image blur. The techniques described herein may also enable the user to fully operate the camera with a single hand, which may be useful when the user only has one hand free for both carrying the device and controlling the camera. When using one hand to carry and control the camera, either hand may be used while still being able to access all of the camera's control features.
The device 100 can also include memory 106. The memory 106 can include volatile memory, non-volatile memory, or any combination thereof. The memory 106 may be used to store user data, such as image and video data, as well as instructions executable by the processor, such as an operating system, applications, and other programming code. Examples of non-volatile memory include, but are not limited to, electrically erasable programmable read only memory (EEPROM), read only memory (ROM), flash memory, and other solid state memory types such as memristor memory. Examples of volatile memory include, but are not limited to, static random access memory (SRAM), and dynamic random access memory (DRAM).
In some embodiments, the device 100 includes a graphics processing unit (GPU) 108 in addition to the processor 102. The GPU 108 may be configured to perform any number of graphics operations within the device 100. For example, the GPU 108 may be configured to render or manipulate graphics images, graphics frames, videos, or the like, to be displayed to a user of the device 100. In some embodiments, the GPU 108 includes a number of graphics engines (not shown), wherein each graphics engine is configured to perform specific graphics tasks, or to execute specific types of workloads. In some embodiments, the device 100 does not include a GPU 108, and the graphics processing operations are performed by the processor 102.
The processor 102 may be connected through the bus 104 to various input devices 110, which may include, for example, buttons, touchscreen, and any other suitable controls, among others. The input devices 110 may be used to control the device 100. For example, many of the camera control features may be accessed through the input devices 110. The processor 102 may also be linked through the bus 104 to a display 112. The display 112 may include a display screen that is a built-in component of the device 100. The display 112 may be integrated with a touchscreen input device.
The device 100 also includes two or more cameras, including a front-facing camera 114 and a rear-facing camera 116. The front-facing camera 114 is the camera that is pointed in the direction of the user when the user is using the device 100 in the usual manner, in other words, facing the display 112 and the input devices 110. The rear-facing camera 116 is the camera that points outward from the back of the device 100, in other words, the side opposite of the display 112 and input devices 110. The rear-facing camera 116 may be a high-resolution camera and would typically be the camera that would be used for taking pictures and recording video. Each of the cameras 114 and 116 may include any suitable type of image sensors that can capture an image for storage to a computer-readable memory device. For example, the cameras may include charge-coupled device (CCD) image sensors, complementary metal-oxide-semiconductor (CMOS) image sensors, system on chip (SOC) image sensors, image sensors with photosensitive thin film transistors, or any combination thereof. In some examples, the rear-facing camera includes more than one image sensor and corresponding lens or set of lenses. For example, the rear-facing camera could be a 3D camera used to capture three-dimensional images.
One or more of the cameras 114 and 116 may include a zoom controller 118, which controls the zoom factor of the camera. In some examples, the zoom factor may adjusted by moving a lens of the camera, and the zoom controller 118 may include or be coupled to a motor for moving one of the lenses. In some examples, the rear-facing camera 116 includes a zoom controller 118, but the front-facing 114 camera does not. One or more of the cameras 114 and 116 may also include an exposure controller 120 that controls the exposure or sensitivity of the camera's image sensor to incoming light. For example, if the camera includes a mechanical shutter, the exposure controller 120 can include circuitry for controlling the shutter speed and the shutter aperture. The exposure controller 120 can control the brightness of a captured image, the depth of field of the captured image, and other effects. The device 100 may also include other controllers not shown in
The device 100 also includes a camera control module 122, which includes logic for controlling the front camera and rear camera. The camera control module can control the cameras by sending commands to the zoom controller and the exposure controller, for example. The camera control module 122 is implemented as hardware or a combination of hardware and software. For example, as shown in
The camera control module 122 enables a user to control the rear-facing camera 116 by various gestures that are received by the front-facing camera 114. When the device is in camera mode, a stream of image data may be received by the rear-facing camera 116, processed, and displayed on the display 112 for viewing by the user. Meanwhile, the front-facing camera 114 can be receiving a stream of image data that is processed to identify gestures that indicate a particular camera control that the user desires, such as zooming in or out, changing the focus, changing the light exposure, or toggling a particular feature such as a flash. Various examples will be described further below. The block diagram of
The method 200 may begin at block 202, wherein the device enters camera mode. During camera mode, the device is receiving image data and can receive camera control instructions from the user, such as an instruction to capture an image. Also during camera mode, the device's display may be serving as a viewfinder and displaying an image stream from the rear-facing camera. If the device is a tablet computer, for example, camera mode may be activated by the user by selecting a camera application. If the device is a digital camera, camera mode may be activated when the device is turned on. During “camera mode,” the user will be able to capture images or video. Furthermore, “capturing” refers to acquiring an image or video so that it can be stored for later retrieval and viewing. In other words, an image is captured when the user takes a picture as opposed to simply viewing an image stream on the display. Similarly, video is captured when the user instructs the camera to record.
At block 204, an image stream is received from the rear-facing camera and sent to the display. As explained above in relation to
At block 206, an image stream is received from the front-facing camera. As described above in relation to
At block 208, the image stream from the front-facing camera is processed to identify gestures. Gestures can be identified based on changes in the front-facing image stream. For example, gestures can be identified by comparing the current image to the baseline image. Movements of the user's head, or shoulders, or specific features of the user's face, can be detected by comparing the current image and the baseline image. Additionally, movement of the camera relative to the user will cause image-wide changes that can also be detected. For example, tilting the device upward will cause all of the image features to shift downward in the image. Moving the device away from the user will cause the user's face to appear smaller. Some of the detected changes will correspond with specific gestures that the camera control module is configured to recognize as camera control gestures. Gestures that the camera control module 122 can be configured to recognize include head movements, facial movements, movements of the shoulders, or image wide movements that indicate movement of the device. Examples of head movements that may correspond with specific gestures include tilting the head up, down, or to the side, rotating the head to the left or right, or shaking the head, among others. Examples of facial movements that may be identified as gestures include blinking, winking, opening or closing the mouth, smiling, frowning, and sticking out the tongue, among others. Examples of image-wide gestures include tilting the device upward, downward, to the left, or to right, moving the device closer or further way, and shaking the device, among others. Other examples of gestures include shrugging the shoulders or one shoulder. In some embodiments, gestures can be identified by a device other than a camera such as a motion detector, a sensor, a distance measuring device, and the like.
At block 210, the gesture identified at block 208 can be used to identify a camera control feature applicable to the rear-facing camera. Any camera control feature can be controlled by a corresponding gesture. The possible camera control features that may be controlled by gestures include, but are not limited to zoom factor, focus distance, depth of field, focus target, picture brightness, exposure target, flash on or off, the toggling of various camera modes, such as red eye reduction, burst mode, camera mode versus video mode, modes that effect shutter aperture, and shutter speed, among several others.
The correlation between the specific gestures and their corresponding control features can be pre-programmed in the device. For example, the device may include a lookup table that correlates each gesture with its corresponding control feature. For example, if a wink is identified as a gesture, the gesture may be assigned a particular identifier and used to retrieve a corresponding control feature from the lookup table. In the present example, the wink gesture may correspond with a control feature that is a command to capture an image. The device can also include a customizer utility that enables the user specify which gestures are paired with which control features. In this way, the correlation between the gestures and the camera control features can vary from user to user. For example, a camera that is owned by two people may have two separate sets of correlations. For one user, a wink could trigger one camera control action, and for another user, a wink could trigger a different camera control action. This may be useful for cases where one user struggles with a particular gesture.
Various gestures will be complimentary, meaning that they have an opposite gesture, and are therefore suitable for being paired with control features that are also complimentary. For example, winking the left eye is complimentary to winking the right eye. Similarly, shrugging the left shoulder is complimentary to shrugging the right shoulder. Smiling is complimentary to frowning. Such complimentary gestures can be paired with complimentary control features. For example, a smile gesture may be paired with the control feature of zooming in, while a frown gesture may be paired with the control feature of zooming out. The gesture of tilting the head to the left can be paired with the control feature of moving the focus target to the left, and the gesture of tilting the head to the right can be paired with the control feature of moving the focus target to the right.
Various gestures may be considered to be directional gestures and are therefore suitable for being paired with control features that are also directional. For example, upward, downward, leftward, and rightward tilts of the head can be paired with the corresponding upward, downward, leftward, and rightward movements of the focus target or exposure target.
Some gestures may be considered to be activation gestures, which are gestures that activate a particular result, such as image capture (i.e., taking a picture), toggling a feature on and off, or advancing through a list of available features. For example, the gesture of sticking out the tongue, winking, or blinking may be paired with the control feature of taking a picture. If blinking is used as a gesture, the gesture may be identified by a rapid succession of blinks to distinguish the gesture from a normal blinking of the eyes. The gesture of a quick left-right shaking of the head may be paired with the control feature of activating or deactivating a particular mode such as toggling between a camera mode and video mode, deleting a picture, or exiting camera or video mode, among others. The gesture of opening or closing the mouth may be paired with the control feature of toggling the camera flash on or off.
Any gesture can be paired with any control feature depending on the design considerations of a particular implementation. Various examples of some of the possible pairing are described herein. However, an exhaustive list of every possible embodiment is impractical and many additional pairings between gestures and control features will occur to a person of ordinary skill in the art given benefit of the present disclosure.
At block 212 the identified control feature is activated for the rear-facing camera. For example, if the control feature is a command to zoom in, the camera control module 122 can send a command to the zoom controller 118 to zoom in. If the control feature is a command to increase the brightness of the picture, the camera control module 122 can send a command to the exposure controller 110 to increase the picture brightness. If the control feature is a command to change the exposure target or focus target to another part of the displayed picture, the camera control module 122 can send a command to the change the position of the exposure target or focus target. The above process may be repeated for as long as the device remains in camera mode.
When the user moves the device 100 closer, as shown in
In the present description and claims, the terms “coupled” and “connected,” along with their derivatives, may be used. It should be understood that these terms are not intended as synonyms for each other. Rather, in particular embodiments, “connected” may be used to indicate that two or more elements are in direct physical or electrical contact with each other. “Coupled” may mean that two or more elements are in direct physical or electrical contact. However, “coupled” may also mean that two or more elements are not in direct contact with each other, but yet still co-operate or interact with each other.
Some embodiments may be implemented in one or a combination of hardware, firmware, and software. Some embodiments may also be implemented as instructions stored on a machine-readable medium, which may be read and executed by a computing platform to perform the operations described herein. A machine-readable medium may include any mechanism for storing or transmitting information in a form readable by a machine, e.g., a computer. For example, a machine-readable medium may include read only memory (ROM); random access memory (RAM); magnetic disk storage media; optical storage media; flash memory devices; or electrical, optical, acoustical or other form of propagated signals, e.g., carrier waves, infrared signals, digital signals, or the interfaces that transmit and/or receive signals, among others.
An embodiment is an implementation or example. Reference in the specification to “an embodiment,” “one embodiment,” “some embodiments,” “various embodiments,” or “other embodiments” means that a particular feature, structure, or characteristic described in connection with the embodiments is included in at least some embodiments, but not necessarily all embodiments, of the present techniques. The various appearances of “an embodiment,” “one embodiment,” or “some embodiments” are not necessarily all referring to the same embodiments. Elements or aspects from an embodiment can be combined with elements or aspects of another embodiment.
Not all components, features, structures, characteristics, etc. described and illustrated herein need be included in a particular embodiment or embodiments. If the specification states a component, feature, structure, or characteristic “may”, “might”, “can” or “could” be included, for example, that particular component, feature, structure, or characteristic is not required to be included. If the specification or claim refers to “a” or “an” element, that does not mean there is only one of the element. If the specification or claims refer to “an additional” element, that does not preclude there being more than one of the additional element.
It is to be noted that, although some embodiments have been described in reference to particular implementations, other implementations are possible according to some embodiments. Additionally, the arrangement and/or order of circuit elements or other features illustrated in the drawings and/or described herein need not be arranged in the particular way illustrated and described. Many other arrangements are possible according to some embodiments.
In each system shown in a figure, the elements in some cases may each have a same reference number or a different reference number to suggest that the elements represented could be different and/or similar. However, an element may be flexible enough to have different implementations and work with some or all of the systems shown or described herein. The various elements shown in the figures may be the same or different. Which one is referred to as a first element and which is called a second element is arbitrary.
It is to be understood that specifics in the aforementioned examples may be used anywhere in one or more embodiments. For instance, all optional features of the computing device described above may also be implemented with respect to either of the methods described herein or a computer-readable medium. Furthermore, although flow diagrams and/or state diagrams may have been used herein to describe embodiments, the present techniques are not limited to those diagrams or to corresponding descriptions herein. For example, flow need not move through each illustrated box or state or in exactly the same order as illustrated and described herein.
The present techniques are not restricted to the particular details listed herein. Indeed, those skilled in the art having the benefit of this disclosure will appreciate that many other variations from the foregoing description and drawings may be made within the scope of the present techniques. Accordingly, it is the following claims including any amendments thereto that define the scope of the present techniques.
An example of an electronic device in accordance with the present techniques is provided. The electronic device includes a first camera, a second camera, and a camera control module. The first camera receives a first image stream. The second camera to receive a second image stream that includes a face of a user of the electronic device. The camera control module is to identify a gesture based on a change in the second image stream, identify a camera control feature of the first camera based on the gesture; and activate the camera control feature for the first camera.
In some examples, the camera control module is to establish a base-line image from the second image stream and compare the base-line image to a current image of the second image stream to identify the gesture. In some examples, the gesture is a movement of the electronic device away from the face of the user, and the camera control feature is a command to zoom in. In some examples, the gesture is a wink, and the camera control feature is the activation of an image capture. The gesture can be a compliment of a second gesture and the camera control feature can be a compliment of a second camera control feature controlled by the second gesture.
The gesture can also be a directional gesture and the camera control feature can be a directional camera control feature. For examples, the gesture can be a tilt of the head or a rotation of the head, and the camera control feature can be a command to move a focus target or a command to move an exposure target.
An example of an electronic device in accordance with the present techniques is provided. The electronic device includes logic to receive a first image stream that includes a face of a user of the electronic device. The electronic device includes logic to identify a gesture based on a change in the first image stream. The electronic device also includes logic to identify a camera control feature of a second camera based on the gesture and activate the camera control feature for the second camera.
In some examples, the electronic device includes logic to establish a base-line image from the first image stream and compare the base-line image to a current image of the first image stream to identify the gesture. In some examples, the gesture is a movement of the electronic device away from the face of the user, and the camera control feature is a command to zoom in. In some examples, the gesture is a wink, and the camera control feature is the activation of an image capture. The gesture can be a compliment of a second gesture and the camera control feature can be a compliment of a second camera control feature controlled by the second gesture.
The gesture can also be a directional gesture and the camera control feature can be a directional camera control feature. For examples, the gesture can be a tilt of the head or a rotation of the head, and the camera control feature can be a command to move a focus target or a command to move an exposure target.
An example of a tangible, non-transitory, computer-readable medium in accordance with the present techniques is provided. The computer-readable medium includes instructions to direct a processor to receive a first image stream from a first camera. The instructions direct the processor to receive a second image stream from a second camera, wherein the second image stream includes a face of a user of the electronic device. The instructions also direct the processor to identify a gesture based on a change in the second image stream, identify a camera control feature of the first camera based on the gesture; and activate the camera control feature for the first camera.
In some examples, the instructions direct the processor to establish a base-line image from the second image stream and compare the base-line image to a current image of the second image stream to identify the gesture. In some examples, the gesture is a movement of the second camera away from the face of the user, and the camera control feature is a command to zoom in. In some examples, the gesture is a wink, and the camera control feature is the activation of an image capture. The gesture can be a compliment of a second gesture and the camera control feature can be a compliment of a second camera control feature controlled by the second gesture.
The gesture can also be a directional gesture and the camera control feature can be a directional camera control feature. For examples, the gesture can be a tilt of the head or a rotation of the head, and the camera control feature can be a command to move a focus target or a command to move an exposure target
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2013/088145 | 11/29/2013 | WO | 00 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2015/077978 | 6/4/2015 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
9171198 | Raffle | Oct 2015 | B1 |
20080089587 | Kim et al. | Apr 2008 | A1 |
20110076003 | Cho et al. | Mar 2011 | A1 |
20120038675 | Johnson | Feb 2012 | A1 |
20120281129 | Wang | Nov 2012 | A1 |
Number | Date | Country |
---|---|---|
101183429 | May 2008 | CN |
101408936 | Apr 2009 | CN |
203070205 | Jul 2013 | CN |
103379224 | Oct 2013 | CN |
2 661 068 | Nov 2013 | EP |
201138440 | Nov 2011 | TW |
Entry |
---|
Taiwan Search Report for TW Application No. 103137444, date of completion Nov. 5, 2015, 3 pages. |
International Search Report and Written Opinion received for PCT Patent Application No. PCT/CN2013/088145, mailed on Sep. 5, 2014, 16 pages. |
Number | Date | Country | |
---|---|---|---|
20160156838 A1 | Jun 2016 | US |