The embodiments described herein relate generally to the field of electronic devices. More particularly, the embodiments describe techniques for using content of an image captured by a camera associated with the electronic device to modify a current operating state of the electronic device.
Electronic devices can include an image capture system such as a camera. The camera can be configured to capture both still images such as a snap shot or moving images that can be processed to form video. Recently, it has become popular to reduce the size and weight of the electronic devices such that they become highly portable in nature. In this way, the highly portable electronic devices can include a front facing camera configured to capture images. However, most of the information associated with the captured images is generally not used since only a small fraction of the total image data captured is retained. Therefore, most camera systems associated with small form factor electronic devices have a very low average utilization factor for the resources, both hardware and computational, dedicated to the camera system. This wasteful use of resources is particularly problematic in highly portable small form factor electronic devices where available space is limited.
Therefore, an electronic device that practices an efficient method, system, and apparatus of a multipurpose image capture system is desired.
A method for displaying visual content by an electronic device having a plurality of operational components at least one of which is a front facing image capture device and at least another is a front facing display device arranged to display visual content is described. The method can be carried out by capturing an image by the front facing image capture device, the image including image content, determining an orientation of a human face associated with at least some of the captured image content, and aligning the orientation of at least some of the visual content presented at the display in real time in accordance with the orientation of the human face.
In one embodiment, electronic device includes a rear facing camera arranged to capture a rear facing image. The rear facing image can be presented as at least some of the visual content presented at the display in real time in accordance with the orientation of the human face.
In yet other embodiments, at least some of the visual content can be presented aligned with the direction of gravity independent of the orientation of the electronic device. In other embodiments, a preferred orientation can be used to process captured images in accordance with the preferred orientation.
A personal media device includes at least a front facing display configured to present visual content, a front facing camera, the front facing camera configured to capture image data, a facial orientation module, the facial orientation module arranged to generate a facial orientation vector based in part upon the captured image data, the facial orientation vector corresponding to a facial orientation of a current user viewing the front facing display, a rear facing image capture device arranged to capture a rear facing image, and a processor coupled to the front facing display, the front facing camera, and the rear facing camera, the processor configured to display in real time at least some of the visual content presented by the display in accordance with the orientation of the human face.
Non-transitory computer readable medium for storing computer code executed by a processor in a personal media device having at least a front facing image capture device and a front facing display device arranged to display visual content for modifying a current operational state of the personal media player. The computer readable medium includes at least computer code for capturing an image by the image capture device, the image including image content and computer code for modifying the current operational state of the electronic device in accordance with captured image content.
A method can be performed by capturing image data by a front facing image capture device associated with a personal media device, the personal media device having a front facing display configured for displaying visual content at an orientation, presenting the visual content at the display device at the orientation, determining if a human face is included in the captured image data, determining a facial orientation vector associated with the human face, and using the facial orientation vector to modify the presenting of the visual content by the display.
Other apparatuses, methods, features and advantages of the described embodiments will be or will become apparent to one with skill in the art upon examination of the following figures and detailed description. It is target that all such additional apparatuses, methods, features and advantages be included within this description be within the scope of and protected by the accompanying claims.
The described embodiments and the advantages thereof can best be understood by reference to the following description taken in conjunction with the accompanying drawings.
In the following detailed description, numerous specific details are set forth to provide a thorough understanding of the concepts underlying the described embodiments. It will be apparent, however, to one skilled in the art that the described embodiments can be practiced without some or all of these specific details. In other instances, well known process steps have not been described in detail in order to avoid unnecessarily obscuring the underlying concepts.
Aspects of the described embodiments relate to operation of an electronic device. In particular, the electronic device can include an image capture device that is capable of capturing images. In a particular embodiment, the image capture device can be a front facing image capture device. In this way, when the electronic device is portable and held by a user, the front facing image capture device can capture image data at least some of which is associated with the user. The images can take the form of still images such as a snap shot, or moving images that can be processed into video of any format. In one embodiment, the image capture device can operate in what can be referred to as a background operation mode by which it is meant that the image capture device can initiate an image capture process as part of the background operations carried out by the operating system of the electronic device. In this way, the image capture device can receive information in the form of image data without direct action by or knowledge of a user.
In some embodiments, captured image data can be used to control or at least influence operations of the electronic device. For example, when the electronic device can capture images of a human face, motions of the human face (nodding, shaking side to side, facial expressions) can be used to modify operations of the electronic device.
In a particularly useful embodiment, image data received at the image capture device can be processed in such as way as to alter a current operating state of the electronic device. In those embodiments where the electronic device includes a front facing display for presenting visual content, the manner in which the visual content is presented can be altered by the image data received at the image capture device and processed by the electronic device. The presentation of the visual content can be altered in any number of ways. For example, an orientation (i.e., landscape or portrait) of the presented visual content can be altered based upon image data received at the image capture and processed by the electronic device. This is particularly advantageous when the captured image data includes image data corresponding to a human face associated a user of the electronic device. By utilizing well known facial recognition techniques, the electronic device can process that portion of the captured image data determined to correspond to the human face. In one aspect, the facial data can be used to determine an orientation of the human face relative to the orientation of the presented visual content. In other words, the electronic device can determine if the orientation of the presented visual content is substantially the same, or different, than the orientation of the human face. In one implementation, the electronic device can alter the current orientation of the presented visual content to more closely align with that of the human face.
It should be noted that the orienting feature of the described embodiments is independent of any inertial or gravimetric techniques used in the prior art to ascertain the orientation of the user. Typical accelerometer or other inertial or gravitational based sensors rely solely upon the sensed orientation with respect to the direction of gravity of the electronic device and not the user. Accordingly, the prior art orientation sensing techniques merely presume the actual orientation of the user and not as with the techniques of the described embodiments, the actual orientation of the user. Furthermore, the orienting feature can operate in real time such that the orientation of the presented visual content can track any changes, in real time, of the user of the electronic device. For example, if a user wishes to recline and view the presented visual content, the electronic device can determine that the user is currently exhibiting a reclining orientation and adjust the orientation of the presented visual content accordingly. Therefore, unlike the prior art orienting techniques that presume the orientation of the user based solely upon the physical orientation of the electronic device, the described embodiments can use the actual orientation of the user to provide a much improved user experience.
Furthermore, the electronic device can also process the captured image data to determine if the human face is a recognized or an unrecognized human face. In this way, the electronic device can provide a layer of security based upon whether or not a current user of the electronic device is not only recognized but recognized as an approved user of the electronic device. This is particular useful in those situations where the electronic device has been lost or stolen. Any attempt to actually use the electronic device by anyone other than the owner (or other authorized user) can result in specific actions being taken designed to thwart any unauthorized use of the electronic device. As part of the facial recognition options made available by the electronic device, the electronic device can be operated in what can be referred to as a learning mode in which the electronic device can be trained to recognize a particular human face. In this way, the electronic device can learn that more than one user can be considered authorized. This ability of learning to recognize the particular human face can be a useful tool in those situations where an owner of the electronic devices desires that only a specific group of humans is authorized to operate the electronic device. Using the facial recognition feature, the authorization process can run in the background without the knowledge or nor intervention by the user of the electronic device. In this way, a seamless transition from one authorized user to another authorized user can be carried out without the need of manually inputting information such as a password, pass phrase etc.
In addition to providing orientation and security services, power saving features can also be provided. In particular, if the processing of the captured image data indicates that there it is likely that a human face is not present, and then the image capture device can continue to capture image data for processing by the electronic device. After a preset length of time with no human face detected, a presumption can be made that electronic device is currently not being used. Once it has been determined that it is likely that the electronic device is not currently being used, most components in the electronic device can be put on a standby mode in order to preserve power. In some cases, the image capture device can occasionally “wake up” to capture image data in order to determine if a user has come into view and take appropriate actions. For example, the sleeping electronic device can wake up when a recognized face that is also authorized comes into view of the image capture device. In this way, the electronic device will wake up when an authorized user comes into view and looks at the electronic device.
In the described embodiments, the electronic device can take many forms. The electronic device can, for example, take the form of a portable media device (PMD) arranged to monitor, process, present and manage image data captured by an image capture device. The PMD can pertain to a portable media device such as an iPod™, a personal communication device along the lines of the iPhone™, or portable computing platform such as a tablet computer that includes the iPad™, all of which are manufactured by Apple Inc. of Cupertino, Calif. More specifically, the image capture device can take the form of at least a front facing camera configured to capture image data that can be processed in any number of ways.
These and other embodiments are discussed below with reference to
PMD 100 can be highly portable in nature and as such can be powered by one or more rechargeable and/or replaceable batteries such that PMD 100 can be carried about while traveling, working, exercising, and so forth. In this way, PMD 100 can provide services such as playing music, play games or video, record video or take pictures, place and receive telephone calls, communicate with communication devices, control other devices (e.g., via remote control and/or Bluetooth functionality), and so forth. In addition, PMD 100 can be sized such that it fits relatively easily into a pocket or a hand. While certain embodiments of the present invention are described with respect to a portable electronic device, it should be noted that the presently disclosed techniques can be applicable to a wide array of other, less portable, electronic devices and systems that are configured to render graphical data, such as a desktop computer.
PMD 100 can include an enclosure or housing 102, display 104 for presenting at least visual content and front facing image capture device 106 having lens 108. For the remainder of this discussion, image capture device 106 takes the form of camera 106 capable of capturing both still and moving images for conversion to video. Enclosure 102 can be formed from plastic, metal, composite materials, or other suitable materials, or any combination thereof. Enclosure 102 can protect the interior components of PMD 100 from physical damage, and can also shield the interior components from electromagnetic interference (EMI).
Display 104 can be a liquid crystal display (LCD); a light emitting diode (LED) based display, an organic light emitting diode (OLED) based display, or some other suitable display. In accordance with certain embodiments of the present invention, Display 104 can display a user interface and various other images, such as those captured by front facing camera 106 or logos, avatars, photos, album art, and the like. Display 104 can include a touch screen through which a user can interact with the user interface. The display can also include various function and/or system indicators to provide feedback to a user, such as power status, call status, memory status, or the like. These indicators can be incorporated into the user interface presented by display 104.
Front facing camera 106 can capture video images via lens 108 adapted to collect and focus external light used for forming viewable video images on display 104. While camera 106 and lens 108 are shown to be disposed on a top portion of enclosure 102, it should be appreciated that in other embodiments such elements can be disposed on a bottom, side, or back portions of PMD 100. In other embodiments, camera 106 and lens 108 can be located on a moveable or rotatable element which is coupled to enclosure 102. Still further, camera 106 can be detachable from enclosure 102. Still further, multiple cameras can be included in the same enclosure 102.
As discussed in detail below, PMD 100 can include image acquiring, processing and generating elements adapted to store and execute image calibration schemes used for adjusting various image parameters, such as color response, image white balance (WB), IR filtering (e.g., wavelength cut-off), and so forth. Accordingly, such calibration methods are directed for reducing camera-to-camera variations, such as those resulting from the manufacturing of camera 106 and its components, as well as those resulting from the entire manufacturing process of PMD 100. Consequently, such calibration schemes are adapted to ensure that media players, such as PMD 100, render images consistently and in a manner conforming to the user's liking and expectation. In this manner, media players incorporating digital camera modules, i.e., camera 106 can operate uniformly with minimal variations, thus, preserving and enhancing product performance and product-line uniformity.
Camera lens 209 can be a standard-type lens having a certain focal length. Lens 209 can be part of an aperture adapted to gather and focus the acquired light onto the image sensor 210. The image sensor 210, which can include a charge coupled device (CCD), a complementary metal oxide device (CMOS) and/or other silicon based electro-optical sensors, photonic and/or optical fiber based devices, and the like, are adapted to convert the focused light into electrical signals which can be digitally processed to form images. Further, IR filter 212 can be made from an acrylic or Plexiglas-based material for providing wavelength cut-offs in desirable ranges, such as between 630-700 nm. IR filter 212 can also be made from Schott glass for providing high quality and long pass filtering. Camera 106 can incorporate additional optical and electro-optical devices, such as lenses, polarizers, filters, and so forth, adapted to bolster the focusing of the light and enhance its conversion into reliable electrical signals. Facial orientation module 208 can be used to determine an orientation of a user when the user is facing PMD 100 using at least image date captured by camera 106. In some cases, facial orientation module 208 can use in addition to image data captured by camera 106, inertial data provided by an accelerometer or other inertial sensors.
Processor 214 can provide processing capabilities to execute and implement operating system platforms, programs, algorithms and any other functions. For example, the processor 214 can execute on-the-fly algorithms adapted for generating and processing images acquired via camera system 201. Specifically, as discussed below, processor 214 is adapted to calibrate and adjust image parameters, such as color response, white balance (WB), etc., for implementing and augmenting the calibration data provided by the EPROM 206. In so doing, processor 214 is adapted to further reduce module-to-module variations, such as those associated with the manufacturing of camera system 201. Those skilled in the art will appreciate that processor 214 is adapted to execute and support additional routines associated with standard on-the-fly functionalities, such as auto-focus, video play back, image filtering and other firmware related operations. Such functionalities can be invoked and called upon by processor 214 using data files stored by memory 216, having storage capacity for storing audio and/or video files, as well as the above mentioned firmware files. Further, processor 214 can be coupled to external computing or communication devices connectable to the Internet, intranet or other web-based networks for uploading and/or downloading files, such as video and image files, via the port 220.
Once PMD 100 has determined that human face 306 is within an appropriate distance from PMD 100 and does represent a user of PMD 100, facial orientation module 208 can process image data from camera 106 captured during an image capture event. The image capture event being defined as the actions taken by PMD 100 and camera 106 to capture a single image, the single image for viewing as a snapshot or as a single frame in a video. The processing can include locating facial landmarks such as ears 308, chin 310, eyes 312, and so on. It is desirable to have more than one set of facial landmarks available since it is likely that one or more facial landmarks may not be viewable to camera 106. For example, when user 304 has hair long enough to obscure ears 308, or when user 304 has a beard that hides chin 310 and so on. In this way, facial orientation module 208 can process captured image data 305 from camera 106 and determine facial orientation vector 302. In particular, facial orientation vector 302 can be based upon angular deviation θ from direction of gravity “g” representing “down”. For example, when user 304 is reclining, then facial orientation vector 302 can take on a value θ of about π/2, or 90°. Therefore in most situations, facial orientation vector 302 can range from about ±π/2 (indicating right declination or left declination). However, it is not out of the question that facial orientation vector 302 can take on value θ of about π when user 304 is upside down. Therefore in the most generalized situation, facial orientation vector 302 can range from −π≦θ≦π.
As shown in
It should also be noted that in some embodiments, PMD 100 can process the images captured by front facing camera 106 and/or rear facing camera 112 in a pre-determined manner. For example, in some embodiments, a preferred orientation can be provided to PMD 100 in a processor incorporated into PMD 100 can perform image processing on the captured image data based upon the preferred orientation only. In this way, captured images (in the form of a snap shot, video etc.) can be processed in a manner consistent with the preferred orientation regardless of the real time orientation of PMD 100 when the image was captured.
As shown in
In addition to or in conjunction with facial orientation mode, PMD 100 can operate in a manner in which image data 305 captured by camera 100 can be used to modify a current operating state of PMD 100. For example,
In some embodiments, the PMD can include in place or in addition to the front facing image capture device, a rear image capture device that can take the form of a rear facing camera. The rear facing camera can capture rear facing images at 908 (
It should be noted that it is contemplated that process 1000 is operable when the PMD is being actively used. Therefore, as part of step 1004, if at least a portion of the human face is determined to be present in the captured image data, in some embodiments, a threshold value can be applied to determine if the human face is actually viewing the visual content presented by the display. In one embodiment, the threshold value can be represented as the number of display elements (referred to as pixels) associated with the human face. Since a bone fide user is expected to occupy a substantial portion of the available image space, the human face should represent a substantial part of the captured image data. In this way, if the threshold value is not exceeded, then it is reasonable to presume that there is no user actively viewing the presented visual content and that the detected human face is likely not a user but more likely a casual observer or passerby. Therefore, if at 1004 it is determined that there is no human face included in the captured image data, then a timer is incremented at 1006 and at 1008 a determination is made if the timer has elapsed. In this way, when no human face corresponding to the user has been detected for at least the elapsed period of time, then it can be reasonably presumed that the PMD is not being used as there is a good chance that the presented visual content is not being actively viewed. In this case, when the time has elapsed at 1008, most of the operational components of the PMD can be de-activated in order to conserve power or operational lifetime of the component at 1010 even when the PMD is receiving external power. It should be noted that the image capture device can remain operational for subsequent processing. On the other hand, if the timer has not elapsed, then control can be passed back to 1002 where additional image data can be captured for further processing.
Returning back to 1004, when it has been determined that at least a portion of the human face has been detected in the visual field of the image capture device (i.e., the number of pixels associated with the human face is greater than the threshold value), then at 1012 a determination is made if the detected human face is a recognized human face or an unrecognized human face. By recognized it is meant that the detected human face has at least a number of facial characteristics that taken together match or at least correlate to facial characteristics stored in a local database corresponding to a known individual. If the detected human face is recognized, then at 1014, an operation of the PMD can be modified based upon the recognized human face. The modification can include executing a pre-defined set of operations such as opening email, opening text messages, and so forth.
If, however, the detected human face is not recognized, then at 1016 an identification request can be generated. The identification request can be as simple as posting a notice to enter a name, password, pass phrase, and so forth. If at 1018 it is determined that a proper ID has not been received, then the PMD can be disabled or at least locked at 1020. The locking or disabling can provide a layer of security by providing secure facial recognition procedure. However, if the proper ID has been received then control is passed to 1014 for additional processing.
The media player 1350 also includes a user input device 1358 that allows a user of the media player 1350 to interact with the media player 1350. For example, the user input device 1358 can take a variety of forms, such as a button, keypad, dial, touch screen, audio input interface, video/image capture input interface, input in the form of sensor data, etc. Still further, the media player 1350 includes a display 1360 (screen display) that can be controlled by the processor 1352 to display information to the user. A data bus 1366 can facilitate data transfer between at least the file system 1354, the cache 1356, the processor 1352, and the CODEC 1363.
In one embodiment, the media player 1350 serves to store a plurality of media items (e.g., songs, podcasts, etc.) in the file system 1354. When a user desires to have the media player play a particular media item, a list of available media items is displayed on the display 1360. Then, using the user input device 1358, a user can select one of the available media items. The processor 1352, upon receiving a selection of a particular media item, supplies the media data (e.g., audio file) for the particular media item to a coder/decoder (CODEC) 1363. The CODEC 1363 then produces analog output signals for a speaker 1364. The speaker 1364 can be a speaker internal to the media player 1350 or external to the media player 1350. For example, headphones or earphones that connect to the media player 1350 would be considered an external speaker.
The media player 1350 also includes a network/bus interface 1361 that couples to a data link 1362. The data link 1362 allows the media player 1350 to couple to a host computer or to accessory devices. The data link 1362 can be provided over a wired connection or a wireless connection. In the case of a wireless connection, the network/bus interface 1361 can include a wireless transceiver. The media items (media assets) can pertain to one or more different types of media content. In one embodiment, the media items are audio tracks (e.g., songs, audio books, and podcasts). In another embodiment, the media items are images (e.g., photos). However, in other embodiments, the media items can be any combination of audio, graphical or visual content.
The various aspects, embodiments, implementations or features of the described embodiments can be used separately or in any combination. Various aspects of the described embodiments can be implemented by software, hardware or a combination of hardware and software. The described embodiments can also be embodied as computer readable code on a non-transitory computer readable medium. The computer readable medium is defined as any data storage device that can store data which can thereafter be read by a computer system. Examples of the computer readable medium include read-only memory, random-access memory, CD-ROMs, DVDs, magnetic tape, and optical data storage devices. The computer readable medium can also be distributed over network-coupled computer systems so that the computer readable code is stored and executed in a distributed fashion.
The foregoing description, for purposes of explanation, used specific nomenclature to provide a thorough understanding of the described embodiments. However, it will be apparent to one skilled in the art that the specific details are not required in order to practice the described embodiments. Thus, the foregoing descriptions of the specific embodiments described herein are presented for purposes of illustration and description. They are not target to be exhaustive or to limit the embodiments to the precise forms disclosed. It will be apparent to one of ordinary skill in the art that many modifications and variations are possible in view of the above teachings.