One of the biggest problems with cameras in consumer electronic devices is the time between the user wanting to capture an image (e.g., photo or video) and the time at which the image is actually captured. Techniques for automatically focusing cameras help to relieve the burden on the user of having to manually focus the camera. However, autofocus algorithms can take time to perform. Also, the algorithm may mistakenly focus the camera on the wrong object.
One technique for autofocus is for the camera to sweep through a range of focal distances, collecting image data at each of a number of distances. The image data is then analyzed using image processing to determine which image provided the best focus. The camera then takes a picture at this best focal distance. A problem with such a technique is the time that it takes the camera to sweep through the different focal distances.
Another technique is to select an object in the field of view of the camera. The camera can then be automatically focused for that object. Some cameras can detect faces and automatically focus on a face. However, it can be difficult to know what object that the camera should focus on, as it can be difficult to know what object the user wishes to take a picture of. For example, there may be a person in the foreground and a tree in the background. If the camera system incorrectly assumes that the user desires to take a picture of the person in the foreground, then the tree would be out of focus. Of course, the camera can be re-focused on the tree, but this takes additional time. If the user was attempting to take a picture of a bird in the tree, the bird may have flown by the time the camera is focused.
Methods and systems for automatically focusing a camera are disclosed. Techniques include tracking an eye gaze of eyes to determine a location at which the user is focusing. Then, a camera lens may be focused on that location. This allows for fast focusing of the camera.
One embodiment includes a method for automatically focusing a camera including the following. An eye gaze of a user is tracking using an eye tracking system. A vector that corresponds to a direction in which an eye of a user is gazing at a point in time is determined based on the eye tracking. The direction is in a field of view of a camera. A distance is determined based on the vector and a location of a lens of the camera. The lens is automatically focused based on the distance.
One embodiment includes a system comprising a camera having a lens and logic coupled to the camera. The logic is configured to perform the following. The logic is configured to determine a first vector that corresponds to a first direction in which a first eye of a user is gazing at a point in time. The logic is configured to determine a second vector that corresponds to a second direction in which a second eye of the user is gazing at the point in time. The logic is configured to determine a location of an intersection of the first vector and the second vector. The logic is configured to determine a distance between the location of intersection and a location of the lens. The logic is configured to focus the lens based on the distance.
One embodiment includes a method for automatically focusing a camera including the following. A user's eyes are tracking using an eye tracking system. A plurality of first vectors that each correspond to a first direction in which a first eye of a user is gazing at different points in time are determined based on the eye tracking. A plurality of second vectors that each correspond to a second direction in which a second eye of the user is gazing at corresponding ones of the different points in time are determined based on the eye tracking. A plurality of intersections of the first vectors and the second vectors for each of the different points in time are determined A depth map is generated based on locations of the plurality of intersections. A lens of a camera is automatically focused based on the depth map.
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
Methods and systems for automatically focusing a camera are disclosed. In one embodiment, the system tracks an eye gaze of two eyes to determine a point at which the user is focusing. This location is determined as the intersection of two vectors, each corresponding to the direction in which one of the eyes is gazing, in one embodiment. Then, a camera lens may be focused at that point. In one embodiment, the system tracks an eye gaze of the user, accesses a depth image having depth values, and determines a point in the depth image that corresponds to the vector. This point could be an object that the user is gazing at. From the depth values and a known position of the camera, the system is able to determine a distance from a camera to the object. The term “gaze” refers to a user looking in some direction for some minimum time. There is no set minimum time, as this is a parameter that can be adjusted.
In
In one embodiment, steps of process 200 are performed by a processor that executes computer executable instructions. Process 200 could be performed by other logic such as an Application Specific Circuit (ASIC). Some steps could be performed by a processor, while others are performed in hardware.
Step 202 is to track an eye gaze of a user using an eye tracking system.
In step 204, one or more vectors are determined that corresponds to a direction in which an eye (or eyes) of the user is gazing at a point in time based on tracking the eye gaze. The direction is in a field of view of a camera that is to be focused.
In step 206, a focusing distance is determined based on the vector(s) and a location of a lens of the camera. In one embodiment, an intersection of two eye vectors are used to determine the distance. In one embodiment, the distance can be determined by accessing a depth image, knowing a physical relationship between the camera and the depth image, and determining some point in the depth image based on at least one eye tracking vector.
In step 208, the camera lens is focused based on the focusing distance.
In one embodiment, two eye vectors are used in the process of
Steps 222 and 224, in general determine vectors that correspond to the direction that the user's right and left eye are gazing. As noted, gazing refers to the user looking in some direction for some defined time. The time can be any length. Steps 222 and 224224 may be performed in response to determining that the user's gaze has been fixed for the defined time. For example, an eye tracking system can continuously monitor the user's eyes, such that each time that the user's gaze is fixed for some minimum time, an eye vector is determined for each eye.
In step 222, a first vector is determined that corresponds to a first direction in which a first eye of a user is gazing at a point in time. More precisely, the user is gazing in this direction for some time period, but for the sake of discussion this time period includes a reference point in time.
In step 224, a second vector is determined that corresponds to a second direction in which a second eye of the user is gazing at the point in time.
Steps 222 and 224 may be performed by the eye tracking of the HMD. Thus, the first and second vector can be determined based on the eye tracking step 202. Steps 222 and 224 can be performed at any time. In one embodiment, they are performed in response to the system receiving a request to focus the camera lens. This could be a request to take a photograph (e.g., still image) or a request to captured video (e.g., moving images). However, these steps 222-224 could be performed without any request to focus the camera. Thus, the location at which the user is gazing can already be determined prior to a request to focus the camera 113.
In step 226, a location of an intersection of the first vector and the second vector is determined. This location may provide a distance between the user and the point at which the user is gazing. Typically this location is somewhere in the field of view of the camera 113. If it is determined that the gaze point is not in the field of view of the camera 113, the gaze point could be disregarded.
A point of intersection of the two vectors is also shown. Sometimes the first and second vectors will not precisely intersect at a 3D point. This may be due to limitations in the ability to precisely track the eye gaze, or perhaps a characteristic of the way in which the user is gazing. As one example, the two vectors may intersect as depicted in
In such a case, the system could define the location of intersection based on the crossing when considering only the z-x coordinates. Any difference in y-coordinates might be averaged, as one example. Thus, as defined herein, the term “location of an intersection” or the like when used to refer to the two eye vectors does not require that the two vectors share the exact some point in 3D space. In other words, location of intersection could be determined based on two of the three coordinates. However, the third coordinate is considered when defining the location of intersection. Other techniques could be used to determine and define the location of intersection.
In one embodiment, the location of intersection is defined as a point in a 3D coordinate system. This could be any 3D coordinate system having an origin anywhere. The 3D coordinate system could be Cartesian (e.g., x, y, z), polar, etc. The origin could be fixed in the environment in which the user and camera are located or could be fixed with respect to some point that may move in the environment. For example, the origin could be some point on an HMD, the user, a camera, etc.
In step 228, a distance (e.g., D1 in
In one embodiment, the relative location of the camera lens 213 to the person's eyes 140 is used in order to make the calculation. In one embodiment, there is some common coordinate system between the user's eyes 140 and the camera 113. The device 2 knows the location of the camera 113 and the user's eyes 140 in this common coordinate system, such that D1 can be accurately determined.
After step 228, step 210 from
In one embodiment, the lens is focused based on at least one vector from eye tracking and depth values from a depth image.
In step 244, at least one vector is determined based on the eye tracking (of, for example, step 202).
In step 246, the system determines a focusing distance for the camera based on depth values in the depth image and the vector. In one embodiment, the system generates 3D model of the environment from the depth image. This 3D model could be from a point of view of any coordinate system. Suitable transformation of coordinate systems may be made if the vector or location of camera to be focused are in other coordinate systems. The 3D model could be a point-cloud model, but that is not a requirement. The system may determine an intersection between the vector and the 3D model, as one way of determining an object that the user is focused on. Other techniques could be used.
The system knows the location of the camera relative to the position of a depth camera used to capture the depth image, in one embodiment. Thus, if the system determines an object associated with the depth image that corresponds to the vector (e.g., an object that the vector intersects), and the system has a 3D coordinate for the object, the system can determine the distance from the camera to the object. This distance may be used for the focusing distance.
One possible application of auto-focusing is used in conjunction with a near-eye see through display having a front facing camera and one or more sensors for tracking eye gaze. A near-eye see through display may be implemented as a head mounted display (HMD). Although embodiments are not limited to an HMD, an example HMD will be discussed as one possible use case.
Head-mounted display (HMD) devices can be used in various applications, including military, aviation, medicine, video gaming, entertainment, sports, and so forth. See-through HMD devices allow the user to observe the physical world, while optical elements add light from one or more small micro-displays into the user's visual path, to provide an augmented reality image.
See-through HMD devices can use optical elements such as mirrors, prisms, and holographic lenses to add light from one or two small micro-displays into a user's visual path. The light provides holographic images to the user's eyes via see-though lenses.
The HMD device can be worn on the head of a user so that the user can see through a display and thereby see a real-world scene which includes an image which is not generated by the HMD device. The HMD device 2 can be self-contained so that all of its components are carried by, e.g., physically supported by, the frame 115. Optionally, one or more component of the HMD device is not carried by the frame. For example, one of more components which are not carried by the frame can be physically attached by a wire to a component carried by the frame. Further, one of more components which are not carried by the frame can be in wireless communication with a component carried by the frame, and not physically attached by a wire or otherwise to a component carried by the frame. The one or more components which are not carried by the frame can be carried by the user, in one approach, such as on the wrist. The processing unit 4 could be connected to a component in the frame via a wire or via a wireless link. The term “HMD device” can encompass both on-frame and off-frame components.
The processing unit 4 includes much of the computing power used to operate HMD device 2. The processor may execute instructions stored on a processor readable storage device for performing the processes described herein. In one embodiment, the processing unit 4 communicates wirelessly (e.g., using Wi-Fi®, BLUETOOTH®, infrared (e.g., IrDA® or INFRARED DATA ASSOCIATION® standard), or other wireless communication means) to one or more hub computing systems.
Control circuits 136 provide various electronics that support the other components of HMD device 2.
A portion of the frame of HMD device 2 surrounds a display that includes one or more lenses. To show the components of HMD device 2, a portion of the frame surrounding the display is not depicted. The display includes a light guide optical element 112, opacity filter 114, see-through lens 116 and see-through lens 118. In one embodiment, opacity filter 114 is behind and aligned with see-through lens 116, light guide optical element 112 is behind and aligned with opacity filter 114, and see-through lens 118 is behind and aligned with light guide optical element 112. See-through lenses 116 and 118 are standard lenses used in eye glasses and can be made to any prescription (including no prescription). In one embodiment, see-through lenses 116 and 118 can be replaced by a variable prescription lens. In some embodiments, HMD device 2 will include only one see-through lens or no see-through lenses. In another alternative, a prescription lens can go inside light guide optical element 112. Opacity filter 114 filters out natural light (either on a per pixel basis or uniformly) to enhance the contrast of the augmented reality imagery. Light guide optical element 112 channels artificial light to the eye.
Mounted to or inside temple 102 is an image source, which (in one embodiment) includes microdisplay 120 for projecting an augmented reality image and lens 122 for directing images from microdisplay 120 into light guide optical element 112. In one embodiment, lens 122 is a collimating lens. An augmented reality emitter can include microdisplay 120, one or more optical components such as the lens 122 and light guide 112, and associated electronics such as a driver. Such an augmented reality emitter is associated with the HMD device, and emits light to a user's eye, where the light represents augmented reality still or video images.
Control circuits 136 provide various electronics that support the other components of HMD device 2. More details of control circuits 136 are provided below with respect to
Microdisplay 120 projects an image through lens 122. Different image generation technologies can be used. For example, with a transmissive projection technology, the light source is modulated by optically active material, and backlit with white light. These technologies are usually implemented using LCD type displays with powerful backlights and high optical energy densities. With a reflective technology, external light is reflected and modulated by an optically active material. The illumination is forward lit by either a white source or RGB source, depending on the technology. Digital light processing (DGP), liquid crystal on silicon (LCOS) and MIRASOL® (a display technology from QUALCOMM®, INC.) are all examples of reflective technologies which are efficient as most energy is reflected away from the modulated structure. With an emissive technology, light is generated by the display. For example, a PicoP™-display engine (available from MICROVISION, INC.) emits a laser signal with a micro mirror steering either onto a tiny screen that acts as a transmissive element or beamed directly into the eye.
Light guide optical element 112 transmits light from microdisplay 120 to the eye 140 of the user wearing the HMD device 2. Light guide optical element 112 also allows light from in front of the HMD device 2 to be transmitted through light guide optical element 112 to eye 140, as depicted by arrow 142, thereby allowing the user to have an actual direct view of the space in front of HMD device 2, in addition to receiving an augmented reality image from microdisplay 120. Thus, the walls of light guide optical element 112 are see-through. Light guide optical element 112 includes a first reflecting surface 124 (e.g., a mirror or other surface). Light from microdisplay 120 passes through lens 122 and is incident on reflecting surface 124. The reflecting surface 124 reflects the incident light from the microdisplay 120 such that light is trapped inside a planar, substrate comprising light guide optical element 112 by internal reflection. After several reflections off the surfaces of the substrate, the trapped light waves reach an array of selectively reflecting surfaces, including example surface 126.
Reflecting surfaces 126 couple the light waves incident upon those reflecting surfaces out of the substrate into the eye 140 of the user. As different light rays will travel and bounce off the inside of the substrate at different angles, the different rays will hit the various reflecting surface 126 at different angles. Therefore, different light rays will be reflected out of the substrate by different ones of the reflecting surfaces. The selection of which light rays will be reflected out of the substrate by which surface 126 is engineered by selecting an appropriate angle of the surfaces 126. More details of a light guide optical element can be found in U.S. Patent Application Publication 2008/0285140, published on Nov. 20, 2008, incorporated herein by reference in its entirety. In one embodiment, each eye will have its own light guide optical element 112. When the HMD device has two light guide optical elements, each eye can have its own microdisplay 120 that can display the same image in both eyes or different images in the two eyes. In another embodiment, there can be one light guide optical element which reflects light into both eyes.
Opacity filter 114, which is aligned with light guide optical element 112, selectively blocks natural light, either uniformly or on a per-pixel basis, from passing through light guide optical element 112. In one embodiment, the opacity filter can be a see-through LCD panel, electrochromic film, or similar device. A see-through LCD panel can be obtained by removing various layers of substrate, backlight and diffusers from a conventional LCD. The LCD panel can include one or more light-transmissive LCD chips which allow light to pass through the liquid crystal. Such chips are used in LCD projectors, for instance.
Opacity filter 114 can include a dense grid of pixels, where the light transmissivity of each pixel is individually controllable between minimum and maximum transmissivities. A transmissivity can be set for each pixel by the opacity filter control circuit 224, described below. More details of an opacity filter are provided in U.S. patent application Ser. No. 12/887,426, “Opacity Filter For See-Through Mounted Display,” filed on Sep. 21, 2010, incorporated herein by reference in its entirety.
In one embodiment, the display and the opacity filter are rendered simultaneously and are calibrated to a user's precise position in space to compensate for angle-offset issues. Eye tracking (e.g., using eye tracking camera 134) can be employed to compute the correct image offset at the extremities of the viewing field. Eye tracking can also be used to provide data for focusing the front facing camera 113, or another camera. The eye tracking camera 134 and other logic to compute eye vectors are considered to be an eye tracking system, in one embodiment.
In the example of
In one example, a visible light camera also commonly referred to as an RGB camera may be the sensor, and an example of an optical element or light directing element is a visible light reflecting mirror which is partially transmissive and partially reflective. The visible light camera provides image data of the pupil of the user's eye, while IR photodetectors 162 capture glints which are reflections in the IR portion of the spectrum. If a visible light camera is used, reflections of virtual images may appear in the eye data captured by the camera. An image filtering technique may be used to remove the virtual image reflections if desired. An IR camera is not sensitive to the virtual image reflections on the eye.
In one embodiment, the at least one sensor 134 is an IR camera or a position sensitive detector (PSD) to which IR radiation may be directed. For example, a hot reflecting surface may transmit visible light but reflect IR radiation. The IR radiation reflected from the eye may be from incident radiation of the illuminators 153, other IR illuminators (not shown) or from ambient IR radiation reflected off the eye. In some examples, sensor 134 may be a combination of an RGB and an IR camera, and the optical light directing elements may include a visible light reflecting or diverting element and an IR radiation reflecting or diverting element. In some examples, a camera may be small, e.g. 2 millimeters (mm) by 2 mm. An example of such a camera sensor is the Omnivision OV7727. In other examples, the camera may be small enough, e.g. the Omnivision OV7727, e.g. that the image sensor or camera 134 may be centered on the optical axis or other location of the display optical system 14. For example, the camera 134 may be embedded within a lens of the system 14. Additionally, an image filtering technique may be applied to blend the camera into a user field of view to lessen any distraction to the user.
In the example of
As mentioned above, in some embodiments which calculate a cornea center as part of determining a gaze vector, two glints, and therefore two illuminators will suffice. However, other embodiments may use additional glints in determining a pupil position and hence a gaze vector. As eye data representing the glints is repeatedly captured, for example at 30 frames a second or greater, data for one glint may be blocked by an eyelid or even an eyelash, but data may be gathered by a glint generated by another illuminator.
Note that some of the components of
In another approach, two or more cameras with a known spacing between them are used as a depth camera to also obtain depth data for objects in a room, indicating the distance from the cameras/HMD device to the object.
Display out interface 328 and display in interface 330 communicate with band interface 332 which is an interface to processing unit 4, when the processing unit is attached to the frame of the HMD device by a wire, or communicates by a wireless link, and is worn on the wrist of the user on a wrist band. This approach reduces the weight of the frame-carried components of the HMD device. In other approaches, as mentioned, the processing unit can be carried by the frame and a band interface is not used.
Power management circuit 302 includes voltage regulator 334, eye tracking illumination driver 336, audio DAC and amplifier 338, microphone preamplifier audio ADC 340, biological sensor interface 342 and clock generator 345. Voltage regulator 334 receives power from processing unit 4 via band interface 332 and provides that power to the other components of HMD device 2. Eye tracking illumination driver 336 provides the infrared (IR) light source for eye tracking illumination 134A, as described above. Audio DAC and amplifier 338 receives the audio information from earphones 130. Microphone preamplifier and audio ADC 340 provides an interface for microphone 110. Biological sensor interface 342 is an interface for biological sensor 138. Power management unit 302 also provides power and receives data back from three-axis magnetometer 132A, three-axis gyroscope 132B and three axis accelerometer 132C.
In one embodiment, wireless communication component 446 can include a Wi-Fi® enabled communication device, BLUETOOTH®communication device, infrared communication device, etc. The wireless communication component 446 is a wireless communication interface which, in one implementation, receives data in synchronism with the content displayed by the audiovisual device 16. Further, augmented reality images may be displayed in response to the received data. In one approach, such data is received from the hub computing system 12.
The USB port can be used to dock the processing unit 4 to hub computing device 12 to load data or software onto processing unit 4, as well as charge processing unit 4. In one embodiment, CPU 420 and GPU 422 are the main workhorses for determining where, when and how to insert images into the view of the user. More details are provided below.
Power management circuit 406 includes clock generator 460, analog to digital converter 462, battery charger 464, voltage regulator 466, HMD power source 476, and biological sensor interface 472 in communication with biological sensor 474. Analog to digital converter 462 is connected to a charging jack 470 for receiving an AC supply and creating a DC supply for the system. Voltage regulator 466 is in communication with battery 468 for supplying power to the system. Battery charger 464 is used to charge battery 468 (via voltage regulator 466) upon receiving power from charging jack 470. HMD power source 476 provides power to the HMD device 2.
The calculations that determine where, how and when to insert an image may be performed by the HMD device 2.
In one embodiment, the system generates a depth map of locations at which the user gazed. Then, the camera 113 is focused based on one or more of the locations in the depth map.
In step 602, a depth map of locations gazed at by the user is constructed. In one embodiment, the locations are determined by tracking eye gaze. When a user moves their eyes, they may tend to hold their gaze on objects that are more interesting. The system can take note when the user gazes for some minimum time. The amount of time is a parameter that can be adjusted. For example, the system can take note when the user holds their gaze for 1 second, some pre-defined time that is less than one second, a few seconds, or some other time period.
In one embodiment, the depth map includes a 3D coordinate for each location at which the user gazed. As noted, gazed is defined as the user looking at for some defined time.
The depth map can be generated by the processes of
In step 604, a point or location to focus the camera 113 at is selected. This point could be one of the locations at which that user gazed. However, the point is not required to be one the locations. For example, if the user looked at two different locations (at two different distances from the camera 113), the location could be somewhere between the two locations.
Numerous ways to select the point are discussed herein. Some are based on the automatically selecting some location without the guidance of the depth map. For example, a camera 113 may be able to detect faces, such that a face is selected to focus upon. Then, the depth map may be consulted to help supplement that technique. Some embodiments select the point based on how long the user spent gazing at the various locations. Some embodiments select the point based on when the user gazed at the various locations.
In step 606, the camera 113 is focused based on the selected location.
If the system determines that the camera is to be focused (step 710=yes), then control passes to step 712. The determination of when to focus the camera can be made in a variety of ways. In one embodiment, the system more or less continuously focuses the camera 113. For example, each time that the system stores a new location (e.g., adds a new location to the depth map), the system can focus the camera 113. In one embodiment, the system waits for input to be instructed to focus the camera 113. For example, the user 13 may provide input that a picture or video is to be captured by the camera 113.
In step 712, one or more of the stored locations (e.g., locations from the depth map) are selected. These locations will be used to determine how to focus the camera 113. As one example, an assumption is made that the user desires to focus the camera 113 on the last location at which they gazed. The amount of time the user spent gazing can be used as a factor to select the location. In some cases, more than one location is selected. It may be that the user 13 has recently looked at several objects that they desire to include in the captured image. Other examples are discussed below.
In step 714, a focus location is determined based on the one or more locations. In one embodiment, rather than determining a focus location, a metric for focusing the camera 113 is determined. An example of a metric is the average distance between the camera 113 and two or more locations. Further details are discussed below.
In step 716, the camera lens is focused based on the distance between the lens 213 (or some other camera element) and the focus location. It is not an absolute requirement that a focus location be determined That is, it is not required to determine a single 3D coordinate to focus on. Rather, the system might determine the distance to several locations and focus the camera based on an average of these distances.
As discussed in
In step 804, a prediction of the location of the face is accessed from the depth map of locations gazed at by the user. In one embodiment, step 804 is achieved by assuming that the user last looked at the face. Therefore, the last location in the depth map is accessed as the location to focus upon, in one embodiment. As noted above, this can be a 3D coordinate. In one embodiment, step 804 is achieved by assuming that the user is intends to photograph on object that the user spent the most amount time gazing at recently. Another assumption could be made such as assuming that the closest location that the user recently gazed at corresponds to the face. Any combination of these factors, or others, may be used.
In step 806, the camera 113 is focused on the location in the depth map that is predicted to be the face. Step 806 may be achieved by determining the distance between the camera 113 and the location that was accessed from the depth map. Since this camera 113 only needs to be focused once, the image can be captured without the need for focusing at many distances. Note that steps 804-806 are one implementation of steps 712-716 of the process of
One variation of the process of
In step 814, an estimate or prediction of the location of the center of the FOV is accessed from the depth map of locations gazed at by the user. In one embodiment, step 814 is achieved by assuming that the user last looked at something that is at the location of an object in the center of the FOV. Therefore, the last location in the depth map is accessed as the location to focus upon, in one embodiment. As noted above, this can be a 3D coordinate. In one embodiment, step 814 is achieved by assuming that the user recently spent more time looking at an object in the center of the FOV than other points. In one embodiment, step 824 is achieved by assuming that an object in the center of the FOV is the closest location that the user recently gazed at. Any combination of these factors, or others, may be used.
In step 816, the camera 113 is focused on the center of the FOV based on eye tracking data. Step 816 may be achieved by determining the distance between the camera 113 and the location that was accessed from the depth map. Since this camera 113 only needs to be focused once, the image can be captured without the need for focusing at many distances. Note that steps 814-816 are one implementation of steps 712-716 of the process of
One variation of the process of
In step 824, a location in the depth map that is estimated or predicted to be the manual select point is accessed. In one embodiment, step 824 is achieved by assuming that the user last looked at the manual select point. Therefore, the last location in the depth map is accessed as the location to focus upon, in one embodiment. As noted above, this can be a 3D coordinate. In one embodiment, step 824 is achieved by assuming that the user recently spent more time looking at the manual select point than other points. In one embodiment, step 824 is achieved by assuming that the manual select point is the closest location that the user recently gazed at.
In step 826, the camera 113 is focused on the manual select point based on eye tracking data. Step 826 may be achieved by determining the distance between the camera 113 and the location that was accessed from the depth map. Since this camera 113 only needs to be focused once, the image can be captured without the need for focusing at many distances. Note that steps 824-826 are one implementation of steps 712-716 of the process of
One variation of the process of
In step 904, the camera 113 is focused on the last location that the user gazed at, or other location selected in step 902.
In step 912, two or more locations are selected from the depth map. These locations can be selected using a variety of factors discussed herein including, but not limited to, time spent gazing at the locations, distance of the location from the user, and time since the user gazed at the location.
In step 914, a point is calculated based on the two or more locations. This point is calculated to provide the best focus to capture an object at all of the locations, in one embodiment. In one embodiment, the system calculates a metric from the two or more locations. The metric is used in step 916 to focus the camera 113. The metric might be the average distance from the lens 213, as one example. The metric might be a location that is based on the two or more locations, such as a central point.
In step 916, the camera 113 is focused based on the metric that was calculated in step 914. This can allow the camera 113 to be focused to capture two or more locations, which could be different distances from the camera 113.
As noted above, some embodiments focus the camera 113 based on the amount of time that the user spent gazing at various locations.
Various techniques for auto-focusing a camera 113 described herein can be combined. Some combinations have already been mentioned, but other combinations are possible.
The foregoing detailed description of the technology herein has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the technology to the precise form disclosed. Many modifications and variations are possible in light of the above teaching. The described embodiments were chosen to best explain the principles of the technology and its practical application to thereby enable others skilled in the art to best utilize the technology in various embodiments and with various modifications as are suited to the particular use contemplated. It is intended that the scope of the technology be defined by the claims appended hereto.