This application claims the benefit of Korean Patent Application No. 10-2010-0057396, filed on Jun. 17, 2010, which is hereby incorporated by reference in its entirety into this application.
1. Technical Field
The present invention relates generally to an apparatus and method for inputting coordinates using eye tracking, and, more particularly, to an apparatus and method for inputting coordinates for a gaze-based interaction system, which are capable of finding a point, which is being viewed by a user, using an image of the user's eye.
2. Description of the Related Art
Eye tracking technology and gaze direction extraction technology are topics that have been actively researched so as to implement a new user input method in the Human-Computer Interaction (HCI) field. Such technologies have been developed and commercialized to enable physically impaired persons, who cannot freely move their bodily parts, such as their hands or feet, to use devices such as computers.
Eye tracking technology and gaze direction extraction technology are used in the various data mining fields, for example, in such a way as to investigate the gaze trajectories of users depending on the arrangement of advertisements or text by tracking locations which are viewed by not only physically impaired persons but also general users.
The most important part of eye tracking is the tracking of the pupil. Thus far, various methods for tracking the pupil have been used.
For example, these methods include a method using the fact that light is reflected from the cornea, a method using the phenomenon which occurs when light passes through various layers of the eye having different refractive indices, an electrooculography (EOG) method using electrodes placed around the eye, a search coil method using a contact lens, and a method using the phenomenon where the brightness of the pupil varies depending on the location of a light source.
Furthermore, when the method of tracking the pupil is used in practice, there are used firstly a method of extracting a gaze direction by analyzing the relationship between the head and the eye based on information about the movement of the head extracted using a magnetic sensor and the locations of points obtained by tracking the eyeball (the iris or the pupil) using a camera in order to compensate for the movement of the head; and secondly, a method of estimating a gaze direction based on variation in input light depending on the gaze direction by using a device for receiving light reflected from a projector and the eye.
Accordingly, the present invention has been made keeping in mind the above problems occurring in the prior art, and an object of the present invention is to provide an apparatus and method for inputting coordinates, which are configured to photograph images of the user's pupil and the user's front using at least two cameras, track a gaze direction depending on the movement of the location of the pupil in a user's visible region and then convert the results of the tracking into spatial coordinates, so that it is possible to track a location which is being viewed by a user regardless of the movement of the user's head.
In order to accomplish the above object, the present invention provides an apparatus for inputting coordinates using eye tracking, including a pupil tracking unit for tracking movement of a user's pupil based on a first image photographed by a first camera; a display tracking unit for tracking a region of a display device located in a second image photographed by a second camera; and a spatial coordinate conversion unit for mapping the tracked movement of the pupil to the region of the display device in the second image, and then converting location information, acquired based on the mapped movement of the pupil, into spatial coordinates corresponding to the region of the display device in the second image.
The first camera may be fixed onto a head mount worn on the user's head, and may be disposed so that a lens of the first camera is oriented toward the user's eye.
The first camera may be an infrared camera including a band pass filter having a wavelength range of 1300 nm or 1900 nm.
The second camera may be fixed onto the head mount worn on the user's head beside the first camera, and may be disposed so that a lens of the second camera is oriented toward the user's gaze direction.
The second camera may photograph the second image depending on the user's gaze direction at a location which is varied by movement of the user's head.
The pupil tracking unit may track the location of the center of the pupil based on the first image photographed by the first camera.
The spatial coordinate conversion unit may calibrate the location of the center of the pupil in the space of the second image.
The spatial coordinate conversion unit may convert the location of the center of the pupil into spatial coordinates corresponding to the region of the display device in the second image based on the ratio between the region of the display device and the location of the center of the pupil.
The display tracking unit may track the locations of one or more markers, attached to the display device, in the second image.
Additionally, in order to accomplish the above object, the present invention provides a method of inputting coordinates using eye tracking, including tracking the movement of a user's pupil based on a first image photographed by a first camera; tracking a region of a display device located in a second image photographed by a second camera; and mapping the tracked movement of the pupil to the region of the display device in the second image, and then converting location information, acquired based on the mapped movement of the pupil, into spatial coordinates corresponding to the region of the display device in the second image.
The first camera may be fixed onto a head mount worn on the user's head, and may be disposed so that a lens of the first camera is oriented toward the user's eye.
The first camera may be an infrared camera including a band pass filter having a wavelength range of 1300 nm or 1900 nm.
The second camera may be fixed onto the head mount worn on the user's head beside the first camera, and may be disposed so that a lens of the second camera is oriented toward the user's gaze direction.
The second camera may photograph the second image depending on the user's gaze direction at a location which is varied by movement of the user's head.
The tracking movement of a user's pupil may track the location of the center of the pupil based on the first image photographed by the first camera.
The mapping may include calibrating the location of the center of the pupil in the space of the second image.
The converting may convert the location of the center of the pupil into spatial coordinates corresponding to the region of the display device in the second image based on the ratio between the region of the display device and the location of the center of the pupil.
The tracking a region of a display device may include tracking the locations of one or more markers, attached to the display device, in the second image.
The above and other objects, features and advantages of the present invention will be more clearly understood from the following detailed description taken in conjunction with the accompanying drawings, in which:
Reference now should be made to the drawings, in which the same reference numerals are used throughout the different drawings to designate the same or similar components.
Embodiments of the present invention will be described below with reference to the accompanying drawings.
In general, methods using cameras in eye tracking may be classified into two types. The first type of method is to place cameras around a user's eye in head-mounted form, and the second type of method is to place cameras on a monitor side and photograph a user's eye over a long distance.
Although the method of capturing a user's eye over a long distance has the advantage of wearing nothing on his or her body, the movement of a user's head is limited, accuracy is reduced because the method of calculating the relative locations between a monitor, the head and the eye is complicated, or the resolution of a camera should be sufficiently high. Furthermore, the method of capturing a user's eye over a long distance is disadvantageous in that a camera and various additional devices should be moved from a monitor to another monitor in a calibrated state so as to apply the method to the other monitor because a camera is attached to the former monitor.
Accordingly, in the present invention, the method using head-mounted type cameras is used to track a user's gaze direction.
As shown in
Here, at least one camera photographs an image of a user's eye, and another at least one camera photographs an image of the user's front view. For convenience's sake, at least one camera is referred to as a first camera 110, and another at least one camera is referred to as a second camera 120.
The first camera 110 is fixed onto the head mount 50, and the lens of the first camera 110 is fixed and disposed so that it is oriented toward the user's eye when the head mount 50 is worn on the user's head 10. That is, the first camera 110 fixedly photographs an image of the user's eye even if a gaze direction is changed by the movement of the user's head 10.
Here, although it is preferred that the first camera 110 be an infrared camera provided with a band pass filter for a wavelength range of 1300 nm or 1900 nm, it is not limited thereto.
A method using infrared light when capturing the eye can prevent illumination from being reflected from the pupil and also it is easy to directly track the pupil rather than the limbus because the method does not utilize surrounding light.
Moreover, the first camera 110 photographs an image of the eye in a wavelength range of 1300 nm or 1900 nm, so that it is possible to track the movement of the pupil outdoors. A detailed description thereof will now be given with reference to
The second camera 120 is fixed onto the head mount 50 beside the first camera 110, and the lens of the second camera 120 is fixed and disposed so that it is oriented toward a direction opposite to the direction of the user's eye, that is, the user's gaze direction, when the head mount 50 is worn on the user's head 10. That is, when the gaze direction is changed by the movement of the user's head 10, the second camera 120 photographs a frontal image of a visible region in the gaze direction in which the user's eye is oriented toward the changed location.
Here, although the second camera 120 may be an infrared camera provided with a band pass filter having a wavelength range of 1300 nm or 1900 nm, like the first camera 110, it is not limited thereto.
In greater detail, the second camera 120 photographs a display device 200 which is located in front of the user. In this case, markers 250 are attached to the display device 200 located in front of the user to enable the location, shape and the like of the display device 200 to be detected. It will be apparent that the markers 250 may be provided in the form which is contained inside the display device 200. Here, infrared light emitting devices, for example, Light-Emitting Diodes (LEDs), may be used as the markers 250.
Although the markers 250 are attached to the four corners of the display device 200, the markers 250 are not limited to a specific shape or a number because they are used to detect the location, shape and the like of the display device 200.
Referring to
As shown in
For the first camera 110 and the second camera 120, reference is made to the descriptions of
Meanwhile, the pupil tracking unit 130 tracks the movement of the user's pupil in images of the user's eye (hereinafter referred to as the “first images”) photographed by the first camera 110. In greater detail, the pupil tracking unit 130 tracks the center location of the pupil based on the first images photographed by the first camera 110.
The display tracking unit 140 tracks the region of the display device 200 which is located in images of the user's front (hereinafter referred to as the “second images”) photographed by the second camera 120. Here, the display tracking unit 140 tracks the region of the display device 200 by tracking the locations of the markers 250, attached to the display device 200, in the second images.
The spatial coordinate conversion unit 160 maps the movement of the pupil, tracked in the first images, to the region of the display device 200 in the second images.
Furthermore, the spatial coordinate conversion unit 160 converts location information, acquired based on the mapped movement of the pupil, into spatial coordinates corresponding to the region of the display device 200 in the second images.
Here, the spatial coordinate conversion unit 160 performs conversion into spatial coordinates corresponding to the region of the display device 200 in the second image based on the ratio between the region of the display device 200 and the location of the center of the pupil.
Here, the spatial coordinate conversion unit 160 performs calibration in the space of the second image based on the location of the center of the pupil. The spatial coordinate conversion unit 160 performs calibration in advance.
Calibration is the process of creating function fc(x) which is used to calculate the location of a second image to which the location of the center of the pupil acquired from a first image is oriented. Here, fc(x) does not convert the coordinates of the center of the pupil, acquired from the first image, into coordinates on the display device 200, but converts the coordinates of the center of the pupil, acquired from the first image, into coordinates in the second image.
Furthermore, since fc(x) is not a fixed function but may vary depending on the location of the pupil based on a first image and depending on a second image, the equation of fc(x) is not mentioned in the embodiment of the present invention.
Accordingly, the spatial coordinate conversion unit 160 enables location information, acquired based on the movement of the pupil, to be converted into spatial coordinates corresponding to the region of the display device 200 in the second image by applying the location of the center of the pupil, acquired from the first image, and the locations of the markers 250, acquired from the second image, to the calibrated fc(x).
The storage unit 170 stores the first and second images photographed by the first camera 110 and the second camera 120. Furthermore, the storage unit 170 further stores information about the location of the center of the pupil tracked by the pupil tracking unit 130 and information about the location of the region of the display device 200 tracked by the display tracking unit 140. Moreover, the storage unit 170 stores function fc(x) created by the calibration of the spatial coordinate conversion unit 160 and spatial coordinate values obtained by function fc(x).
The spatial coordinate output unit 180 outputs the coordinate information, obtained by the spatial coordinate conversion unit 160, to a control device which is connected to the apparatus 100 for inputting coordinates according to the present invention.
In the graph of
Furthermore, in
As shown in
Accordingly, in the present invention, the locations of the pupil and the markers 250 are tracked using infrared light in wavelength ranges near 1300 nm and 1900 nm. In this case, not only can more robust images be acquired under solar light, but power consumption can also be reduced.
In
In order to perform calibration, the user views the markers 250 attached to the display device 200, with his or her head 10 being fixed as much as possible. It is preferable to fill the second image with the display device 200 if possible.
Although according to the present invention, it is unnecessary for the user to view the markers 250 with his or her head 10 fixed, or it is unnecessary to fill the second image with the display device 200, it is preferable to fill the second image with the display device 200 so as to increase accuracy.
The pupil tracking unit 130 stores the location of the center of the pupil in the storage unit 170 when the user views each of the markers 250.
For example, with regard to the pupil tracking unit 130, when the user views the markers 250 attached to the display device 200 of
Once the coordinates of the four corners are known, the spatial coordinate conversion unit 160 creates function fc(x), which can calculate the portion of the second image which is being viewed by the user, using various methods, even if the user views a location other than the markers 250.
Since the second camera 120 is affixed onto the user's head 10, the second image photographed by the second camera 120 is varied by the movement of the user's head 10.
Here, fc(x) indicates the portion of the second image, varied by the movement of the user's head 10, which is being viewed by the user. That is, fc(x) is a spatial coordinate conversion function which has the coordinates of the center of the pupil of the user, acquired from the first image, as input and has specific coordinates of the second image as output.
Once fc(x) has been determined as described above, it is unnecessary for the spatial coordinate conversion unit 160 to obtain it, as long as the locations of the first and second cameras 110 and 120 or the characteristics of the cameras (focal length or the like) do not change.
Although in the embodiment of the present invention, the process of obtaining fc(x) using the markers 250 attached to the display device 200 has been described, any method can be used to obtain fc(x) because the ultimate objective is to obtain fc(x). That is, even when fc(x) is obtained using the three points of a triangle, the operation of the present invention can track the portion of the display device 200 which is being viewed by the user.
First,
Here, “a,” “b,” “c,” and “d” denote the locations of the markers 250, and “P” denotes the location of the center of the pupil.
Furthermore, a rectangle that connects “a,” “b,” “c,” and “d” corresponds to the region of the display device 200.
Accordingly, the spatial coordinate conversion unit 160 estimates the portion of the actual display device 200 that is being viewed by the user by calculating the ratio between the rectangle abcd and P.
In the embodiment of
Meanwhile, the case where the region of the display device 200 is not a rectangle occurs due to the photograph angle of the second camera 120. In this case, a method of calculating the location of “P” will now be described with reference to
In
Here, a vanishing point can be found from a, b, c and d, point (M2, M3) at which a rectilinear line passing through the vanishing point and P meets
Here, it is (assumed that the coordinates of M1, M2, M3 and M4 are M1(x2, y2), M2(x3, y3), M3(x5, y5) and M4(x6, y6).
Accordingly, when the display device 200 is plane, the location coordinates (Xp, Yp) of P can be obtained using the following Equation 1:
Here, Equation 1 is based on fc(x).
Referring to
Thereafter, the spatial coordinate conversion unit 160 maps the results of the tracking of the location of the pupil to the visible screen region at step S340, and converts the mapped location of the pupil into spatial coordinates at step S350.
Of course, the spatial coordinate conversion unit 160, prior to the performance of steps S340 and S350, creates a function by performing calibration on the location of the pupil in the visible screen region. At this time, the spatial coordinate conversion unit 160 converts the location of the pupil, mapped to the visible screen region, into spatial coordinates using the created function.
Finally, the spatial coordinate output unit 180 outputs spatial coordinate information obtained at step S350, thereby inputting coordinates based on the tracking of the gaze direction at step S360.
If the location of the pupil has changed at step S370, steps S310 to S360 are repeated until the input of coordinates is terminated.
Although in the embodiment of the present invention, the process of determining the portion of a display device in a visible screen region, which is being viewed by a user has been described, it will be apparent that the present invention may be applied to any object to which markers have been attached, such as a poster and a signboard, in addition to the display device.
The present invention is advantageous in that a gaze direction depending on the movement of the location of the pupil in a user's visible region is tracked based on images of the user's pupil and the user's front photographed using at least two cameras and then the results of the tracking are transformed into spatial coordinates, so that it is possible to track a location which is being viewed by a user regardless of the movement of the user's head or the resolution of the screen.
Furthermore, the present invention is advantageous in that once calibration has been performed, it is unnecessary to perform calibration again even when a display device in a visible screen region changes.
Furthermore, the present invention is advantageous in that power consumption can be reduced compared to that in the case where light source reflected from the eye is photographed by a camera because markers, that is, light sources, attached or embedded in a display device are directly photographed by a camera, and in that robust detection can be achieved outdoors because solar light in the wavelength range, which does not easily reach the Earth's surface, is utilized.
Although the preferred embodiments of the present invention have been disclosed for illustrative purposes, those skilled in the art will appreciate that various modifications, additions and substitutions are possible, without departing from the scope and spirit of the invention as disclosed in the accompanying claims.
Number | Date | Country | Kind |
---|---|---|---|
10-2010-0057396 | Jun 2010 | KR | national |