This application relates generally to detecting whether a user intends to provide an input to a system and in particular to detecting a user's intent to provide input to an image of a control.
The appearance of electro-mechanical and other types of switches, button, knobs, and controls tends to degrade with repeated use. In addition, the appearance of physical switches and controls is generally fixed so that modifying the language or an iconic image on a physical switch or control requires replacement of the control. Moreover, it is sometimes desired to hide controls when they are not needed. Generally, this is not possible with physical controls without introducing additional structure, such as a movable panel.
Physical switches, button, knobs, and controls are useful for detecting whether a user intends to provide an input to a system. There are problems with known techniques for detecting whether a user intends to provide an input where a physical control is not present. One problem is that known techniques are expensive. There is also a problem of “false positives.” A technique may infer that some activity indicates that a user intends to provide an input, but the activity may also sometimes be consistent with a lack of user intent to provide input. When the technique detects the activity and infers intent to provide input, but the user, in fact, does not intend to provide input, the technique provides a false positive. The problem of false positives tends to become more common when known techniques are employed in an inexpensive fashion. However, low cost is important. Accordingly, there is a need for low-cost, robust methods and apparatus for detecting user input provided to a projected user interface.
One aspect is directed to an apparatus for determining whether a user intends to provide an input using an image of a control appearing on a surface. The apparatus may include a first camera to capture two or more images of the surface and a unit. The unit may determine whether first, second, and third conditions are true. The first condition being that a particular number of pixels classified as skin color are present within one cell of the two or more images, the cell having a location substantially coinciding with the image of the control. The second condition being that the pixels classified as skin color persist for at least a particular time period. The third condition being that the pixels classified as skin color have a first shape. The unit may provide a signal indicative of an intent of a user to provide an input if the each of the first, second, and third conditions are true.
In one embodiment, the apparatus may include a projector to project one or more user controls onto the surface. In addition, the camera and the surface may be in a fixed spatial relationship with one another.
In one embodiment, the apparatus may include a second camera to capture two or more images of the surface, the second camera being spaced apart from the first camera. The unit may determine whether a fourth condition is true, the fourth condition being that the pixels classified as skin color and having the first shape are within a first distance from the surface. The unit may provide a signal indicative of an intent of a user to provide an input if the each of the first, second, third and fourth conditions are true. In one alternative embodiment, the unit may provide a signal indicative of an intent of a user to provide an input if a majority of the first, second, third, and fourth conditions are true.
In one embodiment, the unit may determine whether a fifth condition is true, the fifth condition being that a particular number of pixels classified as finger-nail color are present within the cell of the two or more images, and that a first count of the pixels classified as finger-nail color at a first time is greater than a second count of the pixels classified as finger-nail color at a second time. The unit may provide a signal indicative of an intent of a user to provide an input if the each of the first, second, third, fourth, and fifth conditions are true.
In one embodiment, the unit may determine whether a sixth condition is true, the sixth condition being that a first position at a first time of the pixels classified as skin color is different from a second position at a second time of the pixels classified as skin color. The unit may provide a signal indicative of an intent of a user to provide an input if the each of the first, second, third, fourth, fifth, and sixth conditions are true. In one alternative embodiment, the unit may provide a signal indicative of an intent of a user to provide an input if a majority of the first, second, third, fourth, fifth, and sixth seventh conditions are true.
Embodiments are also directed to methods for determining whether a user intends to provide an input using an image of a control appearing on a surface.
While embodiments may be described generally below, it will be appreciated that the principles and concepts described in this specification may be implemented in a wide variety of contexts, including controls for games, computers, kiosk, vending or other types of machines found in a home, office, or factory. In particular, principles and concepts described in this specification are applicable to home appliances, such as those found in the kitchen or laundry, and to entertainment devices found in the home, such as games and audio/video entertainment.
The projector 100 and surface 104 may be installed at particular locations so that the projector and surface are in a fixed spatial relationship with one another. The projection area 106 may have fixed dimensions and may appear at a fixed location on the surface 104. As one example, the projection area 106 may be 40×30 cm. In addition, the digital image that is input to the projector 100 and used by the projector to create a projected image may have fixed horizontal and vertical dimensions. As one example, the input image may be 800×600 pixels, and points “a” and “d” in
As one example of a projected control and projection area,
Turning now to
Still referring to
In one embodiment, the control unit 502 provides a signal to the processing unit 504 indicative of a user's intent to provide input using a projected control. The control unit 502 and processing unit 504 may be coupled with one another via wireless transceivers 510, 512. Alternatively, the units 502, 504 may be coupled by any other desired means, e.g., wire or optical fiber. The processing unit 504 may cause an action in an appliance in response to receipt of a signal indicative of a user's intent to provide input using a projected control.
Projecting an image onto a surface at an angle results in distortion of the image's dimensions known as a “keystone effect.” In embodiments where the projector 100 is positioned to one side of the surface 104, the projected image may be warped prior to projection in order to prevent or minimize keystone distortion. In addition, capturing an image from a planar surface at an angle results in distortion of the image's dimensions similar to the “keystone effect.” The captured image may be inverse-warped to prevent or minimize inverse-keystone distortion prior to determining the physical locations of objects in the captured image.
A search for activity, e.g., operations 602 and 608, may include examining each pixel within a cell and classifying pixels as either skin-colored pixels and non-skin-colored pixels. In one embodiment, pixels of frame 402 are in an RGB color space and a pixel may be classified according to the following known test. Specifically, a pixel may be classified as skin-colored if the following conditions are true:
R>95,G>40,B>20, and (1)
(max{R,G,B}−min{R,G,B})>15, and (2)
|R−G|>15, and (3)
R>G,R>B; (4)
or
R>220,G>210,B>170, and (1)
|R−G|≦15, and (2)
R>B,G>B. (3)
Pixels not satisfying the above conditions are classified as non-skin colored. The pixel classifying test may classify a region of the finger occupied by a finger nail as non-skin-colored, however, this is not essential. In alternative embodiments, alternative tests for classifying pixels into skin-colored and non-skin-colored pixels may be used. Alternative tests may operate on pixels in color spaces other than RGB, e.g., YUV. As each pixel within a cell is examined, a count of skin-colored pixels may be generated. If the number of pixels within a cell that are classified as skin-colored exceeds a particular threshold, then it may be tentatively concluded that the user's finger is within the boundaries of the cell and the user intends to provide input using the projected control. As one example of a threshold, the threshold may be 2400 pixels for a 60×60 pixel cell. In one alternative, a search for activity, e.g., operations 602 and 608, may include any known edge detection method. If one or more edges between a non-skin colored region and a skin colored region are detected within a cell, it may be tentatively concluded that the user's finger is within the boundaries of the cell.
The valid time operation 618 determines if activity is present for a time period exceeding a threshold time interval. For example, if a sufficient number of skin-color pixels are present for a time period of 0.5 second or longer, then the operation returns a confirming result. On the other hand, if a sufficient number of skin-color pixels is present but for less than the time threshold, then the operation returns an invalid result and it may be concluded that activity is not present.
The valid shape operation 620 may include determining if the shape of the skin-colored pixel region matches a valid shape. If the detected shape matches a valid shape, then the operation returns a confirming result. On the other hand, if the detected shape fails to match a valid shape, then the operation returns an invalid result and it may be concluded that activity is not present.
Referring again to
The elevation of object 1100 above a projection surface 104 is “Y” and the elevation of the cameras 400 and 900 is “H.” The distance between the cameras 108 and 110 is “L.” While
To summarize,
Referring now to
The operation 1302 may include examining each pixel within a cell and classifying each pixel as either a skin-colored pixel or a non-skin-colored pixel. In one embodiment, pixels of frame 402 may be received from camera 102 in a YUV color space or the received frame may be converted to a YUV color space, and a pixel may be classified as skin-colored according to the following test: If
102<U<128,
102<V<128,
115<U+128<145,
150<V+128<170, and
100<Y<200,
the pixel may be considered skin color. The area 1206 of image 1200 shows an area where pixels may be classified as skin-colored pixels. The operation 1302 is not limited to the skin-color test set forth above. In other embodiments, any suitable alternative skin-color test may be used.
The operation 1304 may include examining each pixel within a cell and classifying each pixel as either a finger-nail-colored pixel or a non-finger-nail-colored pixel. In one embodiment, pixels of frame 402 may be classified according to the following known test: If
102<U<128,
102<V<128,
105<U+128<145,
158<V+128<170,
100<Y<200,
pixel is nail color. The area 1208 of image 1202 shows an area where pixels may be classified as finger-nail-colored pixels. The operation 1304 is not limited to the nail-color test set forth above. In other embodiments, any suitable alternative nail-color test may be used. One advantage of using a nail-color test is that the variation in color of fingernails among humans of various races is believed to be smaller than the variation in color of skin.
The operation 1306 may include comparing the number of nail-colored pixels with a minimum threshold. This is operation is useful for detecting situations where a condition is present that may interfere with a test described below. For example, if the user is wearing nail polish, the counted number of nail-colored pixels is not likely to satisfy the minimum threshold. The wearing of nail polish may interfere with a test described below.
The operation 1308 may include comparing a count of the number of nail-colored pixels for a current frame with a corresponding count from a previous frame. The previous frame may be any frame within a particular preceding time period, e.g., ½ second. The count of fingernail colored pixels may change as a result of the finger being pressed against the surface 104. As shown in image 1204, an area 1210 of the fingernail turns white when the finger is pressed against the surface 104 due to blood being forced out of part of the tissue under and adjacent to the fingernail. While the area 1210 may take a variety of shapes and sizes depending on the particular user, the area 1210 will be captured by the camera 400 as an area of generally white pixels not satisfying either of the classification tests for skin-colored or finger-nail-colored pixels. Accordingly, when the finger is pressed against the surface 104, the count of finger-nail-colored pixels will be lower than at times when the finger is not pressed against the surface, e.g., the count of finger-nail-colored pixels in image 1202 will be greater than the count of finger-nail-colored pixels in image 1204. The operation 1310 may include determining whether a difference between a count of a current frame and a previous frame is greater than a threshold. As one example, a pixel difference threshold may be 30 pixels. In one alternative embodiment, the operation 1308 may determine if the number of white pixels in the fingernail region or in the cell exceeds a threshold. The presence of white pixels being due, as mentioned, to a portion of the fingernail turning white when the finger is pressed against a surface.
A user may wear fingernail polish, which may interfere with the classification of particular pixels as nail colored and the comparing of counts of nail-colored pixels. Generally speaking, the hands and fingers of all users have some degree of tremor, tremble, or involuntary shaking movement, i.e., a user's outstretched hand will generally have at least some tremble, the degree of tremble depending on the particular individual. However, when the hand or finger is placed against an object, the tremor or tremble generally stops. The operations 1312 and 1314 may be employed where it is difficult or impossible to compare of counts of nail-colored pixels. The operation 1312 evaluates a region of interest 1400 comprised of a matrix of skin-colored and non-skin-colored pixels.
In one embodiment, the operations 1312 and 1314 may be performed in addition to operations 1308 and 1310 to provide additional confirmation or confidence. For example, the operation 1312 may be performed subsequent to operation 1310 as shown by the dashed line in
In one embodiment, two or more validation tests may be performed and the results combined to determine whether a tentative conclusion that a user intends to provide input using a projected control should be confirmed.
If activity is detected within one cell, operations 1506, 1508, 1510, 1512, and 1514 may be invoked. Each of the operations 1506-1514 may independently determine whether a tentative conclusion should be confirmed and may return an indication of its determination. The operations 1506, 1508, and 1510 correspond respectfully with the operations 618, 620, and 622 described above. In addition, the operation 1512 corresponds with the operations 1302, 1304, 1306, 1308, and 1310. Further, the operation 1514 corresponds with the operations 1302, 1304, 1306, 1312, and 1314. A decision operation 1516 receives the confirming/non-confirming indications of each of the operations 1506-1514. In one embodiment, each indication is given one vote and a majority of votes determines whether to confirm the tentative conclusion that the user intends to use the projected control. In one embodiment, the operation 1512 may return an “abstaining” indication if the operation is unable to detect a sufficient number of fingernail-colored pixels. In alternative embodiments, the operation 1516 may include a decision based on a weighted polling of validation tests. The method 1300 provides the advantage that a group of tests will always outperform most of the individual tests. A further advantage of the method 1300 is that each of the tests is relatively inexpensive to implement.
The use of projected controls include advantages such as the appearance of the control not being degraded (e.g., wearing down) with repeated physical contact, the appearance of the control being readily modifiable, and the ability to hide the control when it is not needed. While embodiments have been described in terms of detecting a user's intent to provide input to a projected user interface, it is not essential that the image of the control be a projected image. In one embodiment the one or more projected controls described in the various embodiments above may be replaced with a non-projected image on the surface 104, such as a painted image, an engraved image, an image applied as a decal, label, or sticker, or other non-projected image.
The methods and their variations described above may be implemented in hardware, software, or in a combination of hardware and software. Software for implementing all or part of any method described above may stored in any suitable memory for execution by a control unit 502 or processing unit 504.
It should be understood that the embodiments described above may employ various computer-implemented operations involving data stored in computer systems. These operations are those requiring physical manipulation of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. Further, the manipulations performed may be referred to in terms, such as producing, identifying, determining, or comparing.
Any of the operations described in this specification that form part of the embodiments are useful machine operations. As described above, some embodiments relate to a device or an apparatus specially constructed for performing these operations. It should be appreciated, however, that the embodiments may be employed in a general purpose computer selectively activated or configured by a computer program stored in the computer. In particular, various general purpose computer systems may be used with computer programs written in accordance with the teachings herein. Accordingly, it should be understood that the embodiments may also be embodied as computer readable code on a computer readable medium.
A computer readable medium is any data storage device that can store data which can be thereafter read by a computer system. Examples of the computer readable medium include, among other things, flash drives, floppy disks, memory cards, hard drives, RAMs, ROMs, EPROMs, compact disks, and magnetic tapes. In one embodiment, any method described above may be stored as a program of instructions on a computer readable medium.
Although the present invention has been fully described by way of the embodiments described in this specification with reference to the accompanying drawings, various changes and modifications will be apparent to those having skill in this field. Therefore, unless these changes and modifications depart from the scope of the present invention, they should be construed as being included in this specification.
The present application claims the benefit under 35 USC Section 119(e) of U.S. Provisional Patent Application Ser. No. 61/325,088, filed Apr. 16, 2010, entitled “Projected User Interface.” The present application is based on and claims priority from this provisional application, the disclosure of which is hereby expressly incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
61325088 | Apr 2010 | US |