Graphical user interface (“GUI”) allows users to interact with electronic devices based on images rather than text commands. For example, a GUI can represent information and/or actions available to users through graphical icons and visual indicators. Such representation is more intuitive and easier to operate than text-based interfaces, typed command labels, or text navigation.
To interact with GUIs, users typically utilize mice, touchscreens, touchpads, joysticks, and/or other human-machine interfaces (“HMIs”). However, such HMIs may not be suitable for certain applications. For example, mice may lack sufficient mobility for use with smart phones or tablet computers. Instead, touchscreens are typically used for such handheld devices. However, touchscreens may not allow precise cursor control because of limited operating surface area and/or touchscreen resolution. Various hands-free techniques have also been developed to interact with GUIs without HMIs. Example hands-free techniques include voice recognition and camera-based head tracking. These conventional hands-free techniques, however, can be difficult to use and limited in functionalities when compared to HMIs.
Various embodiments of electronic systems, devices, and associated methods of user input recognition are described below. The term “gesture” as used herein generally refers to a representation or expression based on a position, an orientation, and/or a movement trajectory of a finger, a hand, other parts of a user, and/or an object associated therewith. For example, a gesture can include a user's finger holding a generally static position (e.g., a canted position) relative to a reference point or plane. In another example, a gesture can include a user's finger moving toward or away from a reference point or plane over a period of time. In further examples, a gesture can include a combination of static and dynamic representations and/or expressions. A person skilled in the relevant art will also understand that the technology may have additional embodiments, and that the technology may be practiced without several of the details of the embodiments described below with reference to
In the illustrated embodiment, the finger 105 is shown as an index finger on a left hand of the user 101. In other embodiments, the finger 105 can also be other suitable finger on either left or right hand of the user 101. Even though the electronic system 100 is describe below as being configured to monitor only the finger 105 for user input, in further embodiments, the electronic system 100 can also be configured to monitor two, three, or any suitable number of fingers on left hand and/or right hand of the user 101 for user input. In yet further embodiments, the electronic system 100 can also be configured to monitor at least one object (e.g., an input device 102 in
The detector 104 can be configured to acquire images of and/or otherwise detect a current position of the finger 105 of the user 101. In the following description, a camera (e.g., Webcam C500 provided by Logitech of Fremont, Calif.) is used as an example of the detector 104. In other embodiments, the detector 104 can also include an IR camera, laser detector, radio frequency (“RF”) receiver, ultrasonic transducer, radar detector, and/or other suitable types of radio, image, and/or sound capturing component. Even though only one detector 104 is shown in
The output device 106 can be configured to provide textual, graphical, sound, and/or other suitable types of feedback or display to the user 101. For example, as shown in
The controller 118 can include a processor 120 coupled to a memory 122 and an input/output interface 124. The processor 120 can include a microprocessor (e.g., an A5 processor provided by Apple, Inc. of Cupertino, Calif.), a field-programmable gate array, and/or other suitable logic processing component. The memory 122 can include a volatile and/or nonvolatile computer readable medium (e.g., ROM; RAM, magnetic disk storage media; optical storage media; flash memory devices, EEPROM, and/or other suitable non-transitory storage media) configured to store data received from, as well as instructions for, the processor 120. The input/output interface 124 can include a driver for interfacing with a camera, display, touch screen, keyboard, track ball, gauge or dial, and/or other suitable types of input/output devices.
In certain embodiments, the controller 118 can be operatively coupled to the other components of the electronic system 100 via a hardwire communication link (e.g., a USB link, an Ethernet link, an RS232 link, etc.). In other embodiments, the controller 118 can be operatively coupled to the other components of the electronic system 100 via a wireless connection (e.g., a WIFI link, a Bluetooth link, etc.). In further embodiments, the controller 118 can be configured as an application specific integrated circuit, system-on-chip circuit, programmable logic controller, and/or other suitable computing framework.
In certain embodiments, the detector 104, the output device 106, and the controller 118 can be configured as a desktop computer, a laptop computer, a tablet computer, a smart phone, an electronic whiteboard, and/or other suitable types of electronic devices. In other embodiments, the output device 106 may be at least a part of a television set. The detector 104 and/or the controller 118 may be integrated into or separate from the television set. In further embodiments, the controller 118 and the detector 104 may be configured as a unitary component (e.g., a game console, a camera unit, or a projector unit), and the output device 106 may include a television screen, a projected screen, and/or other suitable displays. In further embodiments, the detector 104, the output device 106, and/or the controller 118 may be independent from one another or may have other suitable configurations.
Embodiments of the electronic system 100 can allow the user 101 to operate in a touch free fashion by, for example, positioning, orientating, moving, and/or otherwise gesturing with the finger 105. For example, the electronic system 100 can monitor a position, orientation, movement, and/or other gesture of the finger 105 and correlate the monitored gesture with a computing command, move instruction, and/or other suitable types of instruction. Techniques for determine a position, orientation, movement, and/or other gestures of the finger 105 can include monitoring and identifying a shape, color, and/or other suitable characteristics of the finger 105, as described in U.S. patent application Ser. Nos. 08/203,603 and 08/468,358, the disclosures of which are incorporated herein in their entirety.
In one operating mode, the user 101 can issue a move instruction by producing a movement of the finger 105 between a start position 107a and an end position 107b as indicated by an arrow 107. In response, the electronic system 100 detects the produced movement of the finger 105 via the detector 104, and then generates a move instruction by mapping the start and end positions 107a and 107b to the output device 106. The electronic system 100 then executes the move instruction by, for example, moving the computer cursor 108 from a first position 109a to a second position 109b corresponding to the start and end positions 107a and 107b of the finger 105.
In another operating mode, the user 101 can also issue a computing command to the electronic system 100. In the example above, after the user 101 moved the computer cursor 108 to at least partially overlap the mail 111, the user 101 can then produce a gesture to signal an open command. An example gesture for an open command can include moving the finger 105 toward the detector 104 in a continuous motion and return immediately to approximately the original position. Other example gestures are described in U.S. patent application Ser. No. 13/363,569, the disclosures of which are incorporated herein in its entirety. The electronic system 100 then detects and interprets the movement of the finger 105 as corresponding to an open command before executing the open command to open the mail 111. Details of a process suitable for operations of the electronic system 100 are described below with reference to
Even though the electronic system 100 in
In certain embodiments, the input device 102 can include at least one marker 103 (only one is shown in
In other embodiments, the marker 103 can include a non-powered (i.e., passive) component. For example, the marker 103 can include a reflective material that produces the signal 110 by reflecting at least a portion of the illumination 114 from the optional illumination source 112. The reflective material can include aluminum foils, mirrors, and/or other suitable materials with sufficient reflectivity. In further embodiments, the input device 102 may include a combination of powered and passive components. In any of the foregoing embodiments, one or more markers 103 may be configured to emit the signal 110 with a generally circular, triangular, rectangular, and/or other suitable pattern. In yet further embodiments, the marker 103 may be omitted.
The electronic system 100 with the input device 102 can operate in generally similar fashion as that described above with reference to
When implementing several embodiments of user input recognition discussed above, the inventors discovered that one difficulty of monitoring and recognizing gestures of the finger 105 is to distinguish between natural shaking and intended movements or gestures of the finger 105. Without being bound by theory, it is believed that human hands (and fingers) exhibit certain amounts of natural tremor, shakiness, or unsteadiness (collectively referred to herein as “jitter”) when held in air. The inventors have recognized that the natural shakiness may mislead, confuse, and/or otherwise affect gesture recognition of the finger 105. In response, several embodiments of the electronic system 100 are configured to identify and/or remove natural shakiness of the finger 105 (or the hand of the user 101) from intended movements or gestures, as discussed in more detail below with reference to
The inventors have also discovered that distinguishing gestures corresponding to move instructions from those corresponding to computing commands is useful for providing good user experience. For instance, in the example shown in
In operation, the input module 132 can accept data input 150 (e.g., images from the detector 104 in
The process module 136 analyzes the data input 150 from the input module 132 and/or other data sources, and the output module 138 generates output signals 152 based on the analyzed data input 150. The processor 120 may include the display module 140 for displaying, printing, or downloading the data input 150, the output signals 152, and/or other information via the output device 106 (
The sensing module 160 is configured to receive the data input 150 and identify the finger 105 (
The calculation module 166 may include routines configured to perform various types of calculations to facilitate operation of other modules. For example, the calculation module 166 can include a sampling routine configured to sample the data input 150 at regular time intervals along preset directions. In certain embodiments, the sampling routine can include linear or non-linear interpolation, extrapolation, and/or other suitable subroutines configured to generate a set of data, images, frames from the detector 104 (
The calculation module 166 can also include a modeling routine configured to determine a position and/or orientation of the finger 105 and/or the input device 102 relative to the detector 104. In certain embodiments, the modeling routine can include subroutines configured to determine and/or calculate parameters of the processed image. For example, the modeling routine may include subroutines to determine an angle of the finger 105 relative to a reference plane. In another example, the modeling routine may also include subroutines that calculate a quantity of markers 103 in the processed image and/or a distance between individual pairs of the markers 103.
In another example, the calculation module 166 can also include a trajectory routine configured to form a temporal trajectory of the finger 105 and/or the input device 102. As used herein, the term “temporal trajectory” generally refers to a spatial trajectory of a subject of interest (e.g., the finger 105 or the input device 102) over time. In one embodiment, the calculation module 166 is configured to calculate a vector representing a movement of the finger 105 and/or the input device 102 from a first position/orientation at a first time point to a second position/orientation at a second time point. In another embodiment, the calculation module 166 is configured to calculate a vector array or plot a trajectory of the finger 105 and/or the input device 102 based on multiple position/orientation at various time points.
In other embodiments, the calculation module 166 can include linear regression, polynomial regression, interpolation, extrapolation, and/or other suitable subroutines to derive a formula, a linear fit, and/or other suitable representation of movements of the finger 105 and/or the input device 102. In yet other embodiments, the calculation module 166 can include routines to compute a travel distance, travel direction, velocity profile, and/or other suitable characteristics of the temporal trajectory. In further embodiments, the calculation module 166 can also include counters, timers, and/or other suitable routines to facilitate operation of other modules, as discussed in more detail below with reference to
The analysis module 162 can be configured to analyze the calculated temporal trajectory of the finger 105 and/or the input device 102 to determine a corresponding user gesture. In certain embodiments, the analysis module 162 analyzes characteristics of the calculated temporal trajectory and compares the characteristics to the gesture database 142. For example, in one embodiment, the analysis module 162 can compare a travel distance, travel direction, velocity profile, and/or other suitable characteristics of the temporal trajectory to known actions or gesture in the gesture database 142. If a match is found, the analysis module 166 is configured to indicate the identified particular gesture.
The analysis module 162 can also be configured to correlate the identified gesture to a control instruction based on the gesture map 144. For example, if the identified user action is a lateral move from left to right, the analysis module 162 may correlate the gesture to a move instruction for a lateral cursor shift from left to right, as shown in
The control module 164 may be configured to control the operation of the electronic system 100 (
Referring to
Another stage 204 of the process 200 can include monitoring a position, orientation, movement, or gesture of the finger 105 relative to the virtual frame. For example, the detector 104 can detect, acquire, and/or record positions of the finger 105 relative to the virtual frame over time. The detected positions of the finger 105 may then be used to form a temporal trajectory. The controller 118 can then compare the formed temporal trajectory with known actions or gestures in the gesture database 142 (
The process 200 can include a decision stage 206 to determine if the gesture of the finger 105 corresponds to a computing command. If the gesture corresponds to a computing command, in one embodiment, the process 200 includes inserting the computing command into a buffer (e.g., a queue, stack, and/or other suitable types of data structure) awaiting execution by the processor 120 of the controller 118 at stage 208. In another embodiment, the process 200 can also include modifying a previously inserted computing command and/or move instruction in the buffer at stage 208. For example, a previously inserted move instruction may be deleted from the buffer before being executed. Subsequently, a computing command is inserted into the buffer. The process 200 then includes executing commands in the buffer after a certain amount of delay at stage 210. In one embodiment, the delay is about 0.1 seconds. In other embodiments, the delay can be about 10 milliseconds, about 20 milliseconds, about 50 milliseconds, about 0.5 seconds, and/or other suitable amount of delay.
Several embodiments of the process 200 can thus at least ameliorate the difficulty of distinguishing between gestures for move instruction and those for computing commands. For example, when a movement of the finger 105 is first detected, the movement may be insufficient (e.g., short travel distance, low speed, etc.) to be recognized as a computing command. Thus, move instructions may be inserted into the buffer based on the detected movement. After a certain period of time (e.g., 0.5 seconds), the movement of the finger 105 is sufficient to be recognized as a gesture corresponding to a computer command. In response, the process 200 includes deleting the previously inserted move instruction and inserting the computing command instead. As such, the computer cursor 108 may be maintained generally stationary when the user 101 issues a computing command after moving the computer cursor 108 to a desired location.
If the gesture does not correspond to a computing command, the process 200 includes detecting jittering at stage 214 to determine if at least a portion of the monitored temporal trajectory of the finger 105 corresponds to natural shakiness of a human hand. In certain embodiments, detecting jittering can include analyzing the monitored temporal trajectory of the finger 105 for an established direction. In other embodiments, detecting jitters can include analyzing a travel distance, a travel speed, other suitable characteristics of the temporal trajectory, and/or combinations thereof. Several embodiments of detecting jitters by analyzing the monitored temporal trajectory for an established direction are described in more detail below with reference to
The process 200 then includes another decision stage 216 to determine if jittering is detected. If jittering is detected, the process 200 includes adjusting the virtual frame to counteract (e.g., at least reduce or even cancel) the impact of the detected jittering at stage 218. For example, the virtual frame may be adjusted based on the amount, direction, and/or other suitable characteristics of the monitored temporal trajectory of the finger 105. In one embodiment, a center of the virtual frame is shifted by an amount that is generally equal to an amount of detected jittering along generally the same direction. In other embodiments, the virtual frame may be tilted, scaled, rotated, and/or may have other suitable adjustments.
The process 200 can also include detecting slight motions of the finger 105 at stage 220. The inventors have recognized that the user 101 may utilize slight motions of the finger 105 for finely adjusting and/or controlling a position of the computing cursor 108. Unfortunately, such slight motions may have characteristics generally similar to those of jittering. As a result, the electronic system 100 may misconstrue such slight motions as jittering.
Several embodiments of the process 200 can recognize such slight motions to allow fine control of cursor position on the output device 106. As used herein, the term a “slight motion” generally refers to a motion having a travel distance, directional change, and/or other motion characteristics generally similar to jittering of a user's hand. In certain embodiments, recognizing slight motions may include performing linear regressions on the temporal trajectory of the finger 105 and determine a slope of the regressed fit, as discussed in more detail below with reference to
The process 200 then includes generating a move instruction at stage 222 if no jittering is detected or a slight motion is determined. Generating the move instruction can include computing a computer cursor position based on the temporal trajectory of the finger 105 and mapping the computed cursor position to the output device 106. The process 200 than proceeds to inserting the generated move instruction to the buffer at stage 208.
The process 200 then includes a decision stage 212 to determine if the process 200 should continue. In one embodiment, the process is continued if further movement of the finger 105 and/or the input device 102 is detected. In other embodiments, the process 200 may be continued based on other suitable criteria. If the process is continued, the process reverts to monitoring finger gesture at stage 204; otherwise, the process ends.
Even though the process 200 is shown in
Based on the detected position of the finger 105, the process 202 can include defining a virtual frame at stage 226. In one embodiment, the virtual frame includes an x-y plane (or a plane generally parallel thereto) in an x-y-z coordinate system based on a fingertip position of the finger 105. For example, the virtual frame can be a rectangular plane generally parallel to the output device 106 and has a center that generally coincides with the detected position of the finger 105. The virtual frame can have a size generally corresponding to a movement range along x-, y-, and z-axis of the finger 105. In other embodiments, the virtual frame may have other suitable locations and/or orientations. An example virtual frame is discussed in more detail below with reference to
The process 204 then includes mapping the virtual frame to the output device 106 at stage 228. In one embodiment, the virtual frame is mapped to the output device 106 based on a display size of the output device 106 (e.g., in number of pixels). As a result, each finger position in the virtual frame has a corresponding position on the output device 106. In other embodiments, the virtual frame may be mapped to the output device 106 in other suitable fashions. The process 202 then returns with the initiated virtual frame.
As shown in
The process 214 then includes acquiring a section and labeling the acquired section as jitter at stage 234. In one embodiment, acquiring a section includes detecting a position of the finger 105 relative to the virtual frame and calculating a vector based on the detected position and a previous position with respect to time. In other embodiments, acquiring a section may include retrieving at least two positions of the finger 105 from the memory 122 (
The process 214 then includes a decision stage 236 to determine if the section count has a value that is greater than zero. If the section count currently has a value of zero, the process 214 includes another decision stage 238 to determine if the section length of the acquired section is greater than the length threshold D. If the section length is greater than the length threshold D, the process 214 includes incrementing the section count at stage 240 before the process returns. The section count may be incremented by one or any other suitable integer. If the section length is not greater than the length threshold D, the process returns without incrementing the section count.
If the section count has a current value that is greater than zero, the process 214 then includes calculating a direction change of the current section at stage 242. In one embodiment, calculating a direction change includes calculating an angle change between a direction of the current section and that defined by prior positions of the finger 105. An example angle change is schematically shown in
The process 214 then includes a decision block 244 to determine if the section length is greater than the length threshold D and the calculated direction change is lower than an angle change threshold A. If no, the process 214 includes resetting the section count, for example, to zero and optionally indicating the plurality of spatial positions of the user's finger or the object associated with the user's finger correspond to natural shakiness at stage 250. If yes, the process 214 includes another decision stage 246 to determine if the section count has a current value greater than a count threshold N. The count threshold N may be predetermined to correspond to a minimum number of sections that indicate an intentional movement of the finger 105. In one embodiment, the count threshold N is three. In other embodiments, the count threshold N can be 1, 2, 4, or any other suitable integer values.
If the section count has a current value greater than the count threshold N, in one embodiment, the process 214 includes labeling the current section as not jitter at stage 248. In other embodiments, the process 214 may also label at least some or all of the previous sections in the section count as not jitters at stage 248. The process then returns. If the section count has a current value not greater than the count threshold N, the process 214 includes proceeding to incrementing the section count at stage 240 before the process returns.
The virtual frame 114 also includes first, second, third, and fourth peripheral frames 119a, 119b, 119c, and 119d shown in
In the example shown in
From the foregoing, it will be appreciated that specific embodiments of the disclosure have been described herein for purposes of illustration, but that various modifications may be made without deviating from the disclosure. In addition, many of the elements of one embodiment may be combined with other embodiments in addition to or in lieu of the elements of the other embodiments. Accordingly, the technology is not limited except as by the appended claims.