Embodiments of the invention generally relate to the field of electronic devices and, more particularly, to a method and apparatus for touch sensor gesture recognition for operation of mobile devices.
Mobile devices, including cellular phones, smart phones, mobile Internet devices (MIDs), handheld computers, personal digital assistants (PDAs), and other similar devices, provide a wide variety of applications for various purposes, including business and personal use.
A mobile device requires one or more input mechanisms to allow a user to input instructions and responses for such applications. As mobile devices become smaller yet more full-featured, a reduced number of user input devices (such as switches, buttons, trackballs, dials, touch sensors, and touch screens) are used to perform an increasing number of application functions.
However, conventional input devices are limited in their ability to accurately reflect the variety of inputs that are possible with complex mobile devices. Conventional device inputs may respond inaccurately or inflexibly to inputs of users, thereby reducing the usefulness and user friendliness of mobile devices.
Embodiments of the invention are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings in which like reference numerals refer to similar elements.
Embodiments of the invention are generally directed to touch sensor gesture recognition for operation of mobile devices.
As used herein:
“Mobile device” means a mobile electronic device or system including a cellular phone, smart phone, mobile Internet device (MID), handheld computers, personal digital assistants (PDAs), and other similar devices.
“Touch sensor” means a sensor that is configured to provide input signals that are generated by the physical touch of a user, including a sensor that detects contact by a thumb or other finger of a user of a device or system.
In some embodiments, a mobile device includes a touch sensor for the input of signals. In some embodiments, the touch sensor includes a plurality of sensor elements. In some embodiments, a method, apparatus, or system provides for:
(1) A zoned touch sensor for multiple, simultaneous user interface modes.
(2) Selection of a gesture identification algorithm based on an application.
(3) Neural network optical calibration of a touch sensor.
In some embodiments, a mobile device includes an instrumented surface designed for manipulation via a finger of a mobile user. In some embodiments, the mobile device includes a sensor on a side of a device that may especially be accessible by a thumb (or other finger) of a mobile device user. In some embodiments, the surface of a sensor may be designed in any shape. In some embodiments, the sensor is constructed as an oblong intersection of a saddle shape. In some embodiments, the touch sensor is relatively small in comparison with the thumb used to engage the touch sensor.
In some embodiments, instrumentation for a sensor is accomplished via the use of capacitance sensors and/or optical or other types of sensors embedded beneath the surface of the device input element. In some embodiments, these sensors are arranged in one of a number of possible patterns in order to increase overall sensitivity and signal accuracy, but may also be arranged to increase sensitivity to different operations or features (including, for example, motion at an edge of the sensor area, small motions, or particular gestures). Many different sensor arrangements for a capacitive sensor are possible, including, but not limited to, the sensor arrangements illustrated in
In some embodiments, sensors include a controlling integrated circuit that is interfaced with the sensor and designed to connect to a computer processor, such as a general-purpose processor, via a bus, such as a standard interface bus. In some embodiments, sub-processors are variously connected to a computer processor responsible for collecting sensor input data, where the computer processor may be a primary CPU or a secondary microcontroller, depending on the application. In some embodiments, sensor data may pass through multiple sub-processors before the data reaches the processor that is responsible for handling all sensor input.
In some embodiments, in a system in which data is handled by a primary CPU 114, the sensor data may be acquired by a system or kernel process that handles data input before handing the raw data to another system or kernel process that handles the data interpretation and fusion. In a microcontroller or sub-processor based system, this can either be a dedicated process or timeshared with other functions.
The mobile device may further include, for example, one or more transmitters and receivers 106 for the wireless transmission and reception of data, as well as one or more antennas 104 for such data transmission and reception; a memory 118 for the storage of data; a user interface 120, including a graphical user interface (GUI), for communications between the mobile device 100 and a user of the device; a display circuit or controller 122 for providing a visual display to a user of the mobile device 100; and a location circuit or element, including a global positioning system (GPS) circuit or element 124.
In some embodiments, raw data is time tagged as it enters into the device or system with sufficient precision so that the raw data can both be correlated with data from another sensor, and so that any jitter in the sensor circuit or acquisition system can be accounted for in the processing algorithm. Each set of raw data may also have a pre-processing algorithm that accounts for characteristic noise or sensor layout features which need to be accounted for prior to the general algorithm.
In some embodiments, a processing algorithm then processes the data from each sensor set individually and (if more than one sensor type is present) fuses the data in order to generate contact, position information, and relative motion. In some embodiments, relative motion output may be processed through a ballistics/acceleration curve to give the user fine control of motion when the user is moving the pointer slowly. In some embodiments, a separate processing algorithm uses the calculated contact and position information along with the raw data in order to recognize gestures. In some embodiments, gestures that the device or system may recognize include, but are not limited to: finger taps of various duration, swipes in various directions, and circles (clockwise or counter-clockwise). In some embodiments, a device or system includes one or more switches built into a sensor element or module together with the motion sensor, where the sensed position of the switches may be directly used as clicks in control operation of the mobile device or system.
In some embodiments, the output of processing algorithms and any auxiliary data is available for usage within a mobile device or system for operation of user interface logic. In some embodiments, the data may be handled through any standard interface protocol, where example protocols are UDP (User Datagram Protocol) socket, Unix™ socket, D-Bus (Desktop Bus), and UNIX/dev/input device.
In this illustration, a first touch sensor 200 may include a plurality of oval capacitive sensors 202 (twelve in sensor 200) in a particular pattern, together with a centrally placed optical sensor 206. A second sensor 210 may include similar oval capacitive sensors 212 with no optical sensor in the center region 214 of the sensor 210.
In this illustration, a third touch sensor 220 may include a plurality of diamond-shaped capacitive sensors 222 in a particular pattern, together with a centrally placed optical sensor 226. A fourth sensor 230 may include similar diamond-shaped capacitive sensors 232 with no optical sensor in the center region 234 of the sensor 230.
In this illustration, a fifth touch sensor 240 may include a plurality of capacitive sensors 242 separated by horizontal and vertical boundaries 241, together with a centrally placed optical sensor 246. A sixth sensor 250 may include similar capacitive sensors 252 as the fifth sensor with no optical sensor in the center region 254 of the sensor 250.
In this illustration, a seventh touch sensor 260 may include a plurality of vertically aligned oval capacitive sensors 262, together with a centrally placed optical sensor 266. An eighth sensor 270 may include similar oval capacitive sensors 272 with no optical sensor in the center region 276 of the sensor 270.
Zoned Touch Sensor for Multiple, Simultaneous User Interface Modes
In some embodiments, a device or system divides the touch sensing area of a touch sensor on a mobile device into multiple discrete zones and assigns distinct functions to inputs received in each of the zones. In some embodiments, the number, location, extent and assigned functionality of the zones may be configured by the application designer or reconfigured by the user as desired. In some embodiments, the division of the touch sensor into discrete zones allows the single touch sensor to emulate the functionality of multiple separate input devices. In some embodiments, the division may be provided for a particular application or portion of an application, while other applications may be subject to no division of the touch sensor or to a different division of the touch sensor.
In one exemplary embodiment, a touch sensor is divided into a top zone, a middle zone, and bottom zone, and the inputs in each zone are assigned to control different functional aspects of, for example, a dual-camera zoom system. In this example, inputs (such as taps by a finger of a user on the touch sensor) within the top zone toggle the system between automatic and manual focus; inputs within the middle zone (such as taps on the touch sensor) operate the camera, initiating image capture; and inputs within the bottom zone operate the zoom function. For example, an upward movement in the bottom zone could zoom inward and a downward movement in the bottom zone could zoom outward. In other embodiments, a touch sensor may be divided into any number of zones for different functions of an application.
In some embodiments, continuous, moving contacts with the touch sensor (for example, gestures such as swipes along the touch sensor) that cross from one zone to another, such as crossing between zone 1410 and zone 2420, or between zone 2420 and zone 3430, may be handled in one of several ways. In a first approach, a mobile device may operate such that any gesture commencing in one region and finishing in another is ignored. In a second approach, a mobile device may be operated such that any gesture commencing in one region and finishing in another region is divided into two separate gestures, one in each zone, with each of the two gestures interpreted as appropriate for each zone. In addition, the existence of a “neutral” region (a dead space in the touch sensor) between adjacent zones in a touch sensor of a mobile device may be utilized to reduce the likelihood that a user will unintentionally commence a gesture in one region and finish the gesture in another.
If the touch sensor of a mobile device has not been divided into zones 506, then the mobile device may operate to interpret gestures in the same manner for all portions of the touch sensor 508. If the touch sensor is divided into zones 506, then the mobile device may interpret detected gestures according to the zone within which the gesture is detected 510.
Selection of Gesture Identification Algorithm Based on Application
In some embodiments, a mobile device provides for selecting a gesture recognition algorithm with characteristics that are suited for a particular application.
Mobile devices having user interfaces incorporating a touch sensor may have numerous techniques available for processing the contact, location, and movement information detected by the touch sensor to identify gestures corresponding to actions to be taken within the controlled application. Selection of a single technique for gesture recognition requires analysis of tradeoffs because each technique may have certain strengths and weaknesses, and certain techniques thus may be better at identifying some gestures than others. Correspondingly, the applications running on a mobile device may vary in their need for robust, precise, and accurate identification of particular gestures. For example, a particular application may require extremely accurate identification of a panning gesture, but be highly tolerant of a missed tapping gesture.
In some embodiments, a system operating on a mobile device selects a gesture recognition algorithm from among a set of available gesture recognition algorithms for a particular application. In some embodiments, the mobile device makes such selection on a real-time basis in the operation of the mobile device.
In some embodiments, a system on a mobile device selects a gesture algorithm based on the nature of the current application. In some embodiments, the system on a mobile device operates on the premise that each application operating on the mobile device (for example, a contact list, a picture viewer, a desktop, or other application) may be characterized by one or more “dominant” actions (where the dominant actions may be, for example, the most statistically frequent actions, or the most consequential actions), where each such dominant action is invoked by a particular gesture. In some embodiments, a system on a mobile device selects a particular gesture algorithm in order to identify the corresponding gestures robustly, precisely, and accurately.
In an example, for a contact list application, the dominant actions may be scrolling and selection, where such actions may be invoked by swiping and tapping gestures on the touch sensor of a mobile device. In some embodiments, when the contact list application is the active application for a mobile device, the system or mobile device invokes a gesture identification algorithm that can effectively identify both swiping and tapping gestures. In this example, the chosen gesture identification algorithm may be less effective at identifying other gestures, such as corner-to-corner box selection and “lasso” selection, that are not dominant gestures for the application. In some embodiments, if a picture viewer is the active application, a system or mobile device invokes a gesture identification algorithm that can effectively identify two-point separation and two-point rotation gestures corresponding to zooming and rotating actions, where such gestures are dominant gestures of the picture viewer application.
In some embodiments, a system or mobile device may select a gesture identification algorithm based on one or more specific single actions anticipated within a particular application. In an example, upon loading a contact lists application, a system or mobile device may first invoke a gesture algorithm that most effectively identifies swiping gestures corresponding to a scrolling action, on the assumption that a user will first scroll the list to find a contact of interest. Further in this example, after scrolling has, for example, ceased for a certain period of time, the system or mobile device may invoke a gesture identification algorithm that most effectively identifies tapping gestures corresponding to a selection action, on the assumption that once the user has scrolled this list to a desired location, the user will select a particular contact of interest.
In this illustration, a first application 602 has one or more dominant actions 604, where such dominant actions are better handled by the first algorithm 620. Further, a second application 606 has one or more dominant actions 608, where such dominant actions are better handled by the second algorithm 622. A third application 610 may include multiple functions or subparts, where the dominant actions of the functions or subparts may differ. For example, a first function 612 has one or more dominant actions 614, where such dominant actions are better handled by the third algorithm 624 and a second function 616 has one or more dominant actions 618, where such dominant actions are better handled by the second algorithm 622.
As illustrated by
In some embodiments, if gesture is detected 710, then the mobile device operates to identify the gesture using the currently chosen gesture identification algorithm 712 and thereby determine the intended action of the user of the mobile device 714. The mobile device may then implement the intended action in the context of the current application or function 716.
Neural Network Optical Calibration of Capacitive Thumb Sensor
In some embodiments, a system or mobile device provides for calibration of a touch sensor, where the calibration includes a neural network optical calibration of the touch sensor.
Many capacitive touch sensing surfaces operate based on “centroid” algorithms, which take a weighted average of a quantity derived from the instantaneous capacitance reported by each capacitive sensor pad multiplied by that capacitive sensor pad's position in space. In such algorithms, the resulting quantity for a touch sensor operated with a user's thumb (or other finger) is a capacitive “barycenter” for the thumb, which may either be treated as the absolute position of the thumb or differentiated to provide relative motion information as would a mouse.
For a sensor operated by a user's thumb (or other finger), however, the biomechanics of the thumb may lead to an apparent mismatch between the user's expectation of pointer motion and the measured barycenter for such motion. In particular, as the thumb is extended through its full motion in a gesture of a capacitive touch sensor, the tip of the thumb generally lifts away from the surface of the capacitive sensors. In a centroid-based capacitive sensor algorithm, this yields an apparent (proximal) shift in the calculated position of the thumb while the user generally expects that the calculated position will continue to track the distal extension of the thumb. Thus, instead of tracking the user's perceived position of the finger tip, the centroid algorithm will “roll-back” along the proximodistal axis (the axis running from the tip of the thumb to the basal joint joining the thumb to the hand).
Additionally, the small size of the touch sensor relative to the thumb presents additional challenges. In a thumb sensor consisting of a physically small array of capacitive elements, many of the elements are similarly affected by the thumb at any given thumb position.
Collectively, these two phenomena make it exceedingly challenging to construct a mapping from capacitive sensor readings to calculated thumb positions that matches the user's expectations. In practice, traditional approaches, including hand-formulated functions with adjustable parameters and use of a non-linear optimizer (for example, the Levenberg-Marquardt algorithm) are generally unsuccessful.
In some embodiments, a system or apparatus provides an effective technique for generating a mapping between capacitive touch sensor measurements and calculated thumb positions.
In some embodiments, a system or apparatus uses an optical calibration instrument to determine actual thumb (or other finger) positions. In some embodiments, the actual thumb positions and the contemporaneous capacitive sensor data are provided to an artificial neural network (ANN) during a training procedure. An ANN in general is a mathematical or computational model to simulate the structure and/or functional aspects of biological neural networks, such as a system of programs and data structures that approximates the operation of the human brain. In some embodiments, a resulting ANN provides a mapping between the capacitive sensor data from the touch sensor and the actual thumb positions (which may be two-dimensional (2D, which may be expressed as a position in x-y coordinates) or three-dimensional (3D, which may be expressed as x-y-z coordinates), depending on the interface requirements of the device software) in performing gestures. In some embodiments, a mobile device may use the resulting mapping between capacitive sensor data and actual thumb positions during subsequent operation of the capacitive thumb sensor.
In some embodiments, an optical calibration instrument may be a 3D calibration rig or system, such as a system similar to those commonly used by computer vision scientists to obtain precise measurements of physical objects. The uncertainties in the measurements provided by such a rig or system are presumably small, with the ANN training procedure being resilient to any remaining noise in the training data. However, embodiments are not limited to any particular optical calibration system.
In some embodiments, the inputs to the ANN may be raw capacitive touch sensor data. In some embodiments, the inputs to the ANN may alternatively include historical sensor data quantities derived from past measurements of the capacitive touch sensors. In some embodiments, the training procedure for the ANN implements a nonparametric regression, that is, the training procedure for the ANN does not merely determine parameters within a predetermined functional form but determines the functional form itself.
In some embodiments, an ANN may be utilized to provide improved performance in comparison with manually generated mappings for “pointing” operations, such as cursor control. An ANN is generally adept at interpreting touch sensor measurements that would be difficult or impossible for a programmer to anticipate and handle within handwritten code. An ANN-based approach can successfully develop mappings for a wide variety of arrangements of capacitive sensor pads on a sensor surface. In particular, ANNs may operate to readily accept measurements from larger electrodes (as compared to the size of the thumb) arrayed in an irregular shape (such as a non-grid arrangement), thereby extracting improved (over handwritten code) position estimates from potentially ambiguous capacitive measurements. In some embodiments, the ANN training procedure and operation may also be extended to other sensor configurations, including sensor fusion approaches, such as hybrid capacitive and optical sensors.
In some embodiments, the device 1100 further comprises a random access memory (RAM) or other dynamic storage device or element as a main memory 1115 for storing information and instructions to be executed by the processors 1110. Main memory 1115 also may be used for storing data for data streams or sub-streams. RAM memory includes dynamic random access memory (DRAM), which requires refreshing of memory contents, and static random access memory (SRAM), which does not require refreshing contents, but at increased cost. DRAM memory may include synchronous dynamic random access memory (SDRAM), which includes a clock signal to control signals, and extended data-out dynamic random access memory (EDO DRAM). In some embodiments, memory of the system may include certain registers or other special purpose memory. The device 1100 also may comprise a read only memory (ROM) 1125 or other static storage device for storing static information and instructions for the processors 1110. The device 1100 may include one or more non-volatile memory elements 1130 for the storage of certain elements.
Data storage 1120 may also be coupled to the interconnect 1105 of the device 1100 for storing information and instructions. The data storage 1120 may include a magnetic disk, an optical disc and its corresponding drive, or other memory device. Such elements may be combined together or may be separate components, and utilize parts of other elements of the device 1100.
The device 1100 may also be coupled via the interconnect 1105 to an output display 1140. In some embodiments, the display 1140 may include a liquid crystal display (LCD) or any other display technology, for displaying information or content to a user. In some environments, the display 1140 may include a touch-screen that is also utilized as at least a part of an input device. In some environments, the display 1140 may be or may include an audio device, such as a speaker for providing audio information.
One or more transmitters or receivers 1145 may also be coupled to the interconnect 1105. In some embodiments, the device 1100 may include one or more ports 1150 for the reception or transmission of data. The device 1100 may further include one or more antennas 1155 for the reception of data via radio signals.
The device 1100 may also comprise a power device or system 1160, which may comprise a power supply, a battery, a solar cell, a fuel cell, or other system or device for providing or generating power. The power provided by the power device or system 1160 may be distributed as required to elements of the device 1100.
In some embodiments, the device 1100 includes a touch sensor 1170. In some embodiments, the touch sensor 1170 includes a plurality of capacitive sensor pads 1172. In some embodiments, the touch sensor 1170 may further include another sensor or sensors, such as an optical sensor 1174.
In the description above, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced without some of these specific details. In other instances, well-known structures and devices are shown in block diagram form. There may be intermediate structure between illustrated components. The components described or illustrated herein may have additional inputs or outputs which are not illustrated or described.
Various embodiments may include various processes. These processes may be performed by hardware components or may be embodied in computer program or machine-executable instructions, which may be used to cause a general-purpose or special-purpose processor or logic circuits programmed with the instructions to perform the processes. Alternatively, the processes may be performed by a combination of hardware and software.
Portions of various embodiments may be provided as a computer program product, which may include a computer-readable medium having stored thereon computer program instructions, which may be used to program a computer (or other electronic devices) for execution by one or more processors to perform a process according to certain embodiments. The computer-readable medium may include, but is not limited to, floppy diskettes, optical disks, compact disk read-only memory (CD-ROM), and magneto-optical disks, read-only memory (ROM), random access memory (RAM), erasable programmable read-only memory (EPROM), electrically-erasable programmable read-only memory (EEPROM), magnet or optical cards, flash memory, or other type of computer-readable medium suitable for storing electronic instructions. Moreover, embodiments may also be downloaded as a computer program product, wherein the program may be transferred from a remote computer to a requesting computer.
Many of the methods are described in their most basic form, but processes can be added to or deleted from any of the methods and information can be added or subtracted from any of the described messages without departing from the basic scope of the present invention. It will be apparent to those skilled in the art that many further modifications and adaptations can be made. The particular embodiments are not provided to limit the invention but to illustrate it. The scope of the embodiments of the present invention is not to be determined by the specific examples provided above but only by the claims below.
If it is said that an element “A” is coupled to or with element “B,” element A may be directly coupled to element B or be indirectly coupled through, for example, element C. When the specification or claims state that a component, feature, structure, process, or characteristic A “causes” a component, feature, structure, process, or characteristic B, it means that “A” is at least a partial cause of “B” but that there may also be at least one other component, feature, structure, process, or characteristic that assists in causing “B.” If the specification indicates that a component, feature, structure, process, or characteristic “may”, “might”, or “could” be included, that particular component, feature, structure, process, or characteristic is not required to be included. If the specification or claim refers to “a” or “an” element, this does not mean there is only one of the described elements.
An embodiment is an implementation or example of the present invention. Reference in the specification to “an embodiment,” “one embodiment,” “some embodiments,” or “other embodiments” means that a particular feature, structure, or characteristic described in connection with the embodiments is included in at least some embodiments, but not necessarily all embodiments. The various appearances of “an embodiment,” “one embodiment,” or “some embodiments” are not necessarily all referring to the same embodiments. It should be appreciated that in the foregoing description of exemplary embodiments of the present invention, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. This method of disclosure, however, is not to be interpreted as reflecting an intention that the claimed invention requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims are hereby expressly incorporated into this description, with each claim standing on its own as a separate embodiment of this invention.
This application is a continuation of U.S. application Ser. No. 13/992,699, filed on Jun. 7, 2013, which is further a U.S. National Phase application under 35 U.S.C. §371 from International Application No. PCT/US2010/061802, filed on Dec. 22, 2010, which applications are incorporated herein by reference in their entirety.
Number | Date | Country | |
---|---|---|---|
Parent | 13992699 | Jun 2013 | US |
Child | 15352474 | US |