1. Field of the Invention
This invention relates generally to interacting with electronic devices via a touch-sensitive surface.
2. Description of the Related Art
Many touch pads and touch screens today are able to support a small set of gestures. For example, one finger is typically used to manipulate a cursor or to scroll the display. Another example is using two fingers in a pinching manner to zoom in and out of content, such as a photograph or map. However, this is a gross simplification of what fingers and hands are capable of doing. Fingers are diverse appendages, both in their motor capabilities and their anatomical composition. Furthermore, fingers and hands can also be used to manipulate tools, in addition to making gestures themselves.
Thus, there is a need for better utilization of the capabilities of fingers and hands to control interactions with electronic devices.
The present invention allows users to interact with touch-sensitive surfaces in a manner that distinguishes different touch types. For example, the same touch events performed by a finger pad, a finger nail, a knuckle or different types of instruments may result in the execution of different actions on the electronic device.
In one approach, a user uses his finger(s) or an instrument to interact with an electronic device via a touch-sensitive surface, such as a touch pad or a touch screen. A touch event trigger indicates an occurrence of a touch event between the user and the touch-sensitive surface. Touch data and vibro-acoustic data produced by the physical touch event are used to determine the touch type for the touch event. However, the touch event trigger may take some time to generate due to, for example, sensing latency and filtering. Further, the event trigger may take some time to propagate in the device due to, for example, software processing, hysteresis, and overhead from processing a low level event (e.g., interrupt) up through the operating system to end user applications. Because there will always be some amount of latency, the vibro-acoustic data from the touch impact will always have occurred prior to receipt of the touch event trigger.
On most mobile electronic devices, the distinguishing components of the vibro-acoustic signal (i.e., those which are most useful for classification) occur in the first 10 ms of a touch impact. For current mobile electronics, the touch event trigger is typically received on the order of tens of milliseconds after the physical touch contact. Therefore, if vibro-acoustic data is captured only upon receipt of a touch event trigger, the most important part of the vibro-acoustic signal will have already occurred and will be lost (i.e., never captured). This precludes reliable touch type classification for many platforms.
In one approach, vibro-acoustic data is continuously captured and buffered, for example, with a circular buffer. After receipt of the touch event trigger, an appropriate window (based on device latency) of vibro-acoustic data (which can include times prior to receipt of the touch event trigger or even prior to the physical touch event) is then accessed from the buffer. For example, a 10 ms window beginning 30 ms prior to receipt of the touch event trigger (i.e., from −30 ms to −20 ms) can be accessed. Additionally, the system can wait after the receipt of a touch event trigger for a predefined length of time before extracting a window of vibro-acoustic data. For example, the system can wait 20 ms after receipt of a touch event trigger, and then extract from the buffer the prior 100 ms of data.
In an alternate approach, the occurrence of the touch event is predicted beforehand. For example, the touch-sensitive surface may sense proximity of a finger before actual contact (e.g., using hover sensing capabilities of capacitive screens, diffuse illumination optical screens, and other technologies). This prediction is then used to trigger capture of vibro-acoustic data or to initiate vibro-acoustic data capturing and buffering. If the predicted touch event does not occur, capturing and/or buffering can cease, waiting for another predicted touch.
In another aspect, the touch type for the touch event determines subsequent actions. An action is taken on the electronic device in response to the touch event and to the touch type. That is, the same touch event can result in the execution of one action for one touch type and a different action for a different touch type.
Other aspects of the invention include methods, devices, systems, components and applications related to the approaches described above.
The invention has other advantages and features which will be more readily apparent from the following detailed description of the invention and the appended claims, when taken in conjunction with the accompanying drawings, in which:
The figures depict embodiments of the present invention for purposes of illustration only. One skilled in the art will readily recognize from the following discussion that alternative embodiments of the structures and methods illustrated herein may be employed without departing from the principles of the invention described herein.
The figures and the following description relate to preferred embodiments by way of illustration only. It should be noted that from the following discussion, alternative embodiments of the structures and methods disclosed herein will be readily recognized as viable alternatives that may be employed without departing from the principles of what is claimed.
In a common architecture, the data storage 106 includes a machine-readable medium which stores the main body of instructions 124 (e.g., software). The instructions 124 may also reside, completely or at least partially, within the memory 104 or within the processor 102 (e.g., within a processor's cache memory) during execution. The memory 104 and the processor 102 also constitute machine-readable media.
In this example, the different components communicate using a common bus, although other communication mechanisms could be used. As one example, the processor 102 could act as a hub with direct access or control over each of the other components.
The device 100 may be a server computer, a client computer, a personal computer (PC), or any device capable of executing instructions 124 (sequential or otherwise) that specify actions to be taken by that device. Further, while only a single device is illustrated, the term “device” shall also be taken to include any collection of devices that individually or jointly execute instructions 124 to perform any one or more of the methodologies discussed herein. The same is true for each of the individual components. For example, the processor 102 may be a multicore processor, or multiple processors working in a coordinated fashion. It may also be or include a central processing unit (CPU), a graphics processing unit (GPU), a network processing unit (NPU), a digital signal processor (DSP), one or more application specific integrated circuits (ASICs), or combinations of the foregoing. The memory 104 and data storage 106 may be dedicated to individual processors, shared by many processors, or a single processor may be served by many memories and data storage.
As one example, the device 100 could be a self-contained mobile device, such as a cell phone or tablet computer with a touch screen. In that case, the touch screen serves as both the touch-sensitive surface 110 and the display 120. As another example, the device 100 could be implemented in a distributed fashion over a network. The processor 102 could be part of a cloud-based offering (e.g., renting processor time from a cloud offering), the data storage 106 could be network attached storage or other distributed or shared data storage, and the memory 104 could similarly be distributed or shared. The touch-sensitive surface 110 and display 120 could be user I/O devices to allow the user to interact with the different networked components.
Returning to
Touch events also physically cause vibrations or acoustic signals. Touching the surface may cause acoustic signals (such as the sound of a fingernail or finger pad contacting glass) and/or may cause vibrations in the underlying structure of an electronic device, e.g., chassis, enclosure, electronics boards (e.g., PCBs). The sensor circuitry 112 includes sensors 112B to detect the vibro-acoustic signal. The vibro-acoustic sensors may be arranged at a rear side of the touch-sensitive surface so that the vibro-acoustic signal caused by the physical touch event can be captured. They could also be mounted in any number of locations inside the device, including by not limited to the chassis, touch screen, main board, printed circuit board, display panel, and enclosure. Examples of vibro-acoustic sensors include impact sensors, vibration sensors, accelerometers, strain gauges, and acoustic sensors such as a condenser microphone, a piezoelectric microphone, MEMS microphone and the like. Additional sensor types include piezo bender elements, piezo film, accelerometers (e.g., linear variable differential transformer (LVDT), potentiometric, variable reluctance, piezoelectric, piezoresistive, capacitive, servo (force balance), MEMS), displacement sensors, velocity sensors, vibration sensors, gyroscopes, proximity sensors, electric microphones, hydrophones, condenser microphones, electret condenser microphones, dynamic microphones, ribbon microphones, carbon microphones, piezoelectric microphones, fiber optic microphones, laser microphones, liquid microphones, and MEMS microphones. Many touch screen computing devices today already have microphones and accelerometers built in (e.g., for voice and input sensing). These can be utilized without the need for additional sensors, or can work in concert with specialized sensors.
Whatever the underlying principle of operation, touches on the touch-sensitive surface will result in signals—both touch signals and vibro-acoustic signals. However, these raw signals typically are not directly useable in a digital computing environment. For example, the signals may be analog in nature. The sensor circuitry 112A-B typically provides an intermediate stage to process and/or condition these signals so that they are suitable for use in a digital computing environment. As shown in
The touch sensor circuitry 112A also produces a touch event trigger, which indicates the occurrence of a touch event. Touch event triggers could appear in different forms.
For example, the touch event trigger might be an interrupt from a processor controlling the touch sensing system. Alternately, the touch event trigger could be a change in a polled status of the touchscreen controller. It could also be implemented as a modification of a device file (e.g., “/dev/input/event6”) on the file system, or as a message posted to a driver work queue. As a final example, the touch event trigger could be implemented as an onTouchDown( )event in a graphical user interface program.
However, the generation and receipt of the touch event trigger may be delayed due to latency in touch sensor circuitry 112A. Thus, if vibro-acoustic sensor circuitry 112B were to wait until it received the touch event trigger and then turn on, it may miss the beginning portion of the vibro-acoustic data.
In many cases, the delay At can be very significant. It could be longer than the entire signal window. For example, typical delays At for current devices are 20 ms, 35 ms, 50 ms, or possibly longer; while the desired vibro-acoustic signal window 210B can be the first 5 ms, for example. In these cases, waiting for the touch event trigger 212 may miss the entire vibro-acoustic signal. Other times, the delay At can be short and the window long, for example, a 10 ms delay with a 100 ms window.
In some cases, useful vibro acoustic data can persist after the receipt of the touch event trigger 212. In this case, a small waiting period can be used before accessing the vibro-acoustic buffer, which can contain periods both before and after the touch event trigger. This is shown in
The vibro-acoustic data for this time window are then accessed from buffer 310. In other words, the approach of
In
Touch types can be defined according to different criteria. For example, different touch types can be defined depending on the number of contacts. A “uni-touch” occurs when the touch event is defined by interaction with a single portion of a single finger (or instrument), although the interaction could occur over time. Examples of uni-touch include a simple touch (e.g., a single tap), touch-and-drag, and double-touch (e.g., a double-tap—two taps in quick succession). In multi-touch, the touch event is defined by combinations of different fingers or finger parts. For example, a touch event where two fingers are simultaneously touching is a multi-touch. Another example would be when different parts of the same finger are used, either simultaneously or over time.
Touch types can also be classified according to which part of the finger or instrument touches. For example, touch by the finger pad, finger nail or knuckle could be considered different touch types. The finger pad is the fleshy part around the tip of the finger. It includes both the fleshy tip and the fleshy region from the tip to the first joint. The knuckle refers to any of the finger joints. The term “finger” is also meant to include the thumb. It should be understood that the finger itself is not required to be used for touching; similar touches may be produced in other ways. For example, the “finger pad” touch type is really a class of touch events that have similar characteristics as those produced by a finger pad touching the touch-sensitive surface, but the actual touching object may be a man-made instrument or a gloved hand or covered finger, so long as the touching characteristics are similar enough to a finger pad so as to fall within the class.
The touch type is determined in part by a classification of vibro-acoustic signals from the touch event. When an object strikes a certain material, vibro-acoustic waves propagate outward through the material or along the surface of the material. Typically, touch-sensitive surface 110 uses rigid materials, such as plastic or glass, which both quickly distribute and faithfully preserve the signal. As such, when respective finger parts touch or contact the surface of the touch-sensitive surface 110, vibro-acoustic responses are produced. The vibro-acoustic characteristics of the respective finger parts are unique, mirroring their unique anatomical compositions.
For example,
The feature extraction module 556 then generates various features. These features can include time domain and/or frequency domain representations of the vibro-acoustic signal (or its filtered versions), as well as first, second, and higher order derivatives thereof. These features can also include down-sampling the time and frequency domain data into additional vectors (e.g., buckets of ten), providing different aliasing. Additional features can be further derived from the time domain and/or frequency domain representations and their derivatives, including average, standard deviation, standard deviation (normalized by overall amplitude), range, variance, skewness, kurtosis, sum, absolute sum, root mean square (rms), crest factor, dispersion, entropy, power sum, center of mass (centroid), coefficient of variation, cross correlation (e.g., sliding dot product), zero-crossings, seasonality (i.e., cyclic variation), and DC bias. Additional features based on frequency domain representations and their derivatives include power in different bands of the frequency domain representation (e.g., power in linear bins or octaves) and ratios of the power in different bands (e.g., ratio of power in octave 1 to power in octave 4).
Features could also include template match scores for a set of known exemplar signals using any of the following methods: convolution, inverse filter matching technique, sum-squared difference (SSD), dynamic time warping, and elastic matching.
Spectral centroid, spectral density, spherical harmonics, total average spectral energy, spectral rolloff, spectral flatness, band energy ratio (e.g., for every octave), and log spectral band ratios (e.g., for every pair of octaves, and every pair of thirds) are features that can be derived from frequency domain representations.
Additional vibro-acoustic features include linear prediction-based cepstral coefficients (LPCC), perceptual linear prediction (PLP) cepstral coefficients, cepstrum coefficients, mel-frequency cepstral coefficients (MFCC), and frequency phases (e.g., as generated by an FFT). The above features can be computed on the entire window of vibro-acoustic data, but could also be computed for sub regions (e.g., around the peak of the waveform, at the end of the waveform). Further, the above vibro-acoustic features can be combined to form hybrid features, for example a ratio (e.g., zero-crossings/spectral centroid) or difference (zero-crossings-spectral centroid).
The feature extraction module 556 can also generate features from the touch data. Examples include location of the touch (2D, or 3D in the case of curved glass or other non-planar geometry), size and shape of the touch (some touch technologies provide an ellipse of the touch with major and minor axes, eccentricity, and/or ratio of major and minor axes), orientation of the touch, surface area of the touch (e.g., in squared mm or pixels), number of touches, pressure of the touch (available on some touch systems), and shear of the touch. “Shear stress,” also called “tangential force,” arises from a force vector perpendicular to the surface normal of a touch screen, i.e., parallel to the touch screen surface. This is similar to normal stress—what is commonly called pressure—which arises from a force vector parallel to the surface normal. Some features depend on the type of touch-sensitive surface. For example, capacitance of touch, swept frequency capacitance of touch, and swept frequency impedance of touch may be available for (swept frequency) capacitive touch screens. Derivatives of the above quantities can also be computed as features. The derivatives may be computed over a short period of time, for example, touch velocity and pressure velocity. Another possible feature is an image of the hand pose (as imaged by e.g., an optical sensor, diffuse illuminated surface with camera, near-range capacitive sensing).
The classification module 558 classifies the touch using extracted features from the vibro-acoustic signal as well as possibly other non-vibro-acoustic features, including touch features. In one exemplary embodiment, the classification module 558 is implemented with a support vector machine (SVM) for feature classification. The SVM is a supervised learning model with associated learning algorithms that analyze data and recognize patterns, used for classification and regression analysis. To aid classification, the user can provide supplemental training samples to the vibro-acoustic classifier. Other techniques appropriate for the classification module 558 include basic heuristics, decision trees, random forest, naive Bayes, elastic matching, dynamic time warping, template matching, k-means clustering, K-nearest neighbors algorithm, neural network, multilayer perceptron, multinomial logistic regression, Gaussian mixture models, and AdaBoost.
Returning to
This approach allows the same touch event to control more than one action. This can be desirable for various reasons. First, it increases the number of available actions for a given set of touch events. For example, if touch types are not distinguished, then a single tap can be used for only one purpose, because a single tap by a finger pad, a single tap by a finger nail and a single tap by an instrument cannot be distinguished. However, if all three of these touch types can be distinguished, then a single tap can be used for three different purposes, depending on the touch type.
Conversely, for a given number of actions, this approach can reduce the number of user inputs needed to reach that action. Continuing, the above example, if three actions are desired, by distinguishing touch types, the user will be able to initiate the action by a single motion—a single tap. If touch types are not distinguished, then more complex motions or a deeper interface decision tree may be required. For example, without different touch types, the user might be required to first make a single tap to bring up a menu of the three choices. He would then make a second touch to choose from the menu.
Although the detailed description contains many specifics, these should not be construed as limiting the scope of the invention but merely as illustrating different examples and aspects of the invention. It should be appreciated that the scope of the invention includes other embodiments not discussed in detail above. Various other modifications, changes and variations which will be apparent to those skilled in the art may be made in the arrangement, operation and details of the method and apparatus of the present invention disclosed herein without departing from the spirit and scope of the invention as defined in the appended claims. Therefore, the scope of the invention should be determined by the appended claims and their legal equivalents.
The term “module” is not meant to be limited to a specific physical form. Depending on the specific application, modules can be implemented as hardware, firmware, software, and/or combinations of these. Furthermore, different modules can share common components or even be implemented by the same components. There may or may not be a clear boundary between different modules.
Depending on the form of the modules, the “coupling” between modules may also take different forms. Dedicated circuitry can be coupled to each other by hardwiring or by accessing a common register or memory location, for example. Software “coupling” can occur by any number of ways to pass information between software components (or between software and hardware, if that is the case). The term “coupling” is meant to include all of these and is not meant to be limited to a hardwired permanent connection between two components. In addition, there may be intervening elements. For example, when two elements are described as being coupled to each other, this does not imply that the elements are directly coupled to each other nor does it preclude the use of other elements between the two.