The present invention relates to method and devices for classifying activity of a user, especially using sound information.
With the rapid development of the mobile terminals such as mobile phones, more and more functionalities are incorporated inside the terminal. One feature is to detect motion of the terminal and thereby the motion and activity of the user.
Activity recognition, i.e. classifying how a user is moving, e.g. sitting, running, walking, riding a car etc., is currently done mainly using accelerometer sensors and in some cases location sensors or video. Activity recognition in handsets is a problem since it may consume a lot of power and also has limited accuracy. This invention tries to solve this by using body microphones to capture sound of vibrations transported through the user's body to improve accuracy and/or reduce power consumption. It improves accuracy compared to using only accelerometer or microphones recording external (non-body) sounds.
The present invention provides a solution to aforementioned problem by using body attached microphones to capture sound of vibrations transported through the user's body to improve accuracy and/or reduce power consumption.
Thus, the invention relates to a method for classifying an activity of an object, the method comprising: receiving a sound signal from a sensor, determining type of sound based on the sound signal, and determining the activity based on the type of sound. The sound data corresponds to vibrations from the object. According to one embodiment the sound receiver is a microphone attached to a person and facing skin of the person. The sensor further comprises a motion detector. The method further comprises comparing the sound signal with a number of sound signals stored in a memory, which includes a plurality of sound types and a plurality of attributes associated with each sound type. Each attribute comprises a predefined value and each sound type is associated with each attribute. Each sound type is associated with each attribute in accordance with Bayesian's rule, such that a conditional probability of each sound type is defined for an occurrence of each attribute. The attributes may consist of one or several of: histogram features, linear predictive coding, cepstral coefficients, short-time Fourier transform, timbre, zero-crossing rate, short-time energy, root-mean-square energy, high/low feature value ratio, spectrum centroid, spectrum spread, or spectral roll-off frequency.
The invention also relates to a device for classifying an activity of a person, the device comprising: a receiver for receiving a sound signal from a sensor, and a controller, characterised in that the controller is configured to process the sound signal and determine type of sound based on the sound signal, and determine the activity based on the type of sound. The sound signals are received from one or several microphones attached to the person. The microphones are arranged facing skin of the person. The device may further receive motion data from one or several motion detectors. The controller is further configured to compare the sound signal with a number of sound signals stored in a memory, which includes a plurality of sound types and a plurality of attributes associated with each sound type, each attribute comprising a predefined value and each sound type is associated with each attribute, each sound type is associated with each attribute in accordance with Bayesian's rule, such that a conditional probability of each sound type is defined for an occurrence of each attribute.
The invention also relates to a mobile communication terminal comprising a device as mentioned above.
Reference is made to the attached drawings, wherein elements having the same reference number designation may represent like elements throughout.
The following detailed description refers to the accompanying drawings. The same reference numbers in different drawings may identify the same or similar elements. The term “image,” as used herein, may refer to a digital or an analog representation of visual information (e.g., a picture, a video, a photograph, animations, etc.)
The term “audio” as used herein, may include may refer to a digital or an analog representation of audio information (e.g., a recorded voice, a song, an audio book, etc.)
Also, the following detailed description does not limit the invention. Instead, the scope of the invention is defined by the appended claims and equivalents.
The basic idea of the invention is to record sound waves internally transported through the body of a user itself. This makes it suitable to also recognize activities that do not generate distinct external sounds, e.g. walking or running. It also makes it less susceptible to ambient noise and thus provides higher accuracy.
The microphone(s) can be placed, e.g. using a holder on the body of a user. The microphones may be provided facing the body and in direct contact with the skin. The activity classification itself can be done in a sensor and then communicated to the terminal to be used in applications. The sound type detection may be carried on in a lower level feature detection, which is then communicated to the terminal where the actual activity classification is done.
The audio and accelerometer and audio data is preprocessed to extract features and then fed to the classifier, which can be an assembly of classifiers, which then generates a classification. The specific classification method used, e.g. bayesian, neural networks etc, is an implementation detail.
Processor 120 may include any type of processor or microprocessor that interprets and executes instructions. Processor 120 may also include logic that is able to decode media, such as audio and audio files, etc., and generate output to, for example, a speaker, a display, etc. Memory 130 may include a random access memory (RAM) or another dynamic storage device that stores information and instructions for execution by processor 120. Memory 130 may also be used to store temporary variables or other intermediate information during execution of instructions by processor 120.
ROM 140 may include a conventional ROM device and/or another static storage device that stores static information and instructions for processor 120. Storage device 150 may include a flash memory (e.g., an electrically erasable programmable read only memory (EEPROM)) device for storing information and instructions.
Input device 160 may include one or more conventional mechanisms that permit a user to input information to the arrangement 100, such as a keyboard, a keypad, a directional pad, a mouse, a pen, voice recognition, a touch-screen and/or biometric mechanisms, etc. Output device 170 may include one or more conventional mechanisms that output information to the user, including a display, a printer, one or more speakers, etc. Communication interface 180 may include any transceiver-like mechanism that enables arrangement 100 to communicate with other devices and/or systems. For example, communication interface 180 may include a modem or an Ethernet interface to a LAN. Alternatively, or additionally, communication interface 180 may include other mechanisms for communicating via a network, such as a wireless network. For example, communication interface may include a radio frequency (RF) transmitter and receiver and one or more antennas for transmitting and receiving RF data.
Arrangement 100, consistent with the invention, provides a platform through which audible information and motion information may be interpreted to activity information. Arrangement 100 may also display information associated with the activity to the user of arrangement 100 in a graphical format or provided to a third part system. According to an exemplary implementation, arrangement 100 may perform various processes in response to processor 120 executing sequences of instructions contained in memory 130. Such instructions may be read into memory 130 from another computer-readable medium, such as storage device 150, or from a separate device via communication interface 180. It should be understood that a computer-readable medium may include one or more memory devices or carrier waves. Execution of the sequences of instructions contained in memory 130 causes processor 120 to perform the acts that will be described hereafter. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions to implement aspects consistent with the invention. Thus, the invention is not limited to any specific combination of hardware circuitry and software.
The mobile terminal 210 may comprise an arrangement according to
The housing 221 may be provided on an attachment portion 225, such as strap or band. The attachment portion 225 allows the senor portion to be attached to a body part of user. The attachment portion may comprise VELCRO fastening band, or any other type of fastening, which in one embodiment may allow the user to attach the sensor 220 to a body part, such as wrist, ankle, chest etc. The senor may also be integrated in or attached to a watch, closing, socks, gloves, etc.
The microphone 222, in one embodiment facing the skin of the user, records sound waves internally transported through the body of the user itself, which allows recognizing activities that do not generate distinct external sounds, e.g. body activities such as running or walking. It also makes it less susceptible to ambient noise and thus provides higher accuracy.
The motion sensor 223, such as accelerometer, gyro etc., allows detecting movement of the user.
In one embodiment, the sensor 220 may only record sound, i.e. only comprise microphone or in lack of motion only use microphone. In one embodiment both the microphone and the motion sensor are in MEMS (Microelectromechanical systems).
The control 224 receives signals from the microphone 222 and motion sensor 223 and, depending on the configuration, may process the signals or transmit them to the mobile terminal. The controller 224 may include any type of processor or microprocessor that interprets and executes instructions. The controller may also include logic that is able to decode media, such as audio and audio files, etc., and generate output to, for example, a speaker, a display, etc. The controller may also include onboard memory for storing information and instructions for execution by the controller.
The transceiver 225, which may include an antenna (not shown), may use wireless communication including radio signals, such as Bluetooth, Wi-Fi, or IR or wired communication, mainly to transmit signals to the terminal 210 (or other devices).
With reference now to
A more accurate classification may be obtained using the signal from the motion detector 223. Different motions, e.g. walking, running, dancing etc. have different movement characteristics.
The senor 222 may also be provided with other detectors, e.g. pulsimeter, heartbeat meter, temperature meter, etc.
When the type of sound is determined (2), the activity classification, irrespective of where (sensor, terminal, network) it is carried out, may comprise comparing the sound type data (and motion data and other relevant data) with stored data in a database, or use Bayesian, neural network methods to classify (3) the activity. The classification may be carried out in the senor or the data is provided to the mobile terminal or a network device for classification.
In one example, the user may have two sensors, as in
It should be noted that the word “comprising” does not exclude the presence of other elements or steps than those listed and the words “a” or “an” preceding an element do not exclude the presence of a plurality of such elements. It should further be noted that any reference signs do not limit the scope of the claims, that the invention may be implemented at least in part by means of both hardware and software, and that several “means”, “units” or “devices” may be represented by the same item of hardware.
A “device” as the term is used herein, is to be broadly interpreted to include a radiotelephone having ability for receiving and processing sound and other data. The device may also be a sound recorder, global positioning system (GPS) receiver; a personal communications system (PCS) terminal that may combine a cellular radiotelephone with data processing; a personal digital assistant (PDA); a laptop; a camera (e.g., video and/or still image camera) having communication ability; and any other computation or communication device capable of transceiving, such as a personal computer, a home entertainment system, a television, etc.
The various embodiments of the present invention described herein is described in the general context of method steps or processes, which may be implemented in one embodiment by a computer program product, embodied in a computer-readable medium, including computer-executable instructions, such as program code, executed by computers in networked environments. A computer-readable medium may include removable and non-removable storage devices including, but not limited to, Read Only Memory (ROM), Random Access Memory (RAM), compact discs (CDs), digital versatile discs (DVD), etc. Generally, program modules may include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. Computer-executable instructions, associated data structures, and program modules represent examples of program code for executing steps of the methods disclosed herein. The particular sequence of such executable instructions or associated data structures represents examples of corresponding acts for implementing the functions described in such steps or processes.
Software and web implementations of various embodiments of the present invention can be accomplished with standard programming techniques with rule-based logic and other logic to accomplish various database searching steps or processes, correlation steps or processes, comparison steps or processes and decision steps or processes. It should be noted that the words “component” and “module,” as used herein and in the following claims, is intended to encompass implementations using one or more lines of software code, and/or hardware implementations, and/or equipment for receiving manual inputs.
The foregoing description of embodiments of the present invention, have been presented for purposes of illustration and description. The foregoing description is not intended to be exhaustive or to limit embodiments of the present invention to the precise form disclosed, and modifications and variations are possible in light of the above teachings or may be acquired from practice of various embodiments of the present invention. The embodiments discussed herein were chosen and described in order to explain the principles and the nature of various embodiments of the present invention and its practical application to enable one skilled in the art to utilize the present invention in various embodiments and with various modifications as are suited to the particular use contemplated. The features of the embodiments described herein may be combined in all possible combinations of methods, apparatus, modules, systems, and computer program products.
Other solutions, uses, objectives, and functions within the scope of the invention as claimed in the below described patent claims should be apparent for the person skilled in the art.
Number | Date | Country | Kind |
---|---|---|---|
12158835.4 | Mar 2012 | EP | regional |
Number | Date | Country | |
---|---|---|---|
61609984 | Mar 2012 | US |