This application is the National Stage under 35 U.S.C. 371 of International Application No. PCT/EP2010/051761, filed Feb. 12, 2010, which claims priority to French Patent Application No. 0950916, filed Feb. 13, 2009 and French Patent Application No. 0950919, filed Feb. 13, 2009 the contents of which are incorporated herein by reference.
1. Field of the Invention
Various embodiments of the invention relate to the field of the interpretation of musical gestures or gestures acting on or as musical instruments. In particular, preferred embodiments relate to a device and a method for processing signals representative of the movements of a music player using an instrument or beating an accompanying rhythm.
2. Description of the Prior Art
Gaming or learning devices and methods have been developed to enable a musical instrument player using an object which simulates said instrument to play a score thereon, where appropriate coupled with the scores of other instruments. The instruments whose interpretation is simulated may be a guitar, a piano, a saxophone, a drum, etc. In such devices, the notes of the score are generated from the actions of the player. Such devices and methods may use buttons which make it possible to trigger the notes, where appropriate by combining said buttons. Certain devices such as the WII™ Music also use a recognition of certain gestures on the part of the musician with the pressures on the buttons to play the score. Since the WII™ Music motion sensor is an optical sensor which requires a fixed reference, its measurements are both conditioned by the position of the player relative to the reference and rudimentary, which considerably limits the interpretation possibilities. A satisfactory musical rendition in fact requires a high degree of accuracy in capturing the movements of the player which are genuinely intended to actuate the instrument.
Such a rendition is not within the scope of the prior art devices, such as U.S. Pat. No. 5,663,514.
Embodiments of the present invention provide a response to these limitations of the prior art by using the measurements of motion sensors on at least two axes and a processing of their measurements which allow for this accuracy and thus allow for a satisfactory musical rendition.
To this end, the various embodiments of the present invention disclose a device for interpreting gestures of a user comprising at least one input module for measurements comprising at least one motion capture assembly on at least a first and a second axis, a module for processing signals sampled at the output of the input module and an output module capable of playing back the musical meaning of said gestures, the signal processing module comprising a submodule for analyzing and interpreting gestures comprising a filtering function, a function for detecting meaningful gestures by comparison of the variation between two successive values in the sample of at least one of the signals originating from at least the first axis of the set of sensors with at least a first selected threshold value and a function for confirming the detection of a meaningful gesture, wherein said function for confirming the detection of a meaningful gesture is capable of comparing at least one of the signals originating from at least the second axis of the set of sensors with at least a second selected threshold value.
Advantageously, the filtering function can be executed by at least one pair of two successive low-pass recursive filters capable of receiving as input at least one of the signals output from the module.
Advantageously, the function for detecting meaningful gestures can be capable of identifying changes of sign between two successive values in the sample of the difference between at least one output from the first filter of at least one of the pairs of filters at the current value and at least one output from the second filter of the same pair of filters for the same signal at the preceding value.
Advantageously, the submodule for analyzing and interpreting gestures can also comprise a function for measuring the velocity of the gesture detected at the output of the detection confirmation function.
Advantageously, the function for measuring velocity can be capable of computing the travel (Max-Min) between two detected meaningful gestures.
Advantageously, the second filter can be capable of operating at a cut-off frequency less than that of the first filter.
Advantageously, the input module can comprise at least a first sensor of accelerometer type and a second sensor chosen from the group of sensors of magnetometer and rate gyro types.
Advantageously, the function for detecting meaningful gestures can be capable of receiving as input at least one output from the second recursive filter of one of the pairs of filters applied to at least one of the signals from the first sensor.
Advantageously, the function for confirming the detection of a meaningful gesture can be capable of receiving as input at least one output from the second recursive filter of one of the pairs of filters applied to at least one of the signals from the second sensor.
Advantageously, the threshold selected for the function for confirming the detection of a meaningful gesture can be of the order of 5/1000 as a relative value of the filtered signal.
Advantageously, the input module can receive the signals from at least two sensors positioned on two independent parts of the body of the user, a first sensor supplying, via one of the pairs of recursive filters, a signal as input for the function for detecting meaningful gestures and a second sensor supplying, via one of the pairs of recursive filters, a signal as input for the function for measuring the velocity of the gesture detected at the output of the function for confirming the detection of a meaningful gesture.
Advantageously, the signal processing module can comprise an input submodule for prerecorded multimedia contents.
Advantageously, the input submodule for multimedia contents can comprise a function for partitioning said multimedia contents into time windows that can be used to perform a second confirmation of detection of the detected meaningful gestures.
Advantageously, the input module can be capable of transmitting to the processing module a signal representative of the position of the user in a plane substantially orthogonal to the direction of the detected meaningful gesture to perform a second confirmation thereof.
Advantageously, the output module can comprise a submodule for playing back a prerecorded file of signals to be played back and in that the processing module comprises a submodule for controlling the timing of said prerecorded signals, said playback submodule being able to be programmed to determine the times at which strikes controlling the runrate of the file are expected, and in that said timing control submodule is capable of computing, for a certain number of control strikes, a relative corrected speed factor of preprogrammed strikes in the playback submodule and strikes actually entered in the timing control submodule and a relative intensity factor of the velocities of said strikes actually entered and expected then of adjusting the runrate of said timing control submodule to adjust said corrected speed factor on the subsequent strikes to a selected value and the intensity of the signals output from said playback submodule according to said relative intensity factor of the velocities.
Advantageously, the velocity of the entered strike can be computed on the basis of the deviation of the signal output from the second sensor.
Advantageously, the input module can also comprise a submodule capable of interpreting gestures of the user whose output is used by the timing control submodule to control a characteristic of the audio output selected from the group consisting of vibrato and tremolo.
Advantageously, the playback submodule can comprise a function for placing tags in the file of prerecorded signals to be played back at times at which strikes controlling the runrate of the file are expected, said tags being generated automatically according to the rate of the prerecorded signals and being able to be shifted by a MIDI interface.
Advantageously, the value selected in the timing control submodule to adjust the running speed of the playback submodule can be equal to a value selected from a set of computed values of which one of the limits is computed by application of a corrected speed factor CSF equal to the ratio of the time interval between the next tag and the preceding tag minus the time interval between the current strike and the preceding strike to the time interval between the current strike and the preceding strike and whose other values are computed by linear interpolation between the current value and the value corresponding to that of the limit used for the application of the speed factor CSF.
Advantageously, the value selected in the timing control submodule to adjust the running speed of the playback submodule can be equal to the value corresponding to that of the limit used for the application of the corrected speed factor.
Various embodiments also disclose a method for interpreting meaningful gestures of a user comprising at least one step for inputting measurements originating from at least one motion capture assembly along at least a first and a second axis, a step for processing signals sampled at the output of the input step and an output step capable of playing back the musical meaning of said gestures, the signal processing step comprising a substep for analyzing and interpreting gestures comprising at least one filtering step, a function for detecting meaningful gestures by comparison of the variation between two successive values in the sample of at least one of the signals originating from at least the first axis of the set of sensors with at least a first selected threshold value and a function for confirming the detection of a meaningful gesture, wherein said function for confirming the detection of a meaningful gesture is capable of comparing at least one of the signals originating from at least the second axis of the set of sensors with at least a second selected threshold value.
Advantageously, the output step can comprise a substep for playing back a prerecorded file of signals to be played back and in that the processing step comprises a substep for controlling the timing of said prerecorded signals, said playback substep being capable of determining the times at which strikes controlling the runrate of the file are expected, and said timing control substep being capable of computing, for a certain number of control strikes, a relative corrected speed factor of preprogrammed strikes in the playback substep and of strikes actually entered during the timing control substep and a relative intensity factor of the velocities of said strikes actually entered and expected then of adjusting the runrate of said prerecorded file to adjust said corrected speed factor on the subsequent strikes to a selected value and the intensity of the signals output from the playback step according to said relative intensity factor of the velocities.
Another advantage of certain embodiments of the invention is that they use inexpensive microsensors (accelerometers and magnetometers or rate gyros). They can be used to play with the hands and/or beat time with the feet. They do not require a lengthy learning phase and can be used by a number of players. They can be used with a large number of movements and instruments. They can also be used without an object simulating any instrument.
Furthermore, embodiment devices and methods of the invention can be used to control the runrate and the playback volume of an mp3 or way audio file while ensuring a satisfactory musical rendition. Furthermore, certain embodiments make it possible to control the running of the prerecorded audio files intuitively. New algorithms for controlling the running can also be incorporated easily in embodiment devices.
a and 8b represent two cases of control of the running of an audio file in which, respectively, the strike speed is higher/lower than that at which the audio track runs.
The left-hand side of
A MotionPod includes a triaxial accelerometer, a triaxial magnetometer, a preprocessing capability making it possible to preform signals from the sensors, a radiofrequency transmission module for transmitting said signals to the processing module itself and a battery. This motion sensor is called “3A3M” (three accelerometer axes and three magnetometer axes). The accelerometers and magnetometers are market-standard microsensors with a small footprint, low consumption and low cost, for example a three-channel accelerometer from the company Kionix™ (KXPA4 3628) and HoneyWell™ magnetometers of HMC1041Z type (1 vertical channel) and HMC1042L type for the 2 horizontal channels. There are other suppliers: Memsic™ or Asahi Kasei™ for the magnetometers and STM™, Freescale™, Analog Device™ for the accelerometers, to cite only a few. In the MotionPod, for the 6 signal channels, there is only an analog filtering and then, after analog-digital conversion (12-bit), the raw signals are transmitted by a radiofrequency protocol in the Bluetooth™ band (2.4 GHz) optimized for consumption in this type of application. The data therefore arrive raw at a controller which can receive the data from a set of sensors. The data are read by the controller and made available to the software. The rate of sampling can be adjusted. By default, it is set to 200 Hz. Higher values (up to 3000 Hz, or higher) can nevertheless be considered, allowing for a greater accuracy in the detection of impacts for example. The radiofrequency protocol of the MotionPod makes it possible to ensure that the data is made available to the controller with a controlled delay, which in this case must not exceed 10 ms (at 200 Hz), which is important for music.
An accelerometer of the above type makes it possible to measure the longitudinal displacements on its three axes and, by transformation, angular displacements (except around the direction of the Earth's gravitational field) and orientations according to a three-dimensional Cartesian reference frame. A set of magnetometers of the above type makes it possible to measure the orientation of the sensor to which it is fixed relative to the Earth's magnetic field, and therefore relative to the three axes of the reference frame (except around the direction of the Earth's magnetic field). The 3A3M combination supplies complementary and smooth movement information.
In fact, in an embodiment of the invention, only the information relating to one of the axes, the vertical Z axis, or one of the other two axes, is used. It is therefore possible in principle to use only a monoaxial sensor of each of the types, when two types of sensors (accelerometer and magnetometer or accelerometer and rate gyro) are used. In practice, given the inexpensive availability of 3A3M sensor modules incorporating transmission and processing functions for the six channels, it is this approach which is preferred.
Other motion sensors can be used, for example a combination of accelerometer and of rate gyro (so-called “3A3G” sensors) or even just one triaxial rate gyro, as explained below in the description as a commentary to other figures.
When a number of sets of motion sensors are used, the remote controller of the MotionPod (at the input of the processing module 20, 210) synthesizes the signals from the sets of sensors. A trade-off has to be found between the number of sensors, the sampling frequency of the sensors and the autonomy in terms of energy consumption of the sets of sensors. Hereinafter in the description, output signal from the accelerometer or from the magnetometer in the singular will be used without differentiation to designate the outputs of the controller depending on whether the input data originate from a single 3A3M sensor module or from a set of 3A3M modules synthesized in the controller.
The AirMouse comprises two sensors of rate gyro type, each with one rotation axis. The rate gyros used are Epson brand, reference XV3500. Their axes are orthogonal and deliver pitch angles (yaw or rotation about the axis parallel to the horizontal axis of a plane situated facing the user of the AirMouse) and of yaw (pitch or rotation about an axis parallel to the vertical axis of a plane situated facing the user of the AirMouse). The instantaneous pitch and yaw speeds measured by the two rate gyro axes are transmitted by radiofrequency protocol to a controller of the input module (10) and converted by said controller into movement of a cursor in a screen situated facing the user. In an embodiment application, it is possible to use either one of the signals controlling the cursor (in Z or in Y), even both, or a direct measurement signal output from one of the rate gyro axes.
The functionalities and the architecture of the processing module 20 will be described in conjunction with
An output module 30 plays back the sounds produced by the combination of prerecorded contents and the capture of the musical gestures produced by the player via the input module 10. It may be a simple loudspeaker or a synthesizer.
The functional architecture of an embodiment device is described in
The module 20 processes the signals received from the input module 10 in a module for analyzing and interpreting gestures 210 whose outputs are supplied to a module for computing control data for the musical content 230. A prerecorded multimedia content is also supplied by a module 220 to the module 230.
To correctly specify the algorithm for analyzing and interpreting the musical body language implanted in the module 210, it is desirable to take into account the specifics of said body language. In particular, playing a 5-minute piece of music for example by beating a medium-fast tempo at 120 bpm (beats per minute) translates into 600 beats performed by the user. Now, in a musical context, a single error is reflected in a sensory break or a loss of interest in the device. In a false alarm situation, the system detects nonexistent beats, and in a nondetection situation, the playing of the piece is interrupted. Now, in a situation of musical interpretation by beating time, the user adopts a body language on the one hand which is specific to him, and on the other hand which allows for a certain variability within his specific body language. Furthermore, physiological motor phenomena specific to human beings, which are themselves dependent on the beating speed, are superimposed on this variability (there is a quasi-sinusoidal mode at high speed, but with strong bounces at slow speed).
These observations can lead to a number of consequences:
Furthermore, the behavior of the user can depend directly on his interaction with the content that he is interpreting. It is therefore desirable to provide an in-situ method, that is to say, placing the human system in an action/perception loop including all the aspects involved (content, brain and cognitive processes, body language, actuators, sensors, etc.).
To meet these specifications, the general processing principle implemented in the module 210 can have the following two characteristics:
The module 220 is used to insert prerecorded contents of MIDI (Musical Instrument Digital Interface) type coming from an electronic musical instrument, audio coming from a drive (MP3—MPEG (Moving Picture Expert Group) 1/2 Layer 3, WAV—WAVeform audio format, WMA—Windows Media Audio, etc. . . . ), multimedia, images, video, etc., via an appropriate interface. The outputs from the module 220 are supplied concurrently to the module 210 (to enable the reactions of the music player to be taken into account) and to the module 230 to be then played back as output from the processing device.
The module 230 makes it possible to synthesize the musical gestures interpreted by the module 210 and the prerecorded contents output from the module 220. The simplest mode is to play a fragment, for example MP3-coded or of a midi file (even of a video file) each time a strike is detected by the module 210, which will then search sequentially for the fragments in the module 220. This mode allows for numerous interesting applications. It is much more flexible and powerful when 220 incorporates a method such as the one we have disclosed in application No. FR07/55244 entitled “Computer-assisted music interpretation system” and whose holder is the inventor of the present application. The embodiment device disclosed in this invention comprises two memories, one of which contains musical data defining all the musical events forming the piece of music to be interpreted and the other containing the sequence of actions used to play back the stored musical events and means for establishing said musical information by comparing the data stored in the first memory containing the musical data and the memory containing the sequence of actions. In this case, the user will have complete control over what he wants to play and when, and over what is left to the initiative of the machine (for example, an accompaniment).
The processing operations comprise, first of all, a low-pass filtering of the outputs of the sensors of the two modalities (accelerometer and magnetometer) whose detailed operation is explained by
Output(z(n))=0.3*Input(z(n−1))+0.7*Output(z(n−1))
In which, for each of the modalities:
z is the reading of the modality on the axis used;
n is the reading of the current sample;
n−1 is the reading of the preceding sample.
The processing then includes a low-pass filtering of the two modalities with a cut-off frequency less than that of the first filter. This lower cut-off frequency is the result of a choice of a coefficient of the second filter which is less than the gain of the first filter. In the case chosen in the above example in which the coefficient of the first filter is 0.3, the coefficient of the second filter may be set to 0.1. The equation of the second filter is then (with the same notations as above):
Output(z(n))=0.1*Input(z(n−1))+0.9*Output(z(n−1))
Then, the processing includes a detection of a zero in the drift of the signal output from the accelerometer with the measurement of the signal output from the magnetometer.
The following notations are used:
Then, the following equation can be used to compute a filtered drift of the signal from the accelerometer in the sample n:
FDA(n)=AF1(n)−AF2(n−1)
A negative sign for the product FDA(n)*FDA(n−1) indicates a zero in the drift of the filtered signal from the accelerometer and therefore detects a strike.
For each of these zeros of the filtered signal from the accelerometer, the processing module checks the intensity of the deviation of the other modality at the filtered output of the magnetometer. If this value is too low, the strike is considered not to be a primary strike but to be a secondary or tertiary strike and is discarded. The threshold making it possible to discard the non-primary strikes depends on the expected amplitude of the deviation of the magnetometer. Typically, this value will be of the order of 5/1000 in the applications envisaged. This part of the processing therefore makes it possible to eliminate the meaningless strikes.
Finally, for all the primary strikes detected, the processing module computes a strike velocity (or volume) signal by using the deviation of the signal filtered at the output of the magnetometer.
The value DELTAB(n) is introduced into the sample n which can be considered to be the pre-filtered signal of the centered magnetometer and which is computed as follows:
DELTAB(n)=BF1(n)−BF2(n)
The minimum and maximum values of DELTAB(n) are stored between two detected primary strikes. An acceptable value VEL(n) of the velocity of a primary strike detected in a sample n is then given by the following equation:
VEL(n)=Max{DELTAB(n),DELTAB(p)}−Min{DELTAB(n),DELTA(p)}
In which p is the index of the sample in which the preceding primary strike was detected. The velocity is therefore the travel (Max-Min difference) of the drift of the signal between two detected primary strikes, characteristic of musically meaningful gestures.
This part of the processing is illustrated by
An adaptive processing is thus performed, because the processing of the magnetic modality includes a centering of the signal. From the signal itself are subtracted its own slow variations (see formula above). Thus, for example if the user turns by 60° to his right, the magnetic signals received will be shifted, but the corresponding offset will be removed by the subtraction concerned, retaining only the rapid variations due to the musical rhythm.
This processing according to embodiments of the invention makes it possible to interpret, without a single error, pieces lasting a few minutes, with a fine control of both playing speed and volume, both when the sensors are placed on the hand of the player or when they are situated on the foot of a player who beats time with his foot. The embodiment devices can be used as such, that is to say without any calibration, even of the magnetometers (the device in fact can work only on signals stripped of continuous components). It may, however, be advantageous to perform a calibration at the start of play, a calibration which may also be renewed on each strike. It is then desirable to have the filtering designed to dispense with the slow variations and this calibration on each strike done in parallel. In this case, it is no longer necessary to filter using the second filter. On the contrary, the calibration will ensure that, in an “approximate” position known to the user (at the moment of the strike), the magnetometer supplies a reference datum by virtue of the calibration. In a way, the data are realigned by these calibrations, whereas they were previously realigned by the second filtering. It is also possible to imagine combining the second filtering and the calibration.
Moreover, these processing operations as a whole can provide:
The AirMouse or the GyroMouse from Movea (player 120b of
The processing performed in the module 210 is comparable to the processing described above, except that we do not use more than a single sensor datum which can in effect be considered, as a first approximation, to be physically mid-way between the accelerometer datum and the magnetometer datum which supplies absolute angles. The rate gyro is in this case used in both detections: that of the primary strike, with a processing comparable to that of the accelerometer above, except that the second filtering is not necessary, because a first filtering is already performed in the AirMouse or the GyroMouse. The two filterings may, however, be added together.
In this case, crossings between the drift of the signal obtained from the AirMouse are detected, and this same signal low-pass filtered recursively.
The detection of the power of the gesture is also based on a measurement of the travel between two successive detected primary strikes.
This velocity computation gives usable results, but is less effective than the approach with two modalities. Because of the intermediate nature—between measurements from an accelerometer and measurements from a magnetometer—of the measurements from the rate gyro, said rate gyro is sufficient for both detections, but is it is also less effective than the dedicated modalities. This solution provides a trade-off which is not optimal but which may provide other opportunities. On the one hand, the AirMouse is more accessible, at least for the time being, to the general public and therefore is of interest from this point of view even if it does not offer the fine level of control of the bimodality solution. In a way, the Airmouse lies between the Wii Music and a sensor providing two motion capture modes. Moreover, the mouse buttons provide additional controls in order, for example, to change a sound, or to switch to the next piece, or to operate the pedal of a sampled piano for example.
The various embodiments of the invention can be enhanced by the variants explained below.
One variant embodiment uses two sensor modules in each of the player's hands, one of the modules being dedicated to detecting primary strikes and the other to measuring the velocity.
It is also possible to exploit the other axes of the sensors to determine a heading information which makes it possible to introduce a pan control and thus improve the centering to make the detections completely independent of the positioning of the player.
Another variant embodiment that makes it possible to improve the robustness involves exploiting the knowledge of the current musical content. Time windows are then introduced, which are deduced from the current content, in which a strike detected as primary is not taken into account because it is inconsistent with said current content. In fact, this consistency can exploit a measurement of the current playing speed of the person (the time between the last two strikes) and compare it to the time elapsing between the two fragments contained in the module 220. If these two measurements differ excessively (for example by more than 25%) an acceleration (or a deceleration) is registered which seems excessive relative to what is being played. It is deduced therefrom that there has been a false detection. When such a false detection is identified, it in fact always corresponds to a strike devoid of musical sense, from which it is deduced that it is a spurious detection. It is therefore purely and simply disregarded (it does not trigger any multimedia fragment). Conversely, a nondetection can be overcome simply, the paced elements of the piece being played by using the last two detected strikes.
The characteristics of the module 720, for the input of the signals to be played back, of the module 730 for controlling the timing rhythm and of the audio output module 740 are described later. The motion sensors of Motion Pod or Air Mouse type described above are, in the embodiment described here, used to control the runrate of a prerecorded audio file. The module for analyzing and interpreting gestures 712, adapted to this embodiment, supplies signals that can be directly exploited by the timing control processor 730. The signals on one axis of the accelerometer and of the magnetometer of the Motion Pod are combined according to the method described above.
The processing operations advantageously comprise, first of all, a double low-pass filtering of the outputs of the sensors of the two modalities (accelerometer and magnetometer) which has already been described above in relation to
Then, the processing includes the detection of a zero in the drift of the signal output from the accelerometer with the measurement of the signal output from the magnetometer according to the modalities explained above in comments to
The modalities enabling the embodiment device to control the running of an mp3, wav or similar type file are explained below.
A prerecorded music file 720 with one of the standard formats (MP3, WAV, WMA, etc.) is taken from a storage unit by a drive. This file has associated with it another file including time marks, or “tags”, at predetermined instants; for example, the table below indicates nine tags at the instants in milliseconds which are indicated alongside the index of the tag after the comma:
The tags can advantageously be placed at the beats of the same index in the piece that is being played. There is, however, no limitation on the number of tags. There are a number of possible techniques for placing tags in a prerecorded piece of music:
The module 720 for the input of prerecorded signals to be played back can process different types of audio files, in the MP3, WAV, WMA formats. The file may also contain multimedia content other than a simple sound recording. This may be, for example, video content, with or without sound tracks, which can be marked with tags and whose running can be controlled by the input module 710.
The timing control processor 730 handles the synchronization between the signals received from the input module 710 and the prerecorded piece of music 720, in a manner explained in comments to
The audio output 740 plays back the prerecorded piece of music originating from the module 720 with the rhythm variations introduced by the commands from the input module 710 interpreted by the timing control processor 730. Any sound playback device can do this, notably headphones, and loudspeakers.
On the first strike identified by the motion sensor 711, the audio player of the module 720 starts playing the prerecorded piece of music at a given pace. This pace may, for example, be indicated by a number of preliminary small strikes. Each time the timing control processor receives a strike signal, the current playing speed of the user is computed. This may, for example, be expressed as the speed factor SF(n) computed as the ratio of the time interval between two successive tags T, n and n+1, of the prerecorded piece to the time interval between two successive strikes H, n and n+1, of the user:
SF(n)=[T(n+1)−T(n)]/[H(n+1)−H(n)]
In the case of
In the case of
Three positions of the tags at the instant n+2 (in the timescale of the audio file) before change of player speed are indicated in
CSF is the ratio of the time interval of the strike n+1 to the tag n+2 related to the time interval of the strike n+1 to the strike n+2. Its computation formula is as follows:
CSF={[T(n+2)−T(n)]−[H(n+1)−H(n)]}/[H(n+1)−H(n)]
It is possible to enhance the musical rendition by smoothing the profile of the tempo of the player. For this, instead of adjusting the running speed of the playback device as indicated above, it is possible to compute a linear variation between the target value and the starting value over a relatively short duration, for example 50 ms, and change the running speed through these different intermediate values. The longer this adjustment time becomes, the smoother the transition will be. This allows for a better rendition, notably when many notes are played by the playback device between two strikes. However, the smoothing is obviously done to the detriment of the dynamic of the musical response.
Another enhancement, applicable to the embodiment comprising one or more motion sensors, consists in measuring the strike energy of the player or velocity to control the audio output volume. The manner in which the velocity is measured indicated above in the description.
This part of the processing performed by the module 712 for analyzing and interpreting gestures is represented in
For all the primary strikes detected, the processing module computes a strike velocity (or volume) signal by using the deviation of the signal filtered at the output of the magnetometer.
Using the same notations as above in commentary to
DELTAB(n)=BF1(n)−BF2(n)
The minimum and maximum values of DELTAB(n) are stored between two detected primary strikes. An acceptable value VEL(n) of the velocity of a primary strike detected in a sample n is then given by the following equation:
VEL(n)=Max{DELTAB(n),DELTAB(p)}−Min{DELTAB(n),DELTA(p)}
In which p is the index of the sample in which the preceding primary strike was detected. The velocity is therefore the travel (Max-Min difference) of the drift of the signal between two detected primary strikes, characteristic of musically meaningful gestures.
It is also possible to envisage, in this embodiment comprising a number of motion sensors, using other gestures to control other musical parameters such as the spatial origin of the sound (or panning), vibrato or tremolo. For example, a sensor in a hand will make it possible to detect the strike while another sensor held in the other hand will make it possible to detect the spatial origin of the sound or the tremolo. Rotations of the hand may also be taken into account: when the palm of the hand is horizontal, a value of the spatial origin of the sound or of the tremolo is obtained; when the palm is vertical, another value of the same parameter is obtained; in both cases, the movements of the hand in space provide the detection of the strikes.
In the case where a MIDI keyboard is used, the controllers conventionally used may also be used in this embodiment of the invention to control the spatial origin of the sounds, tremolo or vibrato.
Various embodiments of the invention may advantageously be implemented by processing the strikes through a MAX/MSP program.
The display in the figure shows the wave form associated with the audio piece loaded in the system. There is a conventional part making it possible to listen to the original piece.
Bottom left there is a part, represented in
In the column on the right, the acceleration/slowing down coefficient SF is computed by comparison between the duration that exists between two consecutive markers, on the one hand in the original piece and on the other hand in the actual playing of the user. The formula for computing this speed factor is given above in the description.
In the central column, a timeout is set that makes it possible to stop the running of the audio if the user has not performed any more strikes for a time dependent on the current musical content.
The left-hand column contains the core of the control system. It relies on a time compression/expansion algorithm. The difficulty lies in transforming a “discrete” control, therefore one occurring at consecutive instants, into an even modulation of the speed. By default, the listening suffers on the one hand from total interruptions of the sound (when the player slows down), and on the other hand from clicks and sudden jumps when he speeds up. These defects, which make such an approach unrealistic because of a musically unsable audio output, are resolved in the embodiment implementation developed. It includes:
The examples described above are given as a way of illustrating embodiments of the invention. They in no way limit the scope of the invention which is defined by the following claims.
Number | Date | Country | Kind |
---|---|---|---|
09 50916 | Feb 2009 | FR | national |
09 50919 | Feb 2009 | FR | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/EP2010/051761 | 2/12/2010 | WO | 00 | 11/23/2011 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2010/092139 | 8/19/2010 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
5166463 | Weber | Nov 1992 | A |
5170002 | Suzuki et al. | Dec 1992 | A |
5648627 | Usa | Jul 1997 | A |
5663514 | Usa | Sep 1997 | A |
5684259 | Horii | Nov 1997 | A |
5746640 | Meadows | May 1998 | A |
5808219 | Usa | Sep 1998 | A |
5819206 | Horton et al. | Oct 1998 | A |
6005181 | Adams et al. | Dec 1999 | A |
6011212 | Rigopulos et al. | Jan 2000 | A |
6066794 | Longo | May 2000 | A |
6088017 | Tremblay et al. | Jul 2000 | A |
6150947 | Shima | Nov 2000 | A |
RE37654 | Longo | Apr 2002 | E |
6388183 | Leh | May 2002 | B1 |
7474197 | Choi et al. | Jan 2009 | B2 |
7489979 | Rosenberg | Feb 2009 | B2 |
8222507 | Salazar et al. | Jul 2012 | B1 |
20010015123 | Nishitani et al. | Aug 2001 | A1 |
20020026866 | Nishitani et al. | Mar 2002 | A1 |
20020088335 | Nishitani et al. | Jul 2002 | A1 |
20020166439 | Nishitani et al. | Nov 2002 | A1 |
20030188627 | Longo | Oct 2003 | A1 |
20030196542 | Harrison, Jr. | Oct 2003 | A1 |
20040000225 | Nishitani et al. | Jan 2004 | A1 |
20040046736 | Pryor et al. | Mar 2004 | A1 |
20060028446 | Liberty et al. | Feb 2006 | A1 |
20060142082 | Chiang et al. | Jun 2006 | A1 |
20070021208 | Mao et al. | Jan 2007 | A1 |
20070039450 | Ohshima et al. | Feb 2007 | A1 |
20070113726 | Oliver et al. | May 2007 | A1 |
20070118241 | Rosenberg | May 2007 | A1 |
20070175321 | Baum et al. | Aug 2007 | A1 |
20090211432 | Casillas et al. | Aug 2009 | A1 |
20120059494 | David | Mar 2012 | A1 |
20120062718 | David | Mar 2012 | A1 |
20120103168 | Yamanouchi | May 2012 | A1 |
20120174736 | Wang et al. | Jul 2012 | A1 |
20120260789 | Ur et al. | Oct 2012 | A1 |
20130032023 | Pulley et al. | Feb 2013 | A1 |
20130118339 | Lee et al. | May 2013 | A1 |
Number | Date | Country |
---|---|---|
0 747 851 | Dec 1996 | EP |
1 837 858 | Sep 2007 | EP |
1 850 318 | Oct 2007 | EP |
Entry |
---|
Business Wire: “Movea unveils the latest in motion-sensing Technology for consumer products”, Jan. 8, 2009. |
Stephen Totilo: “One on one with Shigeru Miyamoto: from WII Music to Bowser to . . . Motion Plus?”, Oct. 27, 2008. |
International Search Report and Written Opinion issued in International Application No. PCT/EP2010/051761, dated Nov. 19, 2010. |
Number | Date | Country | |
---|---|---|---|
20120062718 A1 | Mar 2012 | US |