The present invention relates to a method and a device for humanizing music sequences. In particular, it relates to humanizing drum sequences.
Large parts of existing music are characterized by a sequence of stressed and unstressed beats (often called “strong” and “weak”). Beats divide the time axis of a piece of music or a musical sequence by impulses or pulses. The beat is intimately tied to the meter (metre) of the music as it designates that level of the meter (metre) that is particularly important, e.g. for the perceived tempo of the music.
A well-known instrument for determining the beat of a musical sequence is a metronome. A metronome is any device that produces a regulated audible and/or visual pulse, usually used to establish a steady beat, or tempo, measured in beats-per-minute (BPM) for the performance of musical compositions. Ideally, the pulses are equidistant.
However, humans performing music will never exactly match the beat given by a metronome. Instead, music performed by humans will always exhibit a certain amount of fluctuations compared with the steady beat of a metronome. Machine-generated music on the other hand, such as an artificial drum sequence, has no difficulty in always keeping the exact beat, as synthesizers and computers are equipped with ultra precise clocking mechanisms.
But machine-generated music, an artificial drum sequence in particular, is often recognizable just for this perfection and frequently devalued by audiences due to a perceived lack of human touch. The same holds true for music performed by humans which is recorded and then undergoes some kind of analogue or digital editing. Post-processing is a standard procedure in contemporary music production, e.g. for the purpose of enhancing human performed music having shortcomings due to a lack of performing skills or inadequate instruments, etc. Here also, even music originally performed by humans may acquire an undesired artificial touch.
Therefore, there exists a desire to generate or modify music on a machine that sounds more natural.
It is therefore an object of the present invention to provide a method and a device for generating or modifying music sequences having a more human touch.
This object is achieved according to the invention of by a method and a device according to the independent claims. Advantageous embodiments are defined in the dependent claims.
The term sound to which the claims refer is defined herein as a subsequence of a music sequence. In some embodiments, a sound may correspond to a note or a beat played by an instrument. In other embodiments, it may be a sound sample and more particularly a loop, i.e. a sample of music for continuous repetition. Each sound has a temporal occurrence t within the music sequence.
Preliminary results of empirical experiments carried out by the inventors strongly indicate that a rhythm comprising a natural random fluctuation as generated according to the invention sounds much better or more natural to people than the same rhythm comprising a fluctuation due to Gaussian or uniformly distributed white noise with the same standard deviation, even when using Gaussian instead of uniform white noise.
These and further aspect and advantages of the present invention will become more apparent when studying the following detailed description of the invention, in connection with the attached drawing in which
The beats or clicks of the metronome occur on times t1, t2 and t3 and constitute a regular sequence of the form
t
n
=t
0
+nT, (1)
wherein tn is the temporal occurrence or time of the n-th beat, t0 is the time of the initial beat and T denotes the time between metronome clicks.
The human drummer's beats occur on times t′1, t′2 and t′3 and constitute an irregular sequence. The offsets oi between the beats may be calculated as
o
n
=t
n
−t′
n. (2)
Alternatively, the above definitions may also be generalized in order to track deviations of a sequence from a given metric pattern instead from a metronome. In other words, instead of taking regular distances T for the metronome clicks, a more complex metronome signal can be generated wherein distances between clicks are not equal but are distributed according to a more complex pattern. In particular, the pattern may correspond to a particular rhythm.
Now, according to empirical investigations of the inventors, the offsets of human drum sequences may be described by Gaussian distributed 1/fα noise, where f is a frequency and α is a shape parameter of the spectrum.
With regard to the invention, in particular with respect to human drumming, the parameter α may be estimated empirically by comparing the beat sequence generated by a human drum player (or several of them) with a metronome. More particularly, the temporal differences between the human and the artificial beats correspond to the offsets oi of
Experiments carried out by the inventors using own recordings of the inventors as well as recordings of drummers provided by professional recording studios revealed that the exponent α appears to be widely independent of the drummer. The parameter α also clearly appears to be greater than zero (0). Also, it appears to be smaller than 2.0 in general. For drumming, it has been determined as being smaller than 1.5 in general. However, the offsets of different human drummers may differ in standard deviation and mean.
For the empirical analysis, drums have been chosen because in the analysis, the distinction between accentuation and errors is easiest when analyzing sequences that contain time-periodic structures, such as drum sequences. However, in principle, the methods according to the invention may also be applied to other instruments played by humans. For example, for a piano player playing a song on the piano, it is expectable that after removal of accentuation, the relevant noise obeys the same 1/fα-law as discussed above with respect to drums.
Based on these empirically determined facts and figures, a method and a device for humanizing music, in particular drum sequences may now be described as follows.
In step 310, the method is initialized. In particular, the algorithm may be set to the first time t0 (i=0).
In step 320, a random offset oi is generated for the present sound or note at time ti.
In step 330, the random offset oi is added to the time ti in order to obtain a modified time t′i. Hereby, it is understood that the offset oi may also be negative.
In step 340, the present sound si is output at the modified time t′i. The outputting step may comprise playing the sound in an audio device. It may also comprise storing the sound on a medium, at the modified time t′I for later playing.
In step 350, the procedure loops back to step 320 in order to repeat the procedure for the remaining sounds.
According to the invention, the random offsets are generated such that their power spectral density obeys the law
The parameter α may be set according to the empirical estimates obtained as described in relation to
Again, it is assumed that the music sequence (S) comprises a multitude of sounds (s1 . . . sn) occurring on times (t1, . . . , tn). According to one embodiment of the invention, the device may comprise means 410 for generating, for each time (ti) a random offset (oi).
The device may further comprise means 420 for adding the random offset (oi) to the time (ti) in order to obtain a modified time (ti+oi).
Finally, the device may also comprise means 430 for outputting a humanized music sequence (S′) wherein each sound (si) occurs on the modified time (ti+oi). The humanized music sequence (S′) may be output, e.g. stored to a machine-readable medium, such as a CD (compact disc) or DVD or output to an equalizer, amplifier and/or loudspeaker.
According to the invention, the power spectral density of the random offsets has the form
wherein 0<α<2. Generators for 1/fα- or colored noise (for f=2 also called ‘pink’ noise) are commercially available.
The deviation of human drum sequences from a given metronome may be well described by Gaussian distributed 1/fα noise, wherein the exponent α is distinct from 0. In principle, the results do also apply to other instruments played by humans. In conclusion, the method and device for humanizing musical sequence may very well be applied in the field of electronic music as well as for post processing real recordings. In other words, 1/fα-noise is the natural choice for humanizing a given music sequence.
This application is related to and claims priority from U.S. Provisional Patent Application No. 60/960,410, titled “Method and device for humanizing musical sequences,” filed Sep. 28, 2007, the entire contents of which are incorporated herein for all purposes.
Number | Date | Country | |
---|---|---|---|
60960410 | Sep 2007 | US |