The invention is based on the detection and matching of tempo and phase in pieces of music, especially for the realisation of an interactive music player, which amongst other advantages, allows several synchronised pieces of music to be played back to form a complete new work. In this context, digital music data are obtained, according to one advantageous embodiment, by a playing back several pieces of music at the same time on a standard CD-ROM drive in real-time.
In present-day dance culture, which is characterised by modern, electronic music, the technical demands on the disc jockey (DJ) have increased to a considerable extent. Sorting the pieces of music to be played to form a complete work with its own characteristic curve of emotional excitement (referred to as a set or a mix) is one of the standard tasks required of a DJ. In this context, it is important to be able to match the individual pieces of music with reference to their tempo and the phase, in other words, the position of the beats in the time grid, (referred to in English as “beat matching”), in such a manner that the pieces of music merge in a unified manner at the transition points without interrupting the rhythm.
This requirement presents the technical problem of tempo and phase matching of two pieces of music and/or audio tracks in real-time. Accordingly, it would be desirable if the tempo and phase of two pieces of music and/or audio tracks could be matched automatically in real-time, in order to release the DJ from this technical aspect of mixing, and/or to create a mix automatically or semi-automatically, without the assistance of a technically skilled DJ.
So far, this problem has only been addressed in an incomplete manner. For example, software players are available for the MP3 format (a standard format for compressed digital audio data), which can realise pure, real-time tempo detection and matching. However, phase detection must still be carried out manually on the basis of the listening and matching skills of the DJ. This demands a considerable amount of the DJ's attention, which would otherwise be available for more artistic aspects such as compiling the music etc.
Hardware effects-equipment for processing audio information, which can indeed realise real-time tempo and phase detection is also already known, but this equipment cannot match the tempo and phase of the audio material, if the data have only been supplied in analogue form. The equipment can only provide a visual display of the relative phase shift of the two audio tracks.
However, no devices are currently known which utilise tempo information to calculate loops (short audio segments, which can be played back repeatedly) and loop lengths. With the previously used playback equipment, these are either cut and loaded in advance (software MP3 player) or set and matched manually (hardware CD player).
Accordingly, one object of the present invention is to create the possibility for automatic tempo and phase matching of two pieces of music and/or audio tracks in real-time with the greatest possible accuracy.
One substantial technical problem here is the accuracy of tempo and phase measurement, which declines in direct proportion to the time available for measurement. The primary problem is therefore to establish the tempo and phase in real-time, as, for example, in the case of live mixing.
According to the present invention this object is achieved with a method for detecting the tempo and phase of a piece of music available in digital format comprising the following procedural stages:
A successive approximation to the ideal value is therefore implemented in a control circuit.
In this context, it has proved favourable, if rhythm-relevant beat information is obtained through the band-pass filtering of the underlying digital audio data in various frequency ranges.
This is particularly successful if rhythm intervals in the audio data are transformed, if necessary by raising their frequency by a power of two, into a pre-defined frequency octave, where they provide time intervals for establishing the tempo. Further relevant intervals can be obtained if the rhythm intervals are grouped, especially in pairs or groups of three, by addition of their time values, before the frequency transformation.
According to one advantageous embodiment, the quantity of data obtained which refers to time intervals in the rhythm-relevant beat information is investigated for accumulation points. The tempo approximation is then based on the information regarding the accumulation maximum.
According to one further, advantageous embodiment of the method according to the present invention, the phase of the reference oscillator for establishing the approximate phase of the piece of music is selected in such a manner that the maximum agreement is achieved between the rhythm-relevant beat-information in the digital audio data and the zero passes of the reference oscillator.
Furthermore, it has proved favourable if a successive correction of the established tempo and phase of the piece of music is carried out at regular intervals in such short time intervals that resulting correction movements and/or correction shifts remain below the threshold of audibility.
Since all the successive corrections of the established tempo and phase in the piece of music are accumulated over time, further-corrections can be made on this basis with constantly increasing accuracy.
Instead of implementing successive corrections of this kind continuously, corrections may alternatively be implemented until the volume of errors falls below a tolerable error threshold. In this context, an error threshold of less than 0.1% is suitable for the tempo established.
If the corrections are always exclusively either negative or positive over a predetermined period, a new approximation of tempo and phase with subsequent, successive corrections is carried out to ensure that any possible tempo changes in the piece of music are matched.
In addition to the automatic detection of tempo and phase in pieces of music, as described above, the specified object also requires a matching of tempo and phase in the pieces of music.
This problem is resolved, in that, after an initial approximation of the tempo and phase of the pieces of music, these results and the matching are successively improved on the basis of feedback to the playback rate of the piece of music.
According to the invention, this is achieved with a method for synchronising at least two pieces of music available in digital format with the following procedural steps:
In this context, it has proved advantageous if the playback rate and the playback phase of the other piece of music is matched on the basis of a possible phase shift of the reference oscillator allocated to this other piece of music relative to the reference oscillator allocated to the first piece of music, the resulting systematic phase shift is evaluated and the frequency of the reference oscillator allocated to the other piece of music is regulated in proportion to the phase shift established.
A successive approximation to the ideal value is therefore carried out in a control circuit, in which the tempo and phase information are fed back into the control unit for the playback speed of the audio material.
Various devices for various storage media such as vinyl discs, CDs or cassettes are currently used for playing back pre-recorded music. These formats were not developed to allow interventions during the playback process, wherein the music can be processed in a creative manner. However, this possibility is not only desirable; it is already practised by the disc jockeys mentioned in the introduction in spite of the limitations encountered. Vinyl discs are preferred, because manual influence on the playback rate and playback position can most readily be achieved in this context.
Nowadays, however, digital formats such as audio CD and MP3 are predominantly used for storing music. The present invention allows the possibility of creative processing of music, as described above, in the context of any digital format required.
With the method according to the invention as described above, it is possible to produce a mix in a fully automatic manner from a collection of pieces of music, wherein the pieces of music are placed in sequence with the correct tempo and phase.
This is achieved with a music player, wherein at least two pieces of music available in digital format can be synchronised in real-time as explained above.
Particularly effective results are obtained with a music player wherein, in each case starting from a current playback position of the piece of music, rhythm-relevant beat information for a predetermined past time are used as the basis for establishing the tempo.
As a result of the automatic tempo detection, the content of a music data source, e.g. a CD, can be played back, at the request of the listener, as a homogeneous mix providing a tempo-dependent sequence, which the listener can select.
The invention therefore also comprises a music player of this kind, wherein the synchronised pieces of music can be sorted and played back automatically to form a complete work with unified rhythm.
To implement targeted interventions, it is important to have a graphic representation of the music, which allows the identification of the current playback position as well as a given period in the future and in the past. For this purpose, it is conventional to present the amplitude-envelope-curve of the sound-wave form over a period of several seconds before and after the playback position. The display moves in real-time at the rate at which the music is played.
In this context, it is essential to have as much helpful information in the graphic representation as possible, in order to make the interventions in a targeted manner. It would also be desirable to be able to intervene in the playback procedure in an ergonomic manner, comparable to the “scratching” frequently practised by DJs with vinyl disc players, holding the turntable and moving it forwards and backwards during playback.
To resolve this problem, the present invention proposes an interactive music player, which provides
According to one advantageous embodiment, this interactive player is additionally fitted with:
In this context, it has proved advantageous if a means for ramp smoothing is provided for smoothing a stepped sequence of time-limited playback-position-data, by means of which a ramp with constant gradient can be resolved with every predetermined playback position message, which, within a predetermined time interval, moves the smoothed signal from its previous value to the value of the playback position message. Alternatively, or additionally, a linear, digital low-pass filter, especially a second-order resonance filter, can be used for smoothing a stepped sequence of predetermined time-limited playback-position-data.
To avoid jumps in playback when switching between operating modes, the position reached in the previous mode is used as the starting position in the new mode.
To avoid abrupt changes in the playback rate when switching between operating modes, the current playback rate reached in the previous mode is moved by a smoothing function, especially a ramp-smoothing function or a linear, digital low-pass filter, to a playback rate corresponding to the playback rate in the new operating mode.
When playing back with very strongly and quickly changing playback rates, a playback which most authentically resembles “scratching” on a vinyl disc player can be achieved with a further advantageous embodiment of the interactive music player according to the invention which uses a scratch-audio-filter for an audio signal, wherein the audio signal is subjected to pre-emphasis filtering (pre-distortion) and stored in a buffer memory, from which it can be read out at a variable tempo in dependence on the relevant playback rate, after which it is subjected to de-emphasis filtering (reverse-distortion) before playing back.
The length of one or more beats can be established on the basis of the tempo information with sufficient accuracy to set the length of a loop at the touch of a button, so that the loop can be played without “clicks” at the tempo of the original audio track. According to a further advantageous embodiment of an interactive music player of this kind, which establishes tempo information in the manner described according to the invention, it is possible, on the basis of the tempo information established for one or more of the synchronised pieces of music, to define the length of a playback loop in the relevant piece of music extending over one or more beats of this piece of music and to play back the loop in a beat-synchronised manner in real-time.
In this context, the phase information can be used, once again at the touch of a button, to place jump marks, or so-called cue-points within the track, or to place entire loops accurately on a starting beat. An advantageous interactive music player can therefore be further developed in that, for one or more of the synchronised pieces of music and with reference to the established phase information from the relevant piece of music, beat-synchronised jump marks can be defined in real-time and can be moved within this piece of music by whole number multiples of beats. Such cue-points and loops can also be moved by whole number multiples of beats within the track. Both procedures are carried out in real-time, during the playback of the audio track.
Furthermore, the information obtained about the tempo and phase of an audio track allows so-called tempo-synchronised effects to be controlled. In this context, the audio signal is manipulated to match its own rhythm, which allows rhythmically effective, real-time sound changes. In particular, the tempo information can be used to cut loops from the audio material in real-time with a length synchronised to the beat.
A further advantageous interactive music player is characterised in that each audio-data stream played back can be manipulated in real-time by signal processing means, in particular, by means of filter equipment and/or audio effects.
When mixing several pieces of music, the audio sources from sound media are conventionally played back on several playback devices, for example, vinyl-disc players or CD players and then mixed via a mixing desk. With this procedure, audio recording is restricted to recording the final results. When using computer systems with audio interfaces and appropriate audio-processing software, such as audio sequencers or so-called sample processing programs for manipulating digital audio information, interactive interventions by the user are not possible during playback.
If the mixing procedure is to be reproduced or if mixing is to be continued at a later time accurately from a predetermined position within a piece of music, it would be desirable to play back not only the final result.
This object is achieved according to the invention with an interactive music player, which is further developed so that real-time interventions, especially interventions from a mixing procedure with several pieces of music and/or additional signal processing, can be stored over the time sequence as digital control information.
Since mixing procedures with pieces of music and/or interactive interventions into pieces of music using audio-signal processing media can be stored as a complete new work independently from the digital audio information in the piece of music, in the form of digital control information, especially for the purpose of reproduction, the processes of interactive mixing and interactive effect processing can be recorded and played back at any time.
According to a further advantageous embodiment of the invention, stored digital control information has a format which provides information for the identification of the processed pieces of music and a time sequence of playback positions and status information for the control elements of the music player allocated to each of these.
One decisive advantage of this recording option and of the proposed format is the fact that a digital record of the mixing procedure can be implemented independently from the audio data in the pieces of music mixed; this therefore avoids the problems with reference to copyright associated with copying these audio data. The overall result can therefore be played back, processed, duplicated and transmitted independently at any time.
One particularly advantageous interactive music player can be realised with an appropriately programmed computer system fitted with audio interfaces. In this context, standard data storage media of the computer system are used for recording the control file. A particularly interesting transfer of recording files, which are generally not memory-intensive, can therefore also be realised, for example, via the Internet.
This poses the problem that often only one audio data source is available, for example, a CD player or, in the case of a computer system, a CD-ROM drive. In general, these and other playback devices have only a single reader unit at their disposal. However, to implement the function described above, in particular, the mixing of several pieces of music, the audio data from at least two pieces of music must be available at the same time. It would therefore be desirable if this could be achieved with one playback device with only one reader unit.
The invention resolves this problem with a method for providing in real-time digital audio data from at least two pieces of music from a data source with only one reader unit, provided the data source supplies the audio data at a reading rate faster than the playback rate, in that an appropriate buffer memory, especially a ring-buffer memory, is provided for each piece of music to be played back, and the faster reading rate is used to fill the relevant buffer memories with the relevant audio data in such a manner that audio data are always available chronologically before and after a current playback position in the relevant piece of music.
In this context, it has also proved advantageous to monitor the status of each buffer memory to determine whether adequate data are available and, if the level of data falls below a predetermined threshold value, to order a central instance, which is not coupled to the playback of the pieces of music, to provide the necessary audio data, wherein the central instance automatically requests the necessary regions of audio data from the data source and fills the relevant buffer memory with the data obtained. According to a further advantageous embodiment, data no longer needed are over-written during the filling of a buffer memory. Moreover, it has proved advantageous if the central instance sorts requests received in parallel into an order to be worked through sequentially.
This method is particularly suitable in conjunction with a CD-ROM drive and presents an innovative and advantageous method of reading from such drives in a manner referred to by a person skilled in the art as CD-grabbing. In a further advantageous, interactive music player, a CD-ROM drive operated according to the method described above can be used as the data source for pieces of music.
Since the invention described above can be realised in a particularly advantageous manner with an appropriately programmed computer system, the measures according to the invention can also be realised in the form of a computer software product, which can be loaded directly into the internal memory of a digital computer and comprises software sections, with which the measures according to the invention can be implemented, when the software product is run on a computer.
In this context, the invention also allows the provision of a data medium, especially a compact disc, with
In this context, it is particularly advantageous if the digital control information in the second data region represent mixing procedures with pieces of music and/or interactive interventions into pieces of music with audio signal processing media as a new complete work of the digital audio information from pieces of music in the first data region.
Furthermore, it has proved favourable if the stored digital control information in the second data region has a format, which provides the information for identifying the processed pieces of music in the first data region as well as the relevant time sequence of playback positions and status information for the control elements in the music player allocated to each piece of music.
It is also advantageously possible to arrange on a data medium of this kind, a computer software product, which can be loaded directly into the internal memory of a digital computer and provides software sections, which allow this digital computer to function as a music player, in particular, a music player as described above, which, on the basis of the control data in the second data region of the data medium, which refer to audio data in the first data region of the data medium, can play back a complete work represented by the control data when the software product is run on the computer.
Since the interactive music player combines audio playback, signal analysis and signal transformation by means of effects and loops, it is possible, for the first time, not only to realise the real-time detection of the tempo and phase of the audio track but at the same time also to achieve automatic matching of tempo and phase.
The analysis additionally provides necessary output data for the control of tempo-synchronised effects and loops.
The advantages include, amongst others, the possibility of automating the so-called beat-matching process achieved in this context, a basic requirement for DJ mixing which cannot be readily learned, and which claims a considerable amount of the DJ's attention at every transition between two pieces of music. Furthermore, the entire mixing procedure can be automated.
Further advantages and details of the invention are provided with reference to the following description of advantageous exemplary embodiments in conjunction with the drawings. In outline, the drawings are as follows:
The following description is intended to represent a possible realisation of the approximate tempo and phase detection and tempo and phase matching according to the invention.
The first stage of the procedure is an initial, approximation of the tempo of the piece of music. This is implemented via a statistical evaluation of the time interval between the so-called beat-events. One method for obtaining rhythm-relevant events from the audio material is to use a narrow band-pass filter for audio signals in various frequency ranges. To establish the tempo in real-time, only beat events from the preceding few seconds are used for the subsequent calculations in each case. Accordingly, 8 to 16 events correspond approximately to 4 to 8 seconds.
In view of the quantised structure of music (16th note grid), it is possible to include not only quarter note beat intervals in the tempo calculation; other intervals (16th, 8th, ½ and whole notes) can be transformed, by means of octaving (that is, raising their frequency by a power of two), into a pre-defined frequency octave (e.g. 90-160 bpm=beats per minute) and thereby supplying tempo-relevant information. Errors in octaving (e.g. of triplet intervals) are not relevant for the subsequent statistical evaluation because of their relative rarity.
In order to register triplets and/or shuffled rhythms (individual notes displaced slightly from the 16th note grid), the time intervals obtained at the first point are additionally grouped into pairs and groups of three by addition of the time values before they are octaved. The rhythmic structure between beats is calculated from the time intervals using this method.
The quantity of data obtained in this manner is investigated for accumulation points. In general, depending on the octaving and grouping procedure, three accumulation maxima occur, of which the values are in a rational relationship to one another (2/3, 5/4, 4/5 or 3/2). If it is not sufficiently clear from the strength of one of the maxima that this indicates the actual tempo of the piece of music, the correct maximum can be established from the rational relationships between the maxima.
A reference oscillator is used for approximation of the phase. This oscillates at the tempo previously established. Its phase is advantageously selected to achieve the best agreement between beat-events in the audio material and zero passes of the oscillator.
Following this, a successive improvement of the approximated tempo and phase is implemented. As a result of the natural inaccuracy of the initial tempo approximation, the phase of the reference oscillator is initially shifted relative to the audio track after a few seconds. This systematic phase shift provides information about the amount by which the tempo of the reference oscillator must be changed. A correction of the tempo and phase is advantageously carried out at regular intervals, in order to remain below the threshold of audibility of the shifts and correction movements.
All of the phase corrections, implemented from the time of the approximate phase correlation, are accumulated over time so that the calculation of the tempo and the phase is based on a constantly increasing time interval. As a result, the tempo and phase values become increasingly more accurate and lose the error associated with approximate real-time measurements mentioned above. After a short time (approximately 1 minute), the error in the tempo value obtained by this method falls below 0.1%, a measure of accuracy, which is a prerequisite for calculating loop lengths.
The drawing according to
Two streams of audio events Ei with a value 1 are provided as the input; these correspond to the peaks in the frequency bands F1 at 150 Hz and F2 at 4000 Hz or 9000 Hz. These two event streams are initially processed separately, being filtered through appropriate band-pass filters with threshold frequency F1 and F2 in each case.
If an event follows the preceding event within 50 ms, the second event is ignored. A time of 50 ms corresponds to the duration of a 16th note at 300 bpm, and is therefore considerably shorter than the duration of the shortest interval in which the pieces of music are generally located.
From the stream of filtered events Ei, a stream consisting of the simple time intervals Ti between the events is now calculated in the relevant processing units BD1 and BD2.
Two further streams of bandwidth-limited time intervals are additionally formed in identical processing units BPM_C1 and BPM_C2 in each case from the stream of simple time intervals T1i: namely, the sums of two successive time intervals in each case with time intervals T2i, and the sum of three successive time intervals with time intervals T3i. The events included in this context may also overlap. Accordingly from the stream: t1, t2, t3, t4, t5, t6 . . . the following two streams are additionally produced:
T2i:(t1+t2),(t2+t3),(t3+t4),(t4+t5),(t5+t6), . . .
and
T3i:(t1+t2+t3),(t2+t3+t4),(t3+t4+t5),(t4+t5+t6) . . .
The three streams . . . T1i, T2i, T3i, are now time-octaved in appropriate processing units OKT. The time-octaving OKT is implemented in such a manner that the individual time intervals of each stream are doubled until they lie within a predetermined interval BPM_REF. Three data streams T1io, T2io, T3io are obtained in this manner. The upper limit of the interval is calculated from the lower bpm threshold according to the formula:
thi[ms]=60000/bpmlow.
The lower threshold of the interval is approximately 0.5* thi
The consistency of each of the three streams obtained in this manner is now checked, in further processing units CHK, for the two frequency bands F1, F2. This determines whether a certain number of successive, time-octaved interval values lie within a predetermined error threshold in each case. In particular, this check may be carried out, with the following values:
For T1i, the last 4 relevant events t11o, t12o, t13o, t14o are checked to determine whether the following applies:
(t11o−t12o)2+(t11o−t13o)2+(t11o−t14o)2<20 a)
If this is the case, the value t110 will be obtained as a valid time interval.
For T2i, the last 4 relevant events t21o, t22o, t23o, t24o are checked to determine whether the following applies:
(t21o−t22o)2+(t21o−t23o)2+(t21o−t24o)2<20 b)
If this is the case, the value t210 will be obtained as a valid time interval.
For T3i, the last 4 relevant events t31o, t32o, t33o, t34o are checked to determine whether the following applies:
(t31o−t32o)2+(t31o−t33o)2+(t31o−t34o)2<20 c)
If this is the case, the value t310 will be obtained as a valid time interval.
In this context, consistency test a) takes priority over b), and b) takes priority over c). Accordingly, if a value is obtained for a), then b) and c) will not be investigated. If no value is obtained for a), then b) will be investigated and so on. However, if a consistent value is not found for a), or for b) or for c), then the sum of the last 4 non-octaved individual intervals (t1+t2+t3+t4) will be obtained.
The stream of values for consistent time intervals obtained in this manner from the three streams is again octaved in a downstream processing unit OKT into the predetermined time interval BPM_REF. Following this, the octaved time interval is converted into a BPM value.
As a result, two streams BPM1 and BPM2 of bpm values are now available—one for each of two frequency ranges F1 and F2. In one prototype, the streams are retrieved with a fixed frequency of 5 Hz, and the last eight events from each of the two streams are used for statistical evaluation. At this point, a variable (event-controlled) sampling rate can also be used, wherein more than merely the last 8 events can be used, for example, 16 or 32 events.
These last 8, 16 or 32 events from each frequency band F1, F2 are combined and examined for accumulation maxima N in a downstream processing unit STAT. In the prototype version, an error interval of 1.5 bpm is used, that is, provided events differ from one another by at least 1.5 bpm, they are regarded as associated and are added together in the weighting. In this context, the processing unit STAT determines the BPM values at which accumulations occur and how many events are to be attributed to the relevant accumulation points. The most heavily weighted accumulation point can be regarded as the local BPM measurement and provide the desired tempo value A.
In an initial further development of this method, in addition to the local BPM measurement, a global measurement is carried out, by expanding the number of events used to 64, 128 etc. With alternating rhythm patterns, in which the tempo only comes through clearly on every fourth beat, an event number of at least 128 may frequently be necessary. A measurement of this kind is more reliable, but also requires more time.
A further decisive improvement can be achieved with the following measure:
Not only the first but also the second accumulation maximum is taken into consideration. This second maximum almost always occurs as a result of triplets and may even be stronger than the first maximum. The tempo of the triplets, however, has a clearly defined relationship to the tempo of the quarter notes, so that it can be established from the relationship between the tempi of the first two maxima, which accumulation maximum should be attributed to the quarter notes and which to the triplets.
If T2=2/3*T1, then T2 is the tempo
If T2=4/3*T1, then T2 is the tempo
If T2=2/5*T1, then T2 is the tempo
If T2=4/5*T1, then T2 is the tempo
If T2=3/2*T1, then T1 is the tempo
If T2=3/4*T1, then T1 is the tempo
If T2=5/2*T1, then T1 is the tempo
If T2=5/4*T1, then T1 is the tempo
A phase value P is approximated with reference to one of the two filtered, simple time intervals Ti between the events, preferably with reference to those values which are filtered with the lower frequency F1. These are used for the rough approximation of the frequency of the reference oscillator.
The drawing according to
Initially, the reference oscillator and/or the reference clock MCLK is started in an initial stage 1 with the rough phase values P and tempo values A derived from the beat detection, which is approximately equivalent to a reset of the control circuit shown in
If a “critical” deviation is systematically exceeded (+) in several successive events by a value, for example, of greater than 30 ms, the reference clock MCLK is (re)matched to the audio signal in a further processing stage 3 by means of a short-term tempo change
A(I+1)=A(i)+q or
A(I+1)=A(i)−q
relative to the deviation, wherein q represents a lowering or raising of the tempo. Otherwise (−), the tempo is held constant.
During the further sequence, in a subsequent stage 4, a summation is carried out of all correction events from stage 3 and of the time elapsed since the last “reset” in the internal memories (not shown). At approximately every 5th to 10th event of an approximately accurate synchronization (difference between the audio data and the reference clock MCLK approximately below 5 ms), the tempo value is re-calculated in a further stage 5 on the basis of the previous tempo value, the correction events accumulated up to this time and the time elapsed since the last reset, as follows.
With
Furthermore, tests are carried out to check whether the corrections in stage 3 are consistently negative or positive over a certain period of time. If this is the case, there is probably a tempo change in the audio material, which cannot be corrected by the above procedure; this status is identified and on reaching the next approximately perfect synchronisation event (stage 5), the time and the correction memory are deleted in stage 6, in order to reset the starting point in phase and tempo. After this “reset”, the procedure begins again to optimise the tempo starting at stage 2.
A synchronisation of a second piece of music now takes place by matching its tempo and phase. The matching of the second piece of music takes place indirectly via the reference oscillator. After the approximation of tempo and phase in the piece of music as described above, these values are successively matched to the reference oscillator according to the above procedure, only this time the playback phase and playback rate of the track are themselves changed. The original tempo of the track can readily be calculated back from the required change in its playback rate by comparison with the original playback rate.
The following paragraphs discuss the possibility already described above for playing back several pieces of music at the same time on a standard CD-ROM drive or another data source with only one reader unit. In this context, the present invention creates the possibility, essential for synchronising a second piece of music, of providing two or more pieces of music with a unit of this kind in real-time.
The prior art, in this context, is the playing back of an audio title from a CD-ROM by means of a computer (so-called “grabbing”), which is comparable with playing back a piece of music on a conventional CD player.
Just like audio CD players, CD-ROM drives have only one reader unit, and can therefore only read the audio data at one position at any given time.
To resolve this problem, a parallel thread, which is not coupled to the audio output is produced to act as a so-called Scheduler, which, in the background, receives requests for the pieces of music to be played back and retrospectively loads the necessary audio data.
The concept of multi-threading is understood to mean the capability of a software program to implement various functions of an application simultaneously. Accordingly, several programs are not run in parallel on the digital computer (multitasking), but, within one program, various functions are implemented at the same time from the perspective of the user. In this context, a thread represents the smallest unit of executable program code, to which one part of the operating system (the thread scheduler) allocates computer time according to a given priority. Coordination of the individual threads is carried out by means of synchronisation mechanisms, or so-called locks, which ensure the compilation of the individual threads. The reader unit, in this context the laser of the CD-ROM drive, is operated in multiplex mode, so that it can provide the necessary data in real-time by means of buffer memory strategies and a higher reading rate.
The essential technical obstacle here is that, like audio CD players, CD-ROM drives have only one reader unit available. It is therefore only possible to supply the data for one track at any given time.
This problem is resolved in that for every track to be played back, an adequately dimensioned buffer is introduced, and the higher reading rate of the CD-ROM drive is used to read out the data for the buffer. This measure fits seamlessly into the environment of the music player described. For the user, the playback of CD tracks is transparent; it occurs exactly as if the data were present in a digital format on a computer hard disk. As a result of the digital read-out from the CD, it is possible to send the audio data through signal processing means such as filters or audio effects. Amongst other factors, this allows reverse playback, pitching (changing the rate and level of pitch, beat detection and filtering of normal audio CDs.
The drawing according to
This central instance, referred to below as the Scheduler S, is not coupled to the actual playback of the audio track TR1 . . . TRn, it runs in its own thread and sorts the requests received, sometimes in parallel, from various tracks into an order which is to be worked through sequentially. The scheduler S now sends the requests for an excerpt from a track to the CD-ROM drive CD-ROM. This reads the requested sectors from a data medium with the corresponding digital audio data. The scheduler S then fills the corresponding buffer P . . . Pn with the data received; data which are no longer required are over-written.
Various storage media such as vinyl discs, compact discs or cassettes are conventionally used to play back pre-recorded music on appropriate devices. These formats were not developed to allow intervention into the playback process allowing the music to be processed in a creative manner. However, this possibility is desirable and is, indeed, currently practised by the DJs mentioned in the introduction in spite of the limitations encountered. In this context, vinyl discs are preferred because the playback rate and position can most readily be influenced by hand.
Nowadays, however, digital formats such as audio CD and MP3 are predominantly used for storing music. MP3 represents a compression procedure for digital audio data according to the MPEG standard (MPEG 1 Layer 3). The procedure is asymmetrical, that is, coding is very much more complex than decoding. Furthermore, it is a procedure associated with loss. The present invention allows the above-named creative processing of music in any digital format using an appropriately interactive music player, which utilises the new possibilities created by the measures according to the invention as described above.
In order to make targeted interventions, it is important to have a graphic representation of the music, in which the current playback position can be identified as well as a certain period in the future and in the past. For this purpose, an amplitude-envelope-curve of the sound-wave form over a period of several seconds before and after the playback position is conventionally displayed. The display moves in real-time at the rate at which the music is played.
In principle, the maximum amount of helpful information in the graphic display is desirable in order to allow targeted intervention. Moreover, it is desirable if interventions in the playback procedure can be made in the most ergonomic manner possible, in a manner comparable with so-called “scratching” on vinyl discs, which is understood to mean the holding and moving forwards or backwards of the turn-table during playback.
In the case of the interactive music player created by the invention, musically relevant points in time, especially beats, can be extracted from the audio signal with the beat-detector functions explained above (
A hardware control element R1 is also provided, e.g. a button, in particular a mouse button, which allows switching between two operating modes:
a) the music is played back freely at constant tempo
b) the playback position and rate are directly influenced by the user.
Mode a) corresponds to a vinyl disc, which is not touched and which rotates at the same rate as the turn-table. By contrast, mode b) corresponds to a vinyl disc, which is manually held and pushed backwards and forwards.
In one advantageous embodiment of an interactive music player, the playback rate in mode a) is further influenced by the automatic control for synchronising the beat of the music played back with another beat (cf.
Moreover, a further hardware control element R2 is provided. This is used in mode b) to influence the position of the disc, so to speak, and may be a continuous controller or also the computer mouse.
The drawing according to
The position data established with this further control element R2 generally have a limited time resolution, i.e. a message indicating the current position is sent only at regular or irregular intervals. However, the playback position of the stored audio signal is supposed to change uniformly with a time resolution which corresponds to the audio sampling rate. Accordingly, the invention uses a smoothing function at this position, which produces a high-resolution, uniformly changing signal from the stepped signal defined by the control element R2.
In this context, one method is to initiate a ramp with constant gradient for every position message defined, which, within a defined time, moves the smoothed signal from its old value to the value of the position message. Another possibility is to send the stepped wave form into a linear, digital low-pass filter LP, of which the output represents the desired, smoothed signal. A 2-pole resonance filter is particularly well suited for this purpose. A combination (series connection) of the two smoothing procedures is also possible and advantageous, and this allows the following advantageous signal processing chain:
Defined stepped signal->ramp smoothing->low-pass filter->exact playback position
or
Defined stepped signal->low-pass filter->ramp smoothing->exact playback position.
The block circuit diagram according to
The position must not jump when the user switches from one mode into the other (equivalent to holding and releasing the turn-table). For this reason, the proposed interactive music player adopts the position reached in the preceding mode as the starting position in the new mode. Similarly, the playback rate (first derivation of the position) must not change in a jumping manner. Accordingly, the current rate is also adopted and moved by means of a smoothing function, as described above, to the rate which corresponds to the new mode. According to
During “scratching” with vinyl discs, that is to say, playback with strongly and rapidly changing playback rate, the sound-wave form changes in a characteristic manner, because of the properties of the recording method conventionally used for vinyl discs. When producing a press-master for the vinyl disc in the recording studio, the sound signal is passed through a pre-emphasis filter (pre-distortion filter) according to the RIAA standard, which raises the peaks (the so-called “cutting characteristic”). Every piece of equipment used for playing back vinyl discs contains a corresponding de-emphasis filter (reverse-distortion filter), which reverses the effect so that approximately the original signal is obtained.
Now, if the playback rate is not the same as the recording rate, which occurs, for example, during “scratching”, then all the frequency components of the signal on the vinyl disc are correspondingly shifted and therefore attenuated differently by the de-emphasis filter. The characteristic sound is produced as a result.
According to one further advantageous embodiment of an interactive music player according to the invention with a set-up corresponding to
A second-order digital IRR filter, i.e. with two favourably selected pole positions and two favourably selected zero positions is advantageously used for the pre-emphasis and de-emphasis filter PEF and DEF, which should have the same frequency response as specified in the RIAA standard. If the pole positions of one filter are the same as the zero positions of the other filter, the effect of the two filters will be increased as desired if the audio signal is played back at the original rate. In all other cases, the named filters produce the characteristic sound effect associated with “scratching”. Of course, the scratching-audio filter described can also be used in conjunction with any other type of music playback device with a “scratching” function.
In combination with the suggested CD-grabbing procedure, it is also advantageous if one and the same title can be loaded twice into the interactive music player to be mixed and/or “re-mixed” with itself via the automix procedure or allowed to run as a long, one-song-mix, without ever losing the beat. In this manner, very short pieces of music can be prolonged as required by the DJ.
Moreover, the tempo of a mix can be gradually raised or lowered via a targeted frequency change of the master clock MCLK (the reference oscillator from
As already mentioned, when several pieces of music are mixed conventionally, the audio sources from sound media are played back on several playback devices and mixed via a mixing desk. With this procedure, an audio recording is restricted to recording the final result. It is therefore not possible to reproduce the mixing procedure or, at a later time, to start exactly at a predetermined position within a piece of music.
The present invention achieves precisely this goal by proposing a file format for digital control information, which provides the possibility of recording and accurately reproducing from audio sources the process of interactive mixing together with any processing effects. This is especially possible with a music player as described above.
The recording is subdivided into a description of the audio sources used and a time sequence of control information for the mixing procedure and additional effect processing.
Only the information about the actual mixing procedure and the original audio sources are required in order to reproduce the results of the mixing procedure. The actual digital audio data are provided externally. This avoids procedures involving the copying of protected pieces of music which can be problematic under copyright law. Accordingly, by storing digital control data, which relate to playback position, synchronisation information, real-time interventions using audio-signal-processing etc., mixing procedures for several audio pieces representing a mix of audio sources together with any effect processing used, can be realised as a new complete work with a comparatively long playback duration.
This provides the advantage, that a description of the processing of the audio sources is relatively short by comparison with the audio data from the mixing procedure, and the mixing procedure can be edited and re-started at any desired position. Moreover, existing audio pieces can be played back in various compilations or as longer, interconnected interpretations.
With existing sound media and music players, it has not so far been possible to record and reproduce the interaction with the user, because the known playback equipment does not provide the technical conditions required to control this accurately enough. This has only become possible as a result of the present invention, wherein several digital audio sources can be reproduced and their playback positions established and controlled. As a result, the entire procedure can be processed digitally, and the corresponding control data can be stored in a file. These digital control data are preferably stored with a resolution which corresponds to the sampling rate of the processed digital audio data.
The recording is essentially subdivided into two parts:
The list of audio sources used contains, for example:
Amongst other data, the control information stores the following:
The following paragraphs describe one possible example for administering the list of audio pieces in an instance in the XML format. In this context, XML is an abbreviation for Extensible Markup Language. This is a name for a meta language for describing pages in the World Wide Web. By contrast with HTML (Hypertext Markup Language), it is possible for the author of an XML document to define within the document itself certain extensions of XML in the document-type-definition-part of the document and also to use these within the same document.
<?xml version=“1.0” encoding=“ISO-8859-1”?>
<MJL VERSION=“version description”>
<HEAD PROGRAM=“program name” COMPANY=“company name”/>
<MIX TITLE=“title of the mix”>
<LOCATION FILE=“marking of the control information file” PATH=“storage location for control information file”/>
<COMMENT> comments and remarks on the mix </COMMENT>
<MIX>
<PLAYLIST>
<ENTRY TITLE=“title entry 1” ARTIST=“name of author” ID=“identification of title”>
<LOCATION FILE=“identification of audio source” PATH=“memory location of audio source” VOLUME=“storage medium of the file”/>
<ALBUM TITLE=“name of the associated album” TRACK=“identification of the track on the album”/>
<INFOPLAYTIME=“playback time in seconds” GENRE_ID=“code for musical genre”/>
<TEMPO BPM=“playback time in BPM” BPM QUALITY=“quality of tempo value from the analysis”/>
<CUE POINT 1=“position of the first cue point” . . . POINTn=“position of the nth cue point”/>
<FADE TIME=“fade time” MODE=“fade mode”>
<COMMENT> comments and remarks on the audio piece>
<IMAGE FILE=“code for an image file as additional commentary option”/>
<REFERENCE URL=“code for further information on the audio source”/>
</COMMENT.
</ENTRY>
</ENTRY . . . >
</ENTRY>
</PLAYLIST>
</MJL>
The control information data, referenced through the list of audio pieces, are preferably stored in binary format. The basic structure of the stored control information in a file can be described, by way of example, as follows:
[Number of control blocks N]
For [number of control blocks N] is repeated }
[time difference since the last control block in milliseconds]
[number of control blocks M]
For [number of control blocks M] is repeated {
[identification of controller]
[Controller chnannel]
[New value of the controller]
}
}
[identification of controller] defines a value which identifies a control element (e.g. volume, rate, position) of the interactive music player. Several sub-channels [controller channel], e.g. number of playback module, may be allocated to control elements of this kind. An unambiguous control point M is addressed with [identification of controller], [controller channel].
As a result, a digital record of the mixing procedure is produced, which can be stored, reproduced non-destructively with reference to the audio material, duplicated and transmitted, e.g. over the Internet.
One advantageous embodiment with reference to such control files is a data medium D, as shown in
However, the invention can be realised in a particularly advantageous manner on an appropriately programmed digital computer with appropriate audio interfaces, in that a software program executes the procedural stages of the computer system (e.g. the playback and/or mix application PRG_DATA) presented above. In combination with the advantageous CD-grabbing methods implemented on a standard CD-ROM drive, the data medium described then allows the full functionality of the invention.
Provided the known prior art permits, all of the features mentioned in the above description and shown in the diagrams should be regarded as components of the invention either in their own right or in combination.
The above description of preferred embodiments according to the invention is provided for the purpose of illustration. These exemplary embodiments are not exhaustive. Moreover, the invention is not restricted to the form exactly as indicated, indeed, numerous modifications and changes are possible within the technical doctrine indicated above. One preferred embodiment has been selected and described in order to illustrate the basic details and practical applications of the invention, thereby allowing a person skilled in the art to realise the invention. A number of preferred embodiments and further modifications may be considered in specialist areas of application.
Number | Date | Country | Kind |
---|---|---|---|
101 01 473 | Jan 2001 | DE | national |
This application is a continuation of U.S. patent application Ser. No. 10/251,000 filed Jul. 8, 2003, which is a national phase entry of PCT/EP02/00074 filed Jan. 7, 2002, which claims priority to DE 101 01 4 73.2 filed Jan. 13, 2001, all of which are incorporated by reference.
Number | Name | Date | Kind |
---|---|---|---|
5336844 | Yamauchi et al. | Aug 1994 | A |
5683253 | Park et al. | Nov 1997 | A |
5793739 | Tanaka et al. | Aug 1998 | A |
7041892 | Becker | May 2006 | B2 |
7058882 | Kobayashi | Jun 2006 | B2 |
7169999 | Tsuji et al. | Jan 2007 | B2 |
7319185 | Wieder | Jan 2008 | B1 |
7615702 | Becker et al. | Nov 2009 | B2 |
7732697 | Wieder | Jun 2010 | B1 |
20010017076 | Fujita et al. | Aug 2001 | A1 |
20040069123 | Becker et al. | Apr 2004 | A1 |
20040177746 | Becker | Sep 2004 | A1 |
20040177747 | Tsuji et al. | Sep 2004 | A1 |
20090019995 | Miyajima | Jan 2009 | A1 |
20090223352 | Matsuda et al. | Sep 2009 | A1 |
20090272253 | Yamashita et al. | Nov 2009 | A1 |
20100011941 | Becker et al. | Jan 2010 | A1 |
Number | Date | Country | |
---|---|---|---|
20100011941 A1 | Jan 2010 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 10251000 | US | |
Child | 12565766 | US |