The present invention relates to an improved waveform data analysis method and waveform data analysis apparatus suitable for automatic performances, particularly automatic accompaniments, executed by personal computers, electronic musical instruments, amusement equipment, etc. as well as a computer program to be used for the waveform data analysis.
Heretofore, there has been known the technique of recording tones of a natural musical instrument or the like of certain lengths and then automatically reproducing the thus-recorded tones repetitively at a rate corresponding to a set tempo. With this technique, often used in an automatic accompaniment of rhythm tones or the like, it is necessary to expand or compress the original waveforms in accordance with a set tempo, in order to avoid reproduction pitch variations. For example, to this end, original waveform data, obtained by stereophonic recording of tones of a natural musical instrument or the like are divided at rise portions (for convenience of description, hereinafter referred to as “waveform data control points”) of an envelope and thereby divided into a plurality of sections (for convenience of description, hereinafter referred to as “original sections”). When an automatic rhythm accompaniment or the like is to be performed using such original waveform data, the original waveform data may be reproduced repetitively as they are without being subjected to particular processing, as long as the waveform data are reproduced at a same tempo as when they were recorded (i.e., as long as the reproducing tempo of the waveform data is the same as the original recording tempo).
When, however, the reproducing tempo is to be faster than the original recording tempo, it is necessary to shorten the individual original sections to be reproduced; for this purpose, respective end segments of the original sections may be cut off at a given ratio. For example, if the original recording tempo is “100” and the reproducing tempo is “125”, the end segments of the individual original sections may each be cut off by 20% to allow only the remaining waveform data to be reproduced. On the other hand, when the reproducing tempo is to be slower than the original recording tempo, there arises a problem. Namely, if reproduction start timing of the individual original sections is simply delayed in accordance with the desired reproducing tempo, silent segments are produced in gaps between the successive original sections, which tend to become offensive to the ears. Thus, it has been conventional to add, to each of the gaps between the original sections, a necessary length of the waveform data of the immediately-preceding original section. At that time, the initial amplitude value of a portion of the waveform data to be added to fill in the gap is set to coincide with the amplitude value of a portion of the waveform data immediately preceding the to-be-added portion.
However, the above-mentioned conventional technique has not been satisfactory in that the waveform data control points, i.e. dividing positions for waveform control, can not be necessarily set at appropriate positions. Namely, although the conventional technique is arranged to set the waveform data control point at a position of the envelope where the amplitude exceeds a predetermined threshold value, no waveform data control point is sometimes set automatically even at a position that can be identified as a rise portion through the human auditory sense because the peak does not reach the threshold value. In such a case, a plurality of beats may be undesirably included in a single original section, and the tempo compression/expansion can not be executed properly between these beats. Conversely, at a position where there is a great envelope variation, a plurality of the waveform data control points are sometimes set at the position even though the position is identified as a single beat through the human auditory sense, so that unnatural compression/expansion tends to be executed.
Further, where the original recording tempo and reproducing tempo are different from each other and if the reproduction start timing of the individual sections is controlled simply in accordance with a ratio between the original recording tempo and the reproducing tempo, there would be undesirably produced a sense of “tardiness” or “heaviness” particularly in a waveform having slow rise portions. Specific example of the “tardiness” or “heaviness” will be later explained in relation to
Further, it has been known to use the following devices or software in order to compress or expand an original waveform in accordance with a set tempo.
(1) Sampler: The sampler samples an analog waveform and converts it into digital waveform data. There have been known two major types of samplers, one type for recording a single tone waveform and the other type for recording a phrase waveform made up of a plurality of tones; the other type is commonly known as a “phrase sampler”.
(2) Slicer: The slicer allocates serial note numbers to divided waveform data starting with the leading waveform data, and generates and stores automatic performance information composed of the allocated note numbers and timing (dividing positions), i.e. generates and stores sequence data for driving the divided waveform data. Original waveform data (waveform data before the division) can be reproduced by executing an automatic performance on the basis of the automatic performance information at the original recording tempo while triggering the divided waveform data in response to reproduced note-on events. If the tempo is changed, the timing of the note-on events varies in accordance with the changed tempo so that the waveform data as a whole are expanded or compressed in a time-axial direction.
(3) Sequencer: The sequencer reproduces the above-mentioned sequence data. However, in addition to reproducing the sequence data merely as they are, the sequencer can reproduce the sequence data at any desired tempo by increasing or decreasing the reproducing speed as appropriate. Note that the reproduction timing of the individual unit waveform data can of course be independently varied as desired in advance by previously editing the sequence data. As the sequence data are supplied to a suitable tone generator, tone waveforms are reproduced on the basis of the reproduced sequence data.
Let's now consider a case where a new track is added to given original sequence data and additional sequence data obtained by the above-mentioned technique are written into the new track so as to create ensemble (synthesized) sequence data. Because the original sequence data and additional sequence data are recorded independently of each other, merely writing the additional sequence data into the new track would result in undesired timing differences between the original sequence data and the additional sequence data. Thus, there is a need to deliberately adjust the timing of the original sequence data and additional sequence data to accurately coincide with each other, which is a very complex and troublesome task. Particularly, in a case where the tempo of the original sequence data is varied at a halfway position of the sequence data, for example, it is necessary to perform the timing adjustment here and there in a music piece in question. Further, because the conventional technique is arranged to divide the waveform data only on the basis of dividing positions obtained through analysis of an envelope of the waveform data, there is a possibility of the dividing positions being detected erroneously or the waveform data being divided at musically inappropriate positions.
In view of the foregoing, it is an object of the present invention to provide a waveform data analysis method and waveform data analysis apparatus which can determine optimal dividing positions (waveform data control points) of waveform data, as well as a computer program to be used for the waveform data analysis.
It is another object of the present invention to provide a waveform data analysis method and waveform data analysis apparatus which can record waveform data in association with automatic performance data, as well as a computer program to be used for such waveform data analysis.
According to one aspect of the present invention, there is provided a waveform data analysis method which comprises: a step of performing a filter process for removing components of a predetermined frequency band from original waveform data; and a step of determining dividing positions of the original waveform data on the basis of envelope levels of the waveform data having been subjected to the filter process. Such arrangements can appropriately remove components of a predetermined frequency band, such as sustainable components of vocal sounds, bass tones or the like in the original waveform data, that would impede detection of optimal dividing positions of the waveform data, and thereby permits appropriate envelope level analysis and hence determination of optimal dividing positions. As in the above-discussed conventionally-known technique, the thus-determined dividing positions may be used, for example, as waveform data control points when the original waveform data are to be compressed or expanded with a view to variably controlling a reproducing performance tempo without changing a pitch feeling of the original waveform data.
According to another aspect of the present invention, there is provided a waveform data analysis method which comprises: a step of performing a filter process for removing components of a predetermined frequency band from original waveform data; a step of detecting an envelope of the waveform data having been subjected to the filter process; and a step of determining dividing positions of the original waveform data on the basis of differentiation of the detected envelope. Such arrangements too permits determination of optimal dividing positions of the waveform data.
As one example, the waveform data analysis method may further comprise an amplitude conversion step of reducing an amplitude level difference in the detected envelope, and the step of determining dividing positions may determine the dividing positions of the original waveform data on the basis of differentiation of the envelope having been processed by the amplitude conversion step. As another example, the step of determining dividing positions may include a step of detecting peak levels corresponding to the determined dividing positions. In such a case, the method may further comprise a step of setting a time difference (Td) between a reproduction start time point of the original waveform data and a start time point of a given dividing position of the original waveform data as
Td=n(Ts+Tt)−Tt
where Ts represents an original time difference between a reproduction start position of the original waveform data and a start position of the given dividing position, Tt represents an original time difference between the given dividing position and a peak position where a peak level corresponding to the given dividing position occurs, and n represents an expansion/compression ratio of a reproducing tempo at which the original waveform data are to be reproduced.
According to still another aspect of the present invention, there is provided a waveform data analysis method which comprises: a step of determining presumed beat positions in original waveform data; a step of detecting rise positions in the original waveform data within predetermined ranges corresponding to the determined presumed beat positions; and a step of extracting any one of the detected rise positions as a dividing position of the original waveform data. Such arrangements too permits determination of optimal dividing positions of the waveform data.
According to still another aspect of the present invention, there is provided a waveform data analysis method which comprises: a step of detecting rise positions in original waveform data; and a step of selecting one rise position from among one or more rise positions detected within a predetermined range of the original waveform data and extracting the selected rise position as a dividing position of the original waveform data. Such arrangements too permits determination of optimal dividing positions of the waveform data.
According to still another aspect of the present invention, there is provided a waveform data analysis method which comprises: a step of reproducing automatic performance information; a step of storing waveform data in parallel with reproduction of the automatic performance information; and a step of storing synchronization control data indicative of relationship in processing timing between the automatic performance information and the waveform data, in correspondence with storage of the waveform data. With such arrangements, the waveform data can be stored in association with the automatic performance information. Further, because the synchronization control data can be obtained in association with the automatic performance information, it is possible to properly and readily determine dividing positions of the waveform data that should function as waveform data control points.
According to still another aspect of the present invention, there is provided a waveform data processing method which comprises: a step of dividing original waveform data into a plurality of sections; and a step of adding waveform data of an additional section to an end of a selected one of the sections divided from the original waveform data by said step of dividing, the waveform data of the additional section attenuating, with passage of time, from an initial value equal to an envelope level at the end of the selected section.
According to still another aspect of the present invention, there is provided a waveform data processing method which comprises: a step of dividing original waveform data into a plurality of sections; a step of, in correspondence with the sections divided from the original waveform data by said step of dividing, previously generating and storing waveform data of additional sections to be added to individual ones of the divided sections; a step of, when a reproducing tempo is faster than a predetermined standard, using the original waveform data of the individual divided sections to reproduce a waveform without using the waveform data of the additional sections; and a step of, when the reproducing tempo is slower than the predetermined standard, reproducing a waveform by adding the waveform data of corresponding ones of the additional sections to the divided sections to follow the waveform data of the divided sections.
The present invention may be constructed and implemented not only as the method invention as discussed above but also as an apparatus invention. Also, the present invention may be arranged and implemented as a software program for execution by a processor such as a computer or DSP, as well as a storage medium storing such a program. Further, the processor used in the present invention may comprise a dedicated processor with dedicated logic built in hardware, not to mention a computer or other general-purpose type processor capable of running a desired software program.
While the embodiments to be described herein represent the preferred form of the present invention, it is to be understood that various modifications will occur to those skilled in the art without departing from the spirit of the invention. The scope of the present invention is therefore to be determined solely by the appended claims.
For better understanding of the objects and other features of the present invention, its embodiments will be described in greater detail hereinbelow with reference to the accompanying drawings, in which:
1. Hardware Setup of First Embodiment:
The following paragraphs describe an exemplary general hardware setup of a waveform editing system in accordance with a first embodiment of the present invention, with reference to
The waveform editing system includes a communication interface (I/F) 2 that communicates waveform data and various other data via an external network such as the Internet, an input operation unit 4 including a keyboard and mouse, a performance operator unit 6 including pad operators for simulating a keyboard and percussion instrument and the like, and a display device 8 that visually displays various information to a user.
The waveform editing system also includes a CPU 10 that controls various other components in the waveform editing system via a bus 16 on the basis of later-described programs, a ROM 12 having stored therein an initial loader program and the like, and a RAM 14 on which various data are written and read via the CPU 10. Reference numeral 18 represents a drive device that writes and reads data to and from a storage medium 20 such as a CD-ROM or MO (Magneto-Optical) disk. The waveform editing system further includes a waveform input interface (I/F) 22 that samples an analog waveform input from an external waveform source, converts the input analog waveform into digital waveform data and outputs the digital waveform data via the bus 16. Reference numeral 24 represents a hard disk where are stored an operating system of the general-purpose computer, later-described waveform-editing application program, waveform data and the like. Reference numeral 26 represents a waveform output interface that converts digital waveform data, supplied via the bus 16, into an analog waveform so that the converted analog waveform is audibly reproduced or sounded via a sound system 28.
2. Behavior of First Embodiment:
The following paragraphs describe behavior of the first embodiment. Upon powering-on of the personal computer, the initial loader program stored in the ROM 12 is executed so that the operating system is started up. Once predetermined operation is performed by the user while the operating system is ON, the waveform-editing application program is triggered.
2.1. Acquisition of Original Waveform Data:
Once predetermined operation is performed by the user while the waveform-editing application program is ON, original waveform data to be processed are loaded into the RAM 14 or hard disk 24 via the waveform input interface 22. Note that the original waveform data to be processed may be acquired via the communication interface 2 or storage medium 20.
2.2. To-be-Reproduced-Waveform-Data Generation Processing:
2.2.1. Trimming Process (steps SP2 and SP4 of
Once predetermined operation is performed by the user after the acquisition of the original waveform data, a to-be-reproduced-waveform-data generation processing routine shown in
2.2.2. Parameter Setting Process (step SP6):
After step SP4, the to-be-reproduced-waveform-data generation processing routine moves on to step SP6, where the user designates various parameters for detecting waveform data control points. Examples of the various parameters to be designated by the user include the following.
(1) Waveform Type: The waveform type parameter specifies, for example, a desired type of waveform, such as a percussion-type or sustainable-type waveform, and is classified into a plurality of variations suited to original waveform data of various musical instruments. Default values of thresholds and various other parameters are set on the basis of the waveform type.
(2) Number of Measure: This number-of-measure parameter specifies, by one of natural numbers raging, for example, from “1” to “8”, a desired number of measures to be included in the waveform data after having been subjected to the silent segment trimming.
(3) Musical Time: This musical time parameter specifies musical time of the waveform data, e.g. one of 1/4 (one-four time)-8/4 (eight-four time), 1/8 (one-eight time) 16/8 (sixteen-eight time) and 1/16 (one-sixteen time)-16/16 (sixteen-sixteen time). Because the total time length of the waveform data is already known, one specific tempo can be uniquely set once only the number of measures and musical time have been set.
(4) Resolution: This resolution parameter specifies particular resolution with which each measure of the waveform data is to be examined in order to detect control points of downbeats and upbeats. For example, where the musical time is “4/4” (four-four time), any one of resolution values “¼”, “¼(+3)”, “⅛”, “⅛(+3)”, “{fraction (1/16)}”, “{fraction (1/16)}(+3)” and “{fraction (1/32)}” can be designated; here, “(+3)” means division into triplets. In this case, designation of the “¼” resolution causes the waveform data to be examined every quarter (¼) timing of each measure (i.e., at points dividing each measure into four equal portions, or at each quarter note timing), designation of the “¼(+3)” resolution causes the waveform data to be examined every {fraction (1/12)} timing of each measure, designation of the “⅛” resolution causes the waveform data to be examined every ⅛ timing of each measure (i.e., points dividing each measure into eight equal portions, or at each eighth-note timing), designation of the “⅛(+3)” resolution causes the waveform data to be examined every {fraction (1/24)} timing of each measure, designation of the “{fraction (1/16)}” resolution causes the waveform data to be examined every {fraction (1/16)} timing of each measure (i.e., at each sixteenth-note timing), designation of the “{fraction (1/16)}(+3)” resolution causes the waveform data to be examined every {fraction (1/48)} timing of each measure, and designation of the “{fraction (1/32)}” resolution causes the waveform data to be examined every {fraction (1/32)} timing of each measure.
Of the above-mentioned various parameters, the waveform type can be selected in the following manner. Namely, a percussion-type waveform selection button 80 and sustainable-type waveform selection button 82 are displayed on the display device 8 as illustrated in
2.2.3 Unnecessary-band Removing Filter Process (step SP8):
The waveform data, having been subjected to the silent segment trimming, contain various frequency components, which include components of an unnecessary frequency band that become an obstacle to the detection of the waveform data control points. Therefore, at next step SP8, a filter process is carried out for removing components of such an unnecessary frequency band from the waveform data. Contents of the filter process are generally classified into two major types: a band cut filter process; and a high-pass filter process, and it is preferable that the contents of the filter process be determined in accordance with the designated waveform type. That is, either one or both of the band cut filter process and high-pass filter process is carried out, and parameters for use in the filter process are also determined in accordance with the designated waveform type.
The following paragraphs describe an exemplary manner in which the parameters for use in the filter process are set. Of the waveform data, components having pitches, such as those of melody data, have a high possibility of becoming an obstacle in the waveform data control point detection. Analysis of a variety of music pieces has revealed that many of such components, i.e. components of a sustainable portion of vocal sounds, bass tones or the like, appear in a “80 Hz-8 kHz” frequency band and particularly in a “100 Hz-300 Hz” frequency band. For this reason, the band cut filter process is performed in such a manner as to attenuate the “80 Hz-8 kHz” frequency band and particularly in the “100 Hz-300 Hz” frequency band. Note that because components of an attack portion (consonant, attack noise and the like) of vocal sounds, bass tones or the like spread widely across other frequency bands than the 80 Hz-8 kHz” frequency band, the waveform data control points can be detected even when the filter process has been performed.
In a band performance or the like, high-frequency tones of a cymbal or the like are performed and recorded regularly. In such a case, it is more advantageous to carry out the high-pass filter process for extracting only the regular high-frequency components. Because neither of the band cut process and high-pass filter process requires steep filter characteristics, it is, in practice, only necessary that a first-order filter be used in the high-pass filter process and a second-order filter be used in the band cut filter process. As an exemplary result of the unnecessary-band removing filter process, a waveform after having been subjected to the silent segment trimming are shown in
2.2.4. Determination of Default Waveform Data Control Point (steps SP10 and SP12):
As noted earlier, the waveform data control points are reference positions to be used for editing waveform data. First, an exemplary manner in which waveform data control points are set will be described with reference to
Referring back to
In the “analysis mode”, the waveform data control points are determined on the basis of analyzed results of the waveform data. Specifically, rise start positions, peak positions, etc. of a tone volume envelope are detected, and the waveform data control points are set on the basis of the detected results. The default waveform data control points having been determined in the above-mentioned manner are displayed on the display device 8 along with the waveform data, as illustratively shown in
(1) Down-Sampling Process (step SP102):
At step SP102 of
(2) Absolute Value Acquisition Process (step SP104):
Respective absolute values of the down-sampled waveform data are obtained at step SP104. In
(3) Envelope Follower Process (SP106):
At next step SP106 of
In the illustrated example of
Further, a switch 64 operates to select the first multiplier 66 when the difference signal d is “0” or over but select the second multiplier 68 when the difference signal d is below “0”, so as to supply the difference signal d to the selected multiplier 66 or 68. Adder 70 adds an output signal from the selected multiplier 66 or 68 and the envelope level of the last sampling cycle, and outputs the addition result as an envelope level of the current sampling cycle. In
(4) Compressor Process (step SP108):
Then, at step SP108 of
(5) Edge-Detecting Filter Process (step SP110):
At step SP110 of
(6) Edge-Start-Position/Peak-Position Detection Process (step SP112):
At step SP112 of
Output signal level shown in part (b) of
In the output signal shown in part (b), the time point at which the signal level is lowered to “−M” is the edge start position, and the time over which the “−M” level continues is equal to a rising time Tt from the edge start position to the peak position. Further, the peal level of the output signal is equal to the peal level in the result of the edge-detecting filter process (part (a) of
(7) Downbeat Extraction Process (step SP114):
At next step SP114 of
Then, in the downbeat extraction process, each edge information having the peak level greater than a predetermined first threshold value Th1 is extracted from among the edge information with the respective peak positions located within the detecting windows. However, where a plurality of pieces of the edge information are present in a same detecting window, only one of the pieces of the edge information which has the greatest peak level is extracted.
(8) Upbeat Extraction Process (step SP116):
At next step SP116 of
In this upbeat extraction process too, the detecting windows are set in such a manner that ⅓ of the detecting window ({fraction (1/18)} of the width of the corresponding presumed downbeat section) is located before the reference position and the remaining ⅔ ({fraction (2/18)} of the width of the corresponding presumed downbeat section) is located behind the reference position.
Then, in the downbeat extraction process, each edge information having the peak level greater than a predetermined second threshold value Th2 is extracted from among the edge information with the respective peak positions located within the detecting windows. However, where a plurality of pieces of the edge information are present in a same detecting window, only one of the pieces of the edge information which has the greatest peak level is extracted.
Here, the second threshold value Th2 is set to be about “⅕” of the above-mentioned first threshold value Th1, and the above-mentioned threshold value Th is set to be smaller than the threshold value Th2.
(9) Compulsory Control-point Setting Process (step SP118):
At step SP118 of
2.2.5. Control-point Editing Process (step SP14):
At step SP14 of
2.2.6. Determination of Smoothed Waveform Data for Inserting sections 1i-12i:
As stated above, the sections of the waveform data divided at the start and end positions and control points are called here “original sections”. After the control points have been determined as shown in
Referring back to
In the default state, the user selects the waveform data illustrated in
Sustainable-Type Tone
In the case of a sustainable-type tone, a smooth connection from the original section nr to the inserting section (n+1)r is assured because these original section nr and inserting section (n+1)r are originally successive sections. Although another section than the inserting section (n+1)r can be used as an inserting section, a waveform data control point may sometimes be set in a portion of the sustainable tone that has no attack phase (i.e., halfway through the sustainable-type waveform)(see step SP118), and such a situation should also be taken into consideration. Namely, in case the original section nr and inserting section are out of phase, there would be produced noise offensive to the ears. Thus, a need arises for adjusting the phases of the two sections to coincide with each other, which makes the processing more complicated. However, if the section (n+1)r immediately following the original section nr is used as an inserting section as in the above-described example, then waveform data of the inserting section can be created on the basis of more stable waveform data.
Let's now suppose a case where a given waveform data control point of the sustainable-type tone is immediately followed by an attack portion of a next tone. Generally, in such a case, the next tone differs in pitch from the preceding tone. Although a pitch difference between the original section and the corresponding inserting section is essentially undesirable, experimental results have revealed that such a pitch different does not become so noticeable. This is probably attributable to the fact that the envelope level of the inserting section is controlled to attenuate smoothly from the preceding original section. That is, variations in color (timbre) and pitch at an attack phase of a newly produced tone generally tend to become noticeable, but, if the color or pitch changes halfway through an attenuating waveform, the color or pitch change will not become so noticeable because of a strong impression of the preceding attack portion.
Percussion-Type Tone
In the case of a percussion-type tone, no noticeable noise is often produced in a connection from the original section nr to the corresponding inserting section ni because the percussion-type tone inherently has a lot of noise-like components. However, if the original section nr, next original section (n+1)r or the like is used as it is as the inserting section ni, then attack noise in a leading portion of the waveform may become more or less offensive to the ears. Such an inconvenience can be eliminated by using, as data of the inserting section ni, waveform data obtained by inverting the waveform data of the original section nr on the time axis. The original section nr and the inserting section ni can be interconnected even more smoothly if connecting portions of the two sections are interconnected in a cross-fading fashion. Note that reading out the inverted waveform data up to the end of the data will reproduce attack noise in an end portion of the inverted waveform data, which may become slightly offensive to the ears. In such a case, the waveform data may be read out by turning back (further inverting, on the time axis, the data at a halfway point (e.g., point corresponding to about a ⅔ length from the beginning) of the inverted waveform data.
The inserting sections are not necessarily limited to the above-described default inserting sections and inserting sections optimal to the human auditory sense may be selected, because the user is allowed to designate a desired mode for creating an inserting section per original section. Further, when waveform data of the inserting sections have been selected, levels in individual portions of the waveform data are divided by envelope levels of the waveform data, at step SP16. In this way, the waveform data of the inserting sections are converted into waveform data of flat envelopes.
2.2.7. Envelope Impartment to Inserting Sections 1i-12i:
At next step SP18 of
dr=(L1/L2)1/T
Then, the initial envelope level value of the inserting section ni corresponding to the original section nr is set to the above-mentioned value “L2”, and an envelope of the inserting section ni is determined such that the attenuation rate dr is maintained. Specifically, if a start time position t of the inserting section ni is set to “0” (t=0), the envelope level of each portion in the inserting section ni can be determined by L2/drt. Thus, as shown in
However, in the event that the simple determination mode has been selected, it is conceivable that the envelope level takes a greatest value at the end of the original section nr. In such a case, the envelope level of the inserting section ni is limited to the level at the end of the original section nr. Specifically, where the attenuation rate dr determined by the above mathematical expression is smaller than “1”, the attenuation rate dr to be used for determining envelope levels of the inserting section may be compulsorily set to “1”, or a lower limit value “Dr_min” greater than “1” may be predetermined as regards the attenuation rate dr so as to control the attenuation rate dr to always exceed the lower limit value “Dr_min”.
Once the envelops of the individual inserting sections have been determined in the above-described manner, the smoothed waveform data of the inserting sections li-12i are multiplied by the thus-determined envelopes. As a consequence, the waveform data of the individual inserting sections are caused to assume the respective determined envelopes.
At next step SP20 of
2.3. Reproducing Tempo Setting/Variation Process:
In the above-described processing, the desired musical time and number of measures have been set for the recorded waveform data, and the maximum clock count maxcount has been set in accordance with the thus-set musical time and number of measures. Because the absolute time length of the waveform data is already known, the tempo with which the waveform data were originally recorded (i.e., original recording tempo of the waveform data) can be calculated by reverse arithmetic operations.
Regardless of the original recording tempo, the user is allowed to set or vary a reproducing tempo as desired prior to or in the course of the performance processing. If the thus-set reproducing tempo is equal to the original recording tempo, the clock counts determined earlier (see
Namely, in the case where the original recording tempo and reproducing tempo are different from each other, controlling the reproduction start timing of the individual sections merely in accordance with a ratio between the two tempos will result in a sense of “tardiness” or “heaviness” particularly in a waveform having a slow rise, as seen from
So, when a reproducing tempo has been set or varied in the instant embodiment, each of the predetermined counts of clocks representing the timing for reproducing the waveform data (i.e., waveform-data-reproduction-start triggering clock counts) shown in
2.4. Performance Process:
The following paragraphs describe a process for carrying out an automatic performance using the waveform data of the coupled sections 1t-12t, with reference to
At next step SP38 of
Thus, even when some other waveform data were being read out till just before the current time point, readout of new waveform data can be initiated at step SP40 in such a manner as to replace the other waveform data. If answered in the negative at step SP38, the operation of step SP40 is omitted, so that the routine goes to step SP42 without the currently read-out waveform data being replaced. At step SP42, the variable tcount is incremented by one, after which the instant routine is brought to an end.
In the instant performance process, the waveform data readout of the coupled section it is initiated as soon as the routine goes to step SP38 for the first time after initialization, to the “0” value, of the variable tcount. Thereafter, when the process goes to step SP38 after the variable tcount has reached a value “28”, the waveform data readout of the coupled section it is terminated, so that the waveform data readout of the next coupled section 2t is initiated. To provide a smooth connection between the successive coupled sections 1t and 2t, the waveform data of a portion of the coupled section 1t where the data readout is to be terminated and the waveform data of a portion of the next coupled section 2t where the data readout is to be initiated may be interconnected in a cross-fading fashion.
In the same manner, waveform data readout of the following coupled sections 3t-12t is initiated sequentially in accordance with increment in the value of the variable tcount. Then, once the variable tcount reaches a value “maxcount+1” as determined at step SP34, it is reset to “0” at step SP36. After that, operations similar to the above-described are repeated. The thus sequentially-readout waveform data are sequentially audibly reproduced via the waveform output interface 26 and sound system 28.
The following paragraphs describe a tone waveform actually generated through such operations, with reference to
Part (a) of
Further, part (c) of
The reproducing tempo and the waveform data to be reproduced can be varied in real time by the user via the input device 4. Similarly, the rate at which to read out the individual sections, i.e. the pitch shift amount, can be varied in real time by the user via the input device 4. Alternatively, the waveform data to be reproduced, reproducing tempo or pitch shift amount may be varied automatically. In this way, the waveform data can be reproductively sounded in any one of a variety of manners on the basis of user operation or a predetermined sequence.
3. Advantageous Results of First Embodiment:
Because the instant embodiment is arranged to detect waveform data control points of waveform data after having performed the filter process on the waveform data, it can extract optimal waveform data control points corresponding to characteristics of the individual waveform data. Further, in the instant embodiment, reproduction of each of the original sections can be initiated at the timing “n(Ts+Tt)−Tt” by modifying the waveform-data-generation-start triggering clock count in accordance with relationship between the original edge start time Ts and rising time Tt and the reproducing tempo (tempo expansion/compression ratio n), and thus it is possible to secure appropriate consistency between the tempo expansion/compression ratio n and beat timing actually felt by the human auditory sense.
4. Modifications of First Embodiment:
It should be appreciated that the present invention is not limited to the scope of the above-described first embodiment and various modifications of the first embodiment are also possible as set forth below.
(1) Whereas the embodiment has been described as implementing an inventive waveform editing system by an application program run on a personal computer, similar functions may be applied to various electronic musical instruments, mobile cellular telephones, amusement equipment and other tone generating apparatus. Further, the software used in the above-described embodiment may be distributed in storage media such as CD-ROMs and floppy disks, or via data communication paths.
(2) Whereas the embodiment has been described as previously generating waveform data of the inserting sections in corresponding relation to the original sections, the waveform data of the individual inserting sections may be generated during reproduction of the waveform data of the corresponding original sections, or immediately before initiation of the reproduction of the waveform data of the corresponding original sections. This modification can reduce the necessary storage capacity for storing the waveform data.
(3) The “number-of-measure” parameter in the embodiment has been described as designated using a natural number. This is because the length of the waveform data can hardly be other than natural number multiples of a measure in applications where the waveform data are reproduced repetitively. In other possible applications where the waveform data are reproduced in a so-called “one shot” fashion, however, it is not always necessary to perform the trimming operations such that the number of measures becomes a natural number multiple. In such a case, it is likely that the number of measures is not a natural number, and, therefore, arrangements may be made such that the number-of-measure parameter can be designated in a decimal. Alternatively, the number of beats in the whole of the trimmed waveform data may be designated in place of the number of measures.
Further, the trimming operations need not necessarily be performed on the beat-by-beat basis. To practice the present invention, it is only necessary that positions, on the original waveform data, of the beat detecting windows be specified (see FIGS. 10C and 12A); thus, the trimming operations may be performed in any desired manner, and positions of the beat detecting windows may be specified in any desired manner. For example, desired positions, on the original waveform data, of the beat detecting windows may be designated directly by the user, or automatically on the basis of timing indicated by metronome data; in the latter case, a metronome may be used during recording of the original waveform data.
(4) Further, the unnecessary-band removing process of step SP8 in the embodiment has been described as performing the high-pass and band cut filter processes. However, in the unnecessary-band removing process, any other process, such as an operation for attenuating low-frequency components or boosting high-frequency components, may be performed in place of or in addition to the high-pass or band cut filter process.
(5) Further, whereas the embodiment has been described as performing, at step SP110, the comb-filter-based filter process for detecting edge portions, the edge portions may be detected by any other suitable filter process that is arranged to generate values corresponding to envelope inclinations. For example, there may be performed a filter process for simply differentiating the envelope levels, and a low-pass filter process for processing the differentiated results.
(6) At step SP116 in the above-described embodiment, the reference positions of the upbeat detecting windows are set at a ½ position of each presumed downbeat section (namely, position dividing the presumed downbeat section into two equal portions), i.e. at a h position in between the reference positions for detecting downbeats. However, the upbeat detecting windows are not limited to such positions. Namely, a point halving an interval between successive downbeat edge start positions or peak positions actually extracted at step SP114 may be obtained so that the thus-determined position is set as the reference position of the upbeat detecting window.
(7) Step SP16 in the instant embodiment has been described as using, as the waveform data of the inserting section prior to envelope adjustment, the waveform data of the corresponding section as they are or after inversion. However, in the case of waveform data of a melody part or the like having particularly stable pitch components, an inserting section may be created by detecting a pitch of a latter portion of the original section and repeating a portion (partial waveform) of the original section in accordance with the basis of the detected pitch. In this way, it is possible to prevent instability specific to an attack portion from appearing in the inserting section.
Note that the partial waveform may be of either a fixed length (loop waveform) or a randomly variable length. In case no stable pitch has been detected in the latter portion of the original section, a proportion of the waveform data of the original section may be repeated. Further, in case no pitch has been detected at all, the whole of the waveform data (or inverted version of the waveform data) of the original section may be copied and set as the waveform data of the inserting section prior to envelope adjustment.
(8) Further, in the above-described embodiment, each of the inserting sections is created on the basis of the waveform data (or inverted version thereof) of the immediately preceding original section. In an alternative, each of the inserting sections may be created on the basis of the waveform data of the original section immediately following the inserting section; for example, the inserting section 1i may be created on the basis of the waveform data of the next original section 2r.
(9) Furthermore, whereas the embodiment has been described as classifying the waveform data into two major types, “percussion type” and “sustainable type”, the waveform data may be classified into three or more types.
(10) Furthermore, whereas the embodiment has been described as setting the downbeat detecting windows in accordance with the musical time setting and setting the upbeat detecting windows in accordance with the resolution setting, these detecting windows need not necessarily be set in accordance with the musical time or resolution. For example, the downbeat detecting windows and upbeat detecting windows may be designated independently of each other, or the downbeat detecting windows and upbeat detecting windows may be designated in accordance with the set musical time. In another alternative, the downbeat detecting windows and upbeat detecting windows may be designated on the basis of timing indicated by metronome data recorded during recording of the original waveform data as noted above.
With the above-described arrangement that waveform data dividing positions of the waveform data are determined on the basis of the envelope levels of the waveform data having been subjected to the filter process, the present invention can efficiently extract effective rise positions as the waveform data dividing positions, using filter characteristics corresponding to a type of a music piece in question.
Further, by differentiating the envelope shape of the waveform data having been subjected to the filter process, the present invention can extract the effective rise positions with further increased efficiency. Furthermore, by further including the amplitude conversion process for reducing amplitude differences in the envelop waveforms, the present invention can accurately determine the waveform data dividing positions on the basis of the rising rates while reducing differences in the rising rates.
Furthermore, by the arrangement of determining the waveform data dividing positions and corresponding peak levels on the basis of results of the envelope shape differentiation, the present invention can determine the waveform data dividing positions by selecting rise positions on the basis of the peak levels. Moreover, with the arrangement that reproduction of the waveform data following the waveform data dividing position is initiate upon lapse of the time “n(Ts+Tt)−Tt”, the present invention can set a beat position or peak position, to be felt by the human auditory sense, at an optimal position in accordance with the expansion ratio n.
Furthermore, with the arrangement that a predetermined one of the rise positions of the original waveform data, belonging to a predetermined range, is extracted as a dividing position, the present invention can efficiently extract effective rise positions as the waveform data dividing positions. Also, by extracting the rise positions in correspondence with the presumed beat positions, the present invention can detect the rise positions in a stable manner while effectively preventing erroneous detection. Stated differently, positions that are located near originally expected dividing positions and also considered to be appropriate from a musical point of view can be set as the waveform data dividing positions.
Furthermore, with the arrangement that one of a plurality of rise positions belonging to a predetermined range is selected and extracted as a waveform data dividing position of the original waveform data, the present invention can efficiently eliminate any unnecessary rise positions. Moreover, with the arrangement that a rise position belonging to a predetermined range is extracted as a waveform data dividing position on condition that level values corresponding to the rise position exceed a predetermined first threshold value, the present invention can extract a rise position of a greater level value with higher priority. Furthermore, by re-extracting a waveform data dividing position on the basis of a second threshold value after the extraction based on the first threshold value, the present invention can extract a finer waveform data dividing position. As compared to a case where the lower threshold value is used from the beginning, the present invention can reduce erroneous detection more reliably. Furthermore, by the arrangement of extracting a waveform data dividing position by applying the first and second threshold values to first and second predetermined ranges, respectively, the present invention can extract a waveform data dividing position by use of optimal threshold values corresponding to points where variation in beat intensity is likely to occur.
5. Second Embodiment:
D/A converter (DAC) 118 converts the mixed result of the mixer 116 into an analog signal that is then sounded via a sound system 28. Reference numeral 124 represents a MIDI interface that communicates MIDI signals with external MIDI equipment, and 126 represents another interface that communicates waveform data with external equipment. Timer 128 generates a timer interrupt signal every predetermined timing.
6. Behavior of Second Embodiment:
The following paragraphs describe behavior of the second embodiment. Upon powering-on of a personal computer constituting the waveform editing system, an initial loader program stored in a ROM 12 is executed so that an operating system is started up. Once predetermined operation is performed by the user while the operating system is ON, a waveform-editing application program is triggered. Then, once the user performs predetermined operation while the waveform-editing application program is ON, an automatic performance/waveform recording processing routine shown in
At step SP202 of
At next step SP204 of
At next step S210, detection is made of operational events of the window elements currently set in the active state on the waveform recording control window 150. Then, at step SP212, the routine branches in accordance with a result of the operational event detection. If no operational event has been detected, the routine reverts to step SP210 to continue the operational event detection operation. If the automatic performance stop button 154 has been mouse-clicked on the control window 150, the reproduction of the automatic performance file is terminated, and the automatic performance/waveform recording processing routine is immediately brought to an end. If the waveform recording start button 156 has been mouse-clicked on the control window 150 as determined at step SP212, the routine moves on to step SP214, where is initiated recording of waveform data obtained sequentially via the microphone 6, A/D converter 108 and recording circuit 110; namely, the waveform data are recorded sequentially onto the hard disk 112 via the D/A converter 118 and recording circuit 110. Also, at step SP214, the automatic performance stop button 154 and waveform recording stop button 158 are set in the active state while the other window elements are set in the inactive state.
Here, details of the waveform recording process are explained with reference to
In
As clear from the above-described contents of the operations up to step SP214, the automatic performance has already been initiated when the waveform recording is initiated. Thus, as shown in
After that, the waveform recording process and automatic performance process are continued synchronously with each other using any one of the following synchronization schemes.
(1) The sampling cycles of the waveform data are synchronized with the tempo clocks of the automatic performance. In this case, there is no need to store the later-described synchronization data.
(2) The synchronization data are recorded onto a predetermined track of the automatic performance data (waveform timing track 222 in the illustrated example of
The (2) scheme may have the following variations depending on the nature of the synchronization data to be recorded.
(2-1) Every predetermined number of tempo clocks, a unique sample number of the waveform data at the current time point is recorded.
(2-2) Unique sample number of the waveform data at each beat timing (e.g., at each timing indicated by the mark “↑” in
(2-3) Sequence position of every predetermined number (e.g., every 1,000 samples) of the waveform data is recorded.
It should be appreciated that the synchronization between the waveform recording process and the automatic performance process may be secured using any other suitable schemes than the above-mentioned.
Referring back to
Irrespective of which one of steps SP220 and SP222 has been taken, the routine goes next to step SP224, where a determination is made whether both of the automatic performance and waveform recording processes have now been terminated. If at least one of the automatic performance and waveform recording processes is in progress, a negative (NO) determination is made, so that the routine reverts to step SP216 to continue the operational event detection. Then, if both of the automatic performance and waveform recording processes have been terminated, an affirmative (YES) determination is made once the routine moves to step SP224, so that the routine is brought to an end.
From the above-described contents of the operations, it can be seen that the original recorded waveform data (original waveform data) are recorded here, in association with the timing data of the automatic performance information, in accordance with the synchronization control data. However, because a series of the original waveform data are recorded as it is, the above-described operations can not appropriately deal with increase or decrease in the automatic performance tempo. For appropriately dealing with increase or decrease in the automatic performance tempo, there arises a need for a process to divide the original waveform data into a plurality of original sections, as will be set forth below.
6.2. To-be-Reproduced-Waveform-Data Generation Processing:
Once predetermined operation is performed by the user after the acquisition of the original waveform data through the above-described waveform recording process, a to-be-reproduced-waveform-data generation processing routine, similar to that described above in relation to
Description already made about the behavior of the first embodiment with reference to
The second embodiment differs from the first embodiment in the manner of storing last data in the process for imparting envelopes to inserting sections 1i-12i (step SP 20 of
“performance process” in the second embodiment slightly differs from the performance process in the first embodiment and thus will be described below with reference to
At step SP132 of
If answered in the affirmative at step SP134, the routine goes to step SP136 in order to perform an event operation corresponding to the event data. If the event data represents a note-on event, a new tone generating channel is assigned in the tone generator 122 in response to an instruction by the CPU 10, so that a tone signal is synthesized by the assigned tone generating channel. The thus-synthesized tone signal is sounded via the mixer 116, D/A converter 118 and sound system 28. If the event data represents a note-off event, a tone deadening (silencing) operation is performed by a designated tone generating channel.
In case the event timing of next event data has not yet been reached, a negative determination is made at step SP134, so that the routine goes to step SP138. At step SP138, it is determined whether predetermined timing has arrived for start reading out the waveform data of any one of the coupled sections. As previously noted, the default readout start timing for the waveform data of the coupled sections is previously recorded in the waveform timing track 222. However, the readout start timing referred to at step SP138 is timing modified on the basis of a rising time Tt, i.e. timing corresponding to “n(Ts+Tt)−Tt” discussed above.
With an affirmative answer at step SP138, the routine goes to step SP140, where readout of the corresponding waveform data is initiated. Here, the waveform data readout rate is controlled in accordance with a value of a pitch shift amount as will be described later. If the pitch shift amount is “0”, the waveform data readout rate is set to the same rate as the data writing rate at which the waveform data were originally recorded (original data write rate). If the pitch shift amount is of a positive value, the waveform data readout rate is set to be higher than the original data write rate, while if the pitch shift amount is of a negative value, the waveform data readout rate is set to be lower than the original data write rate. As well known in the art, the pitch of the read-out waveform data becomes higher as the waveform data readout rate gets higher, but becomes lower as the waveform data readout rate gets lower.
Thus, even when some other waveform data were being read out till just before the current time point, readout of new waveform data can be initiated at step SP140 in such a manner as to replace the other waveform data. If answered in the negative at step SP138, the operation of step SP140 is omitted, so that the instant routine is brought to an end without replacing the currently read-out waveform data.
According to the process arranged in the above-described manner, waveform data readout of the leading or first coupled section 1t is initiated immediately when the value of the variable tcount arrives at waveform data readout timing of the coupled section it. Then, once the value of the variable tcount arrives at waveform data readout timing of the second coupled section 2t, the waveform data readout of the first coupled section 1t is terminated, and waveform data readout of the second coupled section 2t is initiated. To provide a smooth connection between the successive coupled sections 1t and 2t, the waveform data of a portion of the first coupled section it where the waveform data readout is to be terminated and the waveform data of a portion of the second coupled section 2t where the waveform data readout is to be initiated may be interconnected in a cross-fading fashion. The waveform data sequentially read out with the foregoing operation are sounded sequentially via the reproduction circuit 114, mixer 116, D/A converter 118 and sound system 28. In the same manner, waveform data readout of the following coupled sections 3t-12t is initiated sequentially in accordance with increment in the value of the variable tcount. Tone waveforms actually generated through the foregoing operations are similar to those described earlier in relation to
7. Advantageous Results of Second Embodiment:
As described above, the second embodiment is arranged to record the synchronization data for the automatic performance information into the waveform timing track 222 as the original waveform data are recorded into the waveform data track 224. Thus, the second embodiment can secure synchronism between the original waveform data and to-be-reproduced waveform data generated on the basis of the original waveform data and the original automatic performance information. Further, in the instant embodiment, reproduction of each of the original sections can be initiated at the timing “n(Ts+Tt)−Tt” by modifying the waveform-data-generation-start triggering clock count in accordance with relationship between the original edge start time position Ts and rising time Tt and the reproducing tempo (tempo expansion/compression ratio n), and thus it is possible to secure appropriate consistency between the tempo expansion/compression ratio n and beat timing felt by the human auditory sense.
8. Modifications of Second Embodiment:
It should be appreciated that the present invention is not limited to the scope of the above-described embodiments and various modifications of the second embodiment are also possible as set forth below; note that the second embodiment may be modified in the same manner as described earlier in items (1)-(10) in relation to the first embodiment.
(11) Whereas the second embodiment has been described above as analyzing the envelopes of the entire waveform data, it may analyze portions of the waveform data near the reference positions, e.g., portions corresponding to the beat detecting windows.
(12) It should be understood that the process for determining dividing positions (waveform data control points) of the original waveform data is not limited to the one described above in relation to the second embodiment. For example, although the second embodiment has been described as determining the waveform data control points using the beat timing of the automatic performance information as the reference positions of the detecting windows, individual note-on timing (timing of notes in
(13) Furthermore, whereas the second embodiment has been described above as first recording the original waveform data in the waveform data track 224 and then determining the waveform data control points, the waveform data control points may be determined during recording of the original waveform data, provided that the CPU 10 has sufficiently high processing capability.
(14) Moreover, in the described second embodiment, the resolution with which to determine the waveform data control points is designated after the recording of the original waveform data. Alternatively, the resolution may be designated prior to the recording of the original waveform data.
The present invention arranged in the above-described manner can readily synchronize the waveform data with the automatic performance information because it records the waveform data while recording the synchronization data indicating relationship between the automatic performance information and the waveform data.
Further, by the arrangement of determining envelope levels of the waveform data and then determining waveform data dividing positions on the basis of the synchronization data and envelope levels, the present invention can efficiently extract effective rise positions as the waveform data dividing positions. Also, by extracting the rise positions in correspondence with the presumed beat positions, the present invention can detect the rise positions in a stable manner while effectively preventing erroneous detection. Stated differently, points that are located near originally-expected dividing positions and also considered to be appropriate from a musical point of view can be set as the waveform data dividing positions. Thus, the present invention can reliably prevent erroneous detection of the waveform data dividing positions and waveform data division at musically inappropriate positions.
Furthermore, with the arrangement that presumed beat positions are determined on the basis of beat timing or note-on or note-off timing of the automatic performance information, the present invention can detect the waveform data diving points efficiently using information originally included in the automatic performance information.
Furthermore, with the arrangement that one of a plurality of rise positions belonging to a predetermined range is selected and extracted as a waveform data dividing position of the original waveform data, the present invention can efficiently eliminate any unnecessary rise positions.
The present invention relates to the subject matter of Japanese Patent Application Nos. 2001-008813, 2001-008814 and 2001-008815 filed Jan. 17, 2001, the disclosure of which is expressly incorporated herein by reference in its entirety.
Number | Date | Country | Kind |
---|---|---|---|
2001-008813 | Jan 2001 | JP | national |
2001-008814 | Jan 2001 | JP | national |
2001-008815 | Jan 2001 | JP | national |
Number | Date | Country | |
---|---|---|---|
Parent | 10051973 | Jan 2002 | US |
Child | 11016294 | Dec 2004 | US |