The present invention relates to tone synthesis apparatus, methods and programs for generating waveforms of tones, voices or other desired sounds, for example, on the basis of readout of waveform data from a memory or the like while varying a timbre and rendition style (or articulation) of the tones, voices or other sounds. More particularly, the present invention relates to an improved tone synthesis apparatus, method and program which perform control to reduce a delay in tone generation (i.e., tone generation delay) etc. that may occur during, for example, a real-time performance.
In recent years, there has been known a tone waveform control technique called “SAEM” (Sound Articulation Element Modeling), which is intended for realistic reproduction and control of various rendition styles (various types of articulation) peculiar to natural musical instruments. Among examples of equipment using the SAEM technique is an apparatus disclosed in Japanese Patent Application Laid-open Publication No. HEI-11-167382 (hereinafter referred to as “patent literature 1”). The conventionally-known apparatus equipped with a tone generator using the SAEM technique, such as the one disclosed in patent literature 1, are arranged to generate a continuous tone waveform by time-serially combining a plurality of ones of rendition style modules prepared in advance for individual portions of tones, such as an attack-related rendition style module defining an attack waveform, release-related rendition style module defining a release waveform, body-related rendition style module defining a body waveform (intermediate waveform) constituting a steady portion of a tone and a joint waveform interconnecting tones. For example, the apparatus can generate a waveform of an entire tone by crossfade-synthesizing waveforms of individual portions of the tone using an attack-related rendition module for an attack portion, i.e. a rise portion, of the tone, one or more body-related rendition modules for a body portion, i.e. a steady portion, of the tone and a release-related rendition style module for a release portion, i.e. a fall portion, of the tone. Also, by using a joint-related rendition style module in place of the release-related rendition style module, the apparatus can also generate a series of waveforms of a plurality of successive tones (or tone portions) connected together by a desired rendition style. Note that, in this specification, the terms “tone waveform” are used to mean a waveform of a voice or any desired sound rather than being limited only to a waveform of a musical tone.
Further, there have been known apparatus which allow a human player to selectively designate in real time rendition styles to be used, among which is the one disclosed in Japanese Patent Application Laid-open Publication No. 2004-78095 (hereinafter referred to as “patent literature 2”).
In apparatus equipped with a tone generator capable of sequentially varying the tone color and rendition style (or articulation) while sequentially crossfade-synthesizing a plurality of waveforms on the basis of a tone synthesis technique as represented by the SAEM synthesis technique, such as those disclosed in patent literature 1 and patent literature 2 mentioned above, at least two tone generating channels are used for synthesis of a tone to additively synthesize waveforms allocated to the tone generating channels while frequently fading out and fading in output tone volumes of the individual tone generating channels, to thereby output a waveform of the entire tone. Example of such tone synthesis is outlined in
As seen in
Once the output volume of the first tone generating channel reaches 0% and the output volume of the second tone generating channel 100% (time point t2), synthesis of another tone waveform C (loop waveform) constituting the body portion is started in a fading-in manner, and simultaneously fade-out of the tone waveform B in the second tone generating channel is started. Then, once the output volume of the first tone generating channel reaches 100% and the output volume of the second tone generating channel 0% (time point t3), synthesis of still another tone waveform D (loop waveform) constituting the body portion is started in a fading-in manner, and simultaneously fade-out of the tone waveform C in the first tone generating channel is started. In this way, as long as the body portion lasts, the tone is synthesized while fade-in/fade-out is alternately repeated in the first and second tone generating channels with the tone waveform to be used sequentially switched from one to another. Once a note-off event is instructed (more specifically, once note-off even data is received) at time point t4 in response to performance operation by the human player, transition or shift to a non-loop release waveform by way of a steady tone waveform E (loop waveform) constituting part of the release waveform is started after completion of crossfade between the tone waveform C of the first tone generating channel and the tone waveform D of the second tone generating channel (i.e., at time point t5 later by Δt than time point t4 when the note-off instruction was given). In this way, the individual waveforms defined by the above-mentioned rendition style modules connected together can be smoothly connected together by crossfade synthesis between the loop waveforms, so that a continuous tone waveform can be formed as a whole.
In the conventionally-known apparatus equipped with a tone generator using the SAEM technique, as noted above, rendition style modules are allotted in advance to the time axis in response to real-time performance operation, selection instruction operation, etc. by the human player and in accordance with the respective start times of the rendition style modules, and cross-face waveform synthesis is performed between the thus-allotted rendition style modules to thereby generate a continuous tone waveform. Stated differently, the tone synthesis is carried out in accordance with previously-determined crossfade time lengths. However, if the crossfade time lengths are determined in advance, it is not possible to appropriately respond to, or deal with, sudden performance instructions, such as note-off operation during a real-time performance or note-on operation of a tone during generation of another tone. Namely, when a sudden performance instruction has been given, the conventionally-known apparatus shift to a release waveform (or joint waveform) only after crossfade synthesis having already been started at the time point when the performance instruction was given is completed, so that complete deadening of the previous tone would be delayed by an amount corresponding to the waiting time till the completion of the crossfade synthesis and thus start of generation of the next tone would be delayed by that amount.
In view of the foregoing, it is an object of the present invention to provide a tone synthesis apparatus, method and program which, in generating a continuous tone waveform by crossfade-synthesizing waveforms of various portions of one or more tones, such as attack, body and release or joint portions, can effectively reduce a tone generation delay that may occur when a sudden performance instruction is given.
In order to accomplish the above-mentioned object, the present invention provides an improved tone synthesis apparatus for outputting a continuous tone waveform by time-serially combining rendition style modules, defining rendition-style-related waveform characteristics for individual tone portions, and sequentially crossfade-synthesizing a plurality of waveforms in accordance with the combination of the rendition style modules by use of at least two channels, which comprises: an acquisition section that acquires performance information; a determination section that makes a determination, in accordance with the performance information acquired by the acquisition section, as to whether a crossfade characteristic should be changed or not; and a change section that, in accordance with a result of the determination by the determination section, automatically changes a crossfade characteristic of crossfade synthesis having already been started at a time point when the performance information was acquired by the acquisition section. In the present invention, n a time position of a succeeding one of rendition style modules to be time-serially combined in accordance with the acquired performance information is controlled by the change section automatically changing the crossfade characteristic of the crossfade synthesis having already been started at the time point when the performance information was acquired by the acquisition section.
In outputting a continuous tone waveform by time-serially combining rendition style modules, defining rendition-style-related waveform characteristics for individual tone portions, and sequentially crossfade-synthesizing a plurality of waveforms in accordance with the combination of the rendition style modules by use of at least two channels, the tone synthesis apparatus of the present invention determines, in accordance with performance information acquired by the acquisition section, as to whether a crossfade characteristic should be changed or not. Then, in accordance with the result of the determination, the crossfade characteristic of crossfade synthesis having already been started when the performance information was acquired is automatically changed. Because the crossfade characteristic is automatically changed during the course of the crossfade synthesis, the time length of the crossfade synthesis can be expanded or contracted as compared to the time length that had been previously set at the beginning of the crossfade synthesis, and thus, the time position of the succeeding one of the rendition style modules to be time-serially combined in accordance with the acquired performance information can be allotted to a time position displaced by an amount corresponding to the expanded or contracted time. In this way, control can be performed automatically, even during the course of the crossfade synthesis, to allow the crossfade synthesis to be completed earlier (or later), so that a waveform shift can be made over to the succeeding rendition style module earlier (or later), without a human player being conscious of the waveform shift.
Namely, the present invention is characterized in that, during the course of crossfade synthesis having already been started when a performance instruction was given, the crossfade characteristic of the crossfade synthesis is automatically changed. With such an arrangement, the time length of the crossfade synthesis can be expanded or contracted as compared to the time length that had been previously set at the beginning of the crossfade synthesis, so that a waveform shift can be effected earlier (or later), without a human player being conscious of the waveform shift.
The present invention may be constructed and implemented not only as the apparatus invention as discussed above but also as a method invention. Also, the present invention may be arranged and implemented as a software program for execution by a processor such as a computer or DSP, as well as a storage medium storing such a software program. Further, the processor used in the present invention may comprise a dedicated processor with dedicated logic built in hardware, not to mention a computer or other general-purpose type processor capable of running a desired software program.
The following will describe embodiments of the present invention, but it should be appreciated that the present invention is not limited to the described embodiments and various modifications of the invention are possible without departing from the basic principles. The scope of the present invention is therefore to be determined solely by the appended claims.
For better understanding of the objects and other features of the present invention, its preferred embodiments will be described hereinbelow in greater detail with reference to the accompanying drawings, in which:
In the electronic musical instrument of
The ROM 2 stores therein various programs for execution by the CPU 1 and various data. The RAM 3 is used as a working memory for temporarily storing various data generated as the CPU 1 executes predetermined programs, as a memory for storing a currently-executed program and data related to the currently-executed program, and for various other purposes. Predetermined address regions of the RAM 3 are allocated to various functions and used as various registers, flags, tables, memories, etc. The external storage device 4 is provided for storing various data, such as rendition style modules for generating tones corresponding to rendition styles specific to various musical instruments, and various control programs to be executed or referred to by the CPU 1. In a case where a particular control program is not prestored in the ROM 2, the control program may be prestored in the external storage device (e.g., hard disk device) 4, so that, by reading the control program from the external storage device 4 into the RAM 3, the CPU 1 is allowed to operate in exactly the same way as in the case where the particular control program is stored in the ROM 2. This arrangement greatly facilitates version upgrade of the control program, addition of a new control program, etc. The external storage device 4 may use any of various removable-type recording media other than the hard disk (HD), such as a flexible disk (FD), compact disk (CD-ROM or CD-RAM), magneto-optical disk (MO) and digital versatile disk (DVD); alternatively, the external storage device 4 may comprise a semiconductor memory. It should be appreciated that other data than the above-mentioned may be stored in the ROM 2, external storage device 4 and RAM 3.
The performance operator unit 5 is, for example, a keyboard including a plurality of keys operable to select pitches of tones to be generated and key switches corresponding to the keys. The performance operator unit 5 generates performance information for a tone performance; for example, the performance operator unit 5 generates, in response to ON/OFF operation by the user or human player, performance information (e.g., MIDI information), including event data, such as note-on and note-off event data, various control data, such as control change data, etc. It should be obvious that the performance operator unit 5 may be of any desired type other than the keyboard type, such as a neck-like device type having tone-pitch selecting strings provided thereon. The operator unit 6 includes various operators, such as setting switches operable to set tone pitches, colors, effects, etc. with which tones are to be performed, and rendition style switches operable by the human player to designate types (or contents) of rendition styles to be imparted to individual portions of tones. The panel operator unit 6 also include various other operators, such as a numeric keypad, character (text)-data entering keyboard and mouse. Note that the keyboard 5 may be used as input means, such as setting switches and rendition switches. The display device 7, which comprises a liquid crystal display (LCD), CRT (Cathode Ray Tube) and/or the like, visually displays a listing of prestored rendition style modules, contents of the individual rendition style modules, controlling states of the CPU 1, etc.
The tone generator 8, which is capable of simultaneously generating tone signals in a plurality of tone generation channels, receives performance information supplied via the communication bus 1D and generates tone signals by performing tone synthesis on the basis of the received performance information. Namely, as a rendition style module corresponding to the performance information is read out from the ROM 2 or external storage device 4, waveform data defined by the read-out rendition style module are delivered via the communication bus 1D to the tone generator 8 and stored in a buffer of the tone generator 8 as necessary. Then, the tone generator 8 outputs the buffered waveform data at a predetermined output sampling frequency. Tone signals generated by the tone generator 8 are subjected to predetermined digital processing performed by a not-shown effect circuit (e.g., DSP (Digital Signal Processor)) or the like, and the tone signals having been subjected to the digital processing are supplied to a sound system 8A, including an amplifier, speaker, etc., for audible reproduction or sounding.
The interface 9, which is, for example, a MIDI interface, communication interface, etc., is provided for communicating various MIDI information between the electronic musical instrument and external or other MIDI equipment (not shown). The MIDI interface functions to input performance information based on the MIDI standard (i.e., MIDI information) from the external MIDI equipment or the like to the electronic musical instrument, or output MIDI information from the electronic musical instrument to other MIDI equipment or the like. The other MIDI equipment may be of any type (or operating type), such as a keyboard type, guitar type, wind instrument type, percussion instrument type or gesture type, as long as it can generate MIDI information in response to operation by a user of the equipment. The MIDI interface may be a general-purpose interface rather than a dedicated MIDI interface, such as RS232-C, USB (Universal Serial Bus) or IEEE1394, in which case other data than MIDI data may be communicated at the same time. The communication interface, on the other hand, is connected to a wired or wireless communication network (not shown), such as a LAN, Internet, telephone line network, via which the communication interface is connected to an external server computer or the like. Thus, the communication interface functions to input various information, such as a control program and various information, such as MIDI information, from the server computer to the electronic musical instrument. Such a communication interface may be capable of both wired and wireless communication rather than just one of wired and wireless communication.
Now, with reference to
As conventionally known, the rendition style modules are prestored, in the ROM 2, external storage device 4, RAM 3 or the like, as a “rendition style table” where a variety of rendition style modules are compiled as a database. The rendition style modules each comprises original waveform data to be used for reproducing a waveform corresponding to any one of variety of rendition styles, and a group of related data. Each of the “rendition style modules” is a rendition style waveform unit that can be processed as a single data block in a rendition style waveform synthesis system; in other words, each of the “rendition style modules” is a rendition style waveform unit that can be processed as a single event. Broadly classified, the rendition style modules, as seen from
Such rendition style modules can be classified more finely into several rendition style types on the basis of characters of the individual rendition styles, in addition to the above-mentioned classification based on various portions of performance tones. For example, the rendition style modules may be classified into: “Bendup Attack” which is an attack-related rendition style module that causes a bendup to take place immediately after a rise of a tone; “Glissup Attack” which is an attack-related rendition style module that causes a glissup to take place immediately after a rise of a tone; “Vibrato Body” which is a body-related rendition style module representative of a vibrato-imparted portion of a tone between rise and fall portions of a tone; “Benddown Release” which is a release-related rendition style module that causes a benddown to take place immediately before a fall of a tone; “Glissdown Release” which is a release-related rendition style module that causes a benddown to take place immediately after a fall of a tone; “Gliss Joint” which is a joint-related rendition style module that interconnects two tones while effecting a glissup or glissdown; “Bend Joint” which is a joint-related rendition style module that interconnects two tones while effecting a bendup or benddown. The human player can select any desired one of such rendition style types by operating any of the above-mentioned rendition style switches; however, these rendition style types will not be described in this specification because they are already known in the art. Needless to say, the rendition style modules are classified per original tone generator, such as musical instrument type. Further, selection from among various rendition style types may be made by any other means than the rendition style switch.
In the instant embodiment of the present invention, each set of waveform data corresponding to one rendition style module is stored in a database as a data set of a plurality of waveform-constituting factors or elements, rather than being stored directly as the waveform data; each of the waveform-constituting elements will hereinafter be called a “vector”. As an example, vectors corresponding to one rendition style module may include the following. Note that “harmonic” and “nonharmonic” components are defined here by separating an original rendition style waveform into a sine wave, waveform having a harmonious component capable of being additively synthesized, and the remaining waveform component
The rendition style waveform data of the rendition style module may include one or more other types of vectors, such as a time vector indicative of a time-axial progression of the waveform, although not specifically described here.
For synthesis of a tone, waveforms or envelopes corresponding to various constituent elements of a rendition style waveform are constructed along a reproduction time axis of a performance tone by applying appropriate processing to these vector data to thereby modify the data values and arranging or allotting the thus-processed vector data on or to the time axis and then carrying out predetermined waveform synthesis processing on the basis of the vector data allotted to the time axis. For example, in order to produce a desired performance tone waveform, i.e. a desired rendition style waveform, exhibiting predetermined ultimate rendition style characteristics, a waveform segment of the harmonic component is produced by imparting a harmonic component's waveform shape vector with a pitch and time variation characteristic thereof corresponding to a harmonic component's pitch vector and with an amplitude and time variation characteristic thereof corresponding to a harmonic component's amplitude vector, and a waveform segment of the nonharmonic component is produced by imparting a nonharmonic component's waveform shape vector with an amplitude and time variation characteristic thereof corresponding to a nonharmonic component's amplitude vector. Then, the desired performance tone waveform can be produced by additively synthesizing the thus-produced harmonic component's waveform segment and nonharmonic component's waveform segment, so that the tone to be sounded ultimately can be generated. Such tone synthesis processing will not be described later because it is known in the art.
Each of the rendition style modules includes not only the aforementioned rendition style waveform data but also rendition style parameters. The rendition style parameters are parameters for controlling the time, level etc. of the waveform of the rendition style module in question. The rendition style parameters may include one or more kinds of parameters depending on the nature of the rendition style module. For example, the “Bendup Attack” rendition style module may include different kinds of rendition style parameters, such as an absolute tone pitch at the end of the bendup attack, initial bend depth value during the bendup attack, time length from the start to end of the bendup attack, tone volume immediately after the bendup attack and timewise expansion/contraction of a default curve during the bendup attack. These “rendition style parameters” may be prestored in memory, or may be entered by user's input operation. The existing rendition style parameters may be modified via user operation. Further, in a case where no rendition style parameter is given at the time of reproduction of a rendition style waveform, predetermined standard rendition style parameters may be automatically applied. Furthermore, suitable parameters may be automatically produced and applied during the course of processing.
The preceding paragraphs have set forth the case where each rendition style module has all of the waveform-constituting elements (waveform shape, pitch and amplitude) of the harmonic component and all of the waveform-constituting elements (waveform shape and amplitude) of the nonharmonic component, with a view to facilitating understanding of the description. However, the present invention is not so limited, and there may also be used rendition style modules each having only one of the waveform shape, pitch and amplitude elements of the harmonic component and only one of the waveform shape and/or amplitude elements of the nonharmonic component. For example, some rendition style module may have only one of the waveform shape (Timbre), pitch and amplitude elements of the harmonic component and waveform shape and amplitude elements of the nonharmonic component. Such an alternative is preferable in that a plurality of rendition style modules can be used in combination per component.
Now, a description will be given about a general picture of the tone synthesis processing carried out in the electronic musical instrument shown in
Performance reception section 100 performs a performance reception process for receiving in real time performance information (e.g., MIDI information) generated in response to operation by the human player. Namely, MIDI information, such as note-on, note-off and control change data, is output in real time from the performance operator unit 5, such as a keyboard, in response to operation, by the human player, of the performance operator unit 5. Further, rendition style switch output information, indicative of which one of the rendition style switches having rendition style types allocated thereto in advance has been depressed or released, is output in real time, as control change data of MIDI information, from the rendition style switch. The performance reception section 100 is constantly monitoring so as to receive in real time such MIDI information output in response to operation of the performance operator unit 5 or rendition style switch. When MIDI information has been received, the performance reception section 100 outputs the received MIDI information to a performance interpretation section 101.
The performance interpretation section (“player”) 101 performs performance interpretation processing on the basis of the received MIDI information. In the performance interpretation processing, the received MIDI information is analyzed to generate rendition style designation information (i.e., rendition style ID and rendition style parameters), and performance information imparted with the thus-generated rendition style designation information (i.e., rendition-style-imparted performance information) is output to a rendition style synthesis section 102. More specifically, portion-specific rendition style modules are determined which are to be imparted at necessary performance time points corresponding to rendition styles in a time-serial flow of the received MIDI information. The performance interpretation processing to be performed by the performance interpretation section 101 is shown in
Referring first to
Referring now to
In the above-described performance interpretation processing, the type of each rendition style which the rendition style synthesis section 102 is instructed to impart is determined in accordance with control change data, included in the MIDI information, output in response to operation of the corresponding rendition style switch. If no such control change data is included, a rendition style of a predetermined default type may be imparted.
Referring back to
At step S31, the rendition style table is searched on the basis of the input information, i.e. rendition-style-imparted performance information, to select vector data to be used, and data values of the selected vector data are modified on the basis of the rendition-style-imparted performance information. For example, At this step, there performed operations, such as selection of vector data to be used, instruction related to qualification of vector data as to how the pitch element and amplitude element are to be controlled, start time calculation as to at what times vector data are to be used. At next step S32, a determination is made as to whether or not an instruction has been given for imparting a joint-related rendition style or release-related rendition style. If an instruction has been given for imparting a joint-related rendition style or release-related rendition style (i.e., YES determination at step S32), the rendition style synthesis section 102 instructs the waveform synthesis section 103 to perform a later-described acceleration process of
Referring back to
At step S41, a determination is made as to whether the crossfade synthesis is currently under way. If the crossfade synthesis is currently under way (YES determination at step S41), the acceleration process goes to step S42, where it is further determined, on the basis of the start time previously specified by the rendition style synthesis section 102 (see step S31 of
Next, a description will be given, using a specific example, about the accelerated crossfade synthesis intended to promptly complete the currently-performed crossfade synthesis by the new crossfade completion time having been calculated in the aforementioned acceleration process.
As seen from
As seen from
Whereas the embodiment has been described above in relation to the case where tone waveforms to be crossfade-synthesized are loop waveform segments, non-loop waveform (also called “block waveform”) segments may be crossfade-synthesized.
Further, the crossfade characteristic of the crossfade synthesis is not limited to a linear characteristic and may be a non-linear characteristic. Furthermore, the control curve of the crossfade synthesis (i.e., crossfade curve) may be of any desired inclination. The human player may select a desired crossfade characteristic.
Furthermore, the acceleration (crossfade characteristic) of the crossfade synthesis need not necessarily use, or depend on, an absolute time, such as the above-mentioned crossfade completion time; alternatively, the acceleration may use, or depend on, any of a plurality of predetermined crossfade characteristics (i.e., rate dependency), or a combination of crossfade characteristics predetermined per rendition style module.
Furthermore, if, in the above-described acceleration process, next data has already been automatically prepared for the crossfade synthesis before an instruction regarding the next data is given by the rendition style synthesis section 102, then the already-prepared next data may be canceled. This approach is preferable in that it permits a smooth connection to the next data instructed by the rendition style synthesis section 102.
Furthermore, the acceleration time to be used to advance the crossfade synthesis completion time may be set by the user to any desired time, or a different acceleration time may be preset in accordance with the rendition styles to be crossfade-synthesized. If the crossfade synthesis completion time is set to be later than the preset time by increasing the length of the acceleration time, it is possible to retard a waveform shift by a corresponding time amount.
Furthermore, whereas the embodiment has been described as synthesizing a tone on the basis of MIDI information, such as note-on and note-off event information, given from the performance operator unit 5, the present invention may of course be arranged to synthesize a tone on the basis of, for example, music piece data generated based on a plurality of pieces of MIDI information of a music piece prestored in the external storage device 4 or the like in particular order of a performance. Namely, the rendition style impartment may be controlled by the user appropriately operating the rendition style switches to a music piece performance based on such music piece data, rather than operating the rendition style switches to a performance on the keyboard. Further, only MIDI information based on operation of the rendition style switches may be prestored so that the rendition style impartment is automatically controlled in accordance with the MIDI information, in which case the user is allowed to execute only a keyboard performance.
Number | Date | Country | Kind |
---|---|---|---|
2005-156560 | May 2005 | JP | national |
Number | Name | Date | Kind |
---|---|---|---|
5371315 | Hanzawa et al. | Dec 1994 | A |
5687240 | Yoshida et al. | Nov 1997 | A |
6150598 | Suzuki et al. | Nov 2000 | A |
6255576 | Suzuki et al. | Jul 2001 | B1 |
20020178006 | Suzuki et al. | Nov 2002 | A1 |
20040055449 | Akazawa et al. | Mar 2004 | A1 |
Number | Date | Country |
---|---|---|
0 907 160 | Apr 1999 | EP |
1 087 369 | Mar 2001 | EP |
Number | Date | Country | |
---|---|---|---|
20060272482 A1 | Dec 2006 | US |