This disclosure relates to a sound generation device and its control method, a program, and an electronic musical instrument.
In electronic musical instruments such as electronic keyboard devices, besides generating electronic sounds assuming the sounds of musical instruments, singing sounds are also synthesized and generated. Such singing sounds (hereinafter referred to as synthesized singing sounds, as distinguished from actual singing) synthesize a waveform so as to have a designated pitch while combining segments of speech according to characters, such as of lyrics; in this way, a synthesized sound is produced as if the characters were vocalized. Conventionally, a technology for generating synthesized singing sounds by combining a musical score (sequence data, etc.) prepared in advance with characters has been used; however, as described in Patent Japanese Laid-Open Patent Application No. 2016-206496 and Japanese Laid-Open Patent Publication No. 2014-98801, for the performance operations on an electronic keyboard device in response to this, a technology has also been developed to generate synthesized singing sounds in real time.
When a conventional singing sound synthesizer automatically advances one character or one syllable at a time in response to the depression of a key of an electronic keyboard device, if a wrong key is struck or if there is a grace note, the position of the lyrics sometimes advances ahead of the performance. If the position of the lyrics gets ahead of the performance, the position of the lyrics and the performance do not match, resulting in an audibly unnatural synthesized singing sound.
Therefore, an object of this disclosure is to generate audibly natural synthesized singing sounds when singing sounds are vocalized in a real-time performance.
In order to realize the object described above, this disclosure provides a sound generation device comprising an electronic controller including at least one processor. The electronic controller is configured to execute a plurality of modules including a first acquisition module configured to acquire first lyrics data in which a plurality of characters to be vocalized are arranged in a time series and which include at least a first character and a second character that follows the first character, a second acquisition module configured to acquire a vocalization start instruction, and a control module configured to, in response to the second acquisition module acquiring the vocalization start instruction, output an instruction to generate an audio signal based on a first vocalization corresponding to the first character of the first lyrics data in response to the vocalization start instruction satisfying a first condition, and output an instruction to generate an audio signal based on a second vocalization corresponding to the second character of the first lyrics data in response to the vocalization start instruction not satisfying the first condition.
A karaoke system according to an embodiment of this disclosure will be described in detail below with reference to the drawings. The following embodiments are examples of embodiments of this disclosure, but the invention is not limited to these embodiments.
The karaoke system according to an embodiment of this disclosure has a function for generating natural synthesized singing sounds when the singing sounds are vocalized in a real-time performance by specifying a target musical piece when karaoke is performed using an electronic musical instrument that can generate synthesized singing sounds.
The karaoke server 1000 is equipped with a storage device that stores music data required for providing karaoke in the karaoke device 1 in association with song IDs. Song data include data pertaining to karaoke songs, such as lead vocal data, chorus data, accompaniment data, and karaoke subtitle data. Lead vocal data indicate the main melody part of a song. Chorus data indicate secondary melody parts, such as harmony to the main melody. Accompaniment data indicate accompaniment sounds of the song. The lead vocal data, chorus data, and accompaniment data can be expressed in MIDI format. The karaoke subtitle data are data for displaying lyrics on the display of the karaoke device 1.
The singing sound synthesis server 2000 is equipped with a storage device for storing setting data for setting the electronic musical instrument 3 in accordance with the song, in association with the song IDs. Setting data include lyrics data corresponding to each part of the song to be sung corresponding to the song ID. Lyrics data corresponding to the lead vocal part are referred to as first lyrics data. First lyrics data stored in the singing sound synthesis server 2000 can be the same as or different from the karaoke subtitle data stored in the karaoke server 1000. That is, the first lyrics data stored in the singing sound synthesis server 2000 are the same in that the data are data that define the lyrics (characters) to be vocalized, but are adjusted to a format that can easily be used in the electronic musical instrument 3. For example, karaoke subtitle data stored in the karaoke server 1000 are character strings such as “ko,” “n,” “ni,” “chi,” and “ha.” In contrast, the first lyrics data stored in the singing sound synthesis server 2000 can be character strings that match the actual pronunciations of “ko,” “n,” “ni,” “chi,” and “wa” for easy use by the electronic musical instrument 3. This format can include information for identifying cases in which two characters are sung with one sound, information for identifying breaks in phrases, and the like.
The karaoke device 1 includes an input terminal to which an audio signal is supplied and a speaker that outputs the audio signal as sound. The audio signal input to the input terminal can be supplied from the electronic musical instrument 3 or from a microphone.
The karaoke device 1 reproduces the audio signal from the accompaniment data of the music data received from the karaoke server 1000, and outputs the audio signal from the speaker as an accompaniment sound of the song. The sound corresponding to the audio signal supplied to the input terminal can be synthesized with the accompaniment sound and output.
The control terminal 2 is a remote controller that transmits user instructions (for example, song designation, volume, transpose, etc.) to the karaoke device 1. The control terminal 2 can also transmit user instructions (for example, setting of the lyrics, timbre, etc.) to the electronic musical instrument 3 via the karaoke device 1.
In the karaoke system, the control terminal 2 transmits an instruction for setting the musical piece set by the user to the karaoke device 1. Based on this instruction, the karaoke device 1 acquires the music data of the musical piece from the karaoke server 1000 and first lyrics data from the singing sound synthesis server 2000. The karaoke device 1 transmits the first lyrics data to the electronic musical instrument 3. The first lyrics data are stored in the electronic musical instrument 3. Based on the user's instruction to start the performance of the musical piece, the karaoke device 1 reads the music data and outputs the accompaniment sound, etc., and the electronic musical instrument 3 reads the first lyrics data and outputs a synthesized singing sound in accordance with the user's performance operation. Hardware configuration of the electronic musical instrument
The electronic musical instrument 3 is a device that generates an audio signal representing a synthesized singing sound in accordance with the contents of an instruction in response to an operation of a performance operation unit 321 (
The control unit 301 is an electronic controller that includes one or a plurality of processors. In this embodiment, the control unit 301 includes an arithmetic processing circuit, such as a CPU (Central Processing Unit). The term “electronic controller” as used herein refers to hardware that executes software programs. The control unit 301 causes the CPU to execute programs stored in the storage unit 303 to realize various functions in the electronic musical instrument 3. The functions realized in the electronic musical instrument 3 include, for example, a sound generation function for executing a sound generation process. The control unit 301 can be configured to comprise, instead of the CPU or in addition to the CPU, an MPU (Microprocessing Unit), a GPU (Graphics Processing Unit), an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable Gate Array), a DSP (Digital Signal Processor), and a general-purpose computer. Moreover, the electronic controller can include a plurality of CPUs. In this embodiment, the control unit 301 further includes the DSP for generating audio signals by the sound generation function. The control unit 301 (electronic controller) is configured to execute a plurality of modules including at least a first acquisition module (lyrics data acquisition unit 31), a second acquisition module (vocalization start instruction acquisition unit 34), and a control module (vocalization control unit 32) as explained below.
The storage unit 303 is a storage device (memory (computer memory)) such as non-volatile memory. The storage unit 303 is one example of a non-transitory computer-readable medium. The storage unit 303 stores a program for realizing the sound generation function described above. The sound generation function will be described further below. The storage unit 303 also stores setting information used in generating audio signals representing the synthesized singing sound, segments of speech for generating the synthesized singing sound, and the like. Setting information includes, for example, timbre, as well as the first lyrics data received from the singing sound synthesis server 2000.
The operating unit 305 is a device (user operable input(s)) such as a switch, volume knob, etc., and outputs a signal to the control unit 301 in response to the input operations. The display unit 307 is a display device (display), such as a liquid-crystal display or an organic EL display, which displays a screen based on control by the control unit 301. The operating unit 305 and the display unit 307 can be integrated to form a touch panel. The communication unit (communication device) 309 connects to the control terminal 2 by short-range wireless communication based on the control by the control unit 301. The term “communication device” as used herein includes a receiver, a transmitter, a transceiver and a transmitter-receiver, capable of transmitting and/or receiving signals over the telephone, other communication wire, or wirelessly.
The performance operation unit 321 outputs performance signals to the control unit 301 in response to the performance operation. The performance operation unit 321 includes a plurality of user operable keys (for example, the plurality of keys) and a sensor that detects one or more operations of each operable key. Performance signals include information indicating the position of the operated key (note number), information indicating that a key has been pressed (note on), information indicating that a key has been released (note off), key depression speed (velocity), and the like. Specifically, when a key is pressed, a note on, which is associated with the velocity and note number (also called pitch instruction) is output as a performance signal indicating a vocalization start instruction, and when the key is released, a note off, which is associated with the note number, is output as a performance signal indicating a vocalization stop instruction. The control unit 301 uses these performance signals to generate audio signals. The interface 317 includes a terminal for outputting the generated audio signals.
Here, an example of the first lyrics data stored in the storage unit 303 will be described with reference to
Hereinafter, each of the lyrics (characters) to be vocalized, that is, a unit (a group of divided sounds) of speech can be referred to as a “syllable.” In the present embodiment, “character” in the lyrics data (including second lyrics data described further below) is used synonymously with “syllable.”
As shown in
The sound generation process according to the embodiment of this disclosure will now be described with reference to
When the process is initiated by the user's instruction to play the musical piece, the control unit 301 obtains the first lyrics data from the storage unit 303 (Step S401). The control unit 301 then performs an initialization process (Step S402). In the present embodiment, initialization means that the control unit 301 sets the count value tc=0. The control unit 301 then sets count value tc=tc+1, thereby incrementing the count value tc (Step S403). Next, of the accompaniment data, the data of the portion corresponding to the count value tc are read (Step S404).
The control unit 301 waits until the end of the reading of the accompaniment data, the input of a user instruction to stop the performance of the musical piece, or the reception of a performance signal is detected (Step S405: NO, Step S406: NO, Step S407: NO) and repeats the processing in Steps S403 and S404 until the above-described detection is made. This state is referred to as the standby state. As described above, the initial value of the count value tc is 0, which corresponds to the playback start timing of the musical piece. The control unit 301 increments the count value tc to measure a time based on the playback start timing of the musical piece.
When the reading of the accompaniment data has been completed by reading the accompaniment data to the end of the standby state (Step S405: YES), the control unit 301 terminates the sound generation process. If the user inputs an instruction to stop the performance of the musical piece in the standby state (Step S406: YES), the control unit 301 terminates the sound generation process.
If a performance signal is received from the performance operation unit 321 in the standby state (Step S407: YES), the control unit 301 executes an instruction process for generating an audio signal by the DSP (Step S480). A detailed explanation of the instruction process for generating an audio signal will be described further below. When the instruction process for generating an audio signal is completed, the process again proceeds to Step S403, and the control unit 301 enters the standby state to repeat the processing of steps 403 and 404.
When a performance signal is received from the performance operation unit 321, the instruction process for generating an audio signal is initiated. First, the control unit 301 sets the pitch based on the performance signal obtained from the performance operation unit 321 (Step S501). The control unit 301 determines whether the performance signal acquired from the performance operation unit 321 is a vocalization start instruction (Step S502).
If it is determined that the performance signal is a vocalization start instruction (Step S502: YES), the control unit 301 determines whether the count value tc at the time that the vocalization start instruction was acquired is within the vocalization setting interval corresponding to any one of the characters by referring to the first lyrics data.
If it is determined that the time that the vocalization start instruction was acquired is within the vocalization setting interval corresponding to one of the characters M(i) (Step S503: YES), the control unit 301 sets the character M(p) corresponding to said vocalization setting interval as the character to be vocalized (Step S504). The control unit 301 then outputs an instruction to the DSP to generate an audio signal based on the vocalization of the character M(p) at the set pitch (Step S509), terminates the instruction process, and proceeds to Step S403 shown in
If the control unit 301 determines that the time at which the vocalization start instruction was acquired is not within the vocalization setting interval for any character (Step S503: NO), the control unit 301 calculates a center time tm(q) between a vocalization stop time te(q) corresponding to the immediately preceding character M(q) with respect to the time of the vocalization start instruction, and a vocalization start time ts(q+1) corresponding to the next character M(q+1) (Step S505). If it is assumed that the stop time te(q) is a “first time” and the start time ts(q+1) is a “second time,” the center time between the stop time te(q) and the start time ts(q+1) is called a “third time.” If the count value tc is in the interval between the vocalization stop time te(1) of “ko” (character M(1)) and the vocalization start time ts(2) of “n” (character M(2)), the control unit 301 calculates the center time tm(1)=(te(1)+ts(2))/2. If the center time tm(q) between the immediately preceding vocalization stop time te(q) and the next vocalization start time ts(q+1) is calculated in advance, Step S505 can be omitted. The control unit 301 then determines whether the count value tc is before the center time tm(q) (Step S506). Here, determining whether the count value tc is before the center time tm(q) is one example of determining whether a “first condition” is satisfied.
If the count value tc is before the center time tm(q) (Step S506: YES), the control unit 301 sets a character M(q) corresponding to the set interval before the center time tm(q) (S507). The control unit 301 then outputs an instruction to the DSP to generate an audio signal based on the vocalization of the character M(q) at the set pitch (Step S509), terminates the instruction process, and proceeds to Step S403 shown in
If the obtained start instruction is not before the center time tm(q) (Step S506: NO), the control unit 301 reads the character M(q+1) corresponding to the set interval after the center time tm(q) (Step S508). The control unit 301 then outputs a signal to start the vocalization of the character at the acquired pitch (Step S509), terminates the instruction process, and proceeds to Step S403 shown in
If it is determined that the performance signal acquired from the performance operation unit 321 is not a vocalization start instruction, that is, that it is a vocalization stop instruction (Step S502: NO), the control unit 301 outputs an instruction to the DSP to stop the generation of the audio signal generated based on the vocalization of the character M(q) at the set pitch (Step S510), terminates the instruction process, and proceeds to Step S403 shown in
In summary, the instruction process described above can be rephrased as follows. In the instruction process for generating an audio signal, the control unit 301 determines whether the vocalization start instruction satisfies the first condition. If the first condition is satisfied, the control unit 301 generates an audio signal based on the first vocalization corresponding to the first character, and if the first condition is not satisfied, generates an audio signal based on the second condition corresponding to the second character after the first character. In the present embodiment, the first condition is a condition in which the time that the vocalization start instruction is acquired is before the center time between the stop time of the first character and the start time of the second character. To further rephrase the instruction process described above, the control unit 301 identifies the setting interval to which the acquisition time of the vocalization start instruction belongs, or the setting interval that is closest to the acquisition time, and generates an audio signal based on the vocalization corresponding to the character corresponding to the identified setting interval.
In this manner, by sequential processing, a synthesized singing sound is generated in which the characters of the song lyrics, which are identified with the progression of the accompaniment sound from the playback of the accompaniment sound data, are sequentially vocalized at the timing and pitch corresponding to the performance operation. An audio signal representing the synthesized singing sound is then output to the karaoke device 1.
A specific example of the sound generation process shown in
First, a case in which the count value tc (acquisition time) when the vocalization start instruction was acquired is within the setting interval ts(1)-te(1) of the vocalization will be explained with reference to
Next, a case will be explained in which, in the standby state of the sound generation process, a performance signal including a vocalization stop instruction associated with the pitch “G4” is received from performance operation unit 321. In this case, the control unit 301 executes the instruction process (Step S408) and sets the pitch “G4” based on the performance signal (Step S501). The control unit 301 determines that the performance signal is a vocalization stop instruction (Step S502: NO), and the DSP of the control unit 301 outputs an instruction to generate an audio signal based on the vocalization (character “ko”) at the set pitch “G4” (Step S510). In
A case in which the count value tc, from when the vocalization start instruction was acquired, is between the setting interval ts(1)-te(1) and the setting interval ts(2)-te(2), and is close to the setting interval ts(1)-te(1), will now be explained with reference to
A case in which the count value tc when the vocalization start instruction shown in
The electronic musical instrument 3 includes a lyrics data acquisition unit 31 (first acquisition module), a vocalization control unit 32 (control module), a signal generating unit (signal generating module) 33, and a vocalization start instruction acquisition unit 34 (second acquisition module) as functional blocks that realize the sound generation function, etc., for generating the synthesized singing sound. The functions of these functional units are realized by the cooperation of the control unit 301, the storage unit 303, a timer, not shown. It is not essential that the functional blocks include the signal generator 33 of this disclosure.
The lyrics data acquisition unit 31 acquires the first lyrics data corresponding to the song ID from the singing sound synthesis server 2000 via the karaoke device 1. The vocalization control unit 32 primarily executes the instruction process shown in
The signal generating unit 33 corresponds to the aforementioned DSP and starts or stops the generation of audio signals based on the instruction received from the vocalization control unit 32. The audio signal generated by the signal generating unit 33 is output to the outside via the interface 317.
In the present embodiment, a sound generation process that is somewhat different from the sound generation process described in the first embodiment will be described with reference to
In the present embodiment, in the first lyrics data shown in
In the flowchart shown in
When a performance signal is received from the performance operation unit 321, the instruction process for generating audio signals is started. First, the control unit 301 sets the pitch based on the performance signal acquired from the performance operation unit 321 (Step S521). The control unit 301 determines whether the performance signal acquired from the performance operation unit 321 is a vocalization start instruction (Step S522).
If it is determined that the performance signal is a vocalization start instruction (Step S522: YES), the control unit 301 determines whether the time ts at which the vocalization start instruction was acquired satisfies tc−ts≤tth or M(i)=M(1) (Step S523). Here, tc−ts is the elapsed time from the time of the last acquisition of the vocalization start instruction to the present time. tth is a prescribed time interval. If the time ts satisfies either tc−ts≤tth or M(i)=M(1) (Step S523: YES), the control unit 301 outputs an instruction to the DSP to generate an audio signal for the character M(i) (Step S526). If M(i)=M(1) is satisfied, i.e., if it is the first vocalization, the control unit 301 sets the character “ko” as the character to be vocalized, and if tc−ts≤tth is satisfied, the control unit sets the same character as the character set in the immediately preceding vocalization as the character to be vocalized. The control unit 301 then sets the count value tc to time ts (Step S527), terminates the instruction process, and proceeds to Step S403 shown in
If the time ts satisfies neither tc−ts≤tth nor M(i)=M(1) (Step S523: NO), the control unit 301 determines whether the volume acquired in the vocalization start instruction is lower than a prescribed volume (Step S524). If the volume acquired in the vocalization start instruction is lower than the prescribed volume (Step S524: YES), the control unit 301 executes Steps S526, 527, then terminates the instruction process and proceeds to the Step S403 shown in
In the present embodiment, the first condition is whether or not either tc−ts≤tth or M(i)=M(1) is satisfied. The first condition is also whether or not the volume is lower than a prescribed volume, even if neither tc−ts≤tth nor M(i)=M(1) is satisfied.
In this manner, by the sequential processing shown in
A specific example of the sound generation process shown in
When the sound generation process is started, the control unit 301 acquires the first lyrics data (Step S401) and executes the initialization process (Step S402). In the initialization process, the control unit 301 sets character M(i)=M(1), tc=0, and ts=0. A case is assumed in which, in the standby state of the sound generation processing, the control unit 301 receives a performance signal associated with the pitch “G4” from the performance operation unit 321 (Step S407: YES). In this case, the control unit 301 executes the instruction process (Step S408) and sets the pitch “G4” based on the performance signal (Step S521). The control unit 301 determines that the performance signal is a vocalization start instruction (Step S522: YES) and determines whether or not either tc−ts≤tth or M(i)=M(1) is satisfied (Step S523). The control unit 301 determines that M(i)=1 is satisfied (Step S523: YES). Since the character M(1) is “ko,” the control unit 301 outputs an instruction to the DSP to generate an audio signal based on the vocalization of the character “ko” at the pitch “G4” (Step S526). The control unit 301 sets the count value tc to time ts (Step S527), terminates the instruction process, and proceeds to the Step S403 shown in
Next, a case is assumed in which, in the standby process of the sound generation processing, the control unit 301 receives a performance signal associated with the pitch “G4” from the performance operation unit 321. In this case, the control unit 301 executes the instruction process (Step S408) and sets the pitch “G4” based on the performance signal (Step S521). When it is determined that the performance signal is a vocalization stop instruction (Step S522: NO), the control unit 301 outputs an instruction to stop the generation of the audio signal based on the vocalization of the character “ko” at the set pitch G4 (Step S510), terminates the instruction process, and proceeds to the Step S403 shown in
Next, a case is assumed in which, in the standby process of the sound generation processing, the control unit 301 receives a performance signal including a vocalization start instruction associated with the pitch “A5” from the performance operation unit 321. In this case, the control unit 301 executes the instruction process (Step S408) and sets the pitch “A5” based on the performance signal (Step S521). The control unit 301 then determines that the performance signal is a vocalization start instruction (Step S522: YES) and determines whether or not either tc−ts≤tth or M(i)=M(1) is satisfied (Step S523). The prescribed interval tth is in the range of 10 ms-100 ms, for example, and is 100 ms in the present embodiment. When tc−ts exceeds 100 ms, it is determined that tc−ts≤tth is not satisfied. Here, since tc−ts is longer than the prescribed interval tth, the control unit 301 determines that neither tc−ts≤tth nor M(i)=M(1) is satisfied (Step S523: NO) and determines whether the volume is lower than the prescribed volume (Step S524). When it is determined that the volume is higher than or equal to the prescribed volume (Step S524: NO), the control unit 301 sets the character count value i=i+1 (Step S525). Here, the character M(2) after the character M(1) is set. Since the character M(2) is “n,” the control unit 301 outputs an instruction to the DSP to generate an audio signal based on the vocalization of the character “n” at the pitch “A5” (Step S526). The control unit 301 sets the count value tc to time ts (Step S527), terminates the instruction process, and proceeds to the Step S403 shown in
Next, a case is assumed in which, in the standby state of the sound generation process, a performance signal including a vocalization start instruction associated with the pitch “B5” is received from performance operation unit 321. In this case, the control unit 301 executes the instruction process (Step S408) and sets the pitch “B5” based on the performance signal (Step S521). The control unit 301 determines that the performance signal is a vocalization start instruction (Step S522: YES) and determines whether or not either tc-ts≤tth or M(i)=M(1) is satisfied (Step S523). Here, since tc−ts is shorter than the prescribed interval tth, it is determined that tc−ts≤tth is satisfied (Step S523: YES) and an instruction to generate an audio signal based on the vocalization of the character “n” at the pitch “A5” is output (Step S526). Here, the control unit 301 actually outputs an instruction to generate an audio signal to continue the vocalization of the immediately preceding character “n”. Therefore, in order to continue the vocalization of the character “n,” an audio signal based on the vocalization of “-”, which is a prolonged sound at the pitch “B5” is generated. The control unit 301 sets the count value tc to time is (Step S527), terminates the instruction process, and proceeds to the Step S403 shown in
In this manner, in the sound generation process according to the present embodiment, if the time interval from the immediately preceding vocalization start instruction to the next vocalization start instruction is shorter than the prescribed time interval, the characters of the first lyrics data can be prevented from advancing.
In other words, if the time interval from the immediately preceding vocalization start instruction to the next vocalization start instruction is shorter than the prescribed time interval, the second vocalization start instruction satisfies the first condition. In this case, the control unit 301 outputs an instruction to generate an audio signal to continue the first vocalization corresponding to the start instruction of the first vocalization. For example, “-”, which is a prolonged sound at pitch “B5,” is assigned to the syllable note of the interval ton(3)-toff(3).
The embodiments of this disclosure were described above, but the embodiments of this disclosure can be modified in various forms, as described in the following. In addition, the embodiment described above and the modified examples that will now be described can be applied in combination with each other.
Here, the first lyrics data stored in the storage unit 303 will be described with reference to
When it is determined that the vocalization start instruction is before the center time tfm(1), the control unit 301 outputs an instruction to the DSP to generate an audio signal based on the vocalization corresponding to the first (beginning) character of the first phrase. When it is then determined that the vocalization start instruction is before the center time tfm(1), the control unit 301 can output an instruction to the DSP to generate an audio signal based on the vocalization corresponding to the first (beginning) character of the second phrase.
When it is determined that the vocalization start instruction follows the center time tfm(1), the control unit 301 also determines whether the vocalization start instruction follows the start time tfs(2) of the second phrase. If it is determined that the vocalization start instruction follows the start time tfs(2) of the second phrase, the control unit 301 outputs an instruction to the DSP to generate an audio signal based on the vocalization corresponding to, from among the characters corresponding to the vocalizations of the second phrase, the character that has not yet been vocalized. Specifically, as shown in
On the other hand, if it is determined that the vocalization start instruction precedes the start time tfs(2) of the second phrase, the control unit 301 generates an audio signal based on a vocalization corresponding to the first (beginning) character of the characters corresponding to the vocalization. Specifically, as shown in
In the modified example (1), the first condition is a condition in which the time that the vocalization start instruction is acquired precedes the center time between the stop time of the first phrase and the start time of the second phrase. In addition, the second condition is a condition that the time that the vocalization start instruction was acquired follows the start time tfs(2) of the second vocalization. In other words, the second condition described above is satisfied when the acquisition time of the vocalization start instruction follows the start time of the second vocalization as defined in the first lyrics data.
As shown in
The vocalization interval defined in the first lyrics data as shown in
Moreover, the control can be as follows. In the first lyrics data, the control unit 301 identifies a setting interval to which the acquisition time of the vocalization start instruction belongs, or a setting interval that is closest to the acquisition time. If the second lyrics data include a setting interval that temporally coincides with the setting interval identified above, the control unit 301 then generates an audio signal based on a vocalization corresponding to a character that corresponds to the temporally coincident setting interval in the second lyrics data. That is, if the setting interval corresponding to the acquisition time of the vocalization start instruction is in both the first lyrics data and the second lyrics data, the vocalization of the second lyrics data is prioritized. Such a process can also be applied when the second lyrics data correspond to the first lyrics data only in some time regions. If the chorus part is also used, the third time described above can be shifted forward or backward with respect to the center time of the stop time te(q) and the start time ts(q+1).
The electronic musical instrument 3A has a plurality of sound holes in the body of the instrument, a plurality of operation keys 311 that change the open/closed state of the sound holes, and the breath sensor 312. When a performer plays the plurality of operation keys 311, the open/closed states of the sound holes are changed, thereby outputting sound with a prescribed tone. A mouthpiece is attached to the body of the instrument, and the breath sensor 312 is provided inside the instrument body in the vicinity of the mouthpiece. The breath sensor 312 is a pressure sensor that detects the blowing pressure of breath blown in by the user (performer) through the mouthpiece. The breath sensor 312 detects the presence or absence of the blowing in of the breath as well as, at least when the electronic musical instrument 3A is played, the intensity and speed (momentum) of the blowing pressure. The volume of the vocalization is determined in accordance with the magnitude of the pressure detected by the breath sensor 312. In the present modified example, the magnitude of the pressure detected by the breath sensor 312 is treated as volume information. If the breath sensor 312 detects a prescribed magnitude of pressure, it is treated as a vocalization start instruction. If the detected magnitude of pressure is less than the prescribed pressure, it is not treated as a vocalization start instruction.
In the electronic wind instrument, as described in
This disclosure was described above based on preferred embodiments, but this disclosure is not limited to the above-described embodiments, and includes various embodiments that do not deviate from the scope of the invention. Some of the above-described embodiments can be appropriately combined.
The performance signal can be acquired from the outside via communication. Therefore, it is not essential to provide the performance operation unit 321, and it is not essential for the sound generation device to have the function and form of a musical instrument.
A storage medium that stores a control program represented by software for achieving this disclosure can be read into the present device to achieve the same effects of this disclosure, in which case the program code read from the storage medium realizes the novel functions of this disclosure, so that the non-transitory, computer-readable storage medium that stores the program code constitutes this disclosure. In addition, the program code can be supplied via a transmission medium, or the like, in which case the program code itself constitutes this disclosure. The storage medium in these cases can be, in addition to ROM, a floppy disk, a hard disk, an optical disc, a magneto-optical disk, a CD-ROM, a CD-R, a magnetic tape, a non-volatile memory card, or the like. The non-transitory, computer-readable storage medium includes storage media that retain programs for a set period of time, such as volatile memory (for example, DRAM (Dynamic Random Access Memory)) inside a computer system that constitutes a server or a client, when the program is transmitted via a network such as the Internet or a communication line, such as a telephone line.
By this disclosure, it is possible to generate natural synthesized singing sounds when singing sounds are generated in a real-time performance.
Number | Date | Country | Kind |
---|---|---|---|
2021-037651 | Mar 2021 | JP | national |
This application is a continuation application of International Application No. PCT/JP2021/046585, filed on Dec. 16, 2021, which claims priority to Japanese Patent Application No. 2021-037651 filed in Japan on Mar. 9, 2021. The entire disclosures of International Application No. PCT/JP2021/046585 and Japanese Patent Application No. 2021-037651 are hereby incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/JP2021/046585 | Dec 2021 | US |
Child | 18463470 | US |