The present invention relates to a system and a method for synthesizing music and voice, and a service system and a service method using the same.
Generally, in a conventional music mail service, a user selects music to be transmitted to a receiver and sends only the selected music to the receiver. However, this simple music transfer is not satisfactory to sender's various desires.
An object of the present invention is to provide a system and a method capable of providing a music mail with sender's voice and making it easy to grasp the music mail from the sender without loss of the clarity, similar to a multimedia such as disk jockey broadcasting.
Another object of the present invention is to provide a system and a method for controlling a volume level of a synthesized music with various synthesizing effects based on user's voice.
According to the present invention, a system for synthesizing voice and music includes: a receiver for receiving user's voice; a database for storing various music sources; and a synthesizing means for controlling volume of the music stored in the database and for synthesizing the controlled music and the voice according to detection of a voice silent part inputted from the receiver.
The system and method according to the present invention is capable of making a listener feel maximum synthesizing effects to mix the voice and the music.
Also, the system and method according to the present invention is capable of synthesizing the voice and music with various effects without the professional synthesizer's volume control.
According to one aspect of the present invention, there is provided a system for synthesizing voice into music comprising; a receiver for receiving the voice from a user; a database for storing a plurality of music data; and a synthesizing means for controlling a volume of the music according to a silent part of the voice and for synthesizing the received voice into the volume controlled music.
According to another aspect of the present invention, there is provided a system for synthesizing voice into music comprising; a receiver for receiving the voice from a user; a database for storing a plurality of music data; and a synthesizing means for separating the received voice into a plurality of voice elements according to a silent part of the voice and synthesizing the separated voice elements into the music.
According to further aspect of the present invention, there is provided a system for synthesizing voice into music comprising; a receiver for receiving the voice from a user; a database for storing individually separated music elements which form the music; and a synthesizing means for synthesizing the received voice into the separated music elements.
According to still further aspect of the present invention, there is provided a system for synthesizing voice into music comprising; a receiver for receiving the voice from a user; a database for storing individually separated music elements which form the music; and a synthesizing means for separating the received voice into a plurality of voice elements according to a silent part of the voice and synthesizing the separated voice elements and the separated music elements.
According to still further aspect of the present invention, there is provided a method for synthesizing voice into music comprising the steps of: a) receiving the voice from a user; b) detecting a silent part of the received voice; c) controlling a volume of the music according to the detected silent part; d) synthesizing the volume-controlled music and the received voice; and e) transmitting the synthesized music and voice.
According to still further aspect of the present invention, there is provided a method for synthesizing voice into music, comprising the steps of: a) receiving the voice from a user; b) detecting a silent part of the received voice; and c) according to the detected silent part, synthesizing the received voice into a plurality of music elements which form the music.
According to still further aspect of the present invention, there is provided a method for synthesizing voice into music, comprising the steps of: a) receiving the voice from a user; b) detecting a silent part of the received voice; c) separating the received voice into a plurality of voice elements according to the detected silent part and; d) synthesizing the separated voice elements and into the music.
According to still further aspect of the present invention, there is provided a method for synthesizing voice into music, comprising the steps of: a) receiving the voice from a user; b) detecting a silent part of the received voice; c) separating the received voice into a plurality of voice elements according to the detected silent part; and d) synthesizing the separated voice elements into a plurality of music elements which form the music.
Hereinafter, the preferred embodiments of the present invention will be described in detail referring to the accompanying drawings.
As illustrated in
The receiving and transmitting unit (10) is coupled to internet, a mobile communication network, or a telecommunication network. It receives user's voice and transmits a synthesized sound of the music and voice to a specific recipient.
A synthesis unit (20) synthesizes the received voice and the music selected by the user. Here, the synthesis does not mean a mere integration. As illustrated in
Before synthesizing the voice and the music, the synthesis unit (20) can separate the voice into a plurality of voice elements according to the voice silent parts. For instance, the voice separation for the plurality of voice elements can be performed based on a voice silent part of which a time period is more than 1 second. Also, the whole length of the voice can be divided by the voice silent part. For instance, when the entire input voice has the period of 30 seconds, the voice can be divided into two voice elements, front and rear voice elements, based on a voice silent part near by a 15-second length of the input voice. At this time, when one of the front or rear voice elements has a blank (voice silent part) which is over the reference duration, the length of the blank can be reduced as illustrated in
During the communication, many noises can be produced and inputted. To erase such noises, a method for erasing a white noise (which created during the entire voice input), such as a circuit noise, or filtering off other frequencies except for the voice frequency can be used to accept clear voice source.
A database (30) stores many musical data. As illustrated in
Referring to
Referring to
As described above, the embodiment of the present invention only explains when the music is played on the background but the voice can be played with no background.
Synthesis of the voice can be reserved as the user desires and sent to the designated on the specific date and this synthesis can be applied to coloring, feeling, bell sound, or e-mail service. Service of the present invention through the web can provide basic comments, replays of synthesized the music and voice, and repeat-records of the voice and music.
On the other hand, the music referred in the present invention includes pops, classics, natural sounds, original soundtracks, and all other recorded sounds.
The present invention is focused on the service based on the server but the present invention can be provided through a client-based program. Then, the music can be obtained through the music contents containing servers or be made or purchased by the user.
A control unit (100) performs a general control function in synthesis of the voice and music.
A filtering unit (160) samples the analog voice and converts the sampled analog voice signals to digital signals. The Fourier transform is applied to the converted signals such that the time-based data is converted into frequency-based data and high or low frequencies, that human cannot produce, are blocked so as to input only human's voice. Such a digital processing can be done through analog filtering. That is, the filtering unit (160) removes the white noise, such as a circuit noise or a peripheral noise, that comes in regularly so that pure voice required to be synthesized into the music are inputted. For example, in a space where fans are turning, a fan noise can be detected even though no voices are heard. In this case, a difference between a real voice input part and a noise input part can be detected and the white noise can be removed by using such a voice difference. First input signal (s) for a period of time T and second input signals (s+S) for a period of time T+t can be used to remove the white noise (s) that comes in regularly. Also, the filtering unit can be used to remove a peak noise. When a loud sound (big signal that is over a regular amplitude) abruptly comes in on an axis of time, such a loud sound can be removed by filtering off the corresponding peaks in the filtering unit.
A voice separating unit (140) separates the entire voice data into a plurality of voice elements according to the whole time frame of the input voice and a voice silent part from a voice silent control unit (130). For example, when a voice is inputted shown in
When a length of the input voice signal is shorter than a predetermined length, the voice silent control unit (130) can recognize it as a voice silent part which is not inputted by the user. In determining the voice silent part, a certain length of the voice silent part should be recognized as a blank, as well as existence of the signal. According to the length of a voice silent input, the blank should be detected. The voice silent control unit (130) aids the separation of input voice. That is, as shown in
A storage unit (120) stores the voice input, the separated voice, the background music and the synthesized file are stored therein.
A synthesis unit (150) synthesizes the stored voice and music through a digital signal processing under the control of the control unit (100). Synthesized voice and music volumes are controlled. The volume level, which is lower or higher than an average level, is respectively amplified and reduces to help hearing. Beginning part of the music volume will remain untouched or the volume control can be fade in. Also, the volume control can be fade out at the end. a down control will be used in the beginning of the voice elements and a up control will be used at the end of the voice elements to recover an original volume setting. Fast forward, fast rewind and rewind functions can be used for convenience' sake.
When the length of the stored voice exceeds that of the music, the same music can be repeated or other music can be mixed on the background.
Hereinafter, an embodiment of the present invention will be described about separation of the voice input into two voice elements and synthesis the two voice elements and two music sources referring to the
When the user stores his voice as shown in
As shown in
Referring to
On the other hand, when the length of the music is shorter than that of the voice element such that the music ends at point T4, other music data should be subsequently synthesized. At this time, the starting part of a second music is overlapped with the ending part of the first music to have no outstanding volume variation. As illustrated at part E of
At point T5 where the synthesis of the second voice element is terminated, the music volume is up-controlled. Thereafter, the music is faded out from down point 3′ after a predetermined time or from down point 4′ after the lapse of the predetermined time.
At step 200, if a user is coupled to a communication network (mobile communication network, wire communication network or internet), an identification procedure for the user is processed. If the user requires a synthesis service, go to step 220, or not go to step 211 to execute other procedures to be settled previously.
At step 220, the user inputs his voice via the coupled communication network. At this time, the voice input can be carried out by a handheld phone, a wire telephone, a microphone installed in a computer. As set forth above, the voice input can be directly divided by the user into several elements according to information from a service provider or a server can divide the entire voice into a plurality of voice elements referring to the length of the voice and a silent part. Only one voice element can be used in the synthesis. At step 230, the synthesis of the divided voice elements is carried out by the synthesizing unit (20) using the above-mentioned down points and introduction, bridge and ending elements of the music. At step 240, the required service is confirmed by the user and a billing for the service is executed. For example, if the synthesized sound is a voice message, information about the transmission time of the message and a receiver thereof may be input in the server. At step 250, the corresponding message is transmitted to the receiver and the confirmation of the transfer is sent to the user. In case where the synthesized sound is a voice message, the service provider can call the receiver on time reserved by the user and transmits an information message to him, for instance, “This is a DJ mail message from 1234 to 5678.”
When the synthesized sound is a bell sound or a coloring (which is heard music to a caller), it can be set up in the user's phone or the telephone exchange or it can be downloaded on the phone via a bell sound download function. The set-up information is sent to the user in a short message.
As apparent from the above, the synthesis according to the preset invention makes the user have the maximum effectiveness of the mixing by adaptively synthesizing the voice and music. This excellent mixing is carried out with an automatic volume control in the synthesizer.
Number | Date | Country | Kind |
---|---|---|---|
10-2005-0004609 | Jan 2005 | KR | national |
10-2006-0002103 | Jan 2006 | KR | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/KR06/00170 | 1/17/2006 | WO | 00 | 7/18/2007 |