The present application is based on, and claims priority from, Taiwan Patent Application Ser. No.r 111100798, filed Jul. 1, 2022, the disclosure of which is hereby incorporated by reference herein in its entirety.
The present invention relates to a noise reduction, more specifically, a method of noise reduction for an intelligent network communication.
Conventional technologies of background noise cancellation are mostly used in telephone communication or headphones. The main purpose of these technologies is to prevent the impact of background noise or ambient noise on communication quality or sound quality of headphone. At present, most of the common technologies of background noise cancellation used by intelligent devices based on voice interaction are derived from the existing technologies of traditional telephone communication. These technologies include spectral subtraction, Wiener filtering and adaptive noise cancellation.
The method of spectral subtraction is to use the mean value of amplitude of speech segmentation to subtract the amplitude of non-speech segmentation to obtain the mean value of noise, and then eliminate the noise. This method has a poor effect for unsteady noise, which is easy to cause speech distortion by noise elimination, and resulting in the decline of speech recognition rate.
The method of Wiener filtering uses the transfer function of Wiener filter to convolute the mean value of noise amplitude with the amplitude of speech segmentation to obtain the amplitude information of signal by noise elimination. It does not cause serious speech distortion in Wiener filtering method, and can effectively suppress the noise with small change range or stable in the environment. However, this method estimates the mean value of noise by calculating the statistical average of the power spectrum of noise during the silent period. This estimation is based on the premise that the power spectrum of noise does not change much before and after the sound producing. Therefore, in the case of unsteady noise with large changes, this method cannot achieve higher noise reduction performance.
Another cancellation method of ambient noise commonly used in smart devices is adaptive noise cancellation method by a directional microphone. This method uses an omnidirectional microphone to collect ambient noise, a directional microphone to collect user voice, and then adaptive noise cancellation is performed for the two signals to obtain pure voice signals.
In addition, remote video conferencing is more popular at present. When conducting one-way or multi-party meetings, the common problem is that the volume of sound sources varies from place to place, and thereby resulting in poor quality of output sound from the main meeting venue. Often, the volume can only be self-adjusted by other places to match the volume of the main meeting venue. This not only delays the setting time, but also makes the meeting unable to proceed smoothly. Moreover, in most video conferences, the receiver often receives echo back, which will not only interfere with the sender, but also affect the audio message of the receiver. This echo is the most common noise, especially in small rooms, where the echo is the largest. In order to achieve a good suppression of echo and noise, the present invention has been developed.
Based on the above-mentioned, the method of noise reduction for intelligent network communication has become an important work in many fields. For example, a database of voice characteristics, models or features of conference participants is established to facilitate the improvement of the quality and effect of sound receiving, so as to achieve the purpose of the present invention.
The purpose of the present invention is to provide a video conference system with anti-echo function for improving the audio quality and effect of the video conference.
According to one aspect of the present invention, the reverse phase noise in the transceiver devices of the conference participants may be created when the voice pauses. Based on the principle of destructive interference, the sound interval method can achieve a great effect to filter out the background noise. It should be understood that the reverse phase noise can completely offset (counterbalance) the noise of the noise source, and can also partially offset the noise of the noise source.
According to another aspect of the present invention, a method of noise reduction for an intelligent network communication is provided, which comprises the following steps. First, a first local sound message is received by a voice receiver of a communication device at a transmitting end, wherein the first sound message includes a voice emitted by a speaker. Then, voice characteristics of the speaker is captured by a voice recognizer. Next, a second local sound message is received by the voice receiver, wherein the second local sound message includes the voice of the speaker. In the following step, the second local sound message is compared with the voice characteristics of the speaker by a control device. Finally, all signals except the voice characteristic of the speaker in the second local sound message is filtered by a voice filter to obtain an original voice emitted by the speaker.
According to one aspect of the present invention, the voice characteristics of the speaker are stored in a voice database, and the voice characteristics of the speaker comprises voice frequency, timbre and accent. After the filtering process is finished, a voice signal from the speaker is transmitted to a second communication device at a receiving end through a wireless transmission device and/or a network transmission device, and the voice signal from the speaker in the second communication device at the receiving end is produced.
According to another aspect of the present invention, a method of noise reduction for an intelligent network communication is provided, which comprises the following steps. First, a first local sound message is received by a voice receiver of a first communication device at a transmitting end, wherein the first sound message includes a voice emitted by a speaker. Then, the first local sound message is transmitted by a wireless transmission device and/or a network transmission device to a second communication device at a receiving end. Next, voice characteristics of the speaker are captured by a voice recognizer of the second communication device at the receiving end. Subsequently, a second local sound message is received by the second communication device, wherein the second local sound message includes the voice of the speaker. In the following step, the second local sound message is compared with the voice characteristics of the speaker by a control device of the second communication device. Finally, all signals except the voice characteristic of the speaker in the second local sound message is filtered by a voice filter of the second communication device to obtain an original voice emitted by the speaker.
According to yet another aspect of the present invention, a method of noise reduction for an intelligent network communication is provided, which comprises the following steps. First, a local ambient noise is received by a voice receiver of a communication device at a transmitting end. Then, a waveform of the ambient noise received through the voice receiver is identified by a voice recognizer. Next, an energy level of the ambient noise is determined by a control device to obtain a sound interval. Subsequently, a local sound message is received by the voice receiver of the communication device at the transmitting end after obtaining the sound interval. Finally, waveform signal of the ambient noise is filtered by a voice filter to obtain an original sound emitted by the speaker.
According to an aspect of the present invention, after the filtering process is finished, a voice signal from the speaker is transmitted to a second communication device at a receiving end through a wireless transmission device and/or a network transmission device, and the voice signal from the speaker in the second communication device at the receiving end is produced.
According to another aspect of the present invention, a computer program/algorithm is used to determine based on a voice database whether there is a corresponding or similar voice characteristic of the speaker recognized by the voice recognizer.
The components, characteristics and advantages of the present invention may be understood by the detailed descriptions of the preferred embodiments outlined in the specification and the drawings attached:
Some preferred embodiments of the present invention will now be described in greater detail. However, it should be recognized that the preferred embodiments of the present invention are provided for illustration rather than limiting the present invention. In addition, the present invention can be practiced in a wide range of other embodiments besides those explicitly described, and the scope of the present invention is not expressly limited except as specified in the accompanying claims.
As shown in
The voice recognizer 104 is used to recognize the features of sound and audio. As shown in
Please refer to
According to the above-mentioned, the voice recognizer 104 of the present invention is used for audio classification and recognize the voice characteristics including voice frequency, timbre, accent, and other voice models or characteristics of the speakers. Firstly, the speaker's voice signal is input, the audio characteristics are extracted by the feature extraction method. Then, the parameters of audio characteristics are normalized as the inputs of audio classification processing. Using these known inputs to train the recognition system, the audio characteristics of the speakers can be obtained after the training.
As shown in
Through the processing of the control device 102, the voice characteristics of the speakers are stored in a voice database 106. The voice database 106 stores the preset voice characteristics of the speakers. When the conference is initiated, the communication device 100 at the transmitting end receives a second local sound message including the voice of the speakers. Through the processing of the control device 102, the second local voice message is compared with the voice characteristics of the speakers from the voice database 106. In order to transmit the original speaker's voice cleanly, it is necessary to remove or reduce ambient noise and echo. The voice filter 108 filters all signals except the speaker's voice characteristic signal in the second local voice message to obtain the original voice emitted by the speaker. For example, the voice filter 108 is a Kalman filter, which uses the speaker's voice model and the ambient noise model to filter the noise (ambient noise and echo) from the local audio signal, so as to provide the filtered signal to the receiver's communication devices 100, 100a, 100b, . . . , 100c. Through the acquisition of the voice recognizer 104 of the transmitting end and the noise filtering by the voice filter 108, the original voice signal of the speaker is then wirelessly or wired transmitted to the communication device of the receiver through the wireless transmission device 112 and/or the network transmission device 120. Therefore, in the receiver's communication device, through the conversion of the analog-to-digital converter 122, the original voice emitted by the speaker can be made from the speaker 116. For example, the speaker voice model stored in the voice database 106 can be received from a remote server or remote device through the wireless transmission device 112 and/or the network transmission device 120. For example, the voice database 106 may also be stored in the storage device 114.
As shown in
The voice characteristics of the speaker (speaker's voice model) of the voice database 106 can be received through the wireless transmission device 112 and/or the network transmission device 120. In one example, the voice characteristics of the speaker (speaker's voice model) are set in the application (APP) 118 and transmitted externally to the wireless transmission device 112 and/or the network transmission device 120 through a wireless or wired network. The voice database 106 is integrated into the APP 118. For example, the wireless networks include various wireless specifications such as Bluetooth, WLAN or WiFi. In one embodiment, the voice recognition APP on the communication device 100 controls the opening or closing of the noise elimination function to achieve the best effect of noise elimination.
As shown in
As shown in
As shown in
Finally, in the step 516, the voice signal from the speaker is produced in the receiving end communication device. Through the conversion of the analog-to-digital converter 122, the original voice emitted by the speaker is sounded through the speaker 116. For example, the analog-to-digital converter 122 may be built-in or external to the control device 102.
The communication devices 100, 100a, 100b, . . . , 100c are configured to communicate with external devices, which may be external computing devices, computing systems, mobile devices (smart phones, tablets, smart watches), or other types of electronic devices.
External devices include computing core, user interface, Internet interface, wireless communication transceiver and storage device. The user interface includes one or more input devices (e.g., keyboard, touch screen, voice input device), one or more audio output devices (e.g., speaker) and/or one or more visual output devices (e.g., video graphics display, touch screen). The Internet interface includes one or more networking devices (e.g., wireless local area network (WLAN) devices, wired LAN devices, wireless wide area network (WWAN) devices). The storage device includes a flash memory device, one or more hard disk drives, one or more solid-state storage devices and/or cloud storage devices.
The computing core includes processors and other computing core components. Other computing core components include video graphics processors, memory controllers, main memory (e.g., RAM), one or more input/output (I/O) device interface modules, input/output (I/O) interfaces, input/output (I/O) controllers, peripheral device interfaces, one or more USB interface modules, one or more network interface modules, one or more memory interface modules, and/or one or more peripheral device interface modules.
The external device processes the data transmitted by the wireless transmission device 112 and/or the network transmission device 120 to produce various results.
As will be understood by persons skilled in the art, the foregoing preferred embodiment of the present invention illustrates the present invention rather than limiting the present invention. Having described the invention in connection with a preferred embodiment, modifications will be suggested to those skilled in the art. Thus, the invention is not to be limited to this embodiment, but rather the invention is intended to cover various modifications and similar arrangements included within the spirit and scope of the appended claims, the scope of which should be accorded the broadest interpretation, thereby encompassing all such modifications and similar structures. While the preferred embodiment of the invention has been illustrated and described, it will be appreciated that various changes can be made without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
111100798 | Jan 2022 | TW | national |