This application is based upon and claims the benefit of priority of the prior Japanese Patent Application No, 2018-216027, filed on Nov. 16, 2018, the entire contents of which are incorporated herein by reference.
The embodiments discussed herein are related to a non-transitory computer-readable storage medium storing a noise suppression program, a noise suppression method, and a noise suppression apparatus.
Voice recognition is used in wide-ranging fields represented by voice-to-text conversion, that is so-called dictation, and also voice assistant, voice translation, and the like which are mounted to smart phones and smart speakers. For example, in a case where voice recognition is performed, noise included in input sound may be a factor for a decrease in a recognition rate in some cases.
An example of a technology for suppressing such noise includes the following noise removal system. In this noise removal system, a short-time spectrum is calculated by performing Fourier transform of a frame signal cut out from an input signal. A noise spectrum is estimated from the short-time spectrum in a silence segment in the noise removal system. In the noise removal system, after a start point of voice is detected, a value obtained by multiplying the noise spectrum estimated in the last silence segment by a spectrum subtraction coefficient is subtracted from the short-time spectrum to perform noise removal.
Examples of the related art include Japanese Laid-open Patent Publication Nos. 2015-170988, 2015-177447, and 8-221092.
According to an aspect of the embodiments, a noise suppression method performed by a computer includes: obtaining input sound; detecting a cycle of power change in a non-voice segment included in the input sound; calculating a correction amount that periodically changes and is applied to a voice segment included in the input sound based on the cycle; and correcting power in at least the voice segment based on the correction amount.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that, both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention.
However, according to the above-mentioned technology, it is difficult to suppress periodic noise where power periodically changes.
According to one aspect, the present disclosure aims at providing a noise suppression program with which periodic noise included in input sound may be suppressed, a noise suppression method, and a noise suppression apparatus.
Hereinafter, a noise suppression program, a noise suppression method, and a noise suppression apparatus according to the present application are described with reference to the accompanying drawings. The embodiment discussed herein are not intended to limit the technology of the disclosure. The embodiments may be appropriately combined within a range where processing contents are not conflicted.
[Stationary Noise and Periodic Noise]
“Stationary noise” where a power level does not change may be superimposed over voice in some cases. For example, sound such as rotating sound of a fan or motor and hum noise of a machine are exemplified as the stationary noise. “Periodic noise” where power periodically changes may be superimposed over the voice in addition to the above-mentioned the stationary noise. For example, noise having a cycle longer than a frame length at which a voice signal is divided such as, for example, operating sound of an air conditioner or the like corresponds to the periodic noise.
Points of similarity and difference of these stationary noise and periodic noise will be described.
As represented by the graphic representation in
[One Aspect of Problem]
As described in the above-mentioned background art section, the above-mentioned periodic noise is not supposed in the above-mentioned noise removal system, and also suppression as countermeasures therefore is not supposed. For this reason, in the above-mentioned noise removal system, noise is estimated under an assumption that noise power remains the same in a voice segment. Therefore, in a case where an input signal includes the above-mentioned periodic noise, an error occurs between the noise where the power is estimated to remain the same and the periodic noise where the power periodically changes. In a case where the error occurs in the noise estimation as described above, distortion may occur in the voice due to excessive suppression or residue of the noise may occur due to insufficient suppression in some cases.
As represented by the broken line in
As represented by the broken line in
In the above-mentioned noise removal system, since the distortion of the output sound occurs due to the estimation error of the periodic noise as described above, the periodic noise superimposed over the input sound is not suppressed in some cases.
[One Aspect of Approach to Solve the Problem]
In view of the above, the noise suppression apparatus 10 according to the present embodiment does not adopt an approach for estimating the noise based on the assumption that the noise power remains the same in the voice segment. That is, for example, the noise suppression apparatus 10 according to the present embodiment estimates the periodic noise in the voice segment based on a cycle of power change in the noise segment before the voice segment is detected from the input sound and suppresses the periodic noise included in the input sound.
As illustrated in
Therefore, it is possible to suppress the periodic noise included in the input sound in accordance with the noise suppression apparatus 10 according to the present embodiment.
For example, it is possible to suppress the periodic noise included in the input sound illustrated in
In
[One Example of Functional Configuration]
As, illustrated in 1, the noise suppression apparatus 10 includes an obtaining unit 11, a transform unit 12A, an inverse transform unit 12B, a voice segment detection unit 13, and a power calculation unit 14. The noise suppression apparatus 10 further includes a stationary noise estimation unit 15, a periodic noise determination unit 16, a periodic noise estimation unit 17, a gain calculation unit 18, and a suppression unit 19. The noise suppression apparatus 10 may also include various functional units included in related-art computers in addition to the functional units illustrated in
The functional units corresponding to the respective blocks illustrated in
The example in which the above-mentioned noise suppression program is executed has been described merely as one aspect, but the configuration is not limited to this. For example, the above-mentioned noise suppression program may be executed as package software in which functions corresponding to services such as voice recognition, voice recognition AI assistant, and voice translation are packaged.
Although the CPU and the MPU are exemplified as one example of the processor, the functional units described above may be implemented by any processor regardless of whether the processor is a general-purpose type or a specific type. In addition, the functional units described above may be implemented by a hard wired logic circuit such as an application specific integrated circuit (ASIC) or a field-programmable gate array (FPGA).
The obtaining unit 11 is a processing unit configured to obtain input sound.
The obtaining unit 11 obtains a signal transformed from an acoustic wave by a microphone that is not illustrated as the input sound merely as one example source that obtains the input sound by the obtaining unit 11 may be arbitrary and is not limited to the microphone. For example, the obtaining unit 11 may also obtain the input sound by reading out the input sound from an auxiliary storage device such as a hard disc or an optical disc that accumulates sound data or a removable medium such as a memory card or a Universal Serial Bus (USB) memory. In addition, the obtaining unit 11 may also obtain stream data of voice by receiving from an external apparatus via a network as the input sound,
The transform unit 12A is a processing unit configured to transform a frame of the input sound from a time domain into a frequency domain.
As one embodiment, each time the obtaining unit 11 obtains the frame of the input sound, the transform unit 12A applies Fourier transform represented by fast Fourier transform (FFT) to the frame of the input sound to obtain FFT coefficients having an increment of a predetermined frequency. When a sampling frequency of the input sound is 16 kHz, merely as one example, a frame length used for FFT analysis may be set as approximately 512 samples.
The voice segment detection unit 13 is a processing unit configured to detect the voice segment,
As one aspect, the voice segment detection unit 13 may detect the voice segment by manually accepting a specification of a period via a user interface that is not illustrated. This user interface may be realized by hardware such as a physical switch or realized by software via display of a touch panel or the like. For example, the frame of the input sound input from the obtaining unit 11 during a period in which a press operation of a button continues may be identified as the voice segment. In addition, the frame of the input sound input from the obtaining unit 11 during a period in which the press operation is performed at start and end timings of the voice segment may be identified as the voice segment.
As another aspect, the voice segment detection unit 13 may also estimate the voice segment from the input sound. For example, the voice segment detection unit 13 may detect start and end of the voice segment based on an amplitude and zero-crossing of the waveform of the input sound, or may calculate a voice likelihood and a non-voice likelihood in accordance with a Gaussian mixture model (GMM) for each frame of the input sound and detect the voice segment from a ratio of these likelihoods. The voice segment may also be detected by using the technology of Japanese Laid-open Patent Publication No. 8-221092 or the like.
In accordance with the detection of these voice segments, for each frame of the input sound, the frame is labeled as the voice segment or the non-voice segment. Hereinafter, the non-voice segment that is identified that this segment is not the voice segment in the temporal waveforms of the input sound may be referred to as “noise segment” in some cases.
The power calculation unit 14 is a processing unit configured to calculate power of the frame of the input sound.
As one embodiment, the power calculation unit 14 calculates power of the frame based on a frequency analysis result of the frame where the transform unit 12A executes the FFT for each frequency band. For example, when a current frame is set as “f” and a frequency band is set as “b”, the power calculation unit 14 may calculate power I2[f, b] in the current frame f by calculating a squared-sum of a real number part and an imaginary number part of the FFT coefficient included in the frequency band b. A band width of the frequency band may be set as approximately 100 Hz merely as an example.
The stationary noise estimation unit 15 is a processing unit configured to estimate stationary noise of the input sound.
As one embodiment, the stationary noise estimation unit 15 may estimate the stationary noise in the frame of the input sound from the power in the noise segment for each frequency band. For example, the stationary noise estimation unit 15 may calculate power Nt2[f, b] of the stationary noise in the frequency band b in the current frame fin accordance with the following expression (1) and the following expression (2). At this time, in a case where the current frame is the “noise segment”, the stationary noise estimation unit 15 calculates the power Nt2[f, b] of the stationary noise in accordance with the following expression (1). On the other hand, in a case where the current frame is the “voice segment”, the stationary noise estimation unit 15 calculates the power Nt2[f, b] of the stationary noise in accordance with the following expression (2).
Nt
2
[f, b]=a×Nt
2
[f−1, b]+(1−a)I2[f, b] (1)
Nt
2
[f, b]=Nt
2
[f−1, b] (2)
“Nt2[f−1, b]” in the above-mentioned expression (1) denotes the stationary noise in a frame f−1 corresponding to a frame one frame before the current frame f. “a” in the above-mentioned expression (1) denotes a coefficient used for absorbing a sudden change of the stationary noise.
The periodic noise determination unit 16 is a processing unit configured to determine whether or not the periodic noise is included in the frame of the input sound. In a case where the frame of the input sound is the voice segment, there is a possibility that both the voice and the periodic noise are superimposed over the input sound. For this reason, it may be difficult to determine the presence or absence of the periodic noise in the frame belonging to the voice segment as compared with the frame belonging to the noise segment in some cases. From the above-mentioned aspect, the determination result regarding the presence or absence of the periodic noise which has been determined in the noise segment immediately before the voice segment is inherited in the frame belonging to the voice segment.
The inverse transform unit 16A is a processing unit configured to perform an inverse transform of a frequency analysis result of the frame of the input sound for each frequency band from the frequency domain into the time domain.
As one embodiment, the inverse transform unit 16A applies inverse fast Fourier transform (FFT) to an FFT coefficient in the frequency band b and obtains a signal of a component corresponding to the frequency band b among signals in the frame f of the input sound. The temporal waveform of the signal obtained for each frequency band b as described above is saved in a work area that is not illustrated not only in the current frame f but also as far back as a predetermined period. For example, from an aspect in which the periodic noise having a cycle of approximately 1 Hz falls within a range of detectability, the signal having a time length of one second back from the current frame f is accumulated for each frequency band b.
The envelope extraction unit 166 is a processing unit configured extract an envelope.
As one embodiment, the envelope extraction unit 16B executes the following processing for each frequency band b.
The transform unit 16C is a processing unit configured to transform the envelope from the time domain into the frequency domain for each frequency band.
As one embodiment, the transform unit 16C executes the following processing for each frequency band b. For example, the transform unit 16C applies a high-pass filter or the like to temporal waveform of the envelope in the frequency band b which has been extracted by the envelope extraction unit 16B.
The determination unit 16D is a processing unit configured to determine whether or not a frequency component exceeding a predetermined threshold exists. The determination unit 16D corresponds to an example of a detection unit.
As one embodiment, the determination unit 16D executes the following processing for each frequency band b. For example, the determination unit 16D determines whether or not power measured at a peak in the frequency analysis result of the temporal waveform of the envelope in the frequency band b obtained by the transform unit 16C exceeds the predetermined threshold.
With reference to
The phase calculation unit 17A is a processing unit configured to calculate a phase of the periodic noise.
As one embodiment, the phase calculation unit 17A operates in a case where the frame of the input sound is the noise segment. For example, the phase calculation unit 17A assigns the FFT coefficient corresponding to the frequency at which it is determined that the power exceeds the threshold th in the power spectrum of the envelope among the frequencies included in the frequency band b in the frame f where the determination unit 16D determines that the periodic noise exists to the following expression (3). Accordingly, phase[f, b] is calculated. This phase[f, b] is calculated in a range between 0 and 2π[rad] by the following expression (3). The thus calculated phase[f, b] in the frequency band b is saved in the work area not illustrated from the latest frame to a predetermined N frames among the frames belonging to the noise segment.
phase[f, b]=arctan(real[f, b]/imag[f, b]) (3)
The power calculation unit 17B is a processing unit configured to calculate the power of the periodic noise.
As one embodiment, the power calculation unit 17B operates in a case where the frame of the input sound is the noise segment. For example, the power calculation unit 176 assigns the FFT coefficient corresponding to the frequency at which it is determined that the power exceeds the threshold th in the power spectrum of the envelope among the frequencies included in the frequency band b in the frame f where the determination unit 16D determines that the periodic noise exists to the following, expression (4). Accordingly, power[f, b] is calculated. The thus calculated power[f, b] in the frequency band b is saved in the work area not illustrated from the latest frame to the predetermined N frames among the frames belonging to the noise segment.
power[f, b]=(real[f, b]×real[f, b])+(imag[f, b]×imag[f, b]) (4)
The correction unit 17C is a processing unit configured to correct the phase of the periodic noise.
As one embodiment, the correction unit 17C operates in a case where the frame of the input sound is the voice segment. For example, the correction unit 17C corrects a phase in the noise segment immediately before the voice segment into a phase of the periodic noise in the current frame f by a linear prediction.
More specifically, for example, the phases are saved in the work area not illustrated from the latest frame to the predetermined N frames among the frames belonging to the noise segment. For example, when frames between one to N frames before the voice segment are set as the noise segment, phase[f−1, b] to phase[f−N+1, b] are saved in the work area. When the above-mentioned phases in the immediately preceding noise segment are assigned to the following expression (5) the correction unit 17C calculates the phase of the periodic noise in the frequency band b in the current frame f. In the following expression (5), the phases of the two frames in the immediately preceding noise segment are used, but the correction may also be performed by using phases of N frames. For example, it is possible to change an interval of frames where a difference is calculated at a second term in accordance with the number of passing frames from the frame in the immediately preceding noise segment to the current frame fin the voice segment.
phase[f, b]=phase[f−1, b]+(phase[f−1, b]−phase[f−2, b] (5)
The combining unit 17D is a processing unit configured to combine the phase and the power of the periodic noise with each other.
As one embodiment, the combining unit 17D calculates a real number component preal[f, b] of the estimated periodic noise in accordance with the following expression (6) and also calculates an imaginary number component pimag[f, b] of the estimated periodic noise in accordance with the following expression (7). At this time, in a case where the current frame f is the noise segment, the phase[f, b] calculated by the phase calculation unit 17A in the current frame f and the power[f, b] calculated by the power calculation unit 17B in the current frame f are used. On the other hand, in a case where the current frame f is the voice segment, the phase corrected by the correction unit 17C from the phase of the immediately preceding noise segment is used as the phase[f, b] in the current frame f, and also the power of the frame in the immediately preceding noise segment which is saved in the work area is used as the power[f, b] in the current frame. The combining unit 17D calculates power Ns2 of the estimated periodic noise in accordance with the following expression (8) from the real number component preal[f, b] of the estimated periodic noise and the imaginary number component pimag[f, b] of the estimated periodic noise.
preal[f, b]=√power[f, b]×cos(phase[f, b]) (6)
pimag[f, b]=√power[f, b]×sin(phase[f, b]) (7)
Ns
2
=IFF(preal[f, b], pimag[f, b]) (8)
The gain calculation unit 18 is a processing unit configured to calculate a gain of the frame of the input sound.
Various methods are used for calculating the gain and suppressing noise, but a case is exemplified hereinafter where a method called spectrum subtraction method is used merely as an example. For example, when power of the input sound is set as I2[f, b], power of voice included in the input sound is set as S2[f, b], and noise included in the input sound is set as N2[f, b], it is assumed that the following expression (9) is established. It is further assumed that the input sound is multiplied by gain[f, b] represented in the following expression (10). Under these assumptions, gain[f, b] may be obtained from the following expression (11). In a case where the periodic noise is included in the frequency band b, the periodic noise Ns[f, b] is applied to N[f, b], and in a case where the periodic noise is not included in the frequency band b, the stationary noise Nt[f, b] is applied to N[f, b]. An example has been described in which only the periodic noise Ns[f, b] is used in a case where the periodic noise is included in the frequency band b merely as an example, but weighting addition may also be performed between the stationary noise Nt[f, b] and the periodic noise Ns[f, b] in accordance with the magnitude of the cycle of the periodic noise, “sqrt{}” in the following expression (11) denotes a square root.
I
2
[f, b]=S
2
[f, b]+N
2
[f, b] (9)
S[f, b]=gain[f, b]×I[f, b] (10)
gain[f, b]=sqrt{(1−N2[f, b])/(I2[f, b])} (11)
The suppression unit 19 is a processing unit configured to suppress noise. The suppression unit 19 corresponds to one example of the correction unit.
As one embodiment, the suppression unit 19 multiplies the FFT coefficient in the frequency band b in the frame f of the input sound by the gain gain[f, b] calculated by the gain calculation unit 18 in accordance with the following expression (12) to calculate output sound O[f, b].
O[f, b]=gain[f, b]×I[f, b] (12)
The inverse transform unit 12B is a processing unit configured to perform an inverse transform of the frequency analysis result for each frequency band after the gain multiplication from the frequency domain into the time domain.
As one embodiment, the inverse transform unit 12B applies IFFT to the FFT coefficient of the output sound in each frequency band b where the input sound I[f, b] is multiplied by the gain gain[f, b] for each frequency band b by the suppression unit 19. As a result, the temporal waveform of the output sound is obtained in which the voice is emphasized due to the noise suppression.
[Processing Sequence]
The transform unit 12A applies Fourier transform represented by FFT to the frame of the input sound obtained in step S101 (step S103). The FFT coefficients having the increment of the predetermined frequency are obtained by this processing in step S103.
Thereafter, the processing from step S104 to step S110 illustrated in
For example, in step S104, a squared-sum of a real number part and an imaginary number part of the FFT coefficient included in the frequency band b set as the processing target among the frames of the input sound obtained in step S101 is calculated to calculate the power I2[f, b] in the current frame f (step S104).
The periodic noise determination unit 16 subsequently executes the “periodic noise determination processing” for determining whether or not the periodic noise is included in the frame of the input sound (step S105). A detail of processing contents of this “periodic noise determination processing” will be illustrated in
In a case where the frame of the input sound is the “noise segment” (step S301 No), the inverse transform unit 16A applies IFFT to the FFT coefficient in the frequency band b (step S302). A signal of the component corresponding to the frequency band b among the signals in the frame f of the input sound is obtained by this processing in step S302,
The envelope extraction unit 16B subsequently extracts an envelope of a curved line group included in the temporal waveform of the signal in a past predetermined period, for example, one second, in the frequency band b (step S303). The transform unit 16C applies FFT to the temporal waveform of the envelope extracted in step S303 (step S304). With this configuration, the FFT coefficients having the increment of the predetermined frequency are obtained as the frequency analysis result of the temporal waveform of the envelope.
Thereafter, the determination unit 16D determines whether or not the power measured at the peak in the power spectrum obtained from the FFT coefficients of the envelope in the frequency band b obtained in step S304 exceeds a predetermined threshold, for example, the threshold th illustrated in
On the other hand, in a case where the frame of the input sound is the “voice segment” (step S301 Yes), the determination unit 16D refers to the determination result regarding the presence or absence of the periodic noise which is determined in the noise segment immediately before the voice segment (step S306), and the processing is ended.
With reference to the flowchart in
A detail of processing contents of this “periodic noise estimation processing” will be illustrated in
In a case where the frame of the input sound is the “noise segment” (step S501 No), the phase calculation unit 17A assigns the FFT coefficient corresponding to the frequency at which it is determined that the power exceeds the threshold in the power spectrum of the envelope among the frequencies included in the frequency band b in the frame f where it is determined that the periodic noise exists to the above-mentioned expression (3) to calculate the phase [f, b] (step S502).
The power calculation unit 17B subsequently assigns the FFT coefficient corresponding to the frequency at which it is determined that the power exceeds the threshold in the power spectrum of the envelope among the frequencies included in the frequency band b in the frame f where it is determined that the periodic noise exists to the above-mentioned expression (4) to calculate the power [f, b] (step S503).
The combining unit 17D calculates the power Ns2 of the estimated periodic noise based on the phase and the cycle calculated in step S502 and step S503 (step S504), and the processing is ended.
On the other hand, in a case where the frame of the input sound is the “voice segment” (step S501 Yes), the correction unit 17C corrects the phase of the noise segment immediately before the voice segment into the phase of the periodic noise in the current frame f by the linear prediction (step S505).
The combining unit 17D calculates the power Ns2 of the estimated periodic noise based on the phase corrected from the phase in the immediately preceding noise segment in step S505 and the power of the frame in the immediately preceding noise segment which has been saved in the work area (step S504), and the processing is ended,
With reference to the flowchart in
The gain calculation unit 18 subsequently switches to use the periodic noise Ns[f, b] or the stationary noise Nt[f, b] as N[f, b] depending on whether or not the periodic noise is included in the frequency band b and calculate the gain gain[f, b] by which the input sound is multiplied in accordance with the above-mentioned expression (11) (step S109).
Thereafter, the suppression unit 19 multiplies the FFT coefficient in the frequency band b in the frame f of the input sound by the gain gain[f, b] calculated in step S109 in accordance with the following expression (12) to calculate output sound O[f, b] (step S110).
After the processing from these step S104 to step S110 is executed with respect to all the frequency bands b, the inverse transform unit 12B applies IFFT to the FFT coefficient of the output sound in each frequency band b where the input sound I[f, b] is multiplicated by the gain gain[f, b] for each frequency band b (step S111), and the processing is ended.
In accordance with this processing in step S111, the temporal waveform of the output sound is obtained in which the voice is emphasized due to the noise suppression.
[One Aspect of Effects]
As described above, the noise suppression apparatus 10 according to the present embodiment estimates the periodic noise in the voice segment based on the cycle of the power change in the noise segment before the voice segment is detected from the input sound and suppresses the periodic noise included in the input sound.
At this time, in the noise suppression apparatus 10 according to the present embodiment, the cycle of the power change in the noise segment before the voice segment is detected from the input sound is used for the estimation of the periodic noise in the voice segment. For this reason, in the noise suppression apparatus 10 according to the present embodiment, the power of the estimated noise is not fixed to remain the same like the above-mentioned noise removal system. In the noise suppression apparatus 10 according to the present embodiment, the periodic noise having the correlation with the cycle of the power change in the immediately preceding noise segment is estimated. In this manner, in the noise suppression apparatus 10 according to the present embodiment, it is possible to estimate the periodic noise that is difficult to be estimated in the above-mentioned noise removal system.
Therefore, in accordance with the noise suppression apparatus 10, it is possible to suppress the periodic noise included in the input sound.
Heretofore, the embodiment of the devices of the present disclosure have been described, but it is to be understood that embodiments of the present disclosure may be made in various ways other than the above-mentioned embodiments. Therefore, other embodiments included in the present disclosure are described below.
The above-mentioned noise suppression function described in the first embodiment may be incorporated in various devices such as a mobile terminal apparatus represented by a smart phone, a wearable terminal, a smart speaker, and a communication robot. In this case, input sound input to a microphone included in the device is obtained to execute the processing illustrated in
According to the above-mentioned first embodiment, the example has been described in which the correction based on the linear correction by the correction unit 17C is executed at the time of the transition from the noise segment to the voice segment, but the correction based on the linear correction by the correction unit 17C may also be executed in the case of the transition from the voice segment to the noise segment.
According to the above-mentioned first embodiment, the example in which the periodic noise is generated in a stationary manner, for example, the example in which the periodic noise is generated in the entire sampling time has been described from the viewpoint of the example of
According to the above-mentioned first embodiment, from an aspect in which the signal of the input sound is processed in real time, the example has been illustrated in which the noise in the voice segment is suppressed based on the periodic noise estimated from the cycle of the power change in the noise segment immediately before the voice segment, but the configuration is not limited to this. For example, it is also sufficient when the signal of the input sound is not necessarily processed in real time. In this case, it is possible to suppress the periodic noise generated after the frame where the periodic noise has been detected by the processing in step S106 and suppress the periodic noise generated before the frame. For example, in a case where the periodic noise is generated in the midcourse of the voice segment such as, for example, a case where the periodic noise is detected at and after sample 8000 of the temporal waveform illustrated in
[Distribution and Integration]
The respective components of the respective devices illustrated in the drawings do not necessarily have to be physically configured as illustrated in the drawings. Specific forms of the distribution and integration of the devices are not limited to the illustrated forms, and all or a portion thereof may be distributed and integrated in any units in either a functional or physical manner depending on various conditions such as a load and a usage state. For example, the obtaining unit 11, the transform unit 12A, the inverse transform unit 12B, the voice segment detection unit 13, the power calculation unit 14, the stationary noise estimation unit 15, the periodic noise determination unit 16, the periodic noise estimation unit 17, the gain calculation unit 18, or the suppression unit 19 may be coupled via a network as external devices of the noise suppression apparatus 10. Different devices may respectively include the obtaining unit 11, the transform unit 12A, the inverse transform unit 12B, the voice segment detection unit 13, the power calculation unit 14, the stationary noise estimation unit 15, the periodic noise determination unit 16, the periodic noise estimation unit 17, the gain calculation unit 18, or the suppression unit 19 and are coupled via the network to co-operate, and the above-mentioned functions of the noise suppression apparatus 10 may also be realized.
[Noise Suppression Program]
The various types of processing described in the above-mentioned embodiments may be implemented by executing a program prepared in advance in a computer such as a personal computer or a work station. Hereinafter, with reference to
As illustrated in
Under the above-mentioned environment, the CPU 150 reads out the noise suppression program 170a from the HDD 170 to be loaded to the RAM 180 As a result, as illustrated in
The noise suppression program 170a does not necessarily have to be initially stored in the HDD 170 or the ROM 160. For example, the noise suppression program 170a is stored in “portable physical media” such as a flexible disk called an FD, a CD-ROM, a DVD disk, a magneto-optical disk, and an IC card, which will be inserted into the computer 100. The computer 100 may obtain the noise suppression program 170a from these portable physical media and execute the program 170a. The noise suppression program 170a may be stored in another computer or server apparatus coupled to the computer 100 via a public line, the Internet, a LAN, a WAN, or the like, and the computer 100 may obtain the noise suppression program 170a from these and execute the noise suppression program 170a.
All examples and conditional language provided herein are intended for the pedagogical purposes of aiding the reader in understanding the invention and the concepts contributed by the inventor to further the art, and are not to be construed as limitations to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although one or more embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2018-216027 | Nov 2018 | JP | national |