This application is a National Stage Entry of PCT/JP2013/079701 filed on Nov. 1, 2013, which claims priority from Japanese Patent Application 2012-259218 filed on Nov. 27, 2012, the contents of all of which are incorporated herein by reference, in their entirety.
The present invention relates to a signal processing technique of controlling the phase component of a signal.
As examples of a technique of performing signal processing by controlling the phase component of a signal, patent literature 1 and non-patent literature 1 disclose noise suppression techniques which pay attention to a phase spectrum. In the techniques described in patent literature 1 and non-patent literature 1, an amplitude spectrum pertaining to noise is suppressed, and at the same time, the phase spectrum is shifted by a random value of up to π/4. The techniques described in patent literature 1 and non-patent literature 1 implement, by shifting the phase spectrum at random, suppression of noise which cannot be suppressed by only attenuation of the noise spectrum.
However, as in patent literature 1 and non-patent literature 1, to shift the phase spectrum at random, it is necessary to generate a random number. As a result, a calculation amount for generating a random number is added.
The present invention enable to provide a signal processing technique of solving the above-described problem.
One aspect of the present invention provides an apparatus characterized by comprising:
a transformer that transforms a mixed signal, in which a first signal and a second signal coexist, into a phase component for each frequency and one of an amplitude component and a power component for each frequency;
a change amount generator that generates a change amount of the phase component at a predetermined frequency by using a series of data with a cross-correlation weaker than that of the phase components and randomness lower than that of random numbers;
a phase controller that controls the phase component by using the change amount provided from the change amount generator; and
an inverse transformer that generates an enhanced signal by using the phase component having undergone control processing by the phase controller.
Another aspect of the present invention provides a method characterized by comprising:
transforming a mixed signal, in which a first signal and a second signal coexist, into a phase component for each frequency and one of an amplitude component and a power component for each frequency;
generating a change amount of the phase component at a predetermined frequency by using a series of data with a cross-correlation weaker than that of the phase components and randomness lower than that of random numbers;
controlling the phase component by using the change amount generated in the generating the change amount; and
generating an enhanced signal by using the phase component having undergone control processing in the controlling.
Still other aspect of the present invention provides a program for causing a computer to execute a method, characterized by comprising:
transforming a mixed signal, in which a first signal and a second signal coexist, into a phase component for each frequency and one of an amplitude component and a power component for each frequency;
generating a change amount of the phase component at a predetermined frequency by using a series of data with a cross-correlation weaker than that of the phase components and randomness lower than that of random numbers;
controlling the phase component by using the change amount generated in the generating the change amount; and
generating an enhanced signal by using the phase component having undergone control processing in the controlling.
According to the present invention, it is possible to provide a signal processing technique of controlling the phase component of an input signal without generating any random number.
Preferred embodiments of the present invention will now be described in detail with reference to the drawings. It should be noted that the relative arrangement of the components, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless it is specifically stated otherwise.
The transformer 101 transforms a mixed signal 110, in which the first and second signals coexist, into a phase component 120 for each frequency and an amplitude component or power component 130 for each frequency.
The change amount generator 103 generates a change amount of a phase component at a predetermined frequency by using a series of data with a cross-correlation weaker than that of the phase components 120 and randomness lower than that of random numbers. The phase controller 102 controls the phase component 120 by using the change amount provided from the change amount generator 103. The inverse transformer 104 generates an enhanced signal 170 using a phase component 140 having undergone control processing by the phase controller 102.
With the above arrangement, it is possible to control the phase component 120 using a series of data with a cross-correlation weaker than that of the phase components 120 and randomness lower than that of random numbers, thereby efficiently implementing suppression of noise which cannot be suppressed by only attenuation of the amplitude spectrum.
<<Overall Arrangement>>
A noise suppression apparatus 200 according to the second embodiment of the present invention will be described with reference to
A deteriorated signal (a signal in which a desired signal and noise coexist) is supplied as a series of sample values to an input terminal 206. When a deteriorated signal is supplied to the input terminal 206, a transformer 201 performs transform such as Fourier transform for the supplied deteriorated signal, and divides the resultant signal into a plurality of frequency components. The transformer 201 independently processes the plurality of frequency components for each frequency. The following description pays attention to a specific frequency component. The transformer 201 supplies a deteriorated signal amplitude spectrum (amplitude component) 230 of the plurality of frequency components to an inverse transformer 204. The transformer 201 supplies a phase spectrum (phase component) 220 of the plurality of frequency components to a phase controller 202 and a change amount generator 203. Note that although the transformer 201 supplies the deteriorated signal amplitude spectrum 230 to the inverse transformer 204, the present invention is not limited to this. The transformer 201 may supply a power spectrum corresponding to the square of the deteriorated signal amplitude spectrum 230 to the inverse transformer 204.
The change amount generator 203 generates a change amount by using the deteriorated signal phase spectrum 220 received from the transformer 201, and supplies the change amount to the phase controller 202. The “change amount” of the phase is a concept including the “rotation amount” and “replacement amount” of the phase, and indicates the control amount of the phase. The phase controller 202 reduces the phase correlation by changing the deteriorated signal phase spectrum 220 supplied from the transformer 201 by using the change amount supplied from the change amount generator 203, and supplies the resultant data as an enhanced signal phase spectrum 240 to the inverse transformer 204. The inverse transformer 204 performs inverse transform by composing the enhanced signal phase spectrum 240 supplied from the phase controller 202 and the deteriorated signal amplitude spectrum 230 supplied from the transformer 201, and supplies the result of inverse transform as an enhanced signal 270 to an output terminal 207.
<<Arrangement of Transformer>>
The windowing unit 302 may partially superimpose (overlap) and window two successive frames. Assuming that the overlapping length is 50% of the frame length, the windowing unit 302 outputs the left-hand side of equation (2) below obtained for t=0, 1, . . . , K/2−1.
For a real signal, the windowing unit 302 may use a symmetric window function. The window function is designed so that, when the phase controller 202 performs no control operation, the input signal of the transformer 201 and the output signal of the inverse transformer 204 coincide with each other by excluding a calculation error. This means w(t)+w(t+K/2)=1.
The following description will continue by exemplifying a case in which two successive frames are made to overlap each other by 50% and windowed. For example, the windowing unit 302 may use, as w(t), a Hanning window given by:
In addition, various window functions such as a Hamming window and triangle window are known. The windowed output is supplied to the Fourier transformer 303, and transformed into a deteriorated signal spectrum Yn(k). The deteriorated signal spectrum Yn(k) is separated into a phase and amplitude. A deteriorated signal phase spectrum arg Yn(k) is supplied to the phase controller 202 and change amount generator 203, and a deteriorated signal amplitude spectrum |Yn(k)| is supplied to the inverse transformer 204. As described above, the power spectrum may be used instead of the amplitude spectrum.
<<Arrangement of Inverse Transformer>>
Xn(k)=|Xn(k)|·arg Xn(k) (4)
The inverse Fourier transformer 401 performs inverse Fourier transform for the obtained enhanced signal. The enhanced signal having undergone inverse Fourier transform is supplied to the windowing unit 402 as a series xn(t) (t=0, 1, . . . , K−1) of time domain sample values in which one frame includes K samples, and multiplied by the window function w(t). A signal obtained by performing windowing for the input signal xn(t) (t=0, 1, . . . , K/2−1) of the nth frame by w(t) is given by the left-hand side of:
The windowing unit 402 may partially superimpose (overlap) and window two successive frames. Assuming that the overlapping length is 50% of the frame length, the windowing unit 402 outputs and transmits, to the frame composition unit 403, the left-hand side of the following equation for t=0, 1, . . . , K/2−1.
The frame composition unit 403 extracts outputs of two adjacent frames from the windowing unit 402 for every K/2 samples, superimposes them, and obtains an output signal (the left-hand side of equation (7) below) for t=0, 1, . . . , K−1.
{circumflex over (x)}n(t)=
The frame composition unit 403 transmits the obtained output signal to the output terminal 207.
In
After a plurality of frequency components obtained by the transformer 201 are integrated, the change amount generator 203 may generate a change amount and the phase controller 202 may control the phase. At this time, it is possible to obtain higher sound quality by integrating a larger number of frequency components from a low-frequency domain where the discrimination ability of auditory properties is high toward a high-frequency domain where the discrimination ability is poor so that the bandwidth after integration becomes wide. When phase control is executed after integrating a plurality of frequency components, the number of frequency components to which phase control is applied decreases, and the total amount of calculation can be reduced.
<<Operation of Change Amount Generator 203>>
The change amount generator 203 is supplied with the deteriorated signal phase spectrum 220 from the transformer 201, and generates a change amount to reduce the phase correlation. Since the deteriorated signal phase spectrum 220 supplied from the transformer 201 is represented by arg Yn(k) (0≤k<K), the change amount generator 203 can obtain an enhanced signal phase spectrum arg Xn(k) for which correlation has been reduced, as given by:
arg Xn(k)=(−1)karg Yn(k) (8)
This corresponds to alternately inverting the signs of the phases. Instead of alternately inverting the signs, inversion may be performed for every arbitrary integer smaller than K, as a matter of course.
The change amount generator 203 obtains a rotation amount Δarg Yn(k) as a change amount necessary for phase control indicated by equation (8), as given by:
Δarg Yn(k)={(−1)k−1}arg Yn(k) (9)
That is, the change amount generator 203 generates the rotation amount Δarg Yn(k) indicated by equation (9) as a change amount. Also, it is possible to use:
Δarg Yn(k)=arg Yn(mod [k+K/2−1,K]) (10)
where mod [k, K] represents a remainder obtained by dividing k by K. The rotation amount Δarg Yn(k) at this time corresponds to a phase obtained by shifting the original phase by K/2 samples. It is apparent that he shift amount is not limited to K/2, and may be an arbitrary integer.
Alternatively, a phase at a position symmetrical to the position of the original phase with respect to K/2 is set as the rotation amount Δarg Yn(k). This uses:
Δarg Yn(k)=arg Yn(mod [K−k+1,K]) (11)
Furthermore, it is possible to generate a change amount by combining these two kinds of processes, that is, sign inversion and addition of the shifted phase. That is,
Δarg Yn(k)={(−1)k−1}arg Yn(mod [k+K/2−1,K]) (12)
or
Δarg Yn(k)={(−1)mod(k+N/2−1,N)−1}arg Yn(mod [k+K/2−1,K]) (13)
As for the shift addition, the shift amount K/2 can be changed. For example, if a frame number n at that time is set as the shift amount, the shift amount automatically changes with time. Similarly, in equations (12) and (13), the equation (10) may be combined instead of equation (11).
Furthermore, constant multiplication can be combined with the selective sign inversion of the phase and shift addition processing. For example, combining constant multiplication with equation (10) yields:
Δarg Yn(k)=k·arg Yn(mod [k+K/2−1,K]) (14)
This is an example of performing constant multiplication for a term to undergo shift addition by k corresponding to the position of the term.
Furthermore, it is possible to replace a plurality of phase samples. For example, k (0≤k<K/2) can be alternately applied with:
Δarg Yn(k)=−arg Yn(k)+arg Yn(mod [K−k+1,K])
Δarg Yn(mod [K−k+1,K])=−arg Yn(mod [K−k+1,K])+arg Yn(k) (15)
An arbitrary integer smaller than K may be used, instead of 1.
Selective sign inversion of the phase, shift addition, constant multiplication, and replacement have been described above. These processes can be selectively applied in accordance with the value of arg Yn(k). For example, it is possible to apply the above-described processes only when the value of arg Yn(k) takes a positive value. Exemplifying the processing indicated by equation (10) yields:
where sgn(⋅) represents an operator for extracting a sign. A fraction on the right-hand side becomes 1 only when the phase takes a positive value, and becomes zero otherwise. It is therefore possible to selectively apply the processes in accordance with the value of arg Yn(k). The correlation elimination processes using the change amount are different in the degree of correlation elimination and the necessary calculation amount. In actual application, in consideration of the degree of correlation elimination and the necessary calculation amount, appropriate processing is selected and used or the processes are used in combination.
As another correlation elimination method, there is provided a method of obtaining the correlation of the phase samples arg Yn(k) and eliminating the obtained correlation. For example, consider a case in which arg Yn(k) is represented by linear combination of N−1 adjacent samples. This establishes:
Alternatively, paying attention to the correlation in the opposite direction can yield:
Note that δL(k) and δR(k) represent uncorrelated components (components with no correlation).
Modifying arg Yn(k) using the relationship yields:
In the above equations, it is not necessary to use all nonzero values aj. By using some values aj, it is possible to reduce the calculation amount.
Although the correlation elimination effect decreases, it is possible to minimize a decrease in effect by selectively using large values aj. As an example, by using only the largest value aj, phase correlation elimination is performed based on:
Δarg Yn(k)=−ajmaxarg Yn(jmax) (21)
where jmax represents the value of j with which a correlation coefficient a takes a largest value. As compared with correlation elimination using N samples, it is possible to reduce the calculation amount necessary for correlation elimination.
The coefficient aj in the above linear correlation equations is known as a linear prediction coefficient (LPC) in voice encoding. It is possible to obtain the LPC at high speed by using a Levinson-Durbin recursion method. Also, it is possible to obtain the LPB using a coefficient update algorithm for an adaptive filter represented by a normalized LMS algorithm by using the difference (error) between the original phase sample value and the prediction result.
The correlation may be eliminated by assuming linear combination of Kj−1 samples (Kj<K), instead of linear combination of K−1 adjacent samples. By decreasing the number of samples assumed for linear combination in this way, it is possible to reduce the calculation amount necessary for correlation elimination.
A case in which arg Yn(k) is represented by linear combination of K−1 adjacent samples has been exemplified. Similarly, a case in which arg Yn(k) is represented by nonlinear combination of K−1 samples is possible. That is, this establishes:
arg Yn(k)=fNL[arg Yn(j)]|0≤j<K,j≠k+δ(k) (22)
where fNL[⋅] represents a nonlinear function, and δ(k) represents an uncorrelated component. In this case, the change amount used for correlation elimination can be obtained by:
Δarg Yn(k)=−fNL[arg Yn(j)]|0≤j<K,j≠k (23)
Correlation elimination using the nonlinear function can sufficiently eliminate the correlation when data have a nonlinear correlation.
The nonlinear function can be generally approximated by a polynomial. When approximating the nonlinear function fNL[⋅] by a polynomial of arg Yn(j), the kinds of arg Yn(j) are limited, and its order can also be limited. If, for example, only arg Yn(k), arg Yn(k+1), and the squares of them are used, fNL[⋅] is approximated by only the four kinds of terms including arg Yn(k), arg Yn(k+1), and the squares of them. Approximation of the nonlinear function can reduce the calculation amount necessary for correlation elimination.
<<Operation of Phase Controller 202>>
The phase controller 202 obtains the enhanced signal phase spectrum 240 arg Xn(k) by adding the change amount Δarg Yn(k) supplied from the change amount generator 203 to the deteriorated signal phase spectrum 220 supplied from the transformer 201, and supplies the obtained enhanced signal phase spectrum 240 to the inverse transformer 204. That is, this executes:
arg Xn(k)=arg Yn(k)+Δarg Yn(k) (24)
The phase controller 202 can obtain the enhanced signal phase spectrum 240 arg Xn(k) by replacing the change amount Δarg Yn(k) supplied from the change amount generator 203 with the deteriorated signal phase spectrum 220 supplied from the transformer 201 without adding the change amount to the deteriorated signal phase spectrum 220, and supply the enhanced signal phase spectrum 240 to the inverse transformer 204. That is, the phase rotation amount equals the replacement amount of the phase by executing:
arg Xn(k)=arg Yn(k)−arg Yn(k)+Δarg Yn(k) (25)
Note that although replacement is implemented by subtracting the enhanced signal phase spectrum itself, and adding the rotation amount in this example, replacement may be implemented by simply replacing phase data with the replacement amount.
As described above, the shape of the deteriorated signal phase spectrum 220 is changed when the phase controller 202 changes the value of Δarg Yn(k) by using the change amount Δarg Yn(k) generated by the change amount generator 203. The change of the shape weakens the correlation of the deteriorated signal phase spectrum 220, thereby weakening the feature of the input signal.
Note that it is also possible to apply phase unwrapping prior to the phase processing described above. This is because the deteriorated signal phase spectrum 220 has a range of ±π as a value range. That is, phase unwrapping is performed not to limit the value range to the range of ±π. Performing phase unwrapping makes it possible to obtain the correlation indicated by equation (15), (16), or (20) at high accuracy. Various methods can be applied for phase unwrapping, as described in B. Rad and T. Virtanen, “Phase spectrum prediction of audio signals,” Proc. ISCCSP2012, CD-ROM, May 2012 (non-patent literature 2) and S. T. Kaplan and T. J. Ulrych, “Phase Unwrapping: A review of methods and a novel technique,” Proc. 2007 CSPG CSEG Conv. pp. 534-537, May 2007 (non-patent literature 3).
<<Overall Arrangement>>
A noise suppression apparatus 500 according to the third embodiment of the present invention will be described with reference to
<<Arrangement of Change Amount Generator 503>>
A transformer 201 supplies a deteriorated signal phase spectrum 220 to the phase controller 202, and the change amount generator 503 supplies a phase rotation amount to the phase controller 202. The phase controller 202 rotates (shifts) the deteriorated signal phase spectrum 220 by the rotation amount supplied from the change amount generator 503, and supplies the rotation result as an enhanced signal phase spectrum 240 to an inverse transformer 204.
<<Change Amount Calculation 1 Using Amplitude>>
For example, the amplitude analyzer 602 sets, as a rotation amount, a product obtained by multiplying the deteriorated signal amplitude spectrum held in the amplitude holding unit 601 by π. Alternatively, even if the deteriorated signal amplitude spectrum held in the amplitude holding unit 601 is collected in the frequency direction or time axis direction, and directly set as a rotation amount, the same effects are obtained. The phase controller 202 changes (rotates or replaces) the deteriorated signal phase spectrum at each frequency by using the change amount generated by the change amount generator 503 based on the deteriorated signal amplitude spectrum. Under the control of the phase controller 202, the shape of the deteriorated signal phase spectrum 220 changes. The change of the shape can weaken the feature of noise.
<<Change Amount Calculation 2 Using Amplitude>>
The amplitude analyzer 602 may supply, as a rotation amount, the result of normalizing the deteriorated signal amplitude spectrum 230 held in the amplitude holding unit 601 to the phase controller 202. In this case, the amplitude analyzer 602 first obtains the average of the deteriorated signal amplitude spectra 230 (K positive values). The amplitude analyzer 602 obtains a product by multiplying, by π, a quotient obtained by dividing the deteriorated signal amplitude spectrum 230 by the obtained average value, and sets the product as a rotation amount. Note that if the quotient is directly set as a rotation amount without multiplying the quotient by π, the similar effects are obtained. Since a variance can be made large with respect to a case in which no normalization is performed, the correlation elimination effect for a rotated phase can be enhanced. Also, the average can be obtained after excluding a value (outlier) extremely different from the remaining values. This can eliminate the adverse effect of the outlier, thereby obtaining a more effective rotation amount.
<<Change Amount Calculation 3 Using Amplitude>>
The amplitude analyzer 602 can normalize the distribution of the deteriorated signal amplitude spectra 230, and then set a rotation amount. First, the amplitude analyzer 602 obtains a maximum value |Xn(K)|max and a minimum value |Xn(K)|min of the deteriorated signal amplitude spectra 230 (K positive values). The amplitude analyzer 602 subtracts the minimum value from the deteriorated signal amplitude spectrum, and divides the subtraction result by the difference between the maximum value and the minimum value. A product obtained by multiplying the obtained quotient by π is set as a rotation amount. That is, a rotation amount Δarg Yn(k) is obtained by:
By obtaining the rotation amounts in this way, the rotation amounts are distributed between 0 and π. It is, therefore, possible to enhance the correlation elimination effect for a rotated phase. Note that even if the quotient is directly set as a rotation amount without multiplying the quotient by π, the similar effects are obtained.
<<Change Amount Calculation 4>>
A change amount generator 503 can normalize the distribution of the deteriorated signal amplitude spectra by an envelope, and set the normalization result as a rotation amount. As for the envelope, for example, a regression curve of the deteriorated signal amplitude spectrum is obtained based on N samples, and each sample is divided by the value of the regression curve. The regression curve may be obtained by using some of the N samples, or can be obtained by excluding an outlier. By excluding an outlier, it is possible to eliminate the adverse effect of the outlier, thereby obtaining a more effective rotation amount. The thus obtained quotients are distributed centered on 1.
By applying normalization of the maximum value and minimum value described using equation (26) to the quotient, the rotation amount Δarg Yn(k) is obtained by:
where |{tilde over (X)}n(k)| represents the deteriorated signal amplitude spectrum normalized by the envelope. By obtaining the rotation amounts, the rotation amounts are uniformly distributed between π and −π, thereby enhancing the correlation elimination effect. Note that even if the quotient is directly set as a rotation amount without multiplying the quotient by π, the similar effects are obtained.
The fourth embodiment of the present invention will be described with reference to
As shown in
A drop of the output level by phase rotation in correlation elimination will be explained with reference to
Signals when no phase rotation is performed will be described with reference to
A windowing unit 302 performs windowing for the signals of the divided frames. The third signal from the top, which is sectioned by a dotted line, is a signal after windowing. In
After that, a Fourier transformer 303 transforms a signal into one in the frequency domain. However, the signal in the frequency domain is not shown in
A windowing unit 402 executes windowing again for an enhanced signal output from the inverse Fourier transformer 401 of the inverse transformer 204.
On the other hand, signals when the phase is rotated will be explained with reference to
The drop of the output signal level caused by phase rotation can also be explained by vector composition in the frequency domain by replacing addition in the time domain with addition in the frequency domain.
The relationship between x1 and x2 is given by:
First, a transform equation from a time domain signal into a frequency domain signal, and an inverse transform equation will be described. Based on Fourier transform of a time domain signal x[n], a frequency domain signal X[k] is given by:
where k represents the discrete frequency, and L represents the frame length.
Inverse transform to return the frequency domain signal X[k] into the time domain signal x[n] is given by:
Based on this, transform of the time domain signals x1[n] and x2[m] into frequency domain signals X1[k] and X2[k], respectively, is given by:
Based on equation (31), inverse transform to return the frequency domain signals X1[k] and X2[k] into the time domain signals x1[n] and x2[m], respectively, is given by:
The inverse Fourier transformer 401 transforms a frequency domain signal into a time domain signal by inverse Fourier transform. After that, the frame composition unit 403 performs overlap addition of enhanced sounds of preceding and current frames.
For example, at the overlapping ratio of 50% in the illustrated example, the frame composition unit 403 adds adjacent frames in a section of the discrete time m=L/2 to L−1. The addition section m=L/2 to L−1 will be considered.
Substituting equations (34) and (35) into time domain addition yields:
Furthermore, substituting equations (32) and (33) into the frequency domain signals X1[k] and X2[k] in equation (36) yields:
Equation (37) is expanded into:
Total sum calculation included in each term of equation (38) will be considered. Introducing an arbitrary integer g establishes:
An inverse Fourier transform of a delta function δ[g] is given by:
The delta function δ[g] is given by:
Based on equation (40), equation (39) can be rewritten into:
From the relation of equation (42), equation (38) can be rewritten into:
Hence, equation (38) is rewritten into:
A case in which phase rotation is performed for the frequency domain signal X2[k] will be considered. A time domain signal at this time is as shown in
When the phase spectrum of X2[k] is rotated by ϕ[k], inverse transform is given by:
Substituting equation (45) into equation (36) yields:
Equation (46) is expanded into:
Assuming that the overlapping ratio is 50%, the overlapping section n=L/2 to L−1 will be considered. In the overlapping section, equation (47) can be expanded into:
In this case, the parenthesized term (given by expression (49) below) in each term is vector composition.
1+ejϕ[k] (49)
When attention is paid to a specific frequency k, a frequency domain signal can be represented as shown in
When no phase rotation is performed, that is, ϕ[k]=0, a frequency domain signal is as shown in
The absolute value of expression (49) is given by:
A condition under which the absolute value indicated by expression (49) is maximized is ϕ[k]=0, and the absolute value is 2. That is, executing phase rotation decreases the magnitude of an output signal.
To correct the drop of the output signal level, the correction amount calculator 881 decides the amplitude correction amount of the enhanced signal amplitude spectrum.
A correction amount calculation method will be explained in detail. To simplify a problem, attention is paid to variations of the magnitude caused by phase rotation, and respective frequency components are assumed to have been normalized to unit vectors.
First, a case in which no phase rotation is performed will be considered. A composite vector when the phase is the same between successive frames is represented by a vector S shown in
On the other hand, if phase rotation is performed by a normal random number, a composite vector when the phase difference between successive frames is ϕ is represented by a vector S′ shown in
|S′|=√{square root over ({1+cos ϕ}2+{sin ϕ}2)}=√{square root over (2+2{cos ϕ})} (52)
An expected value E(|S′|2) is given by:
E(|S′|2)=E(2+2 cos ϕ)=E(2)+(2 cos ϕ) (53)
For a normal random number, the rate of occurrence of ϕ is decided by a normal distribution. To obtain an expected power value when phase rotation is performed by a normal random number, therefore, it is necessary to perform weighting based on the rate of occurrence of ϕ.
More specifically, a weighting function f(ϕ) based on the rate of occurrence of ϕ is introduced. The weighting function f(ϕ) weights cos(ϕ). Furthermore, normalization using the integral value of the weighting function f(ϕ) can provide an expected power value.
By introducing the weighting function f(ϕ) and its integral value into equation (53) which expresses an expected output power value based on a uniform normal random number, an expected output power value E(S′2) when phase rotation is performed using a normal random number is given by:
Expressing the weighting function f(ϕ) by a normal distribution yields:
where σ represents the variance, and μ represents the average.
For example, for a standardized normal distribution having the average μ=0 and the variance σ=1, the weighting function f(ϕ) is given by:
Substituting equation (56) into equation (54) yields:
Then, numerically calculating the second term on the right-hand side of equation (57) establishes:
E(|S′|2)=2{1+0.609}=3.219 (58)
The ratio of E(|S′|2) to E(|S2|) when no phase rotation is performed is given by:
E(|S′|2)/E(|S′|2)=3.218/4=0.805 (59)
When rotating the phase with a normal random number of the standardized normal distribution, the correction amount calculator 881 sets sqrt(1/0.805) as the correction coefficient, and transmits it to the amplitude correction unit 882. The phase controller 202 may perform phase rotation for all or some frequencies. The amplitude controller 708 performs amplitude correction for only frequencies having undergone phase rotation. Therefore, a correction coefficient for frequencies not to undergo phase rotation is set to 1.0. Only a correction coefficient for frequencies having undergone phase rotation is a value derived here.
Although not all the phase rotation characteristics can be completely expressed by a normal distribution, it is possible to apply the above-described correction amount calculation method by approximation using the normal distribution. To do this, it is necessary to collect statistics based on the value of a change amount generated by the change amount generator 203 and its appearance frequency, and obtain the average μ and variance σ of the normal distribution indicated by equation (55). Subsequently, calculation from equation (55) to equation (59) is performed to obtain the square root of the reciprocal of the calculation result as a correction coefficient.
As described above, the noise suppression apparatus 700 according to this embodiment can cause the amplitude controller 708 to eliminate the influence of control of the phase spectrum on an output signal level. Thus, the noise suppression apparatus 700 can obtain a high-quality enhanced signal.
A noise suppression apparatus 1500 according to the fifth embodiment of the present invention will be described with reference to
According to this embodiment, it is possible to efficiently suppress noise derived from a phase by rotating or replacing a deteriorated signal phase spectrum using a deteriorated signal amplitude spectrum or a value obtained from it, and suppress a decrease in output level caused by phase control by controlling an amplitude.
A noise suppression apparatus 1600 according to the sixth embodiment of the present invention will be described with reference to
The change amount limiter 1601 limits the rotation amount generated by the change amount generator 203 to a given range. That is, the change amount limiter 1601 limits the distribution of ϕ to an arbitrary range from 0 to 2π. For example, the change amount limiter 1601 limits the distribution of ϕ to a range from 0 to π/2. This causes the features of the deteriorated signal phase spectrum to remain in the enhanced signal phase spectrum to some extent. The features of the deteriorated signal are held to some extent, as compared with a case in which the phase is completely rotated at random, thereby reducing the influence on a target sound. This reduces the distortion of the target sound.
According to this embodiment of the present invention, in addition to the effects of the second embodiment, deterioration of a target sound can be reduced by limiting a phase rotation amount.
The seventh embodiment of the present invention will be described with reference to
The phase component delay unit 1781 holds an enhanced signal phase spectrum supplied from the phase controller 202 for one frame, and supplies it to the correction amount calculator 1782.
The correction amount calculator 1782 calculates an amplitude correction amount based on the enhanced signal phase spectrum of an immediately preceding frame supplied from the phase component delay unit 1781 and the current enhanced signal phase spectrum supplied from the phase controller 202, and then transmits the calculated amplitude correction amount to the amplitude correction unit 882.
According to this embodiment, in addition to the effects of the second embodiment, it is possible to correct an output level even if the expected value of the output level cannot be derived mathematically based on a phase change amount.
The correction amount calculator 1782 obtains the magnitude of a composite vector at each frequency from the enhanced signal phase spectra of preceding and current frames, and decides a correction coefficient based on the magnitude. Letting α be the phase of a preceding frame and be the phase of a current frame, a magnitude |S′| of a composite vector is given by:
|S′|=√{square root over ({cos α+cos β}2+{sin α+sin β}2)}=√{square root over (2+2{sin α sin β}+2{cos α cos β})} (60)
A magnitude |S| of a composite vector when the phases of successive frames coincide with each other is |S|=2, which has already been derived by equation (51). Therefore, the amplitude correction amount is given by:
|S|/|S′|=2/√{square root over (2+2{sin α sin β}+2{cos α cos β})} (61)
In this embodiment, this value is supplied to the amplitude controller 1708 to correct the enhanced signal amplitude spectrum, thereby canceling a drop of the output level. In this embodiment, the components and operations except for the phase rotation unit are the same as those in the second embodiment, and a description thereof will be omitted.
The eighth embodiment of the present invention will be explained with reference to
An invention according to this embodiment is different from
The input/output ratio calculator 1881 obtains the level ratio from the deteriorated signal and the time domain signal of the enhanced signal.
A level ratio R of an enhanced signal xn(t) of the nth frame to a deteriorated signal yn(t) of the nth frame is given by:
where t represents the sample time, and L represents the frame length of Fourier transform.
The correction amount calculator 1882 obtains an amplitude correction amount G from the ratio value R and the number of frequency components having undergone phase rotation. When a transformer divides the time domain signal into N frequency components and phase rotation is performed for M phase spectra, the amplitude correction amount G is given by:
The amplitude correction unit 882 executes amplitude correction for only frequencies having undergone phase rotation based on phase rotation presence/absence information transmitted from a change amount generator 203. In this embodiment, the components and operations except for the input/output ratio calculator 1881 and correction amount calculator 1882 are the same as those in the fourth embodiment, and a description thereof will be omitted.
According to this embodiment of the present invention, a correction coefficient is obtained from a time domain signal, so the output level can be corrected regardless of a phase replacement amount decision method.
The ninth embodiment of the present invention will be described with reference to
The averaging processor 1981 receives a deteriorated signal from an input terminal 206, performs averaging processing, and then supplies an average value to the input/output ratio calculator 1881. The averaging processor 1981 receives an enhanced signal from an inverse transformer 204, performs averaging processing, and then supplies an average value to the input/output ratio calculator 1881. The input/output ratio calculator 1881 receives the average values of the deteriorated signal and enhanced signal from the averaging processor 1981, and calculates the level ratio.
The averaging processor 1981 averages the levels of the deteriorated signal and enhanced signal for an arbitrary time length. More specifically, the averaging processor 1981 averages the levels of the deteriorated signal and enhanced signal using a moving average, leakage integral, or the like.
According to this embodiment of the present invention, in addition to the effects of the eighth embodiment, since an averaged level is used, variations of the correction amount can be suppressed to improve the quality of an output signal.
The 10th embodiment of the present invention will be described with reference to
A deteriorated signal 210 supplied to an input terminal 206 is supplied to a transformer 201 and the amplitude controller 2008. The transformer 201 supplies a deteriorated signal amplitude spectrum 230 to the amplitude component delay unit 2011 and inverse transformer 2013. The transformer 201 also supplies a deteriorated signal phase spectrum 220 to a phase controller 202 and a change amount generator 203. The phase controller 202 controls the deteriorated signal phase spectrum 220 supplied from the transformer 201 by using a change amount generated by the change amount generator 203, and supplies it as an enhanced signal phase spectrum to the inverse transformer 2013 and phase component delay unit 2012. Also, the change amount generator 203 transmits the presence/absence of phase rotation at each frequency to the amplitude controller 2008.
By using the deteriorated signal amplitude spectrum 230 supplied from the transformer 201 and the deteriorated signal phase spectrum supplied from the phase controller 202, the inverse transformer 2013 transmits, to the amplitude controller 2008, a signal whose level has dropped due to phase rotation.
The amplitude component delay unit 2011 delays the deteriorated signal amplitude spectrum 230 supplied from the transformer 201, and supplies it to the amplitude controller 2008.
The phase component delay unit 2012 delays the enhanced signal phase spectrum supplied from the phase controller 202, and supplies it to an inverse transformer 204. The amplitude controller 2008 generates a corrected amplitude spectrum 250 based on the deteriorated signal amplitude spectrum supplied from the amplitude component delay unit 2011 by using the output of the inverse transformer 2013 and the deteriorated signal 210.
The inverse transformer 204 performs inverse transform by composing an enhanced signal phase spectrum 240 supplied from the phase controller 202 via the phase component delay unit 2012 and the corrected amplitude spectrum 250 supplied from the amplitude controller 2008, and supplies the inverse transform result as an enhanced signal to an output terminal 207.
The phase controller 202 controls the deteriorated signal phase spectrum 220, and the inverse transformer 2013 transforms the deteriorated signal phase spectrum 220 into a time domain signal. By using this signal and the deteriorated signal 210, the amplitude controller 2008 obtains the amount of a variation in level caused by phase rotation.
Since the variation amount arises from a variation caused by only rotation processing by the phase controller 202. The amplitude controller 2008 can, therefore, accurately grasp a variation in level caused by phase rotation. Although the amplitude controller 2008 executes amplitude correction using the level ratio, the obtained level ratio is one for an immediately preceding frame. Thus, the amplitude component delay unit 2011 and phase component delay unit 2012 are introduced, and the amplitude controller 2008 performs amplitude correction for the frequency component of an immediately preceding frame.
The correction amount calculator 2182 receives phase rotation presence/absence information at each frequency from the change amount generator 203, and calculates an amplitude correction amount. An amplitude correction unit 882 corrects the enhanced signal phase spectrum at each frequency based on the amplitude correction amount, and supplies it to the inverse transformer 204.
In addition to the effects of the eighth embodiment, the noise suppression apparatus 2000 according to this embodiment can avoid a delay of the input/output ratio that is inevitable in the eighth and ninth embodiments, thereby correcting the output level more accurately.
The 11th embodiment of the present invention will be described with reference to
When the overlapping ratio is 0%, no amplitude correction is necessary. When the overlapping ratio is 50%, the amplitude correction amount is G. Thus, by using the ratio between a frame length L and an overlapping length Q, the amplitude correction amount is given by:
where G′ represents the amplitude correction amount when correction based on the overlapping ratio is performed.
For example, the fact that Q=L/2 holds for the overlapping ratio of 50% yields:
The fact that Q=L/4 holds for the overlapping ratio of 25% yields:
The amplitude controller 708 corrects a correction coefficient transmitted from a phase controller 202 based on equation (64), and corrects an enhanced signal amplitude spectrum. In this embodiment, the components and operations except for the frame overlapping controller 2208 are the same as those in the fourth embodiment, and a description thereof will be omitted.
In addition to the effects of the fourth embodiment, the noise suppression apparatus 2200 according to this embodiment can freely set the frame overlapping ratio.
Although the aforementioned first to 11th embodiments have described the noise suppression apparatuses having different features, a noise suppression apparatus having a combination of these features is also incorporated in the scope of the present invention.
The present invention can be applied to a system including plural devices or a single apparatus. The present invention can be applied to a case in which a software signal processing program for implementing the functions of the embodiments is supplied to the system or apparatus directly or from a remote site. Therefore, the program installed in a computer to implement the functions of the present invention by the computer, a medium storing the program, or a WWW server to download the program is also incorporated in the scope of the present invention.
The CPU 2302 controls the operation of the computer 2300 by loading the signal processing program. That is, the CPU 2302 executes the signal processing program stored in the memory 2304, and transforms a mixed signal, in which the first and second signals coexist, into a phase component for each frequency and an amplitude component or power component for each frequency (step S2311). Then, the CPU 2302 generates a change amount of a phase component at a predetermined frequency by using a series of data with a cross-correlation weaker than that of the phase components and randomness lower than that of random numbers (step S2312). The CPU 2302 controls the phase component in accordance with the generated change amount (step S2313). The CPU 2302 generates an enhanced signal by using the phase component having undergone the control processing in step S2313 (step S2314).
Accordingly, the same effects as those of the first embodiment can be obtained. Note that the same applies to the second to 11th embodiments. The system implemented when the CPU executes the signal processing program for implementing the functions of the embodiments is also incorporated in the scope of the present invention.
While the present invention has been described with reference to exemplary embodiments, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. The scope of the following claims is to be accorded the broadest interpretation so as to encompass all such modifications and equivalent structures and functions.
This application claims the benefit of Japanese Patent Application No. 2012-259218 filed on Nov. 27, 2012, which is hereby incorporated by reference herein in its entirety.
Number | Date | Country | Kind |
---|---|---|---|
2012-259218 | Nov 2012 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2013/079701 | 11/1/2013 | WO | 00 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2014/083999 | 6/5/2014 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
6952460 | Van Wechel | Oct 2005 | B1 |
20040228422 | Silveira | Nov 2004 | A1 |
20050129139 | Jones | Jun 2005 | A1 |
20060256978 | Balan | Nov 2006 | A1 |
20070014428 | Kountchev | Jan 2007 | A1 |
20080247569 | Kondo | Oct 2008 | A1 |
20080319739 | Mehrotra | Dec 2008 | A1 |
20120177156 | Hauske | Jul 2012 | A1 |
20130336133 | Carbonelli | Dec 2013 | A1 |
Number | Date | Country |
---|---|---|
10-149198 | Jun 1998 | JP |
2002-023800 | Jan 2002 | JP |
2004-272292 | Sep 2004 | JP |
2008-257049 | Oct 2008 | JP |
2007029536 | Mar 2007 | WO |
2012070671 | May 2012 | WO |
2012114628 | Aug 2012 | WO |
Entry |
---|
Japanese Office Action for JP Application No. 2014-550098 dated Nov. 28, 2017. |
International Search Report for PCT Application No. PCT/JP2013/079701, dated Feb. 4, 2014. |
Akihiko Sugiyama, “Single-Channel Impact-Noise Suppression with No Auxiliary Information for Its Detection,” Proc. IEEE Workshop on Appl. of Sig.Proc. to Audio and Acoustics(WASPAA), pp. 127-130, Oct. 2007. |
B. Rad and T. Virtanen, “Phase spectrum prediction of audio signals,” Proc. ISCCSP2012, CD-ROM, May 2012. |
S. T. Kaplan and T. J. Ulrych, “Phase Unwrapping: A review of methods and a novel technique,” Proc. 2007 CSPG CSEG Conv. pp. 534-537, May 2007. |
Number | Date | Country | |
---|---|---|---|
20150295741 A1 | Oct 2015 | US |