This application claims the priority of Korean Patent Application Nos. 10-2009-0127541 filed on Dec. 18, 2009 and 10-2010-0104197 filed on Oct. 25, 2010, in the Korean Intellectual Property Office, the disclosures of which are incorporated herein by reference.
1. Field of the Invention
The present invention relates to a signal processing technique and, more particularly, to a blind signal separating method for separating respective signals from multi-channel multi-path mixed signals, and an apparatus for performing the same.
2. Description of the Related Art
In general, a plurality of signal sources in a multi-channel multi-path environment reach respective sensors via various paths and are mixed in the respective sensors. Among the various paths from the locations of the signal sources to the sensors, a direct path involves a time delay corresponding to relative locations of the signal sources and sensors.
An independent component analysis (ICA) technique, using the fact that signal sources are statically independent, estimates radio wave paths of signal sources from multi-channel signals and separates the signal sources, without any information regarding the signal sources provided in advance.
Also, a frequency domain ICA technique is a method in which an ICA is applied in each frequency. In this case, because the ICA is separately applied in each frequency, the separated signals are permutated, and one of the methods for solving such permutation phenomenon is utilizing direction information of the signals.
The method for separating a blind signal using the frequency domain ICA and the permutation phenomenon will now be described in detail. First, when an nth (n=1, . . . , N) signal source is sn(t) and an impulse response from the nth signal source to an mth (m=1, . . . , M) sensor is hmn, mixed signals (xm) collected from the mth sensor can be represented by Equation 1 shown below:
In Equation 1, * indicates convolution, and the impulse response hmn is a mixture filter administering the process of mixing the signal sources by the convolution. Signal processing is performed in a frequency domain, so mixed signals in a time domain are multiplied by a window function and then converted into signals of the frequency domain through short-time Fourier Transform.
The mixed signals in the frequency domain can be represented by Equation 2 shown below:
In Equation 2, f indicates a frequency index, t indicates a time index, and xm(f,t), Hmn(f), sn(f,t) are those obtained as xm, hm, sn are Fourier-transformed, respectively. In general, the impulse response hmn changes over time, but hereinafter, it is assumed that the impulse response hmn is time-invariant for the sake of brevity.
When the signal sources and the mixed signals are defined as s(f,t)=[s1(f,t),sN(f,t)]T and x(f,t)=[x1(f,t),xn(f,t)]T in a vector form, the mixed signals can be represented by Equation 3 shown below:
x(f,t)=H(f)s(f,t) [Equation 3]
In the frequency domain, ICA with respect to a complex value (CICA: Complex-valued ICA) is separately applied in each frequency to calculate a separation filter W(f). An applicable CICA method includes FastICA (E. Bingham et al., “A fast fixed-point algorithm for independent component analysis of complex-valued signals,” International Journal of Neural Systems, vol. 10, no. 1, pp. 1-8, 2000) or InforMax (M. S. Pederson et al., “A survey of convolutive blind source separation methods,” in Multichannel Speech Processing Handbook, Jacob Benesty and Arden Huang, Eds, Springer, 2007), and the like.
The separated signals with respect to the mixed signals are calculated as represented by Equation 4 shown below:
y(f,t)=W(f)x(f,t) [Equation 4]
Because ICA is independently applied to each frequency and the statistical independence of signals is not related to the order of signals and change in amplitude of the signals, the resultantly calculated separation filters are sorted in random order in each frequency and have arbitrary sizes. These ambiguities will be referred to as permutation and scaling ambiguities. Here, the scaling ambiguity can be solved by a minimum distortion principle.
Also, various methods for solving the permutation problem of the frequency domain ICA have been proposed, and among the methods, a method of solving the permutation by using direction information of a separation filter is advantageous in that it can be employed irrespective of a type of signals and provides excellent performance.
When a far-field model, which disregards a signal echo and considers only a direct path because the distance between a sensor and a signal source are sufficiently long, is taken into account, the relationship between the direction of the signal and the mixture filter can be represented by Equation 5 shown below:
In Equation 5, λm indicates an attenuation of a direct path, v indicates a radiowave speed of a signal, and dm and θn indicate the position of an mth sensor and a direction angle of an nth signal source based on the front side of the sensor when the position of a reference sensor m′ is set to be 0. The ratio of the direct path can be represented by Equation 6 shown below:
In Equation 6, τmn indicates a relative delay time taken for the nth signal source to reach the mth sensor based on the reference sensor m′. The phase
has a value ranging from −π to π, so when the frequency is f≧1/(2|τmn|)≧ν/2dm, aliasing occurs, and at this time, the integer k has a value not 0.
As for respective streams of the separation filter W(f) obtained from the results of ICA, spectral nulls are positioned on a spatial spectrum in the direction of the signal sources in order to remove the remaining signals other than one signal. In this sense, the separation filter has information regarding the direction of the signal sources, which is mathematically equivalent to a null-beamformer.
Meanwhile, the separation filter is the converse of the mixture filter, so A(f)=W−1(f) obtained by taking the converse of the separation filter is equal to the size of the mixture filter H(f) except for the permutation. Thus, based on these characteristics, a method of estimating direction information of a signal source from A(f) and sorting the rows of A(f) such that they have the same direction information as the estimated direction information has been proposed. Here, as the scheme of sorting the rows of A(f), the converse of the separation filter, a k-means clustering scheme is applied. However, when spatial aliasing occurs due to a wide frequency band of a signal or due to a large space between sensors, because the k value has a value, not 0, a one-to-one corresponding relationship is not maintained between the direction information and phase information (or time delay information), so the method cannot be employed.
To offset the shortcomings, a method of setting a mixture filter as a direct path model having a time delay and attenuation factor and clustering the rows of A(f) by using the same has been proposed. A k-means clustering scheme is also applied to this method. However, as the k-means clustering scheme does not utilize statistical characteristics, its performance may be degraded in an environment in which an echo is large or background noise is present. In addition, in order to accurately normalize a phase, the approximate size of a sensor array must be known and information regarding the disposition of sensors, or the like, is required.
Another method for solving the permutation problem is a method of directly using the phase of a separation filter, rather than taking the converse of the separation filter. However, because this method utilizes W(f) forming a spectrum zero point with respect to a signal source, it cannot be applied to a case in which there are three or more signal sources. Also, this method does not consider statistical characteristics, the performance may be degraded in an area with excessive echo, and information regarding the size and disposition of a sensor array is required.
An aspect of the present invention provides a method for separating a blind signal capable of solving permutation of a separation filter without advance information regarding a sensor array and thus improving the separation performance.
Another aspect of the present invention provides an apparatus for separating a blind signal through the method for separating a blind signal.
According to an aspect of the present invention, there is provided a method for separating a blind signal, including: converting mixed signals of a time domain collected by using a plurality of sensors into mixed signals of a frequency domain; calculating a separation filter from the mixed signals which have been converted into those of the frequency domain; calculating an inverse filter of the separation filter; calculating the difference in phase between the respective sensors from the calculated inverse filter; permutation-sorting the separation filter by using the calculated phase difference; and separating the mixed signals of the frequency domain by using the permutation-sorted separation filter.
In the calculating of the difference in phase between the sensors from the calculated inverse filter, a certain sensor among the plurality of sensors may be set as a reference sensor, and the difference between the phase of each row of the matrix of the inverse filter and the phase of the row corresponding to the reference sensor may be calculated.
The permutation-sorting of the separation filter may include: estimating a time delay parameter based on the calculated phase difference; calculating permutation-sorting based on the estimated time delay parameter; and permutation-sorting the separation filter by using the calculated permutation-sorting.
In the estimating of the time delay parameter, θ which maximizes a cost function of Equation of
(where Nφ(m,l,n,k,f)≡N(φmO
In the calculating of the permutation-sorting, a permutation-sorting that maximizes a posterior probability of a permutation combination of each frequency may be calculated by using
In the permutation-sorting of the separation filter, the whole frequency band may be divided into a low frequency band and a high frequency band based on a predetermined particular frequency, and then the permutation-sorting may be performed.
According to another aspect of the present invention, there is provided an apparatus for separating a blind signal, including: a sensor unit configured to include a plurality of sensors each collecting a mixed signal; a DFT unit converting mixed signals of a time domain provided from the sensors into mixed signals of a frequency domain; an independent component analyzing unit calculating a separation filter from the mixed signals which have been converted into those of the frequency domain; a permutation-sorting unit calculating an inverse filter of the separation filter, calculating a phase difference between sensors from the calculated inverse filter, and permutation-sorting the separation filter by using the calculated phase difference; and a signal separating unit separating the mixed signals of the frequency domain by using the permutation-sorted separation filter.
The above and other aspects, features and other advantages of the present invention will be more clearly understood from the following detailed description taken in conjunction with the accompanying drawings, in which:
a, 4b and 4c are view illustrating an environment for evaluating the method for separating a blind signal according to an exemplary embodiment of the present invention;
The present invention will now be described more fully hereinafter with reference to the accompanying drawings, in which preferred embodiments of the invention are shown.
However, it should be understood that the following exemplifying description of the invention is not meant to restrict the invention to specific forms of the present invention but rather the present invention is meant to cover all modifications, similarities and alternatives which are included in the spirit and scope of the present invention. The terms used in the present application are merely used to describe particular embodiments, and are not intended to limit the present invention. Unless otherwise defined, all terms used herein, including technical or scientific terms, have the same meanings as those generally understood by those with ordinary knowledge in the field of art to which the present invention belongs. Such terms as those defined in a generally used dictionary are to be interpreted to have the meanings equal to the contextual meanings in the relevant field of art, and are not to be interpreted to have ideal or excessively formal meanings unless clearly defined in the present application.
Embodiments of the present invention will be described below in detail with reference to the accompanying drawings, where those components are rendered the same reference number that are the same or are in correspondence, regardless of the figure number, and redundant explanations are omitted.
With reference to
Next, the blind signal separating apparatus converts the collected mixed signals xm of a time domain into signals xm(f,t) of a frequency domain through short-time Fourier transform (step 103). Here, the mixed signals xm of the time domain are multiplied by a window function and then converted into the signals of the frequency domain. As the window function, a hamming window may be used. Here, f indicates a frequency index, and t indicates a time index.
The blind signal separating apparatus independently and separately processes the mixed signals xm(f,t), which have been converted into those of the frequency domain, in each frequency f by using an independent component analysis (ICA) to calculate a separation filter matrix W(f) (step 105). Here, the separation filter matrix W(f) is in a randomly permuted state, so a permutation-sorting process is required.
For the permutation-sorting, first, the blind signal separating apparatus calculates an inverse matrix (or an inverse filter) A(f)=W−1(f) of the separation filter matrix W(f) (step 107).
Thereafter, the blind signal separating apparatus performs permutation-sorting by using direction information of the inverse matrix A(f) of the separation filter.
To this end, first, the blind signal separating apparatus calculates a phase difference matrix Φ(f) from the inverse matrix (or an inverse filter matrix) A(f) of the separation filter (step 109). Here, the blind signal separating apparatus may use a Gaussian mixture model with respect to the phase difference.
In detail, the blind signal separating apparatus calculates the difference in phase between mth row of the inverse filter A(f) of the separation filter and a reference row m′ as represented by Equation 7 shown below:
When there is an echo and noise, the average of the phase difference φmn(f) may be represented as a random variable of Gaussian probability distribution having an average phase difference of 2πτmn and a variance of σmn2.
In Equation 8, a constant k has an integer value, not 0, when there is aliasing. The integer value is determined by (f,τmn) and may be set within a limited range from −K to K. Here, K may be determined to be different in each frequency according to the disposition and size of the sensor array. When the size of the sensor array is not accurately known, a sufficient larger value may be set.
The probability distribution of φmn(f) with respect to every available k value can be represented by Equation 9 shown below:
In order to solve the permutation problem, a combination of permutations that can be generated is defined as O={O1, . . . , Ol, . . . , OP}. Here, P=N! Also, in order to solve the permutation problem by using an expectation maximization scheme, a latent variable zfl is defined as follows.
(1) When the inverse matrix A(f) of the separation filter corresponds to permutation 01 in the frequency f, zfl has a value of 1.
(2) When
When a reference sensor is set to be m′=1 for simplifying the formula, the phase difference may be expressed as a matrix as represented by Equation 10 shown below;
When it is assumed that the respective phase differences are statistically independent from each other, a probability distribution when it is assumed that an observed phase difference corresponds to a permutation 01 can be expressed as represented by Equation 11 shown below:
In Equation 11, 01(n) is nth element of a first permutation 01. Also, the sum of m is for considering the phase difference of all the sensors with respect to the reference sensor.
From the foregoing model, the probability of Φ(f) can be represented by Equation 12 by averaging all the permutations.
Also, the blind signal separating apparatus estimates a time delay parameter in order to solve the permutation from the calculated phase difference as described above (step 111).
The process of estimating a parameter will now be described in detail with reference to
Next, the blind signal separating apparatus defines a parameter to be estimated for the low frequency band as θ={τmn,σmn2,ψl}, and initializes the parameter θ={τmn,σmn2,ψl} with respect to the low frequency band with a suitable value (step 111-3). Here, it is defined as Nφ(m,l,n,k,f)≡N(φmO
In estimating the parameter θ, in a state in which a previous parameter θold is given, θ, which maximizes a cost function as represented in Equation 13, is estimated by using an expectation-maximization (EM) technique.
To this end, first, when the parameter is given, the blind signal separating apparatus calculates posterior probability βfl of the permutation in the frequency f as represented by Equation 14 shown below (step 111-5).
And then, an auxiliary function is defined as follows.
Thereafter, the parameter θ, which maximizes Equation 15, is calculated as represented by Equation 16 to Equation 18 shown below (step 111-7).
The estimated value with respect to ψl expressed in Equation 18 can be calculated by optimizing Equation 1 such that it satisfies the condition of
Also, in Equation 18, F indicates the total number of discrete frequencies.
In Equation 16, γ′fl is expressed as shown in Equation 19 below, and in Equation 17, γ″fl is expressed as shown in Equation 20 below:
Thereafter, the blind signal separating apparatus calculates a likelihood ratio function Q(θ|θold) by using Equation 15 (step 111-9).
Then, the blind signal separating apparatus determines whether or not the parameter estimation has been converged based on the previously likelihood ratio function calculation results (step 111-11). When it is determined that the parameter estimation has not been converged, the blind signal separating apparatus returns to step 111-3 and repeatedly performs steps 111-3 to 111-11.
When it is determined that the parameter estimation has been sufficiently converged in step 111-11, the blind signal separating apparatus performs step 111-13 to initialize the parameter θ={τmn,σmn2,ψl} with respect to the high frequency band with a proper value (step 111-13).
Thereafter, the blind signal separating apparatus performs steps 111-15 to 111-21 in the same manner as steps 111-5 to 111-11 preformed at the low frequency band, to estimate a parameter with respect to the high frequency band.
When the estimation of the parameter is completed according to the process illustrated in
A posterior probability of the phase difference over the permutation given by the Bayes rule can be represented by Equation 22 shown below:
A desired permutation-sorting can be determined as represented by Equation 23 shown below, from Equation 22, such that the posterior probability is maximized.
Thereafter, the blind signal separating apparatus performs permutation-sorting on the separation filter W(f0 by using the permutation-sorting of Equation 23 (step 115), and then separates the mixed signals by using the separation filter of which permutation-sorting has been solved (step 117).
And then, the blind signal separating apparatus outputs and stores the separated signals (step 119).
With reference to
The sensor unit 310 may include a plurality of microphones (sensors) configured in the form of an array, and each of the sensors collects mixed signals xm(m=1, M) of multiple paths. Here, the mixed signals xm collected through the sensor unit 310 may be provided to the DFT unit 320 and, simultaneously, stored in the storage unit 370.
The DFT unit 320 receives the mixed signals xm of the time domain from the sensor unit 310 and performs discrete Fourier transform on the received mixed signals xm to convert them into signals xm(f,t) of the frequency domain. Here, the DFT unit 320 may multiply the collected mixed signals xm of the time domain by a window function and then convert them into the signals xm(f,t) of the frequency domain through the short-time Fourier transform.
The independent component analyzing unit 330 receives the mixed signals xm(f,t) which have been converted into those of the frequency domain, from the DFT unit 320 and performs independent component analysis (ICA) on the received signals to calculate a separation filter matrix W(f) with respect to each frequency f.
The permutation-sorting unit 340 calculates an inverse matrix A(f) of the separation filter matrix W(f) provided from the independent component analyzing unit 330, calculates a phase difference matrix from the inverse matrix A(f), calculates permutation-sorting by estimating a time delay parameter from the phase difference, and then sorts the permutation of the separation filter matrix W(f).
Here, the permutation-sorting unit 340 may perform the steps 107 to 115 in
The signal separating unit 350 separates the mixed signals by using the separation filter, whose permutation has been sorted, provided from the permutation-sorting unit 340.
The IFFT unit 360 performs IFFT on the separated signals of the frequency domain provided from the signal separating unit 350 to convert them into signals of the time domain.
The storage unit 370 stores the signals which have been converted into those of the time domain.
The DFT unit 320, the independent component analyzing unit 330, the permutation-sorting unit 340, the signal separating unit 350, and the IFFT (Inverse Fast Fourier Transform) unit 360 may be implemented in the form of a software program which can be read from an information processing device such as a computer, or the like, and executed, or may be implemented in the form of hardware, such as specifically devised ASIC (Application Specific Integrated Circuits), a digital signal processor, or the like, or a combination of hardware and software.
For example, when the blind signal separating apparatus as illustrated in
a, 4b and 4c are view illustrating an environment for evaluating the method for separating a blind signal according to an exemplary embodiment of the present invention.
As shown in
Mixed signals were acquired by measuring an impulse response in an actual laboratory space as shown in
The mixed signals collected by the sensor (microphone) were selected with a hamming window having a length of 2048 samples so as to have a 50% overlap and then converted into signals of the frequency domain through FFT (Fast Fourier Transform).
In the performance evaluation experiment, the performances of various combinations of microphone-voice signals were compared. A separation performance was expressed by a SIR (Signal-to-Interference Ratio), and an SDR (Signal-to-Distortion Ratio). Here, the separation performance was calculated by using BSS EVAL MATLAB Toolbox (R. Gribonval, C. Fevotte, and E. Vincent, BSS EVAL Toolbox User GuideRevision 2.0, IRISA Technical Report 1706, April 2005).
In the performance evaluation experiment, the low frequency and high frequency bands were classified based on 1562.5 Hz (discrete frequency index f=200).
In order to use a proper initial value required for a sufficient convergence in a parameter estimation process, τmn was initialized as shown in
a, 5b, and 5c show the phase difference results before and after the permutation-sorting performed on the m=2nd, 3rd, and 4th rows when the first row (m′=1) of A(f) was set as a reference sensor, in the case of four signal sources (six ones in case of
The results illustrated in
The conventional Sawada method does not use statistical characteristics, so an initial estimated value at the low frequency band is not precise, or when the phase patterns of the high frequency band are complicated, clustering in the vicinity of the intersection points of the phase patterns fails. This kind of error tends to be reflected in the final results as it is, without being corrected. This problem may be reduced to a degree by setting the reference sensor as a central sensor of the sensor array, but in this case, information regarding the disposition of the sensor array is required.
In comparison, the blind signal separating method according to an exemplary embodiment of the present invention provides substantially the same separation performance without the necessity of the size and disposition of the sensor array. Also, when the information regarding the sensor array such as the disposition of the sensors, or the like, the time delay can be converted into the direction of signal sources in Equation 6 and Equation 7. Thus, the direction of the signal sources can be estimated through the blind signal separating method according to an exemplary embodiment of the present invention.
As described above, in the blind signal separating method according to an exemplary embodiment of the present invention, because every information regarding the mixed signals collected from all the sensors are effectively used, the performance of signal separation can be improved, the selection of the reference sensor does not substantially affect the separation performance, and the constantly uniform signal separation performance can be obtained without advance information regarding the disposition of the sensors and signal sources. In addition, when the information regarding the disposition of the sensors is acquired, the direction of the signal sources can be accurately calculated.
As set forth above, according to exemplary embodiments of the invention, because permutation problem of a separation filter is solved by using the statistical characteristics, an excellent separation performance can be provided even in an environment in which there is excessive echo, without advance information regarding the size of a sensor array or the disposition of sensors. Also, a time delay calculated by using the method according to an exemplary embodiment of the present invention can be utilized for estimating the direction of a signal source by using the information regarding the sensor disposition.
While the present invention has been shown and described in connection with the exemplary embodiments, it will be apparent to those skilled in the art that modifications and variations can be made without departing from the spirit and scope of the invention as defined by the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
10-2009-0127541 | Dec 2009 | KR | national |
10-2010-0104197 | Oct 2010 | KR | national |
Number | Name | Date | Kind |
---|---|---|---|
7496482 | Araki et al. | Feb 2009 | B2 |
7647209 | Sawada et al. | Jan 2010 | B2 |
20030046038 | Deligne et al. | Mar 2003 | A1 |
20050203981 | Sawada et al. | Sep 2005 | A1 |
20060058983 | Araki et al. | Mar 2006 | A1 |
20070025556 | Hiekata | Feb 2007 | A1 |
20080215651 | Sawada et al. | Sep 2008 | A1 |
20090022403 | Takamori et al. | Jan 2009 | A1 |
20090136057 | Taenzer | May 2009 | A1 |
20090310444 | Hiroe | Dec 2009 | A1 |
20110149719 | Nam | Jun 2011 | A1 |
Entry |
---|
Sawada et al. “Grouping Separated Frequency Components by Estimating Propagation Model Parameters in Frequency-Domain Blind Source Separation,” in IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, No. 5, Jul. 2007, pp. 1592-1604. |
Knapp et al. “The generalized correlation method for estimation of time delay,” IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-24, No. 4, pp. 320-327, Aug. 1976. |
Anthony J. Bell, Terrence J. Sejnowski, “An Information-Maximization Approach to Blind Separation and Blind Deconvolution”, Neural Computation, vol. 7, pp. 1129-1159, 1995, Massachusetts Institute of Technology. |
Hiroshi Sawada, Shoko Araki, Ryo Mukai, Shoji Makino,“Solving the Permutation Problem of Frequency-Domain BSS When Spatial Aliasing Occurs With Wide Sensor Spacing”, ICASSP, pp. 77-80, 2006, IEEE. |
Number | Date | Country | |
---|---|---|---|
20110149719 A1 | Jun 2011 | US |