Claims
- 1. A method comprising:
recording a number of input sound source signals by a number of sound input devices, the number of sound input devices at least equal to the number of input sound source signals, to generate a number of sound input device signals at least equal to the number of input sound source signals, the number of input sound source signals including a target input sound source signal and acoustical factor signals; and, applying a number of reconstruction filters to the number of sound input device signals according to a convolutional mixing independent component analysis (ICA) to generate at least one reconstructed input sound source signal separating the target input sound source signal from the number of sound input device signals without permutation, the number of reconstruction filters taking into account a priori knowledge regarding the target input sound source signal, one of the at least one reconstructed input sound source signal corresponding to the target input sound source signal.
- 2. The method of claim 1, wherein each of the number of sound input devices is a microphone.
- 3. The method of claim 1, wherein the target input sound source signals corresponds to human speech.
- 4. The method of claim 1, wherein the acoustical factor signals include reverberation.
- 5. The method of claim 1, wherein at least one of the input sound source signals exhibits correlation over time.
- 6. The method of claim 1, wherein the a priori knowledge regarding the target input sound source signal is an estimate of spectra of the target input sound source signal.
- 7. The method of claim 1, wherein the number of reconstruction filters is constructed based on a speech recognition system, such that the one of the at least one reconstructed input sound source signal corresponding to the target input sound source signal is matched against a plurality of words of a dictionary of the speech recognition system, a high probability match indicating that proper separation has occurred.
- 8. The method of claim 1, wherein the number of reconstruction filters is constructed based on a vector quantization (VQ) codebook of vectors, the vectors representing sound source patterns typical of the target input sound source signal, such that the one of the at least one reconstructed input sound source signal corresponding to the target input sound source signal is matched against the vectors of the VQ codebook, a high probability match indicating that proper separation has occurred.
- 9. The method of claim 8, wherein the vectors are linear prediction (LPC) vectors.
- 10. A machine-readable medium having instructions stored thereon for execution by a processor to perform the method of claim 1.
- 11. A method for constructing a number of reconstruction filters to separate a target input sound source signal from a number of sound input device signals without permutation according to a convolutional mixing independent component analysis (ICA), comprising:
determining a maximum a posteriori (MAP) estimate of the number of reconstruction filters by summing over a plurality of possible word strings within a dictionary of a hidden Markov model (HMM) speech recognition system; employing the MAP estimate of the number of reconstruction filters within the HMM speech recognition system to generate at least one nonlinear equation representing the number of reconstruction filters; and, solving the at least one nonlinear equation to generate the number of reconstruction filters.
- 12. The method of claim 11, wherein the MAP estimate of the number of reconstruction filters encapsulates a priori knowledge of the target input sound source signal, where the target sound source signal corresponds to human speech.
- 13. A machine-readable medium having instructions stored thereon for execution by a processor to perform the method of claim 11.
- 14. A method for constructing a number of reconstruction filters to separate a target input sound source signal from a number of sound input device signals without permutation according to a convolutional mixing independent component analysis (ICA), comprising:
determining a prediction error based on a vector quantization (VQ) codebook of vectors, the vectors representing sound patterns typical of the target input sound source signal, such that matching the vectors to a reconstructed signal is indicative of whether the reconstructed signal has been properly separated; minimizing the prediction error to obtain an estimate of the number of reconstruction filters; and, solving the prediction error as minimized to generate the number of reconstruction filters.
- 15. The method of claim 14, wherein the VQ codebook of vectors encapsulates a priori knowledge of the target input sound source signal as human speech patterns, where the target sound source signal corresponds to human speech.
- 16. The method of claim 14, wherein the vectors are linear prediction (LPC) vectors, and the prediction error is a linear prediction (LPC) error.
- 17. The method of claim 14, wherein solving the prediction error as minimized to generate the number of reconstruction filters comprises using an expectation maximization (EM) approach.
- 18. The method of claim 17, wherein an E-step of the EM approach determines a best codeword within the VQ codebook of vectors.
- 19. The method of claim 17, wherein an M-step of the EM approach minimizes the prediction error.
- 20. A machine-readable medium having instructions stored thereon for execution by a processor to perform the method of claim 14.
RELATED APPLICATIONS
[0001] This application claims the benefit of and priority to the previously filed provisional patent application entitled “Speech/Noise Separation Using Two Microphones and a Model of Speech Signals,” filed on Apr. 26, 2000, and assigned serial No. 60/199,782.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60199782 |
Apr 2000 |
US |