The present application relates generally to audio processing and, more specifically, to systems and methods for restoring distorted speech components of a noise-suppressed audio signal.
Noise reduction is widely used in audio processing systems to suppress or cancel unwanted noise in audio signals used to transmit speech. However, after the noise cancellation and/or suppression, speech that is intertwined with noise tends to be overly attenuated or eliminated altogether in noise reduction systems.
There are models of the brain that explain how sounds are restored using an internal representation that perceptually replaces the input via a feedback mechanism. One exemplary model called a convergence-divergence zone (CDZ) model of the brain has been described in neuroscience and, among other things, attempts to explain the spectral completion and phonemic restoration phenomena found in human speech perception.
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
Systems and methods for restoring distorted speech components of an audio signal are provided. An example method includes determining distorted frequency regions and undistorted frequency regions in the audio signal. The distorted frequency regions include regions of the audio signal in which a speech distortion is present. The method includes performing one or more iterations using a model for refining predictions of the audio signal at the distorted frequency regions. The model can be configured to modify the audio signal.
In some embodiments, the audio signal includes a noise-suppressed audio signal obtained by at least one of noise reduction or noise cancellation of an acoustic signal including speech. The acoustic signal is attenuated or eliminated at the distorted frequency regions.
In some embodiments, the model used to refine predictions of the audio signal at the distorted frequency regions includes a deep neural network trained using spectral envelopes of clean audio signals or undamaged audio signals. The refined predictions can be used for restoring speech components in the distorted frequency regions.
In some embodiments, the audio signals at the distorted frequency regions are set to zero before the first iteration. Prior to performing each of the iterations, the audio signals at the undistorted frequency regions are restored to initial values before the first iterations.
In some embodiments, the method further includes comparing the audio signal at the undistorted frequency regions before and after each of the iterations to determine discrepancies. In certain embodiments, the method allows ending the one or more iterations if the discrepancies meet pre-determined criteria. The pre-determined criteria can be defined by low and upper bounds of energies of the audio signal.
According to another example embodiment of the present disclosure, the steps of the method for restoring distorted speech components of an audio signal are stored on a non-transitory machine-readable medium comprising instructions, which when implemented by one or more processors perform the recited steps.
Other example embodiments of the disclosure and aspects will become apparent from the following description taken in conjunction with the following drawings.
Embodiments are illustrated by way of example and not limitation in the figures of the accompanying drawings, in which like references indicate similar elements.
The technology disclosed herein relates to systems and methods for restoring distorted speech components of an audio signal. Embodiments of the present technology may be practiced with any audio device configured to receive and/or provide audio such as, but not limited to, cellular phones, wearables, phone handsets, headsets, and conferencing systems. It should be understood that while some embodiments of the present technology will be described in reference to operations of a cellular phone, the present technology may be practiced with any audio device.
Audio devices can include radio frequency (RF) receivers, transmitters, and transceivers, wired and/or wireless telecommunications and/or networking devices, amplifiers, audio and/or video players, encoders, decoders, speakers, inputs, outputs, storage devices, and user input devices. The audio devices may include input devices such as buttons, switches, keys, keyboards, trackballs, sliders, touchscreens, one or more microphones, gyroscopes, accelerometers, global positioning system (GPS) receivers, and the like. The audio devices may include output devices, such as LED indicators, video displays, touchscreens, speakers, and the like. In some embodiments, mobile devices include wearables and hand-held devices, such as wired and/or wireless remote controls, notebook computers, tablet computers, phablets, smart phones, personal digital assistants, media players, mobile telephones, and the like.
In various embodiments, the audio devices can be operated in stationary and portable environments. Stationary environments can include residential and commercial buildings or structures, and the like. For example, the stationary embodiments can include living rooms, bedrooms, home theaters, conference rooms, auditoriums, business premises, and the like. Portable environments can include moving vehicles, moving persons, other transportation means, and the like.
According to an example embodiment, a method for restoring distorted speech components of an audio signal includes determining distorted frequency regions and undistorted frequency regions in the audio signal. The distorted frequency regions include regions of the audio signal wherein speech distortion is present. The method includes performing one or more iterations using a model for refining predictions of the audio signal at the distorted frequency regions. The model can be configured to modify the audio signal.
Referring now to
In some embodiments, the audio device 104 includes one or more acoustic sensors, for example microphones. In example of
Noise 110 is unwanted sound present in the environment 100 which can be detected by, for example, sensors such as microphones 106 and 108. In stationary environments, noise sources can include street noise, ambient noise, sounds from a mobile device such as audio, speech from entities other than an intended speaker(s), and the like. Noise 110 may include reverberations and echoes. Mobile environments can encounter certain kinds of noises which arise from their operation and the environments in which they operate, for example, road, track, tire/wheel, fan, wiper blade, engine, exhaust, entertainment system, communications system, competing speakers, wind, rain, waves, other vehicles, exterior, and the like noise. Acoustic signals detected by the microphones 106 and 108 can be used to separate desired speech from the noise 110.
In some embodiments, the audio device 104 is connected to a cloud-based computing resource 160 (also referred to as a computing cloud). In some embodiments, the computing cloud 160 includes one or more server farms/clusters comprising a collection of computer servers and is co-located with network switches and/or routers. The computing cloud 160 is operable to deliver one or more services over a network (e.g., the Internet, mobile phone (cell phone) network, and the like). In certain embodiments, at least partial processing of audio signal is performed remotely in the computing cloud 160. The audio device 104 is operable to send data such as, for example, a recorded acoustic signal, to the computing cloud 160, request computing services and to receive the results of the computation.
In various embodiments, the receiver 200 can be configured to communicate with a network such as the Internet, Wide Area Network (WAN), Local Area Network (LAN), cellular network, and so forth, to receive audio signal. The received audio signal is then forwarded to the audio processing system 210.
In various embodiments, processor 202 includes hardware and/or software, which is operable to execute instructions stored in a memory (not illustrated in
The audio processing system 210 can be configured to receive acoustic signals from an acoustic source via at least one microphone (e.g., primary microphone 106 and secondary microphone 108 in the examples in
In various embodiments, where the microphones 106 and 108 are omni-directional microphones that are closely spaced (e.g., 1-2 cm apart), a beamforming technique can be used to simulate a forward-facing and backward-facing directional microphone response. A level difference can be obtained using the simulated forward-facing and backward-facing directional microphone. The level difference can be used to discriminate speech and noise in, for example, the time-frequency domain, which can be used in noise and/or echo reduction. In some embodiments, some microphones are used mainly to detect speech and other microphones are used mainly to detect noise. In various embodiments, some microphones are used to detect both noise and speech.
The noise reduction can be carried out by the audio processing system 210 based on inter-microphone level differences, level salience, pitch salience, signal type classification, speaker identification, and so forth. In various embodiments, noise reduction includes noise cancellation and/or noise suppression.
In some embodiments, the output device 206 is any device which provides an audio output to a listener (e.g., the acoustic source). For example, the output device 206 may comprise a speaker, a class-D output, an earpiece of a headset, or a handset on the audio device 104.
In some embodiments, audio processing system 210 is operable to receive an audio signal including one or more time-domain input audio signals, depicted in the example in
In some embodiments, frequency analysis module 310 is operable to receive the input audio signals. The frequency analysis module 310 generates frequency sub-bands from the time-domain input audio signals and outputs the frequency sub-band signals. In some embodiments, the frequency analysis module 310 is operable to calculate or determine speech components, for example, a spectrum envelope and excitations, of received audio signal.
In various embodiments, noise reduction module 320 includes multiple modules and receives the audio signal from the frequency analysis module 310. The noise reduction module 320 is operable to perform noise reduction in the audio signal to produce a noise-suppressed signal. In some embodiments, the noise reduction includes a subtractive noise cancellation or multiplicative noise suppression. By way of example and not limitation, noise reduction methods are described in U.S. patent application Ser. No. 12/215,980, entitled “System and Method for Providing Noise Suppression Utilizing Null Processing Noise Subtraction,” filed Jun. 30, 2008, and in U.S. patent application Ser. No. 11/699,732 (U.S. Pat. No. 8,194,880), entitled “System and Method for Utilizing Omni-Directional Microphones for Speech Enhancement,” filed Jan. 29, 2007, which are incorporated herein by reference in their entireties for the above purposes. The noise reduction module 320 provides a transformed, noise-suppressed signal to speech restoration module 330. In the noise-suppressed signal one or more speech components can be eliminated or excessively attenuated since the noise reduction transforms the frequency of the audio signal.
In some embodiments, the speech restoration module 330 receives the noise-suppressed signal from the noise reduction module 320. The speech restoration module 330 is configured to restore damaged speech components in noise-suppressed signal. In some embodiments, the speech restoration module 330 includes a deep neural network (DNN) 315 trained for restoration of speech components in damaged frequency regions. In certain embodiments, the DNN 315 is configured as an autoencoder.
In various embodiments, the DNN 315 is trained using machine learning. The DNN 315 is a feed-forward, artificial neural network having more than one layer of hidden units between its inputs and outputs. The DNN 315 may be trained by receiving input features of one or more frames of spectral envelopes of clean audio signals or undamaged audio signals. In the training process, the DNN 315 may extract learned higher-order spectro-temporal features of the clean or undamaged spectral envelopes. In various embodiments, the DNN 315, as trained using the spectral envelopes of clean or undamaged envelopes, is used in the speech restoration module 330 to refine predictions of the clean speech components that are particularly suitable for restoring speech components in the distorted frequency regions. By way of example and not limitation, exemplary methods concerning deep neural networks are also described in commonly assigned U.S. patent application Ser. No. 14/614,348, entitled “Noise-Robust Multi-Lingual Keyword Spotting with a Deep Neural Network Based Architecture,” filed Feb. 4, 2015, and U.S. patent application Ser. No. 14/745,176, entitled “Key Click Suppression,” filed Jun. 9, 2015, which are incorporated herein by reference in their entirety.
During operation, speech restoration module 330 can assign a zero value to the frequency regions of noise-suppressed signal where a speech distortion is present (distorted regions). In the example in
In some embodiments, to improve the initial predictions, an iterative feedback mechanism is further applied. The output signal 350 is optionally fed back to the input of DNN 315 to receive a next iteration of the output signal, keeping the initial noise-suppressed signal at undistorted regions of the output signal. To prevent the system from diverging, the output at the undistorted regions may be compared to the input after each iteration, and upper and lower bounds may be applied to the estimated energy at undistorted frequency regions based on energies in the input audio signal. In various embodiments, several iterations are applied to improve the accuracy of the predictions until a level of accuracy desired for a particular application is met, e.g., having no further iterations in response to discrepancies of the audio signal at undistorted regions meeting pre-defined criteria for the particular application.
In some embodiments, reconstruction module 340 is operable to receive a noise-suppressed signal with restored speech components from the speech restoration module 330 and to reconstruct the restored speech components into a single audio signal.
The method can commence, in block 402, with determining distorted frequency regions and undistorted frequency regions in the audio signal. The distorted speech regions are regions in which a speech distortion is present due to, for example, noise reduction.
In block 404, method 400 includes performing one or more iterations using a model to refine predictions of the audio signal at distorted frequency regions. The model can be configured to modify the audio signal. In some embodiments, the model includes a deep neural network trained with spectral envelopes of clean or undamaged signals. In certain embodiments, the predictions of the audio signal at distorted frequency regions are set to zero before to the first iteration. Prior to each of the iterations, the audio signal at undistorted frequency regions is restored to values of the audio signal before the first iteration.
In block 406, method 400 includes comparing the audio signal at the undistorted regions before and after each of the iterations to determine discrepancies.
In block 408, the iterations are stopped if the discrepancies meet pre-defined criteria.
Some example embodiments include speech dynamics. For speech dynamics, the audio processing system 210 can be provided with multiple consecutive audio signal frames and trained to output the same number of frames. The inclusion of speech dynamics in some embodiments functions to enforce temporal smoothness and allow restoration of longer distortion regions.
Various embodiments are used to provide improvements for a number of applications such as noise suppression, bandwidth extension, speech coding, and speech synthesis. Additionally, the methods and systems are amenable to sensor fusion such that, in some embodiments, the methods and systems for can be extended to include other non-acoustic sensor information. Exemplary methods concerning sensor fusion are also described in commonly assigned U.S. patent application Ser. No. 14/548,207, entitled “Method for Modeling User Possession of Mobile Device for User Authentication Framework,” filed Nov. 19, 2014, and U.S. patent application Ser. No. 14/331,205, entitled “Selection of System Parameters Based on Non-Acoustic Sensor Information,” filed Jul. 14, 2014, which are incorporated herein by reference in their entirety.
Various methods for restoration of noise reduced speech are also described in commonly assigned U.S. patent application Ser. No. 13/751,907 (U.S. Pat. No. 8,615,394), entitled “Restoration of Noise Reduced Speech,” filed Jan. 28, 2013, which is incorporated herein by reference in its entirety.
The components shown in
Mass data storage 530, which can be implemented with a magnetic disk drive, solid state drive, or an optical disk drive, is a non-volatile storage device for storing data and instructions for use by processor unit 510. Mass data storage 530 stores the system software for implementing embodiments of the present disclosure for purposes of loading that software into main memory 520.
Portable storage device 540 operates in conjunction with a portable non-volatile storage medium, such as a flash drive, floppy disk, compact disk, digital video disc, or Universal Serial Bus (USB) storage device, to input and output data and code to and from the computer system 500 of
User input devices 560 can provide a portion of a user interface. User input devices 560 may include one or more microphones, an alphanumeric keypad, such as a keyboard, for inputting alphanumeric and other information, or a pointing device, such as a mouse, a trackball, stylus, or cursor direction keys. User input devices 560 can also include a touchscreen. Additionally, the computer system 500 as shown in
Graphics display system 570 include a liquid crystal display (LCD) or other suitable display device. Graphics display system 570 is configurable to receive textual and graphical information and processes the information for output to the display device.
Peripheral devices 580 may include any type of computer support device to add additional functionality to the computer system 500.
The components provided in the computer system 500 of
The processing for various embodiments may be implemented in software that is cloud-based. In some embodiments, the computer system 500 is implemented as a cloud-based computing environment, such as a virtual machine operating within a computing cloud. In other embodiments, the computer system 500 may itself include a cloud-based computing environment, where the functionalities of the computer system 500 are executed in a distributed fashion. Thus, the computer system 500, when configured as a computing cloud, may include pluralities of computing devices in various forms, as will be described in greater detail below.
In general, a cloud-based computing environment is a resource that typically combines the computational power of a large grouping of processors (such as within web servers) and/or that combines the storage capacity of a large grouping of computer memories or storage devices. Systems that provide cloud-based resources may be utilized exclusively by their owners or such systems may be accessible to outside users who deploy applications within the computing infrastructure to obtain the benefit of large computational or storage resources.
The cloud may be formed, for example, by a network of web servers that comprise a plurality of computing devices, such as the computer system 500, with each server (or at least a plurality thereof) providing processor and/or storage resources. These servers may manage workloads provided by multiple users (e.g., cloud resource customers or other users). Typically, each user places workload demands upon the cloud that vary in real-time, sometimes dramatically. The nature and extent of these variations typically depends on the type of business associated with the user.
The present technology is described above with reference to example embodiments. Therefore, other variations upon the example embodiments are intended to be covered by the present disclosure.
The present application claims the benefit of U.S. Provisional Application No. 62/049,988, filed on Sep. 12, 2014. The subject matter of the aforementioned application is incorporated herein by reference for all purposes.
Number | Name | Date | Kind |
---|---|---|---|
4025724 | Davidson, Jr. et al. | May 1977 | A |
4137510 | Iwahara | Jan 1979 | A |
4802227 | Elko et al. | Jan 1989 | A |
4969203 | Herman | Nov 1990 | A |
5115404 | Lo et al. | May 1992 | A |
5204906 | Nohara et al. | Apr 1993 | A |
5224170 | Waite, Jr. | Jun 1993 | A |
5230022 | Sakata | Jul 1993 | A |
5289273 | Lang | Feb 1994 | A |
5400409 | Linhard | Mar 1995 | A |
5440751 | Santeler et al. | Aug 1995 | A |
5544346 | Mini et al. | Aug 1996 | A |
5555306 | Gerzon | Sep 1996 | A |
5583784 | Kapust et al. | Dec 1996 | A |
5598505 | Austin et al. | Jan 1997 | A |
5625697 | Bowen et al. | Apr 1997 | A |
5682463 | Allen et al. | Oct 1997 | A |
5715319 | Chu | Feb 1998 | A |
5734713 | Mauney et al. | Mar 1998 | A |
5774837 | Yeldener et al. | Jun 1998 | A |
5796850 | Shiono et al. | Aug 1998 | A |
5806025 | Vis et al. | Sep 1998 | A |
5819215 | Dobson et al. | Oct 1998 | A |
5937070 | Todter et al. | Aug 1999 | A |
5956674 | Smyth et al. | Sep 1999 | A |
5974379 | Hatanaka et al. | Oct 1999 | A |
5974380 | Smyth et al. | Oct 1999 | A |
5978567 | Rebane et al. | Nov 1999 | A |
5978759 | Tsushima | Nov 1999 | A |
5978824 | Ikeda | Nov 1999 | A |
5991385 | Dunn et al. | Nov 1999 | A |
6011853 | Koski et al. | Jan 2000 | A |
6035177 | Moses et al. | Mar 2000 | A |
6065883 | Herring et al. | May 2000 | A |
6084916 | Ott | Jul 2000 | A |
6104993 | Ashley | Aug 2000 | A |
6144937 | Ali | Nov 2000 | A |
6188769 | Jot et al. | Feb 2001 | B1 |
6202047 | Ephraim et al. | Mar 2001 | B1 |
6219408 | Kurth | Apr 2001 | B1 |
6226616 | You et al. | May 2001 | B1 |
6240386 | Thyssen et al. | May 2001 | B1 |
6263307 | Arslan et al. | Jul 2001 | B1 |
6281749 | Klayman et al. | Aug 2001 | B1 |
6327370 | Killion et al. | Dec 2001 | B1 |
6377637 | Berdugo | Apr 2002 | B1 |
6381284 | Strizhevskiy | Apr 2002 | B1 |
6381469 | Wojick | Apr 2002 | B1 |
6389142 | Hagen | May 2002 | B1 |
6421388 | Parizhsky et al. | Jul 2002 | B1 |
6477489 | Lockwood et al. | Nov 2002 | B1 |
6480610 | Fang | Nov 2002 | B1 |
6490556 | Graumann et al. | Dec 2002 | B1 |
6496795 | Malvar | Dec 2002 | B1 |
6504926 | Edelson et al. | Jan 2003 | B1 |
6584438 | Manjunath et al. | Jun 2003 | B1 |
6717991 | Gustafsson et al. | Apr 2004 | B1 |
6748095 | Goss | Jun 2004 | B1 |
6768979 | Menendez-Pidal et al. | Jul 2004 | B1 |
6772117 | Laurila et al. | Aug 2004 | B1 |
6810273 | Mattila et al. | Oct 2004 | B1 |
6862567 | Gao | Mar 2005 | B1 |
6873837 | Yoshioka et al. | Mar 2005 | B1 |
6882736 | Dickel et al. | Apr 2005 | B2 |
6907045 | Robinson et al. | Jun 2005 | B1 |
6931123 | Hughes | Aug 2005 | B1 |
6980528 | LeBlanc et al. | Dec 2005 | B1 |
7010134 | Jensen | Mar 2006 | B2 |
RE39080 | Johnston | Apr 2006 | E |
7035666 | Silberfenig et al. | Apr 2006 | B2 |
7054809 | Gao | May 2006 | B1 |
7058572 | Nemer | Jun 2006 | B1 |
7058574 | Taniguchi et al. | Jun 2006 | B2 |
7103176 | Rodriguez et al. | Sep 2006 | B2 |
7145710 | Holmes | Dec 2006 | B2 |
7190775 | Rambo | Mar 2007 | B2 |
7221622 | Matsuo et al. | May 2007 | B2 |
7245710 | Hughes | Jul 2007 | B1 |
7254242 | Ise et al. | Aug 2007 | B2 |
7283956 | Ashley et al. | Oct 2007 | B2 |
7366658 | Moogi et al. | Apr 2008 | B2 |
7383179 | Alves et al. | Jun 2008 | B2 |
7433907 | Nagai et al. | Oct 2008 | B2 |
7447631 | Truman et al. | Nov 2008 | B2 |
7472059 | Huang | Dec 2008 | B2 |
7548791 | Johnston | Jun 2009 | B1 |
7555434 | Nomura et al. | Jun 2009 | B2 |
7562140 | Clemm et al. | Jul 2009 | B2 |
7590250 | Ellis et al. | Sep 2009 | B2 |
7617099 | Yang et al. | Nov 2009 | B2 |
7617282 | Han | Nov 2009 | B2 |
7657427 | Jelinek | Feb 2010 | B2 |
7664495 | Bonner et al. | Feb 2010 | B1 |
7685132 | Hyman | Mar 2010 | B2 |
7773741 | LeBlanc et al. | Aug 2010 | B1 |
7791508 | Wegener | Sep 2010 | B2 |
7796978 | Jones et al. | Sep 2010 | B2 |
7899565 | Johnston | Mar 2011 | B1 |
7970123 | Beaucoup | Jun 2011 | B2 |
8032369 | Manjunath et al. | Oct 2011 | B2 |
8036767 | Soulodre | Oct 2011 | B2 |
8046219 | Zurek et al. | Oct 2011 | B2 |
8060363 | Ramo et al. | Nov 2011 | B2 |
8098844 | Elko | Jan 2012 | B2 |
8150065 | Solbach et al. | Apr 2012 | B2 |
8175291 | Chan et al. | May 2012 | B2 |
8189429 | Chen et al. | May 2012 | B2 |
8194880 | Avendano | Jun 2012 | B2 |
8194882 | Every et al. | Jun 2012 | B2 |
8195454 | Muesch | Jun 2012 | B2 |
8204253 | Solbach | Jun 2012 | B1 |
8229137 | Romesburg | Jul 2012 | B2 |
8233352 | Beaucoup | Jul 2012 | B2 |
8311817 | Murgia et al. | Nov 2012 | B2 |
8311840 | Giesbrecht | Nov 2012 | B2 |
8345890 | Avendano et al. | Jan 2013 | B2 |
8363823 | Santos | Jan 2013 | B1 |
8369973 | Risbo | Feb 2013 | B2 |
8467891 | Huang et al. | Jun 2013 | B2 |
8473287 | Every et al. | Jun 2013 | B2 |
8531286 | Friar et al. | Sep 2013 | B2 |
8606249 | Goodwin | Dec 2013 | B1 |
8615392 | Goodwin | Dec 2013 | B1 |
8615394 | Avendano et al. | Dec 2013 | B1 |
8639516 | Lindahl et al. | Jan 2014 | B2 |
8694310 | Taylor | Apr 2014 | B2 |
8705759 | Wolff et al. | Apr 2014 | B2 |
8744844 | Klein | Jun 2014 | B2 |
8750526 | Santos et al. | Jun 2014 | B1 |
8774423 | Solbach | Jul 2014 | B1 |
8798290 | Choi et al. | Aug 2014 | B1 |
8831937 | Murgia et al. | Sep 2014 | B2 |
8880396 | Laroche et al. | Nov 2014 | B1 |
8903721 | Cowan | Dec 2014 | B1 |
8908882 | Goodwin et al. | Dec 2014 | B2 |
8934641 | Avendano et al. | Jan 2015 | B2 |
8989401 | Ojanpera | Mar 2015 | B2 |
9007416 | Murgia et al. | Apr 2015 | B1 |
9094496 | Teutsch | Jul 2015 | B2 |
9185487 | Solbach | Nov 2015 | B2 |
9197974 | Clark et al. | Nov 2015 | B1 |
9210503 | Avendano et al. | Dec 2015 | B2 |
9247192 | Lee et al. | Jan 2016 | B2 |
9368110 | Hershey | Jun 2016 | B1 |
9558755 | Laroche | Jan 2017 | B1 |
20010041976 | Taniguchi et al. | Nov 2001 | A1 |
20020041678 | Basburg-Ertem et al. | Apr 2002 | A1 |
20020071342 | Marple et al. | Jun 2002 | A1 |
20020097884 | Cairns | Jul 2002 | A1 |
20020138263 | Deligne et al. | Sep 2002 | A1 |
20020160751 | Sun et al. | Oct 2002 | A1 |
20020177995 | Walker | Nov 2002 | A1 |
20030023430 | Wang et al. | Jan 2003 | A1 |
20030056220 | Thornton et al. | Mar 2003 | A1 |
20030093279 | Malah et al. | May 2003 | A1 |
20030099370 | Moore | May 2003 | A1 |
20030118200 | Beaucoup et al. | Jun 2003 | A1 |
20030147538 | Elko | Aug 2003 | A1 |
20030177006 | Ichikawa et al. | Sep 2003 | A1 |
20030179888 | Burnett et al. | Sep 2003 | A1 |
20030228019 | Eichler et al. | Dec 2003 | A1 |
20040066940 | Amir | Apr 2004 | A1 |
20040076190 | Goel et al. | Apr 2004 | A1 |
20040083110 | Wang | Apr 2004 | A1 |
20040102967 | Furuta et al. | May 2004 | A1 |
20040133421 | Burnett et al. | Jul 2004 | A1 |
20040145871 | Lee | Jul 2004 | A1 |
20040165736 | Hetherington et al. | Aug 2004 | A1 |
20040184882 | Cosgrove | Sep 2004 | A1 |
20050008169 | Muren et al. | Jan 2005 | A1 |
20050008179 | Quinn | Jan 2005 | A1 |
20050043959 | Stemerdink et al. | Feb 2005 | A1 |
20050080616 | Leung et al. | Apr 2005 | A1 |
20050096904 | Taniguchi et al. | May 2005 | A1 |
20050114123 | Lukac et al. | May 2005 | A1 |
20050143989 | Jelinek | Jun 2005 | A1 |
20050213739 | Rodman et al. | Sep 2005 | A1 |
20050240399 | Makinen | Oct 2005 | A1 |
20050249292 | Zhu | Nov 2005 | A1 |
20050261896 | Schuijers et al. | Nov 2005 | A1 |
20050267369 | Lazenby et al. | Dec 2005 | A1 |
20050276363 | Joublin et al. | Dec 2005 | A1 |
20050281410 | Grosvenor et al. | Dec 2005 | A1 |
20050283544 | Yee | Dec 2005 | A1 |
20060063560 | Herle | Mar 2006 | A1 |
20060092918 | Talalai | May 2006 | A1 |
20060100868 | Hetherington et al. | May 2006 | A1 |
20060122832 | Takiguchi et al. | Jun 2006 | A1 |
20060136203 | Ichikawa | Jun 2006 | A1 |
20060198542 | Benjelloun Touimi et al. | Sep 2006 | A1 |
20060206320 | Li | Sep 2006 | A1 |
20060224382 | Taneda | Oct 2006 | A1 |
20060242071 | Stebbings | Oct 2006 | A1 |
20060270468 | Hui et al. | Nov 2006 | A1 |
20060282263 | Vos et al. | Dec 2006 | A1 |
20060293882 | Giesbrecht et al. | Dec 2006 | A1 |
20070003097 | Langberg et al. | Jan 2007 | A1 |
20070005351 | Sathyendra et al. | Jan 2007 | A1 |
20070025562 | Zalewski et al. | Feb 2007 | A1 |
20070033020 | (Kelleher) Francois et al. | Feb 2007 | A1 |
20070033494 | Wenger et al. | Feb 2007 | A1 |
20070038440 | Sung et al. | Feb 2007 | A1 |
20070041589 | Patel et al. | Feb 2007 | A1 |
20070058822 | Ozawa | Mar 2007 | A1 |
20070064817 | Dunne et al. | Mar 2007 | A1 |
20070067166 | Pan et al. | Mar 2007 | A1 |
20070081075 | Canova et al. | Apr 2007 | A1 |
20070088544 | Acero et al. | Apr 2007 | A1 |
20070100612 | Ekstrand et al. | May 2007 | A1 |
20070127668 | Ahya et al. | Jun 2007 | A1 |
20070136056 | Moogi et al. | Jun 2007 | A1 |
20070136059 | Gadbois | Jun 2007 | A1 |
20070150268 | Acero et al. | Jun 2007 | A1 |
20070154031 | Avendano et al. | Jul 2007 | A1 |
20070185587 | Kondo | Aug 2007 | A1 |
20070198254 | Goto et al. | Aug 2007 | A1 |
20070237271 | Pessoa et al. | Oct 2007 | A1 |
20070244695 | Manjunath et al. | Oct 2007 | A1 |
20070253574 | Soulodre | Nov 2007 | A1 |
20070276656 | Solbach et al. | Nov 2007 | A1 |
20070282604 | Gartner et al. | Dec 2007 | A1 |
20070287490 | Green et al. | Dec 2007 | A1 |
20080019548 | Avendano | Jan 2008 | A1 |
20080069366 | Soulodre | Mar 2008 | A1 |
20080111734 | Fam et al. | May 2008 | A1 |
20080117901 | Klammer | May 2008 | A1 |
20080118082 | Seltzer et al. | May 2008 | A1 |
20080140396 | Grosse-Schulte et al. | Jun 2008 | A1 |
20080159507 | Virolainen et al. | Jul 2008 | A1 |
20080160977 | Ahmaniemi et al. | Jul 2008 | A1 |
20080187143 | Mak-Fan | Aug 2008 | A1 |
20080192955 | Merks | Aug 2008 | A1 |
20080192956 | Kazama | Aug 2008 | A1 |
20080195384 | Jabri et al. | Aug 2008 | A1 |
20080208575 | Laaksonen et al. | Aug 2008 | A1 |
20080212795 | Goodwin et al. | Sep 2008 | A1 |
20080233934 | Diethom | Sep 2008 | A1 |
20080247567 | Kjolerbakken et al. | Oct 2008 | A1 |
20080259731 | Happonen | Oct 2008 | A1 |
20080298571 | Kurtz et al. | Dec 2008 | A1 |
20080304677 | Abolfathi et al. | Dec 2008 | A1 |
20080310646 | Amada | Dec 2008 | A1 |
20080317259 | Zhang et al. | Dec 2008 | A1 |
20080317261 | Yoshida et al. | Dec 2008 | A1 |
20090012783 | Klein | Jan 2009 | A1 |
20090012784 | Murgia et al. | Jan 2009 | A1 |
20090018828 | Nakadai et al. | Jan 2009 | A1 |
20090034755 | Short et al. | Feb 2009 | A1 |
20090048824 | Amada | Feb 2009 | A1 |
20090060222 | Jeong et al. | Mar 2009 | A1 |
20090063143 | Schmidt et al. | Mar 2009 | A1 |
20090070118 | Den Brinker et al. | Mar 2009 | A1 |
20090086986 | Schmidt et al. | Apr 2009 | A1 |
20090089054 | Wang et al. | Apr 2009 | A1 |
20090106021 | Zurek et al. | Apr 2009 | A1 |
20090112579 | Li et al. | Apr 2009 | A1 |
20090116656 | Lee et al. | May 2009 | A1 |
20090119096 | Gerl et al. | May 2009 | A1 |
20090119099 | Lee et al. | May 2009 | A1 |
20090134829 | Baumann et al. | May 2009 | A1 |
20090141908 | Jeong et al. | Jun 2009 | A1 |
20090144053 | Tamura et al. | Jun 2009 | A1 |
20090144058 | Sorin | Jun 2009 | A1 |
20090147942 | Culter | Jun 2009 | A1 |
20090150149 | Culter et al. | Jun 2009 | A1 |
20090164905 | Ko | Jun 2009 | A1 |
20090192790 | Ei-Maleh et al. | Jul 2009 | A1 |
20090192791 | El-Maleh et al. | Jul 2009 | A1 |
20090204413 | Sintes et al. | Aug 2009 | A1 |
20090216526 | Schmidt et al. | Aug 2009 | A1 |
20090226005 | Acero et al. | Sep 2009 | A1 |
20090226010 | Schnell et al. | Sep 2009 | A1 |
20090228272 | Herbig et al. | Sep 2009 | A1 |
20090240497 | Usher et al. | Sep 2009 | A1 |
20090257609 | Gerkmann et al. | Oct 2009 | A1 |
20090262969 | Short et al. | Oct 2009 | A1 |
20090264114 | Virolainen et al. | Oct 2009 | A1 |
20090287481 | Paranjpe et al. | Nov 2009 | A1 |
20090292536 | Hetherington et al. | Nov 2009 | A1 |
20090303350 | Terada | Dec 2009 | A1 |
20090323655 | Cardona et al. | Dec 2009 | A1 |
20090323925 | Sweeney et al. | Dec 2009 | A1 |
20090323981 | Cutler | Dec 2009 | A1 |
20090323982 | Solbach et al. | Dec 2009 | A1 |
20100004929 | Baik | Jan 2010 | A1 |
20100017205 | Visser et al. | Jan 2010 | A1 |
20100033427 | Marks et al. | Feb 2010 | A1 |
20100036659 | Haulick et al. | Feb 2010 | A1 |
20100092007 | Sun | Apr 2010 | A1 |
20100094643 | Avendano et al. | Apr 2010 | A1 |
20100105447 | Sibbald et al. | Apr 2010 | A1 |
20100128123 | DiPoala | May 2010 | A1 |
20100130198 | Kannappan et al. | May 2010 | A1 |
20100211385 | Sehlstedt | Aug 2010 | A1 |
20100215184 | Buck et al. | Aug 2010 | A1 |
20100217837 | Ansari et al. | Aug 2010 | A1 |
20100228545 | Ito et al. | Sep 2010 | A1 |
20100245624 | Beaucoup | Sep 2010 | A1 |
20100278352 | Petit et al. | Nov 2010 | A1 |
20100280824 | Petit et al. | Nov 2010 | A1 |
20100296668 | Lee et al. | Nov 2010 | A1 |
20100303298 | Marks et al. | Dec 2010 | A1 |
20100315482 | Rosenfeld et al. | Dec 2010 | A1 |
20110038486 | Beaucoup | Feb 2011 | A1 |
20110038557 | Closset et al. | Feb 2011 | A1 |
20110044324 | Li et al. | Feb 2011 | A1 |
20110075857 | Aoyagi | Mar 2011 | A1 |
20110081024 | Soulodre | Apr 2011 | A1 |
20110081026 | Ramakrishnan et al. | Apr 2011 | A1 |
20110107367 | Georgis et al. | May 2011 | A1 |
20110129095 | Avendano et al. | Jun 2011 | A1 |
20110137646 | Ahgren et al. | Jun 2011 | A1 |
20110142257 | Goodwin et al. | Jun 2011 | A1 |
20110173006 | Nagel et al. | Jul 2011 | A1 |
20110173542 | Imes et al. | Jul 2011 | A1 |
20110182436 | Murgia et al. | Jul 2011 | A1 |
20110184732 | Godavarti | Jul 2011 | A1 |
20110184734 | Wang et al. | Jul 2011 | A1 |
20110191101 | Uhle et al. | Aug 2011 | A1 |
20110208520 | Lee | Aug 2011 | A1 |
20110224994 | Norvell et al. | Sep 2011 | A1 |
20110257965 | Hardwick | Oct 2011 | A1 |
20110257967 | Every et al. | Oct 2011 | A1 |
20110264449 | Sehlstedt | Oct 2011 | A1 |
20110280154 | Silverstrim et al. | Nov 2011 | A1 |
20110286605 | Furuta et al. | Nov 2011 | A1 |
20110300806 | Lindahl et al. | Dec 2011 | A1 |
20110305345 | Bouchard et al. | Dec 2011 | A1 |
20120027217 | Jun et al. | Feb 2012 | A1 |
20120050582 | Seshadri et al. | Mar 2012 | A1 |
20120062729 | Hart et al. | Mar 2012 | A1 |
20120116758 | Murgia et al. | May 2012 | A1 |
20120116769 | Malah | May 2012 | A1 |
20120123775 | Murgia et al. | May 2012 | A1 |
20120133728 | Lee | May 2012 | A1 |
20120182429 | Forutanpour et al. | Jul 2012 | A1 |
20120202485 | Mirbaha et al. | Aug 2012 | A1 |
20120209611 | Furuta et al. | Aug 2012 | A1 |
20120231778 | Chen et al. | Sep 2012 | A1 |
20120249785 | Sudo et al. | Oct 2012 | A1 |
20120250882 | Mohammad et al. | Oct 2012 | A1 |
20120257778 | Hall et al. | Oct 2012 | A1 |
20130034243 | Yermeche et al. | Feb 2013 | A1 |
20130051543 | McDysan et al. | Feb 2013 | A1 |
20130182857 | Namba et al. | Jul 2013 | A1 |
20130289988 | Fry | Oct 2013 | A1 |
20130289996 | Fry | Oct 2013 | A1 |
20130322461 | Poulsen | Dec 2013 | A1 |
20130332156 | Tackin et al. | Dec 2013 | A1 |
20130332171 | Avendano | Dec 2013 | A1 |
20130343549 | Vemireddy et al. | Dec 2013 | A1 |
20140003622 | Ikizyan et al. | Jan 2014 | A1 |
20140350926 | Schuster et al. | Nov 2014 | A1 |
20140379348 | Sung | Dec 2014 | A1 |
20150025881 | Carlos et al. | Jan 2015 | A1 |
20150078555 | Zhang et al. | Mar 2015 | A1 |
20150078606 | Zhang et al. | Mar 2015 | A1 |
20150208165 | Volk et al. | Jul 2015 | A1 |
20160037245 | Harrington | Feb 2016 | A1 |
20160061934 | Woodruff et al. | Mar 2016 | A1 |
20160078880 | Avendano | Mar 2016 | A1 |
20160093307 | Warren et al. | Mar 2016 | A1 |
20160094910 | Vallabhan et al. | Mar 2016 | A1 |
Number | Date | Country |
---|---|---|
105474311 | Apr 2016 | CN |
112014003337 | Mar 2016 | DE |
1081685 | Mar 2001 | EP |
1536660 | Jun 2005 | EP |
20080623 | Nov 2008 | FI |
20110428 | Dec 2011 | FI |
20125600 | Jun 2012 | FI |
123080 | Oct 2012 | FI |
H05172865 | Jul 1993 | JP |
H05300419 | Nov 1993 | JP |
H07336793 | Dec 1995 | JP |
2004053895 | Feb 2004 | JP |
2004531767 | Oct 2004 | JP |
2004533155 | Oct 2004 | JP |
2005148274 | Jun 2005 | JP |
2005518118 | Jun 2005 | JP |
2005309096 | Nov 2005 | JP |
2006515490 | May 2006 | JP |
2007201818 | Aug 2007 | JP |
2008518257 | May 2008 | JP |
2008542798 | Nov 2008 | JP |
2009037042 | Feb 2009 | JP |
2009538450 | Nov 2009 | JP |
2012514233 | Jun 2012 | JP |
5081903 | Sep 2012 | JP |
2013513306 | Apr 2013 | JP |
2013527479 | Jun 2013 | JP |
5718251 | Mar 2015 | JP |
5855571 | Dec 2015 | JP |
1020070068270 | Jun 2007 | KR |
101050379 | Dec 2008 | KR |
1020080109048 | Dec 2008 | KR |
1020090013221 | Feb 2009 | KR |
1020110111409 | Oct 2011 | KR |
1020120094892 | Aug 2012 | KR |
1020120101457 | Sep 2012 | KR |
101294634 | Aug 2013 | KR |
101610662 | Apr 2016 | KR |
519615 | Feb 2003 | TW |
200847133 | Dec 2008 | TW |
201113873 | Apr 2011 | TW |
201143475 | Dec 2011 | TW |
I421858 | Jan 2014 | TW |
201513099 | Apr 2015 | TW |
WO1984000634 | Feb 1984 | WO |
WO2002007061 | Jan 2002 | WO |
WO2002080362 | Oct 2002 | WO |
WO2002103676 | Dec 2002 | WO |
WO2003069499 | Aug 2003 | WO |
WO2004010415 | Jan 2004 | WO |
WO2005086138 | Sep 2005 | WO |
WO2007140003 | Dec 2007 | WO |
WO2008034221 | Mar 2008 | WO |
WO2010077361 | Jul 2010 | WO |
WO2011002489 | Jan 2011 | WO |
WO2011068901 | Jun 2011 | WO |
WO2012094422 | Jul 2012 | WO |
WO2013188562 | Dec 2013 | WO |
WO2015010129 | Jan 2015 | WO |
WO2016040885 | Mar 2016 | WO |
WO2016049566 | Mar 2016 | WO |
Entry |
---|
Non-Final Office Action, dated Aug. 5, 2008, U.S. Appl. No. 11/441,675, filed May 25, 2006. |
Non-Final Office Action, dated Jan. 21, 2009, U.S. Appl. No. 11/441,675, filed May 25, 2006. |
Final Office Action, dated Sep. 3, 2009, U.S. Appl. No. 11/441,675, filed May 25, 2006. |
Non-Final Office Action, dated May 10, 2011, U.S. Appl. No. 11/441,675, filed May 25, 2006. |
Final Office Action, dated Oct. 24, 2011, U.S. Appl. No. 11/441,675, filed May 25, 2006. |
Notice of Allowance, dated Feb. 13, 2012, U.S. Appl. No. 11/441,675, filed May 25, 2006. |
Non-Fianl Office Action, dated Dec. 6, 2011, U.S. Appl. No. 12/319,107, filed Dec. 31, 2008. |
Final Office Action, dated Apr. 16, 2012, U.S. Appl. No. 12/319,107, filed Dec. 31, 2008. |
Advisory Action, dated Jun. 28, 2012, U.S. Appl. No. 12/319,107, filed Dec. 31, 2008. |
Non-Final Office Action, dated Jan. 3, 2014, U.S. Appl. No. 12/319,107, filed Dec. 31, 2008. |
Notice of Allowance, dated Aug. 25, 2014, U.S. Appl. No. 12/319,107, filed Dec. 31, 2008. |
Non-Final Office Action, dated Dec. 10, 2012, U.S. Appl. No. 12/493,927, filed Jun. 29, 2009. |
Final Office Action, dated May 14, 2013, U.S. Appl. No. 12/493,927, filed Jun. 29, 2009. |
Non-Final Office Action, dated Jan. 9, 2014, U.S. Appl. No. 12/493,927, filed Jun. 29, 2009. |
Notice of Allowance, dated Aug. 20, 2014, U.S. Appl. No. 12/493,927, filed Jun. 29, 2009. |
Non-Final Office Action, dated Aug. 28, 2012, U.S. Appl. No. 12/860,515, filed Aug. 20, 2010. |
Final Office Action, dated Mar. 11, 2013, U.S. Appl. No. 12/860,515, filed Aug. 20, 2010. |
Non-Final Office Action, dated Aug. 28, 2013, U.S. Appl. No. 12/860,515, filed Aug. 20, 2010. |
Notice of Allowance, dated Jun. 18, 2014, U.S. Appl. No. 12/860,515, filed Aug. 20, 2010. |
Non-Final Office Action, dated Oct. 2, 2012, U.S. Appl. No. 12/906,009, filed Oct. 15, 2010. |
Non-Final Office Action, dated Jul. 2, 2013, U.S. Appl. No. 12/906,009, filed Oct. 15, 2010. |
Final Office Action, dated May 7, 2014, U.S. Appl. No. 12/906,009, filed Oct. 15, 2010. |
Non-Final Office Action, dated Apr. 21, 2015, U.S. Appl. No. 12/906,009, filed Oct. 15, 2010. |
Non-Final Office Action, dated Jul. 31, 2013, U.S. Appl. No. 13/009,732, filed Jan. 19, 2011. |
Final Office Action, dated Dec. 16, 2014, U.S. Appl. No. 13/009,732, filed Jan. 19, 2011. |
Non-Final Office Action, dated Apr. 24, 2013, U.S. Appl. No. 13/012,517, filed Jan. 24, 2011. |
Final Office Action, dated Dec. 3, 2013, U.S. Appl. No. 13/012,517, filed Jan. 24, 2011. |
Non-Final Office Action, dated Nov. 19, 2014, U.S. Appl. No. 13/012,517, filed Jan. 24, 2011. |
Final Office Action, dated Jun. 17, 2015, U.S. Appl. No. 13/012,517, filed Jan. 24, 2011. |
Non-Final Office Action, dated Feb. 21, 2012, U.S. Appl. No. 13/288,858, filed Nov. 3, 2011. |
Notice of Allowance, dated Sep. 10, 2012, U.S. Appl. No. 13/288,858, filed Nov. 3, 2011. |
Non-Final Office Action, dated Feb. 14, 2012, U.S. Appl. No. 13/295,981, filed Nov. 14, 2011. |
Final Office Action, dated Jul. 9, 2012, U.S. Appl. No. 13/295,981, filed Nov. 14, 2011. |
Final Office Action, dated Jul. 17, 2012, U.S. Appl. No. 13/295,981, filed Nov. 14, 2011. |
Advisory Action, dated Sep. 24, 2012, U.S. Appl. No. 13/295,981, filed Nov. 14, 2011. |
Notice of Allowance, dated May 9, 2014, U.S. Appl. No. 13/295,981, filed Nov. 14, 2011. |
Non-Final Office Action, dated Feb. 1, 2016, U.S. Appl. No. 14/335,850, filed Jul. 18, 2014. |
Office Action dated Jan. 30, 2015 in Finland Patent Application No. 20080623, filed May 24, 2007. |
Office Action dated Mar. 27, 2015 in Korean Patent Application No. 10-2011-7016591, filed Dec. 30, 2009. |
Notice of Allowance dated Aug. 13, 2015 in Finnish Patent Application 20080623, filed May 24, 2007. |
Office Action dated Oct. 15, 2015 in Korean Patent Application 10-2011-7016591. |
Notice of Allowance dated Jan. 14, 2016 in South Korean Patent Application No. 10-2011-7016591 filed Jul. 15, 2011. |
International Search Report & Written Opinion dated Feb. 12, 2016 in Patent Cooperation Treaty Application No. PCT/US2015/064523, filed Dec. 8, 2015. |
International Search Report & Written Opinion dated Feb. 11, 2016 in Patent Cooperation Treaty Application No. PCT/US2015/063519, filed Dec. 2, 2015. |
Klein, David, “Noise-Robust Multi-Lingual Keyword Spotting with a Deep Neural Network Based Architecture”, U.S. Appl. No. 14/614,348, filed Feb. 4, 2015. |
Vitus, Deborah Kathleen et al., “Method for Modeling User Possession of Mobile Device for User Authentication Framework”, U.S. Appl. No. 14/548,207, filed Nov. 19, 2014. |
Murgia, Carlo, “Selection of System Parameters Based on Non-Acoustic Sensor Information”, U.S. Appl. No. 14/331,205, filed Jul. 14, 2014. |
Goodwin, Michael M. et al., “Key Click Suppression”, U.S. Appl. No. 14/745,176, filed Jun. 19, 2015. |
Boll, Steven F. “Suppression of Acoustic Noise in Speech using Spectral Subtraction”, IEEE Transactions on Acoustics, Speech and Signal Processing, vol. ASSP-27, No. 2, Apr. 1979, pp. 113-120. |
“ENT 172.” Instructional Module. Prince George's Community College Department of Engineering Technology Accessed: Oct. 15, 2011. Subsection: “Polar and Rectangular Notation”. <http://academic.ppgcc.edu/ent/ent172_instr_mod.html>. |
Fulghum, D. P. et al., “LPC Voice Digitizer with Background Noise Suppression”, 1979 IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 220-223. |
Haykin, Simon et al., “Appendix A.2 Complex Numbers.” Signals and Systems. 2nd Ed. 2003. p. 764. |
Hohmann, V. “Frequency Analysis and Synthesis Using a Gammatone Filterbank”, ACTA Acustica United with Acustica, 2002, vol. 88, pp. 433-442. |
Martin, Rainer “Spectral Subtraction Based on Minimum Statistics”, in Proceedings Europe. Signal Processing Conf., 1994, pp. 1182-1185. |
Mitra, Sanjit K. Digital Signal Processing: a Computer-based Approach. 2nd Ed. 2001. pp. 131-133. |
Cosi, Piero et al., (1996), “Lyon's Auditory Model Inversion: a Tool for Sound Separation and Speech Enhancement,” Proceedings of ESCA Workshop on ‘The Auditory Basis of Speech Perception,’ Keele University, Keele (UK), Jul. 15-19, 1996, pp. 194-197. |
Rabiner, Lawrence R. et al., “Digital Processing of Speech Signals”, (Prentice-Hall Series in Signal Processing). Upper Saddle River, NJ: Prentice Hall, 1978. |
Schimmel, Steven et al., “Coherent Envelope Detection for Modulation Filtering of Speech,” 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, No. 7, pp. 221-224. |
Slaney, Malcom, et al., “Auditory Model Inversion for Sound Separation,” 1994 IEEE International Conference on Acoustics, Speech and Signal Processing, Apr. 19-22, vol. 2, pp. 77-80. |
Slaney, Malcom. “An Introduction to Auditory Model Inversion”, Interval Technical Report IRC 1994-014, http://coweb.ecn.purdue.edu/˜maclom/interval/1994-014/, Sep. 1994, accessed on Jul. 6, 2010. |
Solbach, Ludger “An Architecture for Robust Partial Tracking and Onset Localization in Single Channel Audio Signal Mixes”, Technical University Hamburg—Harburg, 1998. |
International Search Report and Written Opinion dated Sep. 16, 2008 in Patent Cooperation Treaty Application No. PCT/US2007/012628. |
International Search Report and Written Opinion dated May 20, 2010 in Patent Cooperation Treaty Application No. PCT/US2009/006754. |
Fast Cochlea Transform, US Trademark Reg. No. 2,875,755 (Aug. 17, 2004). |
3GPP2 “Enhanced Variable Rate Codec, Speech Service Options 3, 68, 70, and 73 for Wideband Spread Spectrum Digital Systems”, May 2009, pp. 1-308. |
3GPP2 “Selectable Mode Vocoder (SMV) Service Option for Wideband Spread Spectrum Communication Systems”, Jan. 2004, pp. 1-231. |
3GPP2 “Source-Controlled Variable-Rate Multimode Wideband Speech Codec (VMR-WB) Service Option 62 for Spread Spectrum Systems”, Jun. 11, 2004, pp. 1-164. |
3GPP “3GPP Specification 26.071 Mandatory Speech Codec Speech Processing Functions; AMR Speech Codec; General Description”, http://www.3gpp.org/ftp/Specs/html-info/26071.htm, accessed on Jan. 25, 2012. |
3GPP “3GPP Specification 26.094 Mandatory Speech Codec Speech Processing Functions; Adaptive Multi-Rate (AMR) Speech Codec; Voice Activity Detector (VAD)”, http://www.3gpp.org/ftp/Specs/html-info/26094.htm, accessed on Jan. 25, 2012. |
3GPP “3GPP Specification 26.171 Speech Codec Speech Processing Functions; Adaptive Multi-Rate—Wideband (AMR-WB) Speech Codec; General Description”, http://www.3gpp.org/ftp/Specs/html-info26171.htm, accessed on Jan. 25, 2012. |
3GPP “3GPP Specification 26.194 Speech Codec Speech Processing Functions; Adaptive Multi-Rate—Wideband (AMR-WB) Speech Codec; Voice Activity Detector (VAD)” http://www.3gpp.org/ftp/Specs/html-info26194.htm, accessed on Jan. 25, 2012. |
International Telecommunication Union “Coding of Speech at 8 kbit/s Using Conjugate-Structure Algebraic-code-excited Linear-prediction (CS-ACELP)”, Mar. 19, 1996, pp. 1-39. |
International Telecommunication Union “Coding of Speech at 8 kbit/s Using Conjugate Structure Algebraic-code-excited Linear-prediction (CS-ACELP) Annex B: A Silence Compression Scheme for G.729 Optimized for Terminals Conforming to Recommendation V.70”, Nov. 8, 1996, pp. 1-23. |
International Search Report and Written Opinion dated Aug. 19, 2010 in Patent Cooperation Treaty Application No. PCT/US2010/001786. |
Cisco, “Understanding How Digital T1 CAS (Robbed Bit Signaling) Works in IOS Gateways”, Jan. 17, 2007, http://www.cisco.com/image/gif/paws/22444/t1-cas-ios.pdf, accessed on Apr. 3, 2012. |
Jelinek et al., “Noise Reduction Method for Wideband Speech Coding” Proc. Eusipco, Vienna, Austria, Sep. 2004, pp. 1959-1962. |
Widjaja et al., “Application of Differential Microphone Array for IS-127 EVRC Rate Determination Algorithm”, Interspeech 2009, 10th Annual Conference of the International Speech Communication Association, Brighton, United Kingdom Sep. 6-10, 2009, pp. 1123-1126. |
Sugiyama et al., “Single-Microphone Noise Suppression for 3G Handsets Based on Weighted Noise Estimation” in Benesty et al., “Speech Enhancement”, 2005, pp. 115-133, Springer Berlin Heidelberg. |
Watts, “Real-Time, High-Resolution Simulation of the Auditory Pathway, with Application to Cell-Phone Noise Reduction” Proceedings of 2010 IEEE International Symposium on Circuits and Systems (ISCAS), May 30-Jun. 2, 2010, pp. 3821-3824. |
3GPP Minimum Performance Specification for the Enhanced Variable rate Codec, Speech Service Option 3 and 68 for Wideband Spread Spectrum Digital Systems, Jul. 2007, pp. 1-83. |
Ramakrishnan, 2000. Reconstruction of Incomplete Spectrograms for robust speech recognition. PHD thesis, Carnegie Mellon University, Pittsburgh, Pennsylvania. |
Kim et al., “Missing-Feature Reconstruction by Leveraging Temporal Spectral Correlation for Robust Speech Recognition in Background Noise Conditions, ”Audio, Speech, and Language Processing, IEEE Transactions on, vol. 18, No. 8 pp. 2111-2120, Nov. 2010. |
Cooke et al.,“Robust Automatic Speech Recognition with Missing and Unreliable Acoustic data,” Speech Commun., vol. 34, No. 3, pp. 267-285, 2001. |
Liu et al., “Efficient cepstral normalization for robust speech recognition.” Proceedings of the workshop on Human Language Technology. Association for Computational Linguistics, 1993. |
Yoshizawa et al., “Cepstral gain normalization for noise robust speech recognition.” Acoustics, Speech, and Signal Processing, 2004. Proceedings, (ICASSP04), IEEE International Conference on vol. 1 IEEE, 2004. |
Office Action dated Apr. 8, 2014 in Japan Patent Application 2011-544416, filed Dec. 30, 2009. |
Elhilali et al.,“A cocktail party with a cortical twist: How cortical mechanisms contribute to sound segregation.” J Acoust Soc Am. Dec. 2008; 124(6): 3751-3771). |
Jin et al., “HMM-Based Multipitch Tracking for Noisy and Reverberant Speech.” Jul. 2011. |
Kawahara, W., et al., “Tandem-Straight: A temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, F0, and aperiodicity estimation.” IEEE ICASSP 2008. |
Lu et al. “A Robust Audio Classification and Segmentation Method.” Microsoft Research, 2001, pp. 203, 206, and 207. |
International Search Report & Written Opinion dated Nov. 12, 2014 in Patent Cooperation Treaty Application No. PCT/US2014/047458, filed Jul. 21, 2014. |
Krini, Mohamed et al., “Model-Based Speech Enhancement,” in Speech and Audio Processing in Adverse Environments; Signals and Communication Technology, edited by Hansler et al., 2008, Chapter 4, pp. 89-134. |
Office Action dated Dec. 9, 2014 in Japan Patent Application No. 2012-518521, filed Jun. 21, 2010. |
Office Action dated Dec. 10, 2014 in Taiwan Patent Application No. 099121290, filed Jun. 29, 2010. |
Purnhagen, Heiko, “Low Complexity Parametric Stereo Coding in MPEG-4,” Proc. Of the 7th Int. Conference on Digital Audio Effects (DAFx'04), Naples, Italy, Oct. 5-8, 2004. |
Chang, Chun-Ming et al., “Voltage-Mode Multifunction Filter with Single Input and Three Outputs Using Two Compound Current Conveyors” IEEE Transactions on Circuits and Systems—I: Fundamental Theory and Applications, vol. 46, No. 11, Nov. 1999. |
Nayebi et al., “Low delay FIR filter banks: design and evaluation” IEEE Transactions on Signal Processing, vol. 42, No. 1, pp. 24-31, Jan. 1994. |
Notice of Allowance dated Feb. 17, 2015 in Japan Patent Application No. 2011-544416, filed Dec. 30, 2009. |
International Search Report and Written Opinion dated Feb. 7, 2011 in Patent Cooperation Treaty Application No. PCT/US10/58600. |
International Search Report dated Dec. 20, 2013 in Patent Cooperation Treaty Application No. PCT/US2013/045462, filed Jun. 12, 2013. |
Office Action dated Aug. 26, 2014 in Japanese Application No. 2012-542167, filed Dec. 1, 2010. |
Office Action dated Oct. 31, 2014 in Finnish Patent Application No. 20125600, filed Jun. 1, 2012. |
Office Action dated Jul. 21, 2015 in Japanese Patent Application 2012-542167 filed Dec. 1, 2010. |
Office Action dated Sep. 29, 2015 in Finnish Patent Application 20125600, filed Dec. 1, 2010. |
Allowance dated Nov. 17, 2015 in Japanese Patent Application 2012-542167, filed Dec. 1, 2010. |
International Search Report & Written Opinion dated Dec. 14, 2015 in Patent Cooperation Treaty Application No. PCT/US2015/049816, filed Sep. 11, 2015. |
International Search Report & Written Opinion dated Dec. 22, 2015 in Patent Cooperation Treaty Application No. PCT/US2015/052433, filed Sep. 25, 2015. |
Number | Date | Country | |
---|---|---|---|
20160078880 A1 | Mar 2016 | US |
Number | Date | Country | |
---|---|---|---|
62049988 | Sep 2014 | US |