This application relates to acoustic activity detection (AAD) approaches and voice activity detection (VAD) approaches, and their interfacing with other types of electronic devices.
Voice activity detection (VAD) approaches are important components of speech recognition software and hardware. For example, recognition software constantly scans the audio signal of a microphone searching for voice activity, usually, with a MIPS intensive algorithm. Since the algorithm is constantly running, the power used in this voice detection approach is significant.
Microphones are also disposed in mobile device products such as cellular phones. These customer devices have a standardized interface. If the microphone is not compatible with this interface it cannot be used with the mobile device product.
Many mobile devices products have speech recognition included with the mobile device. However, the power usage of the algorithms are taxing enough to the battery that the feature is often enabled only after the user presses a button or wakes up the device. In order to enable this feature at all times, the power consumption of the overall solution must be small enough to have minimal impact on the total battery life of the device. As mentioned, this has not occurred with existing devices.
Because of the above-mentioned problems, some user dissatisfaction with previous approaches has occurred.
For a more complete understanding of the disclosure, reference should be made to the following detailed description and accompanying drawings wherein:
Skilled artisans will appreciate that elements in the figures are illustrated for simplicity and clarity. It will further be appreciated that certain actions and/or steps may be described or depicted in a particular order of occurrence while those skilled in the art will understand that such specificity with respect to sequence is not actually required. It will also be understood that the terms and expressions used herein have the ordinary meaning as is accorded to such terms and expressions with respect to their corresponding respective areas of inquiry and study except where specific meanings have otherwise been set forth herein.
Approaches are described herein that integrate voice activity detection (VAD) or acoustic activity detection (AAD) approaches into microphones. At least some of the microphone components (e.g., VAD or AAD modules) are disposed at or on an application specific circuit (ASIC) or other integrated device. The integration of components such as the VAD or AAD modules significantly reduces the power requirements of the system thereby increasing user satisfaction with the system. An interface is also provided between the microphone and circuitry in an electronic device (e.g., cellular phone or personal computer) in which the microphone is disposed. The interface is standardized so that its configuration allows placement of the microphone in most if not all electronic devices (e.g. cellular phones). The microphone operates in multiple modes of operation including a lower power mode that still detects acoustic events such as voice signals.
In many of these embodiments, an external clock signal having a first frequency is received. An automatic determination is made for a division ratio based at least in part upon a second frequency of an internal clock, the second frequency being greater than the first frequency. A decimation factor is automatically determined based at least in part upon the first frequency of the external clock signal, the second frequency of the internal clock signal, and a predetermined desired sampling frequency. The division ratio is applied to the internal clock signal to reduce the first frequency to a reduced third frequency. The decimation factor is applied to the reduced third frequency to provide the predetermined desired sampling frequency. Data is clocked to a buffer using the predetermined desired sampling frequency.
In other aspects, the external clock signal is subsequently removed. In other examples, the predetermined desired sampling frequency comprises a frequency rate of approximately 16 kHz.
In others of these embodiments, and apparatus includes interface circuitry that has an input and output, and the input is configured to receive an external clock signal having a first frequency. The apparatus also includes processing circuitry, and the processing circuitry is coupled to the interface circuitry and configured to automatically determine a division ratio based at least in part upon a second frequency of an internal clock, the second frequency being greater than the first frequency. The processing circuitry is further configured to automatically determine a decimation factor based at least in part upon the first frequency of the external clock signal, the second frequency of the internal clock signal, and a predetermined desired sampling frequency. The processing circuitry is further configured to apply the division ratio to the internal clock signal to reduce the first frequency to a reduced third frequency and to apply the decimation factor to the reduced third frequency to provide the predetermined desired sampling frequency. The processing circuitry is further configured to clock data to a buffer via the output using the predetermined desired sampling frequency.
Referring now to
The charge pump 101 provides a voltage to charge up and bias a diaphragm of the capacitive MEMS sensor 102. For some applications (e.g., when using a piezoelectric device as a sensor), the charge pump may be replaced with a power supply that may be external to the microphone. A voice or other acoustic signal moves the diaphragm, the capacitance of the capacitive MEMS sensor 102 changes, and voltages are created that becomes an electrical signal. In one aspect, the charge pump 101 and the MEMS sensor 102 are not disposed on the ASIC (but in other aspects, they may be disposed on the ASIC). It will be appreciated that the MEMS sensor 102 may alternatively be a piezoelectric sensor, a speaker, or any other type of sensing device or arrangement.
The clock detector 104 controls which clock goes to the sigma-delta modulator 106 and synchronizes the digital section of the ASIC. If external clock is present, the clock detector 104 uses that clock; if no external clock signal is present, then the clock detector 104 use an internal oscillator 103 for data timing/clocking purposes.
The sigma-delta modulator 106 converts the analog signal into a digital signal. The output of the sigma-delta modulator 106 is a one-bit serial stream, in one aspect. Alternatively, the sigma-delta modulator 106 may be any type of analog-to-digital converter.
The buffer 110 stores data and constitutes a running storage of past data. By the time acoustic activity is detected, this past additional data is stored in the buffer 110. In other words, the buffer 110 stores a history of past audio activity. When an audio event happens (e.g., a trigger word is detected), the control module 112 instructs the buffer 110 to spool out data from the buffer 110. In one example, the buffer 110 stores the previous approximately 180 ms of data generated prior to the activity detect. Once the activity has been detected, the microphone 100 transmits the buffered data to the host (e.g., electronic circuitry in a customer device such as a cellular phone).
The acoustic activity detection (AAD) module 108 detects acoustic activity. Various approaches can be used to detect such events as the occurrence of a trigger word, trigger phrase, specific noise or sound, and so forth. In one aspect, the module 108 monitors the incoming acoustic signals looking for a voice-like signature (or monitors for other appropriate characteristics or thresholds). Upon detection of acoustic activity that meets the trigger requirements, the microphone 100 transmits a pulse density modulation (PDM) stream to wake up the rest of the system chain to complete the full voice recognition process. Other types of data could also be used.
The control module 112 controls when the data is transmitted from the buffer. As discussed elsewhere herein, when activity has been detected by the AAD module 108, then the data is clocked out over an interface 119 that includes a VDD pin 120, a clock pin 122, a select pin 124, a data pin 126 and a ground pin 128. The pins 120-128 form the interface 119 that is recognizable and compatible in operation with various types of electronic circuits, for example, those types of circuits that are used in cellular phones. In one aspect, the microphone 100 uses the interface 119 to communicate with circuitry inside a cellular phone. Since the interface 119 is standardized as between cellular phones, the microphone 100 can be placed or disposed in any phone that utilizes the standard interface. The interface 119 seamlessly connects to compatible circuitry in the cellular phone. Other interfaces are possible with other pin outs. Different pins could also be used for interrupts.
In operation, the microphone 100 operates in a variety of different modes and several states that cover these modes. For instance, when a clock signal (with a frequency falling within a predetermined range) is supplied to the microphone 100, the microphone 100 is operated in a standard operating mode. If the frequency is not within that range, the microphone 100 is operated within a sensing mode. In the sensing mode, the internal oscillator 103 of the microphone 100 is being used and, upon detection of an acoustic event, data transmissions are aligned with the rising clock edge, where the clock is the internal clock.
Referring now to
In addition, the microphone 100 of
The function of the low pass filter 140 removes higher frequency from the charge pump. The function of the reference 142 is a voltage or other reference used by components within the system as a convenient reference value. The function of the decimation/compression module 144 is to minimize the buffer size take the data or compress and then store it. The function of the decompression PDM module 146 is pulls the data apart for the control module. The function of the pre-amplifier 148 is bringing the sensor output signal to a usable voltage level.
The components identified by the label 100 in
Referring now to
In sensing mode, the output of the microphone is tri-stated and an internal clock is applied to the sensing circuit. Once the AAD module triggers (e.g., sends a trigger signal indicating an acoustic event has occurred), the microphone transmits buffered PDM data on the microphone data pin (e.g., data pin 126) synchronized with the internal clock (e.g. a 512 kHz clock). This internal clock will be supplied to the select pin (e.g., select pin 124) as an output during this mode. In this mode, the data will be valid on the rising edge of the internally generated clock (output on the select pin). This operation assures compatibility with existing I2S-comaptible hardware blocks. The clock pin (e.g., clock pin 122) and the data pin (e.g., data pin 126) will stop outputting data a set time after activity is no longer detected. The frequency for this mode is defined in the datasheet for the part in question. In other example, the interface is compatible with the PDM protocol or the I2C protocol. Other examples are possible.
The operation of the microphone described above is shown in
For compatibility to the DMIC-compliant interfaces in sensing mode, the clock pin (e.g., clock pin 122) can be driven to clock out the microphone data. The clock must meet the sensing mode requirements for frequency (e.g., 512 kHz). When an external clock signal is detected on the clock pin (e.g., clock pin 122), the data driven on the data pin (e.g., data pin 126) is synchronized with the external clock within two cycles, in one example. Other examples are possible. In this mode, the external clock is removed when activity is no longer detected for the microphone to return to lowest power mode. Activity detection in this mode may use the select pin (e.g., select pin 124) to determine if activity is no longer sensed. Other pins may also be used.
This operation is shown in
Referring now to
The state transition diagram of
The microphone off state 402 is where the microphone 400 is deactivated. The normal mode state 404 is the state during the normal operating mode when the external clock is being applied (where the external clock is within a predetermined range). The microphone sensing mode with external clock state 406 is when the mode is switching to the external clock as shown in
As mentioned, transitions between these states are based on and triggered by events. To take one example, if the microphone is operating in normal operating state 404 (e.g., at a clock rate higher than 512 kHz) and the control module detects the clock pin is approximately 512 kHz, then control goes to the microphone sensing mode with external clock state 406. In the external clock state 406, when the control module then detects no clock on the clock pin, control goes to the microphone sensing mode internal clock state 408. When in the microphone sensing mode internal clock state 408, and an acoustic event is detected, control goes to the sensing mode with output state 410. When in the sensing mode with output state 410, a clock of greater than approximately 1 MHz may cause control to return to state 404. The clock may be less than 1 MHz (e.g., the same frequency as the internal oscillator) and is used synchronized data being output from the microphone to an external processor. No acoustic activity for an OTP programmed amount of time, on the other hand, causes control to return to state 406.
It will be appreciated that the other events specified in
Referring now to
It will be appreciated that the clocking module 600 may be the clock detector module 104 of
The clock detect block 602 receives the external clock and calculates a division ratio 620 and a decimation factor 622 as described below. The internal clock 604 provides a high frequency signal while the external clock 610 provides a lower frequency signal. The programmable divider 606 reduces the frequency of the internal clock 604. The decimator 608 converts 1 bit PDM data to PCM data with a frequency determined by the decimation factor. The decimator 608 may include one or more filters.
The charge pump 614 provides voltage for the microphone 613. The microphone 613 may be MEMS sensors, piezoelectric sensor, or any other type of sensing device. The sigma delta converter 612 converts the analog signal from the microphone 614 into a digital signal for use by the decimator 608.
In one example of the operation of the clocking module 600, the internal clock 604 provides a 12.288 MHz internal clock signal. The clock detect block 602 in one aspect contains a counter that counts internal clock pulses. When a signal from the external clock 610 is applied to the clock detect block 602, the counter will count how many internal clocks pulses were within an external clock pulse. The internal clock 604 must be higher frequency than the external clock 610. In this example, the external clock 610 is a 512 kHz clock and is applied to the external clock pin of the clocking module 600.
The clock detect block 602 now counts how many internal clock pulses there are within one external clock cycle. In this case, 12,288,000/512,000=24 clocks. Once it is confirmed that the divide down ratio is, in fact, 24, the programmable divider 606 is programmed with the number 24. At this point, the internal clock signal is now 512,000 Hz. This internal clock signal as modified by the programmable divider 606 will clock the decimator 608.
Based on the desired output data rate (the predetermined desired sampling frequency), and to take one example, 16 kHz data at 16 bits (however, it will be appreciated that this could be any other frequency and bit length) is needed to feed the next stage of the system at the buffer 616.
The clock detect block 602 take the internal clock signal and the predetermined desired sampling frequency to determine the decimation factor (ratio) 622 of the decimator 608. In one example, a 16,000 Hz sample rate is required, and the clock detect block 602 will divide 512,000/16,000 to get a decimation factor of 32.
The clock detect block 602 programs the decimator 608 with a 32× decimation factor (ratio) 622 and adjust filters within the decimator 608 to provide data at a 16 kHz rate.
Preferred embodiments of this invention are described herein, including the best mode known to the inventors for carrying out the invention. It should be understood that the illustrated embodiments are exemplary only, and should not be taken as limiting the scope of the invention.
This patent claims benefit under 35 U.S.C. §119 (e) to U.S. Provisional Application No. 61/901,832 entitled “Microphone and Corresponding Digital Interface” filed Nov. 8, 2013, the content of which is incorporated herein by reference in its entirety. This patent is a continuation-in-part of U.S. application Ser. No. 14/282,101 entitled “VAD Detection Microphone and Method of Operating the Same” filed May 20, 2014, which claims priority to U.S. Provisional Application No. 61/826,587 entitled “VAD Detection Microphone and Method of Operating the Same” filed May 23, 2013, the content of both is incorporated by reference in its entirety.
Number | Name | Date | Kind |
---|---|---|---|
4052568 | Jankowski | Oct 1977 | A |
5577164 | Kaneko | Nov 1996 | A |
5598447 | Usui | Jan 1997 | A |
5675808 | Gulick et al. | Oct 1997 | A |
5822598 | Lam | Oct 1998 | A |
5983186 | Miyazawa | Nov 1999 | A |
6049565 | Paradine | Apr 2000 | A |
6057791 | Knapp | May 2000 | A |
6070140 | Tran | May 2000 | A |
6154721 | Sonnic | Nov 2000 | A |
6249757 | Cason | Jun 2001 | B1 |
6259291 | Huang | Jul 2001 | B1 |
6282268 | Hughes | Aug 2001 | B1 |
6324514 | Matulich | Nov 2001 | B2 |
6397186 | Bush | May 2002 | B1 |
6453020 | Hughes | Sep 2002 | B1 |
6564330 | Martinez | May 2003 | B1 |
6591234 | Chandran | Jul 2003 | B1 |
6640208 | Zhang | Oct 2003 | B1 |
6756700 | Zeng | Jun 2004 | B2 |
6829244 | Wildfeuer et al. | Dec 2004 | B1 |
7190038 | Dehe | Mar 2007 | B2 |
7415416 | Rees | Aug 2008 | B2 |
7473572 | Dehe | Jan 2009 | B2 |
7619551 | Wu | Nov 2009 | B1 |
7630504 | Poulsen | Dec 2009 | B2 |
7774202 | Spengler | Aug 2010 | B2 |
7774204 | Mozer | Aug 2010 | B2 |
7781249 | Laming | Aug 2010 | B2 |
7795695 | Weigold | Sep 2010 | B2 |
7825484 | Martin | Nov 2010 | B2 |
7829961 | Hsiao | Nov 2010 | B2 |
7856283 | Burk et al. | Dec 2010 | B2 |
7856804 | Laming | Dec 2010 | B2 |
7903831 | Song | Mar 2011 | B2 |
7936293 | Hamashita | May 2011 | B2 |
7941313 | Garudadri | May 2011 | B2 |
7957972 | Huang | Jun 2011 | B2 |
7994947 | Ledzius | Aug 2011 | B1 |
8171322 | Fiennes | May 2012 | B2 |
8208621 | Hsu | Jun 2012 | B1 |
8275148 | Li | Sep 2012 | B2 |
8331581 | Pennock | Dec 2012 | B2 |
8666751 | Murthi | Mar 2014 | B2 |
8687823 | Loeppert | Apr 2014 | B2 |
8731210 | Cheng | May 2014 | B2 |
8798289 | Every | Aug 2014 | B1 |
8804974 | Melanson | Aug 2014 | B1 |
8849231 | Murgia | Sep 2014 | B1 |
8972252 | Hung | Mar 2015 | B2 |
8996381 | Mozer | Mar 2015 | B2 |
9020819 | Saitoh | Apr 2015 | B2 |
9043211 | Haiut | May 2015 | B2 |
9059630 | Gueorguiev | Jun 2015 | B2 |
9073747 | Ye | Jul 2015 | B2 |
9076447 | Nandy | Jul 2015 | B2 |
9111548 | Nandy | Aug 2015 | B2 |
9112984 | Sejnoha | Aug 2015 | B2 |
9113263 | Furst | Aug 2015 | B2 |
9119150 | Murgia | Aug 2015 | B1 |
9142215 | Rosner | Sep 2015 | B2 |
9147397 | Thomsen | Sep 2015 | B2 |
9161112 | Ye | Oct 2015 | B2 |
20020054588 | Mehta | May 2002 | A1 |
20020116186 | Strauss | Aug 2002 | A1 |
20020123893 | Woodward | Sep 2002 | A1 |
20020184015 | Li | Dec 2002 | A1 |
20030004720 | Garudadri | Jan 2003 | A1 |
20030061036 | Garudadri | Mar 2003 | A1 |
20030138061 | Li | Jul 2003 | A1 |
20030144844 | Colmenarez | Jul 2003 | A1 |
20040022379 | Klos | Feb 2004 | A1 |
20050207605 | Dehe | Sep 2005 | A1 |
20060074658 | Chadha | Apr 2006 | A1 |
20060233389 | Mao et al. | Oct 2006 | A1 |
20060247923 | Chandran | Nov 2006 | A1 |
20070127761 | Poulsen | Jun 2007 | A1 |
20070168908 | Paolucci | Jul 2007 | A1 |
20070274297 | Cross et al. | Nov 2007 | A1 |
20070278501 | MacPherson | Dec 2007 | A1 |
20080089536 | Josefsson | Apr 2008 | A1 |
20080175425 | Roberts | Jul 2008 | A1 |
20080201138 | Visser | Aug 2008 | A1 |
20080267431 | Leidl | Oct 2008 | A1 |
20080279407 | Pahl | Nov 2008 | A1 |
20080283942 | Huang | Nov 2008 | A1 |
20090001553 | Pahl | Jan 2009 | A1 |
20090003629 | Shajaan | Jan 2009 | A1 |
20090180655 | Tien | Jul 2009 | A1 |
20090234645 | Bruhn | Sep 2009 | A1 |
20100046780 | Song | Feb 2010 | A1 |
20100052082 | Lee | Mar 2010 | A1 |
20100057474 | Kong | Mar 2010 | A1 |
20100128894 | Petit | May 2010 | A1 |
20100128914 | Khenkin | May 2010 | A1 |
20100131783 | Weng | May 2010 | A1 |
20100183181 | Wang | Jul 2010 | A1 |
20100246877 | Wang | Sep 2010 | A1 |
20100290644 | Wu | Nov 2010 | A1 |
20100292987 | Kawaguchi | Nov 2010 | A1 |
20100322443 | Wu | Dec 2010 | A1 |
20100322451 | Wu | Dec 2010 | A1 |
20110007907 | Park | Jan 2011 | A1 |
20110013787 | Chang | Jan 2011 | A1 |
20110029109 | Thomsen et al. | Feb 2011 | A1 |
20110075875 | Wu | Mar 2011 | A1 |
20110106533 | Yu | May 2011 | A1 |
20110208520 | Lee | Aug 2011 | A1 |
20110280109 | Raymond | Nov 2011 | A1 |
20120010890 | Koverzin | Jan 2012 | A1 |
20120112804 | Li | May 2012 | A1 |
20120232896 | Taleb | Sep 2012 | A1 |
20120250881 | Mulligan | Oct 2012 | A1 |
20120250910 | Shajaan et al. | Oct 2012 | A1 |
20120310641 | Niemisto | Dec 2012 | A1 |
20130035777 | Niemisto et al. | Feb 2013 | A1 |
20130044898 | Schultz | Feb 2013 | A1 |
20130058495 | Furst et al. | Mar 2013 | A1 |
20130058506 | Boor | Mar 2013 | A1 |
20130223635 | Singer | Aug 2013 | A1 |
20130226324 | Hannuksela | Aug 2013 | A1 |
20130246071 | Lee | Sep 2013 | A1 |
20130322461 | Poulsen | Dec 2013 | A1 |
20130343584 | Bennett et al. | Dec 2013 | A1 |
20140064523 | Kropfitsch | Mar 2014 | A1 |
20140122078 | Joshi | May 2014 | A1 |
20140143545 | McKeeman | May 2014 | A1 |
20140163978 | Basye | Jun 2014 | A1 |
20140177113 | Gueorguiev | Jun 2014 | A1 |
20140188467 | Jing | Jul 2014 | A1 |
20140188470 | Chang | Jul 2014 | A1 |
20140197887 | Hovesten | Jul 2014 | A1 |
20140244269 | Tokutake | Aug 2014 | A1 |
20140244273 | Laroche | Aug 2014 | A1 |
20140249820 | Hsu | Sep 2014 | A1 |
20140257813 | Mortensen | Sep 2014 | A1 |
20140257821 | Adams | Sep 2014 | A1 |
20140274203 | Ganong | Sep 2014 | A1 |
20140278435 | Ganong et al. | Sep 2014 | A1 |
20140281628 | Nigam | Sep 2014 | A1 |
20140343949 | Huang | Nov 2014 | A1 |
20140348345 | Furst | Nov 2014 | A1 |
20140358552 | Xu | Dec 2014 | A1 |
20150039303 | Lesso | Feb 2015 | A1 |
20150043755 | Furst | Feb 2015 | A1 |
20150046157 | Wolff | Feb 2015 | A1 |
20150046162 | Aley-Raz | Feb 2015 | A1 |
20150049884 | Ye | Feb 2015 | A1 |
20150055803 | Qutub | Feb 2015 | A1 |
20150058001 | Dai | Feb 2015 | A1 |
20150063594 | Nielsen | Mar 2015 | A1 |
20150073780 | Sharma | Mar 2015 | A1 |
20150073785 | Sharma | Mar 2015 | A1 |
20150088500 | Conliffe | Mar 2015 | A1 |
20150106085 | Lindahl | Apr 2015 | A1 |
20150110290 | Furst | Apr 2015 | A1 |
20150112690 | Guha | Apr 2015 | A1 |
20150134331 | Millet | May 2015 | A1 |
20150154981 | Barreda | Jun 2015 | A1 |
20150161989 | Hsu | Jun 2015 | A1 |
20150195656 | Ye | Jul 2015 | A1 |
20150206527 | Connolly | Jul 2015 | A1 |
20150256660 | Kaller | Sep 2015 | A1 |
20150256916 | Volk | Sep 2015 | A1 |
20150287401 | Lee | Oct 2015 | A1 |
20150302865 | Pilli | Oct 2015 | A1 |
20150304502 | Pilli | Oct 2015 | A1 |
20150350760 | Nandy | Dec 2015 | A1 |
20150350774 | Furst | Dec 2015 | A1 |
20160012007 | Popper | Jan 2016 | A1 |
20160087596 | Yurrtas | Mar 2016 | A1 |
20160133271 | Kuntzman | May 2016 | A1 |
20160134975 | Kuntzman | May 2016 | A1 |
Number | Date | Country |
---|---|---|
2001236095 | Aug 2001 | JP |
2004219728 | Aug 2004 | JP |
2009130591 | Oct 2009 | WO |
2011106065 | Sep 2011 | WO |
2011140096 | Nov 2011 | WO |
2013049358 | Apr 2013 | WO |
2013085499 | Jun 2013 | WO |
Entry |
---|
International Search Report for PCT/EP2014/038790, dated Sep. 23, 2014, 9 pages. |
International Search Report and Written Opinion for PCT/EP2014/064324, dated Feb. 12, 2015 (13 pages). |
“MEMS technologies: Microphone” EE Herald Jun. 20, 2013. |
Delta-sigma modulation, Wikipedia (Jul. 4, 2013). |
Pulse-density modulation, Wikipedia (May 3, 2013). |
Kite, Understanding PDM Digital Audio, Audio Precision, Beaverton, OR, 2012. |
International Search Report and Written Opinion for PCT/US2014/060567 dated Jan. 16, 2015 (12 pages). |
International Search Report and Written Opinion for PCT/US2014/062861 dated Jan. 23, 2015 (12 pages). |
International Search Report and Written Opinion for PCT/US2016/013859 dated Apr. 29, 2016 (12 pages). |
Search Report of Taiwan Patent Application No. 103135811, dated Apr. 18, 2016 (1 page). |
Number | Date | Country | |
---|---|---|---|
20150055803 A1 | Feb 2015 | US |
Number | Date | Country | |
---|---|---|---|
61901832 | Nov 2013 | US | |
61826587 | May 2013 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 14282101 | May 2014 | US |
Child | 14533690 | US |