The present examples relate to processing audio and acoustic signals, and particularly to an earphone or earphones, ear-coupling devices, and mobile devices for enhancing voice communication and situation awareness in a user environment.
Earphones are increasingly used by professionals and consumers for voice communications and music listening. The benefits for voice communications and music listener are to enhance audio signal integrity, increase intelligibility of voice communication, voice messages and improve music quality. One disadvantage of earphones is that the user is sonically impaired from their environment. If the user, while listening to music, wishes to interact with people in their immediate physical environment, they are generally required to manually stop playback of audio content on the mobile device and remove the earphone(s). Such a manual action may be difficult to do in a timely manner, resulting in missed sound or conversational information.
A need thus exists to enhance awareness of audible sounds in the environment with respect to the sonically impaired user.
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the embodiments herein, its application, or uses. Similar reference numerals and letters refer to similar items in the following figures, and thus once an item is defined in one figure, it may not be discussed for following figures.
Herein provided is a method and system for automatically detecting when an earphone user and a second nearby user are engaged in a voice conversation, and activating a sound pass-through system such that the nearby voice source is automatically heard by the user of the sound isolating earphone.
For such purpose, the earphone 100 includes a processor 103 (e.g., Digital Signal Processor—DSP) operatively coupled to one or more ambient sound microphones (101/102), an amplifier 104 and an ear canal receiver 105. A further illustration of the earphone 100 in some embodiments include other components that provide additional features for sound isolation, ambient sound pass-through, audio mixing and audio content reproduction and filtering is shown and described in greater detail in
The earphone 100 may be worn by “the user” for enhancing situation awareness, which can include aspects of audibly blocking environmental sounds, and/or, in conjunction with, audio content delivery where the earphone is used for voice communication and/or music listening. The two ambient sound microphone (101/102) signals are directed to the processor 103 which can reside on the earphone 100 or a mobile device as shown and described ahead. The processor 103 can also receive an incoming audio signal 107, e.g. a voice or music signal from a mobile device. Based on an analysis of the at least two ambient microphone (101/102) signals, at least one of the ambient microphone signals is mixed with the incoming audio signal 107 and directed to the ear canal receiver (ECR) 105 (i.e., a loudspeaker in the sound isolating earphone that audibly delivers the audible signal to the user's eardrum). The audio signal can be adjusted by way of the amplifier 104 to increase its signal to noise ratio prior to play out on ECR 105. In some embodiments, the at least two ambient microphone signals are from ambient microphones (101/102) in the same earphone 100, and in some other embodiments one ambient microphone 101 is in a first earphone and the second ambient microphone 102 is in a second earphone (i.e., the second earphone is worn in the other ear of the earphone user). In yet some other embodiments as further discussed with respect to
The method 180 can begin in a state where a user is wearing the earphone 100 and listening to audio content 181 (AC) and in an environment filled with ambient sounds 182. As an example, the audio content 181 may be music or other media delivered to the earphone 100 by way of a communicatively coupled mobile device or other media source. The ambient sounds 182 may be environmental sounds or other sounds in proximity to the user (e.g., traffic, noise, silence, other people talking, etc.). The processor 103 through analysis of the ambient sounds 182 captured by the ASM microphone(s) (101/102) and the audio content 181 can automatically adjust the gains of these signals and mix them together to produce a combined audio signal 191 that is delivered to the ear canal.
As illustrated, the incoming audio signal 181 is adjusted by way of the first stage 185 with the incoming audio gain 183 to produce the modified audio signal 186. The incoming audio gain 183 regulates the level of the audio content (AC) that is delivered to the ear canal. The ambient microphone signal 182 is also adjusted separately by way of the second stage 187 with the ambient sound pass-through gain 183 to produce the modified ambient signal 188. The ambient sound gain 183 regulates how much of the ambient sound from the external environment is passed on to the user's ear canal. The two gain coefficients for the AC 181 and ASM 182 signals and are generated according to a “Close Voice Activity Detection” system described ahead. In some embodiments, the gains may be frequency dependent. The modified audio signal 186 and the modified ambient signal 188 are combined at the summer 190 and delivered as an output signal 191 to the ear canal receiver.
The earphone 100 monitors the levels/frequency of the audio content which the user is listening and also the levels/frequency of the ambient sounds in the user's environment. Accordingly, each gain stage gain (e.g., first stage 185, second stage 187) can be adjusted dependent of frequency, or automatically adjusted. For example, the processor 103 can selectively filter (amplify/attenuate) audio content 181 and ambient sound 182 frequencies depending on similarity, context and/or rating. Each gain stage is generated/adjusted according to a “close voice activity detection”; that is, the gains 183/184 are individually adjusted depending on user/environment context for enhancing situation awareness, for instance, when a second person is talking in close proximity and directing conversation to the user. That is, with “close voice activity detection” enabled, the user wearing the earphone 100 and listening to audio content is made aware that a person is speaking to them.
As depicted in
In some embodiments, as represented by additional devices 150 and 160 in
The method 250 can start in a state 251 as depicted in
When user voice activity ceases, a user voice activity timer is started prior to step 255 at which time a front voice activity detector is invoked. The front voice activity detector practiced by the processor 103 determines if there is voice activity from a second individual “close” to the earphone user and at a determined relative direction to the earphone user in order to assess whether that second individual is speaking and engaged in conversation with the user. In some embodiments, “close” is defined as being within approximately 2 meters, and the determined relative direction is defined as being within approximately plus/minus 45 degrees relative to the direction that the earphone user is facing (see also 157 in
If at step 255 front voice activity from the second individual is detected, then the gain of the incoming audio signal is maintained (or decreased) and the ambient sound pass-through gain is maintained (or decreased) at step 256. If however voice activity from a second close individual is NOT detected at step 255, the gain of the ambient sound pass-through gain is decreased at step 258, and the gain of the incoming audio signal is increased at step 259. The method then continues back to step 252 to monitor for user voice activity.
When the user ceases to speak, or if no voice activity is detected, the voice activity timer is referenced to determine if there was any recent user voice activity at step 257. For example, if at step 252, no user voice is detected, then there are two timer scenarios. In the first scenario when user voice activity is not detected and the user voice activity timer is below a determined threshold (in some embodiments this is approximately 10 seconds), then it is determined at 257 that there was recent user voice activity (or pending voice activity) and the method proceeds to step 255. The voice activity threshold can be based on voice characteristics (e.g., spectra, phase, timing, onsets, etc.) in addition to sound pressure level. In the second scenario, when user voice activity is not detected at step 252 but the voice activity timer is above the determined threshold, then recent user voice activity exists at step 257 and the gain of the ambient sound pass-through gain is decreased at step 258, and the gain of the incoming audio signal is increased at step 259. Thereafter, the method then continues back to step 252 to monitor for user voice activity.
In the context depicted in
The earphone 100 by way of the method 250 can detect frontal and/or side voice activity from another individual in proximity to the user, and adjust the mixing gain of the audio content and the ambient sound pass-through of the earphone based on a combined voice activity detection level with combined voice activity time expirations of the user and the second individual. The method 250 as previously described for automatically enhancing situational awareness and improving two-way conversation includes decreasing the audio content signal and increasing the ambient sound pass-through when voice activity of the user is detected, starting a voice activity timer when voice activity ceases, decreasing the ambient sound pass-through and increasing the audio content signal when the frontal voice activity by the another individual is not detected, monitoring the voice activity timer for combined voice activity of the user and the frontal voice activity from the another individual, and maintaining ambient sound pass-through level and audio content signal level during the combined voice activity. The voice activity timer bridges gaps between voice activity of the user and the frontal voice activity of the another individual to a time length that is a function of the combined voice activity.
Further note that additional devices worn by the user or otherwise working cooperatively with the earphones can use their additional microphones to assist in or further refine the lean-in functionality described above to automatically enhance situational awareness and improve multiparty communication. As shown in
The method 280 can begin in a state where a user has the earphone 100 on and is listening to music and a second individual initiates conversation with the user. Depictions of which are also shown in
The method 280 by way of the processor 103 also distinguishes between a first spoken voice of a user wearing the earphone and a second spoken voice of another individual in proximity to the user. It determines a direction and proximity of the individual with respect to the user wearing the earphone (See
The step of adjusting the ambient sound pass-through includes increasing the gain of the ambient sound from the ambient sound microphone delivered by the internal speaker when voice activity is detected above a threshold, and/or decreasing the gain of the ambient sound from the ambient sound microphone delivered by the internal speaker when voice activity is detected below a threshold. The step of adjusting the mixing gain of an audio content signal includes decreasing a volume of the audio content signal delivered to the internal speaker when voice activity is detected above a threshold, and/or increases the volume of the audio content signal delivered to the internal speaker when voice activity is detected below a threshold. In one arrangement, the adjusting of the ambient sound pass-through increases the signal to noise ratio of the ambient sound with respect to background noise.
Referring to
Referring to
Referring to
In the configuration shown, the first 141 and second 142 microphones are mechanically mounted to one side of eyeglasses to provide audio signal streams. These can serve as ambient sound microphones analogous to those in the earphone 100. Again, the embodiment 140 can be configured for individual sides (left or right) or include an additional pair of microphones on a second side in addition to the first side. The eyeglasses 140 can also include one or more optical elements, for example, cameras 143 and 144 situated at the front or other direction for taking pictures. Similarly, the mobile device 150 (see
As previously noted in the description of these previous figures, the processor 103 performing the close proximity detection and audio mixing can be included thereon, for example, within a digital signal processor or other software programmable device within, or coupled to, the media device 150 or 160. As discussed above, components of the media device for implementing multiplexing and de-multiplexing of separate audio signal streams produce a composite signal.
With respect to the previous figures, the system 300 (see
It should also be noted that the computing devices shown can include any device having audio processing capability for collecting, mining and processing audio signals, or signals within the audio bandwidth (10 Hz to 20 KHz). Computing devices may provide specific functions, such as heart rate monitoring (low-frequency; 10-100 Hz) or pedometer capability (<20 Hz), to name a few. More advanced computing devices may provide multiple and/or more advanced audio processing functions, for instance, to continuously convey heart signals (low-frequency sounds) or other continuous biometric data (sensor signals). As an example, advanced “smart” functions and features similar to those provided on smartphones, smartwatches, optical head-mounted displays or helmet-mounted displays can be included therein. Example functions of computing devices providing audio content may include, without being limited to, capturing images and/or video, displaying images and/or video, presenting audio signals, presenting text messages and/or emails, identifying voice commands from a user, browsing the web, etc. Aspects of voice control included herein are disclosed in U.S. patent application Ser. No. 13/134,222 filed on 19Dec. 2013 entitled “Method and Device for Voice Operated Control”, with a common author, the entire contents, and priority reference parent applications, of which are hereby incorporated by reference in entirety.
The earpiece includes an Ambient Sound Microphone (ASM) 420 to capture ambient sound, an Ear Canal Receiver (ECR) 414 to deliver audio to an ear canal 424, and an Ear Canal Microphone (ECM) 406 to capture and assess a sound exposure level within the ear canal 424. The earpiece can partially or fully occlude the ear canal 424 to provide various degrees of acoustic isolation. In at least one exemplary embodiment, assembly is designed to be inserted into the user's ear canal 424, and to form an acoustic seal with the walls of the ear canal 424 at a location between the entrance to the ear canal 424 and the tympanic membrane (or ear drum). In general, such a seal is typically achieved by means of a soft and compliant housing of sealing unit 408. In the embodiments including such a seal, the earphone provides sound isolation from ambient sounds external to an ear canal of the user in which the earphone is inserted and from acoustic sounds internal to the ear canal of the user.
Sealing unit 408 is an acoustic barrier having a first side corresponding to ear canal 424 and a second side corresponding to the ambient environment. In at least one exemplary embodiment, sealing unit 408 includes an ear canal microphone tube 410 and an ear canal receiver tube 414. Sealing unit 408 creates a closed cavity of approximately 5 cc between the first side of sealing unit 408 and the tympanic membrane in ear canal 424. As a result of this sealing, the ECR (speaker) 414 is able to generate a full range bass response when reproducing sounds for the user. This seal also serves to significantly reduce the sound pressure level at the user's eardrum resulting from the sound field at the entrance to the ear canal 424. This seal is also a basis for a sound isolating performance of the electro-acoustic assembly.
In at least one exemplary embodiment and in broader context, the second side of sealing unit 408 corresponds to the earpiece, electronic housing unit 400, and ambient sound microphone 420 that is exposed to the ambient environment. Ambient sound microphone 420 receives ambient sound from the ambient environment around the user.
Electronic housing unit 400 houses system components such as a microprocessor 416, memory 404, battery 402, ECM 406, ASM 420, ECR, 414, and user interface 422. Microprocessor 916 (or processor 416) can be a logic circuit, a digital signal processor, controller, or the like for performing calculations and operations for the earpiece. Microprocessor 416 is operatively coupled to memory 404, ECM 406, ASM 420, ECR 414, and user interface 420. A wire 418 provides an external connection to the earpiece. Battery 402 powers the circuits and transducers of the earpiece. Battery 402 can be a rechargeable or replaceable battery.
In at least one exemplary embodiment, electronic housing unit 400 is adjacent to sealing unit 408. Openings in electronic housing unit 400 receive ECM tube 410 and ECR tube 412 to respectively couple to ECM 406 and ECR 414. ECR tube 412 and ECM tube 410 acoustically couple signals to and from ear canal 424. For example, ECR outputs an acoustic signal through ECR tube 412 and into ear canal 424 where it is received by the tympanic membrane of the user of the earpiece. Conversely, ECM 414 receives an acoustic signal present in ear canal 424 though ECM tube 410. All transducers shown can receive or transmit audio signals to a processor 416 that undertakes audio signal processing and provides a transceiver for audio via the wired (wire 418) or a wireless communication path. Again, this only represents some of the embodiments herein and other embodiments herein do not contemplate sealing of the ear canal, but rather some contemplate partial occlusion or no occlusion.
As illustrated, the device 500 comprises a wired and/or wireless transceiver 552, a user interface (UI) display 554, a memory 556, a location unit 558, and a processor 560 for managing operations thereof. The media device 550 can be any intelligent processing platform with Digital signal processing capabilities, application processor, data storage, display, input modality like touch-screen or keypad, microphones, speaker 566, Bluetooth, and connection to the internet via WAN, Wi-Fi, Ethernet or USB. This embodies custom hardware devices, Smartphone, cell phone, mobile device, iPad and iPod like devices, a laptop, a notebook, a tablet, or any other type of portable and mobile communication or computing device. Other devices or systems such as a desktop, automobile electronic dash board, computational monitor, or communications control equipment is also herein contemplated for implementing the methods herein described. A power supply 562 provides energy for electronic components.
In one embodiment where the media device 500 operates in a landline environment, the transceiver 552 can utilize common wire-line access technology to support POTS or VoIP services. In a wireless communications setting, the transceiver 552 can utilize common technologies to support singly or in combination any number of wireless access technologies including without limitation Bluetooth™, Wireless Fidelity (WiFi), Worldwide Interoperability for Microwave Access (WiMAX), Ultra Wide Band (UWB), software defined radio (SDR), and cellular access technologies such as CDMA-1X, W-CDMA/HSDPA, GSM/GPRS, EDGE, TDMA/EDGE, and EVDO. SDR can be utilized for accessing a public or private communication spectrum according to any number of communication protocols that can be dynamically downloaded over-the-air to the communication device. It should be noted also that next generation wireless access technologies can be applied to the present disclosure.
The power supply 562 can utilize common power management technologies such as power from USB, replaceable batteries, supply regulation technologies, and charging system technologies for supplying energy to the components of the communication device and to facilitate portable applications. In stationary applications, the power supply 562 can be modified so as to extract energy from a common wall outlet and thereby supply DC power to the components of the communication device 550.
The location unit 558 can utilize common technology such as a GPS (Global Positioning System) receiver that can intercept satellite signals and there from determine a location fix of the portable device 500. The controller processor 560 can utilize computing technologies such as a microprocessor and/or digital signal processor (DSP) with associated storage memory such a Flash, ROM, RAM, SRAM, DRAM or other like technologies for controlling operations of the aforementioned components of the communication device.
This disclosure is intended to cover any and all adaptations or variations of various embodiments. Combinations of the above embodiments, and other embodiments not specifically described herein, will be apparent to those of skill in the art upon reviewing the above description.
These are but a few examples of embodiments and modifications that can be applied to the present disclosure without departing from the scope of the claims stated below. Accordingly, the reader is directed to the claims section for a fuller understanding of the breadth and scope of the present disclosure.
While the present embodiments have been described with reference to exemplary embodiments, it is to be understood that the possible embodiments are not limited to the disclosed exemplary embodiments. The scope of the following claims is to be accorded the broadest interpretation so as to encompass all modifications, equivalent structures and functions of the relevant exemplary embodiments. Thus, the description of the embodiments is merely exemplary in nature and, thus, variations that do not depart from the gist of the embodiments are intended to be within the scope of the exemplary embodiments of the present invention. Such variations are not to be regarded as a departure from the spirit and scope of the claimed embodiments.
This Application is a utility patent application that claims the priority benefit of U.S. Provisional Patent Application No. 61/778,737 filed on Mar. 13, 2013, the entire disclosure and content of which is incorporated herein by reference in its entirety.
Number | Name | Date | Kind |
---|---|---|---|
4837832 | Fanshel | Jun 1989 | A |
4951045 | Knapp | Aug 1990 | A |
5533105 | Brown | Jul 1996 | A |
5867581 | Obara | Feb 1999 | A |
6023517 | Ishige | Feb 2000 | A |
6415034 | Hietanen | Jul 2002 | B1 |
6644973 | Oster | Nov 2003 | B2 |
6661901 | Svean et al. | Dec 2003 | B1 |
7532717 | Nishimura | May 2009 | B2 |
8059847 | Nordahn | Nov 2011 | B2 |
8098844 | Elko | Jan 2012 | B2 |
8150044 | Goldstein et al. | Apr 2012 | B2 |
8340303 | Chun | Dec 2012 | B2 |
8391501 | Khawand et al. | Mar 2013 | B2 |
8401206 | Seltzer et al. | Mar 2013 | B2 |
8467543 | Burnett et al. | Jun 2013 | B2 |
8503704 | Francart et al. | Aug 2013 | B2 |
8583428 | Tashev et al. | Nov 2013 | B2 |
8600454 | Nicholson | Dec 2013 | B2 |
8606571 | Every et al. | Dec 2013 | B1 |
8718305 | Usher et al. | May 2014 | B2 |
20030035551 | Light et al. | Feb 2003 | A1 |
20040037428 | Keller | Feb 2004 | A1 |
20040234089 | Rembrand et al. | Nov 2004 | A1 |
20050175189 | Lee et al. | Aug 2005 | A1 |
20060074693 | Yamashita | Apr 2006 | A1 |
20060133621 | Chen et al. | Jun 2006 | A1 |
20060262938 | Gauger et al. | Nov 2006 | A1 |
20060294091 | Hsieh | Dec 2006 | A1 |
20070030211 | McGlone | Feb 2007 | A1 |
20070053522 | Murray et al. | Mar 2007 | A1 |
20070076896 | Hosaka et al. | Apr 2007 | A1 |
20070076898 | Sarroukh et al. | Apr 2007 | A1 |
20070088544 | Acero et al. | Apr 2007 | A1 |
20070160243 | Dijkstra et al. | Jul 2007 | A1 |
20070230712 | Belt et al. | Oct 2007 | A1 |
20080137873 | Goldstein | Jun 2008 | A1 |
20080147397 | Konig et al. | Jun 2008 | A1 |
20080181419 | Goldstein | Jul 2008 | A1 |
20080253583 | Goldstein | Oct 2008 | A1 |
20080260180 | Goldstein | Oct 2008 | A1 |
20090010442 | Usher et al. | Jan 2009 | A1 |
20090010444 | Goldstein | Jan 2009 | A1 |
20090010456 | Goldstein | Jan 2009 | A1 |
20090016542 | Goldstein et al. | Jan 2009 | A1 |
20090209290 | Chen et al. | Aug 2009 | A1 |
20090220096 | Usher et al. | Sep 2009 | A1 |
20090264161 | Usher | Oct 2009 | A1 |
20100074451 | Usher | Mar 2010 | A1 |
20100104119 | Lan | Apr 2010 | A1 |
20110206217 | Weis | Aug 2011 | A1 |
20140119553 | Usher | May 2014 | A1 |
20140119571 | Jiang | May 2014 | A1 |
20140172422 | Hefetz | Jun 2014 | A1 |
20140270200 | Usher et al. | Sep 2014 | A1 |
20140314238 | Usher | Oct 2014 | A1 |
20150036832 | Usher | Feb 2015 | A1 |
20150131814 | Usher | May 2015 | A1 |
20150172814 | Usher | Jun 2015 | A1 |
20150179178 | Usher | Jun 2015 | A1 |
20150215701 | Usher | Jul 2015 | A1 |
Number | Date | Country |
---|---|---|
WO2012078670 | Jun 2012 | WO |
WO2014022359 | Feb 2014 | WO |
Number | Date | Country | |
---|---|---|---|
20140270200 A1 | Sep 2014 | US |
Number | Date | Country | |
---|---|---|---|
61778737 | Mar 2013 | US |