Telephones, computers, and other electronic systems often have more than one audio input device. This device may or may not be paired with an audio output device such as a speaker. In the case of a telephone, the input and output device may be paired in a handset, headset, or wireless headset.
Where multiple input devices are available the system typically provides a mechanism to select which device to use. This may be a manual selection by the user or a predetermined selection based on a configuration choice.
While these selections may often be correct, they may also be incorrect. As an example, a telephone user who is wearing a wireless headset answers an incoming call by pressing a button on the telephone base unit out of habit. This action is configured to route the audio through the speaker and microphone on the base even though the wireless headset would provide superior sound quality.
Similarly, a computer equipped with a webcam may have an auxiliary microphone plugged in to an input jack. While setting up for an online meeting the user selects the auxiliary microphone as the input device. However, when the meeting starts they leave the microphone laying on the table and speak into the microphone adjacent to the camera attached to the computer screen.
The user's experience would be improved through a process which selects the input device which provides the best sound quality. Selectively disabling input and output devices would also save power, especially where the devices, or perhaps the entire system, are battery powered.
This Summary is provided to introduce in a simplified form a selection of concepts that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
Various aspects of the subject matter disclosed herein are related to selecting one of several audio input devices such as microphones to be used by a system. The selection is based on superior relative performance as determined by comparing peak variations in sound level above the background sound level.
Other aspects relate to applying a threshold level to all peak variation values and considering only those which exceed the threshold value.
The approach described below may be implemented as a computer process, a computing system or as an article of manufacture such as a computer program product. The computer program product may be a computer storage medium readable by a computer system and encoding a computer program of instructions for executing a computer process. The computer program product may also be a propagated signal on a carrier readable by a computing system and encoding a computer program of instructions for executing a computer process.
A more complete appreciation of the above summary can be obtained by reference to the accompanying drawings, which are briefly summarized below, to the following detailed description of present embodiments, and to the appended claims.
This detailed description is made with reference to the accompanying drawings, which form a part hereof, and which show, by way of illustration, specific exemplary embodiments. These embodiments are described in sufficient detail to enable those skilled in the art to practice what is taught below, and it is to be understood that other embodiments may be utilized and that logical, mechanical, electrical, and other changes may be made without departing from the spirit or scope of the subject matter. The following detailed description is, therefore, not to be taken in a limiting sense, and its scope is defined only by the appended claims.
The concepts of the present invention pertain to automatically selecting an audio input device, and optionally an associated audio output device, for use based on determining which of multiple input devices available provides the best reception of the user by sampling and comparing the input on all available input devices. A first exemplary embodiment is a cellular telephone having the built-in microphone and a wireless BlueTooth® headset. A second exemplary embodiment is a hands free telephone or intercom system in a building which has multiple microphones. A third exemplary embodiment is a computer based video conferencing system having multiple microphones. The concepts are also applicable to other systems having more than one available audio input device.
Benefits of the system include improved user experience by utilizing the device with the best audio quality and reduced power consumption and reduced noise by deactivating devices which are not needed.
The concepts of the present disclosure apply in substantially the same manner to systems of the type shown in either
Upon system activation, all available microphones are activated 602. In an exemplary system such as illustrated in
During the sampling period, the amplitude of the audio input signal is sensed accumulating data such as that illustrated graphically in
Dashed Lines 302, 402, and 502 represent a threshold value used to evaluate the sample data. An exemplary embodiment uses the threshold as an additional criteria in selecting the input device. The model underlying the present disclosure is that where a microphone is capturing spoken audio from a user 100 in close proximity, that audio input will show significant power deviations above the background noise, similar to the data shown in
The threshold value may be a single fixed level, as illustrated or may be an incremental value above the measured background noise. Both approaches give similar results where a single background value is used. Where separate background levels are used for each input device, the use of separate thresholds determined as an incremental amount above the background level may provide improved identification of the best device to use in situations such as a person who is speaking quietly because they are in a quiet area. In this case the sample data may not meet a higher, fixed level.
Referring again to
With the deviation values calculated, that input device having the greatest deviation above the background noise is selected 612. All other microphones are deactivated 614 and all future input is accepted from the selected microphone. If none of the sampled data meets all of the criteria a preselected default microphone will be used. If the data from more than one microphone satisfies all criteria and are within a preselected relative range from each other, they will be considered equal and a preconfigured rule will be applied to select the correct device.
In an exemplary embodiment, dBm level is used as a simplification to represent the input signals. Thus the test on a single microphone A becomes:
dBm(A)>BA+TA
Where dBm(A) is the peak input level, BA is the background level, and TA is the threshold level. TA is based on standard deviation in samples obtained from microphone A used in calculation of BA. If this test is satisfied, then microphone A is a candidate for selection. It's peak level is compared to all other microphones which also pass this test and the one with the largest peak input is selected.
If one or more of the microphones, e.g., B, cannot be sampled, then a preselected background value which approximates white noise WB is used for B with no peaks. This approach has more inherent error so a larger threshold value TA′ is used. In the above exemplary embodiment the test becomes:
If dBm(A)>WA+TA′, then select A.
During the initial sampling period an exemplary embodiment will route audio output to all available output devices so that the user can hear the output no matter which device they are using. After the input device has been selected, an output device which has been predetermined to correspond to that input device will be selected and all other output devices deactivated.
In the above exemplary embodiments sampling is performed during a short period at the initiation of a call. Another embodiment periodically samples the microphones while the system is not active. This allows the correct microphone to be known immediately at the start of the call or other system activation. In this context “active” is understood as the system being used for its intended purpose. While inactive, the system is still functional and capable of performing the necessary processing. Yet another embodiment periodically samples the input levels during the call or other use of the system. This allows for adapting to changes in the situation. For example, the user could start a call using speakerphone and then put on a wireless headset and walk away from the base unit. The system would detect that the headset has become a better source and switch to the headset, deactivating the speakerphone.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims. It will be understood by those skilled in the art that many changes in construction and widely differing embodiments and applications will suggest themselves without departing from the scope of the disclosed subject matter.