The present principles generally relate to audio systems, methods, and computer program products, and in particular to an audio system which is able to automatically and selectively adjust the sound level of one or more audio outputs of the audio system based on the determined origin and/or direction of travel of a detected human voice inside a location. Such an adjustment may be to decrease, mute, or even increase the selected sound level.
Audio systems are widely used in different locations such as at home, in a vehicle, or in a public theatre for projecting sound to an audience. They may be used as a part of, e.g., an entertainment system at home, or as a part of a radio, and/or a navigation system in a car.
US 2011/0218711 patent publication assigned to GM Global Technologies Operations, Inc., and filed in the names of Bhavna Mathur et al, describes an infotainment system for an automobile. The infotainment system includes a navigation system, an entertainment system, audio output device, a control system, and etc. The system further includes a human conversation recognizer that determines if a human conversation is being conducted. The control system of the infotainment system then lowers the output sound level of the audio output device in the event that a human conversation is being conducted.
The present inventors recognize that the existing audio systems such as the GM systems described above, however, do not provide effective and intelligent sound management and would need further improvements. For example, existing audio systems do not determine the location of the origin of and/or the direction of the human voice or conversation, and do not selectively control the one or more of the audio outputs typically found in today's multi-channel sound systems.
Accordingly, an exemplary apparatus is presented, comprising: a detector configured to detect an ambient noise in a location; one or more processors configured to determine from the detector whether the ambient noise includes a voice of a person in the location; and based on determining that the ambient noise includes the voice of the person in the location, further configured to determine an origin of the voice; and the one or more processors are further configured to enable an adjustment in a level of at least one sound output of a plurality of sound outputs of one or more audio output drivers, wherein the at least one sound output of the plurality of sound outputs being adjusted is projecting sound in a direction toward the determined origin of the voice.
In another exemplary embodiment, apparatus producing a sound output adjustment as described above may be configured to produce a sound output adjustment comprising one of decreasing and muting and increasing the sound output of the sound output projecting sound in the direction toward the determined origin of the voice.
In another exemplary embodiment, an exemplary apparatus is presented, comprising: a detector configured to detect an ambient noise in a location; one or more processors configured to determine from the detector whether the ambient noise includes a voice of a person in the location; and based on determining that the ambient noise includes the voice of the person in the location, further configured to determine an origin of the voice; and the one or more processors are further configured to enable a decrease in a level of at least one sound output of a plurality of sound outputs of one or more audio output drivers, wherein the at least one sound output of the plurality of sound outputs being decreased is projecting sound in a direction toward the determined origin of the voice.
In another exemplary embodiment, a method performed by an apparatus is presented, comprising: detecting, via a detector, an ambient noise in a location; determining from the detector, via one or more processors, whether the ambient noise includes a voice of a person in the location; if the ambient noise includes the voice of the person in the location based on the determining, determining an origin of the voice; and enabling an adjustment in a level of at least one sound output of a plurality of sound outputs of one or more audio output drivers of the apparatus, wherein the at least one sound output of the plurality of sound outputs being adjusted is projecting sound in a direction toward the determined origin of the voice.
In another exemplary embodiment, a method producing a sound output adjustment as described above comprises enabling an adjustment comprising one of decreasing and muting and increasing the sound output projecting sound in a direction toward the determined origin of the voice.
In another exemplary embodiment, a method performed by an apparatus is presented, comprising: detecting, via a detector, an ambient noise in a location; determining from the detector, via one or more processors, whether the ambient noise includes a voice of a person in the location; if the ambient noise includes the voice of the person in the location based on the determining, determining an origin of the voice; and enabling a decrease in a level of at least one sound output of a plurality of sound outputs of one or more audio output drivers of the apparatus, wherein the at least one sound output of the plurality of sound outputs being decreased is projecting sound in a direction toward the determined origin of the voice.
In another exemplary embodiment, a computer program product stored in a non-transitory computer-readable storage medium is presented, comprising computer-executable instructions for: detecting an ambient noise in a location; determining whether the ambient noise includes a voice of a person in the location; if the ambient noise includes the voice of the person in the location based on the determining, determining an origin of the voice; and enabling an adjustment in a level of at least one sound output of a plurality of sound outputs of one or more audio output drivers, wherein the at least one sound output of the plurality of sound outputs being adjusted is projecting sound in a direction toward the determined origin of the voice.
The above-mentioned and other features and advantages of the present principles, and the manner of attaining them, will become more apparent and the present invention will be better understood by reference to the following description of embodiments of the present principles taken in conjunction with the accompanying drawings, wherein:
The examples set out herein illustrate exemplary embodiments of the present principles. Such examples are not to be construed as limiting the scope of the present principles in any manner.
The present principles recognize that e.g., human conversations in cars are often disturbed or interrupted by sounds from an audio system such as the sounds from the radio or the turn-by-turn navigation prompts from a GPS. Accordingly, the present inventors recognize that by using a detector comprising more than one microphone, the present principles may detect and determine both the origin and/or the direction of a human conversation or voice in a location such as inside a car or in a home theater room. Therefore, the exemplary embodiments of the present principles may intelligently adjust the one or more of the output audio levels of the multiple output channels of the audio system, in response to the detected voice. For example, a conversation between two individuals in the back seat of a car may result in the rear audio speakers being decreased in volume, while the front speakers may remain at the same audio levels. Accordingly, the present principles provide automatically adjustable and highly adaptive audio/sound systems and methods for people inside a car or in a room to more easily and clearly communicate with each other.
Accordingly, the present description illustrates the present principles. It will thus be appreciated that those skilled in the art will be able to devise various arrangements that, although not explicitly described or shown herein, embody the present principles and are included within its spirit and scope.
All examples and conditional language recited herein are intended for pedagogical purposes to aid the reader in understanding the present principles and the concepts contributed by the inventors to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions.
Moreover, all statements herein reciting principles, aspects, and embodiments of the present principles, as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. Additionally, it is intended that such equivalents include both currently known equivalents as well as equivalents developed in the future, i.e., any elements developed that perform the same function, regardless of structure.
Thus, for example, it will be appreciated by those skilled in the art that the block diagrams presented herein represent conceptual views of illustrative circuitry embodying the present principles. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudocode, and the like represent various processes which may be substantially represented in non-transitory computer readable media and so executed by one or more computers, and/or one or more processors, whether or not such computer(s) or processor(s) is/are explicitly shown.
The functions of the various elements shown in the figures may be provided through the use of dedicated hardware as well as hardware capable of executing software in association with appropriate software. When provided by a processor, the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared. Moreover, explicit use of the term “processor” or “controller” should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor (“DSP”) hardware, read-only memory (“ROM”) for storing software, random access memory (“RAM”), and non-volatile storage.
Other hardware, conventional and/or custom, may also be included. Similarly, any switches shown in the figures are conceptual only. Their function may be carried out through the operation of program logic, through dedicated logic, through the interaction of program control and dedicated logic, or even manually, the particular technique being selectable by the implementer as more specifically understood from the context.
In the claims hereof, any element expressed as a means for performing a specified function is intended to encompass any way of performing that function including, for example, a) a combination of circuit elements that performs that function or b) software in any form, including, therefore, firmware, microcode or the like, combined with appropriate circuitry for executing that software to perform the function. The present principles as defined by such claims reside in the fact that the functionalities provided by the various recited means are combined and brought together in the manner which the claims call for. It is thus regarded that any means that can provide those functionalities are equivalent to those shown herein.
Reference in the specification to “one embodiment”, “an embodiment”, “an exemplary embodiment” of the present principles, or as well as other variations thereof, means that a particular feature, structure, characteristic, and so forth described in connection with the embodiment is included in at least one embodiment of the present principles. Thus, the appearances of the phrase “in one embodiment”, “in an embodiment”, “in an exemplary embodiment”, or as well any other variations, appearing in various places throughout the specification are not necessarily all referring to the same embodiment.
It is to be appreciated that the use of any of the following “/”, “and/or”, and “at least one of”, for example, in the cases of “A/B”, “A and/or B” and “at least one of A and B”, is intended to encompass the selection of the first listed option (A) only, or the selection of the second listed option (B) only, or the selection of both options (A and B). As a further example, in the cases of “A, B, and/or C” and “at least one of A, B, and C”, such phrasing is intended to encompass the selection of the first listed option (A) only, or the selection of the second listed option (B) only, or the selection of the third listed option (C) only, or the selection of the first and the second listed options (A and B) only, or the selection of the first and third listed options (A and C) only, or the selection of the second and third listed options (B and C) only, or the selection of all three options (A and B and C). This may be extended, as readily apparent by one of ordinary skill in this and related arts, for as many items listed.
The user interface device 120 in
The exemplary audio system 100 shown in
In addition, processor 110 shown in
Also, the exemplary audio output drivers 131-1 to 131-5 in
According to the present principles, an exemplary detector 150 is provided and is configured to detect ambient sound in an exemplary location 105 as shown in
In another exemplary embodiment, the plurality of microphones 150-1 and 150-5 may be directional microphones which have directionality of detection in order to determine where a sound is coming from. Therefore, according to an exemplary aspect of the present principles, microphones 150-1 to 150-5 are able to provide detected sounds as inputs to processor 110 for further processing in order to determine whether a voice or a conversation is detected, as well as to determine the location of the origin and the direction of travel of such a detected voice or conversation. Accordingly, by employing multiple microphones in different parts of a location as shown in
According to another exemplary aspect of the present principles, the processor 110 performs the analysis of the detected sound samples inputted from the detector 150 described above in order to determine whether the ambient noise detected by the detector 150 includes a voice of a person or a conversation of people in the location. In another non-limiting embodiment, an exemplary DSP in processor 110 may be employed to make such a determination as is well known in the art. For example, in order to determine whether a human speech is present, known speech detection techniques may be used.
These techniques in speech processing may involve first detecting whether sound is present in the range of the frequencies of a typical speech using a bandpass filter or filtering. The potentially detected voice may be further processed by speech recognition types of applications that provide different compromises between latency, sensitivity, accuracy and computational cost. Voice activity detection is usually language independent. Some algorithms also provide further analyses, for example, of whether the speech is voiced, unvoiced or sustained. Therefore, by employing a known voice detection algorithm, processor 110 is able to provide the determination that a voice of a person is present in the location 105.
Furthermore, processor 110 may first filter out the intended output sound from the audio source 115 in order to better analyze and determine whether the ambient sound detected by detector 105 contains a voice of a person and that the voice detected is not from the source material. As well known in the art, such a filtering may be accomplished using an echo canceller or an echo cancelling function implemented e.g., by the DSP of processor 110. Echo cancellation involves first recognizing the originally transmitted signal that appears at the output. Once the echo is recognized, it can be removed by subtracting it from received signal. Accordingly, the echo canceller or function also receives information from the detector 105. The originally transmitted signal is then removed from the signal received from the microphone 150-1 to 150-5 by the echo canceller or cancelling function performed by processor 110.
In another exemplary embodiment as shown in
Therefore, according to the present principles, once processor 110 has determined that a voice or a conversation is present, and also has determined the location of the origin and/or the direction of travel of the voice as described above, the processor 110 is able to automatically adjust the sound level of one or more audio output drivers 131-1 to 131-5 of the audio system 100 shown in
In another exemplary embodiment according to the present principles, however, the at least one sound output of the plurality of sound outputs projecting sound in the direction toward the determined origin of the voice, e.g., LF audio output driver 131-1, may be intentionally increased in order to make sure that the person 160-1 speaking does not miss the sound being outputted. This is especially useful and important when the sound is, e.g., a GPS directional instruction such as a turn instruction and/or an emergency announcement such as an amber alert, a tornado or a tsunami warning, and etc.
As shown in
The exemplary process 200 in
In addition,
Similar to what has already described in connection to
According to another exemplary embodiment of the present principles, the one or more processors of the exemplary audio system 300 of
In another embodiment, the direction of the travel of the voice 365 may be determined. As illustrated in
According to another aspect of the present principles, the adjustment to the one or more sound outputs of the exemplary audio system 300 in
Control of the sound output based on distance between a source of sound output, e.g., one of the speakers, and the origin of a voice may be combined with control based on level of sound output from each speaker. For example, first and second speakers may be located respective first and second distances from an origin of a voice. If the sound level adjustment comprises, for example, a decrease in sound level and both speakers are producing sound at the same or similar levels directed toward the origin of the voice and the first distance is greater than the second distance then a level of sound reduction at the first speaker responsive to detecting a voice may be less than a level of sound reduction at the second speaker. Another exemplary embodiment may comprise first and second speakers producing sound directed toward an origin of a voice and located respective first and second distances from the origin of the voice, wherein the first distance is greater than the second distance, and adjusting, e.g., decreasing or muting or increasing, sound produced by the second speaker while leaving the sound output from the first speaker unchanged based on the relative levels of sound output by each speaker, e.g., a level of sound output from the first speaker being less than a first value or a first threshold level and/or a level of sound output from the second speaker being greater than a second value or second threshold level.
Accordingly, the present principles provide exemplary audio systems, methods and computer program products which are able to automatically and intelligently adjust, such as, e.g., decrease, mute, or even increase the sound level or levels of the one or more audio outputs of an audio system based on the determined origin and/or direction of travel of a detected human voice inside a location.
While several embodiments have been described and illustrated herein, those of ordinary skill in the art will readily envision a variety of other means and/or structures for performing the functions and/or obtaining the results and/or one or more of the advantages described herein, and each of such variations and/or modifications is deemed to be within the scope of the present embodiments. More generally, those skilled in the art will readily appreciate that all parameters, dimensions, materials, and configurations described herein are meant to be exemplary and that the actual parameters, dimensions, materials, and/or configurations will depend upon the specific application or applications for which the teachings herein is/are used. Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments described herein. It is, therefore, to be understood that the foregoing embodiments are presented by way of example only and that, within the scope of the appended claims and equivalents thereof, the embodiments disclosed may be practiced otherwise than as specifically described and claimed. The present embodiments are directed to each individual feature, system, article, material and/or method described herein. In addition, any combination of two or more such features, systems, articles, materials and/or methods, if such features, systems, articles, materials and/or methods are not mutually inconsistent, is included within the scope of the present embodiment.
Number | Date | Country | Kind |
---|---|---|---|
16306382.9 | Oct 2016 | EP | regional |