Users are capable of interacting with their information handling devices (“devices”), for example laptop and/or personal computers, tablet devices, smart phones, smart speakers, and the like, through a variety of different input means. For example, one popular way of interacting with a device is through voice input. More particularly, a user's device may contain one or more audio capture devices (e.g., microphones, etc.) that are capable of capturing user-provided audio. Once the audio is captured, it may thereafter be processed and utilized in a downstream process.
In summary, one aspect provides a method, comprising: capturing, using at least one sensor associated with an information handling device, environmental data in a user's location; determining, from the environmental data, an optimal microphone mode for audible input capture in the user's location; and adjusting, based on the determining, at least one microphone setting to activate the optimal microphone mode.
Another aspect provides an information handling device, comprising: at least one sensor; a processor; a memory device that stores instructions executable by the processor to: capture environmental data in a user's location; determine, from the environmental data, an optimal microphone mode for audible input capture in the user's location; and adjust, based on the determining, at least one microphone setting to activate the optimal microphone mode.
A further aspect provides a product, comprising: a storage device that stores code, the code being executable by a processor and comprising: code that captures environmental data in a user's location; code that determines, from the environmental data, an optimal microphone mode for audible input capture in the user's location; and code that adjusts, based on the determining, at least one microphone setting to activate the optimal microphone mode.
The foregoing is a summary and thus may contain simplifications, generalizations, and omissions of detail; consequently, those skilled in the art will appreciate that the summary is illustrative only and is not intended to be in any way limiting.
For a better understanding of the embodiments, together with other and further features and advantages thereof, reference is made to the following description, taken in conjunction with the accompanying drawings. The scope of the invention will be pointed out in the appended claims.
It will be readily understood that the components of the embodiments, as generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations in addition to the described example embodiments. Thus, the following more detailed description of the example embodiments, as represented in the figures, is not intended to limit the scope of the embodiments, as claimed, but is merely representative of example embodiments.
Reference throughout this specification to “one embodiment” or “an embodiment” (or the like) means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment. Thus, the appearance of the phrases “in one embodiment” or “in an embodiment” or the like in various places throughout this specification are not necessarily all referring to the same embodiment.
Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments. One skilled in the relevant art will recognize, however, that the various embodiments can be practiced without one or more of the specific details, or with other methods, components, materials, et cetera. In other instances, well known structures, materials, or operations are not shown or described in detail to avoid obfuscation.
Users audibly interact with their devices in a variety of different locations. Each of these locations likely has a particular audio input context that is different than the other locations. More particularly, depending upon the ambient noise in the user's location, the quality of audio captured by microphones associated with the user's device may be affected. For example, a user interacting with their device in a loud environment (e.g., a restaurant, a noisy room, etc.) may find that extraneous noise was captured along with their intended user inputs.
Existing solutions enable users to manually adjust the microphone settings to narrow or expand the microphone pickup beam. For example, users situated in a loud environment may desire a narrower beam so that the ambient noise may be excluded, thereby allowing only their voice input to be registered. Alternatively, as another example, users engaged in a conference call may desire a wider beam to allow for all of the voice inputs provided by the conference attendees in the room to be adequately detected.
Currently, however, facilitation of these adjustments may only be accomplished through navigation through a variety of different settings menus. For users that know that the foregoing beamforming adjustments are possible, such a requirement may be burdensome and time-consuming, especially if they do not know exactly where these settings are located. Alternatively, for users that do not know that the foregoing beamforming adjustments are possible, they may continue to operate in a sub-optimal manner, which may affect the quality of their audio inputs.
Accordingly, an embodiment provides a method for adjusting a microphone setting to better accommodate the contextual audible input situation. In an embodiment, environmental data in a user's location may be captured by at least one sensor of a device. The environmental data may include ambient audio data, gaze direction data, occupancy data, individual position data, and the like. An embodiment may then determine, from the environmental data, an optimal microphone mode for capturing audio input while in the user's location. Achievement of the optimal microphone mode may be accomplished by narrowing or widening the pickup area of the microphone(s). Thereafter, an embodiment may automatically adjust, without additional user input, a microphone setting on the device to achieve the optimal microphone mode. Such a method may negate the conventional need for users to navigate through confusing microphone settings to improve the quality of audio input in varying situations.
The illustrated example embodiments will be best understood by reference to the figures. The following description is intended only by way of example, and simply illustrates certain example embodiments.
While various other circuits, circuitry or components may be utilized in information handling devices, with regard to smart phone and/or tablet circuitry 100, an example illustrated in
There are power management chip(s) 130, e.g., a battery management unit, BMU, which manage power as supplied, for example, via a rechargeable battery 140, which may be recharged by a connection to a power source (not shown). In at least one design, a single chip, such as 110, is used to supply BIOS like functionality and DRAM memory.
System 100 typically includes one or more of a WWAN transceiver 150 and a WLAN transceiver 160 for connecting to various networks, such as telecommunications networks and wireless Internet devices, e.g., access points. Additionally, devices 120 are commonly included, e.g., an image sensor such as a camera, audio capture device such as a microphone, etc. System 100 often includes one or more touch screens 170 for data input and display/rendering. System 100 also typically includes various memory devices, for example flash memory 180 and SDRAM 190.
The example of
In
In
The system, upon power on, may be configured to execute boot code 290 for the BIOS 268, as stored within the SPI Flash 266, and thereafter processes data under the control of one or more operating systems and application software (for example, stored in system memory 240). An operating system may be stored in any of a variety of locations and accessed, for example, according to instructions of the BIOS 268. As described herein, a device may include fewer or more features than shown in the system of
Information handling device circuitry, as for example outlined in
Referring now to
In an embodiment, environmental data may correspond to one or more different data types. For example, the environmental data may correspond to image or video data captured by a camera sensor (e.g., an image or video of the number of individuals in a space, the direction the individuals are facing, etc.). As another example, the environmental data may correspond to audio data captured by a microphone or microphone array (e.g., the volume of ambient noise in a space, the determined direction from which audio was provided, etc.). In yet another embodiment, the environmental data may correspond to distance data obtained from a Time-of-Flight (ToF) sensor (e.g., distance data from an individual in the room to the device, etc.). In the context of this application, ToF sensors may refer to virtually any device or process, used alone or in combination, capable of distance detection such as radar, Light Detection and Ranging (“Lidar”), laser, ultrasound, and the like. In yet another embodiment, Global Positioning System (GPS) data may be utilized to determine the nature or identity of the user's location (e.g., an embodiment may determine that the user is at home, work, a particular restaurant, monument, building, etc.).
At 302, an embodiment may determine, from the captured environmental data, an optimal microphone mode for audio input capture in the user's location. In the context of this application, the optimal microphone mode may correspond to a mode in which the quality of the captured audio input is above a predetermined threshold quality standard for the user's contextual situation (e.g., the best microphone mode when: the user alone is providing input to a device, when a plurality of users are providing input to a device, when the user is in a loud or quiet environment, when the user is in a particular location, etc.). The subsequent determination techniques may be used alone or in combination to determine the optimal microphone mode.
In an embodiment, the determination of the optimal microphone mode may be facilitated by identifying that the user's location is associated with a high noise location. From this identification, an embodiment may conclude that the optimal microphone mode corresponds to a user only mode. In the context of this application, the user only mode may correspond to a mode where substantially only a single user's inputs are captured by the device. More particularly, the microphone(s) of the device in the user only mode may be configured to have a narrow audio pickup beam that may capture audio a limited range in front of the microphone and ignore all other ambient sound.
In an embodiment, the identification that the user's location is associated with a high noise location may be accomplished, for example, by simply using the microphone(s) to determine whether the volume of captured audio in the user's location is above a predetermined volume threshold. Additionally or alternatively, other sensors may also be utilized to further confirm that that the user's location is associated with a high noise location. For example, one or more camera sensors may be utilized to capture one or more images of other individuals in the user's location and predict, responsive to identifying that there are more than a threshold number of individuals in the image (e.g., 10, 15, etc.), that the user's location is associated with a high noise location (i.e., from the implication that more individuals in a space creates more noise). In yet another embodiment, a GPS sensor unit may receive location data for the user's location and an embodiment may predict, from the location data, that the user's location is associated with the high noise location. For example, an embodiment may identify that the user is located at a sports stadium and may conclude from this identification that the user is likely in a high noise environment.
In an embodiment, the determination of the optimal microphone mode may be facilitated by identifying that the user's location is associated with a conference setting. From this identification, an embodiment may conclude that the optimal microphone mode corresponds to a multiple voices mode. In the context of this application, the multiple voices mode may correspond to a mode where audio inputs provided by a multitude of different users, potentially located in disparate positions around the user's location, can be appropriately captured by the device. More particularly, the microphone(s) of the device in the multiple voices mode may be configured to have a wider audio pickup beam that may capture audio from more positions in the user's location.
In an embodiment, the identification that the user's location is associated with a conference setting may be accomplished by identifying that a predetermined number of individuals are present in the user's location. For example, one or more camera sensors may first be utilized to capture one or more images of the individuals in the user's location. An embodiment may then determine whether a predetermined number of individuals are present in the image, wherein the predetermined number provides an implication to the system that a conference is on-going. Additionally or alternatively, an embodiment may utilize one or more image analysis techniques known in the art to determine the identities of the individuals in the captured image. These identities may be compared with available context data (e.g., calendar data, communication data, social media data, etc.) to determine that one or more identified individuals in the images were scheduled to participate in a conference meeting. Additionally or alternatively, an embodiment may utilize one or more microphones to determine whether a predetermined number of audio streams are directed at the device. More particularly, an embodiment may leverage one or more localization techniques to determine the direction from which the captured audio is provided from. In the same vein, one or more ToF sensors may be utilized alone, or in combination with, the audio localization techniques to determine the distance from which the audio-providing individuals are located from the device, whereby a closer distance may indicate an explicit intention to interact with the device (e.g., as part of a conference, etc.). In another embodiment, gaze detection techniques may also be utilized alone, or in combination with the foregoing, to determine whether a predetermined number of user gazes (e.g., 3, 5, 10, etc.) are directed at the device, or a particular portion of the device. In an embodiment, the number of detected gazes associated with the predetermined number may be reflective of a conference setting.
In an embodiment, the determination of the optimal microphone mode may be facilitated by identifying that the user's location is associated with a standard setting. From this identification, an embodiment may conclude that the optimal microphone mode corresponds to a normal mode. In the context of this application, the normal mode may correspond to a mode where the microphones are configured to capture audio in a medium-range beam. More particularly, a device in the normal mode may adequately capture voice input provided by users located proximate to the microphone (e.g., in a standard location directly in front of the display screen of the device, etc.) as well as voice input originating a moderate distance away from the microphone. In an embodiment, the normal mode may be the default mode for the device that is reverted to each time the device is reset, each time the device is powered on, each time the user logs into their profile, etc.
Responsive to not determining, at 302, an optimal microphone mode, an embodiment may, at 303, take no additional action. Additionally or alternatively, an embodiment may maintain and capture audio input using an existing microphone mode. Conversely, responsive to determining, at 303, the optimal microphone mode, an embodiment may, at 304, adjust one or more microphone settings to institute the determined optimal microphone mode.
In an embodiment, institution of the optimal microphone mode may be conducted dynamically, without additional user input, by the system. More particularly, an embodiment may have access to knowledge (e.g., stored in an accessible database, etc.) regarding which settings need to be adjusted to accommodate each microphone mode. Responsive to determining a particular optimal microphone mode, the system of the embodiments may dynamically activate or adjust the microphone settings to achieve that mode.
The various embodiments described herein thus represent a technical improvement to conventional methods of dynamically adjusting the microphone settings on a user's device to accommodate the user's contextual input situation. Using the techniques described herein, an embodiment may capture environmental data associated with the user's location. An embodiment may then determine, from the environmental data, a microphone mode that is optimized for audio capture in the user's location and thereafter adjust one or more microphone settings to achieve the optimal microphone mode. Such a method may negate the conventional need for a user to constantly adjust their microphone settings based upon the location that they are in.
As will be appreciated by one skilled in the art, various aspects may be embodied as a system, method or device program product. Accordingly, aspects may take the form of an entirely hardware embodiment or an embodiment including software that may all generally be referred to herein as a “circuit,” “module” or “system.” Furthermore, aspects may take the form of a device program product embodied in one or more device readable medium(s) having device readable program code embodied therewith.
It should be noted that the various functions described herein may be implemented using instructions stored on a device readable storage medium such as a non-signal storage device that are executed by a processor. A storage device may be, for example, a system, apparatus, or device (e.g., an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device) or any suitable combination of the foregoing. More specific examples of a storage device/medium include the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a storage device is not a signal and “non-transitory” includes all media except signal media.
Program code embodied on a storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, et cetera, or any suitable combination of the foregoing.
Program code for carrying out operations may be written in any combination of one or more programming languages. The program code may execute entirely on a single device, partly on a single device, as a stand-alone software package, partly on single device and partly on another device, or entirely on the other device. In some cases, the devices may be connected through any type of connection or network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made through other devices (for example, through the Internet using an Internet Service Provider), through wireless connections, e.g., near-field communication, or through a hard wire connection, such as over a USB connection.
Example embodiments are described herein with reference to the figures, which illustrate example methods, devices and program products according to various example embodiments. It will be understood that the actions and functionality may be implemented at least in part by program instructions. These program instructions may be provided to a processor of a device, a special purpose information handling device, or other programmable data processing device to produce a machine, such that the instructions, which execute via a processor of the device implement the functions/acts specified.
It is worth noting that while specific blocks are used in the figures, and a particular ordering of blocks has been illustrated, these are non-limiting examples. In certain contexts, two or more blocks may be combined, a block may be split into two or more blocks, or certain blocks may be re-ordered or re-organized as appropriate, as the explicit illustrated examples are used only for descriptive purposes and are not to be construed as limiting.
As used herein, the singular “a” and “an” may be construed as including the plural “one or more” unless clearly indicated otherwise.
This disclosure has been presented for purposes of illustration and description but is not intended to be exhaustive or limiting. Many modifications and variations will be apparent to those of ordinary skill in the art. The example embodiments were chosen and described in order to explain principles and practical application, and to enable others of ordinary skill in the art to understand the disclosure for various embodiments with various modifications as are suited to the particular use contemplated.
Thus, although illustrative example embodiments have been described herein with reference to the accompanying figures, it is to be understood that this description is not limiting and that various other changes and modifications may be affected therein by one skilled in the art without departing from the scope or spirit of the disclosure.