This application generally relates to a speaker system. In particular, this application relates to a speaker system comprising at least one steerable speaker array and methods for implementing and controlling the same.
Loudspeaker, or sound reproduction, systems comprising a plurality of speakers are commonly found in office spaces or conferencing environments, public spaces, including theaters, entertainment venues, and transportation hubs, homes, automobiles, and other listening environments. The number, size, quality, arrangement, and type of the speakers can affect sound quality and listening experience. However, most listening environments can only accommodate a certain number, size, type, and/or arrangement of speakers due to spatial and/or aesthetic limitations, limits on expense and/or computational complexity, and other constraints. For example, massive speaker systems with larger cone sizes may be suitable for concert halls and other music applications requiring a high fidelity, full-range response, e.g., 20 Hz to 20 kHz, but typically, are not preferred for office spaces and conferencing environments. Rather, such environments often include speakers that are aesthetically designed to minimize the visual impact of the speaker system and acoustically designed to provide increased intelligibility and other preferred characteristics for voice applications.
One existing type of loudspeaker system is the line array comprising a linear arrangement of transducers with predetermined spacing or distances between the transducers. Typically, the transducers are arranged in a planar array and located on a front plate of a single housing or mounting frame with all of the transducers facing forward, or away from the front plate. A common line array is the “column speaker,” which consists of a long line of closely spaced identical transducers or drivers placed in an upright, forward-facing position. Line arrays provide the ability to steer the sound beams output by the individual speakers towards a given listener using appropriate beamforming techniques (e.g., signal processing). For example, the transducers of an upright column speaker can provide a controlled degree of directionality in the vertical plane. The directivity of a line array depends on several, somewhat conflicting properties. Longer lines of drivers permit greater directional control at lower frequencies, while closer spacing between drivers permits greater directional control at higher frequencies. Also, as frequency decreases, beamwidth increases, causing beam focus to decrease. A two-dimensional speaker array comprised of several individual line arrays arranged in rows and columns may be capable of providing control in all directions. However, such systems are difficult to design and expensive to implement due at least in part to the large number of drivers required to provide directivity across all frequencies.
Accordingly, there is an opportunity for systems that address these concerns. More particularly, there is an opportunity for systems including a speaker array that is unobtrusive, easy to install into an existing environment, and allows for adjustment of the speaker array, including steering discrete lobes to desired listeners or other locations.
The invention is intended to solve the above-noted problems by providing systems and methods that are designed to, among other things, provide: (1) a steerable speaker array comprising a concentric, nested configuration of transducers that achieves improved directivity over the voice frequency range and an optimal main to side lobe ratio over a prescribed steering angle range; and (2) enhanced audio features by utilizing the steerable speaker array in combination with a steerable microphone or microphone array, such as, for example, acoustic echo cancellation, crosstalk minimization, voice-lift, dynamic noise masking, and spatialized audio streams.
According to one aspect, a speaker array is provided. The speaker array comprises a plurality of drivers arranged in a concentric, nested configuration formed by arranging the drivers in a plurality of concentric groups and placing the groups at different radial distances from a central point of the configuration. Each group is formed by a subset of the plurality of drivers being positioned at predetermined intervals from each other along a perimeter of the group. The groups are rotationally offset from each other relative to a central axis of the array that passes through the central point. The different radial distances are configured such that the concentric groups are harmonically nested.
According to another aspect, a method, performed by one or more processors to generate a beamformed audio output using an audio system comprising a speaker array having a plurality of drivers, is provided. The method comprises receiving one or more input audio signals from an audio source coupled to the audio system; generating a separate audio output signal for each driver of the speaker array based on at least one of the input audio signals, the drivers being arranged in a plurality of concentric groups positioned at different radial distances relative to a central point to form a concentric, nested configuration; and providing the audio output signals to the corresponding drivers to produce a beamformed audio output. The generating comprises, for each driver: obtaining one or more filter values and at least one delay value associated with the driver, at least one of the one or more filter values being assigned to the driver based on the concentric group in which the driver is located, applying the at least one filter value to one or more filters to produce a filtered output signal for the driver, providing the filtered output signal to a delay element associated with the driver, applying the at least one delay value to the delay element to produce a delayed output signal for the driver, and providing the delayed output signal to a power amplifier in order to amplify the signal by a predetermined gain amount.
According to another aspect, an audio system is provided. The audio system comprises a first speaker array comprising a plurality of drivers arranged in a plurality of concentric groups positioned at different radial distances from a central point to form a concentric, nested configuration, each group being formed by a subset of the plurality of drivers being positioned at predetermined intervals from each other along a perimeter of the group. The audio system further comprises a beamforming system coupled to the first speaker array and configured to: receive one or more input audio signals from an audio source, generate a separate audio output signal for each driver of the first speaker array based on at least one of the input audio signal, and provide the audio output signals to the corresponding drivers to produce a beamformed audio output.
According to yet another aspect, a speaker system is provided. The speaker system comprises a planar speaker array disposed in a substantially flat housing and comprising a plurality of drivers arranged in a two-dimensional configuration, the speaker array having an aperture size of less than 60 centimeters and being configured to simultaneously form a plurality of dynamically steerable lobes directed towards multiple locations. The speaker system further comprises a beamforming system coupled to the speaker array and configured to digitally process one or more input audio signals, generate a corresponding audio output signal for each driver, and direct each output signal towards a designated one of the multiple locations.
These and other embodiments, and various permutations and aspects, will become apparent and be more fully understood from the following detailed description and accompanying drawings, which set forth illustrative embodiments that are indicative of the various ways in which the principles of the invention may be employed.
The description that follows describes, illustrates and exemplifies one or more particular embodiments of the invention in accordance with its principles. This description is not provided to limit the invention to the embodiments described herein, but rather to explain and teach the principles of the invention in such a way to enable one of ordinary skill in the art to understand these principles and, with that understanding, be able to apply them to practice not only the embodiments described herein, but also other embodiments that may come to mind in accordance with these principles. The scope of the invention is intended to cover all such embodiments that may fall within the scope of the appended claims, either literally or under the doctrine of equivalents.
It should be noted that in the description and drawings, like or substantially similar elements may be labeled with the same reference numerals. However, sometimes these elements may be labeled with differing numbers, such as, for example, in cases where such labeling facilitates a more clear description. Additionally, the drawings set forth herein are not necessarily drawn to scale, and in some instances proportions may have been exaggerated to more clearly depict certain features. Such labeling and drawing practices do not necessarily implicate an underlying substantive purpose. As stated above, the specification is intended to be taken as a whole and interpreted in accordance with the principles of the invention as taught herein and understood to one of ordinary skill in the art.
With respect to the exemplary systems, components and architecture described and illustrated herein, it should also be understood that the embodiments may be embodied by, or employed in, numerous configurations and components, including one or more systems, hardware, software, or firmware configurations or components, or any combination thereof, as understood by one of ordinary skill in the art. Accordingly, while the drawings illustrate exemplary systems including components for one or more of the embodiments contemplated herein, it should be understood that with respect to each embodiment, one or more components may not be present or necessary in the system.
Systems and methods are provided herein for a speaker system that includes a plurality of electroacoustic transducers or drivers selectively arranged to form a high-performing planar array capable of presenting audio source material in a narrowly directed, dynamically steerable sound beam and simultaneously presenting different source materials to different locations using individually steerable beams. The drivers are arranged in a harmonically nested and geometrically optimized configuration to allow for polar pattern formation capable of generating highly spatially-controlled and steerable beams with an optimal directivity index.
In embodiments, the array configuration is achieved by arranging the drivers in a plurality of concentrically-positioned groups (e.g., rings or other formations), which enables the speaker array to have equivalent beamwidth performance for any given look angle in a three-dimensional (e.g., X-Y-Z) space. As a result, the speaker array described herein can provide a more consistent output and improved directivity than existing arrays with linear, rectangular, or square constellations. Further, each concentric group within the configuration of drivers is rotationally offset from every other group in order to avoid radial and axial symmetry. This enables the speaker array described herein to minimize side lobe growth or provide a maximal main-to-side-lobe ratio, unlike existing speaker arrays with co-linearly positioned speaker elements. The offset configuration can also tolerate further beam steering, which allows the speaker array to cover a wider listening area. Moreover, the speaker array configuration described herein can be harmonically nested to optimize beamwidth over a given set of distinct frequency bands (e.g., across the voice frequency range).
The sounds produced by the speaker array 100 can be directed towards one or more listeners (e.g., human listeners) within a room (e.g., conference room), or other location, using beamforming techniques, as described herein. In some embodiments, the speaker array 100 may be configured to simultaneously produce multiple audio outputs based on different audio signals received from a plurality of audio sources, with each audio output being directed to a different location or listener.
As shown in
As shown, the drivers 102 can be coupled to, or included on, a support 104 for securing and supporting the drivers 102. The drivers 102 may be embedded into the support 104 or otherwise mechanically attached thereto (e.g., suspended from wires attached to the support 104). In the illustrated embodiment, all of the drivers 102 are positioned on the same surface or side of the support 104 (e.g., a front or top face). In other embodiments, at least some of the drivers 102 may be arranged on a first side or surface of the support 104, while the rest of the drivers 102 are arranged on the opposite side or surface of the support 104. In some embodiments, the drivers 102 may be distributed across multiple supports or surfaces.
The support 104 may be any suitable planar surface, including, for example, a flat plate, a frame, a printed circuit board, a substrate, etc., and may have any suitable size or shape, including, for example a square, as shown in
In the illustrated embodiment, the speaker array 100 is encased in a housing 106 configured to protect and structurally support the drivers 102 and support 104. The housing 106 may include a sound-permeable front face made of fabric, film, wire mesh, or other suitable material, and an enclosed rear face made of metal, plastic, or other suitable material. A depth of the housing 106 may be selected to accommodate the acoustical cavity required by each of the drivers 102, as described herein. While the illustrated embodiment shows a substantially flat, square housing 106 and support 104, other sizes and shapes are also contemplated, including, for example, domed shapes, spherical shapes, parabolic shapes, oval or circular shapes, or other types of polygons (e.g., rectangle, triangle, pentagon, etc.).
In some embodiments, the housing 106 is configured for attachment to a ceiling so that the speaker array 100 faces down towards or over the listeners in a room or other environment. For example, the speaker array 100 may be placed over a conference table and may be used to reproduce an audio signal representing speech or spoken words received from a remote audio source associated with the conferencing environment. As another example, the speaker array 100 may be placed in an open office environment, above a cluster of cubicles or other suitable location. In a preferred embodiment, the housing 106 may be flush mounted to the ceiling or other surface to gain certain acoustic benefits, such for example, infinite baffling.
In one embodiment, a size and shape of the housing 106 may be configured to substantially match that of a standard ceiling tile, so that the speaker array 100 can be attached to a drop ceiling (or a secondary ceiling hung below a main, structural ceiling) in place of, or adjacent to, one of the ceiling tiles that make up the drop ceiling. For example, the housing 106 may be square-shaped, and each side of the housing 106 may have a length of about 60 cm, or about 24 inches, depending on whether the drop ceiling is according to European specifications or U.S. specifications. In one embodiment, an overall aperture size of the speaker array 100 may be less than 60 centimeters (or less than 24 inches), in order to fit within the housing 106.
The speaker array 100 can be further configured for optimal performance at a certain height, or range of heights, above a floor of the environment, for example, in accordance with standard ceiling heights (e.g., eight to ten feet high), or any other appropriate height range (e.g., ceiling to table height). In other embodiments, the speaker array 100 is configured for attachment to a vertical wall for directing audio towards the listeners from one side of the environment.
As shown in
In embodiments, the central driver 102a can be used as a reference point for creating axial symmetry in the array 100, and the concentric, nested configuration can be formed by arranging the remaining drivers 102b in concentric groups 108, 110, 112, 114 around the central driver 102a. Each group contains a different subset or collection of the drivers 102b. During operation, two or more groups of drivers 102b and/or the central driver 102a may be selected to work together and form a “sub-nest” configured to produce a desired speaker output, such as, for example, high directivity and steerability in a given frequency band. The number of sub-nests that may be formed using the drivers 102 can vary depending on the beamforming techniques used, the covered frequency bands, the total number of drivers 102 in the array 100, the total number of groups of drivers 102, etc.
As shown, the groups 108, 110, 112, 114 are positioned at progressively larger radial distances from the central point (0,0) of the array 100 in order to cover progressively lower frequency octaves and create a harmonically nested configuration. For example, as shown in
Within each of the groups 108-114, the individual drivers 102b may be evenly spaced apart, or positioned at predetermined intervals, along a circumference, or perimeter, of the group. The exact distance between neighboring drivers 102b (e.g., center to center) within a given group may vary depending on an overall size (e.g., radius) of the group, the size of each driver 102, the shape of the groups, and the number of drivers 102b included in the group, as will be appreciated. For example, in
In the illustrated example, the speaker array 100 comprises a total of fifty identical drivers 102, each driver 102 having a 20 millimeter (mm) diameter. The first driver 102a is placed in the central reference point, while the remaining forty-nine drivers 102b are arranged in the four concentric groups 108, 110, 112, 114 with progressively increasing radial distances to create the nested configuration. The increased driver density created by concentrically grouping or clustering the drivers 102 in this manner can minimize side lobes and improve directivity, thereby enabling the speaker array 100 to accommodate a wider range of audio frequencies with varying beamwidth control. The exact number of drivers 102b included in each group 108-114 and the total number of drivers 102 included in the speaker array 100 may depend on a number of considerations, including, for example, a size of the individual drivers 102, the configuration of the harmonic nests, a desired density for the drivers in the array, a preset operating frequency range of the array 100 and other desired performance standards, and constraints on physical space (e.g., due to a limit on the overall dimensions of the housing 106) and/or processing power (e.g., number of processors, number of outputs per processor, processing speeds, etc.). For example, in one embodiment, only forty-eight of the fifty drivers 102 are active because of hardware limitations. In other embodiments, the speaker array 100 may include more than fifty drivers 102, for example, by adding a fifth concentric group outside outermost group 114 to better accommodate lower frequencies.
In some embodiments, the geometry and harmonic nesting of the drivers 102 included in the center of the array 100, namely cluster 118 formed by central driver 102a and the drivers 102b of groups 108 and 110, may be configured to further extend a low frequency output of the speaker array 100 (or operate in low frequency bands) without requiring a larger overall size for the array. For example, as shown in
In some embodiments, the number of drivers 102b in each group can be configured to maximize a main-to-side-lobe ratio of the speaker array 100 and thereby, produce an improved beamwidth with a near constant frequency response across all frequencies within the preset range. For example, the main-to-side-lobe ratio may be maximized by including an odd number of drivers 102b in the first group 108 and by including a multiple of the odd number in each of the other groups 110, 112, and 114. In one embodiment, the odd number is selected from a group of prime numbers in order to further avoid axial alignment between the drivers 102 and mitigate the side lobe effects across different octaves within the overall operating range of the speaker array (for example and without limitation, 100 Hz to 10 KHz). For example, in
The exact diameter or circumference of each group 108, 110, 112, 114, and/or the radial distance between each group and the central point (0,0), can vary depending on the desired frequency range of the speaker array 100 and a desired sensitivity or overall sound pressure for the drivers 102b in that group, as well as a size of each individual driver 102. In some embodiments, a diameter or size of each group may define the lowest frequency at which the drivers 102b within that group can optimally operate without interference or other negative effects (e.g., due to grating lobes). For example, a radial distance of the outermost group 114 may be selected to enable optimal operation at the lowest frequencies in the predetermined operating range, while a radial distance of the innermost group 108 may be selected to enable optimal operation at the highest frequencies in the predetermined range, and the remaining ring diameters or radial distances can be determined by subdividing the remaining frequency range.
In embodiments, the total number of driver groups included in the speaker array 100 can also determine the optimal frequency or operating range of the array 100. For example, the speaker array 100 may be configured to operate in a wider range of frequencies by increasing the number of groups to more than four. In other embodiments, the speaker array 100 may have fewer than the four groups shown in
In a preferred embodiment, the radial distance of each group 108, 110, 112, 114 is twice the radial distance of the smaller group nested immediately inside that group in accordance with the harmonic nesting approach. For example, in
In embodiments, each of the groups 108-114 may be at least slightly rotated relative to central axis 116 (e.g., the x-axis), which passes through the center point (0,0) of the array (e.g., the central speaker 102a), in order to optimize the directivity of the speaker array 100. For example, the rotational offset can be configured to eliminate undesired interference that can occur when more than two drivers 102 are aligned. In some embodiments, the groups 108-114 can be rotationally offset from each other, for example, by rotating each group a different number of degrees relative to the central axis 116, so that no more than two of the drivers 102 are axially aligned, or co-linear. In some embodiments, the number of degrees for the offset is an integer greater than one, or a multiple of that integer, and is selected to further avoid alignment and minimize co-linearity. For example, in the illustrated embodiment, each of the groups are rotationally offset from the x-axis 116 by 17 degrees or a multiple thereof. In particular, the first group 108 is offset by 17 degrees, the second group 110 is offset by 34 degrees, the third group 112 is offset by 51 degrees, and the fourth group 114 is offset by 68 degrees. In other embodiments, the rotational offset may be more arbitrarily implemented, if at all, and/or other methods may be utilized to optimize the overall directivity of the microphone array. Regardless of the method, rotationally offsetting the drivers 102 can configure the speaker array 100 to constrain sensitivity to the main lobes, thereby maximizing main lobe response and reducing side lobe response.
As will be appreciated,
In some embodiments, the plurality of drivers 102 may be arranged in concentric rings around a central point, but without a driver positioned at the central point (e.g., without the central driver 102a). In other embodiments, only a portion of the drivers 102 may be arranged in concentric rings, and the remaining portion of the drivers 102 may be positioned at various points outside of, or in between, the discrete rings, at random locations on the support 104, in line arrays at the top, bottom and/or sides of the concentric rings, or in any other suitable arrangement. In some embodiments, the drivers 102 may be non-identical transducers. For example, some of the drivers 102 may be smaller (e.g., tweeters), while others may be larger (e.g., woofers), to help accommodate a wider range of frequencies.
The speaker array 202 can be comprised of a plurality of speaker elements or drivers arranged in a harmonically nested, concentric configuration, or other geometrically optimized configuration in accordance with the techniques described herein. In embodiments, the speaker array 202 may be substantially similar to the speaker array 100 shown in
Various components of the speaker system 200 may be implemented using software executable by one or more computers, such as a computing device with a processor and memory, and/or by hardware (e.g., discrete logic circuits, application specific integrated circuits (ASIC), programmable gate arrays (PGA), field programmable gate arrays (FPGA), digital signal processors (DSP), microprocessor, etc.). For example, some or all components of the beamforming system 204 may be implemented using discrete circuitry devices and/or using one or more processors (e.g., audio processor and/or digital signal processor) (not shown) executing program code stored in a memory (not shown), the program code being configured to carry out one or more processes or operations described herein, such as, for example, method 400 shown in
The single cable 206 can be configured to transport audio signals, data signals, and power between the beamforming system 204 and the speaker array 202. Though not shown, each of the beamforming system 204 and the speaker array 202 may include an external port for receiving either end of the cable 206. In embodiments, the external ports may be Ethernet ports configured to provide power, control, and audio connectivity to the components of the speaker system 200. In such embodiments, the single cable 206 may be an Ethernet cable (e.g., CATS, CAT6, etc.) configured to be electrically coupled to the Ethernet port. In other embodiments, the speaker system 200 includes one or more other types of external ports (e.g., Universal Serial Bus (USB), mini-USB, PS/2, HDMI, VGA, serial, etc.), and the single cable 206 is configured for coupling to said other port.
The content transported via the cable 206 to and/or from the speaker array 202 may be provided by various components of the beamforming system 204. For example, electrical power may be supplied by a power source 208 (e.g., battery, wall outlet, etc.) configured to send power to the speaker array 202. The power source 208 may be an external power supply that is electrically coupled to the beamforming system 204, or an internal power source included in the beamforming system 204 and/or speaker system 200. In a preferred embodiment, the power signal is delivered through the cable 206 using Power Over Ethernet (PoE) technology (e.g., PoE++). As an example, the power source 208 may be configured to supply up to 100 watts of power (e.g., Level 4 PoE), and the cable 206 may be configured (e.g., by including at least four twisted pairs of wires) to deliver at least 75 watts to the speaker array 202.
The audio data may be provided by an audio processing system 210 of the beamforming system 204 for transmission to the speaker array 202 over the cable 206. The audio processing system 210 can be configured to receive audio signals from one or more audio sources (not shown) coupled to the speaker system 200 and perform prescribed beamforming techniques to steer and focus sound beams to be output by the speaker array 202, for example, as described with respect to
The data signals transported over the cable 206 may include control information received from a user interface 212 of the beamforming system 204 for transmission to the speaker array 202, information provided by the audio processing system 210 for transmission to the speaker array 202, and/or information transmitted by the speaker array 202 to the beamforming system 204. As an example, the control information may include adjustments to parameters of the speaker array 202, such as, e.g., directionality, steering, gain, noise suppression, pattern forming, muting, frequency response, etc. In some embodiments, a user of the speaker system 200 may use the user interface 212 to enter control information designed to steer discrete lobes of the speaker array 202 to a particular angle, direction or location (e.g., using point and steer techniques) and/or change a shape and/or size of the lobes (e.g., using magnitude shading, lobe stretching, and/or other lobe shaping techniques).
In some cases, the user interface 212 includes a control panel coupled to a control device or processor of the beamforming system 204, the control panel including one or more switches, dimmer knobs, buttons, and the like. In other cases, the user interface 212 may be implemented using a software application executed by a processor of the beamforming system 204 and/or a mobile or web application executed by a processor of a remote device communicatively coupled to the beamforming system 204 via a wired or wireless communication network. In such cases, the user interface 212 may include a graphical layout for enabling the user to change filter values, delay values, beamwidth, and other controllable parameters of the audio processing system 210 using graphical sliders and buttons and/or other types of graphical inputs. The remote device may be a smartphone or other mobile phone, laptop computer, tablet computer, desktop computer, or other computing device configured to enable remote user control of the audio processing system 210 and/or speaker array 202. In some embodiments, the beamforming system 204 includes a wireless communication device (not shown) (e.g., a radio frequency (RF) transmitter and/or receiver) for facilitating wireless communication with the remote device (e.g., by transmitting and/or receiving RF signals).
Though
In embodiments, beamformer 304 comprises a filter system 306 and a plurality of delay elements 308 configured to apply pattern forming, steering, and/or other beamforming techniques to individually control the output of each speaker element 302. To help streamline these processes, sub-nests can be formed among the speaker elements 302 so as to cover specific frequency bands. For example, each sub-nest may include a collection of two or more concentric groups of speaker elements 302, a concentric group of elements plus the speaker element positioned at the center of the speaker array, a concentric group by itself, or a combination thereof. In some cases, a given speaker element 302 or group of elements may be used in more than one sub-nest. The exact number of speaker elements 302 or groups included in a given sub-nest may depend on the frequency band assigned to that sub-nest and/or an expected performance for that sub-nest.
In embodiments, beamformer 304 is implemented using one or more audio processors configured to process the input audio signal(s), for example, using filter system 306 and delay elements 308. Each processor (not shown) may comprise a digital signal processor and/or other suitable hardware (e.g., microprocessor, dedicated integrated circuit, field programmable gate array (FPGA), etc.) In one embodiment, beamformer 304 is implemented using two audio processors having 24 outputs each. In such cases, beamformer 304 can be configured to provide up to 48 outputs and therefore, can be connected to up to 48 speaker elements or drivers 302. As will be appreciated, more or fewer processors may be used so that beamformer 304 can accommodate a larger or smaller number of drivers in the speaker array.
Various components of beamformer 304, and/or the overall audio processing system 300, may be implemented using software executable by one or more computers, such as a computing device with a processor and memory, and/or by hardware (e.g., discrete logic circuits, application specific integrated circuits (ASIC), programmable gate arrays (PGA), field programmable gate arrays (FPGA), digital signal processors (DSP), microprocessors, etc.). For example, filter systems 306 and/or delay elements 308 may be implemented using discrete circuitry devices and/or using one or more data processors executing program code stored in a memory, the program code being configured to carry out one or more processes or operations described herein, such as, for example, all or portions of method 400 shown in
As shown, audio processing system 300 also includes a plurality of amplifiers 310 coupled between the beamformer 304 and the plurality of speaker elements 302, such that each output of the beamformer 304 is coupled to a respective one of the amplifiers 310, and each amplifier 310 is coupled to a respective one of the speaker elements 302. During operation, a magnitude of each individual audio signal, an, generated by the beamformer 304 for a given speaker element n is amplified by a predetermined amount of gain, or gain factor (e.g., 0.5, 1, 2, etc.), before being provided to the corresponding speaker element n. In some embodiments, the gain factor for each amplifier 310 may be selected to ensure a uniform output from the speaker elements 302, i.e. matching in magnitude. As will be appreciated, the exact number of amplifiers 310 included in the audio processing system 300 can depend on the number of speaker elements 302 included in the speaker array. In embodiments, the amplifiers 310 may be class D amplifiers or switching amplifiers, another type of electric amplifier, or any other suitable amplifier.
If the input audio signals are analog signals, the audio processing system 300 may further include an analog-to-digital converter 312 for converting the analog audio signal into a digital audio signal before it reaches the beamformer 304 for digital signal processing. In such cases, the individual audio signals an may be digital audio signals that, for example, conform to the Dante standard or another digital audio standard. The audio processing system 300 may also include a digital-to-analog converter 314 for converting each individual audio signal an back into an analog audio signal prior to amplification by the respective amplifier 310.
In some embodiments, the audio processing system 300 can further include a database 316 configured to store information used by the beamformer 304 to generate individual audio signals a1 through an. The information may include filter coefficients and/or weights for configuring the filter system 306 and/or specific time delay values or coefficients (e.g., z−k) for configuring the delay elements 308. The database 316 may store this information in a look up table or other suitable format. As an example, the table may list different filter coefficients and/or weights, as well as time delay values, for each of the speaker elements 302 and/or for each sub-nest or group of speaker elements (e.g., groups 108-114 in
In embodiments, the filter system 306 may be configured to apply crossover filtering to the input audio signal to generate an appropriate audio output signal for each speaker element 302. The crossover filtering may include applying various filters to the input audio signal in order to isolate the signal into different or discrete frequency bands. For example, referring back to
As shown, the filter system 306 includes a plurality of filter banks 318, each filter bank 318 comprising a preselected combination of filters for implementing crossover filtering to generate a desired audio output. In embodiments, the filter banks 318 may be configured to set a constant beamwidth for the audio output of the speaker array across a wide range of frequencies. The individual filters may be configured as bandpass filters, low pass filters, high pass filters, or any other suitable type of filter for optimally isolating a particular frequency band of the input audio signal. The cutoff frequencies for each individual filter may be selected based on the specific frequency response characteristics of the corresponding sub-nest and/or speaker element, including, for example, location of frequency nulls, a desired frequency response for the speaker array, etc. The filter system 306 may include digital filters and/or analog filters. In some embodiments, the filter system 306 includes one or more finite impulse response (FIR) filters and/or infinite impulse response (IIR) filters.
In some embodiments, the filter system 306 includes a separate filter bank 318 for each sub-nest of the speaker array, with N being the total number of sub-nests, and each filter bank 318 includes a separate filter for each speaker element 302 included in the corresponding sub-nest. In such cases, the exact number of filter banks 318, and the number of filters included therein, can depend on the number of sub-nests, as well as the number of speaker elements 302 included in each sub-nest. For example, in one embodiment, the speaker elements 302 may be configured as, or collected into, three different sub-nests to cover three different frequency bands and so, the filter system 306 may include three filter banks 318, one for each sub-nest. In another example embodiment, the speaker elements 302 may be configured to operate in four different sub-nests, so the filter system 306 includes at least four filter banks 318.
In still other embodiments, the filter system 306 can include a separate filter bank 318 for each of the speaker elements 302 or a separate filter bank 318 for each group of elements (e.g., groups 108, 110, 112, 114 in
The filter system 306 may further include additional elements not shown in
As shown, each individual audio signal an output by the filter system 306 is provided to a respective one of the delay elements 308 before exiting the beamformer 304. Each delay element 308 can be individually associated with a respective one of the speaker elements 302 and can be configured to apply an appropriate amount of time delay (e.g., z−1) to the filtered output an received at its input. In embodiments, the delay value for a given speaker element 302 can be retrieved from the database 316 or programmatically generated (e.g., using software instructions executed by a processor), similar to the filter coefficients and/or weights used for the filter system 306. For example, each speaker element 302 may be assigned a respective amount of delay (or delay value), and such pairings may be stored in the database 316. The exact amount of delay applied in association with each speaker element 302 can vary depending on, for example, a desired polar pattern, a desired steering angle and/or shape of the main lobe, and/or other beamforming aspects.
In some embodiments, the audio processing system 300 also includes one or more microphones 320 for detecting sound in a given environment and converting the sound into an audio signal for the purpose of implementing acoustic echo cancellation (AEC), voice lift, and other audio processing techniques designed to improve the performance of the speaker array 300. In some embodiments, the one or more microphones 320 may be arranged inside the speaker enclosure (such as, e.g., housing 106 of
The method 400 begins at step 402 with receiving one or more input audio signals from an audio source. The input audio signals may be received at one or more processors, such as, e.g., beamformer 304 shown in
At step 404, the one or more processors generate a separate audio output signal for each driver included in the speaker array based on at least one of the one or more input audio signals, as well as a desired beamforming result and characteristics related to the driver's position in the speaker array, including, for example, the particular group in which the driver located. The audio output may be generated using crossover filtering, delay and sum processing, weigh and sum processing, and/or other beamforming techniques for manipulating magnitude, phase, and delay values for each individual driver in order to steer the main lobe towards a desired location or listener and maintain a constant beamwidth across a wide range of frequencies. In embodiments, generating an audio output signal for each driver at step 404 can include obtaining one or more filter values and at least one delay value associated with the driver. At least one of the one or more filter values may be assigned to the driver based on the concentric group in which the driver is located. For example, in some embodiments, the groups of drivers may be combined to form two or more sub-nests for audio processing purposes, and all drivers belonging to a particular sub-nest can be assigned at least one common filter value. On the other hand, the time delay value may be specific to each driver. The filter values and delay values may be retrieved from a database (e.g., database 316 in
The generating process at step 404 can also include applying the at least one filter value to one or more filters (e.g., filter bank 306 in
Step 406 involves providing the generated audio output signals to the corresponding drivers of the speaker array in order to produce a beamformed audio output. In embodiments, the audio output signals are transmitted to the speaker array over a single cable configured to transport audio, data, and power. The method 400 may end after completion of step 406.
Polar plots 600-614 shown in
More specifically,
Each speaker array 1102 can include a plurality of speaker elements or drivers arranged in a planar configuration. For example, the speaker elements may be arranged in a harmonically nested, concentric configuration (e.g., as shown in
The beamforming system 1104 can be in communication with the individual speaker elements of each speaker array 1102 and can be configured to beamform or otherwise process input audio signals and generate a corresponding audio output signal for each speaker element of each speaker array 1102. In this manner, the speaker array(s) 1102 can be configured to simultaneously produce a plurality of individual audio outputs using various speaker elements, or combinations of speaker elements, and direct each audio output towards a designated location or listener. In embodiments, the beamforming system 1104 may be substantially similar to the beamforming system 204, as shown in
As shown in
The microphone 1120 can include any suitable type of microphone transducer or element capable of detecting sound in a given environment and converting the sound into an audio signal for implementing acoustic echo cancellation (AEC), voice lift, crosstalk minimization, dynamic lobe steering, and other audio processing techniques designed to improve performance of the speaker array(s) 1102. In embodiments, the microphone 1120 can be substantially similar to the microphone 320 shown in
As shown in
In embodiments, the audio system 1100 can be configured to provide adaptive or dynamic steering control for each speaker array 1102 and each microphone array 1120. For example, the steerable speaker array 1102 may be capable of individually steering each audio output or beam towards a desired location. Likewise, the microphone array 1120 may be capable of individually steering each audio pick-up lobe or beam towards a desired target. The adaptive steering control may be achieved using appropriate beamforming techniques performed by the beamforming system 1104 for each of the microphones and speakers.
In some embodiments, the audio system 1100 can be configured to apply the dynamic steering capabilities of the at least one microphone 1120 and one or more speaker arrays 1102 towards functionalities or aspects that are in addition to delivering audio outputs to specific listeners, or configured to enhance the same. In particular, the audio system 1100 may be configured to allow each component of the system 1100 (e.g., each microphone and speaker) to be mutually aware of the physical location and steering status of all other components in the system 1100 relative to each other. This mutual awareness, as well as other information related to the human source/receivers in the room, allow the audio system 1100 to make active decisions related to steering locations, as well as magnitude variability and signal delay, which allows for source reinforcement and coherence, for example. Additional details and examples are provided below.
In some embodiments, the audio system 1100 may be used to determine room behavior, or measure the room impulse response, by using the microphone array 1120 to calculate an impulse response for the speaker arrays 1102. Appropriate audio processing techniques may be used to measure the impulse response of each speaker array 1102 and may include a frequency-dependent response or an audible response. According to some techniques, an adaptive filter may be assigned to each speaker array 1102, and the filtered outputs may be combined to obtain the overall room response.
As an example, the microphone array 1120 of the audio system 1100 may be used to calculate specific room characteristics, namely RT60, speaker to microphone transfer function, and impulse response. In some embodiments, each of these values may be determined using well-known techniques. The ability to automatically measure these metrics and use them to condition the response of both the microphone array 1120 and the speaker arrays 1102, as well as the accompanying additional functionalities outlined herein, can provide information about the room or environment, and the audio system's interaction with that environment, that may better inform the technologies described below.
In some embodiments, the microphone array 1120 of the audio system 1100 may be used to calculate each speaker array's time of flight (TOF), or the time it takes audio output by a given speaker array 1102 to propagate through air over a known distance (e.g., the distance between the speaker array 1102 and the microphone array 1120). The time of flight calculations can be used to control gain parameters for the speaker arrays 1102, for example, in order to avoid feedback. As an example, this measurement can be made by sending a predetermined test signal to the speaker array 1102 using any synchronous digital communication technique, while simultaneously initiating detection of the test signal audio at the microphone array 1120 also under test, using any synchronous digital communication technique (such as, for example, but not limited to, Dante). Once the signal is detected, an appropriately processed time difference between when the speaker array 1102 issued the signal and when it was detected by the microphone array 1120 will indicate the time of flight and thus, can be used to calculate the actual distance separating the two devices.
In some embodiments, the audio system 1100 may be used to optimize acoustic echo cancellation and minimize crosstalk by taking advantage of the fact that the microphone array 1120 and the speaker arrays 1102 are aware of each other. For example, an appropriate test signal may be applied to a given speaker array 1102 to excite the acoustic response of the room. The audio system 1100 can use the response detected from said test signal to initially tune echo cancelation algorithms for one or more microphones to minimize echoes generated by the room in response to the speaker array output. The audio system 1100 can also use the detected information to tune a response of the microphone array 1120 to minimize pickup from the spatial coordinates of the speaker array 1102 relative to the microphone array 1120.
In some embodiments, the steerable microphone array 1120 and steerable speaker array 1102 of the audio system 1100 may be used for adaptive voice-lift optimization. For example, null-steering techniques may be used to mutually exclude the output of one speaker array 1102 from that of another speaker array 1102. Also, null generation techniques may be used to mask non-speech audio detected by the microphone array 1120.
Voice lift is a technique for increasing speech intelligibility in large meeting rooms through subtle audio reinforcement. Incorporating voice lift techniques into the beamforming microphone array 1120 and speaker arrays 1102 of the audio system 1100 can provide a number of benefits. For example, the gain before feedback can be optimized by including the position of the active microphone in the steering decisions being made by the active speakers. When the system 1100 is aware of where the sound is coming from (i.e. the location of the talker or other audio source), the rest of the system 1100 can react intelligently by reinforcing the areas that far from the audio source, while limiting reinforcement near the audio source. As another example, when the speakers and microphones are aware of each other (e.g., via time of flight), intelligent delays can be applied to the speaker outputs relative to the audio source for voice lift purposes, so as to synchronize the direct transmission with the reinforced transmission. This would limit the amount of phase or time of flight errors in the reinforcement, which leads to a more natural and transparent experience.
In some embodiments, the audio system 1100 may also be used for acoustic localization of multiple audio sources. For example, as people speak, their locations may change, thus requiring the audio system 1100 to redirect speaker audio to optimize system performance. The presence of a set of microphones with known inter-microphone distances allows for the calculation of talker location estimation relative to the microphones. Using that information and its knowledge of the location of the microphone array 1120 relative to the speaker array 1102, the audio system 1100 can simultaneously optimize speaker playback and microphone pickup directions. In some cases, the audio system 1100 may further include one or more technologies for tracking audio sources as they move about the room or environment, such as, for example, one or more infrared devices, a camera, and/or thermal imaging technology.
Another exemplary use for the audio system 1100 may be wall mapping to determine an audio envelope of the room or other environment and generate spatial awareness of the audio sources therein. For example, the audio system 1100 may determine intra-system awareness (e.g., where the speaker arrays 1102 are located in the room) by using the microphone array 1120 to calculate time of arrival (TOA), distance between two points, and other information pertinent to establishing the spatial relationship between a given pair of speaker arrays 1102. The audio system 1100 may combine the wall mapping knowledge with this intra-system awareness to automatically control certain parameters or features of the speaker arrays 1102. For example, the audio system 1100 may use the information to automatically adjust gain parameters, lobe characteristics, and/or other features of the speaker arrays 1102 in order to avoid feedback and other undesirable effects.
In some embodiments, wall mapping can be performed by issuing a pulse to a single speaker array 1102 and processing the response by a set of microphones of known geometry, such as, e.g., microphone array 1120. Room reflections can be estimated, and in most cases, a basic room geometry can be estimated based thereon. Knowing the room geometry allows the audio system 1100 to accommodate an estimated room response. The inter-system awareness can be accomplished via any digital communication technique, whether wired or wireless (such as, e.g., Dante). Alternatively, audio steganography may be used to embed the information in an audio signal output by the speaker array 1102 and received by a given microphone, or inserted into the audio signal detected by a given microphone. Additionally, AES3 digital audio signal technology or ultrasound technology may be used to perform the information exchange between a given pair of microphones.
When used in an open-office environment, or other large, open area, the audio system 1100 may be used to increase or improve a privacy index of the individuals in the environment 1200 through dynamic noise-masking. For example, a person occupying one cubicle may be able to mask a private conversation from the occupants of surrounding cubicles by configuring the speaker array 1102 to direct frequency-tuned noise towards each of the other occupants (e.g., as an individual audio output steered towards each occupant).
Privacy index (PI) is outlined as part of ASTM E1130 and is determined by the ability of nearby listeners to discern and intelligibly understand the content of a conversation. An alternate metric that is used in the architectural acoustics community is Speech Intelligibility Index (SII) outlined in ANSI S3.5. According to some embodiments, the audio system 1100 may have the following capabilities in an open office environment. The speaker array 1102 may be capable of directing masking noise to areas of the environment that are not being used for a given teleconference. This masking noise can hinder the intelligibility of the teleconference audio or speech for outside listeners. Such functionality may be initiated as part of each teleconference, or may be a persistent feature of a well-defined area, wherein the audio system 1100 is configured to ensure minimal interference to that area from talkers detected in other areas, or limit transmission of audio from those other areas to the well-defined area. The dynamic steering ability of the microphone array 1120 and speaker arrays 1102 may also be used to actively mask surrounding sounds that are naturally transmitted to a given area, for example, using active noise suppression technique.
In some embodiments, the audio system 1100 can be configured to share information between its components using ultrasonic or steganographic-type techniques that embed data or control information within the wireless audio signal. For example, information about gain levels, equalization levels, talker identification, filter coefficients, system level warnings (e.g., low battery), and other functional tasks or tests could be conveyed between components of the audio system 1100 using such wireless techniques, instead of using the network, as is conventional. This may reduce bandwidth consumption on the network and increase the speed with which information can be conveyed. Also, by embedding the data into the audio signal, the audio signal can be sent in real-time. That is, the audio signal need not be delayed to accommodate data signals, as is conventional.
In some embodiments, the speaker arrays 1102 may be distributed around the environment 1200 so that each speaker array 1102 covers a predetermined portion of the environment 1200. In addition, the placement of each speaker 1102 and microphone 1120 may be selected relative to each other, or so that there is sufficient distance between adjoining devices. In some cases, the microphone 1120 may be directed away from the speaker arrays 1102 to avoid unwanted acoustic interference. The locations of the speaker arrays 1102 and microphone array(s) 1120 may also be selected depending on expected positioning of the listeners in the environment 1200 and/or the type of environment 1200. For example, in a conference room, the speaker arrays 1102 may be centered above a large conference table and may be used during a conference call to reproduce an audio signal representing speech or spoken words received from a remote audio source associated with the conference call. As another example, in an open office environment, the speaker arrays 1102 may be positioned above the clusters of cubicles, so that each cubicle receives audio from at least one of the speaker arrays 1102.
In some embodiments, the speaker arrays 1102 and the microphone array 1120 can be configured for attachment to a vertical wall or horizontal surface, such as, e.g., a table-top. In other embodiments, the speaker arrays 1102 and microphone array 1120 can be configured for attachment to the ceiling 1230, with a front face of each device facing down towards the environment 1200. For example, each speaker array 1102 and/or microphone array 1120 may include a housing with a back surface that is configured for flush-mount attachment to the ceiling 1230, similar to the housing 106 shown in
In some embodiments, the ceiling 1230 can be a suspended ceiling, or drop-ceiling, comprising a plurality of ceiling tiles arranged in a grid-like fashion, as shown in
As shown in
In embodiments, the ability to wirelessly link the components of the audio system 1100 through a distributed network that enables metadata transfer among said components, allows for full transparency of the audio, DSP, and control parameters that are developed and exchanged through the use of the audio system 1100. Moreover, the ability to manage this metadata sharing through protocols, such as, for example, DECT, encrypted Wi-Fi, RF, NFC, Bluetooth, or any number of other wireless or wired protocols, allows for each piece of the system 1100 to be equally aware of the system 1100 as a whole. This awareness, in turn, allows the individual system components to behave in a system-wide consistent manner, as each component uses the same dataset for decision-making purposes.
Any process descriptions or blocks in figures should be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps in the process, and alternate implementations are included within the scope of the embodiments of the invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those having ordinary skill in the art.
This disclosure is intended to explain how to fashion and use various embodiments in accordance with the technology rather than to limit the true, intended, and fair scope and spirit thereof. The foregoing description is not intended to be exhaustive or to be limited to the precise forms disclosed. Modifications or variations are possible in light of the above teachings. The embodiment(s) were chosen and described to provide the best illustration of the principle of the described technology and its practical application, and to enable one of ordinary skill in the art to utilize the technology in various embodiments and with various modifications as are suited to the particular use contemplated. All such modifications and variations are within the scope of the embodiments as determined by the appended claims, as may be amended during the pendency of this application for patent, and all equivalents thereof, when interpreted in accordance with the breadth to which they are fairly, legally and equitably entitled.
This application is a continuation of U.S. patent application Ser. No. 16/882,110, filed on May 22, 2020, which claims priority to U.S. Provisional Patent Application No. 62/960,502, filed on Jan. 13, 2020, and U.S. Provisional Patent Application No. 62/851,819, filed on May 23, 2019, each of which is fully incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
62960502 | Jan 2020 | US | |
62851819 | May 2019 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 16882110 | May 2020 | US |
Child | 17814029 | US |