STEERABLE SPEAKER ARRAY, SYSTEM, AND METHOD FOR THE SAME

TECHNICAL FIELD

This application generally relates to a speaker system. In particular, this application relates to a speaker system comprising at least one steerable speaker array and methods for implementing and controlling the same.

BACKGROUND

Loudspeaker, or sound reproduction, systems comprising a plurality of speakers are commonly found in office spaces or conferencing environments, public spaces, including theaters, entertainment venues, and transportation hubs, homes, automobiles, and other listening environments. The number, size, quality, arrangement, and type of the speakers can affect sound quality and listening experience. However, most listening environments can only accommodate a certain number, size, type, and/or arrangement of speakers due to spatial and/or aesthetic limitations, limits on expense and/or computational complexity, and other constraints. For example, massive speaker systems with larger cone sizes may be suitable for concert halls and other music applications requiring a high fidelity, full-range response, e.g., 20 Hz to 20 kHz, but typically, are not preferred for office spaces and conferencing environments. Rather, such environments often include speakers that are aesthetically designed to minimize the visual impact of the speaker system and acoustically designed to provide increased intelligibility and other preferred characteristics for voice applications.

One existing type of loudspeaker system is the line array comprising a linear arrangement of transducers with predetermined spacing or distances between the transducers. Typically, the transducers are arranged in a planar array and located on a front plate of a single housing or mounting frame with all of the transducers facing forward, or away from the front plate. A common line array is the “column speaker,” which consists of a long line of closely spaced identical transducers or drivers placed in an upright, forward-facing position. Line arrays provide the ability to steer the sound beams output by the individual speakers towards a given listener using appropriate beamforming techniques (e.g., signal processing). For example, the transducers of an upright column speaker can provide a controlled degree of directionality in the vertical plane. The directivity of a line array depends on several, somewhat conflicting properties. Longer lines of drivers permit greater directional control at lower frequencies, while closer spacing between drivers permits greater directional control at higher frequencies. Also, as frequency decreases, beam width increases, causing beam focus to decrease. A two-dimensional speaker array comprised of several individual line arrays arranged in rows and columns may be capable of providing control in all directions. However, such systems are difficult to design and expensive to implement due at least in part to the large number of drivers required to provide directivity across all frequencies.

Accordingly, there is an opportunity for systems that address these concerns. More particularly, there is an opportunity for systems including a speaker array that is unobtrusive, easy to install into an existing environment, and allows for adjustment of the speaker array, including steering discrete lobes to desired listeners or other locations.

SUMMARY

The invention is intended to solve the above-noted problems by providing systems and methods that are designed to, among other things, provide: (1) a steerable speaker array comprising a concentric, nested configuration of transducers that achieves improved directivity over the voice frequency range and an optimal main to side lobe ratio over a prescribed steering angle range; and (2) enhanced audio features by utilizing the steerable speaker array in combination with a steerable microphone or microphone array, such as, for example, acoustic echo cancellation, crosstalk minimization, voice-lift, dynamic noise masking, and spatialized audio streams.

According to one aspect, a speaker array is provided. The speaker array comprises a plurality of drivers arranged in a concentric, nested configuration formed by arranging the drivers in a plurality of concentric groups and placing the groups at different radial distances from a central point of the configuration. Each group is formed by a subset of the plurality of drivers being positioned at predetermined intervals from each other along a perimeter of the group. The groups are rotationally offset from each other relative to a central axis of the array that passes through the central point. The different radial distances are configured such that the concentric groups are harmonically nested.

According to another aspect, a method, performed by one or more processors to generate a beamformed audio output using an audio system comprising a speaker array having a plurality of drivers, is provided. The method comprises receiving one or more input audio signals from an audio source coupled to the audio system; generating a separate audio output signal for each driver of the speaker array based on at least one of the input audio signals, the drivers being arranged in a plurality of concentric groups positioned at different radial distances relative to a central point to form a concentric, nested configuration; and providing the audio output signals to the corresponding drivers to produce a beamformed audio output. The generating comprises, for each driver: obtaining one or more filter values and at least one delay value associated with the driver, at least one of the one or more filter values being assigned to the driver based on the concentric group in which the driver is located, applying the at least one filter value to one or more filters to produce a filtered output signal for the driver, providing the filtered output signal to a delay element associated with the driver, applying the at least one delay value to the delay element to produce a delayed output signal for the driver, and providing the delayed output signal to a power amplifier in order to amplify the signal by a predetermined gain amount.

According to another aspect, an audio system is provided. The audio system comprises a first speaker array comprising a plurality of drivers arranged in a plurality of concentric groups positioned at different radial distances from a central point to form a concentric, nested configuration, each group being formed by a subset of the plurality of drivers being positioned at predetermined intervals from each other along a perimeter of the group. The audio system further comprises a beamforming system coupled to the first speaker array and configured to: receive one or more input audio signals from an audio source, generate a separate audio output signal for each driver of the first speaker array based on at least one of the input audio signal, and provide the audio output signals to the corresponding drivers to produce a beamformed audio output.

According to yet another aspect, a speaker system is provided. The speaker system comprises a planar speaker array disposed in a substantially flat housing and comprising a plurality of drivers arranged in a two-dimensional configuration, the speaker array having an aperture size of less than 60 centimeters and being configured to simultaneously form a plurality of dynamically steerable lobes directed towards multiple locations. The speaker system further comprises a beamforming system coupled to the speaker array and configured to digitally process one or more input audio signals, generate a corresponding audio output signal for each driver, and direct each output signal towards a designated one of the multiple locations.

These and other embodiments, and various permutations and aspects, will become apparent and be more fully understood from the following detailed description and accompanying drawings, which set forth illustrative embodiments that are indicative of the various ways in which the principles of the invention may be employed.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic diagram illustrating an exemplary speaker array in accordance with certain embodiments.

FIG. 2 is a block diagram depicting an exemplary speaker system in accordance with certain embodiments.

FIG. 3 is a block diagram depicting an exemplary audio processing system of the speaker system shown in FIG. 2, in accordance with certain embodiments.

FIG. 4 is a flowchart illustrating an exemplary method of generating a beamformed audio output using the speaker system of FIG. 2, in accordance with one or more embodiments.

FIG. 5 is a response plot showing select frequency responses of the speaker array of FIG. 1 in accordance with certain embodiments.

FIGS. 6A and 6B and FIGS. 7A and 7B are polar plots showing select polar responses of the speaker array of FIG. 1 in accordance with certain embodiments.

FIGS. 8-10 are diagrams of exemplary use cases for the speaker array of FIG. 1, in accordance with embodiments.

FIG. 11 is a block diagram depicting an exemplary audio system in accordance with certain embodiments.

FIG. 12 is a schematic diagram illustrating an exemplary implementation of the audio system of FIG. 11 in a drop ceiling, in accordance with certain embodiments.

DETAILED DESCRIPTION

The description that follows describes, illustrates and exemplifies one or more particular embodiments of the invention in accordance with its principles. This description is not provided to limit the invention to the embodiments described herein, but rather to explain and teach the principles of the invention in such a way to enable one of ordinary skill in the art to understand these principles and, with that understanding, be able to apply them to practice not only the embodiments described herein, but also other embodiments that may come to mind in accordance with these principles. The scope of the invention is intended to cover all such embodiments that may fall within the scope of the appended claims, either literally or under the doctrine of equivalents.

It should be noted that in the description and drawings, like or substantially similar elements may be labeled with the same reference numerals. However, sometimes these elements may be labeled with differing numbers, such as, for example, in cases where such labeling facilitates a more clear description. Additionally, the drawings set forth herein are not necessarily drawn to scale, and in some instances proportions may have been exaggerated to more clearly depict certain features. Such labeling and drawing practices do not necessarily implicate an underlying substantive purpose. As stated above, the specification is intended to be taken as a whole and interpreted in accordance with the principles of the invention as taught herein and understood to one of ordinary skill in the art.

With respect to the exemplary systems, components and architecture described and illustrated herein, it should also be understood that the embodiments may be embodied by, or employed in, numerous configurations and components, including one or more systems, hardware, software, or firmware configurations or components, or any combination thereof, as understood by one of ordinary skill in the art. Accordingly, while the drawings illustrate exemplary systems including components for one or more of the embodiments contemplated herein, it should be understood that with respect to each embodiment, one or more components may not be present or necessary in the system.

Systems and methods are provided herein for a speaker system that includes a plurality of electroacoustic transducers or drivers selectively arranged to form a high-performing planar array capable of presenting audio source material in a narrowly directed, dynamically steerable sound beam and simultaneously presenting different source materials to different locations using individually steerable beams. The drivers are arranged in a harmonically nested and geometrically optimized configuration to allow for polar pattern formation capable of generating highly spatially-controlled and steerable beams with an optimal directivity index.

In embodiments, the array configuration is achieved by arranging the drivers in a plurality of concentrically-positioned groups (e.g., rings or other formations), which enables the speaker array to have equivalent beam width performance for any given look angle in a three-dimensional (e.g., X-Y-Z) space. As a result, the speaker array described herein can provide a more consistent output and improved directivity than existing arrays with linear, rectangular, or square constellations. Further, each concentric group within the configuration of drivers is rotationally offset from every other group in order to avoid radial and axial symmetry. This enables the speaker array described herein to minimize side lobe growth or provide a maximal main-to-side-lobe ratio, unlike existing speaker arrays with co-linearly positioned speaker elements. The offset configuration can also tolerate further beam steering, which allows the speaker array to cover a wider listening area. Moreover, the speaker array configuration described herein can be harmonically nested to optimize beam width over a given set of distinct frequency bands (e.g., across the voice frequency range).

FIG. 1 illustrates an exemplary speaker array 100 comprising a plurality of individually steerable speakers 102 (also referred to herein as “drivers”) arranged in a two-dimensional configuration, in accordance with embodiments. Each of the speakers 102 may be an electroacoustic transducer or any other type of driver configured to convert an electrical audio signal into a corresponding sound including, for example, dynamic drivers, piezoelectric transducers, planar magnetic drivers, electrostatic transducers, MEMS drivers, compression drivers, etc. The sound output by the speaker array 100 may represent any type of input audio signal including, for example, live or real-time audio spoken by human speakers, pre-recorded audio files reproduced by an audio player, streaming audio received from a remote audio source using a network connection, etc. In some cases, the input audio signal can be a digital audio signal, and the digital audio signals may conform to the Dante standard for transmitting audio over Ethernet or another standard. In other cases, the input audio signal may be an analog audio signal, and the speaker array 100 may be coupled to components, such as analog to digital converters, processors, and/or other components, to process the analog audio signals and ultimately generate one or more digital audio output signals (e.g., as shown in FIG. 3).

The sounds produced by the speaker array 100 can be directed towards one or more listeners (e.g., human listeners) within a room (e.g., conference room), or other location, using beamforming techniques, as described herein. In some embodiments, the speaker array 100 may be configured to simultaneously produce multiple audio outputs based on different audio signals received from a plurality of audio sources, with each audio output being directed to a different location or listener.

As shown in FIG. 1, the drivers 102 are all arranged in a single plane and are forward-facing, or have a front face pointed towards the room or environment in which the speaker array 100 is installed. Each of the drivers 102 has a separate enclosed volume extending away from the front face of the driver 102. The enclosed volume forms a cylindrical cavity that, at least in part, determines a depth of the operating space required for the speaker array 100. For example, in one embodiment, each of the drivers 102 has an enclosure volume of 25 cubic centimeter (cc), which forms a cylindrical cavity of a known height behind the driver 102. This height may define a minimum depth for the speaker array 100, or a housing comprising the speaker array 100. In some embodiments, a back or rear face of the speaker array 100 may look like a honeycomb due to the independent cavities of the drivers 102 extending up and away from the front face of the array 100 and being arranged in close proximity to each other.

As shown, the drivers 102 can be coupled to, or included on, a support 104 for securing and supporting the drivers 102. The drivers 102 may be embedded into the support 104 or otherwise mechanically attached thereto (e.g., suspended from wires attached to the support 104). In the illustrated embodiment, all of the drivers 102 are positioned on the same surface or side of the support 104 (e.g., a front or top face). In other embodiments, at least some of the drivers 102 may be arranged on a first side or surface of the support 104, while the rest of the drivers 102 are arranged on the opposite side or surface of the support 104. In some embodiments, the drivers 102 may be distributed across multiple supports or surfaces.

The support 104 may be any suitable planar surface, including, for example, a flat plate, a frame, a printed circuit board, a substrate, etc., and may have any suitable size or shape, including, for example a square, as shown in FIG. 1, a rectangle, a circle, a hexagon, etc. In other embodiments, the support 104 may be a curved or domed surface having, for example, a concave or convex shape. In still other embodiments, each of the drivers 102 may be individually positioned, or suspended, in the environment without connection to a common support or housing. In such cases, the drivers 102 may be wirelessly connected to an audio processing system to receive audio output signals and may form a distributed network of speakers.

In the illustrated embodiment, the speaker array 100 is encased in a housing 106 configured to protect and structurally support the drivers 102 and support 104. The housing 106 may include a sound-permeable front face made of fabric, film, wire mesh, or other suitable material, and an enclosed rear face made of metal, plastic, or other suitable material. A depth of the housing 106 may be selected to accommodate the acoustical cavity required by each of the drivers 102, as described herein. While the illustrated embodiment shows a substantially flat, square housing 106 and support 104, other sizes and shapes are also contemplated, including, for example, domed shapes, spherical shapes, parabolic shapes, oval or circular shapes, or other types of polygons (e.g., rectangle, triangle, pentagon, etc.).

In some embodiments, the housing 106 is configured for attachment to a ceiling so that the speaker array 100 faces down towards or over the listeners in a room or other environment. For example, the speaker array 100 may be placed over a conference table and may be used to reproduce an audio signal representing speech or spoken words received from a remote audio source associated with the conferencing environment. As another example, the speaker array 100 may be placed in an open office environment, above a cluster of cubicles or other suitable location. In a preferred embodiment, the housing 106 may be flush mounted to the ceiling or other surface to gain certain acoustic benefits, such for example, infinite baffling.

In one embodiment, a size and shape of the housing 106 may be configured to substantially match that of a standard ceiling tile, so that the speaker array 100 can be attached to a drop ceiling (or a secondary ceiling hung below a main, structural ceiling) in place of, or adjacent to, one of the ceiling tiles that make up the drop ceiling. For example, the housing 106 may be square-shaped, and each side of the housing 106 may have a length of about 60 cm, or about 24 inches, depending on whether the drop ceiling is according to European specifications or U.S. specifications. In one embodiment, an overall aperture size of the speaker array 100 may be less than 60 centimeters (or less than 24 inches), in order to fit within the housing 106.

The speaker array 100 can be further configured for optimal performance at a certain height, or range of heights, above a floor of the environment, for example, in accordance with standard ceiling heights (e.g., eight to ten feet high), or any other appropriate height range (e.g., ceiling to table height). In other embodiments, the speaker array 100 is configured for attachment to a vertical wall for directing audio towards the listeners from one side of the environment.

As shown in FIG. 1, the plurality of drivers 102 includes a central driver 102a positioned at a central point (0,0) of the support 104 and a remaining set of the drivers 102b arranged in a concentric, nested configuration surrounding the central driver 102a, thus forming a two-dimensional array. Due, at least in part, to the geometry of this concentric, nested configuration, the speaker array 100 can achieve a constant beam width over a preset audio frequency range (e.g., the voice frequencies), improved directional sensitivity across the preset range, and maximal main-to-side-lobe ratio over a prescribed steering angle range, enabling the speaker array 100 to more precisely direct sound towards selected locations or listeners. Moreover, as compared to a linear array, the two-dimensional design of the speaker array 100 described herein requires fewer drivers 102 to achieve the same directional performance, thus reducing the overall size and weight of the array 100.

In embodiments, the central driver 102a can be used as a reference point for creating axial symmetry in the array 100, and the concentric, nested configuration can be formed by arranging the remaining drivers 102b in concentric groups 108, 110, 112, 114 around the central driver 102a. Each group contains a different subset or collection of the drivers 102b. During operation, two or more groups of drivers 102b and/or the central driver 102a may be selected to work together and form a “sub-nest” configured to produce a desired speaker output, such as, for example, high directivity and steerability in a given frequency band. The number of sub-nests that may be formed using the drivers 102 can vary depending on the beamforming techniques used, the covered frequency bands, the total number of drivers 102 in the array 100, the total number of groups of drivers 102, etc.

As shown, the groups 108, 110, 112, 114 are positioned at progressively larger radial distances from the central point (0,0) of the array 100 in order to cover progressively lower frequency octaves and create a harmonically nested configuration. For example, as shown in FIG. 1, the first group 108 is immediately adjacent to the central driver 102a and is nested within the second group 110, while the second group 110 is nested within the third group 112, and the third group 112 is nested within the fourth group 114. In addition, the radial distances of the groups 108-114 may double in size with each nesting in accordance with harmonic nesting techniques. For example, the radial distance of the second group 110 is double the radial distance of the first group 108, the radial distance of the third group 112 is double that of the second group 110, etc. As shown, in some embodiments, the concentric groups 108-114 may be circular in shape and may form rings of different sizes. For example, in FIG. 1, a circle has been drawn through each group of drivers 102b for ease of explanation and illustration. Other shapes for the groups of drivers 102b are also contemplated, including, for example, oval or other oblong shapes, rectangular or square shapes, triangles or other polygon shapes, etc.

Within each of the groups 108-114, the individual drivers 102b may be evenly spaced apart, or positioned at predetermined intervals, along a circumference, or perimeter, of the group. The exact distance between neighboring drivers 102b (e.g., center to center) within a given group may vary depending on an overall size (e.g., radius) of the group, the size of each driver 102, the shape of the groups, and the number of drivers 102b included in the group, as will be appreciated. For example, in FIG. 1, the drivers 102b in groups 108 and 110 are adjacent or nearly adjacent to each other because those two groups have smaller diameters, while groups 112 and 114 have larger diameters and therefore, larger spaces between their respective drivers 102b.

In the illustrated example, the speaker array 100 comprises a total of fifty identical drivers 102, each driver 102 having a 20 millimeter (mm) diameter. The first driver 102a is placed in the central reference point, while the remaining forty-nine drivers 102b are arranged in the four concentric groups 108, 110, 112, 114 with progressively increasing radial distances to create the nested configuration. The increased driver density created by concentrically grouping or clustering the drivers 102 in this manner can minimize side lobes and improve directivity, thereby enabling the speaker array 100 to accommodate a wider range of audio frequencies with varying beam width control. The exact number of drivers 102b included in each group 108-114 and the total number of drivers 102 included in the speaker array 100 may depend on a number of considerations, including, for example, a size of the individual drivers 102, the configuration of the harmonic nests, a desired density for the drivers in the array, a preset operating frequency range of the array 100 and other desired performance standards, and constraints on physical space (e.g., due to a limit on the overall dimensions of the housing 106) and/or processing power (e.g., number of processors, number of outputs per processor, processing speeds, etc.). For example, in one embodiment, only forty-eight of the fifty drivers 102 are active because of hardware limitations. In other embodiments, the speaker array 100 may include more than fifty drivers 102, for example, by adding a fifth concentric group outside outermost group 114 to better accommodate lower frequencies.

In some embodiments, the geometry and harmonic nesting of the drivers 102 included in the center of the array 100, namely cluster 118 formed by central driver 102a and the drivers 102b of groups 108 and 110, may be configured to further extend a low frequency output of the speaker array 100 (or operate in low frequency bands) without requiring a larger overall size for the array. For example, as shown in FIG. 1, the drivers 102b of the first group 108 are adjacent to each other and in close proximity to the central microphone 102a. Likewise, the drivers 102b of the second group 110 are also adjacent to each other and in close proximity to the first group 108. During operation, the drivers 102 forming the cluster 118 may effectively operate as one larger speaker with an aperture size roughly equivalent to a total width of the cluster 118. In embodiments, the speaker array 100 can combine the cluster 118 of drivers 102 with the drivers 102b in the outer groups 112 and/or 114 to provide better low frequency sensitivity (or operation) than that of each individual driver 102. For example, in embodiments where each driver 102 has a 20 mm aperture size, an effective aperture size of the central cluster 118 may be about four inches. In such cases, the speaker array 100 can be configured to provide a low frequency sensitivity of about 100 Hz, which is much lower than that of a single driver 102 (e.g., 400 Hz).

In some embodiments, the number of drivers 102b in each group can be configured to maximize a main-to-side-lobe ratio of the speaker array 100 and thereby, produce an improved beam width with a near constant frequency response across all frequencies within the preset range. For example, the main-to-side-lobe ratio may be maximized by including an odd number of drivers 102b in the first group 108 and by including a multiple of the odd number in each of the other groups 110, 112, and 114. In one embodiment, the odd number is selected from a group of prime numbers in order to further avoid axial alignment between the drivers 102 and mitigate the side lobe effects across different octaves within the overall operating range of the speaker array (for example and without limitation, 100 Hz to 10 KHz). For example, in FIG. 1, the number of drivers 102b included in the first group 108 is seven, and the number of drivers 102b in each of the other groups 110, 112, 114 is a multiple of seven, or fourteen. In some embodiments, the number of drivers 102b included in each group may be selected to create a repeating pattern that can be easily extended to cover more audio frequencies by adding one or more concentric groups, or easily reduced to cover fewer frequencies by removing one or more groups. In other embodiments, the number of drivers 102b in the first group 108 may be any integer greater than one and the number of drivers 102b in each of the other groups 110, 112, 114 may be a multiple of that number.

The exact diameter or circumference of each group 108, 110, 112, 114, and/or the radial distance between each group and the central point (0,0), can vary depending on the desired frequency range of the speaker array 100 and a desired sensitivity or overall sound pressure for the drivers 102b in that group, as well as a size of each individual driver 102. In some embodiments, a diameter or size of each group may define the lowest frequency at which the drivers 102b within that group can optimally operate without interference or other negative effects (e.g., due to grating lobes). For example, a radial distance of the outermost group 114 may be selected to enable optimal operation at the lowest frequencies in the predetermined operating range, while a radial distance of the innermost group 108 may be selected to enable optimal operation at the highest frequencies in the predetermined range, and the remaining ring diameters or radial distances can be determined by subdividing the remaining frequency range.

In embodiments, the total number of driver groups included in the speaker array 100 can also determine the optimal frequency or operating range of the array 100. For example, the speaker array 100 may be configured to operate in a wider range of frequencies by increasing the number of groups to more than four. In other embodiments, the speaker array 100 may have fewer than the four groups shown in FIG. 1 (e.g., three groups).

In a preferred embodiment, the radial distance of each group 108, 110, 112, 114 is twice the radial distance of the smaller group nested immediately inside that group in accordance with the harmonic nesting approach. For example, in FIG. 1, the first group 108 is positioned on a radial centerline of 25.5 millimeters (mm) from the central point (0,0), the second group 110 is positioned on a radial centerline of 51 mm from the central point (i.e. twice the radial distance of the first group 108), the third group 112 is positioned on a radial centerline of 102 mm from the central point (i.e. twice the radial distance of the second group 110), and the fourth group 114 is positioned on a radial centerline of 204 mm from the central point (i.e. twice the radial distance of the third group 112).

In embodiments, each of the groups 108-114 may be at least slightly rotated relative to central axis 116 (e.g., the x-axis), which passes through the center point (0,0) of the array (e.g., the central speaker 102a), in order to optimize the directivity of the speaker array 100. For example, the rotational offset can be configured to eliminate undesired interference that can occur when more than two drivers 102 are aligned. In some embodiments, the groups 108-114 can be rotationally offset from each other, for example, by rotating each group a different number of degrees relative to the central axis 116, so that no more than two of the drivers 102 are axially aligned, or co-linear. In some embodiments, the number of degrees for the offset is an integer greater than one, or a multiple of that integer, and is selected to further avoid alignment and minimize co-linearity. For example, in the illustrated embodiment, each of the groups are rotationally offset from the x-axis 116 by 17 degrees or a multiple thereof. In particular, the first group 108 is offset by 17 degrees, the second group 110 is offset by 34 degrees, the third group 112 is offset by 51 degrees, and the fourth group 114 is offset by 68 degrees. In other embodiments, the rotational offset may be more arbitrarily implemented, if at all, and/or other methods may be utilized to optimize the overall directivity of the microphone array. Regardless of the method, rotationally offsetting the drivers 102 can configure the speaker array 100 to constrain sensitivity to the main lobes, thereby maximizing main lobe response and reducing side lobe response.

As will be appreciated, FIG. 1 only shows an exemplary embodiment of the speaker array 100 and other configurations are contemplated in accordance with the principles disclosed herein. For example, while a specific number of drivers 102 and groups 108-114 are shown in the illustrated embodiment, other numbers and combinations of speaker elements are also contemplated, including adding more drivers and/or groups to help accommodate a wider frequency range (e.g., lower and/or higher frequencies). For example, by increasing the number of drivers 102b in each ring and/or the number of rings, a driver density across the array is also increased, which can help further minimize grating lobes and thereby, produce an improved beam width with a near constant frequency response across all frequencies within the preset range.

In some embodiments, the plurality of drivers 102 may be arranged in concentric rings around a central point, but without a driver positioned at the central point (e.g., without the central driver 102a). In other embodiments, only a portion of the drivers 102 may be arranged in concentric rings, and the remaining portion of the drivers 102 may be positioned at various points outside of, or in between, the discrete rings, at random locations on the support 104, in line arrays at the top, bottom and/or sides of the concentric rings, or in any other suitable arrangement. In some embodiments, the drivers 102 may be non-identical transducers. For example, some of the drivers 102 may be smaller (e.g., tweeters), while others may be larger (e.g., woofers), to help accommodate a wider range of frequencies.

FIG. 2 illustrates an exemplary speaker system 200 comprising a speaker array 202 and a beamforming system 204 electrically coupled to the speaker array 202 using a single cable 206, in accordance with embodiments. The speaker system 200 (also referred to herein as an “audio system”) can be configured to direct audio source material (e.g., input audio signal(s)) in a narrow, directed beam that is dynamically steerable and highly spatially controlled. In some embodiments, the speaker system 200 is configured to simultaneously output multiple streams, corresponding to different audio source materials, to multiple locations or listeners. The speaker system 200 may be used in open office environments, conference rooms, or other environments. In some embodiments, the speaker system 200 further includes one or more microphones to provide improved performance, including minimization of crosstalk and acoustic echo cancellation (AEC) through higher source receiver isolation, as well as spatialized and multi-lingual content streams, and for use in voice-lift applications.

The speaker array 202 can be comprised of a plurality of speaker elements or drivers arranged in a harmonically nested, concentric configuration, or other geometrically optimized configuration in accordance with the techniques described herein. In embodiments, the speaker array 202 may be substantially similar to the speaker array 100 shown in FIG. 1. The beamforming system 204 can be in communication with the individual speaker elements of the speaker array 202 and can be configured to beamform or otherwise process input audio signals and generate a corresponding audio output signal for each speaker element of the speaker array 202. In embodiments, the speaker array 202 can be configured to simultaneously produce a plurality of individual audio outputs using various speakers, or combinations of speakers, and direct each audio output towards a designated location or listener, as described with respect to FIG. 3.

Various components of the speaker system 200 may be implemented using software executable by one or more computers, such as a computing device with a processor and memory, and/or by hardware (e.g., discrete logic circuits, application specific integrated circuits (ASIC), programmable gate arrays (PGA), field programmable gate arrays (FPGA), digital signal processors (DSP), microprocessor, etc.). For example, some or all components of the beamforming system 204 may be implemented using discrete circuitry devices and/or using one or more processors (e.g., audio processor and/or digital signal processor) (not shown) executing program code stored in a memory (not shown), the program code being configured to carry out one or more processes or operations described herein, such as, for example, method 400 shown in FIG. 4. Thus, in embodiments, the system 200 may include one or more processors, memory devices, computing devices, and/or other hardware components not shown in FIG. 2. In one embodiment, the system 200 includes at least two separate processors, one for consolidating and formatting all of the speaker elements and another for implementing digital signal processing (DSP) functionality. In other embodiments, the system 200 may perform all functionality using one processor.

The single cable 206 can be configured to transport audio signals, data signals, and power between the beamforming system 204 and the speaker array 202. Though not shown, each of the beamforming system 204 and the speaker array 202 may include an external port for receiving either end of the cable 206. In embodiments, the external ports may be Ethernet ports configured to provide power, control, and audio connectivity to the components of the speaker system 200. In such embodiments, the single cable 206 may be an Ethernet cable (e.g., CATS, CAT6, etc.) configured to be electrically coupled to the Ethernet port. In other embodiments, the speaker system 200 includes one or more other types of external ports (e.g., Universal Serial Bus (USB), mini-USB, PS/2, HDMI, VGA, serial, etc.), and the single cable 206 is configured for coupling to said other port.

The content transported via the cable 206 to and/or from the speaker array 202 may be provided by various components of the beamforming system 204. For example, electrical power may be supplied by a power source 208 (e.g., battery, wall outlet, etc.) configured to send power to the speaker array 202. The power source 208 may be an external power supply that is electrically coupled to the beamforming system 204, or an internal power source included in the beamforming system 204 and/or speaker system 200. In a preferred embodiment, the power signal is delivered through the cable 206 using Power Over Ethernet (PoE) technology (e.g., PoE++). As an example, the power source 208 may be configured to supply up to 100 watts of power (e.g., Level 4 PoE), and the cable 206 may be configured (e.g., by including at least four twisted pairs of wires) to deliver at least 75 watts to the speaker array 202.

The audio data may be provided by an audio processing system 210 of the beamforming system 204 for transmission to the speaker array 202 over the cable 206. The audio processing system 210 can be configured to receive audio signals from one or more audio sources (not shown) coupled to the speaker system 200 and perform prescribed beamforming techniques to steer and focus sound beams to be output by the speaker array 202, for example, as described with respect to FIG. 3. The audio processing system 210 may include one or more audio recorders, audio mixers, amplifiers, audio processors, bridge devices, and/or other audio components for processing electrical audio signals. In some embodiments, the audio processing system 210 can be configured to receive audio over multiple input channels and combine the received audios into one or more output channels. In some embodiments, the audio processing system 210 can be configured to direct different audio sources to different listeners of the speaker array 202. For example, in a conference room with listeners that speak different languages, the audio processing system 210 can be configured to provide each listener with a separate sound beam containing audio in the respective language of that listener.

The data signals transported over the cable 206 may include control information received from a user interface 212 of the beamforming system 204 for transmission to the speaker array 202, information provided by the audio processing system 210 for transmission to the speaker array 202, and/or information transmitted by the speaker array 202 to the beamforming system 204. As an example, the control information may include adjustments to parameters of the speaker array 202, such as, e.g., directionality, steering, gain, noise suppression, pattern forming, muting, frequency response, etc. In some embodiments, a user of the speaker system 200 may use the user interface 212 to enter control information designed to steer discrete lobes of the speaker array 202 to a particular angle, direction or location (e.g., using point and steer techniques) and/or change a shape and/or size of the lobes (e.g., using magnitude shading, lobe stretching, and/or other lobe shaping techniques).

In some cases, the user interface 212 includes a control panel coupled to a control device or processor of the beamforming system 204, the control panel including one or more switches, dimmer knobs, buttons, and the like. In other cases, the user interface 212 may be implemented using a software application executed by a processor of the beamforming system 204 and/or a mobile or web application executed by a processor of a remote device communicatively coupled to the beamforming system 204 via a wired or wireless communication network. In such cases, the user interface 212 may include a graphical layout for enabling the user to change filter values, delay values, beam width, and other controllable parameters of the audio processing system 210 using graphical sliders and buttons and/or other types of graphical inputs. The remote device may be a smartphone or other mobile phone, laptop computer, tablet computer, desktop computer, or other computing device configured to enable remote user control of the audio processing system 210 and/or speaker array 202. In some embodiments, the beamforming system 204 includes a wireless communication device (not shown) (e.g., a radio frequency (RF) transmitter and/or receiver) for facilitating wireless communication with the remote device (e.g., by transmitting and/or receiving RF signals).

Though FIG. 2 shows one speaker array 202, other embodiments may include multiple speaker arrays 202, or an array of the speaker arrays 202. In such cases, a separate cable 206 may be used to couple each array 202 to the beamforming system 204 (for example, as shown in FIG. 11 and described herein). And the audio processing system 210 may be configured to handle beamforming and other audio processing for all of the arrays 202. As an example, in some cases, two speaker arrays 202 may be placed side-by-side within one area or room. In other cases, four speaker arrays 202 may be placed respectively in the four corners of a space or room.

FIG. 3 illustrates an exemplary audio processing system 300 for processing input audio signals to generate individual beamformed audio outputs for each of a plurality of highly steerable, highly controllable speaker elements 302, in accordance with embodiments. In particular, the audio processing system 300 includes a beamformer 304 configured to receive one or more audio input signals and generate a separate beamformed audio signal, a_n, for each of n speaker elements 302. In embodiments, the audio processing system 300 may be the same as, or similar to, the audio processing system 210 shown in FIG. 2, and the speaker elements 302 may be the same as, or similar to, the speaker elements of the speaker array 202 in FIG. 2 and/or the drivers 102 shown in FIG. 1. For example, the audio processing system 300 may be configured to individually control and/or steer each of the fifty drivers 102 included in the speaker array 100 shown in FIG. 1.

In embodiments, beamformer 304 comprises a filter system 306 and a plurality of delay elements 308 configured to apply pattern forming, steering, and/or other beamforming techniques to individually control the output of each speaker element 302. To help streamline these processes, sub-nests can be formed among the speaker elements 302 so as to cover specific frequency bands. For example, each sub-nest may include a collection of two or more concentric groups of speaker elements 302, a concentric group of elements plus the speaker element positioned at the center of the speaker array, a concentric group by itself, or a combination thereof. In some cases, a given speaker element 302 or group of elements may be used in more than one sub-nest. The exact number of speaker elements 302 or groups included in a given sub-nest may depend on the frequency band assigned to that sub-nest and/or an expected performance for that sub-nest.

In embodiments, beamformer 304 is implemented using one or more audio processors configured to process the input audio signal(s), for example, using filter system 306 and delay elements 308. Each processor (not shown) may comprise a digital signal processor and/or other suitable hardware (e.g., microprocessor, dedicated integrated circuit, field programmable gate array (FPGA), etc.) In one embodiment, beamformer 304 is implemented using two audio processors having 24 outputs each. In such cases, beamformer 304 can be configured to provide up to 48 outputs and therefore, can be connected to up to 48 speaker elements or drivers 302. As will be appreciated, more or fewer processors may be used so that beamformer 304 can accommodate a larger or smaller number of drivers in the speaker array.

Various components of beamformer 304, and/or the overall audio processing system 300, may be implemented using software executable by one or more computers, such as a computing device with a processor and memory, and/or by hardware (e.g., discrete logic circuits, application specific integrated circuits (ASIC), programmable gate arrays (PGA), field programmable gate arrays (FPGA), digital signal processors (DSP), microprocessors, etc.). For example, filter systems 306 and/or delay elements 308 may be implemented using discrete circuitry devices and/or using one or more data processors executing program code stored in a memory, the program code being configured to carry out one or more processes or operations described herein, such as, for example, all or portions of method 400 shown in FIG. 4. In some embodiments, audio processing system 300 may include additional processors, memory devices, computing devices, and/or other hardware components not shown in FIG. 3.

As shown, audio processing system 300 also includes a plurality of amplifiers 310 coupled between the beamformer 304 and the plurality of speaker elements 302, such that each output of the beamformer 304 is coupled to a respective one of the amplifiers 310, and each amplifier 310 is coupled to a respective one of the speaker elements 302. During operation, a magnitude of each individual audio signal, a_n, generated by the beamformer 304 for a given speaker element n is amplified by a predetermined amount of gain, or gain factor (e.g., 0.5, 1, 2, etc.), before being provided to the corresponding speaker element n. In some embodiments, the gain factor for each amplifier 310 may be selected to ensure a uniform output from the speaker elements 302, i.e. matching in magnitude. As will be appreciated, the exact number of amplifiers 310 included in the audio processing system 300 can depend on the number of speaker elements 302 included in the speaker array. In embodiments, the amplifiers 310 may be class D amplifiers or switching amplifiers, another type of electric amplifier, or any other suitable amplifier.

If the input audio signals are analog signals, the audio processing system 300 may further include an analog-to-digital converter 312 for converting the analog audio signal into a digital audio signal before it reaches the beamformer 304 for digital signal processing. In such cases, the individual audio signals a_nmay be digital audio signals that, for example, conform to the Dante standard or another digital audio standard. The audio processing system 300 may also include a digital-to-analog converter 314 for converting each individual audio signal a_nback into an analog audio signal prior to amplification by the respective amplifier 310.

In some embodiments, the audio processing system 300 can further include a database 316 configured to store information used by the beamformer 304 to generate individual audio signals a₁through a_n. The information may include filter coefficients and/or weights for configuring the filter system 306 and/or specific time delay values or coefficients (e.g., z^−k) for configuring the delay elements 308. The database 316 may store this information in a look up table or other suitable format. As an example, the table may list different filter coefficients and/or weights, as well as time delay values, for each of the speaker elements 302 and/or for each sub-nest or group of speaker elements (e.g., groups 108-114 in FIG. 1). In other embodiments, such information is programmatically generated by a processor of the audio processing system 300 and provided to the beamformer 304 as needed, to generate the individual audio signals a₁through a_n.

In embodiments, the filter system 306 may be configured to apply crossover filtering to the input audio signal to generate an appropriate audio output signal for each speaker element 302. The crossover filtering may include applying various filters to the input audio signal in order to isolate the signal into different or discrete frequency bands. For example, referring back to FIG. 1, there is an inverse relationship between the radial distance of each group 108-114 of drivers in the speaker array 100 and the frequency band(s) that can be optimally covered by that group. Specifically, larger apertures have a narrower low frequency beam width, and smaller apertures have more control at high frequencies. In embodiments, crossover filtering can be applied to stitch together an ideal frequency response for the speaker array 100 across a full range of operating frequencies, with better performance than that of a line array or other speaker array configurations.

As shown, the filter system 306 includes a plurality of filter banks 318, each filter bank 318 comprising a preselected combination of filters for implementing crossover filtering to generate a desired audio output. In embodiments, the filter banks 318 may be configured to set a constant beam width for the audio output of the speaker array across a wide range of frequencies. The individual filters may be configured as bandpass filters, low pass filters, high pass filters, or any other suitable type of filter for optimally isolating a particular frequency band of the input audio signal. The cutoff frequencies for each individual filter may be selected based on the specific frequency response characteristics of the corresponding sub-nest and/or speaker element, including, for example, location of frequency nulls, a desired frequency response for the speaker array, etc. The filter system 306 may include digital filters and/or analog filters. In some embodiments, the filter system 306 includes one or more finite impulse response (FIR) filters and/or infinite impulse response (HR) filters.

In some embodiments, the filter system 306 includes a separate filter bank 318 for each sub-nest of the speaker array, with N being the total number of sub-nests, and each filter bank 318 includes a separate filter for each speaker element 302 included in the corresponding sub-nest. In such cases, the exact number of filter banks 318, and the number of filters included therein, can depend on the number of sub-nests, as well as the number of speaker elements 302 included in each sub-nest. For example, in one embodiment, the speaker elements 302 may be configured as, or collected into, three different sub-nests to cover three different frequency bands and so, the filter system 306 may include three filter banks 318, one for each sub-nest. In another example embodiment, the speaker elements 302 may be configured to operate in four different sub-nests, so the filter system 306 includes at least four filter banks 318.

In still other embodiments, the filter system 306 can include a separate filter bank 318 for each of the speaker elements 302 or a separate filter bank 318 for each group of elements (e.g., groups 108, 110, 112, 114 in FIG. 1). In the latter case, for example, referring back to the speaker array 100 shown in FIG. 1, each of the groups 108, 110, 112, and 114 may be assigned a separate filter bank A, B, C, and D, respectively, from the filter system 306. Filter bank A may include at least seven individual filters, A₁through A₇, one for each of the seven drivers 102b included in group 108, filter bank B may include at least fourteen individual filters, B₁through B₁₄, one for each of the fourteen drivers included in group 110, and so on. In some embodiments, filter bank A may also include an eighth filter A₈for covering the central driver 102a.

The filter system 306 may further include additional elements not shown in FIG. 3, such as, for example, one or more summation elements for combining two or more filtered outputs in order to generate the individual audio signal a_nfor speaker element n. In some embodiments, the filtered outputs for select speaker elements 302, groups, and/or sub-nests may be combined or summed together to create a desired polar pattern, or to steer a main lobe of the speaker array towards a desired angular direction, or azimuth and elevation, such as, e.g., 30 degrees, 45 degrees, etc. In some embodiments, appropriate filter coefficients or weights may be retrieved from database 316 and applied to the audio signals generated for each sub-nest and/or speaker element 302 to create different polar patterns and/or steer the lobes to a desired direction.

As shown, each individual audio signal a_noutput by the filter system 306 is provided to a respective one of the delay elements 308 before exiting the beamformer 304. Each delay element 308 can be individually associated with a respective one of the speaker elements 302 and can be configured to apply an appropriate amount of time delay (e.g., z⁻¹) to the filtered output a_nreceived at its input. In embodiments, the delay value for a given speaker element 302 can be retrieved from the database 316 or programmatically generated (e.g., using software instructions executed by a processor), similar to the filter coefficients and/or weights used for the filter system 306. For example, each speaker element 302 may be assigned a respective amount of delay (or delay value), and such pairings may be stored in the database 316. The exact amount of delay applied in association with each speaker element 302 can vary depending on, for example, a desired polar pattern, a desired steering angle and/or shape of the main lobe, and/or other beamforming aspects.

In some embodiments, the audio processing system 300 also includes one or more microphones 320 for detecting sound in a given environment and converting the sound into an audio signal for the purpose of implementing acoustic echo cancellation (AEC), voice lift, and other audio processing techniques designed to improve the performance of the speaker array 300. In some embodiments, the one or more microphones 320 may be arranged inside the speaker enclosure (such as, e.g., housing 106 of FIG. 1). In other embodiments, the one or more microphones 320 may be physically separate from the speaker array 302, but communicatively coupled to the audio processing system 300 and positioned in the same room or location. The microphone(s) 320 may include any suitable type of microphone element, such as, e.g., a micro-electrical mechanical system (MEMS) transducer, condenser microphone, dynamic transducer, piezoelectric microphone, etc. In some embodiments, the microphone 320 is a standalone microphone array, for example, as shown in FIG. 12 and described below.

FIG. 4 illustrates an exemplary method 400 of generating a beamformed audio output for a speaker array comprising a plurality of speaker elements or drivers arranged in a concentric, nested configuration (e.g., as shown in FIG. 1), in accordance with embodiments. All or portions of the method 400 may be performed by one or more processors and/or other processing devices (e.g., analog to digital converters, encryption chips, etc.) within or external to the speaker array (such as, e.g., speaker array 202 shown in FIG. 2). In addition, one or more other types of components (e.g., memory, input and/or output devices, transmitters, receivers, buffers, drivers, discrete components, logic circuits, etc.) may also be utilized in conjunction with the processors and/or other processing components to perform any, some, or all of the steps of the method 400. For example, program code stored in a memory of the audio processing system 300 shown in FIG. 3 may be executed by the beamformer 304 to carry out one or more operations of the method 400. Each audio output signal generated by the audio processing system 300 may be provided to a respective one of the drivers included in the speaker array (e.g., speaker elements 302 shown in FIG. 3 or drivers 102 shown in FIG. 1). The drivers can be arranged in a plurality of concentric groups positioned at different radial distances to form a nested configuration (e.g., groups 108-114 in FIG. 1).

The method 400 begins at step 402 with receiving one or more input audio signals from an audio source. The input audio signals may be received at one or more processors, such as, e.g., beamformer 304 shown in FIG. 3. In some embodiments, step 402 may include receiving at least two different input audio signals over at least two different channels. In such cases, the method 400 may be configured to simultaneously process or beamform the at least two signals and generate at least two audio outputs directed to at least two different locations or listeners using the same speaker array. For example, certain steps of the method 400 may be performed multiple times, in parallel, in order to generate the two or more outputs. In other embodiments, step 402 may include combining input audio signals received over different channels to create one input audio signal for the beamformer 304.

At step 404, the one or more processors generate a separate audio output signal for each driver included in the speaker array based on at least one of the one or more input audio signals, as well as a desired beamforming result and characteristics related to the driver's position in the speaker array, including, for example, the particular group in which the driver located. The audio output may be generated using crossover filtering, delay and sum processing, weigh and sum processing, and/or other beamforming techniques for manipulating magnitude, phase, and delay values for each individual driver in order to steer the main lobe towards a desired location or listener and maintain a constant beam width across a wide range of frequencies. In embodiments, generating an audio output signal for each driver at step 404 can include obtaining one or more filter values and at least one delay value associated with the driver. At least one of the one or more filter values may be assigned to the driver based on the concentric group in which the driver is located. For example, in some embodiments, the groups of drivers may be combined to form two or more sub-nests for audio processing purposes, and all drivers belonging to a particular sub-nest can be assigned at least one common filter value. On the other hand, the time delay value may be specific to each driver. The filter values and delay values may be retrieved from a database (e.g., database 316 in FIG. 3) or generated by the one or more processors, as described herein.

The generating process at step 404 can also include applying the at least one filter value to one or more filters (e.g., filter bank 306 in FIG. 3) to produce a filtered output signal for the respective driver, providing the filtered output signal to a delay element (e.g., delay element 308 in FIG. 3) associated with the driver, and applying the at least one delay value to the delay element to produce a delayed output signal for that driver. In some embodiments, the generating step can further include providing the delayed output signal to a power amplifier (e.g., amplifier 310 in FIG. 3) in order to amplify the signal by a predetermined gain amount. In some cases, the predetermined gain amount may be selected based on the driver coupled to the amplifier. In other cases, the gain amount can be determined or set by the processer during step 404 in order to ensure uniform outputs across all speaker elements.

Step 406 involves providing the generated audio output signals to the corresponding drivers of the speaker array in order to produce a beamformed audio output. In embodiments, the audio output signals are transmitted to the speaker array over a single cable configured to transport audio, data, and power. The method 400 may end after completion of step 406.

FIG. 5 is a diagram 500 of exemplary anechoic frequency responses of the full speaker array 100 shown in FIG. 1, measured at a distance of two meters from the speaker array in accordance with embodiments. A first response plot 502 corresponds to the frequency response of the full speaker array 100 from a broadside direction, or without any lobe steering. As shown, the response plot 502 is substantially flat for most of the voice frequency range (e.g., 300 Hz to 3.4 kHz), with the frequency response dropping off at very low frequencies (e.g., a 3 decibel (dB) down point around 400 Hz) and very high frequencies (e.g., above 7000 Hz). A second response plot 504 corresponds to the frequency response of the full speaker array 100 when the main lobe is steered thirty degrees to the right relative to a plane of the array, and still at a distance of 2 meters. As shown, the second response plot 504 is substantially consistent with or similar to the first response plot 502. That is, like plot 502, the second response plot 504 is substantially flat for most of the voice frequency range, except for drop offs at the same very low and very high frequencies. Thus, FIG. 5 illustrates that the speaker array 100 is capable of maintaining a constant frequency response across a wide range of frequencies even after steering.

FIGS. 6A and 6B and FIGS. 7A and 7B are diagrams of exemplary polar responses of the speaker array 100 shown in FIG. 1, measured at a distance of two meters from the speaker array, in accordance with embodiments. Each polar response or pattern represents the directionality of the speaker array 100 for a given frequency at different angles about a central axis of the array. As will be appreciated, while the polar plots in FIGS. 6-7 show the polar responses of a single lobe at selected frequencies, the speaker array 100 is capable of creating multiple simultaneous lobes in multiple directions, each with equivalent, or at least substantially similar, polar response.

Polar plots 600-614 shown in FIGS. 6A and 6B provide the polar responses of the speaker array 100 from a broadside direction at frequencies of 350 Hz, 950 Hz, 1250 Hz, 2000 Hz, 3000 Hz, 4000 Hz, 6000 Hz, and 7000 Hz, respectively. Polar plots 600-614 shown in FIGS. 6A and 6B provide the polar responses of the speaker array 100 when steered thirty degrees to the right relative to a plane of the array 100, for the same set of frequencies, respectively. As demonstrated by the polar patterns in FIGS. 6A and 6B, the speaker array 100 can form a main lobe, or directional sound beam, with minimal side lobes at each of the indicated frequencies, when broadside or without any steering. And as demonstrated by the polar patterns in FIGS. 7A and 7B, when steered 30 degrees to the right, the speaker array 100 still forms a main lobe with minimal side lobes at each of the indicated frequencies. Thus, FIGS. 6-7 show that the speaker array 100 is capable of being steered at least 30 degrees to the right without sacrificing the main to side lobe ratio across a wide range of frequencies.

FIGS. 6-7 also show that the speaker array 100 exhibits higher directivity, or narrower beam widths, at higher frequencies, for example, as shown by polar plots 612 and 614 representing 6000 and 7000 Hz, respectively, and somewhat lower directivity at the lower frequencies, with the lowest frequency, 350 Hz, having the largest beam width, as shown by polar plots 600 and 700. Still, FIGS. 6-7 show that the side lobes are formed at no more than 12 decibels (dB) below the main lobe. Thus, the speaker array 100 provides a high overall directivity index across the voice frequency range with a high level of side lobe rejection and an optimal main-to-side-lobe ratio (e.g., 12 dB) over a prescribed steering angle range.

FIGS. 8-10 illustrate various exemplary applications or use cases of the speaker array 100 shown in FIG. 1 being used to dynamically steer localized sound and create spatialized audio, in accordance with embodiments. In each example, the speaker array 100 is configured to generate multiple lobes (or localized sound beams) with specific sizes, shapes, and/or steering directions based on audio output signals received from, for example, beamforming system 204 shown in FIG. 2. The beamforming system 204 may generate the audio output signal(s) by applying beamforming techniques to one or more input audio signals, as described herein. For example, the beamforming techniques can be configured to manipulate magnitude, phase, and/or delay characteristics of the input audio signal(s) to dynamically direct or steer each sound beam towards a specific location. The beamforming techniques can also be configured to apply a shaping function (e.g., using magnitude shading) for stretching the beam along a selected axis.

More specifically, FIG. 8 depicts an exemplary environment 800 in which the speaker array 100 is disposed above a table 802 having a number of human listeners (not shown) situated around or adjacent to the table 802. The environment 800 also includes an open microphone 804 positioned at one end of the table 802 to implement acoustic echo cancellation (AEC) and/or voice lift applications. In the illustrated example, the speaker array 100 has been configured to direct audio outputs, demonstrated by lobes 806, 808, and 810, towards three discrete listeners or locations positioned adjacent to each other along one side of the table 802, while also steering the lobes 806, 808, 810 away from the open microphone 804 to improve AEC functionality. In the case of voice-lift applications, for example, in a conferencing environment, the microphone 804 may be used to capture sound produced by one or more human speakers positioned adjacent to or near the microphone 804, and the steerable lobes of the speaker array 100 may be used to direct the captured sound towards listeners that are outside of an audible range of the human speaker(s) and/or are further away from the microphone 804.

FIG. 9 depicts an exemplary environment 900 in which the speaker array 100 is disposed in an oddly or irregularly shaped room 902. In such cases, the speaker array 100 can be configured to direct multiple sound beams or lobes towards the various segments or corners of the room 902 so as to minimize room reflections. For example, as shown in FIG. 9, a first set of lobes 904 may be generally directed towards a first irregularly shaped segment or alcove of the room 902, but the lobes 904 themselves may be steered away from each other to minimize reflections. This lobe configuration may be repeated for each segment of the room 902, so that each lobe 904 is steered away from the other lobes 904 and towards a unique or different direction, as shown in FIG. 9.

FIG. 10 depicts an exemplary environment 1000 in which the speaker array 100 is configured to produce various lobe shapes to accommodate different scenarios. In the illustrated example, lobe 1002 has a rounded, nearly circular shape that provides a wider beam, while lobes 1004 and 1006 have elongated, oval shapes that provide a narrower, more directed beam. Other shapes are also contemplated. Lobe shaping may be managed using magnitude shading and/or other beamforming techniques, including, for example, through selection of appropriate filter weights for the filter system 306 shown in FIG. 3 and appropriate delay coefficients for the delay elements 308, also shown in FIG. 3.

FIG. 11 illustrates an exemplary audio system 1100 (or “eco-system”) comprising one or more planar speaker arrays 1102, a beamforming system 1104, and at least one microphone 1120, in accordance with embodiments. The audio system 1100 can be configured to output audio signals received from an audio source 1124 in one or more narrow, directed beams that are dynamically steerable and highly spatially controlled, similar to the steerable speaker system 200 shown in FIG. 2 and described herein. Through the use of microphone(s) 1120 and appropriate audio processing techniques, the audio system 1100 can also provide improved audio performance, such as, for example, crosstalk minimization and acoustic echo cancellation (AEC) through higher source receiver isolation, spatialized audio streams, and voice-lift applications. In some embodiments, the audio system 1100 can be configured to simultaneously output multiple streams corresponding to different audio source materials (e.g., multi-lingual content steams) to multiple locations or listeners. The audio system 1100 may be used in open office environments, conference rooms, museums, performance stages, airports, and other large-scale environments with multiple potential listeners.

Each speaker array 1102 can include a plurality of speaker elements or drivers arranged in a planar configuration. For example, the speaker elements may be arranged in a harmonically nested, concentric configuration (e.g., as shown in FIG. 1) or other geometrically optimized configuration in accordance with the techniques described herein. In embodiments, each planar speaker array 1102 may be substantially similar to the steerable speaker array 202, as shown in FIG. 2 and described herein, and/or the microphone array 100, as shown in FIG. 1 and described herein.

The beamforming system 1104 can be in communication with the individual speaker elements of each speaker array 1102 and can be configured to beamform or otherwise process input audio signals and generate a corresponding audio output signal for each speaker element of each speaker array 1102. In this manner, the speaker array(s) 1102 can be configured to simultaneously produce a plurality of individual audio outputs using various speaker elements, or combinations of speaker elements, and direct each audio output towards a designated location or listener. In embodiments, the beamforming system 1104 may be substantially similar to the beamforming system 204, as shown in FIG. 2 and described herein, and may include an audio processing system that is substantially similar to the audio processing system 300, as shown in FIG. 3 and described herein.

As shown in FIG. 11, the audio system 100 may include any number of speaker arrays 1102, and each speaker array 1102 may be coupled to the beamforming system 1104 via a single cable 1106. The cable 1106 can be configured to transport one or more of data signals, audio signals, and power between the beamforming system 1104 and the speaker array 1102 coupled thereto, with a preferred embodiment transporting all three (i.e. data (or control), audio, and power). In embodiments, each single cable 1106 can be substantially similar to the cable 206, as shown in FIG. 2 and described herein. For example, like the cable 206, the cables 1106 may be Ethernet cables (e.g., CATS, CAT6, etc.) configured to be electrically coupled to respective Ethernet ports included in each of the speaker arrays 1102 and in the beamforming system 1104. In such cases, the power signal may be delivered through the cables 1106 using Power over Ethernet (PoE) technology, as described herein. Other types of cables and corresponding external ports are also contemplated, as also described herein. The power source supplying the power signal may be housed in the beamforming system 1104 (e.g., as shown in FIG. 2) or may be coupled to the beamforming system 1104 to provide power thereto.

The microphone 1120 can include any suitable type of microphone transducer or element capable of detecting sound in a given environment and converting the sound into an audio signal for implementing acoustic echo cancellation (AEC), voice lift, crosstalk minimization, dynamic lobe steering, and other audio processing techniques designed to improve performance of the speaker array(s) 1102. In embodiments, the microphone 1120 can be substantially similar to the microphone 320 shown in FIG. 3. The microphone 1120 can be communicatively coupled to the beamforming system 1104 using a single cable 1122 that is similar to the single cable 1106. For example, the cable 1122 may be configured to transport power, data signals, and/or audio signals between the beamforming system 1104 and the microphone array 1120. The audio signal output generated by the microphone 1120 may be digital or analog. If analog, the microphone 1120 may include one or more components, such as, e.g., analog to digital converters, processors, etc., for processing the analog audio signals and converting them into digital audio signals. The digital audio signals may conform to the Dante standard for transmitting audio over Ethernet, for example, or other network standard.

As shown in FIG. 11, the microphone 1120 can be a standalone microphone array. According to embodiments, the microphone array 1120 can include a plurality of microphone elements arranged in a planar configuration. In a preferred embodiment, the microphone elements of the microphone array 1120 are MEMS (micro-electrical mechanical system) transducers, though other types of microphone transducers are also contemplated. The beamforming system 1104 can be configured to combine the audio signals captured by each of the microphone elements in the microphone array 1120 and generate an audio output signal for the microphone array 1120 with a desired directional polar pattern. In some embodiments, the beamforming system 1104 can be configured to steer the output of the microphone array 1120 towards a desired angle or location, similar to the speaker array 1102. Non-limiting examples of beamforming or audio processing techniques that can be used to steer or direct the output of the microphone array in a desired direction may be found in, for example, the following commonly-owned U.S. patent applications: U.S. Patent Application No. 62/855,187, entitled “Auto Focus, Auto Focus within Regions, and Auto Placement of Beamformed Microphone Lobes;” U.S. Patent Application No. 62/821,800, entitled “Auto Focus and Placement of Beamformed Microphone Lobes;” and U.S. patent application Ser. No. 16/409,239, entitled “Pattern-Forming Microphone Array,” the entire contents of each being incorporated by reference herein.

In embodiments, the audio system 1100 can be configured to provide adaptive or dynamic steering control for each speaker array 1102 and each microphone array 1120. For example, the steerable speaker array 1102 may be capable of individually steering each audio output or beam towards a desired location. Likewise, the microphone array 1120 may be capable of individually steering each audio pick-up lobe or beam towards a desired target. The adaptive steering control may be achieved using appropriate beamforming techniques performed by the beamforming system 1104 for each of the microphones and speakers.

In some embodiments, the audio system 1100 can be configured to apply the dynamic steering capabilities of the at least one microphone 1120 and one or more speaker arrays 1102 towards functionalities or aspects that are in addition to delivering audio outputs to specific listeners, or configured to enhance the same. In particular, the audio system 1100 may be configured to allow each component of the system 1100 (e.g., each microphone and speaker) to be mutually aware of the physical location and steering status of all other components in the system 1100 relative to each other. This mutual awareness, as well as other information related to the human source/receivers in the room, allow the audio system 1100 to make active decisions related to steering locations, as well as magnitude variability and signal delay, which allows for source reinforcement and coherence, for example. Additional details and examples are provided below.

Room Response

In some embodiments, the audio system 1100 may be used to determine room behavior, or measure the room impulse response, by using the microphone array 1120 to calculate an impulse response for the speaker arrays 1102. Appropriate audio processing techniques may be used to measure the impulse response of each speaker array 1102 and may include a frequency-dependent response or an audible response. According to some techniques, an adaptive filter may be assigned to each speaker array 1102, and the filtered outputs may be combined to obtain the overall room response.

As an example, the microphone array 1120 of the audio system 1100 may be used to calculate specific room characteristics, namely RT60, speaker to microphone transfer function, and impulse response. In some embodiments, each of these values may be determined using well-known techniques. The ability to automatically measure these metrics and use them to condition the response of both the microphone array 1120 and the speaker arrays 1102, as well as the accompanying additional functionalities outlined herein, can provide information about the room or environment, and the audio system's interaction with that environment, that may better inform the technologies described below.

Time of Flight

In some embodiments, the microphone array 1120 of the audio system 1100 may be used to calculate each speaker array's time of flight (TOF), or the time it takes audio output by a given speaker array 1102 to propagate through air over a known distance (e.g., the distance between the speaker array 1102 and the microphone array 1120). The time of flight calculations can be used to control gain parameters for the speaker arrays 1102, for example, in order to avoid feedback. As an example, this measurement can be made by sending a predetermined test signal to the speaker array 1102 using any synchronous digital communication technique, while simultaneously initiating detection of the test signal audio at the microphone array 1120 also under test, using any synchronous digital communication technique (such as, for example, but not limited to, Dante). Once the signal is detected, an appropriately processed time difference between when the speaker array 1102 issued the signal and when it was detected by the microphone array 1120 will indicate the time of flight and thus, can be used to calculate the actual distance separating the two devices.

AEC

In some embodiments, the audio system 1100 may be used to optimize acoustic echo cancellation and minimize crosstalk by taking advantage of the fact that the microphone array 1120 and the speaker arrays 1102 are aware of each other. For example, an appropriate test signal may be applied to a given speaker array 1102 to excite the acoustic response of the room. The audio system 1100 can use the response detected from said test signal to initially tune echo cancelation algorithms for one or more microphones to minimize echoes generated by the room in response to the speaker array output. The audio system 1100 can also use the detected information to tune a response of the microphone array 1120 to minimize pickup from the spatial coordinates of the speaker array 1102 relative to the microphone array 1120.

Voice-Lift

In some embodiments, the steerable microphone array 1120 and steerable speaker array 1102 of the audio system 1100 may be used for adaptive voice-lift optimization. For example, null-steering techniques may be used to mutually exclude the output of one speaker array 1102 from that of another speaker array 1102. Also, null generation techniques may be used to mask non-speech audio detected by the microphone array 1120.

Voice lift is a technique for increasing speech intelligibility in large meeting rooms through subtle audio reinforcement. Incorporating voice lift techniques into the beamforming microphone array 1120 and speaker arrays 1102 of the audio system 1100 can provide a number of benefits. For example, the gain before feedback can be optimized by including the position of the active microphone in the steering decisions being made by the active speakers. When the system 1100 is aware of where the sound is coming from (i.e. the location of the talker or other audio source), the rest of the system 1100 can react intelligently by reinforcing the areas that far from the audio source, while limiting reinforcement near the audio source. As another example, when the speakers and microphones are aware of each other (e.g., via time of flight), intelligent delays can be applied to the speaker outputs relative to the audio source for voice lift purposes, so as to synchronize the direct transmission with the reinforced transmission. This would limit the amount of phase or time of flight errors in the reinforcement, which leads to a more natural and transparent experience.

Localization

In some embodiments, the audio system 1100 may also be used for acoustic localization of multiple audio sources. For example, as people speak, their locations may change, thus requiring the audio system 1100 to redirect speaker audio to optimize system performance. The presence of a set of microphones with known inter-microphone distances allows for the calculation of talker location estimation relative to the microphones. Using that information and its knowledge of the location of the microphone array 1120 relative to the speaker array 1102, the audio system 1100 can simultaneously optimize speaker playback and microphone pickup directions. In some cases, the audio system 1100 may further include one or more technologies for tracking audio sources as they move about the room or environment, such as, for example, one or more infrared devices, a camera, and/or thermal imaging technology.

Wall Mapping

Another exemplary use for the audio system 1100 may be wall mapping to determine an audio envelope of the room or other environment and generate spatial awareness of the audio sources therein. For example, the audio system 1100 may determine intra-system awareness (e.g., where the speaker arrays 1102 are located in the room) by using the microphone array 1120 to calculate time of arrival (TOA), distance between two points, and other information pertinent to establishing the spatial relationship between a given pair of speaker arrays 1102. The audio system 1100 may combine the wall mapping knowledge with this intra-system awareness to automatically control certain parameters or features of the speaker arrays 1102. For example, the audio system 1100 may use the information to automatically adjust gain parameters, lobe characteristics, and/or other features of the speaker arrays 1102 in order to avoid feedback and other undesirable effects.

In some embodiments, wall mapping can be performed by issuing a pulse to a single speaker array 1102 and processing the response by a set of microphones of known geometry, such as, e.g., microphone array 1120. Room reflections can be estimated, and in most cases, a basic room geometry can be estimated based thereon. Knowing the room geometry allows the audio system 1100 to accommodate an estimated room response. The inter-system awareness can be accomplished via any digital communication technique, whether wired or wireless (such as, e.g., Dante). Alternatively, audio steganography may be used to embed the information in an audio signal output by the speaker array 1102 and received by a given microphone, or inserted into the audio signal detected by a given microphone. Additionally, AES3 digital audio signal technology or ultrasound technology may be used to perform the information exchange between a given pair of microphones.

Privacy Index

When used in an open-office environment, or other large, open area, the audio system 1100 may be used to increase or improve a privacy index of the individuals in the environment 1200 through dynamic noise-masking. For example, a person occupying one cubicle may be able to mask a private conversation from the occupants of surrounding cubicles by configuring the speaker array 1102 to direct frequency-tuned noise towards each of the other occupants (e.g., as an individual audio output steered towards each occupant).

Privacy index (PI) is outlined as part of ASTM E1130 and is determined by the ability of nearby listeners to discern and intelligibly understand the content of a conversation. An alternate metric that is used in the architectural acoustics community is Speech Intelligibility Index (SII) outlined in ANSI S3.5. According to some embodiments, the audio system 1100 may have the following capabilities in an open office environment. The speaker array 1102 may be capable of directing masking noise to areas of the environment that are not being used for a given teleconference. This masking noise can hinder the intelligibility of the teleconference audio or speech for outside listeners. Such functionality may be initiated as part of each teleconference, or may be a persistent feature of a well-defined area, wherein the audio system 1100 is configured to ensure minimal interference to that area from talkers detected in other areas, or limit transmission of audio from those other areas to the well-defined area. The dynamic steering ability of the microphone array 1120 and speaker arrays 1102 may also be used to actively mask surrounding sounds that are naturally transmitted to a given area, for example, using active noise suppression technique.

Wireless Signals

In some embodiments, the audio system 1100 can be configured to share information between its components using ultrasonic or steganographic-type techniques that embed data or control information within the wireless audio signal. For example, information about gain levels, equalization levels, talker identification, filter coefficients, system level warnings (e.g., low battery), and other functional tasks or tests could be conveyed between components of the audio system 1100 using such wireless techniques, instead of using the network, as is conventional. This may reduce bandwidth consumption on the network and increase the speed with which information can be conveyed. Also, by embedding the data into the audio signal, the audio signal can be sent in real-time. That is, the audio signal need not be delayed to accommodate data signals, as is conventional.

FIG. 12 illustrates an exemplary implementation of the audio system 1100 as a distributed system in an environment 1200. The environment 1200 may be a conference room, a meeting hall, an open-office environment, or other large space with a ceiling 1230. As shown, the audio system 1100 may include multiple speaker arrays 1102 and at least one microphone array 1120 positioned at various locations throughout the environment 1200 in order to provide appropriate coverage and audio performance. Though FIG. 12 shows two speaker arrays 1102 and one microphone array 1120, it should be appreciated that additional speaker arrays and/or additional microphone arrays may be included in the audio system 1100, for example, to cover a larger listening area.

In some embodiments, the speaker arrays 1102 may be distributed around the environment 1200 so that each speaker array 1102 covers a predetermined portion of the environment 1200. In addition, the placement of each speaker 1102 and microphone 1120 may be selected relative to each other, or so that there is sufficient distance between adjoining devices. In some cases, the microphone 1120 may be directed away from the speaker arrays 1102 to avoid unwanted acoustic interference. The locations of the speaker arrays 1102 and microphone array(s) 1120 may also be selected depending on expected positioning of the listeners in the environment 1200 and/or the type of environment 1200. For example, in a conference room, the speaker arrays 1102 may be centered above a large conference table and may be used during a conference call to reproduce an audio signal representing speech or spoken words received from a remote audio source associated with the conference call. As another example, in an open office environment, the speaker arrays 1102 may be positioned above the clusters of cubicles, so that each cubicle receives audio from at least one of the speaker arrays 1102.

In some embodiments, the speaker arrays 1102 and the microphone array 1120 can be configured for attachment to a vertical wall or horizontal surface, such as, e.g., a table-top. In other embodiments, the speaker arrays 1102 and microphone array 1120 can be configured for attachment to the ceiling 1230, with a front face of each device facing down towards the environment 1200. For example, each speaker array 1102 and/or microphone array 1120 may include a housing with a back surface that is configured for flush-mount attachment to the ceiling 1230, similar to the housing 106 shown in FIG. 1 and described herein.

In some embodiments, the ceiling 1230 can be a suspended ceiling, or drop-ceiling, comprising a plurality of ceiling tiles arranged in a grid-like fashion, as shown in FIG. 12. In such cases, the speaker arrays 1102 and the microphone array(s) 1120 can be configured (e.g., sized and shaped) for attachment to the drop-ceiling 1230, either in place of a given ceiling tile or to the ceiling tile itself. For example, a size and shape of a housing for each speaker array 1102 and microphone array 1120 may be selected to substantially match the size and shape of a standard ceiling tile (e.g., 60 cm by 60 cm, or 24 in by 24 in), and such housings may be configured for attachment to a frame of the drop-ceiling 1230 in the place of a standard ceiling tile. A non-limiting example of a ceiling array microphone may be found in commonly-owned U.S. Pat. No. 9,565,493, the entire contents of which are incorporated by reference herein.

Wireless/Distributed System

As shown in FIG. 11, the components of the audio system 1100 may be coupled to the beamforming system 1104 via one or more cables 1106 or 1122. In some embodiments, the audio system 1100 may be configured as a distributed system. For example, the microphone array 1120 and speaker arrays 1102 may be in wireless communication with the beamforming system 1104, for example, using a Near Field Communication (NFC) network, or other types of wireless technology (e.g., conductive, inductive, magnetic, etc.). In such cases, power may still be delivered over the cables 1106 and 1122, but audio and/or data signals may be delivered wirelessly from one device to the other using any suitable communication protocol.

In embodiments, the ability to wirelessly link the components of the audio system 1100 through a distributed network that enables metadata transfer among said components, allows for full transparency of the audio, DSP, and control parameters that are developed and exchanged through the use of the audio system 1100. Moreover, the ability to manage this metadata sharing through protocols, such as, for example, DECT, encrypted Wi-Fi, RF, NFC, Bluetooth, or any number of other wireless or wired protocols, allows for each piece of the system 1100 to be equally aware of the system 1100 as a whole. This awareness, in turn, allows the individual system components to behave in a system-wide consistent manner, as each component uses the same dataset for decision-making purposes.

Any process descriptions or blocks in figures should be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps in the process, and alternate implementations are included within the scope of the embodiments of the invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those having ordinary skill in the art.

This disclosure is intended to explain how to fashion and use various embodiments in accordance with the technology rather than to limit the true, intended, and fair scope and spirit thereof. The foregoing description is not intended to be exhaustive or to be limited to the precise forms disclosed. Modifications or variations are possible in light of the above teachings. The embodiments) were chosen and described to provide the best illustration of the principle of the described technology and its practical application, and to enable one of ordinary skill in the art to utilize the technology in various embodiments and with various modifications as are suited to the particular use contemplated. All such modifications and variations are within the scope of the embodiments as determined by the appended claims, as may be amended during the pendency of this application for patent, and all equivalents thereof, when interpreted in accordance with the breadth to which they are fairly, legally and equitably entitled.

	Number	Date	Country
	62960502	Jan 2020	US
	62851819	May 2019	US

STEERABLE SPEAKER ARRAY, SYSTEM, AND METHOD FOR THE SAME

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

CROSS-REFERENCE

Provisional Applications (2)