The present disclosure relates to a method for a multi-channel speaker system and the multi-channel speaker system, and specifically relates to the method of automatic detection of speaker positions and automatic assignments for arbitrarily placed multi-channel speaker system, as well as the multi-channel speaker system.
Multi-channel speaker systems are becoming increasingly popular as one of options for the modern integrated home entertainment system. These multi-channel speaker systems are commonly used to provide immersive audio experiences for movies and multi-channel audio reproduction such as the Dolby ATMOS Music.
With the advancement of wireless technologies, companies have launched their own wireless audio ecosystem to allow user to link a certain number of speakers together to form a mesh network. The common configurations are four portable speakers regarded as a 4.0 channel system, or a soundbar with two portable speakers as true surround setup such as 5.1/7.1 channel systems.
For users' convenience and room tidiness, the linked speakers in the ecosystem usually rely on wireless audio transmission to transmit audio signals, hence without the need of external wires connected to each other. While reducing the need of unnecessary wires, this will require extra speaker position identification during the setup process.
To detect speaker position and thus correctly assign the source channel to the corresponding speaker in various rooms and setups, most multi-channel speaker systems provide acoustic calibration for the system.
Normally the calibration is performed by in-situ measurements via the speaker and microphone. Some calibration methods require an external microphone. For example, some multi-speaker systems require an additional device with microphones for performing calibration. The frequency response of each speaker will be adjusted after the calibration, but there is no automatic speaker assignment correction. For example, some multi-speaker system asks user to manually assign the speaker position before calibration. In this case, failing to assign the correct channel sequence will lead to the reversed sound image even after calibration.
Other calibration methods are using internal microphone, which is friendlier to user, but there is still no automatic speaker assignment correction. Taking a system containing four separate speakers as an example, the calibration method takes advantage of all microphones in each speaker to detect if left and right speakers are reversed, or left surround and right surround speakers are reversed, respectively, but if they are both reversed, the detection algorithm of the calibration method will not be able to react.
Therefore, it is necessary to provide a robust technology for performing automatic speaker assignment, which can not only avoid inconvenience to the user but also avoid potentially assigning the wrong channels to the speakers in the multi-channel speaker system.
According to one aspect of the disclosure, a method for a multi-channel speaker system is provided, wherein the multi-channel speaker system includes N speakers, N≥2. The method may comprise obtaining N! permutations of channel sequence for the N speakers; determining, for each permutation, a voting score that represents the matching degree between the channel sequence indicated in a permutation and a correct channel assignment sequence of the N speakers; selecting the permutation with the highest voting score; and assigning input source channels to the N speakers in the order of the channel sequence indicated in the selected permutation.
According to another aspect of the present disclosure, a multi-channel speaker system is provided. The system may comprise N speakers and a processor. The processor may be configured to obtain N! permutations of channel sequence for the N speakers; determine, for each permutation, a voting score that represents the matching degree between the channel sequence indicated in the permutation and a correct channel assignment sequence of the N speakers; select the permutation with the highest voting score; and assign input source channels to the N speakers in the order of the channel sequence indicated in the selected permutation.
According to yet another aspect of the present disclosure, a non-transitory computer-readable storage medium comprising computer-executable instructions is provided which, when executed by a computer, causes the computer to perform the method disclosed herein.
To facilitate understanding, identical reference numerals have been used, where possible, to designate identical elements that are common to the figures. It is contemplated that elements disclosed in one embodiment may be beneficially utilized on other embodiments without specific recitation. The drawings referred to here should not be understood as being drawn to scale unless specifically noted. Also, the drawings are often simplified, and details or components are omitted for clarity of presentation and explanation. The drawings and discussion serve to explain principles discussed below, where like designations denote like elements.
Examples will be provided below for illustration. The descriptions of the various examples will be presented for purposes of illustration but are not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments.
As mentioned above, during the initial setup stage of the multi-channel speaker system, it is inconvenient for the user to confirm the assignment of channels for speakers in the multi-channel speaker system and manually swap the speakers or change their relative positions. In this disclosure, a novel method and system are provided, which may automatically perform speaker assignment and accordingly avoid inconvenience to the user but also ensure assigning the correct channels to the speakers in the multi-channel speaker system. The method and system provided in this disclosure utilize permutations sequence-based algorithm in combination with jointly voting method to provide the best estimation of the speaker placement. In addition, while performing estimation of channel assignment, an acoustic calibration may be automatically performed. Thus, at the initial setup stage of the speaker system, especially for both the channel assignment and the acoustic calibration during the initial setup stage, the impact on the user experience will be minimized. The novel approach will be explained in detail referring to
A multi-channel speaker system may include N speakers, such as wireless speakers, wherein N may be greater than or equal to 2. Each speaker in the speaker system may include at least two internal microphones. For the sake of clarity,
For example,
where Tleft and Tright are occurrence time of the peaks of the impulse responses of left and right microphones, respectively. For the rest speakers and microphones in the speaker system, time differences can be obtained in the same manner. If the system consists of N speakers, there will be an N×N matrix of time differences. The ith row of the matrix means the ith speaker is playing signal and the jth column of the matrix means the microphones of the jth speaker are recording. In this example, a 5×5 matrix of time differences will be obtained.
After the time differences for each speaker are estimated as described above, the directions of the sound source can be calculated for each speaker, more specifically, the angles of the incoming sounds can be calculated. In theory, there are two models, i.e., near-field and far-field model. For example, in common use case, as the distance of two microphones in one speaker is small (ranging from 5 cm to 40 cm), and usually the speaker distance in the multi-channel speaker system is much bigger (ranging from 1 m to 10 m), the far-field model will be utilized for simplicity in the following descriptions.
where c is the sound speed, Tdiff is the time difference of two impulse responses of the speaker, which can be calculated according to equation (1). If the system consists of N speakers, there will be an N×N matrix of estimated angles indicating the sound source directions for all the speakers.
If the system consists of N speakers, there should be N! permutations of channel sequence assigned to the speakers. To correctly arrange the channel to the speakers, this disclosure proposes a joint voting method to robustly figure out the correct assignment sequence. According to one or more embodiments, for each permutation, a voting score or rank will be calculated, the voting score or rank may represent the matching degree between the channel sequence indicated in the permutation and a correct channel assignment sequence of the N speakers. For example, the higher the voting score or rank, the better the matching degree. Then, the permutation with the highest voting score will be selected. According to the channel sequence indicated in the selected permutation, input source channels will be assigned to the N speakers in the order of the channel sequence in the selected permutation.
Next, a joint voting method in combination with the permutations sequence-based algorithm will be described in detail in reference to
As an example,
Assuming a direction condition for the case of speaker A 102 being recording is as the below condition (Eq. 3), in the configuration shown in the example of
The above we discussed only take into account the position and channel sequence of the speakers. But in practice, it can be combined with frequency response calibration, which takes advantage of the sweep signal as well. For example, the frequency responses of the speaker A, FRA, and its target frequency responses, FRtargetA, are described as below, respectively,
wherein FFT is Fast Fourier Transform and |*| is absolute operator. hA denotes the impulse responses between the microphones and transducers of speaker A in the user's environment, which are discussed as above, for example, discussed in reference to
The calibration filter can be obtained by,
where (⋅) is a function that converts the frequency response to the calibration filter, for example, the function of finite impulse response (FIR) filter to infinite impulse response (IIR) filter. Hence, the calibration filter will be inserted and applied to the original audio pipeline. It can be understood that the frequency response calibration discussed above may be applied to all the speakers in the multi-channel speaker system.
The discussed method above may be realized by a processor included in the speaker system. The processor may be any technically feasible hardware unit configured to process data and execute software applications, including without limitation, a central processing unit (CPU), a microcontroller unit (MCU), an application specific integrated circuit (ASIC), a digital signal processor (DSP) chip and so forth.
In this disclosure, a new solution is provided to correctly and automatically arrange input source channels to the speakers in a multi-channel speaker system. In addition, while performing estimation of channel assignment, an acoustic calibration may be automatically performed. Thus, at the initial setup stage of the speaker system, especially for both channel assignment and acoustic calibration during the initial setup stage, the impact on the user experience will be minimized.
wherein Tdiff is the time difference of arrival of the at least two internal microphones included in each speaker, dMic is a distance between the at least two internal microphones, and c is a sound speed.
wherein Tdiff is the time difference of arrival of the at least two internal microphones included in each speaker, dMic is a distance between the at least two internal microphones, and c is a sound speed.
The descriptions of the various embodiments have been presented for purposes of illustration but are not intended to be exhaustive or limited to the embodiments disclosed. The terminology used herein was chosen to best explain the principles of the embodiments, the practical application or technical improvement over technologies found in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.
In the preceding, reference is made to embodiments presented in this disclosure. However, the scope of the present disclosure is not limited to specific described embodiments. Instead, any combination of the preceding features and elements, whether related to different embodiments or not, is contemplated to implement and practice contemplated embodiments. Furthermore, although embodiments disclosed herein may achieve advantages over other possible solutions or over the prior art, whether or not a particular advantage is achieved by a given embodiment is not limiting of the scope of the present disclosure. Thus, the preceding aspects, features, embodiments, and advantages are merely illustrative and are not considered elements or limitations of the appended claims except where explicitly recited in a claim(s).
Aspects of the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” “module”, “unit” or “system.”
The present disclosure may be a system, a method, and/or a computer program product. The computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present disclosure.
The computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. A non-exhaustive list of more specific examples of the computer readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing. A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
Computer readable program instructions described herein can be downloaded to respective calculating/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network. The network may comprise copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers.
Aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.
These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts or carry out combinations of special purpose hardware and computer instructions.
While the foregoing is directed to embodiments of the present disclosure, other and further embodiments of the disclosure may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow.
This application is a continuation of International Application No. PCT/CN22/070062, filed on Jan. 4, 2022, the disclosure of which is incorporated by reference herein.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/CN22/70062 | Jan 2022 | WO |
Child | 18764438 | US |