The present disclosure relates to processing audio in individual sound zones and more particularly to processing audio in low latency individual sound zones.
Common loudspeaker arrangements in an interior of a vehicle include a plurality of loudspeakers distributed throughout the interior. For example, broadband loudspeakers, midrange loudspeakers, and tweeters may be disposed in various places in the vehicle, including the vehicle headrest, to provide audio sources at various listening positions in the vehicle. A surround sound system may also be utilized which may include woofers and may also include midrange loudspeakers and tweeters.
Individual sound zones (ISZs) integrate the personal audio needs of each listener in the vehicle while managing the audio outputs of the overall vehicle cabin. It is desirable to provide sufficient separation between zones, particularly in relation to the low frequency region while generating audio in the vehicle cabin utilizing the vehicle surround system.
However, when delivering a full range of audio content (20 Hz-20 kHz) to an individual sound zone, all the speakers in the vehicle are used to generate seat-based audio content. Unfortunately, the distance between a head of each listener and the surround speakers creates acoustical delays, no less than 3 ms. Further, the requirement to generate low frequency audio content means that a digital signal processing (DSP) system uses FIR filters with many taps. This adds significant load to the DSP and results in the DSP having to create a large buffer to limit the processing load. The tradeoff for decreasing processing power is an increase in buffer size and thereby an increase in latency, which typically exceeds 20 ms and may sometimes reach 100 ms which reduces the quality of the audio for a listener in the ISZ.
There is a need for low latency in individual sound zones to reduce processing requirements and reduce costs for implementing ISZ.
A system and method for achieving both zonal separation and sound quality by way of a hybrid processing system. A first audio signal is filtered with a high pass (HP) infinite impulse response (IIR) filter and equalizing IIR filters 306. The first audio signal is also filtered with a band pass (BP) IIR filter to extract audio content within a voice frequency band (300 Hz-3400 Hz). The first extracted audio signal is down-sampled and filtered by a set of crosstalk cancellation (CTC) finite impulse response (FIR) filters.
A second audio signal is filtered with a HP IIR filter and equalizing IIR filters. The second audio signal is also filtered with a bandpass (BP) IIR filter to extract audio content within the voice frequency band. The second extracted audio signal is down-sampled and filtered by a set of crosstalk CTC FIR filters.
The first and second CTC FIR filtered signals are up-sampled respectively. The first up-sampled audio signal is recombined with the equalized first audio input signal. A set of IIR filters equalizes the recombined first audio input signal and outputs S1(jω) to the speaker in a first Individual Sound Zone associated with the first audio signal.
The second up-sampled audio signal is recombined with the equalized second audio signal. A set of IIR filters equalizes the recombined second audio signal and outputs to the speaker in a second ISZ.
Each ISZ in the listening space utilizes only headrest speaker(s) within a predetermined distance of each ISZ and does not involve any other speakers in the listening space. The audio signals are processed by a processing module connected to the loudspeakers in a headrest of the ISZ. The processing module is configured to generate audio for the headrest loudspeakers 122 that is greater than or equal to 300 Hz, reducing the processing load and limiting latency to 5 msec.
Elements and steps in the figures are illustrated for simplicity and clarity and have not necessarily been rendered according to any sequence. For example, steps that may be performed concurrently or in different order are illustrated in the figures to help to improve understanding of embodiments of the present disclosure.
While various aspects of the present disclosure are described with reference to
The vehicle 102 has an audio system 120. One or more loudspeakers 122 are disposed in a headrest 124 at each listening position in the vehicle cabin 104 as part of a vehicle audio system. Current technology for ISZ focuses on generating full-bandwidth audio content at each zone. This approach requires utilizing both proximity (headrest) speakers and surround speakers, such as door woofers and subwoofers positioned throughout the listening space. However, generating full-bandwidth audio content requires a large FIR filter matrix that typically adds significant load to processing. As the filters require a considerable number of taps for each zone speaker, e.g., 4096, the digital signal processor system must create a large buffer (1024, for example) to limit processing load. The trade-off is an increased latency, which is undesirable. The latency may exceed 20 msec and in some cases, may reach 100 msec.
Focusing on ISZ implementation for 300 Hz and above, which covers the range of voice applications, allows only headrest speakers to be utilized. Surround woofers and other loudspeakers in the listening environment are not utilized. This approach reduces the acoustical delay and requires far less FIR filtering (less than 512 taps for each zone-speaker at the sample rate of 48 khz, and even fewer taps at a lower sample rate). The result is a low-cost ISZ system with reduced channel number and reduced FIR filter length for each channel. This significantly reduces the MIPS requirement, which, in turn, allows for a much shorter buffer size, thereby significantly decreasing the overall latency. In practice, overall latency is reduced within a range of 2-5 msec.
Each ISZ has at least one loudspeaker 122 disposed within a predetermined distance to the listener 202, 206. The position of the at least one loudspeaker position for the ISZ is within inches of a head of the listener 202, 206. According to the inventive subject matter, to create a bright zone for the listener 204, audio content is limited to 300 Hz or above. The result is the ISZ 112 is dedicated to audio in the range of 300 Hz and above. The loudspeaker 122 in the dark zone 208 cancels the loudspeaker signal from the bright zone 204. Knowing that for each listening position in the listening environment, only audio above 300 Hz will be heard by the listener in that ISZ.
A second audio signal X2 is filtered with a HP IIR filter 314 and equalizing IIR filters 316. The second audio signal is also filtered with a BP IIR filter 318 to extract audio content within the voice frequency band. The second extracted audio signal is down-sampled 320 and filtered by a set of crosstalk CTC FIR filters 312.
The first and second CTC FIR filtered signals are up-sampled 322, 324 respectively. The first up-sampled audio signal is recombined 326 with the equalized first audio input signal. A set of IIR filters 328 equalizes the recombined first audio input signal and outputs S1(jω) to the speaker 122 in the ISZ associated with the first audio signal X1, which is the first ISZ 112 in this example.
The second up-sampled audio signal is recombined 330 with the equalized second audio signal. A set of IIR filters 332 equalizes the recombined second audio signal and outputs to the speaker 122 in another ISZ, which is the second ISZ 114 in this example.
Signal {tilde over (S)}2(jω) is supplied to the loudspeaker 122 in the second ISZ 114 and is calculated as:
The loudspeakers 122 radiate signals {tilde over (S)}1(jω) and {tilde over (S)}2(jω) as acoustic signals that propagate to the first and second ISZs 112, 114. The sound signals that are present at the first and second ISZs 112, 114 are Z1(jω) and Z2(jω), wherein:
and
In equations (3) and (4), transfer functions H11(jω), H21(jω), H12(jω), and H22(jω) denote the room impulse response (RIR) in the frequency domain, i.e., the transfer function from loudspeakers 122 to the respective sound zones, for example the first ISZ 112 and the second ISZ 114.
Because the front passenger listener in the second sound zone 114 does not necessarily want to hear the conversation that is taking place by the driver in the first ISZ 112, signal processing modifies the signal played back loudspeakers 122 to establish a bright zone within a predetermined range of the first ISZ 112, and to establish a dark zone outside of a predetermined range of the first ISZ 112. Each ISZ in the listening space utilizes only headrest speaker(s) within a predetermined distance of each ISZ and does not involve any other speakers in the listening space. The audio signals are processed by a processing module connected to the loudspeakers in a headrest of the ISZ. The processing module is configured to generate audio for the headrest loudspeakers 122 that is greater than or equal to 300 Hz, reducing the processing load and limiting latency to 5 msec.
The processor may be configured to execute computer readable instructions stored in memory. The processor may be single core or multi-core, and the programs executed by the processor may be configured for parallel or distributed processing. The processor may be any technically feasible hardware unit configured to carry out processing functions and execute software applications, including but not limited to, a central processing unit (CPU), a microcontroller unit (MCU), an application specific integrated circuit (ASIC), a digital signal processor (DSP) chip, a field-programmable gate array (FPGA), a graphic board, etc.
The method receives 502, at an audio system in a listening space having a plurality of ISZs, first and second audio signals to be played back by at least a respective loudspeaker in each of the plurality of ISZs. In the present example, the audio system is receiving a first audio signal to be played back in the first ISZ and a second audio signal to be played back in the second ISZ. The loudspeaker in each ISZ is positioned in the ISZ to be close to, within inches, of a listener's ear.
The method extracts 504, from the first audio signal, audio content above a predetermined frequency, hereinafter high frequency audio. The predetermined frequency is greater than a voice frequency band, which band is typically between 300 Hz and 3400 Hz. The method filters and equalizes 506 the high frequency content of the first audio signal.
The method extracts 508, from the first audio signal, audio content within the voice frequency band. The method down-samples 510 the audio content within the voice frequency band of the first audio signal.
Simultaneously, the method extracts 512, from the second audio signal, audio content above the predetermined frequency. The method filters and equalizes 514 the high frequency content of the second audio signal. The method extracts 516, from the second audio signal, audio content within the voice frequency band, and down-samples 518 the audio content within the voice frequency band of the second audio signal.
The method applies 520 a set of crosstalk cancellation (CTC) filters to the down-sampled audio content of the first audio signal and the down-sampled audio content of the second audio signal. After applying 520 the CTC filters, the audio content within the voice frequency band for the first ISZ is up-sampled 522 and recombined 524 with the high frequency audio content of the first audio signal. The method equalizes 526 and outputs 528 an audio signal to be played back at the loudspeaker in the first ISZ.
After applying 520 the CTC filters, the audio content within the voice frequency band for the second ISZ is up-sampled 530 and recombined 532 with the high frequency audio content of the second audio signal. The method equalizes 534 and outputs 536 an audio signal to be played back at the loudspeaker in the second ISZ.
In the foregoing specification, the present disclosure has been described with reference to specific exemplary embodiments. The specification and figures are illustrative, rather than restrictive, and modifications are intended to be included within the scope of the present disclosure. Accordingly, the scope of the present disclosure should be determined by the claims and their legal equivalents rather than by merely the examples described.
For example, the steps recited in any method or process claims may be executed in any order, may be executed repeatedly, and are not limited to the specific order presented in the claims. Additionally, the components and/or elements recited in any apparatus claims may be assembled or otherwise operationally configured in a variety of permutations and are accordingly not limited to the specific configuration recited in the claims. Any method or process described may be carried out by executing instructions with one or more devices, such as a processor or controller, memory (including non-transitory), sensors, network interfaces, antennas, switches, actuators to name a few examples.
Benefits, other advantages, and solutions to problems have been described above regarding embodiments; however, any benefit, advantage, solution to problem or any element that may cause any particular benefit, advantage, or solution to occur or to become more pronounced are not to be construed as critical, required, or essential features or components of any or all the claims.
The terms “comprise”, “comprises”, “comprising”, “having”, “including”, “includes” or any variation thereof, are intended to reference a non-exclusive inclusion, such that a process, method, article, composition, or apparatus that comprises a list of elements does not include only those elements recited but may also include other elements not expressly listed or inherent to such process, method, article, composition, or apparatus. Other combinations and/or modifications of the above-described structures, arrangements, applications, proportions, elements, materials, or components used in the practice of the present disclosure, in addition to those not specifically recited, may be varied, or otherwise particularly adapted to specific environments, manufacturing specifications, design parameters or other operating requirements without departing from the general principles of the same.
This application claims priority to provisional application 63/435,483 filed Dec. 27, 2022, in the United States, the disclosure of which is incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
63435483 | Dec 2022 | US |