This disclosure relates to audio reception and playback, and more particularly to systems for and techniques of enhancing the fidelity and perceived sound field spread of inexpensive speakers typically incorporated into audio and video reception and playback devices such as televisions and computers.
Consumers are typically more sensitive to the quality of visual displays than they are to sound quality. In order to keep cost to a minimum, it is common for consumer electronics manufacturers of audio and video systems, such as televisions and computers, to install small, inexpensive speakers in the systems. These speakers typically exhibit poor fidelity and perceived sound field spread. Consumers seeking to overcome those sound problems typically buy and add high-end speakers as the audio and video reception and playback systems are usually configured so that additional speakers can be connected to the audio and video reception and playback systems for improving the quality of the audio portion of any programming.
The present disclosure describes implementations of and architectures for implementing a multi-band (e.g., three-band) audio compression algorithm with advanced surround processing. Embodiments of the present disclosure can accordingly improve the fidelity and perceived sound field spread of inexpensive, cabinet mounted, stereo speakers such as those that might be found in televisions, wireless speaker systems and sound bars. Embodiments of the present disclosure can improve inexpensive, cabinet mounted, stereo speakers by providing, e.g., (i) an Advanced Surround algorithm that adds depth and height to the left/right/center sound field images, (ii) a Soft Clip algorithm to minimize the perceived artifacts caused by compressor overshoot, (iii) configurable crossover filter order adjustment to allow better isolation between bands, (iv) a compressor maximum gain adjustment to reduce overshoot and minimize noise boost, and/or (v) a center gain adjustment to emphasize the perception of the center image (dialog) in high ambient sound situations.
It may be desirable to have different configurations of these architectures and/or algorithms, depending upon the type of audio source material. For example, while watching an action movie, a listener may be interested in a strong audio surround effect. Embodiments of the present disclosure can accordingly provide enhanced audio surround effect(s). As another example, when listening to music, a listener may be less interested in a surround effect and more interested in high fidelity, a concert hall effect, or increased bass. A listener to a sporting event may be interested in hearing the announcer clearly over the crowd noise and public address system while still trying the maintain the ambiance of a stadium environment. The improvements and configurability of the architectures and algorithms of the present disclosure can thus provide the implementation of multiple audio enhancement modes to facilitate different types of audio material and the listener's taste.
These, as well as other components, steps, features, objects, benefits, and advantages, will now become clear from a review of the following detailed description of illustrative embodiments, the accompanying drawings, and the claims.
The drawings are of illustrative embodiments. They do not illustrate all embodiments. Other embodiments may be used in addition or instead. Details that may be apparent or unnecessary may be omitted to save space or for more effective illustration. Some embodiments may be practiced with additional components or steps and/or without all of the components or steps that are illustrated. When the same numeral appears in different drawings, it refers to the same or like components or steps.
Illustrative embodiments are now described. Other embodiments may be used in addition or instead. Details that may be apparent or unnecessary may be omitted to save space or for a more effective presentation. Some embodiments may be practiced with additional components or steps and/or without all of the components or steps that are described and/or with the component order changed.
When the volume control is positioned as shown in
In exemplary embodiments, a preferred configuration involves configuring DPP, Crossover Network 1, Compressor1 and Compressor 2 as a dynamic volume control (DVC) and also includes the EQ, Crossover Network2, Compressor3, and HPF configured for compressor-based bass enhancement. Examples of suitable EQs used with a crossover network and a compressor for dynamic volume control include, but are not limited to, those disclosed in co-owned U.S. Pat. No. 9,380,385 filed 14 Mar. 2014 and entitled “Compressor Based Dynamic Bass Enhancement with EQ,” the entire content of which is incorporated herein by reference. Examples of suitable DPPs used with crossover networks and compressors for dynamic volume control include, but are not limited to, those disclosed in co-owned U.S. Pat. No. 8,315,411 filed 16 Nov. 2009 and entitled “Dynamic Volume Control and Multi-Spatial Processing Protection,” the entire content of which is incorporated herein by reference. Another configuration, described in the present disclosure, uses Advanced Surround for a concert hall effect. While still another configuration utilizes the DPP Target Sum/Difference ratio, DPP center gain and Advanced Surround to create a sports listening mode effect. And, still another configuration uses Compressor2 and Compressor3 together to create an improved bass enhancement effect.
The DPP 200 functions to limit the difference to sum ratio (L−R/L+R) based upon the Threshold setting. It should be noted that by adjusting the Center gain, the sound field collapses proportionally into the center image, while boosting the sum channel, drawing the listener's attention to the center image which is typically the program dialogue. A detailed description of this function is provided in co-owned U.S. Pat. No. 8,315,411 filed 16 Nov. 2009 and entitled “Dynamic Volume Control and Multi-Spatial Processing Protection,” the entire content of which is incorporated herein by reference.
Referring to the DPP system 200 shown in
In the illustrated embodiment of
Each of these signals is then run through a respective signal level detector 228 and 230. The detectors listed above can be used, such as an RMS level detector, although any type of level detector (such as the ones described above) can be used. Also, the processing can all be performed in the log domain to increase efficiency by processing them through the log domain processing blocks 232 and 234.
The outputs of the blocks 232 and 234 are applied to the signal summer wherein the processed SUM signal is subtracted from the processed DIF signal. Subtracting one signal from the other in the log domain is the same as providing a signal that is the ratio of the process SUM signal to that of the DIF signal in the linear domain. Once the L+R and L−R signal levels are calculated, where the L−R signal level may have been equalized prior to level detection to increase the mid-range frequencies, these two signal levels are compared by the comparator 238 to a preset threshold 240. The ratio between the two signals ((L−R)/(L+R)) is compared to a threshold ratio by comparator 238 in order to determine the recommended L-R signal gain adjustment. A limiter stage 242 may be used to limit the amount and direction of gain applied to the L−R signal. The illustrated embodiment limits the gain at 0 dB hence only allowing attenuation of the L−R signal, although in some applications, there may be a desire to amplify the L−R signal. An averaging stage 244 averages, with a relatively long time constant, the output of the limiter stage 242 so as to prevent the DPP system from tracking brief transient audio events. After conversion back to the linear domain by linear domain block 246, the level of the L-R signal is correspondingly adjusted by the signal multiplier 248 to achieve that target ratio.
Target level signal 316 is subtracted from the output of log conversion block 312 by signal summer 326 so as to provide the REF signal to the signal averaging AVG block 314, a comparator 328 and a second comparator 33o. The REF signal represents the volume level of the input signal relative to the desired listening threshold. The AVG signal can also be thought of as the instantaneous (prior to attack/release processing) ideal gain recommendation. The output of the signal averaging block 314 is the AVG signal, which is a signal that is a function of the average of the REF signal. The AVG signal is applied to the signal summer 332 where it is added to the attack threshold signal 118. In a similar manner (not shown) the AVG signal is summed with a release threshold. The AVG signal is also applied to the signal summer 334 where it is added to the gate threshold signal 320. The output of signal summer 332 is applied to attack threshold comparator 328 where it is compared to the REF signal, while the output of signal summer 334 is applied to gate threshold comparator 130 where it is compared to the REF signal. The AVG signal is also multiplied by the ratio signal 322 by the signal multiplier 336. The output of comparator 328 is applied to the attack/release selection block 338, which in turn provides either an Att (attack) signal, or a Rel (release) signal to the signal averaging block 314, dependent on and responsive to the status of the mute hold signal 324. The output of the release threshold AVG summer (not shown) is also compared to the REF signal and is applied to the attack/release selection block. The comparator 330 provides an output to the HOLD input of signal averaging block 314. Finally, the signal multiplier 336 provides an output to a log-to-linear signal converter 340, which in turn provides an output which is applied to each of the signal multipliers 342 and 344, wherein it respectively scales the left and right signal provided at the corresponding inputs 302 and 304 so as to provide the output modified left and right signals Lo and Ro.
With continued reference to
A target output level represented by the target level signal 316 is subtracted from the sensed level at the output of the log conversion block 312 to determine the difference between the actual and desired sound level. This difference, which represents the level of the input signal relative to the target level signal 316, is known as the reference (REF) signal. The target level signal can be a user input, such as a simple knob or other pre-set setting, so as to control the level of sound desired. This threshold can be fixed or it can be changed as a function of the input signal level to better position the compression relative to the input dynamic range. Once REF signal is obtained, it is provided as an input to the averaging block 314, attack threshold comparator 328 and gate threshold comparator 130. The output of attack threshold comparator 328 is applied to the attack/release select block 338, which in turn can receive a signal (e.g., a MuteHold signal 324) from a program change detector.
The gate threshold signal 320 when added to the current average AVG represents the lowest value REF is able to achieve before left and right gain adjustment (342 and 344) are frozen, The gate threshold comparator 330 receives the instantaneous signal level (REF) signal and determines if the sound level represented by REF drops below the given aforementioned threshold. If the instantaneous signal level (REF) is more than the amount of the gate threshold below the averaged signal level (AVG) appearing at the output of block 314, the gain applied to the signal in the signal path is held constant until the signal level rises above the threshold. The intent is to keep the system 300 from applying increased gain to very low level input signals such as noise. In an infinite hold system, the gain can be constant forever until the signal level rises. In a leaky hold system, the gain can be increased at a gradual pace (much slower than the release time). In a one embodiment, this gate hold threshold is adjustable, while in another embodiment the threshold set by gate threshold 334 is fixed. A detailed description of a similar suitable compressor architecture is provided in co-owned U.S. Pat. No. 8,315,411, which is incorporated by reference herein in its entirety.
The architecture 300 preferably (but not necessarily) has an adjustable maximum limit to the gain applied to the L and R channel. By limiting the maximum gain, one can minimize the effects of compressor overshoot when the source material transitions from very quiet to very loud such as when a television program transitions to a loud commercial. Additionally, a maximum gain limit allows one to minimize the noise boost that can occur when the audio is quiet. This is especially important for analog input sources or older program material that has a high noise floor.
In exemplary embodiments, the DPP, Crossover Network1, Compressors and Compressor2 components can be configured as a volume control with multi-spatial processing protection similar to as described in U.S. Pat. No. 8,315,411. Examples of suitable compressor blocks (or subsystems) include, but are not limited to, those disclosed in co-owned U.S. Pat. No. 8,315,411.
The Volume Control setting is provided to Compressor1 and Compressor2 (dashed line on
A diagram of an exemplary embodiment of an Advanced Surround architecture/system 400 is shown in
Referring to
This architecture allows the summation of a scaled amount of sum and difference reflection/reverb with prior art processing before the entire signal is combined back with the left and right channels. The prior art algorithm is good at spreading the perceived sound field, for two stereo speakers, in the horizontal direction. The addition of reflection/reverb modelling, to the prior art, as shown in
One preferred embodiment of Compressor3 involves Volume Control feedback. The Volume Control setting can be provided, as feedback, to Compressor3 (dashed line on
The preferred instantiation of the Soft Clipper is a hard limiter followed by a smoothing polynomial. Suitable smoothing polynomials include, but are not limited to, the type described in the paper Esqueda, F., et al., 23rd European Signal Processing Conference, “Aliasing reduction in soft-clipping algorithms,” EUSIPCO 2015 (Dec. 22, 2015): 2014-2018, a copy of which is submitted with and incorporated into this application; one such suitable the polynomial is y=(3x/2)(1−x2/3), where y is the clipper output, is utilized in a preferred static soft clipping instantiation. Other smoothing polynomials and methods may be used, e.g., other methods based on the ideal band limited ramp function (BLAMP), or the polyBLAMP polynomial approximation method, etc. A hard clipper alone can produce a harsh audio artifact during compressor overshoot. A true limiter may be computationally intensive and require significant processor bandwidth and memory. A soft clipper represents a good compromise that minimizes the perceived audio artifacts for brief audio excursions above full scale.
This configurable multi-compressor (e.g., three-compressor) system can be utilized to enhance the listener experience for different types of program material. For example, it can be configured in a music mode with an emphasis on bass. It can be configured in a concert hall mode with emphasis on echo and reverberation. It can also be configured in a live sporting event mode that emphasizes the announcer's voice while maintaining the ambience of a stadium environment. There are many other possible configurations such as HiFi, News and Theater modes, etc.
Exemplary Embodiments are described below with respect to features and components descried above and shown in the drawings:
An example of a bass-emphasized music mode will now be described. It is assumed that the system utilizes an inexpensive set of speakers which have a low-end frequency response that extends to, e.g., about 250 Hz. This mode utilizes two compressors (Compressor2 and Compressor3) and EQ to enhance bass.
DPP: Limit the L−R/L+R ratio to 0 dB.
Compressor1: Configured separately to limit the level in the mid and high bands. In this example those are signals above, e.g., 250 Hz. The high band (>250 Hz) Above Threshold Ratio (compression ratio) is set to 1000:1 to provide true limiting at the Target Level. The Target Level is determined, while monitoring the speaker output, with the EQ configured and with the TV at full volume to determine the maximum allowable signal. The Target Level will increase proportionally, via internal feedback, as the TV volume is decreased. In other words, the high band compressor will allow more energy to pass as the volume control is lowered since it will be attenuated by the volume control prior to being present at the speaker terminals. The Max Gain and Below Threshold Ratio setting (1.2:1) will allow some mid and high band boost to occur when the input level, in conjunction with the TV volume control setting, indicates more energy will be tolerated. In other words the high band compressor will allow more mid and high frequency energy as the volume control is lowered since it will be attenuated by the volume control prior to being present at the speaker terminals.
Compressor2: Configured to limit (or boost) the level in the low band relative to a target level setting. In this case the low band could be 250 Hz and below. Crossover Network1 is configured at 250 Hz. The filter order is set to 4th to optimize the separation of the two bands (<250 Hz and >250 Hz). The Target Level sets the limit, for this band, in dB full scale. The Target Level is set, while monitoring the speaker output, with the Volume Control at full volume and with the EQ configured with any desired static boost <250 Hz. Setting the Target Level in this manner allows the maximum amount of energy <250 Hz allowable (before distortion occurs) to reach the speaker terminals at full volume. At lower volume settings the Volume Control feedback will allow more bass signal to pass. This configuration allows the system to always pass as much bass signal as possible, without distortion, while utilizing EQ to provide a static boost to the low band. The Maximum Compressor Gain could be set to a low value (2-3 dB) to allow a small amount of additional dynamic boost at low bass input levels. Above and below threshold compression ratios are set relatively high (16:1).
Advanced Surround: Configured with a moderate to small amount of sound field spread (Width) with a Delay, Delay Feedback and Delay Gain configuration that is dominated by early reflections giving the subtle feeling of 3D sound without sacrificing clarity.
EQ: Configured to flatten the speaker frequency response in mid to high bands and to boost the response in the low band. This creates good overall tonal balance while providing the desired amount of bass boost.
Compressor 3: Configured to limit very low frequency signals (<<250) that are not passable by the speakers at high, or even moderate, output levels. This lower low band is set in Crossover Network2. Continuing the example, let's say it is set to 100 Hz. The Target Level can be set to a level (lower than the Compressor2 Target Level) that will allow these very low frequency signals to still pass (at limited levels) and even boost them, via the Max Gain and Below Threshold Ratio parameters, if the input signal level and volume control setting will allow. The Target Level is set when the TV is at full volume, while monitoring the speaker output, to determine the maximum allowable signal but will then be increased proportionally, via internal feedback, as the TV volume is decreased. In other words the low-low band compressor will allow more energy to pass as the volume control is lowered since it will be attenuated by the volume control prior to being present at the speaker terminals. The HPF is preferably configured to remove those extremely low frequencies that absolutely cannot be reproduced by the speaker.
Soft Clip: Configured to limit signals above 0 dB full scale.
By dividing the speaker low band into two bands, the configuration described above allows lower than typical frequencies to be passed by the speakers. In prior art, a HPF would typically be used to remove the lower-low band frequencies from the audio signal. This new compressor configuration allows them to be passed if conditions (low input level, low volume control setting) merit. All these parameter settings are calibrated for a given set of speakers mounted in a specific enclosure.
Concert Hall Mode: A concert hall mode can be created, for the example speaker, by the following configuration.
DPP: Same as bass-emphasized music mode.
Compressor1: Same as bass-emphasized music mode.
Compressor2: Same as bass-emphasized music mode.
Advanced Surround: Increase the Delay Time and Delay Feedback Coefficient for both the L+R and L−R channels so that the overall impulse response extends well into the reverberation region.
EQ: Same as bass-emphasized music mode.
Compressor3: Same as bass-emphasized music mode.
Soft Clip: Same as bass-emphasized music mode.
Broadcast Sports Mode: A broadcast sports mode can be created by the following configuration:
DPP: Limit the L−R/L+R ratio to −6 dB. This reduces the ambient audio (crowd noise, public address announcer). Increase the Center Gain to emphasize the broadcast announcer's voice. This gives the announcer's voice more perceived clarity without sacrificing the overall bandwidth of the audio signal. Prior art implementations have implemented a bandpass filter to pass voice frequencies while attenuating signals outside of the voice range.
Compressor1: Configure, similar to bass-emphasized music mode, to limit the audio output so as to not overdrive the speakers at frequencies >100 Hz at full volume. Crossover Networks is configured at 100 Hz.
Compressor2: Disable by setting the Above Threshold Compression Ratio and Below Threshold Compression Ratio to 1:1.
Advanced Surround: Configure L−R Delay loop (Delay and Delay Feedback Coefficient) to generate an impulse in the reverberation region. Disable L+R delay loop by setting the sum delay gain to 0. While the L−R channel is reduced by DPP, the reverberation on the remaining difference signal retains the enveloping feel of stadium crowd noise. Disabling the L+R delay loop maintains the broadcast announcer's vocal clarity.
EQ: Configure to compensate for speaker frequency response and to provide bass boost.
Compressor3: Configured to improve the speaker's bass response by limiting (or boosting) the level in the low band relative to a target level setting. In this case the low band would be 250 Hz and below. Crossover Network2 is configured at 250 Hz. The Target Level is set with the Volume Control at full volume and with the EQ fully configured with any desired boost <250 Hz. Setting the Target Level in this manner allows the maximum amount of energy <250 Hz allowable (before distortion occurs) to reach the speaker terminals at full volume. At lower volume settings the Volume Control feedback will allow more bass signal to pass. This configuration allows the system to always pass as much bass signal as possible, without distortion, while utilizing EQ to boost the low band. The HPF is configured to remove the low frequencies that cannot be reproduced by the speaker in this configuration.
Soft Clip: Configured to limit signals above 0 dB full scale.
The components, steps, features, objects, benefits, and advantages that have been discussed are merely illustrative. None of them, or the discussions relating to them, are intended to limit the scope of protection in any way. Numerous other embodiments are also contemplated. These include embodiments that have fewer, additional, and/or different components, steps, features, objects, benefits, and/or advantages. These also include embodiments in which the components and/or steps are arranged and/or ordered differently.
For example, in bass-emphasized music mode the roles of Compressor2 and Compressor3 could be reversed. Compressor2 could compress the lower low band and Compressor3 could compress the upper low band. Additionally, the HPF could be located after the summer. The Volume Control could be positioned before Crossover Network2 eliminating the need for Volume Control feedback.
Unless otherwise stated, all measurements, values, ratings, positions, magnitudes, sizes, and other specifications, including frequencies, ratios, and dB values, that are set forth in this specification, including in the claims that follow, are approximate and/or provided as example, are not necessarily exact or invariable. They (the values described) are intended to have a reasonable range that is consistent with the functions to which they relate and with what is customary in the art to which they pertain.
All articles, patents, patent applications, and other publications that have been cited in this disclosure are incorporated herein by reference.
The phrase “means for” when used in a claim is intended to and should be interpreted to embrace the corresponding structures and materials that have been described and their equivalents. Similarly, the phrase “step for” when used in a claim is intended to and should be interpreted to embrace the corresponding acts that have been described and their equivalents. The absence of these phrases from a claim means that the claim is not intended to and should not be interpreted to be limited to these corresponding structures, materials, or acts, or to their equivalents.
The scope of protection is limited solely by the claims that now follow. That scope is intended and should be interpreted to be as broad as is consistent with the ordinary meaning of the language that is used in the claims when interpreted in light of this specification and the prosecution history that follows, except where specific meanings have been set forth, and to encompass all structural and functional equivalents.
Relational terms such as “first” and “second” and the like may be used solely to distinguish one entity or action from another, without necessarily requiring or implying any actual relationship or order between them. The terms “comprises,” “comprising,” and any other variation thereof when used in connection with a list of elements in the specification or claims are intended to indicate that the list is not exclusive and that other elements may be included. Similarly, an element proceeded by an “a” or an “an” does not, without further constraints, preclude the existence of additional elements of the identical type.
None of the claims are intended to embrace subject matter that fails to satisfy the requirement of Sections 101, 102, or 103 of the Patent Act, nor should they be interpreted in such a way. Any unintended coverage of such subject matter is hereby disclaimed. Except as just stated in this paragraph, nothing that has been stated or illustrated is intended or should be interpreted to cause a dedication of any component, step, feature, object, benefit, advantage, or equivalent to the public, regardless of whether it is or is not recited in the claims.
The abstract is provided to help the reader quickly ascertain the nature of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims. In addition, various features in the foregoing detailed description are grouped together in various embodiments to streamline the disclosure. This method of disclosure should not be interpreted as requiring claimed embodiments to require more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive subject matter lies in less than all features of a single disclosed embodiment. Thus, the following claims are hereby incorporated into the detailed description, with each claim standing on its own as separately claimed subject matter.
This application is based upon and claims priority to U.S. provisional patent application 62/442,195 entitled “Three Band Compressor with Advanced Surround Processing,” filed 4 Jan. 2017, attorney docket number 056233-0609. The entire content of this noted provisional application is incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
62442195 | Jan 2017 | US |