One complaint voiced by many television viewers is the changes in volume viewers endure during commercial breaks and when switching between different channels. Similar volume extremes may also occur with other devices, such as portable audio players, A/V receivers, personal computers, and vehicle audio systems. One solution for this problem is automatic gain control (AGC). A typical automatic gain control (AGC) works by reacting to volume changes by cutting an audio signal at high amplitude and then boosting it at low amplitude—no matter where in the frequency range the loudness spike occurs.
When the AGC kicks in, unwanted changes and unnatural artifacts can often be heard in the form of pumping and breathing fluctuations. Pumping fluctuations can be the result of bass tones disappearing when the loudness suddenly increases, like during a loud action sequence. Breathing fluctuations can happen when low level hiss is boosted during quiet passages. Unfortunately, this brute force method of handling volume changes does not take into account how humans actually perceive change in volume.
In certain embodiments, a method of adjusting a loudness of an audio signal includes receiving an electronic audio signal and using one or more processors to process at least one channel of the audio signal to determine a loudness of a portion of the audio signal. This processing may include processing the channel with a plurality of approximation filters that can approximate a plurality of auditory filters that further approximate a human hearing system. In addition, the method may include computing at least one gain based at least in part on the determined loudness to cause a loudness of the audio signal to remain substantially constant for a period of time. Moreover, the method may include applying the gain to the electronic audio signal.
In various embodiments, a method of adjusting a loudness of an audio signal includes receiving an electronic audio signal having two or more channels of audio and selecting a channel of the two or more audio channels. The selecting may include determining a dominant channel of the two or more audio channels and selecting the dominant channel. The method may further include using one or more processors to process the selected channel to determine a loudness of a portion of the audio signal and computing at least one gain based at least in part on the determined loudness. Additionally, the method may include applying the at least one gain to the electronic audio signal.
In certain implementations, a system for adjusting a loudness of an audio signal includes a pre-processing module that can receive an electronic audio signal having one or more channels of audio and select at least one of the channels of audio. The system may further include a loudness analysis module having one or more processors that can compute a loudness of the at least one selected channel. The system may further include a gain control module that can compute at least one gain based at least in part on the loudness. The gain computation may include calculating a gain for the at least one selected channel of the audio signal based at least partly on the estimated loudness and applying the gain to each channel of the audio signal.
In certain embodiments, a method of distinguishing background sounds from other sounds may include receiving an electronic audio signal having two or more channels of audio, selecting a portion of the electronic audio signal, analyzing a phase between each channel of the selected portion of the electronic audio signal to determine a number of samples that have a corresponding phase, and comparing the number of samples to a threshold to determine whether the selected portion of the electronic audio signal corresponds to background noise.
In certain embodiments, a system for adjusting a loudness of an audio signal may include an audio signal having one or more channels of audio, a loudness module having one or more processors that can compute a loudness of the audio signal, where the computation includes processing the audio signal with a plurality of infinite impulse response (IIR) filters, where each of the IIR filters is a band-pass filter, and where the IIR filters can approximate a human hearing system. The system may further include a gain module that can compute a gain based at least in part on the computed loudness.
For purposes of summarizing the disclosure, certain aspects, advantages and novel features of the inventions have been described herein. It is to be understood that not necessarily all such advantages may be achieved in accordance with any particular embodiment of the inventions disclosed herein. Thus, the inventions disclosed herein may be embodied or carried out in a manner that achieves or optimizes one advantage or group of advantages as taught herein without necessarily achieving other advantages as may be taught or suggested herein.
Throughout the drawings, reference numbers may be re-used to indicate correspondence between referenced elements. The drawings are provided to illustrate embodiments of the inventions described herein and not to limit the scope thereof.
Some volume control systems attempt to take loudness into account in determining how to vary gain. Loudness can be an attribute of the auditory system that can allow for classification of sounds on a scale from quiet to loud. Loudness can be measured in a unit called the “phon.” When we listen to different types of audio material, it is the subjective quantity of loudness that we use in order for our ears to categorize the intensity of the various sounds presented to them and generate a listening sensation. Perceived loudness may vary with frequency, unlike sound pressure levels measured in decibels (dB). Volume control systems that model the human ear based on loudness often use complex, high-order filters to model the human hearing system. These systems can consume significant computing resources and thereby limit their capability to function in certain devices, such as televisions and car audio systems.
This disclosure describes certain systems and methods for adjusting the perceived loudness of audio signals. In certain embodiments, an estimated loudness of an audio signal is determined using one or more processing-efficient techniques. These techniques may include using lower-order filters that approximate the filter banks modeling the human ear, decimating the audio signal to reduce the number of audio samples processed, processing fewer than all audio channels, and smoothing gain coefficients rather than smoothing an output signal. Advantageously, in certain embodiments, applying one or more of these techniques can enable lower-capability processors, such as may be found in many electronic devices, to dynamically adjust the loudness of audio signals.
Referring to
As shown in
A pre-process module 120 of the loudness adjustment system 110 receives the audio input signal 102. The pre-process module 120 may include hardware and/or software for gathering energy information from each channel of the audio input signal 102. Using the energy information in one embodiment, the pre-process module 120 can determine at least one dominant channel to be analyzed for loudness by a loudness analysis module 130. More generally, the pre-process module 120 may select a subset of the channels of the audio input signal 102 for loudness analysis. By using fewer than all of the channels to determine loudness, in certain embodiments, the pre-process module 120 can reduce computing resources used to determine loudness.
The loudness analysis module 130 can include hardware and/or software for estimating loudness based on the channel or channels selected by the pre-processing module 120. The loudness analysis module 130 can compare the estimated loudness of the selected channel with a reference loudness level. If the estimated loudness differs from the reference loudness level, the loudness analysis module 130 can output the level difference between the estimated loudness and the reference level. As will be described below, a gain control module 140 can use this level difference to adjust a gain applied to the audio input signal 102.
In certain embodiments, the loudness analysis module 130 uses a nonlinear multiband model of the human hearing system to analyze the loudness characteristics of the audio input signal 102. This model can simulate the filter bank behavior of the human peripheral auditory system. As such, the model can account for loudness, which may be a subjective measure of sound intensity, by estimating the loudness of the audio input signal 102.
The human auditory system behaves as if it contained a bank of band-pass filters that have continuously overlapping center frequencies. An example of such a bank 100B of band-pass filters 160 is shown in
Additionally, loudness can be measured for different individuals by utilizing one or more equal loudness curves, as described above. Example equal loudness curves 170 are shown in
The loudness analysis module 130 may also down sample, decimate, or otherwise reduce the amount of samples it uses to process audio information. By decimating the audio input signal 102, for instance, the loudness analysis module 130 uses fewer samples to estimate loudness. Decimation may be performed in certain embodiments because the human hearing system may not be able to detect loudness changes at the same sample rate used to sample the audio input signal 102. Decimation or other sampling rate techniques can reduce computing resources used to compute loudness.
As described above, the loudness analysis module 130 compares the computed loudness with a reference loudness level and outputs the level difference to the gain control module 140. The reference loudness level can be a reference that is internal to the loudness adjustment system 110. For example, the reference level can be a full scale loudness (e.g., 0 dB), so that adjusting the loudness to this level preserves dynamic range. In another embodiment (not shown), the reference level can be a volume level set by a user, e.g., via a volume control.
The gain control module 140 can apply the level difference to the audio signal input 110 on a sample by sample basis via mixers 142a and 142b. In certain embodiments, the gain control module 140 smooths transitions between samples or blocks of samples to prevent jarring loudness transitions. As a result, the mixers 142 may output an audio signal that has a constant average loudness level or substantially constant average loudness level. Thus, in certain embodiments, the loudness adjustment system 110 can transform the audio input signal 102 into an audio signal that has a constant average loudness level or substantially constant average loudness level.
The outputs of the mixers 142 are provided to mixers 152a, 152b. These mixers 152 are controlled by a volume control 150. The volume control 150 may be operated by a user, for example. The mixers 152 apply a gain to the output of the mixers 142 according to a volume setting of the volume control 150. The mixers 152 then provide an audio output signal 162, which may be provided to one or more loudspeakers or to other modules for further processing.
Referring to
In addition, as will be described in greater detail below, the pre-processing can include examining noise characteristics of the channels. If a sample block includes primarily noise, for instance, little or no loudness processing may be applied to that sample block.
At a decimation block 210a, the dominant channel signal can be decimated by downsampling and/or filtering the dominant channel. At a loudness process block 212a, a loudness of the lower-rate signal can be estimated by using one or more filters that approximate auditory filters and one or more loudness curves. A level difference may be further determined between the estimated loudness level and a reference loudness level.
At a gain adjustment block 214a, a gain can be calculated based on the level difference. This gain may be applied to both channels of the stereo input signal 202a, rather than just the decimated channel. The gain calculation can include a smoothing function that smooths the calculated gain over a plurality of samples of the stereo input signal. A stereo output signal is provided at block 216a based on the applied gain.
In alternative embodiments, a dominant channel is not selected, but rather each channel is processed to determine a loudness for the channel. A different gain may be applied to each channel based on the computed loudness. In another alternative embodiment, decimation is not performed, and the loudness process 212a operates on a full rate input signal or dominant channel. Many other implementations and configurations may also be used.
Referring to
Advantageously, in certain embodiments, the left and right inputs are provided as one pair to the pre-processing block 204b, and the left and right surround inputs are provided to the pre-processing block 204c. Each of these blocks 204b, 204c can calculate signal energy and determine a dominant channel, which is provided to a decimation block 210b or 210c, respectively. The pre-process block 204d may also calculate signal energy of the center input, but in certain embodiments does not select a dominant channel. However, the signal energy for the center channel may be used later in the process 200B.
Each of the decimation blocks 210 can decimate a selected channel and provide the decimate channel to a loudness process block 212b, 212c, or 212d, respectively. Each of the loudness process blocks 212 can determine a difference between a loudness of the channel and a reference level and output a level difference to a gain adjustment block 214b. Both of the decimation blocks 210 and the loudness process blocks 212 may have the same or similar features described above with respect to
In certain embodiments, the gain adjustment block 214b calculates a gain for each of the input channels based on the received level difference from the loudness process blocks 212. The gains may be different for each channel. In some implementations, it can be desirable to emphasize the center channel to increase listener perception of dialogue. However, the loudness processes 212 may generate gains that cause the various channels to drown out the center channel. To address this problem, the gain adjustment block 214b may generate a higher gain for the center channel than for the other channels. In one embodiment, the gain adjustment block 214b maintains a ratio between the gain for the center channel and gains for the other channels.
In alternative embodiments of the process 200B, a dominant channel is not selected, but all channels are processed to determine loudness, and a separate gain is applied to each channel. As another alternative, a dominant channel may be determined between the left and right channels but not between the left surround and right surround channels, or vice versa. In another alternative embodiment, decimation is not performed. In addition, the features shown in
In certain embodiments, the preprocess module 320 operates on sample blocks of the left and right signals 302, 304. For example, the preprocess module 320 may buffer a number of incoming samples into a predetermined sample block size and then process the sample block. The size of the sample blocks may be chosen arbitrarily. For instance, each sample block may include 256, 512, 768 samples, or a different number of samples.
Currently available AGC systems usually do not discriminate between dialog and background noise such as effects. As such, background noise such as rain can potentially be amplified by these systems, resulting in background noise that may sound louder than it should relative to non-background noise. To address this problem, in certain embodiments sample blocks of the left and right signals 302 are provided to a phase analysis module 322. The phase analysis module 322 may include hardware and/or software for using phase analysis to detect background noise and non-background noise portions of each sample block of the left and right signals 302, 304.
The phase analysis module 322 can base its analysis on the insight that voiced (or non-background) samples may be highly correlated whereas non-voiced samples tend to be decorrelated. What this means is that if one examines the left and right channels 302, 304 on a per sample basis, voiced samples tend to have the same phase on both channels 302, 304 at the same time. In other words, voiced samples tend to be in-phase on both channels 302, 304. Non-voiced samples, on the other hand, tend to have different phase at the same point in time, such that a sample on one channel may be positive while a corresponding sample on the other channel may be negative. Thus, a phase distribution of primarily voiced samples may be highly correlated, whereas a phase distribution of primarily non-voiced samples may be less correlated.
The phase analysis module 322 can perform a process to determine if a given sample block includes primarily voiced or non-voiced samples, based on the insights described above.
At decision block 404, it is determined whether a phase distribution exceeds a threshold. For example, it can be determined whether a combined total number of sample pairs that have the same phase are greater than a threshold number. If so, at block 406, the sample block is used for loudness processing because the sample block may include or substantially include a voiced signal. Otherwise, loudness processing is bypassed on the sample block at block 408. This is because the sample block may include or substantially include a non-voiced signal. A minimum gain may be applied to the sample block to deemphasize the background noise of the sample block.
In alternative embodiments, loudness processing is applied to non-voiced sample blocks as well as voiced sample blocks. However, a lower gain may still be applied to sample blocks that contain a substantial number of non-voiced samples. In addition, the thresholds described above may be adjusted to more or less aggressively apply the phase analysis.
The phase analysis processes described above can also potentially be used in other applications. For example, this phase analysis may be used with hard limiters or other classic gain adjustment systems, such as compressors. Noise reduction systems can potentially benefit from the use of such analysis. Pitch detection systems can also use this analysis.
Referring again to
The energy analysis module 324 may also compute the maximum or peak values of each channel of the sample block. The energy analysis module 324 may create a temporary buffer to hold this information. The temporary buffer may include the maximum value of the absolute value of the samples on each channel (L, R). The temporary buffer may also include a look-ahead delay line that the energy analysis module 324 populates with the maximum values of the samples of the next sample block. The look-ahead delay line will be described in greater detail below with respect to
Referring again to
An example process 500 that may be performed by the dominant channel module 326 is illustrated in
On the other hand, if the conditions of block 502 are not true, it is further determined at decision block 508 whether a mean square value of the right channel is greater than or equal to a mean square value of the left channel and whether a maximum value of the left channel is greater than a threshold value. If so, then the right channel is considered to be dominant at block 510, and the right channel may be provided for loudness processing at block 512.
If the conditions of blocks 508 are not true, then a mono signal may be present, and at decision block 514, it is determined whether a maximum value of the left signal is greater than a threshold. If so, then the left channel is provided for loudness processing. Otherwise, it is further determined at decision block 518 whether a maximum value of the right channel is greater than a threshold. If so, the right channel is provided for loudness processing at block 520. Otherwise, the sample block is passed through at block 522 and is not provided for loudness processing because the sample block may be considered to not have any audio or substantially any audio.
Referring again to
The decimation process in certain embodiments includes a decimation filter that may down-sample the dominant channel buffer described above with respect to
The decimation filter having the impulse response 600 shown is a length 33 finite impulse response (FIR) filter. This filter can be derived by windowing the causal ideal impulse response, as represented in equation (1):
h(n′)=w(n′)d(n′−LM) (1)
In equation (1), the ideal impulse response is given by:
The decimated sample is then given by:
Advantageously, in certain embodiments, each decimated sample block may be used for more computing resource-efficient loudness processing.
In the depicted embodiment, a decimated input 702 is provided to approximation filters 710. The decimated input 702 may include a decimated sample block created by the decimation filter described above with respect to
Gammatone filters have been used to simulate the bank of band-pass filters of the human ear described above with respect to
g(t)=atn-1 cos(2πft+φ)e−2πbt (4)
In equation (4), a denotes amplitude, f denotes frequency, n is the order of the filter, b is the filter's bandwidth, and φ is the filter's phase.
Gammatone filters can be processing-intensive filters and therefore may not be appropriate choices for electronic devices with low computing resources, such as some televisions. Thus, in certain embodiments, each of the filters 710 approximates a gammatone filter. The filters 710 may also have different center frequencies to simulate the bank of band-pass filters of the human ear. At least some of the filters 710 may be first-order approximations to a gammatone filter. Each first-order approximation may be derived in certain embodiments by a) using a first order Butterworth filter approximation matching a selected center frequency for each filter and by b) using a least squares fit to the frequency response of the initial Butterworth estimate. The filters 710 can each be implemented as an Infinite Impulse Response (IIR) filter to use processing resources more efficiently.
Another input 704 is also provided to other approximation filters 720. In certain embodiments, the input 704 is a full-rate sample input, rather than a decimated input. The full-rate input 704 may be used for some frequency bands that human ears are more sensitive to, for higher frequency bands, or the like. In addition, the full-rate input 704 may be used to prevent certain frequency fold-back effects.
The full-rate input 704 is provided to the filters 720. Like the filters 710, the filters 720 can be band-pass filters that approximate gammatone filters. The filters 720 may be derived in a similar manner as described above for the filters 710. However, in certain embodiments, the filters 720 are second-order approximations to the gammatone filters. These filters 720 may also be IIR filters. Normalized frequency responses 810 for a set of example approximation filters is shown in a plot 800 of
In other embodiments, all of the filters operate on decimated inputs (or instead, full-rate inputs). In addition, the number of filters shown is one example, and this number can vary, with fewer filters resulting in possibly better performance with possibly reduced accuracy. The number of filters selected may also depend on the size of available speakers, with larger speakers using more filters.
The filters 710, 720 provide filtered samples to gain blocks 742a, 742b, respectively, which in turn provide the samples to loudness estimators 730a, 730b, respectively. Each of the loudness estimators may implement a loudness estimation process, such as the loudness estimation 900 depicted in
Referring to
Referring again to
Again turning to
Lband=blk (8)
where l represents a weighted sample for a given band or an average of the samples, b and k represent constants that can be determined experimentally, and Lband represents loudness for that band.
At block 912, an estimated total loudness of a sample block is computed by summing the loudness values for each band. For example, the output of equation (8) for each band can be summed to obtain the estimated loudness for a sample block.
Referring again to
The energy scaling block 770 weights the total estimated loudness by the energy or power of the block that was calculated by the energy analysis module 324 (
In equation (9), a is a constant that can vary based on a user-defined mode. In one embodiment, two loudness control modes can be applied, either light or normal. Light control may perform a less aggressive loudness adjustment. The value of the constant “a” may be lower, for example, for light loudness adjustment. Ltotal represents the total estimated loudness, and E represents the energy of the block. Level refers to a calculated loudness level 780 for this sample block. This overall loudness level can represent a scalar value that the gain of the signal should reach in order for the loudness of the sample block to follow an equal loudness curve used above (e.g., the 100-phon curve upon which the C-weighting curve was based).
Scaling the total estimated loudness by the energy E (or in other implementations, power) of the block is done in certain embodiments because some signals are below an audible threshold. If these signals were measured for loudness based on one of the loudness curves above, the signals might not be close enough to the loudness curve to calculate an accurate loudness. Thus, the total estimated loudness can be equalized with the energy of the block. Dividing the total estimated loudness by a small energy from a low-signal block can boost the overall level estimation.
At block 1102, a delta level is computed. The delta level can include a difference between the last gain coefficient (e.g., for a previous sample) and the overall level determined above with respect to
delta gain=(Level−Last Gain Coefficient)*g (10)
The constant g can effectively break down the difference between the level calculation and the last gain coefficient (LGC) into a smaller delta gain. In certain embodiments, when the gain control module 140 is first initialized, the LGC is set to 1.0 scalar (or 0 dB), as a reference level to which the overall loudness level is correlated or equalized. In certain embodiments, this is a full-scale reference level used to preserve dynamic range.
As will be described below, the delta gain can be applied incrementally to each sample of the sample block until a sample multiplied by its corresponding gain coefficient reaches a certain percentage of a scaled full scale value. Thus, the first sample of the sample block may have a gain that is LGC1+delta gain. For the next sample, there is a new LGC2 that is equal to LGC1+delta gain. Thus, the gain of the second sample may be LGC1+delta gain. Thus, the delta gain can be used to gradually transition from the gain coefficient(s) of a previous sample block to a new set of gain coefficients based on the dynamically changing loudness computation of
To prevent further abrupt changes, at block 1104, a look-ahead line is employed to check if the update of the gain coefficient by increments of the calculated delta gain will result in the corresponding sample's value exceeding a target limiter level. The look-ahead line may be the look-ahead line described above with respect to
At decision block 1106, it is determined whether the target limiter level will be exceeded. If so, a decay can be computed at block 1108 and the delta gain may be zeroed for that sample. The decay value can be computed by taking into consideration the index in the look-ahead line at which the corresponding sample's value would exceed the target limiter level. The index can be an array index or the like. This decay may be computed using the following:
where R denotes the target limiter level, G is the current gain coefficient, and index is the index into the look-ahead line where the sample's value would exceed the target limiter level. In one embodiment, the points at which a decay value are computed for the current sample block are also stored in a temporary buffer and are used later in the process 1100 to smooth the calculated gain coefficients around the detected decay points.
At block 1110, the gain coefficient for a current sample is updated by an amount equal to the current delta gain, as described above. This delta gain is either the delta gain that has been calculated using equation (10) or it is zero if a decay point has been detected. If a decay point has been detected, the gain coefficient is updated by means of the computed decay calculated by equation (11), for example, by adding the decay to the last gain coefficient. At block 1112, the gain coefficients are then smoothed. For example, the gain coefficients can be processed by a first order smoothing function.
The smoothing function may use the stored indices of the decay occurrences in the look-ahead line. The smoothing function can be applied either forward or backwards in the gain coefficients temporary buffer depending on where the decay points are in the buffer. By applying the smoothing function, neighboring gain coefficients can be adjusted so that there is a smoother gain transition around the decay points.
The gain is applied for each sample at block 1114. As a result, in certain embodiments the process 1100 may output an audio signal that has a substantially constant average loudness level, with some possibly minor variations in loudness as the gain is smoothed between samples. The output signal may also track or substantially track a loudness curve that was selected above (e.g., the 100 phon curve or another curve). Thus, prior to volume processing by a user in one embodiment (see
Depending on the embodiment, certain acts, events, or functions of any of the algorithms described herein can be performed in a different sequence, may be added, merged, or left out all together (e.g., not all described acts or events are necessary for the practice of the algorithm). Moreover, in certain embodiments, acts or events may be performed concurrently, e.g., through multi-threaded processing, interrupt processing, or multiple processors or processor cores, rather than sequentially.
The various illustrative logical blocks, modules, and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. The described functionality may be implemented in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the disclosure.
The various illustrative logical blocks and modules described in connection with the embodiments disclosed herein may be implemented or performed by a machine, such as a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general purpose processor may be a microprocessor, but in the alternative, the processor may be a processor, controller, microcontroller, or state machine, combinations of the same, or the like. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium can be coupled to the processor such that the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC. The ASIC may reside in a user terminal. In the alternative, the processor and the storage medium may reside as discrete components in a user terminal.
Conditional language used herein, such as, among others, “can,” “could,” “might,” “may,” “e.g.,” and the like, unless specifically stated otherwise, or otherwise understood within the context as used, is generally intended to convey that certain embodiments include, while other embodiments do not include, certain features, elements and/or states. Thus, such conditional language is not generally intended to imply that features, elements and/or states are in any way required for one or more embodiments or that one or more embodiments necessarily include logic for deciding, with or without author input or prompting, whether these features, elements and/or states are included or are to be performed in any particular embodiment.
While the above detailed description has shown, described, and pointed out novel features as applied to various embodiments, it will be understood that various omissions, substitutions, and changes in the form and details of the devices or algorithms illustrated may be made without departing from the spirit of the disclosure. As will be recognized, certain embodiments of the inventions described herein may be embodied within a form that does not provide all of the features and benefits set forth herein, as some features may be used or practiced separately from others. The scope of certain inventions disclosed herein is indicated by the appended claims rather than by the foregoing description. All changes which come within the meaning and range of equivalency of the claims are to be embraced within their scope.
This application is a divisional of U.S. patent application Ser. No. 12/340,364, filed Dec. 19, 2008, which claims the benefit of priority under 35 U.S.C. §119(e) of U.S. Provisional Patent Application No. 61/016,270, filed on Dec. 21, 2007, and entitled “System for Adjusting Perceived Loudness of Audio Signals,” the disclosures of which are both hereby incorporated by reference in their entirety.
Number | Name | Date | Kind |
---|---|---|---|
3101446 | Glomb et al. | Aug 1963 | A |
3127477 | David, Jr. et al. | Mar 1964 | A |
3665345 | Dolby | May 1972 | A |
3828280 | Dolby | Aug 1974 | A |
3845416 | Dolby | Oct 1974 | A |
3846719 | Dolby | Nov 1974 | A |
3903485 | Dolby | Sep 1975 | A |
3967219 | Dolby | Jun 1976 | A |
4074083 | Berkovitz et al. | Feb 1978 | A |
4355383 | Dolby | Oct 1982 | A |
4490691 | Dolby | Dec 1984 | A |
4700361 | Todd et al. | Oct 1987 | A |
4739514 | Short et al. | Apr 1988 | A |
4882758 | Uekawa et al. | Nov 1989 | A |
4887299 | Cummins et al. | Dec 1989 | A |
5027410 | Williamson et al. | Jun 1991 | A |
5172358 | Kimura | Dec 1992 | A |
5175769 | Hejna, Jr. et al. | Dec 1992 | A |
5237559 | Murphy et al. | Aug 1993 | A |
5278912 | Waldhauer | Jan 1994 | A |
5363147 | Joseph et al. | Nov 1994 | A |
5402500 | Sims, Jr. | Mar 1995 | A |
5471527 | Ho et al. | Nov 1995 | A |
5500902 | Stockham, Jr. et al. | Mar 1996 | A |
5530760 | Paisley | Jun 1996 | A |
5537479 | Kreisel et al. | Jul 1996 | A |
5544140 | Seagrave et al. | Aug 1996 | A |
5579404 | Fielder et al. | Nov 1996 | A |
5583962 | Davis et al. | Dec 1996 | A |
5615270 | Miller et al. | Mar 1997 | A |
5623577 | Fielder | Apr 1997 | A |
5631714 | Saadoun | May 1997 | A |
5632003 | Davidson et al. | May 1997 | A |
5632005 | Davis et al. | May 1997 | A |
5633981 | Davis | May 1997 | A |
5659466 | Norris et al. | Aug 1997 | A |
5663727 | Vokac | Sep 1997 | A |
5677987 | Seki et al. | Oct 1997 | A |
5710752 | Seagrave et al. | Jan 1998 | A |
5727119 | Davidson et al. | Mar 1998 | A |
5742689 | Tucker et al. | Apr 1998 | A |
5757465 | Seagrave et al. | May 1998 | A |
5812969 | Barber et al. | Sep 1998 | A |
5848171 | Stockham, Jr. et al. | Dec 1998 | A |
5862228 | Davis | Jan 1999 | A |
5873065 | Akagiri et al. | Feb 1999 | A |
5896358 | Endoh et al. | Apr 1999 | A |
5909664 | Davis et al. | Jun 1999 | A |
5930373 | Shashoua et al. | Jul 1999 | A |
5966689 | McCree | Oct 1999 | A |
6002776 | Bhadkamkar et al. | Dec 1999 | A |
6016295 | Endoh et al. | Jan 2000 | A |
6021386 | Davis et al. | Feb 2000 | A |
6041295 | Hinderks | Mar 2000 | A |
6064962 | Oshikiri et al. | May 2000 | A |
6084974 | Niimi | Jul 2000 | A |
6088461 | Lin et al. | Jul 2000 | A |
6108431 | Bachler | Aug 2000 | A |
6148085 | Jung | Nov 2000 | A |
6185309 | Attias | Feb 2001 | B1 |
6211940 | Seagrave et al. | Apr 2001 | B1 |
6240388 | Fukuchi | May 2001 | B1 |
6263371 | Geagan, III et al. | Jul 2001 | B1 |
6301555 | Hinderks | Oct 2001 | B2 |
6311155 | Vaudrey et al. | Oct 2001 | B1 |
6327366 | Uvacek et al. | Dec 2001 | B1 |
6332119 | Hinderks | Dec 2001 | B1 |
6351733 | Saunders et al. | Feb 2002 | B1 |
6370255 | Schaub et al. | Apr 2002 | B1 |
6430533 | Kolluru et al. | Aug 2002 | B1 |
6442278 | Vaudrey et al. | Aug 2002 | B1 |
6442281 | Sato et al. | Aug 2002 | B2 |
6446037 | Fielder et al. | Sep 2002 | B1 |
6473731 | Hinderks | Oct 2002 | B2 |
6498822 | Tanaka | Dec 2002 | B1 |
6529605 | Christoph | Mar 2003 | B1 |
6606388 | Townsend et al. | Aug 2003 | B1 |
6624873 | Callahan, Jr. et al. | Sep 2003 | B1 |
6639989 | Zacharov et al. | Oct 2003 | B1 |
6650755 | Vaudrey et al. | Nov 2003 | B2 |
6651041 | Juric | Nov 2003 | B1 |
6664913 | Craven et al. | Dec 2003 | B1 |
6704711 | Gustafsson et al. | Mar 2004 | B2 |
6760448 | Gundry | Jul 2004 | B1 |
6766176 | Gupta et al. | Jul 2004 | B1 |
6768801 | Wagner et al. | Jul 2004 | B1 |
6784812 | Craven et al. | Aug 2004 | B2 |
6891482 | Craven et al. | May 2005 | B2 |
6920223 | Fosgate | Jul 2005 | B1 |
6970567 | Gundry et al. | Nov 2005 | B1 |
6980933 | Cheng et al. | Dec 2005 | B2 |
6993480 | Klayman | Jan 2006 | B1 |
7058188 | Allred | Jun 2006 | B1 |
7072477 | Kincaid | Jul 2006 | B1 |
7072831 | Etter | Jul 2006 | B1 |
7116789 | Layton et al. | Oct 2006 | B2 |
7152032 | Suzuki et al. | Dec 2006 | B2 |
7280664 | Fosgate et al. | Oct 2007 | B2 |
7283954 | Crockett et al. | Oct 2007 | B2 |
7349841 | Furuta et al. | Mar 2008 | B2 |
7395211 | Watson et al. | Jul 2008 | B2 |
7418394 | Cowdery | Aug 2008 | B2 |
7424423 | Bazzi et al. | Sep 2008 | B2 |
7448061 | Richards et al. | Nov 2008 | B2 |
7454331 | Vinton et al. | Nov 2008 | B2 |
7461002 | Crockett et al. | Dec 2008 | B2 |
7508947 | Smithers | Mar 2009 | B2 |
7533346 | McGrath et al. | May 2009 | B2 |
7536021 | Dickins et al. | May 2009 | B2 |
7539319 | Dickins et al. | May 2009 | B2 |
7551745 | Gundry et al. | Jun 2009 | B2 |
7583331 | Whitehead | Sep 2009 | B2 |
7610205 | Crockett | Oct 2009 | B2 |
7617109 | Smithers et al. | Nov 2009 | B2 |
7711123 | Crockett | May 2010 | B2 |
7751572 | Villemoes et al. | Jul 2010 | B2 |
7756274 | Layton et al. | Jul 2010 | B2 |
7784938 | Richards | Aug 2010 | B2 |
7925038 | Taenzer et al. | Apr 2011 | B2 |
7965848 | Villemoes et al. | Jun 2011 | B2 |
7973878 | Whitehead | Jul 2011 | B2 |
8032385 | Smithers et al. | Oct 2011 | B2 |
RE42935 | Cheng et al. | Nov 2011 | E |
8085941 | Taenzer | Dec 2011 | B2 |
8090120 | Seefeldt | Jan 2012 | B2 |
8144881 | Crockett et al. | Mar 2012 | B2 |
8204742 | Yang et al. | Jun 2012 | B2 |
8306241 | Kim et al. | Nov 2012 | B2 |
8315398 | Katsianos | Nov 2012 | B2 |
20010027393 | Touimi et al. | Oct 2001 | A1 |
20020013698 | Vaudrey et al. | Jan 2002 | A1 |
20020040295 | Saunders et al. | Apr 2002 | A1 |
20020076072 | Cornelisse | Jun 2002 | A1 |
20020097882 | Greenberg et al. | Jul 2002 | A1 |
20020146137 | Kuhnel et al. | Oct 2002 | A1 |
20020147595 | Baumgarte | Oct 2002 | A1 |
20030002683 | Vaudrey et al. | Jan 2003 | A1 |
20030035549 | Bizjak et al. | Feb 2003 | A1 |
20040024591 | Boillot et al. | Feb 2004 | A1 |
20040042617 | Beerends et al. | Mar 2004 | A1 |
20040042622 | Saito | Mar 2004 | A1 |
20040044525 | Vinton et al. | Mar 2004 | A1 |
20040057586 | Licht | Mar 2004 | A1 |
20040071284 | Abutalebi et al. | Apr 2004 | A1 |
20040076302 | Christoph | Apr 2004 | A1 |
20040078200 | Alves | Apr 2004 | A1 |
20040122662 | Crockett | Jun 2004 | A1 |
20040148159 | Crockett et al. | Jul 2004 | A1 |
20040165730 | Crockett et al. | Aug 2004 | A1 |
20040165736 | Hetherington et al. | Aug 2004 | A1 |
20040172240 | Crockett et al. | Sep 2004 | A1 |
20040184537 | Geiger et al. | Sep 2004 | A1 |
20040190740 | Chalupper et al. | Sep 2004 | A1 |
20050065781 | Tell et al. | Mar 2005 | A1 |
20050069162 | Haykin et al. | Mar 2005 | A1 |
20050075864 | Kim | Apr 2005 | A1 |
20050246170 | Vignoli et al. | Nov 2005 | A1 |
20060002572 | Smithers et al. | Jan 2006 | A1 |
20060129256 | Melanson et al. | Jun 2006 | A1 |
20060130637 | Crebouw | Jun 2006 | A1 |
20070025480 | Tackin et al. | Feb 2007 | A1 |
20070027943 | Jensen et al. | Feb 2007 | A1 |
20070056064 | Roose et al. | Mar 2007 | P1 |
20070092089 | Seefeldt et al. | Apr 2007 | A1 |
20070118363 | Sasaki et al. | May 2007 | A1 |
20070134635 | Hardy et al. | Jun 2007 | A1 |
20070268461 | Whitehead | Nov 2007 | A1 |
20070291959 | Seefeldt | Dec 2007 | A1 |
20080022009 | Yuen et al. | Jan 2008 | A1 |
20080095385 | Tourwe | Apr 2008 | A1 |
20080170721 | Sun et al. | Jul 2008 | A1 |
20080228473 | Kinoshita | Sep 2008 | A1 |
20080232612 | Tourwe | Sep 2008 | A1 |
20080249772 | Martynovich et al. | Oct 2008 | A1 |
20080284677 | Whitehead et al. | Nov 2008 | A1 |
20090063159 | Crockett | Mar 2009 | A1 |
20090067644 | Crockett et al. | Mar 2009 | A1 |
20090112579 | Li et al. | Apr 2009 | A1 |
20090161883 | Katsianos | Jun 2009 | A1 |
20090192795 | Cech | Jul 2009 | A1 |
20090220109 | Crockett et al. | Sep 2009 | A1 |
20090271185 | Smithers et al. | Oct 2009 | A1 |
20090304190 | Seefeldt et al. | Dec 2009 | A1 |
20090322800 | Atkins | Dec 2009 | A1 |
20100042407 | Crockett | Feb 2010 | A1 |
20100049346 | Boustead et al. | Feb 2010 | A1 |
20100060857 | Richards et al. | Mar 2010 | A1 |
20100066976 | Richards et al. | Mar 2010 | A1 |
20100067108 | Richards et al. | Mar 2010 | A1 |
20100067709 | Seefeldt | Mar 2010 | A1 |
20100073769 | Richards et al. | Mar 2010 | A1 |
20100076769 | Yu | Mar 2010 | A1 |
20100083344 | Schildbach et al. | Apr 2010 | A1 |
20100106507 | Muesch | Apr 2010 | A1 |
20100121634 | Muesch | May 2010 | A1 |
20100174540 | Seefeldt | Jul 2010 | A1 |
20100177903 | Vinton et al. | Jul 2010 | A1 |
20100179808 | Brown | Jul 2010 | A1 |
20100185439 | Crockett | Jul 2010 | A1 |
20100198378 | Smithers et al. | Aug 2010 | A1 |
20100202632 | Seefeldt et al. | Aug 2010 | A1 |
20100250258 | Smithers et al. | Sep 2010 | A1 |
20110022402 | Engdegard et al. | Jan 2011 | A1 |
20110022589 | Bauer et al. | Jan 2011 | A1 |
20110054887 | Muesch | Mar 2011 | A1 |
20110125507 | Yu | May 2011 | A1 |
20110137662 | McGrath et al. | Jun 2011 | A1 |
20110153050 | Bauer et al. | Jun 2011 | A1 |
20110188704 | Radhakrishnan et al. | Aug 2011 | A1 |
20110208528 | Schildbach et al. | Aug 2011 | A1 |
20110219097 | Crockett et al. | Sep 2011 | A1 |
20110221864 | Filippini et al. | Sep 2011 | A1 |
20110222835 | Dougherty et al. | Sep 2011 | A1 |
20110227898 | Whitehead | Sep 2011 | A1 |
20110243338 | Brown | Oct 2011 | A1 |
20110274281 | Brown et al. | Nov 2011 | A1 |
20110311062 | Seefeldt et al. | Dec 2011 | A1 |
20120008800 | Goerke | Jan 2012 | A1 |
20120039490 | Smithers | Feb 2012 | A1 |
20120046772 | Dickins | Feb 2012 | A1 |
20120063121 | Atkins | Mar 2012 | A1 |
20130073283 | Yamabe | Mar 2013 | A1 |
Number | Date | Country |
---|---|---|
2005299410 | May 2006 | AU |
2007243586 | Nov 2007 | AU |
2007309691 | May 2008 | AU |
2008266847 | Dec 2008 | AU |
PI0518278-6 | Nov 2008 | BR |
PI0709877-4 | Jul 2011 | BR |
PI0711063-4 | Aug 2011 | BR |
2581810 | May 2006 | CA |
1910816 | Feb 2007 | CN |
1910816 | Jul 2007 | CN |
200580036760.7 | Oct 2007 | CN |
200780011056.5 | Apr 2009 | CN |
200780011710.2 | Apr 2009 | CN |
200780014742.8 | May 2009 | CN |
200780038594.3 | Sep 2009 | CN |
200780049200.4 | Nov 2009 | CN |
200780040917.2 | Dec 2009 | CN |
200880008969.6 | Mar 2010 | CN |
200880024506.9 | Jun 2010 | CN |
200880024525.1 | Jul 2010 | CN |
10323126 | Dec 2004 | DE |
0 661 905 | Mar 1995 | EP |
05818505.9 | Jul 2007 | EP |
1 850 328 | Oct 2007 | EP |
1987586 | May 2008 | EP |
2082480 | May 2008 | EP |
2122828 | Jul 2008 | EP |
07753095.4 | Dec 2008 | EP |
07754779.2 | Jan 2009 | EP |
08768564.0 | Mar 2010 | EP |
08780173.4 | Mar 2010 | EP |
10184647.5 | Dec 2010 | EP |
08780174.2 | Nov 2011 | EP |
2327835 | Feb 1999 | GB |
09112054.9 | Dec 2009 | HK |
10107878.0 | May 2010 | HK |
09106026.6 | Jul 2011 | HK |
6-334459 | Dec 1994 | JP |
7-122953 | May 1995 | JP |
2006-524968 | Nov 2006 | JP |
2007-104407 | Apr 2007 | JP |
2007-539070 | May 2008 | JP |
2009-504190 | Sep 2009 | JP |
2009-504219 | Sep 2009 | JP |
2009-507694 | Oct 2009 | JP |
2009-533304 | Mar 2010 | JP |
2009-535268 | Mar 2010 | JP |
2009-544836 | May 2010 | JP |
2009-553658 | Jun 2010 | JP |
2010-517000 | Oct 2010 | JP |
2010-516999 | Dec 2010 | JP |
2011-025711 | Aug 2011 | JP |
10-2007-0022116 | Feb 2007 | KR |
10-2007-0028080 | Mar 2007 | KR |
10-2009-7016247 | Sep 2009 | KR |
10-2009-7019501 | Feb 2010 | KR |
10-2008-7029070 | Jun 2011 | KR |
2007005027 | Jun 2007 | MX |
2008013753 | Mar 2009 | MX |
2009004175 | Apr 2009 | MX |
PI 20084037 | Mar 2007 | MY |
PI 20091346 | Aug 2011 | MY |
PI 20093743 | Aug 2011 | MY |
2008143336 | May 2010 | RU |
2008146747 | Jun 2010 | RU |
2009118955 | Nov 2010 | RU |
2009135056 | Mar 2011 | RU |
2010105052 | Aug 2011 | RU |
2010105057 | Aug 2011 | RU |
200702926-7 | Oct 2005 | SG |
200807478-3 | Mar 2007 | SG |
200902133-8 | Sep 2007 | SG |
200906211-8 | Jun 2008 | SG |
94138593 | Jul 2006 | TW |
96111810 | Apr 2007 | TW |
96111338 | Jan 2008 | TW |
96108528 | Feb 2008 | TW |
96136545 | Jul 2008 | TW |
96139833 | Jul 2008 | TW |
96148397 | Sep 2008 | TW |
97122852 | Mar 2009 | TW |
97126352 | Mar 2009 | TW |
97126643 | Apr 2009 | TW |
99112159 | Feb 2011 | TW |
99113664 | Feb 2011 | TW |
99113477 | Jul 2011 | TW |
1-2009-01972 | Apr 2010 | VN |
1-2008-02889 | Apr 2011 | VN |
1-2009-01011 | Jun 2011 | VN |
WO 9725834 | Jul 1997 | WO |
WO 9742789 | Nov 1997 | WO |
WO 0131632 | May 2001 | WO |
WO 03090208 | Oct 2003 | WO |
WO 2004019656 | Mar 2004 | WO |
WO 2004021332 | Mar 2004 | WO |
WO 2004073178 | Aug 2004 | WO |
WO 2004098053 | Nov 2004 | WO |
WO 2004111994 | Dec 2004 | WO |
WO 2005086139 | Sep 2005 | WO |
WO 2006019719 | Feb 2006 | WO |
WO 2006047600 | May 2006 | WO |
WO 2006113047 | Oct 2006 | WO |
Entry |
---|
Extended Search Report issued in European application No. 08868264.6 on Sep. 12, 2012. |
Office Action issued in Chinese Application No. 200880121963.X on Aug. 14, 2013. |
Office Action issued in corresponding Japanese Application No. 2010-539900 on Jun. 12, 2012. |
Digital Entertainment, Perceptual Loudness Management for Broadcast Applications, White Paper, 18 pages, Jun. 2010. |
Recommendation ITU-R BS.1770, Algorithms to measure audio programme loudness and true-peak audio level, pp. 1-19, 2006. |
ITU-R Radio Sector of ITU, Recommendation ITU-R BS.1770-2, Algorithms to Measure Audio Programme Loudness and True-Peak Audio Level, 24 pages, Mar. 2011. |
Advanced Television Systems Committee Inc., ATSC Recommended Practice: Techniques for Establishing and Maintaining Audio Loudness for Digital Television, May 25, 2011. |
Lund, Control of Loudness in Digital TV, (2006 NAB BEC Proceedings, pp. 57-65). |
Skovenborg et al., Evaluation of Different Loudness Models with Music and Speech Material (2004, 117th Convention of the Audio Engineering Society, San Francisco, CA USA). |
Skovenborg et al., Evaluation of Designs for Loudness-Matching Experiments (2004, Poznan, Poland). |
Immerseel et al., Digital Implementation of Linear Gammatone Filters: Comparison of Design Methods, (2003; 4(3):59-64, Acoustics Research Letters Online). |
Glasberg et al., A Model of Loudness Applicable to Time-Varying Sounds (Journal of the Audio Engineering Society, Audio Engineering Society, New York, vol. 50, May 2002, pp. 331-342). |
Moore, et al., A Model for the Prediction of Thresholds, Loudness and Partial Loudness (Journal of the Audio Engineering Society, Audio Engineering Society, New York, vol. 45, No. 4, Apr. 1997, pp. 224-240). |
Zwicker et al., Psychoacoustics: Facts and Models (Springer-Verlag, Chapter 8, “Loudness,” pp. 203-238, Berlin Heidelberg, 1990, 1999). |
Hauenstein M., A Computationally Efficient Algorithm for Calculating Loudness Patterns of Narrowband Speech (Acoustics, Speech and Signal Processing, 1997 ICASSP-97 IEEE International Conference on Munich, Germany 21-24 Apr. 1997, Los Alamitos, CA, IEEE Comput. Soc., US, Apr. 21, 1997, pp. 1311-1314). |
Stevens, Calculations of the Loudness of Complex Noise (Journal of the Acoustical Society of America 1956. |
Zwicker, Psychological and Methodical Basis of Loudness (Acoustica, 1958). |
Australian Broadcasting Authority (ABA) Investigation into Loudness of Advertisements (Jul. 2002). |
Lin, L., et al. Auditory Filter Bank Design using Masking Curves (7th European Conference on Speech Communication and Technology, Sep. 2001). |
ISO226: 2003 (E) Acoustics—Normal Equal Loudness Level Contours. |
Seefeldt, et al., A New Objective Measure of Perceived Loudness, (Audio Engineering Society, Convention Paper 6236, 117th Convention, San Francisco, CA. Oct. 28-31, 2004). |
International Search Report for International Application No. PCT/US2008/087791, mailed on Feb. 24, 2009. |
Anderton, Craig, “DC Offset: The Case of the Missing Headroom” Harmony Central. http://www.harmonycentral.com/docs/DOC-1082, Oct. 6, 2009. |
Extended Search Report issued in European Application No. 09848326.6 on Jan. 8, 2014. |
Hu et al. “A Perceptually Motivated Approach for Speech Enhancement”, IEEE Transactions on Speech and Audio Processing, vol. 11, No. 5, Sep. 2003. |
International Preliminary Report on Patentability issued in application No. PCT/US2009/053437 on Feb. 14, 2012. |
International Search Report and Written Opinion in PCT/US2009/053437, Oct. 2, 2009. |
International Search Report and Written Opinion in PCT/US2009/056850, Nov. 2, 2009. |
International Search Report and Written Opinion issued in Application No. PCT/US2012/048378 on Jan. 24, 2014. |
Japanese Office Action mailed Feb. 26, 2013, Japanese Application No. 2012-524683. |
Khalil C. Haddad, et al., Design of Digital Linear-Phase FIR Crossover Systems for Loudspeakers by the Method of Vector Space Projections, Nov. 1999, vol. 47, No. 11, pp. 3058-3066. |
Office Action issued in Chinese Application No. 200880121963 on Apr. 30, 2014. |
Office Action issued in Chinese Application No. 200980160873.6 on Oct. 23, 2013. |
Office Action issued in corresponding Japanese Application No. 2012-524683 on Feb. 26, 2014. |
P1 Audio Processor, White Paper, May 2003, Safe Sound Audio 2003. |
Poers, The Loudness Rollercoaster, Jünger Loudness Control Devices, 2 pages, http://junger-audio.com/technology/the-loudness-rollercoaster/. |
Roger Derry, PC Audio Editing with Adobe Audition 2.0 Broadcast, desktop and CD audio production, First edition 2006, Eisever Ltd. |
Schottstaedt, SCM Repositories—SND Revision 1.2, Jul. 21, 2007, SourceForge, Inc. |
Office Action issued in Korean application No. 10-2010-7014611 on Aug. 31, 2015. |
Number | Date | Country | |
---|---|---|---|
20120250895 A1 | Oct 2012 | US |
Number | Date | Country | |
---|---|---|---|
61016270 | Dec 2007 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 12340364 | Dec 2008 | US |
Child | 13526242 | US |