Aspects of the present invention relate to digital signal processing of audio signals, and particularly to a digital audio workstation for processing audio tracks and audio mixing.
A digital audio workstation (DAW) is an electronic device or software application for recording, editing and producing audio files such as musical pieces, speech or sound effects. DAWs typically provide a user interface that allows the user to record, edit and mix multiple recordings and tracks into a mixed audio production. Modern computer-based DAWs support software plug-ins, each having its own functionality, which may expand the sound processing capabilities of the DAW. There are software plug-ins, for example, for equalization, limiting, compression, reverberation and echo. Software plug-ins may provide further audio sources within a DAW such as virtual instruments.
Various methods performable in a computer system are described herein for adjusting gains of an audio mix. An audio mix is provided including multiple audio tracks and respective gains. The audio tracks using the respective gains are individually analyzed to compute therefrom a first metric of frequency content. A user input specifies a desired second metric of the frequency content. Responsive to the user input, respective gains of the audio tracks are collectively and simultaneously adjusted to produce respective adjusted gains of the audio tracks. A second audio mix when played of the audio tracks with the respective adjusted gains have a third metric of frequency content different from the first metric of frequency content. The third metric is closer to the second metric than the first metric. The audio tracks using the respective adjusted gains may be mixed into the second audio mix and the second audio mix may be played. The second metric may be responsive to a control parameter for collectively and simultaneously adjusting the respective gains of the audio tracks. The second metric of the second audio mix may be responsive to a single control parameter for collectively and simultaneously adjusting the respective gains of the audio tracks. A control may be provided on a user interface. The control may be configured for the collective and simultaneous adjustment of the respective gains of the audio tracks.
Analysis of the audio tracks may include providing a previously determined frequency-dependent audio filter; The audio filter may be applied respectively to the audio tracks and respective values of loudness may be measured. Adjustment of the respective gains of the audio tracks may be responsive to the measured values of loudness.
Analysis of the audio tracks may include providing multiple previously determined frequency-dependent audio filters including a first filter and a second filter. The first filter and said second filter may have different audio frequency responses. The first filter and the second filter may be applied respectively to the audio tracks and values of loudness measured respective to the first filter and the second filter. Adjustment of the respective gains of the audio tracks may be responsive to a difference between the values of loudness respective to the first filter and the second filter.
Adjustment of the respective gains of the audio tracks may include normalization thereby maintaining loudness of the second audio mix.
Various user interfaces are disclosed herein including for performing the methods as disclosed herein in a computer system.
These, additional, and/or other aspects and/or advantages of the present invention are set forth in the detailed description which follows; possibly inferable from the detailed description; and/or learnable by practice of the present invention.
The invention is herein described, by way of example only, with reference to the accompanying drawings, wherein:
The foregoing and/or other aspects will become apparent from the following detailed description when considered in conjunction with the accompanying drawing figures.
Reference will now be made in detail to features of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the like elements throughout. The features are described below to explain the present invention by referring to the figures.
By way of introduction, various embodiments of the present invention are directed to mixing multiple audio tracks into a playable audio file or mix which contains audio content of the tracks but with individual audio processing and individual gains. For example, a song may be recorded with multiple microphones not necessarily at the same time. There may be audio tracks for each instrument or voice; or multiple audio tracks from multiple respective microphones of the same instrument or voice. The audio tracks may be audio processed individually and mixed into an audio mix from all the audio sources intending to produce a pleasant sound experience when the audio mix is played.
In a live performance with multiple microphones, signals from the microphones may be mixed together and amplified to be heard by the audience. In this live scenario, among others, it is desirable to set levels of the audio tracks in a manner which achieves specific tonal or acoustic properties that are perceived by the audience.
Referring now to the drawings, reference is now made to
Different embodiments of the present invention may be configured to achieve generally a target such as more loudness (i.e. bass and treble) or an arbitrary tonal shape/frequency response/contour equalization (EQ) target
Reference is now also made to
first metric. The third metric is more similar to or closer to the frequency content attribute, i.e. second metric specified by user input 25, than the first metric.
Reference is now also made to
Adjusted gains may be determined using a computation that utilizes two values A0 and A1 for each audio track, which relate to a control parameter, e.g. position of slider 15.
The first value A0 may correspond to the audio track's effect on a first setting of slider 15, e.g., the “dark” side, in the dark/bright usage example.
The second value A1 may correspond to the audio track's effect on a second setting of slider 15, e.g., the “bright” side, in the dark/bright usage example.
A first filter 31, e.g. a shelving filter, may be provided which attenuates a lower frequency band and/or boosts a higher frequency band. Loudness L0 in decibels may be measured of the audio track after filtering the audio track with shelving filter 31 which emphasizes the treble content of the audio track.
Similarly, a second filter 33 may be used which emphasizes the bass content of the audio track. Second filter 33 may be a shelving filter such as first filter 31 inverted. Loudness L1 of the audio track in decibels may be measured using second filter 33 similarly to first filter 31. Loudness measurements L0(i) and L1(i) upon applying respective filters 31 and 33 of audio tracks i are an example of a first metric (
Adjusted gains for audio track i may be calculated from the loudness measurements L0(i) and L1(i) as follows:
L0(i)=Loudness(i,A0)
L1(i)=Loudness(i,A1)
Combined logarithmic gain factor Φdb(i) in decibels for audio track i may be given by:
Φdb(i)=L0(i)−L1(i)
In general, combined audio track logarithmic gain factor Φdb in decibels may be given by:
f(A0)−f(A1) where f is a general measurable audio property.
It may be convenient to normalize the audio tracks so that the overall loudness (or general measurable audio property) does not drift. Combined logarithmic gain Φdb(i) of each audio track i is then replaced by:
where μ is a mean combined logarithmic gain over audio tracks i and σ is a standard deviation of combined logarithmic gain Φdb(i) over audio tracks i.
Linear gain Φlin(i) for audio track i is given by:
Φlin=10(Φdb/20)
Using linear gain Φlin, audio track i undergoes multiplicative gain compensation based on slider 15 position which may be designated as a bounded between [0.0 to 1.0] and where 0.5 is the neutral middle position, by way of example.
The adjusted gain of an audio track may be determined as linear interpolation between respective gains at:
a first point of slider 15, e.g. α=0.0:
μ0=1.0/Φlin, and
a second point of slider 15, α=1.0:
μ1=1.0·Φlin,
Specifically, the adjusted gain, over a varying between [0.0 to 1.0] may be:
μ0·α+μ1·(1.0−α)
In general, adjusted gain=g(α, initial audio track gain) where g represents a general function.
Method 30 according to features of the present invention may be used to obtain loudness perception using two filters 31, 33 emphasizing low and high frequencies respectively. Upon a mix including various initial gains for each audio track, the adjusted gains are determined according to embodiments of the present invention to provide relative levels between the audio tracks that cause the audio mix to become darker or brighter (i.e. more treble or bass) dependent on the control parameter or slider 15 position.
Filters 31, 33 may be of any general form which determines the overall effect of slider 15. A filter may be targeted for a specific equalization contour so that moving slider 15 may cause frequency metrics of the audio mix to tend to meet the target in one position of slider 15 and tend to miss the target in another position of the slider 15.
Reference is now made to
The embodiments of the present invention may comprise a general-purpose or special-purpose computer system including various computer hardware components, which are discussed in greater detail herein. Embodiments within the scope of the present invention also include computer-readable media for carrying or having computer-executable instructions, computer-readable instructions, or data structures stored thereon. Such computer-readable media may be any available media, transitory and/or non-transitory which is accessible by a general-purpose or special-purpose computer system. By way of example, and not limitation, such computer-readable media can comprise physical storage media such as RAM, ROM, EPROM, flash disk, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic or solid state storage devices, or any other media which can be used to carry or store desired program code means in the form of computer-executable instructions, computer-readable instructions, or data structures and which may be accessed by a general-purpose or special-purpose computer system.
The term “audio track” as used herein refers to an audio signal which may be mixed or combined with other audio tracks to produce a playable audio production.
The term “audio mix” or “mix” refers to the playable audio production after multiple audio tracks are combined with suitable gains.
The terms “audio track and “channel” are used herein interchangeablely.
The term “original” audio track refers to an audio track as recorded prior to digital signal processing.
The terms “perceived frequency content”, “tonal balance” “energy profile” “bright/dark” and “color” are used herein interchangeably and refer to the perception of frequency content of a mix of audio tracks.
The term “shelving filter” (also known as a shelf filter, shelf EQ, shelving EQ) is a filter which attenuates either the high end or the low end of an audio frequency spectrum, such as between 20-20000 kiloHertz.
The term “loudness” as used herein is the subjective perception of sound pressure. Loudness levels are normally expressed as a value relative to a reference value or beginning value.
The terms “gain”, “amplitude” and “level” although not technically the same are interchangeable in the context of the present disclosure. Thus, adjusting a gain for an audio track results in an adjusted sound amplitude or an adjusted sound level of the audio track when played.
The term “collectively” as used herein refers to adjusting two or more gains with the same control mechanism or motion.
The term “independently” as used herein refers to adjusting two or more gains with without a direct, e.g. proportional, dependence between the two or more adjustments.
The term “metric” as used herein in the context of a metric of frequency content refers to one or more measured or perceived attributes of an audio track or audio mix.
The indefinite articles “a”, “an” is used herein, such as “an audio track”, “an amplitude” have the meaning of “one or more” that is “one or more audio tracks” or “one or more amplitudes”.
All optional and preferred features and modifications of the described embodiments and dependent claims are usable in all aspects of the invention taught herein. Furthermore, the individual features of the dependent claims, as well as all optional and preferred features and modifications of the described embodiments are combinable and interchangeable with one another.
Although selected features of the present invention have been shown and described, it is to be understood the present invention is not limited to the described features.
Number | Name | Date | Kind |
---|---|---|---|
8198525 | Homburg | Jun 2012 | B2 |
Number | Date | Country |
---|---|---|
2015035492 | Mar 2015 | WO |
Entry |
---|
De Man, B., Reiss, J. and Stables, R., 2017. Ten years of automatic mixing. Proceedings of the 3rd Workshop on Intelligent Music Production, Salford, UK, Sep. 15, 2017. |
GB2009452.0 1st search and examination report. |
GB2009452.0 response to 1st examination report. |
Zotope Web page including video https://www.izotope.com/en/learn/how-to-use-mix-assistant.html. |
Number | Date | Country | |
---|---|---|---|
20210397409 A1 | Dec 2021 | US |