A computing device may include multiple user interface components. For example, the computing device may include a display to produce images viewable by a user. The computing device may include a mouse, a keyboard, a touchscreen, or the like to allow the user to provide input. The computing device may also include a speaker, a headphone jack for use with headphones or earbuds, or the like, to produce audio that can be heard by the user. The user may listen to various types of audio with the computing device, such as music, sound associated with a video, the voice of another person (e.g., a voice transmitted in real time over a network), or the like. The computing device may be a desktop computer, an all-in-one computer, a mobile device (e.g., a notebook, a tablet, a mobile phone, etc.), or the like, having an audio output device with a limited low frequency response.
For a more complete understanding of various examples, reference is now made to the following description taken in connection with the accompanying drawings in which:
A computing device may be small to reduce weight and size, which may make the computing device easier for a user to transport. The computing device may have audio output devices with limited capabilities. For example, the audio output devices may be small to fit within the computing device and to reduce the weight contributed by the audio output devices. However, small audio output devices may provide a poor frequency response at low frequencies. The electro-mechanical speaker drivers may be unable to move enough volume of air to produce low frequency tones at the volume that they exist in the original audio stream. Accordingly, the low frequency portions of an audio stream may be lost when the audio stream is played by the computing device, thereby limiting the bandwidth of the reproduced audio stream. Similarly, a user may listen to audio by connecting ear buds or headphones to the computing device, which may also have limited abilities to accurately reproduce low frequency portions of the original audio stream.
To compensate for the loss of low frequencies in the audio output device, the audio signal may be modified to create the perception of the low frequency component being present. In an example, harmonics of the low frequency signals may be added to the audio stream. The inclusion of the harmonics may create the perception in listeners that the fundamental frequency is present even though the audio output device is unable to produce the fundamental frequency. This is known as the missing fundamental effect in psycho-acoustics, where the human brain and hearing system operate to fill-in the fundamental frequency when it is missing. This principle is used with naturally occurring harmonics in the US telephone system, which operates with a bandwidth between 300 Hertz and 3000 Hertz, while allowing listeners to discern male voices with a mean lower frequency of approximately 150 Hertz.
The harmonics may be produced artificially by applying non-linear processing to a low frequency portion of the audio stream. However, if the span of the low frequency portion is too broad, then the non-linear processing may create intermodulation distortion (IMD) that is added to the audio stream. IMD can take the form of third-order intermodulation products and beat notes. When the harmonics and IMD products are added to the audio stream, the intermodulation distortion may cause the resultant audio signal to have less clarity and sound “muddied”.
Various examples described herein provide for systems, methods and computer-readable media for extending the perceived bandwidth of an audio output device with a limited low frequency capability. For the purpose of the present application, any device that converts an electronic representation of an audio stream to an audio signal perceptible by humans shall be referred to as an audio output device, including without limitation, speakers, ear buds, and headphones.
The term auditory filter, as used herein, refers to a bandpass filter that corresponds to a critical frequency band in the human hearing system. In audiology, a critical band is a band of frequencies within which two separate frequencies cannot be readily distinguished. In some examples, as described in greater detail below, arrays of sub-octave bandpass filters may be used to simulate an array of critical band filters.
Continuing with the example of
The subsystem 100 may include a sub-band selection engine 103. As used herein, the term “engine” refers to hardware (e.g., a processor, such as an integrated circuit or other circuitry) or a combination of software (e.g., programming such as machine- or processor-executable instructions, commands, or code such as firmware, a device driver, programming, object code, etc.) and hardware. Hardware may include a hardware element with no software elements such as an application specific integrated circuit (ASIC), a Field Programmable Gate Array (FPGA), etc. A combination of hardware and software includes software hosted at hardware (e.g., a software module that is stored at a processor-readable memory such as random-access memory (RAM), a hard-disk or solid-state drive, resistive memory, or optical media such as a digital versatile disc (DVD), and/or executed or interpreted by a processor), or hardware and software hosted at hardware.
The sub-band selection engine 103 may select a dominant sub-band (or multiple sub-bands based on dominance in descending order for multiple band perceptual bandwidth extension) in the audio stream based on the maximum power detected by the detector bank 102 over a selected time period comprising a frame of the audio stream.
The subsystem 100 may also include a first filter engine 104. In one example, first filter engine 104 may synthesize a bandpass filter corresponding to the dominant sub-band selected by the sub-band selection engine 103. As illustrated in
The subsystem 100 may include a harmonic engine 105 coupled to the first filter engine 104. The harmonic engine 105 may include a non-linear device that generates harmonics of the dominant sub-band. Finally, the example subsystem 100 may include a second filter engine 106, coupled to the harmonic engine 105, to select a subset of the harmonics generated by the harmonic engine 105, where the selected subset of harmonics of the dominant sub-band can be used to create the perception of low frequency content in an audio stream as described in greater detail below.
The example system 200 may also include a detector bank 202 coupled to the filter bank 201, including power detectors, such as power detectors 1 to N corresponding to sub-band filters 1 to N. Each detector determines the power of the audio input stream in the detector's corresponding sub-band. Other examples, in lieu of power detection, include computing the infinity norm (max of the dB value) by first computing the fast Fourier transform (FFT), then the log-magnitude to obtain a dB value in each sub-band, and then selecting the largest dB-valued sub-band.
The example system 200 may process frames of audio samples. In some examples, the frames of samples may be non-overlapping. In other examples, the frames of samples may be overlapping, such as by advancing the frame one sample at a time, by a fraction of a frame (e.g., ¾, ⅔, ½, ⅓, ¼, etc.). Non-overlapping frames may allow for faster processing, which may prevent audio from becoming noticeably unsynchronized with related video signals. Overlapping frames may track changes in dominant frequencies more smoothly. The frame size may be predetermined based on a sampling frequency, a lowest pitch to be detected (e.g., a lowest pitch that is audible to a human listener), or the like. The frame size may correspond to a predetermined multiple of the period of the lowest pitch to be perceived. The predetermined multiple may be, for example, 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5, etc. A higher multiple may increase accuracy but involve processing of a larger number of samples.
The example system 200 may include a sub-band selection engine 203. The sub-band selection engine 203 may select a dominant sub-band in an audio input stream based on the maximum signal power detected in the sub-bands. In one example, the sub-band selection engine computes the RMS (root mean square) value of the output of each sub-band filter over a frame, and then selects the maximum RMS value as the dominant sub-band in that frame. Because the system 200 processes multiple frames of audio samples, the dominant sub-band may change from frame to frame. In some examples, the sub-band selection engine 203 may include a smoothing filter to prevent large changes in the dominant sub-band between frames. For example, for non-overlapping frames or overlapping frames with large advances, the dominant frequency may change rapidly between frames, which may produce noticeable artifacts in the audio output. The smoothing filter may cause the dominant frequency to change gradually from one frame to the next. Accordingly, large frame advances can be used to improve processing performance without creating artifacts in the audio output.
The example system 200 may include a first filter synthesis engine 204, coupled to the sub-band selection engine 203. The first filter synthesis engine 204 may be notified by the sub-band selection engine 203 of the dominant sub-band in the current frame. In turn, the first filter synthesis engine 204 synthesizes a first filter 205 based on the dominant sub-band in the current frame of the audio input stream. That is, first filter 205 is synthesized to replicate the sub-band filter corresponding to the dominant sub-band. In one example, first filter 205 may be a duplicate of the corresponding sub-band filter, or some variation corresponding to a critical band of an auditory filter. As used herein, the term “auditory filter” refers to any filter from a set of contiguous filters that can be used to model the response of the basilar membrane to sound. The basilar membrane, part of the human hearing system, is a pseudo-resonant structure that, like strings on an instrument, varies in width and stiffness. The “string” of the basilar membrane is not a set of parallel strings, as in a guitar, but a long structure that has different properties (width, stiffness, mass, damping, and the dimensions of the ducts that it couples to) at different points along its length. The motion of the basilar membrane is generally described as a traveling wave. The parameters of the membrane at a given point along its length determine its characteristic frequency, the frequency at which it is most sensitive to sound vibrations. The basilar membrane is widest and least stiff at the apex of the cochlea, and narrowest and most stiff at the base. High-frequency sounds localize near the base of the cochlea (near the round and oval windows), while low-frequency sounds localize near the apex.
As used herein, the term “critical band” refers to the passband of a particular auditory filter. In an example, the first filter synthesis engine 204 may select a first filter 205 corresponding to an auditory filter with a center frequency closest to the center frequency of the dominant sub-band. The first filter synthesis engine 204 may synthesize the first filter 205 based on the corresponding auditory filter, may load predetermined filter coefficients for the selected first filter 205, or the like. In one example, the first filter 205 may be a minimum phase IIP or FIR filter.
In one example, the first filter 205 may pass frequencies in the dominant sub-band from the audio input stream, and attenuate or reject all other frequencies in the audio input stream. In one example, the first filter 205 may include an input buffer or delay to compensate for the filtering, detection, selection and synthesis processes described herein, which require a finite amount of processing time.
The example system 200 may also include a harmonic engine 206 to generate harmonics of the frequencies in the dominant sub-band, including both even and odd harmonics. For example, the harmonic engine 206 may apply non-linear processing to the filtered signal to generate the harmonics. The harmonics may include signals with frequencies that are integer multiples of the frequencies in the dominant sub-band. Because the first filter 205 removed frequency components other than those in the dominant sub-band, the harmonic engine 206 may produce less intermodulation distortion and beat notes than if a wide band filter or no filter had been applied. The harmonics engine 240 may produce a signal that includes the dominant sub-band frequencies and the harmonics.
The example system 200 may include a second filter synthesis engine 207. The second filter synthesis engine 207 may receive parameters from the first filter synthesis engine 204, related to the first filter 205, wherein the second filter synthesis engine 207 can synthesize a second filter 208 to pass a subset of the harmonics. Frequencies in the dominant sub-band and some of the lower-order harmonics in the harmonics may be at frequencies that the audio output device cannot reproduce, so the second filter synthesis engine 207 may synthesis a second filter 208 to remove those frequencies. Also, higher-order harmonics above a predetermined upper frequency limit may have little effect in creating the perception of the dominant sub-band, so the second filter 208 may remove the higher-order harmonics as well. In some examples, the second filter 208 may keep some or all of the second harmonic, third harmonic, fourth harmonic, fifth harmonic, sixth harmonic, seventh harmonic, eighth harmonic, ninth harmonic, tenth harmonic, etc. The second filter 208 may output a signal that includes the subset of harmonics. In one example, the second filter 208 may include an input buffer or delay to compensate for signal processing delays associated with synthesizing the second filter 208. In one example, the second filter 208 may be a minimum phase filter IIR or FIR filter.
The second filter 208 may have a lower cutoff frequency and an upper cutoff frequency. As used herein, the term “cutoff frequency” refers to a frequency at which signals are attenuated by a particular amount (e.g., 3 dB, 6 dB, 10 dB, etc.) The second filter synthesis engine 207 may select the cutoff frequencies based on the first filter 205, which may have its own lower and upper cutoff frequencies. The lower cutoff frequency of the second filter 208 may be selected to be a first integer multiple of the lower cutoff frequency of the first filter 205, and the upper cutoff frequency of the second filter 208 may be selected to be a second integer multiple of the upper cutoff frequency of the first filter 205. The first and second integers may be different from each other. The first and second integers may be selected so that the lower cutoff frequency of the second filter 208 excludes harmonics below the capabilities of the audio output device and the upper cutoff frequency of the second filter 208 excludes harmonics that have little effect in creating the perception of the dominant sub-band. In one example, the first integer may be two, three, four, five, six, or the like, and the second integer may be three, four, five, six, seven, eight, nine, ten, or the like.
The system 200 may include a parametric filter engine 209. The parametric filter engine 209 may apply a gain to the subset of harmonics received from the second filter 208 by applying a parametric filter to the signal to shape the spectrum of the signal in order to maximize the psycho-acoustic perception of the missing fundamental frequencies. The parametric filter engine 209 may receive an indication of the gains to apply to different segments of the spectrum from a gain engine 210 and an indication of the lower and upper cutoff frequencies of the second filter 208 from the second filter synthesis engine 207. The parametric filter engine 209 may synthesize the parametric filter based on the gain and the cutoff frequencies of the second filter 208. In one example, without limitation, the parametric filter may be a biquad filter (i.e., a second-order IIR filter). In some examples, gain may be applied to the signal containing the subset of harmonics without using a parametric filter, e.g., using an amplifier to apply a uniform gain to the signal containing the subset of harmonics.
The example system 200 may include an insertion engine 211 to insert the amplified subset of harmonics from the parametric engine 209 into an audio stream comprising a modified version of the original audio input stream. As illustrated in
For example, some or all of the engines, such as sub-band selection engine 203, first filter synthesis engine 204, harmonic engine 206, second filter synthesis engine 207, and parametric filter engine 209 may delay the amplified subset of harmonics relative to the audio input stream. Accordingly, the delay engine 213 may delay the filtered audio input stream to ensure it will be time-aligned with the amplified subset of the harmonics when the filtered audio input stream and the amplified subset of harmonics arrive at the insertion engine 211.
In one example, the insertion engine 211 combines the amplified subset of harmonics with the delayed and filtered audio input stream to create an audio output with harmonics. The amplified subset of harmonics may create the perception of the dominant low frequency components removed by the high-pass filter 212.
Turning now to
Turning now to
In
Frames 12-22 in
Frames 23-33 in
Frames 34-44 in
And finally, frames 45-55 in
The example method 1600 may include determining a maximum power sub-band in a lower frequency portion of an audio stream (block 1602). For example, block 1602 may be performed by the example system 200 by separating the lower frequency portion of the audio stream into sub-bands using an auditory filter bank such as filter bank 201, measuring the RMS power in each sub-band with a bank of detectors such as detector bank 202 in example system 200, and identifying the maximum power sub-band using a sub-band selection engine such as sub-band selection engine 203 in example system 200.
The example method 1600 may include selecting the maximum power sub-band from the lower frequency portion of the audio stream (block 1604). For example, block 1604 may be performed by the example system 200 by using a filter synthesis engine, such as first filter synthesis engine 204 in example system 200 to synthesize a filter, such as first filter 205 in example system 200, and using first filter 205 to extract the maximum power sub-band frequencies from the audio stream.
The example method 1600 may also include generating harmonics of the maximum power sub-band frequencies (block 1606). For example, block 1606 may be performed by example system 200 by applying the maximum power sub-band frequencies from the first filter 205, to a harmonic engine, such as harmonic engine 206 in example system 200.
The example method 1600 may also include selecting a subset of the harmonics of the maximum power sub-band frequencies (block 1608). For example, block 1608 may be performed by example system 200 by using a filter synthesis engine, such as second filter synthesis engine 207 in example system 200 to synthesis a filter, such as second filter 208 in example system 200 to select the subset of harmonic, where the subset is selected to remove harmonics that are below the capabilities of the intended audio output device, and to remove harmonics that have little effect in creating the perception of the dominant sub-band frequencies.
The example method 1600 may also include selectively amplifying the subset of harmonics of the maximum power sub-band frequencies (block 1610). For example, block 1610 may be performed by example system 200 by a parametric filter engine, such as parametric filter engine 209 in example system 200, by applying a parametric filter to the subset of harmonics, which may apply frequency selective gain shaping to the sub-set of harmonics.
The example method 1600 may also include removing the lower frequency portion of the audio stream to isolate an upper frequency portion of the audio stream (block 1612). For example, block 1612 may be performed by example system 200 by using a high-pass filter, such as high-pass filter 212 to remove frequency components from the audio stream that cannot be reproduced by the intended audio output device.
The example method 1600 may also include delaying the upper frequency portion of the audio stream to time-align the upper frequency portion of the audio stream with the subset of harmonics (block 1614). For example, block 1614 may be performed by example system 200 by using a delay engine, such as delay engine 213 in example system 200, where delay engine 213 compensates for any signal processing delays associated with processing engines, such as sub-band selection engine 203, first filter synthesis engine 204, harmonic engine 206, second filter synthesis engine 207, and parametric filter engine 209, and the like.
Finally, the example method 1600 may also include combining the subset of harmonics of the maximum power sub-band frequencies with the upper frequency portion of the audio stream to create the perception of extended low-frequency (block 1616). For example, block 1616 may be performed by example system 200 by using an insertion engine, such as insertion engine 211 to add the subset of harmonics of the maximum power sub-band frequencies to the filtered and time-aligned upper frequency portion of the audio stream.
Referring now to
The example instructions include instructions 1721 for determining a maximum power sub-band in a lower frequency portion of an audio stream. For example, instructions 1721 may cause the processor 1710 to separate the lower frequency portion of the audio stream into sub-bands using an auditory filter bank such as filter bank 201 in example system 200, measure the RMS power in each sub-band with a bank of detectors such as detector bank 202 in example system 200, and identify the maximum power sub-band using a sub-band selection engine such as sub-band selection engine 203 in example system 200.
The example instructions may also include instructions 1722 for selecting the maximum power sub-band from the lower frequency portion of the audio stream. For example, instructions 1722 may cause the processor 1710 to implement a filter synthesis engine, such as first filter synthesis engine 204 in example system 200 to synthesize a filter, such as first filter 205 in example system 200, and to use first filter 205 to extract the maximum power sub-band frequencies from the audio stream.
The example instructions may also include instructions 1723 for generating harmonics of the maximum power sub-band frequencies. For example, instructions 1723 may cause the processor 1710 to apply the maximum power sub-band frequencies from the first filter 205, to a harmonic engine, such as harmonic engine 206 in example system 200.
The example instructions may also include instructions 1724 for selecting a subset of the harmonics of the maximum power sub-band frequencies. For example, instructions 1724 may cause the processor 1710 use a filter synthesis engine, such as second filter synthesis engine 207 in example system 200, to synthesis a filter, such as second filter 208 in example system 200, to select the subset of harmonics, where the subset is selected to remove harmonics that are below the capabilities of the intended audio output device, and to remove harmonics that have little effect in creating the perception of the dominant sub-band frequencies.
The example instructions may also include instructions 1725 for selectively amplifying the subset of harmonics of the maximum power sub-band frequencies. For example, the instructions 1725 may cause the processor 1710 to implement a parametric filter engine, such as parametric filter engine 209 in example system 200, by applying a parametric filter to the subset of harmonics, which may apply frequency selective gain shaping to the sub-set of harmonics to enhance the perception of a missing fundamental frequency.
The example instructions may also include instructions 1725 for removing the lower frequency portion of the audio stream to isolate an upper frequency portion of the audio stream. For example, the instructions 1726 may cause the processor 1710 to implement a high-pass filter, such as high-pass filter 212 in example system 200 to remove frequency components from the audio stream that cannot be reproduced by the intended audio output device.
The example instructions may also include instructions 1727 for delaying the upper frequency portion of the audio stream for time-aligning the upper frequency portion with the subset of harmonics of the maximum power sub-band frequencies. For example, instructions 1727 may cause the processor 1710 to implement a delay engine, such as delay engine 213 in example system 200, where delay engine 213 compensates for any signal processing delays associated with processing engines, such as sub-band selection engine 203, first filter synthesis engine 204, harmonic engine 206, second filter synthesis engine 207, and parametric filter engine 209, and the like.
The example instructions may also include instructions 1728 for combining the subset of harmonics of the maximum power sub-band frequencies with the upper frequency portion of the audio stream. For example, instructions 1728 may cause the processor to implement an insertion engine, such as insertion engine 211 to add the subset of harmonics of the maximum power sub-band frequencies to the filtered and time-aligned upper frequency portion of the audio stream.
The foregoing description of various examples has been presented for purposes of illustration and description. The foregoing description is not intended to be exhaustive or limiting to the examples disclosed, and modifications and variations are possible in light of the above teachings or may be acquired from practice of various examples. The examples discussed herein were chosen and described in order to explain the principles and the nature of various examples of the present disclosure and its practical application to enable one skilled in the art to utilize the present disclosure in various examples and with various modifications as are suited to the particular use contemplated. The features of the examples described herein may be combined in all possible combinations of methods, apparatus, modules, systems, and computer program products.
It is also noted herein that while the above describes examples, these descriptions should not be viewed in a limiting sense. Rather, there are several variations and modifications which may be made without departing from the scope as defined in the appended claims.
Number | Name | Date | Kind |
---|---|---|---|
3622714 | Berkley | Nov 1971 | A |
4706290 | Lin | Nov 1987 | A |
5615302 | McEachern | Mar 1997 | A |
6285767 | Klayman | Sep 2001 | B1 |
8386242 | Kim et al. | Feb 2013 | B2 |
8705764 | Baritkar et al. | Apr 2014 | B2 |
8873763 | Tsang | Oct 2014 | B2 |
8971551 | Ekstrand | Mar 2015 | B2 |
20100086148 | Hung | Apr 2010 | A1 |
20140211954 | Hetherington | Jul 2014 | A1 |
20140309992 | Carney | Oct 2014 | A1 |
20160155441 | Panda | Jun 2016 | A1 |
Number | Date | Country | |
---|---|---|---|
20190342661 A1 | Nov 2019 | US |