This disclosure is directed to signal processing, and, more specifically, to a system for converting data samples at a first rate to data samples at a second rate.
In general, conventional rate converters include two major computation blocks, as illustrated in
The second computation block is the sample interpolator. The function of this block is to interpolate the input data sample and to generate the output data sample with the real value index. One problem with conventional rate converters is that they create aliasing from the resampling, and it is relatively difficult to produce an output signal having a high Signal-to-Noise Ratio (SNR) and having a relatively flat frequency response within standard frequency ranges for audio signals.
Embodiments of the invention address this and other limitations of the prior art.
Embodiments of the invention may be used to implement a rate converter that includes: 6 channels in forward (audio) path, each channel having a 24-bit signal path per channel, an End-to-end SNR of 110 dB, all within the 20 Hz to 20 KHz bandwidth. Embodiment may also be used to implement a rate converter having: 2 channels in a reverse path, such as for voice signals, 16-bit signal path per channel, an End-to-end SNR of 93 dB, all within 20 Hz to 20 KHz bandwidth.
The rate converter may include sample rates such as 8, 11.025, 12, 16, 22.05, 24, 32.44.1, 48, and 96 KHz.
Further, rate converters according to embodiments may include a gated clock in low-power mode to conserve power.
For comparison,
In general, with reference to
There are also several output signals. An output audio sample includes a single word with all of the audio channels output from the converter. It is time multiplexed for all channels. The output audio channel id tells which channel the sample belongs to. An output sample ready signal signifies that the next audio sample is ready. An input sample ack signifies that the converter is ready to accept the next signal from the audio source. Also, a register bank access signal may be output.
In general, with reference to
Both input and output sample rates may be jittery. The estimated sample rate ratio should be stable enough to yield a high SNR, and also should maintain a stable average value for the sample index. Ideally, for maximum performance input buffers should not overflow or underflow, and audio latency variation should be minimized.
In operation, initially, an estimate of drate (decimation rate) is obtained by either a user setting, or by measuring input and out sample rates. The estimated value is stored in drate reg, which is a register that stores the decimation rate.
The tracking loop circuit includes one or more different “gears.” In one embodiment there are four gears. The difference between the gears is the value for the gain elements G1, G2 and G3. The gain values for each of the gear levels are programmed in registers and can be changed by firmware. The gear control decides when to move the gear up or down. The change is controlled by two factors. In some embodiments, the converter stays in each gear for a minimal amount of time before switching to another gear. Also in some embodiments, this minimal time doubles for each higher gear. In one embodiment, the rate converter of
Table 1 includes default values for tracking loop parameter registers according to embodiments of the invention
The tracking loop of
In a particular implementation of embodiments of the invention, the index error is sampled at 24 MHz to get time accuracy that is way below the sample period. The average by 214 enables all of 2nd order phase lock loop and the IIR in error smoothing to operate at 1.5 KHz clock, which allows a very low current draw. The sample index buffer has 28 bits. With one input sample counts as 223, that gives enough room for 16 samples. Since the input buffer is only 16 samples more than the filter requirement. It is enough to cover all possible cases. The drate register contains 41 bits. That is 13 more LSB than the sample index buffer. The 13 LSBs may be a conservative number, but it costs very little in area and current. The error IIR buffer contains 45 bits. It adds 13 LSBs to the input from index error average. Again, 13 LSBs may be a conservative number, but it costs very little in area and current since no multipliers are required. The operation frequency is 1.5 KHz. Dithering is done at reducing drate from 41 bit to 28 bit before adds to sample buffer.
Compared to the rate converter illustrated in
In one embodiment, the gear selection of the new rate converter depends only on time. For each gear level, it stays for a fixed amount time. The stay time for gear level n+1 is twice as the time for level n. The time to stay in each gear depends on the phase lock quality. Different initial conditions and sample conversion ratios decides how long to stay. A threshold based decision is a good implementation method.
There is no 2nd order tracking loop. Instead, the drate output is dumped into the register four times during the operation of the rate converter.
The first time is at beginning of rate conversion and it is set to the initially estimated drate.
Each time gear changes up, the current drate is dumped into a register.
The sample index buffer is not adjusted for every output sample, but instead it is adjusted when the accumulative increase is above one input sample.
Introducing the 2nd order tracking loop is the biggest change for performance compared to the converter of
The older design of
While the above description has been focused on rate tracking, embodiments of the invention additionally include a new design for the other major block of a rate converter, the sample interpolator.
Theoretical Background
In theory, the approach is to apply a continuous time filter with certain frequency response to a discretely sampled input and then discretely resample it at another sample rate.
The frequency response of such a filter should reject the alias generated by the discrete time input samples as much as possible while maintaining the audio passband as much as possible.
For implementation simplicity, embodiments of the invention use a filter length of 80 times input sample period. For example, let's suppose the filter value is f(t) for −40Ts<t≤40Ts. Let the input samples be sn, the output sample sm′=Σk=−4040s└mr┘+k·f((mr−└mr┘+k)·Ts) where r is the output sample to input sample ratio.
It is conventionally difficult to get the values of filter f. The filter can be pre-computed and stored, but the memory requirement is very big. To achieve 120 dB alias rejection would require more than a million entries stored, which is an impractically large number. Instead, the whole 80 sample period may be broken into many pieces and given a polynomial for each piece to approximate the continuous time filter. One embodiment includes a 3rd order polynomial over 640 time pieces. That selection provides an alias less than 120 dB lower than signal for the entire audio pass band.
The way to generate such polynomials is first generate a vastly oversampled filter with the desired frequency response and length. Then, within each of the 640 time pieces, the points of the filter are fit to a 3rd order polynomial with minimum square error. The performance of such polynomials can be verified by computing 30 points in each time piece and look at the overall frequency response.
From the graph, it can be seen that the passband ripple for 6× decimation is 0.1 dB, for 4× decimation is 0.01 dB and even lower for other filters.
For all other filters, the alias rejection varies between −130 dB to −125 dB. There are occasional strays going up to −122 dB.
In operation, one operation cycle is performed on each output sample. In one embodiment, each operation cycle goes through 80 samples corresponding to the filter size. For each of the 80 samples, first is coefficient generation, and then the multiplier and adding over one input sample for each audio channel. When the whole computation is over, the output sample is stored in the 2 sample FIFO buffer.
In the following example, let N be the number of audio channels.
Each operation cycle takes (N+3)*80 cycles.
Let k be the sample index between 0 and 79.
The coefficient ROM is read in the following cycles:
The coefficient buffer is updated during cycles k*(N+3)+1 to k*(N+3)+4. It takes input from coefficient ROM in cycle k*(N+3)+1 and takes input from multiplier accumulator in other cycles.
The input RAM is read during cycles k*(N+3)+5 to k*(N+3)+N+4. Note that the last cycle of input RAM access overlaps with first cycle of coefficient ROM access. But this does not cause any problem.
The output accumulator is reset to 0 at cycle 0 and is updated during cycles k*(N+3)+5 to k*(N+3)+N+4.
The multiplier selector takes LSB of sample index during cycles k*(N+3)+2 to k*(N+3)+4 and takes input RAM during cycles k*(N+3)+5 to k*(N+3)+N+4.
The adder selector takes coefficient buffer during cycles k*(N+3)+2 to k*(N+3)+4 and takes output accumulator during cycles k*(N+3)+5 to k*(N+3)+N+4.
Let x be the LSB of sample index. The operation being done are:
Cycle k*(N+3)+1 gets c′=c3
Cycle k*(N+3)+2 gets c′=cx+c2=c3x+c2
Cycle k*(N+3)+3 gets c′=cx+c1=c3x2+c2x+c1
Cycle k*(N+3)+4 gets c′=cx+c0=c3x3+c2x2+c1x+c0. That is the coefficient applied over all audio channels.
Cycle k*(N+3)+j, 0≤j<N, gets bj′=bj+c·ak. Here b is the output accumulator for channel j and aj,k is the input sample k for channel j.
At the end of (N+3)*80 cycles, the output accumulators are dumped into the output buffer and ready for output over output strobe.
Implementation Details
In an example implementation, for each filter, there are 640 time intervals. Each contains a 3rd order polynomial. Therefore, 2560 words are needed for each filter coefficient ROM. However, symmetry reduces the coefficient ROM. Since f (−t)=f (t), we only need to store half of the filter polynomials. That is, 1280 words per filter coefficient ROM.
There is one input buffer and one output buffer for each audio channel. Each input or output buffer contains two 24 bit words. It serves as a FIFO to temporally store the input or output sample before written by input RAM or output to the next block.
The input RAM contains 96 words of 24 bit width for each audio channel. In the 96 words, 80 are used for the anti-aliasing filter. The remaining 16 are used for possible sample jitter and input/output rate mismatch before the phaselock loop completely locks.
The address gen generates the access address for the input sample RAM and the coefficient ROM. Suppose the input buffer start address is i, the sample index is a·223+b·220+c when an operation cycle starts. Then the kth sample index generated is (a+i+8+k) mod 96. The coefficient ROM address is 32·(a+k)+4·b+j, here j means the coefficient order of the polynomial.
The multiplier is a 24 bit by 24 bit signed multiplier. The adder is a 28 bit adder. The output accumulator is 28 bits too. Rounding is performed before storing to the output sample buffer.
Compared to the operation of the interpolator that operates in conjunction with the rate tracker of
The conventional multiplier is a pipelined multiplier that takes 12 cycles. The multiplier in the interpolator of
In the old interpolator x2 and x3 must be computed because of the 12 cycle multiplier latency. In the old interpolator, the new coefficient is computes as c3x3+c2x2+c1x+c0 while the new interpolator compute it as ((c3x+c2)x+c1)x+c0 which does not need to compute x2 and x3.
Each operation cycle of the old interpolator requires 92N+288 cycles and 80N+240 cycles in the interpolator of
The old interpolator only has two filters, one for full bandwidth and the other for half bandwidth. The new design has 6 filters, supporting 1×. 1.5×, 2×, 3×, 4×, and 6× decimation. That practically allows any rate to any rate conversion. Also, the old interpolator has 2560 words for each filter and the new interpolator only has 1280 words for each filter.
The old rate converter computes all 80 coefficients before applying them. Therefore, it needs 80 word coefficient RAM. The new converter computes one coefficient and apples it to all of the channels before going to the next one. Therefore, it only needs one word to store the coefficient.
The impacts of the change include: the single cycle multiplier has about half area and a third current compare to the pipelined multiplier; not requiring x2 and x3 saves area and current too; and a lower clock rate means less current.
More filters take more area which is a tradeoff, but it makes the design more flexible and is able to handle all rate to all rate conversion. Also, the total words is 7680 words for the new design compared to 5120 words for the old one.
Not requiring the coefficient RAM makes a big difference in area and current.
From inspection of
When the input rate is very close to integer multiple of output rate, the alias falls right on the signal. In these cases, the old and new coders have similar SNRs. That is because the alias falls almost exactly on the signal and cannot be distinguished.
Embodiments of the invention provide: up to 8 channels of audio with a sample rate <=48 KHz with a 48 MHz clock rate, and coverage of all rate to all rate conversion if the decimation rate is not more than 6×.
Embodiments of the invention may be incorporated into integrated circuits such as sound processing circuits, or other audio circuitry. In turn, the integrated circuits may be used in audio devices such as headphones, sound bars, audio docks, amplifiers, speakers, etc.
Also, although embodiments of the invention have been described using functional blocks, the block may be implemented in any physical embodiment, as is known in the art. For example blocks may be implemented in application specific integrated circuits (ASICs), FPGAs or other programmable firmware, software running on a specialized processor, software running on a general purpose processor, or any combination of the above.
Having described and illustrated the principles of the invention with reference to illustrated embodiments, it will be recognized that the illustrated embodiments may be modified in arrangement and detail without departing from such principles, and may be combined in any desired manner. And although the foregoing discussion has focused on particular embodiments, other configurations are contemplated.
In particular, even though expressions such as “according to an embodiment of the invention” or the like are used herein, these phrases are meant to generally reference embodiment possibilities, and are not intended to limit the invention to particular embodiment configurations. As used herein, these terms may reference the same or different embodiments that are combinable into other embodiments.
Consequently, in view of the wide variety of permutations to the embodiments described herein, this detailed description and accompanying material is intended to be illustrative only, and should not be taken as limiting the scope of the invention.
This application is a continuation of co-pending U.S. Non-provisional patent application Ser. No. 15/786,500, filed Oct. 17, 2017, entitled “RATE CONVERTOR,” which is a continuation of U.S. Non-provisional patent application Ser. No. 14/857,681, filed Sep. 17, 2015 (which issued on Oct. 17, 2017 as U.S. Pat. No. 9,793,879), entitled “RATE CONVERTOR,” which claims benefit of U.S. Provisional Patent Application No. 62/051,599, filed Sep. 17, 2014, entitled “RATE CONVERTOR,” the disclosures of all of which are incorporated herein by reference in their entirety.
Number | Name | Date | Kind |
---|---|---|---|
4989221 | Qureshi et al. | Jan 1991 | A |
5388127 | Scarpa | Feb 1995 | A |
5425060 | Roberts et al. | Jun 1995 | A |
5434884 | Rushing et al. | Jul 1995 | A |
5459432 | White et al. | Oct 1995 | A |
5911128 | DeJaco | Jun 1999 | A |
5928313 | Thompson | Jul 1999 | A |
6259389 | McGrath | Jul 2001 | B1 |
6487573 | Jiang | Nov 2002 | B1 |
6535565 | Girardeau, Jr. et al. | Mar 2003 | B1 |
6711228 | Kato et al. | Mar 2004 | B1 |
6834292 | Jiang | Dec 2004 | B2 |
7042925 | Shiue et al. | May 2006 | B2 |
7340024 | Nelson et al. | Mar 2008 | B1 |
8369973 | Risbo | Feb 2013 | B2 |
9015219 | Bal | Apr 2015 | B2 |
9075673 | Khlat | Jul 2015 | B2 |
9553564 | Soltanmohammadi | Jan 2017 | B1 |
9793879 | Zhao | Oct 2017 | B2 |
10230352 | Zhao | Mar 2019 | B2 |
20010033583 | Rabenko et al. | Oct 2001 | A1 |
20020186713 | Brunel | Dec 2002 | A1 |
20030043947 | Zehavi et al. | Mar 2003 | A1 |
20030220783 | Streich | Nov 2003 | A1 |
20040071206 | Takatsu | Apr 2004 | A1 |
20040101143 | Avalos et al. | May 2004 | A1 |
20040120361 | Yu et al. | Jun 2004 | A1 |
20040190649 | Endres et al. | Sep 2004 | A1 |
20040196915 | Gupta | Oct 2004 | A1 |
20050096879 | Waite et al. | May 2005 | A1 |
20060221936 | Rauchwerk | Oct 2006 | A1 |
20070014385 | Shiraishi | Jan 2007 | A1 |
20070025397 | Sticht et al. | Feb 2007 | A1 |
20070058708 | Bultan et al. | Mar 2007 | A1 |
20070146176 | Melanson | Jun 2007 | A1 |
20080069283 | Casorso | Mar 2008 | A1 |
20080315928 | Waheed et al. | Dec 2008 | A1 |
20080315960 | Waheed et al. | Dec 2008 | A1 |
20090128379 | Rosenthal | May 2009 | A1 |
20100198898 | Pan | Aug 2010 | A1 |
20110064233 | Van Buskirk | Mar 2011 | A1 |
20110224996 | Wang | Sep 2011 | A1 |
20120087225 | Honma | Apr 2012 | A1 |
20140043177 | Pagnenelli | Feb 2014 | A1 |
20160140983 | Zhao | May 2016 | A1 |
Number | Date | Country | |
---|---|---|---|
20190207588 A1 | Jul 2019 | US |
Number | Date | Country | |
---|---|---|---|
62051599 | Sep 2014 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 15786500 | Oct 2017 | US |
Child | 16296669 | US | |
Parent | 14857681 | Sep 2015 | US |
Child | 15786500 | US |