Time-delay estimations (TDE) are used to extract the time difference of arrival between microphones in sound-source localization systems.
The standard approach to TDE uses the direct cross-correlation (DCC) function. When using the DCC function, for each TDE, time frames of the input signals are stored and all the points of the DCC function are calculated. The argument of the maximum of the DCC corresponds to the intersignal time delay between the signal inputs. With a sampling frequency above 50 KS/s, the storage of the frames and the arithmetic operations to calculate the DCC values are a roadblock towards achieving a sub-μW implementation as is needed in mobile, wearable, or IoT applications.
The low complexity of adaptive time-delay estimation (ATDE) techniques makes them an attractive approach to TDE, but these techniques still require a high-resolution ADC after the sensors. Thus, these are also not suitable for many low power applications
Bio-inspired silicon cochlea estimate the time difference in input signals by translating audio stimulus into asynchronous events. However, the power consumption of bio-inspired silicon cochlea is in the μW range, which makes such cochlea unsuitable for certain low power applications.
Accordingly, new mechanisms for TDE are desirable.
In accordance with some embodiments, circuits and methods for time-delay to digital converters are provided. In some embodiments, circuits for a time-delay to digital converter are provided, the circuits comprising: a first microphone having a first output; a second microphone having a second output; a first one-bit analog-to-digital converter having a first input coupled to the first output and having a third output; a second one-bit analog-to-digital converter having a second input coupled to the second output and having a fourth output; a first variable time delay having a third input coupled to the third output, a fourth input, and a fifth output, wherein the fourth input controls a delay amount of the first variable time delay; a second variable time delay having a fifth input coupled to the fourth output, a sixth input, and a sixth output, wherein the sixth input controls a delay amount of the second variable time delay; a first mixer having a seventh input coupled to the fifth output, an eighth input coupled to the sixth output, and a seventh output; a third variable time delay having a ninth input coupled to the sixth output, and a eighth output; a second mixer have a tenth input coupled to the fifth output, and eleventh input coupled to the eighth output, and a ninth output; a subtractor having a twelfth input coupled to the seventh output, a thirteenth input coupled to the 9th output, and a tenth output; an accumulator having a fourteenth input coupled to the tenth output, and an eleventh output; an attenuator having a fifteenth input coupled to the eleventh output, and a twelfth output; a differential splitter having a sixteenth input coupled to the twelfth output, a thirteenth output, and a fourteenth output; a first adder having a seventeenth input coupled to the thirteenth output, an eighteenth input coupled to an offset signal, and a fifteenth output coupled to the sixth input; and a second adder having a nineteenth input coupled to the fourteenth output, an twentieth input coupled to the offset signal, and a sixteenth output coupled to the fourth input.
In accordance with some embodiments, polarity-coincidence, adaptive time-delay estimation (PCC-ATDE), mixed-signal techniques are provided. In some embodiments, these techniques use 1-bit quantized signals and negative-feedback architectures to directly determine a time-delay between signals at analog inputs and convert the time-delay to a digital number.
In a time delay estimation (TDE), the sampling frequency of the data converters FS defines the resolution. The noise from approaching automobiles or other vehicles has dominant spectral components below 250 Hz. To support a −1 ms to +1 ms delay range with 8-bit resolution for such noise-like sources, the audio signal needs to be sampled at >50 KS/s, i.e. 100× the Nyquist rate, in some embodiments.
Consider the outputs of two microphones, M1(t) and M2(t), which capture the signal of the single source x(t) at different positions in space:
M
1(t)=x(t)+n (1)
M
2(t)=A.x(t−D)+n2(t) (2)
where D is the time delay that the algorithm needs to determine, n1(t) and n2(t) are random noise, and A the gain (or attenuation) difference in the microphones. In prior mechanisms, the estimation of D requires computing the direct cross-correlation
for many different T and then determining the argument of the peak:
D=argmax(DCCM
There are multiple ways to compute the DCCM1,M2, however, each is complex.
An alternative to using the DCC function for performing TDE is to use the polarity-coincidence correlation function (PCC):
It is known that
hence argmax(PCCM1,M2)=argmax(DCCNM1,M2). Note that the computation of PCC only needs one-bit signals, in contrast to DCC which requires multi-bit signals.
Regardless of how one obtains DCCM1,M2 or PCCM1,M2, their computation involves storing a large frame of both M1(t) and M2(t), calculating and storing all the points of the cross-correlation within the TDE range, and finally searching for the argument of the peak.
In accordance with some embodiments, polarity-coincidence-correlation, adaptive, time-delay estimation (PCC-ATDE) approaches are provided. The PCC-ATDE approaches use only two values of a polarity-coincidence correlation function to close a negative feedback loop that continuously tracks intersignal delay.
Attenuator 1/G 112 in
A practical problem for the architecture in
In accordance with some embodiments, this problem may be overcome using example circuit 300 of
The outputs of comparators 302 and 304 are then provided to variable-delay cells 306 and 308, respectively, which have delays of τvar1 and τvar2, respectively. The outputs of the variable delay cells produce signals M1d(t) and M2d(t):
M
1d(t)=sgn(x(t−τvar1)+n1(t−τvar1)) (7)
M
2d(t)=sgn(A.x(t−τvar2−D)+n2(t−τvar2)) (8)
As shown in
Referring back to
M2d(t) is further delayed by fixed delay 312 by a fixed value, τfix, and then multiplied by multiplier 314 with M1d(t) to create VMIXER2. The average VMIXER2 is PCCM1,M2(Δ+τfix). Note that the averages of the multipliers output, VMIXER1 and VMIXER2, do not need to be explicitly calculated.
The difference between VMIXER1 and VMIXER2 is then determined by subtractor 316. This will result in difference values of +2, 0, or −2.
Next, loop integrator 318 averages the difference values and attenuates the higher frequency components of these signals.
Attenuator 320 then receives the output of the integrator and produces an attenuated signal. The loop bandwidth, which is directly controlled by the attenuator, determines the effective length T of the averages calculated by the integrator.
Turning to
As shown in
As illustrated in
Mixer1[n]−Mixer2[n]=FM
where e[n] contains the high-frequency components, with average zero, and FM
Note that, assuming that the PCCM1,M2 and τfix are static, FM
e[n] does not introduce a DC error in Δ[n], but contributes as noise at the output. If one neglects e[n] and focuses on the low-frequency output of the loop, a non-linear feedback loop that continuously increases or decreases Δ[n] to keep FM
Based on (13), a delay-domain model can be used to predict the behavior of the PCC-ATDE loop. The PCC-ATDE operates across multiple-domains. Delay-domain models can assist in understanding how the PCC-ATDE operates. In the PCC-ATDE, the function FM
Since the only correlation between M1(t) and M2(t) comes from the source x(t), the polarity-coincidence correlation function between M1 and M2 can be approximated by the auto-polarity-coincidence correlation function of x(t) shifted by the intersignal delay D:
PCC
M
,M
[Δ]≈PCCx,x[Δ−D] (14)
This approximation can be used to understand the contribution of the microphone delay D to the values of FM
The approximation in (14) and Fx,x are illustrated in
Introducing Fx,x into (13), we now have a direct relation between the output of the loop Δ[n] and the intersignal delay D that was previously implicit in FM
Using (16), the delay-domain model in
In order to maintain the negative feedback and guarantee the convergence to the correct time-delay estimation, Fx,x[Δ−D] has to have positive values for Δ>D and negative values for Δ<D. Since Fx,x[Δ−D] is defined as the difference of two consecutive values of PCCx,x[τ], see (15), the equivalent condition is that the derivative of PCCx,x[τ] is positive for positive τ and negative for negative τ.
To define a range for the system in some embodiments, the difference between Δ[n] and D must always be less than τMAX:
|Δ[n]−D|<τMAX (17)
This can be an important design parameter for sound-source localization systems, where the maximum time delay between the input signals is limited by the spacing of the microphones. For example, if the microphones are separated by approximately 35cm, the intersignal delay will be always |D|<1 ms. Applying a boundary to the output |Δ|<1.5 ms will guarantee that the loop stays within the covered range for x(t) sources with bandwidth lower than 200 Hz that have a τMAX>2.5 ms. Low-pass filters can be used before the PCC-ATDE to limit the bandwidth of x(t) in some embodiments.
As shown, this mechanism can be implemented on a chip that has four analog inputs 1002, 1004, 1006, and 1008 that are connected to a microphone array. One of the microphones provides the reference for the time-delay estimation at input 1002; the chip outputs the time-delay of the other three analog signals with relation to this reference microphone.
The inputs are provided to inputs of comparators 1010. Any suitable comparators can be used in some embodiments. For example, in some embodiments, each comparator can be implemented as a latched comparator as shown in
The outputs of the comparators are then provided to PCC-ATDE core 1012. The core can be implemented in any suitable technology. For example, in some embodiments, the core can be implemented in 0.18 μm CMOS technology to take advantage of its low leakage current, while easily meeting speed and density requirements. The core of the PCC-ATDE can be implemented with sub-threshold CMOS logic in some embodiments.
As shown in
In some embodiments, the 300 mV signals from the PCC-ATDE core can be converted to 1.8V I/O levels with the sub-threshold level shifters 1028. Any suitable level shifters can be used in some embodiments. For example, the level shifters shown in
In some embodiments, the circuit of
Although the invention has been described and illustrated in the foregoing illustrative embodiments, it is understood that the present disclosure has been made only by way of example, and that numerous changes in the details of implementation of the invention can be made without departing from the spirit and scope of the invention, which is limited only by the claims that follow. Features of the disclosed embodiments can be combined and rearranged in various ways.
The application claims the benefit of U.S. Provisional Patent Application No. 62/831,173, filed Apr. 8, 2019, and U.S. Provisional Patent Application No. 62/654,041, filed Apr. 6, 2018, each of which is hereby incorporated by reference herein in its entirety.
Number | Date | Country | |
---|---|---|---|
62831173 | Apr 2019 | US | |
62654041 | Apr 2018 | US |