In many applications, data are sensed at various locations around a network. In such applications, the data must often be tagged with timing information. This timing information should be consistent across the network. For example, in exploring for oil, an area may be surveyed using sensors (accelerometers) disposed in a grid. Seismic shocks are induced by triggers such as explosives or vibrator trucks, and in response the sensors provide data indicative of soil content and condition. This data must be time-stamped so it can be synchronized across the network. Clocks at the sensors may be used to time-stamp the data.
The figures are not drawn to scale. They illustrate the disclosure by examples.
Illustrative examples and details are used in the drawings and in this description, but other configurations may exist and may suggest themselves. Parameters such as voltages, temperatures, dimensions, and component values are approximate. Terms of orientation such as up, down, top, and bottom are used only for convenience to indicate spatial relationships of components with respect to each other, and except as otherwise indicated, orientation with respect to external axes is not critical. For clarity, some known methods and structures have not been described in detail. Methods defined by the claims may comprise steps in addition to those listed, and except as indicated in the claims themselves the steps may be performed in another order than that given.
The systems and methods described herein may be implemented in various forms of hardware, software, firmware, special purpose processors, or a combination thereof. At least a portion thereof may be implemented as an application comprising program instructions that are tangibly embodied on one or more program storage devices such as hard disks, magnetic floppy disks, RAM, ROM, and CDROM, and executable by any device or machine comprising suitable architecture. Some or all of the instructions may be remotely stored; in one example, execution of remotely-accessed instructions may be referred to as cloud computing. Some of the constituent system components and process steps may be implemented in software, and therefore the connections between system modules or the logic flow of method steps may differ depending on the manner in which they are programmed.
In sensor networks such as the oil exploration networks described above, free-running clocks (that is, clocks not synchronized to a central timing source) are often used. These clocks may be crystal-controlled and typically are characterized by a high degree of short-term stability but slow drift over time. Time stamps provided by these clocks can be off by as much as several microseconds as a result of this drift and such other factors as temperature variations, limited timing resolution of IEEE 802.11 time sync functions, propagation delay of timing beacons, processing delays at the sensors, and error in measuring the elapsed time between arrival of a signal from a timing beacon and the data sampling time. These various errors may be random. Accordingly, uncertainty is associated with the time stamps—in other words the timing information is noisy.
A sequence of time stamps, viewed as a random process, has energy in most frequencies. For example, if time stamp errors are modeled as white noise, the power spectral density is constant. Energy in the underlying time stamps is concentrated mainly in low frequencies due to slow clock drift, and low-pass filters can remove much of the noise energy while preserving much of the desired signal energy. But inasmuch as the time stamps are of finite duration, their power spectral density also has high-frequency energy, and low-pass filtering therefore distorts the desired underlying time stamp information. On the other hand, if instead of low-pass filtering a non-linear filtering technique such as least squares is attempted, the low-frequency variation in the time stamps results in errors that grow larger with the length of the sequence of time stamps. There is a need for a way to reduce noise in data sequences such as time stamps without distorting or introducing errors into the data.
In some examples the non-linear technique comprises linear least-squares approximation, as will be described in detail presently. This least-squares approximation may be performed at intervals, and cubic spline interpolation used to fill in between the intervals.
In some examples, as will be explained presently, the linear technique comprises a fast Fourier transform or a linear low-pass time-invariant finite impulse response filter.
In the foregoing example linear and non-linear estimating techniques interoperate together to reduce noise in a noisy data sequence, overcoming the limitations of using either technique by itself.
In some examples the sensor unit 300 also includes other devices such as a processor 314, data storage 316, and a communication port 318.
The other sensor units 304, 306, 308 and 310 also include seismic sense elements and clocks, and they may include one or more other devices.
The system includes a control unit 320. In this example the control unit 320 includes a central processing unit (CPU) 322, memory 324, data storage 326, and a communication port 328. Peripheral devices such as a monitor 330, keyboard 332, and mouse 334 may be provided for a user.
The control unit 320 is in communication with the seismic sensor units as indicated by a communication symbol 336 to:
The control unit 320 may include machine instructions 338 that control one or more of the devices in the control unit and the seismic sensor units to carry out the tasks described above. The machine instructions 338 may reside in the data storage 326 or the memory 324. In some examples the machine instructions may be remotely located and communicated to the control unit 320 as needed, for example in a cloud computing environment.
In some examples the control unit 320 may also serve as a seismic sensor unit by addition of a seismic sense element, or its functions may be performed by one or more devices in the seismic sensor units.
In some examples the system includes one or more seismic triggers such as a trigger 340 coupled to the earth to induce seismic shocks. In some examples the seismic triggers comprise explosives or vibrator trucks. The triggers may be operated by the control unit 320 or asynchronously by a user.
In some examples the corrected sequence of time-stamped seismic data may be communicated to a user through the monitor 330 or through a printer (not shown), or may be stored in the data storage 326 for later reference, or may be communicated through the communication port 336 for use elsewhere.
As described above, non-linear and linear estimating techniques interoperate to provide a reduced-noise sequence of data indicative of a sequence of physical events. Also as described above, in one application the physical events comprise seismic events and the reduced-noise sequence comprises a corrected sequence of time-stamped seismic data indicative of those seismic events. The interoperation of the non-linear and linear estimating techniques will now be described in more detail. Although this description is written in the context of one specific application—reducing noise in time stamps in seismic data—the techniques may readily be adapted to reducing noise in other data sequences that represent time or other physical quantities.
Notation:
Assumptions:
To provide the sequence of approximated reduced-noise timestamps {{tilde over (t)}[m]}m=1N, let Δ1 and Δ2 be constants (to be explained presently) that may be optimized for a given problem. For example, choose Δ1=5·103 and Δ2=2.5·103. For m=kΔ1 and k∈ compute the linear least square approximation of T[n] for n=m−Δ2, m−Δ2+1, . . . , m+Δ2, and evaluate this approximation at n=m to obtain {tilde over (t)}[m]. Then, to obtain {tilde over (t)}[n] for n≠kΔ1 interpolate the sequence {{tilde over (t)}[m]}m=kΔ
This technique is depicted graphically in
After the approximation {tilde over (t)}[m] of the corrected time stamps has been determined as described above, the approximation is subtracted from the noisy sequence T[m] to get an estimate of the noise. A linear technique, for example a linear time-invariant (LTI) low-pass filter, is used to obtain a filtered sequence from the estimate of the noise, and then the filtered sequence is added back into the approximation of the corrected time stamps to obtain the corrected time stamp sequence:
{tilde over (t)}[m]=h[n]*(T[m]−{tilde over (t)}[m])+{tilde over (t)}[m] (1)
where * denotes convolution and h[n] is the impulse response of the LTI low-pass filter.
For convenience, in the following discussion of the LTI low-pass filter it is assumed that:
The sequence {tilde over (t)}[m] may be considered as the sum of sequences t[m] and e[m] which for convenience are assumed to be independent. For a reference frequency Ω0 in the pass band of the filter, let
S
e(Ω0)/St(Ω0)=∈1
|H(Ω0)|2=1+∈2
S
z(Ω0)/St(Ω0)=∈3
where
where
Equation (2) is representative of the error that would result from applying only the linear technique. Error in equation (2) may be made small by making ∈2 and ∈3 of the same order. Error in equation (3) may be made small by making (∈1∈2) and ∈3 of the same order. Therefore a large filter distortion ∈2 can be tolerated provided ∈1 is relatively small, that is, if {tilde over (t)}[m] is a good approximation of t[m]. In other words, by using both the non-linear and linear techniques, a larger ∈2 can be tolerated than would be the case if only the linear technique were used. Since ∈1 may be much larger than ∈3, as a general matter {tilde over (t)}[m] should not be used as a direct estimate of t[m]; instead, the non-linear and linear techniques should both be used as in equation (1).
The sequence of approximated reduced-noise timestamps {{tilde over (t)}[m]}m=1N does not have the strict error requirements of the final estimate {{circumflex over (t)}[m]}m=1N but should have similar spectrum content because
Spurious high-frequency components should not be introduced.
The LTI low-pass filter should have a low cut-off frequency. This results in a filter with a long impulse. Implementing such a filter in the time domain can be computationally intense. This problem can be avoided by instead implementing a fast Fourier transform (FFT) in the frequency domain.
An FFT operates on a periodic time-domain sequence. Applying an FFT to the estimate of the noise sequence that is obtained by subtracting the corrected-sequence approximation {{tilde over (t)}[m]}m=1N from the original uncorrected sequence implicitly performs a periodic extension of the sequence, creating edge artifacts that translate into spurious high-frequency components. This may be minimized by multiplying the estimate of the noise sequence by a window function such as a raised cosine described as:
where:
For example, L may be chosen to be 72,000, corresponding with two hours of timestamps at a frequency of 10 Hz. Choosing a large value for L reduces distortion in the signal due to edge effects but requires discarding more data in the edges of the signal since the timestamps affected by the window are heavily distorted. L should be much smaller than N to minimize the amount of discarded data. Applying the N-point FFT yields
F
1
[k]=FFT
N{(T−{tilde over (t)})·fL}[k], 1≦k≦N
where the raised dot, ·, indicates component-wise multiplication.
Any desired low-pass filter with cut-off frequency B>Bt may be used to perform the LTI filtering. Bt is the bandwidth of the underlying clean timestamp sequence {t[m]}; in cases where {t[m]} is not strictly band limited, Bt is some bandwidth where {t[m]} contains most of its energy. As shown in
βspecifies the roll-off factor as a fraction of the bandwidth {tilde over (B)}. These parameters {tilde over (B)}and β are chosen based on;
F
2
[k]=H[k]·F
1
[k] 1≦k≦N
leading to the result
{tilde over (t)}[m]=IFFT
N
{F
2
}+{tilde over (t)}[m]
where IFFTN is the N-point inverse FFT.
In one example the following parameters are used:
B=2.09·10−4
Δ1=5·103
Δ2=2.5·103
L=72·103 and
β=0.5
give good results.
In some applications low-pass filtering in the time domain by convolution, rather than in the frequency domain, may result in a better computational complexity.
Reducing noise in a sequence of data by linear and non-linear estimation provides significantly better performance in the form of smaller estimation errors that either approach used separately. Non-linear estimation gives a good approximation of large variation of signal components. Linear low-pass filtering removes noise without adding much distortion, because of the prior substitution of large variations in the signal.