This application relates generally to distributed fiber optic sensing (DFOS) systems, methods, structures, and related technologies. More particularly, it pertains to spatiotemporal and spectral classification of acoustic signals for vehicle event detection over deployed fiber networks.
Distributed fiber optic sensing (DFOS) technologies including Distributed Acoustic Sensing (DAS), Distributed Vibration Sensing (DVS), and Distributed Temperature Sensing (DTS) are known to be quite useful for sensing acoustic events, vibrational events, and temperatures in a plethora of contemporary applications. Known further, traffic incidents and accidents cause both traffic disruptions and loss of life.
It is therefore of significant societal importance to monitor the status of traffic on roadways to reduce the number of accidents and improve highway productivity. One such approach to highway improvement includes traffic incident detection as traffic incidents not only cause traffic congestion but also increase the probability of producing both primary and secondary accidents. Thus, it is desirable to provide efficient and accurate systems, methods, and structures for monitoring and detecting unusual driving behaviors early and reporting same in real time to avoid further incidents and accidents.
To mitigate traffic incidents and accidents, an increasing number of Sonic Alert Patterns (SNAP) have been deployed along roadways to enhance road safety. However, such SNAP deployments only detect drifting drivers but do not report the drifting drivers to roadway operators.
An advance in the art is made according to aspects of the present disclosure directed to machine learning (ML) based DFOS systems, methods, and structures for SNAP event detection in real time. Our inventive systems, methods, and structures employe an intelligent SNAP informatic system including DFOS/Distributed Acoustic Sensing (DAS) and machine learning technologies that utilize SNAP vibration signals as an indicator. Without installation of additional sensors, our inventive systems, methods, and structures detect vibration signals along a length of an existing optical fiber through DAS.
In sharp contrast to the prior art which generally employed DFOS waterfall data, our inventive systems, methods, and structures according to aspects of the present disclosure do not require or utilize preprocessing of raw data resulting in much faster and more accurate informational derivation as rich, time-frequency information in raw DFOS/DAS waveform data is preserved.
Systems, methods, and structures according to aspects of the present disclosure employ a deep learning module Temporal Relation Network (TRN) that accurately detects SNAP events from among chaotic signals of normal traffic, making it reliable when applied to busy roads with dense traffic and vehicles of different speed. According to further aspects of the present disclosure, our inventive systems, methods, and structures investigate intrinsic data structures such as transformations and relations along a temporal dimension. Along with Mel-frequency cepstral coefficients (MFCCs) features extracted from raw waveforms, our systems, methods, and structures advantageously outperform other systems and methods that employ different models, such as convolutional neural network (CNN) and features such as power spectral density (PSD) and raw waveform. Moreover, TRN employ temporal reasoning by explicitly learning changes of spectral intensity over locations and time.
The following merely illustrates the principles of this disclosure. It will thus be appreciated that those skilled in the art will be able to devise various arrangements which, although not explicitly described or shown herein, embody the principles of the disclosure and are included within its spirit and scope.
Furthermore, all examples and conditional language recited herein are intended to be only for pedagogical purposes to aid the reader in understanding the principles of the disclosure and the concepts contributed by the inventor(s) to furthering the art and are to be construed as being without limitation to such specifically recited examples and conditions.
Moreover, all statements herein reciting principles, aspects, and embodiments of the disclosure, as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. Additionally, it is intended that such equivalents include both currently known equivalents as well as equivalents developed in the future, i.e., any elements developed that perform the same function, regardless of structure.
Thus, for example, it will be appreciated by those skilled in the art that any block diagrams herein represent conceptual views of illustrative circuitry embodying the principles of the disclosure.
Unless otherwise explicitly specified herein, the FIGS. comprising the drawing are not drawn to scale.
By way of some additional background, we note that distributed fiber optic sensing systems interconnect opto-electronic integrators to an optical fiber (or cable), converting the fiber to an array of sensors distributed along the length of the fiber. In effect, the fiber becomes a sensor, while the interrogator generates/injects laser light energy into the fiber and senses/detects events along the fiber length.
As those skilled in the art will understand and appreciate, DFOS technology can be deployed to continuously monitor vehicle movement, human traffic, excavating activity, seismic activity, temperatures, structural integrity, liquid and gas leaks, and many other conditions and activities. It is used around the world to monitor power stations, telecom networks, railways, roads, bridges, international borders, critical infrastructure, terrestrial and subsea power and pipelines, and downhole applications in oil, gas, and enhanced geothermal electricity generation. Advantageously, distributed fiber optic sensing is not constrained by line of sight or remote power access and—depending on system configuration—can be deployed in continuous lengths exceeding 30 miles with sensing/detection at every point along its length. As such, cost per sensing point over great distances typically cannot be matched by competing technologies.
Distributed fiber optic sensing measures changes in “backscattering” of light occurring in an optical sensing fiber when the sensing fiber encounters environmental changes including vibration, strain, or temperature change events. As noted, the sensing fiber serves as sensor over its entire length, delivering real time information on physical/environmental surroundings, and fiber integrity/security. Furthermore, distributed fiber optic sensing data pinpoints a precise location of events and conditions occurring at or near the sensing fiber.
A schematic diagram illustrating the generalized arrangement and operation of a distributed fiber optic sensing system that may advantageously include artificial intelligence/machine learning (AI/ML) analysis is shown illustratively in
As is known, contemporary interrogators are systems that generate an input signal to the optical sensing fiber and detects/analyzes reflected/backscattered and subsequently received signal(s). The received signals are analyzed, and an output is generated which is indicative of the environmental conditions encountered along the length of the fiber. The backscattered signal(s) so received may result from reflections in the fiber, such as Raman backscattering, Rayleigh backscattering, and Brillion backscattering.
As will be appreciated, a contemporary DFOS system includes the interrogator that periodically generates optical pulses (or any coded signal) and injects them into an optical sensing fiber. The injected optical pulse signal is conveyed along the length optical fiber.
At locations along the length of the fiber, a small portion of signal is backscattered/reflected and conveyed back to the interrogator wherein it is received. The backscattered/reflected signal carries information the interrogator uses to detect, such as a power level change that indicates—for example—a mechanical vibration.
The received backscattered signal is converted to electrical domain and processed inside the interrogator. Based on the pulse injection time and the time the received signal is detected, the interrogator determines at which location along the length of the optical sensing fiber the received signal is returning from, thus able to sense the activity of each location along the length of the optical sensing fiber. Classification methods may be further used to detect and locate events or other environmental conditions including acoustic and/or vibrational and/or thermal along the length of the optical sensing fiber.
As we shall now further show and describe systems, methods, and structures according to aspects of the present invention employ machine learning for SNAP event detection using waveform data collected from a DFOS system in real-time. An algorithm employed learns distinctive patterns by comparing normal traffic signals SNAP events.
According to aspects of the present disclosure and in sharp contrast to the prior art, our inventive systems, methods, and structures employ temporal relational reasoning techniques instead of pattern matching methods. Visually, patterns generated by normal driving signal and SNAP event looks very similar, while they may exhibit intrinsic patterns when analyzing along a temporal dimension which can be captured by Temporal Relation Network (TRN) module.
First, there exists order information in actions of wheels passing SNAP stripes, which is discriminative to the actions of a wheel driving on the ground during normal driving. The change of spatial intensity over time manifests the driving speed, which is inherently related to the characteristic frequency of a SNAP vibration.
Second, heterogeneous factors such as driving speed, tire size, and length of wheelbase can all affect the time axis of a SNAP signal. The TRN extracts frames from a long sequence of data and focuses on relationships between segments of waveform data, which can be more robust to variability along the time-scale. Meanwhile, it also reduces the computation load than inference on the whole waveform data, supporting real-time inference with limited local computing resource.
Finally, in contrast to considering a spatial-temporal patch as a 1D “video”, we use Mel-frequency cepstral coefficients (MFCCs) features, which is a representation of the short-term power spectrum of an acoustic signal. This feature concisely describes the overall shape of a spectral envelope, which captures the characteristic vibration patterns in the frequency domain.
The coefficients output by MFCCs of multiple locations forms an “image”, and the time dimension is a series, which would be further processed to get the representation and learned by TRN module. Augmenting with the spatial dimension, MFCCs provide a 3D representation of the short-term power spectrum of acoustic signals, which allows for joint reasoning between driving speed and vibration frequency.
As will become further apparent, our inventive systems, methods, and structures according to aspects of the present disclosure employ raw DAS waveform (2 kHz), while previous, prior-art approaches employ only 8 Hz waterfall data.
Our inventive systems, methods, and structures perform spatio-temporal and spectral classification, while previous, prior-art approaches employ convolutional neural networks, which only perform pattern-matching in spatial-temporal classification.
Finally, with TRN, our inventive model explicitly compares what is different between 2 time-frames, in both the location peak and the frequency peak of the vibration signal. It performs temporal reasoning between driving speed and vibrating frequency. The (visually similar) SNAP signal and driving signal are quite different in this regard.
Intuitively, the waveform itself forms a time series, as each patch represents locations along the length of the DFOS sensor fiber by time. However, in a real traffic scenario, the relations may look more similar between a SNAP patch and a normal traffic patch. In order to solve this, we extract MFCCs feature for each fifo point, and form a 2D “image” when combining multiple fifos.
With reference to the summarized model shown therein, the MFCCs features are first extracted from the input waveform patch, then transformed and subsampled as a sequence of features for the model input. The following event classification model including convolutional block for high-level feature extraction for each input in the sequence, and a TRN module to capture the temporal relations within the sequence. The final output provides the probability of each category that the patch being classified as.
Specifically, each MFCCs feature of a fifo is M×T, where T is time and M is number of coefficients. For N fifos, it forms a N×M×T tensor. After concatination and reshaping to T×M×N, it can be seen as a “video” with T frames and M×N “image” size. To model the temporal relations between adjacent “frames”, we adopt TRN module. A pairwise temporal relation can be defined as
When events are completed and cannot be captured by single scale relations, we can use the following function to accumulate relations at different scales
Where Td captures temporal relationships between d ordered “frames”. All the relation functions are end-to-end trainable with base CNN.
As may be observed in
[Data collection] A target vehicle is driven in the field, with GPS and video camera recording the SNAP engagement or crossing events. Multiple rounds of data collection from multiple routes are preferred.
[Data annotation] Labeling the SNAP events by linking the DAS waveform to the GPS and video time stamp as the positive class. The DFOS/DAS waveforms collected from no-snap segments of the road serve as a negative class.
[Model training] During the training phase, following a conventional supervised training procedure, the TRN model is trained with label and patch pairs. When training the TRN model, we subsampled the input for better efficiency. After obtaining the T×M×N transformed MFCCs feature, first uniformly generate T′ segment and randomly sample one feature from each segment along the first axis, resulting T′×M×N tensor as model input. This largely accelerates training procedure comparing to using the entire sequence of features.
[Inference] During the inference phase, the input raw waveform data are converted to local patches by applying sliding windows with overlaps. The patches are then classified by the trained TRN model to determine if a SNAP event exists within the patch. The identified windows containing SNAP events are then merged into one single box via a boxes fusion step. The timestamp, cable location, event type, and confidence score are provided as output.
We collected data from multiple independent runs. During one of the independent runs, we collected data from 330 m to 600 m we performed three groups of runs to cover multiple situations: (1) group 1, trying to pass the SNAP for the entire SNAP section, 20 runs; (2) group 2, passing SNAP only at two locations, i.e., 330 m and 600 m, 20 runs; (3) few passing events between 2050 m and 2225 m, 2 runs. Each individual run was conducted at different speeds.
We then manually annotated the SNAP patches by cropping along the entire SNAP pattern. We then conducted 8 runs from 326 m to 4863 m, covering 7 segments of SNAP sections. As it is hard to distinguish SNAP patterns to normal traffic patterns on waveform data, we only sample negative patches within non-SNAP regions from this data. We randomly sampled from all positive and negative patches, resulting 4936 and 1235 patches in training and test set.
We compared our method with different baselines in terms of the model and feature(s) being used. We compared different features include (1) waveform, the raw output from DAS system; (2) PSD, Power Spectral Density vector of waveform with cropped frequency between 75 HZ and 250 HZ; and (3) MFCCs, MFCCs feature vector as described previously. We also compared the TRN model with conventional CNN models. The results are shown in
At this point, while we have presented this disclosure using some specific examples, those skilled in the art will recognize that our teachings are not so limited. Accordingly, this disclosure should only be limited by the scope of the claims attached hereto.
This application claims the benefit of U.S. Provisional Patent Application Ser. No. 63/480,369 filed Jan. 18, 2023, the entire contents of which is incorporated by reference as if set forth at length herein.
Number | Date | Country | |
---|---|---|---|
63480369 | Jan 2023 | US |