This application is related to transport information prediction. This is a transportation network speed forecasting method using deep capsule networks with nested LSTM models.
Transport prediction is an important transportation research topic. It predicts future traffic congestions using history traffic data. Transport prediction becomes one of the most powerful tools in transportation to solve traffic congestions by not only providing commuters with better routing scheme, but also developing key management insights for traffic planners. With the prevalent installation of intelligent transportation systems (ITS) and global position systems (GPS) on buses, the costs to collecting data are largely reduced compared with the traditional data collection methods, such as surveys and loop detectors. The vast data makes transport predictions at large scales become feasible, so as to the macro traffic controls by analyzing these traffic congestion data.
Road traffic is inherently dynamic, complex and unstable due to the complexity of transport networks, such as the coexistence of main stream, road intersections, quick ways, et. Moreover, the data quality of the captured data by ITS systems varies greatly, despite the data size is huge. The collected data is usually highly unstructured, heterogeneous in quality, dynamic in time and space. These characteristics make great challenges for conventional machine learning methods to extract valuable information from it. To address the problems, recent years show a trend of gradually employing deep learning models to analyze traffic data. Deep learning models show greater learning and generalization abilities than conventional machine leaning methods by adopting deep and well-tuned model structures. Deep learning models can make much more accurate predictions on network level by mining time-space evolution patterns, of traffic from the collected big data.
However, deep learning models for traffic prediction have some limitations to date: (1) For deep learning models that construct time series for each road segment and make predictions by mining their time evolution, patterns using recursive artificial networks, the prediction accuracy is low because these models only consider value correlations across time for separate road segments. Traffic correlations across space are not considered in these models; (2) For convolutional deep learning models that represent traffic as images and learn time-space traffic relation through multiple convolution and pooling layers, the prediction accuracy is extremely unstable and dependent on the placing order of road segments on one dimension of the time-space image: (3) For other deep learning models that introduce coordinate systems into traffic networks, they see traffic evolutions across time as frames of videos and apply convolution and recurrent networks to mine the time-space patterns of traffic. These deep learning models ignore the graphic structure of traffic networks and treat overlapping road segments (such as bridge and roads under it) as one, so they cannot efficiently capture traffic flows on complex traffic networks with overlapping road structure. Moreover, the square size of coordinate systems also has great influence on the prediction accuracy of these models.
The Application
A transportation network forecasting method using deep capsule networks (CapsNet) with nested LSTM models (NLSTM) is proposed in this application address the limitations of current practice, and to efficiently mine the time-space pattern of traffic in complex traffic networks. Specifically, the model uses CapsNet to extract the spatial features of traffic networks and utilizes NLSTM to capture the hierarchical temporal dependencies in traffic sequence data. The CapsNet and NLSTM are, sequentially connected into the final model.
The model realizes its prediction power by using following steps.
1) Data Preprocessing.
First, setting up speed profile for each road segment based on three steps. The first step divides the traffic network into n road links. The second step discretizes the investigated time into intervals. The time interval should not be too long nor too short, in order to capture the traffic evolution pattern in short time periods. The natural choice of time interval can be around 2-4 minutes. The third step calculates average travel speed of each link at each time interval. The average travel speed Vat for link a∈(1, 2, . . . , n) at time t is given by
where k is the number of cars that travel through the road link at this time interval. Vit represents the average travel speed for car i.
Then, establishing the mapping relationship between the average speed and road link in GIS maps.
Finally, the geographical area of the road network is meshed into squares or coordinates. A value representing the average speed is assigned to each square. The average speed for each square is calculated as follows. For squares with no road links, the average speed is zero. The squares with at least one links, the value is the average speed of these links. Representing these average speeds as pixels of images, images representing the traffic state of the network in all time intervals can be obtained. These images are inputs of the proposed model. The model outputs are vectors containing average speeds for all road links at the next time interval. Let (X, Y) represents the model inputs and outputs.
2) Constructing CapsNet to Extract Spatial Features of Traffic Networks.
CapsNet first extracts variety of local features of traffic speed through a primary layer. The local features are then integrated into high-level features (i.e., represented by vectors) by final layers. The integrated features contain information not only about local time-space patterns between road links, but also about the high-level correlation between these local features. Thus, the integrated feature represents traffic patterns of the whole network, while encapsulating local pattern into high-level representations.
3) Constructing NLSTM to Capture the Hierarchical Temporal Dependencies in Traffic Sequence Data.
The inputs of the NLSTM are the output vectors of CapsNet. NLSTM transforms the traditional two-layer LSTM structure into two LSTM structures connected by a gate unit. NLSTM treats the input vectors as time-series in training.
4) Connecting CapsNet and NLSTM to Predict Traffic Speeds at Network Level.
The output vectors that, represent traffic patterns of the transport network from the CapsNet model are feed into the NLSTM model as time-series to learn temporal patterns across these abstract features NLSTM makes predictions on future traffic states (i.e., traffic speeds) by a fully-connected layer. In summary, the model makes prediction on future traffic states by learning the history traffic patterns represented as images (in step 1).
This application has the following advantages.
This application solves the problem that the spatial structure of road links in complex traffic networks cannot be handled efficiently by traditional statistical models and machine learning models. This application represents traffic states over time as images, and utilizes a CapsNet model and a NLSTM model to learn spatial and temporal traffic patterns, respectively. The model proposed has much higher prediction accuracy compared with traditional methods.
This application uses a more advanced deep learning structure called CapsNet. The CapsNet model is more powerful in handling overlapping road structures and low data resolution situations than CNN models. CapsNet uses vectors neurons instead of scalar neurons, so that more comprehensive time-space features of traffic can be preserved such as link location, length, direction and traffic speeds.
This application alters the sequential layer structure of LSTM as internal and external structures and connects them with, a gate unit, so that information can be passed between internal and external memory units without a second-screen process of sequential structure. This character makes the model more stable and efficient when dealing with long term history information.
Compared with traditional methods, this application makes predictions not only by mining tune-space patterns of traffic, but also by targeting and analyzing complex road structures, such as overlapping between roads and bridges. This application fills the gap that little practical methods are proposed to handle traffic prediction for complex road structures. The tests show that the model is accuracy and robust.
This application is a transportation network speed forecasting method using deep capsule networks with nested LSTM models. The implementation steps are as follows.
1. Data Preprocessing and Training Dataset Generation
The selected network (
The road network is segmented by grids with a size of 0.0001°×0.0001° (latitude and longitude). The value of each grid is determined on the basis of the speed of links using the following criteria: if no link passes through the grid area, then the value is zero; if only one link passes through the grid area, the value is the speed of this link; if multiple links pass through the same grid area, the value is the average speed of, all links.
On the basis of the above process, each grid is taken as a pixel with one channel, in which its value is the projected velocity value. Sequences of images are generated as data samples, and the time interval in, these sequences is 2 minutes. These images not only represent the traffic state but also contain the spatial structure of the road network and the relative topology among different links.
The model input is a two-dimensional vector containing traffic state in the last 15 time intervals (i.e., 30 minutes). The model output is a vector containing traffic states of all road link in the following 3 time intervals (i.e., 6 minutes). One training sample of the model is represented as s=[(x1, x2, . . . x15), (y1, y2, y3)], where {xi}i=115 represents traffic states observed in the last 15 time intervals and (y1, y2, y3) represent traffic states, in the 3 future time intervals. The implementation uses data from Jun. 1, 2015 to Jun. 30, 2015 as training set, and uses data from Aug. 1, 2015 to Aug. 14, 2015 as test set. Traffic data between 6:00 AM and 10:00 PM is used, so there are 481 samples every day.
2. Constructing CapsNet to Extract Spatial Features of Traffic Networks.
CapsNet is a new type of NN structure. It replaces scalar neurons in the CNN with vector neurons, so that much more comprehensive traffic information can be kept, such as rotation angle, direction, and size of local features. In addition, CapsNet can retain all the extracted local features by replacing the pooling operation with a dynamic routing operation between capsule layers. Thus, CapsNet has greater learning ability than CNN because it keeps spatial relationships among road links.
CapsNet is composed of primary capsule layers (PrimaryCaps) and fully connected layers (TrafficCaps). The implementation of CapsNet is shown in
where vj is the output vector, and sj is the input vector. The squashing operation ensures that the short vectors shrink to approximately zero length and long vectors shrink to a length slightly below 1. Thus, the length of the output vector of a capsule can represent the probability of the existence of the extracted local features.
In the convolution layers, the value of neurons is the activated as the weighted sum of neurons in the leading layer. The network is solved using back propagation. The structure of the CapsNet is discussed as follows.
First, to obtain the spatial relationship between the local features of network-level traffic state extracted by the primary layer and advanced features, an affine transformation is performed by multiplying the local features with a weight matrix Wij.
ûj|i=Wijui, (2)
where ui is the local features extracted by a primary capsule i, and ûj|i is the input vector associated with an advanced capsule j.
Then, input sj to an advanced capsule j is the weighted sum over all input vectors ûj|i from the primary capsule layer.
sj=Σicijûj|i (3)
where weights cij are the coupling coefficients that determined by an iterative dynamic routing algorithm. The essence of the dynamic routing algorithm is to find a part of primary capsules that is highly correlated to the advanced capsules, that is, to determine the local features with high probability to be associated with the high-level feature. This process represents the capability of the model to explore the spatial relationships among the distant links. The dynamic routing algorithm is described as follows.
1). For each primary capsule i in the primary capsule layer, the coupling coefficients cij with all the advanced capsules j are summed to 1 by using a SoftMax function:
where routing logit bij is the log prior probability that capsule i should be coupled to capsule j, and output cij represents the normalized probability that primary capsule i is associated with advanced capsule j. In the first iteration, the initial value of routing logit bij is set to zero in which the probabilities of the primary capsule accepted by each advanced capsule are equal.
2) After all the weights cij are calculated for all the primary capsules, each advanced capsule j is weighted by using Equation (3).
3) The input vector to advanced capsule layer is activated by a squashing function. The output is vj.
4) Updating bij on the basis of the following rule:
bij=bij+ûj|i·vj.
Routing logit bij is updated by using the dot product of the input to capsule j and its output. In the field of mathematics, the dot product becomes large for similar vectors. Therefore, the corresponding routing logit increases when the input and output are similar; thus, the primary capsule is coupled to the advanced capsule with a similar output. This process represents the association of local features with the high-level feature.
5) Repeating Steps 1-4 to obtain the optimal routing weights. The dynamic routing algorithm is easy to be optimized, and experiments show that the CapsNet model can be optimized by iterating three times on the training dataset.
3. Capture Temporal Relationship Between Traffic States Using LSTM
Ĩt={tilde over (σ)}i({tilde over (x)}t{tilde over (W)}xi+{tilde over (h)}t-1{tilde over (W)}hi+{tilde over (b)}i)
{tilde over (f)}t={tilde over (σ)}f({tilde over (x)}t{tilde over (W)}xf+{tilde over (h)}t-1{tilde over (W)}hf+{tilde over (b)}f)
{tilde over (c)}t={tilde over (f)}t⊙{tilde over (c)}t-1+Ĩt⊙{tilde over (σ)}c({tilde over (x)}t{tilde over (W)}xc+{tilde over (h)}t-1{tilde over (W)}hc+{tilde over (b)}c)
õt={tilde over (σ)}o({tilde over (x)}t{tilde over (W)}xo+{tilde over (h)}t-1{tilde over (W)}ho+{tilde over (b)}o)
{tilde over (h)}t=õt⊙{tilde over (σ)}h({tilde over (c)}t)
where {tilde over (x)}t, {tilde over (h)}t-1 are the inputs of the internal LSTM unit. They can be calculated as
{tilde over (x)}t=It⊙σc(xtWxc+ht-1Whc+bc)
{tilde over (h)}t-1=ft⊙ct-1
where Ĩt, {tilde over (f)}t, and õt are the three states of the gates; {tilde over (c)}t is the cell input state; {tilde over (W)}xi, {tilde over (W)}xf, {tilde over (W)}xo, and {tilde over (W)}xc are the weight matrices that connect {tilde over (x)}t to the three gates and cell input; {tilde over (W)}hi, {tilde over (W)}hf, {tilde over (W)}ho, and {tilde over (W)}hc are the weight matrices that connect {tilde over (h)}t-1 to the three gates and cell input; {tilde over (b)}i, {tilde over (b)}f, {tilde over (b)}o, and {tilde over (b)}c are the biases of the three gates and cell input; σ represents the sigmoid function; and ⊙ represents the scalar product of two vectors.
For the external LSTM unit, only the cell state update rule is changed to the output of the internal LSTM, i.e., ct={tilde over (h)}t.
4. Combine Models to Predict Future Traffic State
The final model connects the CapsNet model and NLSTM model sequentially, and puts a fully connected layer at last. The structure of the final model is, as follows.
The deep learning model is implemented based on Keras framework and is trained on a server with 8 NVIDIA GeForce Titan X GPUs (12 GB RAM).
5. Evaluation Metrics and Model Comparison
Feeding the testing dataset into the trained model, traffic states at future six minutes can be predicted using historical 30 minutes data. The MSE and MAPE are calculated as follows.
Where ŷi is the predicted value, while yi is the true value. The prediction accuracy is demonstrated as follows.
The results show that the proposed model generate lowest MSEs and MAPEs under all circumstances, suggesting that the proposed model, can mine traffic patterns efficiently and is accurate and stable in traffic state prediction.
Number | Name | Date | Kind |
---|---|---|---|
20200135017 | Ma | Apr 2020 | A1 |
Number | Date | Country | |
---|---|---|---|
20200135017 A1 | Apr 2020 | US |