Transportation network speed foreeasting method using deep capsule networks with nested LSTM models

Description

TECHNICAL FIELD

This application is related to transport information prediction. This is a transportation network speed forecasting method using deep capsule networks with nested LSTM models.

BACKGROUND

Transport prediction is an important transportation research topic. It predicts future traffic congestions using history traffic data. Transport prediction becomes one of the most powerful tools in transportation to solve traffic congestions by not only providing commuters with better routing scheme, but also developing key management insights for traffic planners. With the prevalent installation of intelligent transportation systems (ITS) and global position systems (GPS) on buses, the costs to collecting data are largely reduced compared with the traditional data collection methods, such as surveys and loop detectors. The vast data makes transport predictions at large scales become feasible, so as to the macro traffic controls by analyzing these traffic congestion data.

Road traffic is inherently dynamic, complex and unstable due to the complexity of transport networks, such as the coexistence of main stream, road intersections, quick ways, et. Moreover, the data quality of the captured data by ITS systems varies greatly, despite the data size is huge. The collected data is usually highly unstructured, heterogeneous in quality, dynamic in time and space. These characteristics make great challenges for conventional machine learning methods to extract valuable information from it. To address the problems, recent years show a trend of gradually employing deep learning models to analyze traffic data. Deep learning models show greater learning and generalization abilities than conventional machine leaning methods by adopting deep and well-tuned model structures. Deep learning models can make much more accurate predictions on network level by mining time-space evolution patterns, of traffic from the collected big data.

However, deep learning models for traffic prediction have some limitations to date: (1) For deep learning models that construct time series for each road segment and make predictions by mining their time evolution, patterns using recursive artificial networks, the prediction accuracy is low because these models only consider value correlations across time for separate road segments. Traffic correlations across space are not considered in these models; (2) For convolutional deep learning models that represent traffic as images and learn time-space traffic relation through multiple convolution and pooling layers, the prediction accuracy is extremely unstable and dependent on the placing order of road segments on one dimension of the time-space image: (3) For other deep learning models that introduce coordinate systems into traffic networks, they see traffic evolutions across time as frames of videos and apply convolution and recurrent networks to mine the time-space patterns of traffic. These deep learning models ignore the graphic structure of traffic networks and treat overlapping road segments (such as bridge and roads under it) as one, so they cannot efficiently capture traffic flows on complex traffic networks with overlapping road structure. Moreover, the square size of coordinate systems also has great influence on the prediction accuracy of these models.

The Application

A transportation network forecasting method using deep capsule networks (CapsNet) with nested LSTM models (NLSTM) is proposed in this application address the limitations of current practice, and to efficiently mine the time-space pattern of traffic in complex traffic networks. Specifically, the model uses CapsNet to extract the spatial features of traffic networks and utilizes NLSTM to capture the hierarchical temporal dependencies in traffic sequence data. The CapsNet and NLSTM are, sequentially connected into the final model.

DESCRIPTION

The model realizes its prediction power by using following steps.

1) Data Preprocessing.

First, setting up speed profile for each road segment based on three steps. The first step divides the traffic network into n road links. The second step discretizes the investigated time into intervals. The time interval should not be too long nor too short, in order to capture the traffic evolution pattern in short time periods. The natural choice of time interval can be around 2-4 minutes. The third step calculates average travel speed of each link at each time interval. The average travel speed V_atfor link a∈(1, 2, . . . , n) at time t is given by

$V_{at} = \frac{\sum_{j = 1}^{k} V_{it}}{k}$

where k is the number of cars that travel through the road link at this time interval. V_itrepresents the average travel speed for car i.

Then, establishing the mapping relationship between the average speed and road link in GIS maps.

Finally, the geographical area of the road network is meshed into squares or coordinates. A value representing the average speed is assigned to each square. The average speed for each square is calculated as follows. For squares with no road links, the average speed is zero. The squares with at least one links, the value is the average speed of these links. Representing these average speeds as pixels of images, images representing the traffic state of the network in all time intervals can be obtained. These images are inputs of the proposed model. The model outputs are vectors containing average speeds for all road links at the next time interval. Let (X, Y) represents the model inputs and outputs.

2) Constructing CapsNet to Extract Spatial Features of Traffic Networks.

CapsNet first extracts variety of local features of traffic speed through a primary layer. The local features are then integrated into high-level features (i.e., represented by vectors) by final layers. The integrated features contain information not only about local time-space patterns between road links, but also about the high-level correlation between these local features. Thus, the integrated feature represents traffic patterns of the whole network, while encapsulating local pattern into high-level representations.

3) Constructing NLSTM to Capture the Hierarchical Temporal Dependencies in Traffic Sequence Data.

The inputs of the NLSTM are the output vectors of CapsNet. NLSTM transforms the traditional two-layer LSTM structure into two LSTM structures connected by a gate unit. NLSTM treats the input vectors as time-series in training.

4) Connecting CapsNet and NLSTM to Predict Traffic Speeds at Network Level.

The output vectors that, represent traffic patterns of the transport network from the CapsNet model are feed into the NLSTM model as time-series to learn temporal patterns across these abstract features NLSTM makes predictions on future traffic states (i.e., traffic speeds) by a fully-connected layer. In summary, the model makes prediction on future traffic states by learning the history traffic patterns represented as images (in step 1).

This application has the following advantages.

This application solves the problem that the spatial structure of road links in complex traffic networks cannot be handled efficiently by traditional statistical models and machine learning models. This application represents traffic states over time as images, and utilizes a CapsNet model and a NLSTM model to learn spatial and temporal traffic patterns, respectively. The model proposed has much higher prediction accuracy compared with traditional methods.

This application uses a more advanced deep learning structure called CapsNet. The CapsNet model is more powerful in handling overlapping road structures and low data resolution situations than CNN models. CapsNet uses vectors neurons instead of scalar neurons, so that more comprehensive time-space features of traffic can be preserved such as link location, length, direction and traffic speeds.

This application alters the sequential layer structure of LSTM as internal and external structures and connects them with, a gate unit, so that information can be passed between internal and external memory units without a second-screen process of sequential structure. This character makes the model more stable and efficient when dealing with long term history information.

Compared with traditional methods, this application makes predictions not only by mining tune-space patterns of traffic, but also by targeting and analyzing complex road structures, such as overlapping between roads and bridges. This application fills the gap that little practical methods are proposed to handle traffic prediction for complex road structures. The tests show that the model is accuracy and robust.

DESCRIPTION OF DRAWINGS

FIG. 1 shows the research flow chart.

FIG. 2 shows the mapping process of traffic speeds and road links.

FIG. 3 shows underlying grid system of transport networks.

FIG. 4 shows the structure of CapsNet.

FIG. 5 shows the structure of NLSTM.

IMPLEMENTATION STEPS

This application is a transportation network speed forecasting method using deep capsule networks with nested LSTM models. The implementation steps are as follows.

1. Data Preprocessing and Training Dataset Generation

The selected network (FIG. 2A) is a transport network in Beijing. The network has an area of 2.42 square kilometers (1.64 km*1.48 km), and contains 278 road links geographically closing to each other. The average speed for road links is calculated based on a 2 minutes time interval. The average speed is set to the free flow speed when no cars pass through the link. The average speeds are mapped to links in the transport network as shown in FIG. 2B.

The road network is segmented by grids with a size of 0.0001°×0.0001° (latitude and longitude). The value of each grid is determined on the basis of the speed of links using the following criteria: if no link passes through the grid area, then the value is zero; if only one link passes through the grid area, the value is the speed of this link; if multiple links pass through the same grid area, the value is the average speed of, all links.

On the basis of the above process, each grid is taken as a pixel with one channel, in which its value is the projected velocity value. Sequences of images are generated as data samples, and the time interval in, these sequences is 2 minutes. These images not only represent the traffic state but also contain the spatial structure of the road network and the relative topology among different links.

The model input is a two-dimensional vector containing traffic state in the last 15 time intervals (i.e., 30 minutes). The model output is a vector containing traffic states of all road link in the following 3 time intervals (i.e., 6 minutes). One training sample of the model is represented as s=[(x₁, x₂, . . . x₁₅), (y₁, y₂, y₃)], where {x_i}_i=1¹⁵represents traffic states observed in the last 15 time intervals and (y₁, y₂, y₃) represent traffic states, in the 3 future time intervals. The implementation uses data from Jun. 1, 2015 to Jun. 30, 2015 as training set, and uses data from Aug. 1, 2015 to Aug. 14, 2015 as test set. Traffic data between 6:00 AM and 10:00 PM is used, so there are 481 samples every day.

2. Constructing CapsNet to Extract Spatial Features of Traffic Networks.

CapsNet is a new type of NN structure. It replaces scalar neurons in the CNN with vector neurons, so that much more comprehensive traffic information can be kept, such as rotation angle, direction, and size of local features. In addition, CapsNet can retain all the extracted local features by replacing the pooling operation with a dynamic routing operation between capsule layers. Thus, CapsNet has greater learning ability than CNN because it keeps spatial relationships among road links.

CapsNet is composed of primary capsule layers (PrimaryCaps) and fully connected layers (TrafficCaps). The implementation of CapsNet is shown in FIG. 4. The model contains two convolution layers and one fully connected layer. The input image representing historical traffic states is first feed into the first convolution layer to learn local features between road links. Then, the PrimaryCaps layer further learns abstract features upon the local features. These abstract features will obtain traffic patterns between links far away from each other. These abstract features also come as vectors other than scalars. Finally, a TrafficCaps layer is used to combines all features and transforms them into predictions. The PrimaryCaps layer in this implementation uses a new non-linear activation function called squashing, which is given by

$\begin{matrix} v_{j} = \frac{{ s_{j} }^{2}}{1 + { s_{j} }^{2}} \frac{s_{j}}{ s_{j} } & (1) \end{matrix}$

where v_jis the output vector, and s_jis the input vector. The squashing operation ensures that the short vectors shrink to approximately zero length and long vectors shrink to a length slightly below 1. Thus, the length of the output vector of a capsule can represent the probability of the existence of the extracted local features.

In the convolution layers, the value of neurons is the activated as the weighted sum of neurons in the leading layer. The network is solved using back propagation. The structure of the CapsNet is discussed as follows.

First, to obtain the spatial relationship between the local features of network-level traffic state extracted by the primary layer and advanced features, an affine transformation is performed by multiplying the local features with a weight matrix W_ij.

û_j|i=W_iju_i, (2)

where u_iis the local features extracted by a primary capsule i, and û_j|iis the input vector associated with an advanced capsule j.

Then, input s_jto an advanced capsule j is the weighted sum over all input vectors û_j|ifrom the primary capsule layer.

s_j=Σ_ic_ijû_j|i (3)

where weights c_ijare the coupling coefficients that determined by an iterative dynamic routing algorithm. The essence of the dynamic routing algorithm is to find a part of primary capsules that is highly correlated to the advanced capsules, that is, to determine the local features with high probability to be associated with the high-level feature. This process represents the capability of the model to explore the spatial relationships among the distant links. The dynamic routing algorithm is described as follows.

1). For each primary capsule i in the primary capsule layer, the coupling coefficients c_ijwith all the advanced capsules j are summed to 1 by using a SoftMax function:

$\begin{matrix} c_{ij} = \frac{\exp (b_{ij})}{\sum_{k} \exp (b_{ik})} & (4) \end{matrix}$

where routing logit b_ijis the log prior probability that capsule i should be coupled to capsule j, and output c_ijrepresents the normalized probability that primary capsule i is associated with advanced capsule j. In the first iteration, the initial value of routing logit b_ijis set to zero in which the probabilities of the primary capsule accepted by each advanced capsule are equal.

2) After all the weights c_ijare calculated for all the primary capsules, each advanced capsule j is weighted by using Equation (3).

3) The input vector to advanced capsule layer is activated by a squashing function. The output is v_j.

4) Updating b_ijon the basis of the following rule:

b_ij=b_ij+û_j|i·v_j.

Routing logit b_ijis updated by using the dot product of the input to capsule j and its output. In the field of mathematics, the dot product becomes large for similar vectors. Therefore, the corresponding routing logit increases when the input and output are similar; thus, the primary capsule is coupled to the advanced capsule with a similar output. This process represents the association of local features with the high-level feature.

5) Repeating Steps 1-4 to obtain the optimal routing weights. The dynamic routing algorithm is easy to be optimized, and experiments show that the CapsNet model can be optimized by iterating three times on the training dataset.

3. Capture Temporal Relationship Between Traffic States Using LSTM

FIG. 5 shows the structure of NLSTM used in this application. The NLSTM contains an internal LSTM unit and an external LSTM unit. The model input, is the learned abstract traffic patterns of the 30 minutes history traffic states, and the model output is the predicted traffic state in the near future (i.e., 6 minutes). Following equations model the two LSTM units.

Ĩ_t={tilde over (σ)}_i({tilde over (x)}_t{tilde over (W)}_xi+{tilde over (h)}_t-1{tilde over (W)}_hi+{tilde over (b)}_i)
{tilde over (f)}_t={tilde over (σ)}_f({tilde over (x)}_t{tilde over (W)}_xf+{tilde over (h)}_t-1{tilde over (W)}_hf+{tilde over (b)}_f)
{tilde over (c)}_t={tilde over (f)}_t⊙{tilde over (c)}_t-1+Ĩ_t⊙{tilde over (σ)}_c({tilde over (x)}_t{tilde over (W)}_xc+{tilde over (h)}_t-1{tilde over (W)}_hc+{tilde over (b)}_c)
õ_t={tilde over (σ)}_o({tilde over (x)}_t{tilde over (W)}_xo+{tilde over (h)}_t-1{tilde over (W)}_ho+{tilde over (b)}_o)
{tilde over (h)}_t=õ_t⊙{tilde over (σ)}_h({tilde over (c)}_t)

where {tilde over (x)}_t, {tilde over (h)}_t-1are the inputs of the internal LSTM unit. They can be calculated as

{tilde over (x)}_t=I_t⊙σ_c(x_tW_xc+h_t-1W_hc+b_c)
{tilde over (h)}_t-1=f_t⊙c_t-1

where Ĩ_t, {tilde over (f)}_t, and õ_tare the three states of the gates; {tilde over (c)}_tis the cell input state; {tilde over (W)}_xi, {tilde over (W)}_xf, {tilde over (W)}_xo, and {tilde over (W)}_xcare the weight matrices that connect {tilde over (x)}_tto the three gates and cell input; {tilde over (W)}_hi, {tilde over (W)}_hf, {tilde over (W)}_ho, and {tilde over (W)}_hcare the weight matrices that connect {tilde over (h)}_t-1to the three gates and cell input; {tilde over (b)}_i, {tilde over (b)}_f, {tilde over (b)}_o, and {tilde over (b)}_care the biases of the three gates and cell input; σ represents the sigmoid function; and ⊙ represents the scalar product of two vectors.

For the external LSTM unit, only the cell state update rule is changed to the output of the internal LSTM, i.e., c_t={tilde over (h)}_t.

4. Combine Models to Predict Future Traffic State

The final model connects the CapsNet model and NLSTM model sequentially, and puts a fully connected layer at last. The structure of the final model is, as follows.

TABLE 1

Model structure of CapsNet + NLSTM

Parameter

Name of layers
Parameters
Output
scale

Input

164 × 148 × 1
0

Convolution
Kernel size = 9 × 9

Channels = 128
78 × 70 × 128
10,496

Stride = 2

PrimaryCaps
Kernel size = 9 × 9
18 × 16 × 128
1,327,232

(Convolution)
Channels = 128

Stride = 4

Reshape
Capsule dimension = 8
4,608 × 8
0

TrafficCaps
Advanced capsule = 30
30 × 16
17,694,720

(Fully
Capsule dimension = 16

connected)

(Flattened)

480
0

NLSTM
Hidden unit = 800
800
9,222,400

Dropout
0.2
800
0

Fully

278
222,678

connected

Total

28,477,526

parameters

The deep learning model is implemented based on Keras framework and is trained on a server with 8 NVIDIA GeForce Titan X GPUs (12 GB RAM).

5. Evaluation Metrics and Model Comparison

Feeding the testing dataset into the trained model, traffic states at future six minutes can be predicted using historical 30 minutes data. The MSE and MAPE are calculated as follows.

$MSE = \frac{1}{n} \sum_{i = 1}^{N} {({\hat{y}}_{i} - y_{i})}^{2}$

$MAPE = \frac{1}{n} \sum_{i = 1}^{N} (\frac{{\hat{y}}_{i} - y_{i}}{{\hat{y}}_{i}})$

Where ŷ_iis the predicted value, while y_iis the true value. The prediction accuracy is demonstrated as follows.

TABLE 3

Comparison among different methods

Time steps

2 min
4 min
6 min

Metrics
MSE
MAPE
MSE
MAPE
MSE
MAPE

LSTMs
41.67
0.2158
44.67
0.2255
48.11
0.2273

NLSTM
39.55
0.2067
44.49
0.2229
47.32
0.2246

DCNNs
42.94
0.2131
47.14
0.2367
51.38
0.2384

CapsNet
35.80
0.1891
42.53
0.2205
47.08
0.2308

CNN + LSTMs
36.57
0.2051
43.10
0.2181
45.90
0.2258

CapsNet + NLSTM
31.04
0.1757
39.29
0.2071
42.88
0.2183

The results show that the proposed model generate lowest MSEs and MAPEs under all circumstances, suggesting that the proposed model, can mine traffic patterns efficiently and is accurate and stable in traffic state prediction.

Claims

1. A road network status prediction method based on a capsule network and a nested long-short term memory neural network, comprising the following specific steps: Step 1. selecting a target road network, dividing same into n road sections, and dividing time at equal interval;Step 2. for a certain time interval t, calculating the average velocity of all vehicles passing through each road section within the time interval t;if no vehicle passes through a certain road section a within the time interval t, replacing the average velocity with the average velocity within the previous time interval;wherein the average velocity of the road section a within the time interval t is calculated as follows:
2. A road network status prediction method based on a capsule network and a nested long-short term memory neural network of claim 1, wherein in the step 1, the time interval is divided by taking the rule of capturing the change of the traffic status of the road network as much as possible within a short time as a principle.
3. A road network status, prediction method based on a capsule network and a nested long-short term memory neural network of claim 1, wherein in the step 3, the spatial corresponding relationship means that the average velocity value of each road section is matched to the line segment in the spatial geographic area corresponding thereto, and velocities are represented by different gray scales.
4. A road network status prediction method based on a capsule network and a nested long-short term memory neural network of claim 1, wherein in the step 4, the value standards are as follows: for a mesh area without a road section, the value is zero;for a mesh area with only one road section, the value is the corresponding average velocity of the road section; andfor a mesh area with more than two road, sections, the value is the mean of corresponding average velocities of all road sections.
5. A road network status prediction method based on a capsule network and a nested long-short term memory neural network of claim 1, wherein the step 6 specifically includes the following steps: first, establishing a primary-level capsule layer according to the input sample set pictures, and extracting a plurality groups of spatial local features of the traffic status, of the road network implied in the pictures as low-level capsules; then, establishing a high-level capsule layer, integrating the local features in all the low-level capsules in a mode of full connection, further extracting a spatial relationship among all the local features to obtain a group of high-level capsules which characterize the global spatial relationship among the traffic status of the road network, and converting the group of high-level capsules into a group of spatial feature vectors to make preparation for subsequent model establishment.
6. A road network status prediction method based on a capsule network and a nested long-short term memory neural network of claim 1, wherein in the step 8, the sequential connection means that the spatial feature vector of each time interval output by the capsule network model is used as the input of the nested long-short term memory neural network model, a full connection layer is added at the end of the nested long-short term memory neural network model, building a complete deep learning framework, and combining a prediction model.

US Referenced Citations (1)

Number	Name	Date	Kind
20200135017	Ma	Apr 2020	A1

Related Publications (1)

	Number	Date	Country
	20200135017 A1	Apr 2020	US

Transportation network speed foreeasting method using deep capsule networks with nested LSTM models

Information

Patent Number

Date Filed

Date Issued

Inventors

Examiners

CPC

Field of Search

CPC

International Classifications

Term Extension

Abstract

Description

Claims

US Referenced Citations (1)

Related Publications (1)