This patent application claims the priority of Chinese Patent Application No. 202210015815.1 filed with China National Intellectual Property Administration on Jan. 7, 2022 and entitled “MULTI-SCALE AGGREGATION PATTERN ANALYSIS METHOD FOR COMPLEX TRAFFIC NETWORK”, the disclosure of which is incorporated by reference herein in its entirety.
The present disclosure relates to the technical field of a highway traffic network, and in particular, to a multi-scale aggregation pattern analysis method for a complex traffic network.
The highway traffic network is an important infrastructure for serving the economy, the society and the public, and is the backbone of the comprehensive transportation system. Analyzing the complex traffic network in multiple scales, mining functional blocks of the network structure, and identifying distribution characteristics of the geographic spatial network are important aspects of analyzing the highway traffic network.
At present, there are few studies on multi-scale analysis of the complex traffic network. Yang Pan et al. proposed that the scale is the basic feature of the objective world (Yang Pan. Multi-scale study on urban furniture color—with Hefei as an example [D]. Hefei University of Technology, 2019.), and the multi-scale study is an important means to understand the complex system of the objective world. The highway traffic network is complex and systematic. In order to obtain scientific and reasonable design strategies and principles for the highway traffic network, it is desirable to analyze the road network from multiple levels and perspectives. The multi-scale study undoubtedly provides a unique perspective for the road network analysis. Secondly, the existing studies on block aggregation characteristics of the highway traffic network are insufficient in data types. Zheng and Gao et al. studied the dynamic traffic information on the scale-free traffic network (Zheng J. F., Gao Z. Y. and Zhao et al. X. M., Properties of transportation dynamics on scale-free networks [J]. Phyicas A, 2007, 373(none): 837-844.), and found that congestion behavior has an impact on the traffic network. Adding weight influence factors such as dynamic traffic congestion degree to construct a road network theoretical model can provide more comprehensive theoretical support for traffic decision-making and service personnel. In addition, it is desirable to select a suitable clustering algorithm for the highway network aggregation pattern analysis. There are many algorithms for identifying modules in the complex network, such as a vertex clustering algorithm, a density-based algorithm, a random walk method, a circuit approximation method, and a spectral clustering algorithm. At present, the k-means clustering algorithm is mostly used to divide the aggregation blocks of the road network. However, the road network data is typical high-dimensional data, and using the k-means clustering algorithm to process high-dimensional data is not dominant, resulting in certain limitations in the study results.
An objective of some embodiments of the present disclosure is to provide a multi-scale aggregation pattern analysis method for the complex traffic network, which can analyze the block characteristics of the highway traffic network, obtain a result of the block aggregation characteristics of the highway traffic network across administrative divisions, and further provide decision-making reference for traffic planning, design work and maintenance work.
To achieve the above objective, the present disclosure provides the following technical solutions.
A multi-scale aggregation pattern analysis method for complex traffic network, including:
if there is a road segment connection between a node i and a node j, eij=1; otherwise, eij=0;
F=(v1,v2, . . . ,vN)1×N (1-2)
where dij is a shortest path length between the node i and the node j in an initial traffic network structure, with a unit of km;
where A* is a transition matrix of the adjacency matrix A obtained from an original traffic network; α is a damping factor, generally α=0.85; N represents a number of nodes in the highway traffic network, and IN×N is a unit matrix of order N;
K=F+W+L+T=(k1,k2, . . . ,kj, . . . ,kN) (1-5)
where kj represents a j-th column of the weighting matrix K;
with (1−α)KN:
G*=αA*+(1−α)KN (1-6)
Further, in the S3, the drawing a two-dimensional decision diagram by two indicators: order of critical nodess and a shortest path distance, to determine center points and a number k of clusters in a spectral clustering, so obtaining a new weighting matrix which accords with an actual situation of the road network by incorporating a position weight matrix, a distance weight matrix, a road grade weight matrix and a dynamic traffic congestion degree weight matrix based on a similarity matrix of spectral clustering, and carrying out clustering analysis to obtain aggregation blocks of the road network, includes:
γi=ρiδi,i∈IS (1-7)
where ρi represents a i-th component of the main eigenvector X1*, and is used to evaluate importance of the node i; δi represents a shortest path length between the node i and a more critical node, which is a critical node whose component value is greater than a predetermined threshold, that is, δi represents a shortest path length between the node i and each critical node; IS represents a set of nodes in an area S; a sequence of comprehensive values {γi}i=1N is calculated, where γi=ρiδi, i∈IS represents a comprehensive value of the node i; as a reference value of γ increases, a possibly of the node i being a clustering center increases; therefore, {γi}i=1N are sorted in a descending order and a plurality of data points are extracted from the front of the sorted {γi}i=1N to back as block centers, k nodes distributed on upper right of the decision diagram are selected as clustering centers, and k is a number of clusters;
where ∥xi−xj∥ represents a distance between two sample points, and a parameter σ defines a neighborhood width of a sample point, as a value of the σ increases, a similarity between the sample point and a far sample point away from the sample point increases, and as a value of the σ decreases, the similarity between the sample point and the far sample point decreases;
R=S+F+W+L+T (1-9)
Further, in step S2, the expression of A* is as follows:
when eij=1, it means that a road segment connection between the node i and the node j; when eij=0 it means other cases.
According to specific embodiments provided by the present disclosure, the present disclosure discloses the following technical effects. In the multi-scale aggregation pattern analysis method for the complex traffic network provided by the present disclosure, firstly, a road network theoretical model which adds influence factors of a position attribute weight, a geographical distance weight, a road grade weight and a dynamic time-phased traffic congestion degree weight is constructed, which provides a theoretical model basis for subsequent study; then an improved PageRank (APA, Adapted PageRank) algorithm is proposed to obtain sorting of critical nodes in the highway traffic network, and the clustering center points and the number of clusters are determined by two indicators: the sorting of the critical nodes and the shortest path distance; further, a APA-spectral clustering algorithm is proposed, which can cross the limitation of administrative divisions, obtain division results of special common blocks in the highway traffic network, maintain connectivity between blocks, and improve the overall efficiency of the highway traffic network.
To describe the technical solutions in embodiments of the present disclosure or in the prior art more clearly, the accompanying drawings required in the embodiments are briefly described below. Apparently, the accompanying drawings in the following description show merely some embodiments of the present disclosure, and other drawings can be derived from these accompanying drawings by those of ordinary skill in the art without creative efforts.
The technical solutions in the embodiments of the present disclosure will be described below clearly and completely with reference to the accompanying drawings in the embodiments of the present disclosure. Apparently, the described embodiments are merely some rather than all of the embodiments of the present disclosure. All other embodiments obtained by those of ordinary skill in the art based on the embodiments of the present disclosure without creative efforts shall fall within the protection scope of the present disclosure.
An objective of some embodiments of the present disclosure is to provide a multi-scale aggregation pattern analysis method for the complex traffic network, which can analyze the block characteristics of the highway traffic network, obtain a result of the block aggregation characteristics of the highway traffic network across administrative divisions, and further provide decision-making reference for traffic planning, design and maintenance works.
To make the above-mentioned objective, features, and advantages of the present disclosure clearer and more comprehensible, the present disclosure will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.
As shown in
In S1, an adjacency matrix A, a position attribute matrix F, a distance weight matrix W, a road grade matrix L and a time-phased traffic congestion degree matrix T of the highway traffic network are calculated.
(1) The adjacency matrix A=(aij)N×N is a square matrix of order N, and the element ay on the i-th row and the j-th column is defined as follows:
If there is a road segment connection between a node i and a node j, then eij=1; otherwise, eij=0.
An example of part of the adjacency matrix A in this embodiment is shown in the following table (because the adjacency matrix of Langfang City, China is 16921 rows*16921 columns, it is not convenient to show here, so only part of the matrix A is provided):
Note: This example shows the connection of 4 nodes. If there is a road connection between node 1 and node 3, the value is 1; if there is no road connection between node 1 and node 2, the value is 0.
(2) The position attribute matrix F is constructed by using POI points to defined a 500-meter buffer zone and determining road segments in the highway network covered by the buffer zone so as to set position attribute values of the road segments. The weights of the road segments in the buffer zone are set to 1, while the weights of other road segments of the highway network are set to 0.
Assuming that vi=1 (1≤i≤N) represents the position attribute of the road segment i according to a studied problem or an actual situation of the traffic network, and vi=0 represents other situations. The position attribute matrix F is constructed as follows:
F=(v1,v2, . . . ,vN)1×N (1-2)
Note: This example shows the position attribute weighting of the road network corresponding to four nodes. If there is a POI point on the road connecting node 2 and the node 3, the value is 1; if there is no POI point on the road connecting node 2 and node 3, the value is 0.
(3) In the distance weight matrix W=(wij)i,j=1N, wij represents a reciprocal of a shortest path length between node i and node j;
where dij is the shortest path length between node i and node j in an initial traffic network structure, with a unit of km.
Note: this example shows the distance weighting of the road network corresponding to four nodes. The reciprocal of the distance between node 1 and node 3 is calculated, and the weight obtained after unified normalization is 0.99967. There is no road connection between node 1 and node 2, that is, if the reciprocal of the distance value is close to 0, the weight is 0.
(4) In the road grade matrix L=(lij)i,j=1N, the element lij refers to a weight of the grade of a road between node i and node j. The higher the road grade is, the higher the weight is. The specific weight settings are shown in Table 4.
Note: as shown in Table 5, the example shows the road grade weighting of the road network corresponding to the four nodes. If the road connecting node 1 and the node 3 is the motorway, the value is 0.333; if there is no road connection between node 1 and node 2, the weight is 0.
(5) In the time-phased traffic congestion degree matrix T=(tij)i,j=1N, the element tij represents the weight of the traffic congestion degree between node i and node j, and the higher the congestion degree is, the larger the road weight is. The specific weight settings are shown in Table 6.
Note: the example shown in Table 7 shows the traffic congestion degree weighting of the road network corresponding to the four nodes. For example, if the road congestion between node 1 and node 3 is clear, the value is 0.0875; if there is no road connection between node 1 and node 2, the value is 0.
In S2, based on an original PageRank algorithm, a weight influence factor of the road network is added to obtain an improved PageRank algorithm, so as to determine order of critical nodes. The S2 specifically includes the follow steps.
(1) A Google matrix of PageRank, represented by G, is defined as:
where A* is a transition matrix of the adjacency matrix A obtained from the original traffic network; α is a damping factor, and generally α=0.85. N represents a number of nodes in the traffic network (intersection points of roads in the highway traffic network), and IN×N is a unit matrix of order N.
(2) A new weighting matrix K is defined by using the position attribute matrix F, the distance weight matrix W, the road grade matrix L and the time-phased traffic congestion degree matrix T:
K=F+W+L+T=(k1,k2, . . . ,kj, . . . ,kN) (1-5)
where kj represents the j-th column of the matrix K.
Note: the example shown in Table 8 shows the weighting of the road network corresponding to the four nodes. The weighted value of the road between node 1 and node 3 is 1.42017; the weighted value of the road between the node 1 and the node 2 is 0.
(3) Each column vector kj of the matrix K is normalized to obtain a standard matrix KN.
(4) A new matrix G* is constructed by replacing
with (1−α)KN.
G*=αA*+(1−α)KN (1-6)
According to Perron-Frobenius theorem, a main eigenvector X1*={g(1), g(2), . . . , g(N)}(λ=1) of G* is calculated by setting a eigenvalue of the eigenvector λ=1, to obtain grades of critical nodes; g(1), g(2), . . . , g(N) represent respective components of the main eigenvector X1*, and a value of a component indicates an importance degree of the node. The larger the value is, the more important the node is, which means that the node has a higher the criticality grade.
According to the Perron-Frobenius theorem (referring to Golub G H,Loan C F V.Matrixcomputation[M].Baltimore: The Johns Hopkins UniversityPress, 1996, 728(94):208-209 for details), the main eigenvector X1*={g(1), g(2), . . . , g(N)}(λ=1) of G* is calculated by setting the eigenvalue of the eigenvector λ=1, to obtain the grades of critical nodes; g(1), g(2), . . . , g(N) represent respective components of the main eigenvector X1*, and the value of the component indicates the importance degree of the node. The larger the value is, the more important the node is, which means that the node has a higher criticality grade.
In S3, a two-dimensional decision diagram is drawn by two indicators: order of critical nodes and a shortest path distance. Spectral clustering centers and a number k of clusters are determined. Based on a similarity matrix of spectral clustering algorithm, the position weight matrix, the distance weight matrix, the road grade weight matrix and dynamic traffic congestion degree weight matrix are added to obtain a new weighting matrix which accords with the actual situation of the road network. Then the clustering analysis is carried out to obtain aggregation blocks of the road network. Specific steps are as follows.
(1) The two-dimensional decision diagram is constructed to select the clustering center points and determine the number of clusters.
The paper (Stanfill C, WaltzD. Toward memory-based reasoning [J]. Communications of the ACM, 1986, 29 (12):1213-1228.) published by Alex Rodriguez and Alessandro Laio in science points out that the clustering center has the following two attributes:
1) The clustering centers are important nodes surrounded by low-impact neighbors; and
2) The initial cluster centers are evenly distributed in the physical network, and the “distance” between the centers is relatively large.
The present disclosure adopts a method of two-dimensional decision diagram to select the clustering center points. The number of the clustering center points is the number of clusters. The clustering center points are determined by considering ρ and δ, where ρ is the horizontal axis and δ is the vertical axis:
γi=ρiδi,i∈IS (1-7)
where ρi represents the i-th component of the main eigenvector X1*, and is used to evaluate the importance of node i; δi represents the shortest path length between the node i and a more critical node, which is a critical node whose component value is greater than a predetermined threshold. Firstly, the path length between the node i and each critical node is calculated, and the shortest path length is δi; IS represents a set of nodes in an area S; a sequence of comprehensive values {γi}i=1N is calculated, where γi=ρiδi(i∈IS) represents the comprehensive value of node i, and N represents the number of nodes. The larger the value of γ is, the more likely the node i is the clustering center. Therefore, it is desirable to sort {γi}i=1N in a descending order and then extract several data points from the front of the sorted {γi}i=1N to back as block centers. k nodes distributed on the upper right of the decision diagram are selected as the clustering centers, and k is a number of clusters.
For example, the decision diagram is shown in
(2) A weighting matrix is constructed, and the road network are divided into aggregation blocks by the spectral clustering algorithm. The specific steps are as follows.
There are n sample points X={x1,x2, . . . ,xn} and the number k of clusters.
1) A similarity matrix S={sij|1≤i≤n, 1≤j≤n} of n*n is calculated as follows:
where ∥xi−xj∥ represents a distance between two sample points, and the parameter σ defines a neighborhood width of a sample point, that is, the larger the σ is, the greater a similarity between the sample point and a farther sample point away from the sample point is, and vice versa.
Note: the example shown in Table 9 shows a similarity matrix consisting of four nodes. A similarity value between node 1 and node 3 is 0.021049927; a similarity value between node 1 and node 4 is 0.023497721.
2) Based on the similarity matrix S, combining with the position weight matrix F, the distance weight matrix W, the road grade weight matrix L and the dynamic traffic congestion degree weight matrix T, a new weighting matrix R which accords with the actual situation of the road network is obtained as follows:
R=S+F+W+L+T (1-9)
Note: the example shown in Table 10 shows part of the weighting matrix R. A weight between node 1 and node 3 is 1.441219927; a weight between node 1 and node 2 is 0.000117981.
3) A degree matrix D is calculated, where di=Σj=1n rij is a sum of elements in each row of the weighting matrix R, and D is an n*n diagonal matrix composed of di.
4) A Laplace matrix L=D−1/2LD−1/2=D−1/2(D−W)D−1/2 is calculated.
5) Eigenvalues of the Laplacian matrix L are calculated and sorted from small to large.
6) According to the two-dimensional decision diagram, the center nodes are selected to obtain the number k of clusters.
7) The first k eigenvalues of the Laplacian matrix L are selected and the eigenvectors u1,u2, . . . ,uk of the first k eigenvalues are calculated.
8) The eigenvectors of the first k eigenvalues are used to form a matrix U={u1,u2, . . . , uk}, U∈Rn+k.
9) yi∈Rk is deemed to be a vector of the i-th row of U, where i=1,2, . . . ,n.
10) For i=1,2, . . . ,n, yi∈Rk is sequentially unitized, so that |yi|=1.
11) New sample points Y={y1, y2, . . . , yn} are grouped into clusters C1,C2, . . . ,Ck by using the k-means algorithm.
12) Clusters A1,A2, . . . ,Ak as a clustering result are obtained, where Ak={yi|yi∈Ck},i∈n.
In step S2, the expression of A* is as follows:
when eij=1, it means that there is a road segment connection between node i and node j; when eij=0, it means the other case.
In the embodiments of the present disclosure, the multi-scale aggregation pattern analysis is performed by using two complex traffic networks in Langfang City, China and Xiong'an New Area, China as examples.
The existing analysis of the complex traffic network mostly stays at a single scale, and dynamic influence factors are not considered in the study of block aggregation characteristics of the highway traffic network, and the road network aggregation block division method mostly adopts the k-means clustering algorithm suitable for low-dimensional data, which results in a certain gap between the analysis results and the real situation. Based on the current situation, the present disclosure performs a multi-scale aggregation pattern analysis on a complex traffic network, and improves the road network aggregation block division method. For the problem that the dynamic influencing factors were not considered in the previous road network theoretical model, a road network theoretical model which incorporates the influence factors of position attribute weights, geographical distance weights, road grade weights, and dynamic time-phased traffic congestion degree weights is constructed. For the problem that the k-means algorithm is not effective in processing high-dimensional road network data, an improved spectral clustering algorithm is proposed for the road network aggregation block division. For the defects of the spectral clustering algorithm itself, an improved PageRank (APA) algorithm is proposed to obtain the order of the critical nodes in the highway traffic network; then the spectral clustering centers and the number of clusters are determined by two indicators, namely order of critical node and a shortest path distance; finally, the APA-spectral clustering algorithm is obtained, which can transcend the limitation of administrative division boundaries and obtain the division results of special common blocks in the highway traffic network; and the connectivity among blocks can improve the overall efficiency of the highway traffic network.
In this specification, some specific embodiments are used for illustration of the principles and implementations of the present disclosure. The description of the foregoing embodiments is used to help illustrate the method of the present disclosure and the core ideas thereof. In addition, those of ordinary skill in the art can make various modifications in terms of specific implementations and the scope of application in accordance with the ideas of the present disclosure. In conclusion, the content of this specification shall not be construed as limitations to the present disclosure.
Number | Date | Country | Kind |
---|---|---|---|
202210015815.1 | Jan 2022 | CN | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2023/070582 | 1/5/2023 | WO |