URBAN TRAFFIC VELOCITY ESTIMATION METHOD BASED ON MULTI-SOURCE CROWD SENSING DATA

Information

  • Patent Application
  • 20250102316
  • Publication Number
    20250102316
  • Date Filed
    May 11, 2023
    a year ago
  • Date Published
    March 27, 2025
    a month ago
Abstract
The present invention discloses an urban traffic velocity estimation method based on multi-source crowd sensing data. This method, based on roadside pedestrian data and road navigation data collected by smart phones, obtains a final estimated velocity through the steps of missing data filling, self-view velocity aggregation and multi-view velocity fusion. This fine-grained large-scale urban traffic velocity estimation method can achieve velocity estimation on all types of roads, including suburban road sections and paths, instead of just focusing on main roads in a city center. According to the present invention, based on data driving, the urban traffic velocity estimation method does not need to install additional devices on roads, and is low in cost and high in universality. Compared with the prior art, the urban traffic velocity estimation method has higher practicability, theoretical property and applicability, and is of great significance for improving traffic management and planning.
Description
TECHNICAL FIELD

The present invention relates to an urban traffic velocity estimation method, and in particular to a traffic velocity estimation method based on multi-source crowd sensing data.


BACKGROUND

Fine-grained large-scale urban traffic velocity estimation is of great significance to urban traffic management and improvement. Traditional coarse-grained traffic velocity estimation is only based on a limited number of traffic sensors to calculate a velocity of a road section in a small range. Nowadays, mobile phones have been used more and more for navigation purposes. When users of the mobile phones use maps or taxi APPs, service providers will record GPS coordinates. The road mobile navigation data has become an important data source for traffic monitoring and sensing, and is widely used in traffic state estimation. However, a spatial coverage of mobile navigation data is uneven, and usually more data will be collected in hot spots and little or no data is collected in suburbs, thus it is impossible to implement fine-grained traffic velocity estimation. In addition to the mobile navigation data obtained for a navigation purpose, when users use mobile applications such as Weibo and Meituan, location-based services are involved, and many pedestrians on the roadside use mobile phones while walking. Meanwhile, when some pedestrians use the mobile phone applications with location-based services, they will randomly scan WIFI signals of nearby vehicles and report their current locations. WIFI signals of vehicles can be filtered through WIFI lists reported by pedestrians, and the locations of the vehicles can be approximated according to the locations of the pedestrians on the roadside. The obtained data can cover more sidewalk aspects without deploying any additional device. Therefore, it is possible to fuse roadside pedestrian data and road mobile navigation data to obtain fine-grained large-scale urban traffic velocity estimation in a low-cost and accurate way.


SUMMARY

An object of the present invention is to propose an urban traffic velocity estimation method based on multi-source crowd sensing data to improve and standardize existing research and technologies. This method puts forward an overall data processing flow for the traffic velocity estimation method, which can promote urban traffic planning and management and has a practical value.


An object of the present invention is achieved by the following technical solution.


An urban traffic velocity estimation method based on multi-source crowd sensing data includes the following steps:

    • step 1, data set preprocessing: cleaning an original data set collected by smart phones to obtain roadside pedestrian data and road navigation data respectively;
    • step 2, average velocity calculation: calculating a current velocity X of each road section in different time periods by using the data set in step 1;
    • step 3, missing data filling: by using the current velocity X of each road section in different time periods obtained in step 2, in combination with a historical velocity H of each road section in different time periods, filling missing velocity data in X to obtain a filled velocity {circumflex over (X)};
    • step 4, self-view velocity aggregation: by using the filled velocity {circumflex over (X)} calculated in step 3, quantizing spatial dependences between different road sections according to the historical velocity H, and collecting useful neighbor information to obtain an aggregated roadside pedestrian velocity data Vd and road mobile navigation velocity data Vw;
    • step 5, multi-view velocity fusion: according to the aggregated roadside pedestrian velocity data Vd and the road mobile navigation velocity data Vw, fusing multi-source velocity data by using a multi-layer perceptron (MLP) according to a determination whether a time stamp and current velocity data are filled data or not, to obtain a fusion velocity Y, and finally correcting the fusion velocity according to a feature of a road type to obtain fine-grained large-scale urban traffic estimated velocity Ŷ′.


Further, step 1 specifically includes: obtaining the road navigation data by filtering an APP usage list in original data, that is, personal position data reported when users use programs such as Gaode Map Navigation and Didi; and by filtering a scanned WIFI signal list in the original data, obtaining the roadside pedestrian data according to a determination whether there is a vehicle-mounted WIFI signal in the list, which means that when a user inadvertently scans the WIFI signal of a passing vehicle when using a mobile phone and reports a personal position, a roadside pedestrian position is approximately regarded as a driving vehicle position.


Further, step 2 specifically includes: by using the data obtained by cleaning and filtering in step 1, projecting trajectory data into a road network by using a hidden Markov road network matching algorithm, so as to obtain the current velocity X of each road section in different time periods, where the hidden Markov road network matching algorithm is also called a hidden Markov model map matching algorithm, which is the known art.


Further, step 3 specifically includes: firstly, introducing a mask matrix M to represent a missing unit of the velocity X:







M

i
,
j


=

{




1
,


if



X

i
,
j




is


available







0
,

if



X

i
,
j




is


missing












    • secondly, establishing a historical velocity matrix H by using historical data to provide additional information to help fill in missing data, and introducing another mask matrix N to represent a missing unit of H:










N

i
,
j


=

{




1
,


if



H

i
,
j




is


available







0
,

if



H

i
,
j




is


missing












    • defining a weighted matrix W to measure an importance of each item in the historical velocity matrix H, and then performing matrix decomposition by using H, W, M, N and X:














Loss
h

(
W
)

=


1
2







W

N


(

H
-

U


V
T



)




2


+

λ




U


2


+

λ




V


2













U
+



(
W
)


=

U
-

α







Loss
h

(
W
)




U












V
+



(
W
)


=

V
-

α







Loss
h

(
W
)




V













    • then constructing a loss by the matrix decomposition based on the updated U+ and V+:











Loss
t

(
W
)

=



1
2






M


(

X
-



U
+

(

V
+

)

T


)




2


+

λ





U
+



2


+

λ





V
+



2









    • then updating a weight W:










W
+

=

W
-

β





Loss
t




W










    • where, α and β are learning rate parameters; and

    • after iterative updating, obtaining a learned weighted matrix W, X and H being able to be used simultaneously to estimate the missing data {circumflex over (X)}:












min

U
.
V



1
2






M


(

H
-

UV
T


)




2


+


1
2






W

N


(

H
-

UV
T


)




2


+

λ




U


2


+

λ




V


2







X
^

=


M

X

+


(

1
-
M

)



UV
T










    • where, λ represents a penalty item parameter, and U and V represent two sub-matrices decomposed from an original matrix.





Further, the step 4 specifically includes: capturing a spatial correlation between adjacent roads by using the self-view velocity aggregation, and aggregating information of neighbor road sections highly correlated to a central road section; firstly, calculating a spatial correlation ei,j between a road section i and a road section j according to the historical velocity matrix, and keeping highly correlated parts and ignoring irrelevant information:









e

i
,
j



=





(


H

i
,
:


-

H

j
,
:



)

2




,

j


the


set


of


neighboring


connected


roads


for


i







e

i
,
j


=

{





+


,


e

i
,
j




threshold








e

i
,
j



,


e

i
,
j




threshold











then, calculating a fusion coefficient ai,j between the road sections according to the spatial correlation ei,j, and then obtaining the roadside pedestrian velocity data Vd and the road mobile navigation velocity data Vw after the self-view aggregation:








a

i
,
j


=


exp

(


-

e

i
,
j



/
k

)









j
=

N
i





(

-


e

i
,
j


k


)


+
ε







V

i
,


=



(

1
-




j
=

N
i





a
ij

2



)

*


X
^


i
,
:



+




j
=

N
i






a
ij

2

*


X
^


j
,
:












    • where, ε represents a minimal constant to prevent an overflow, k represents a constant scaling value, Ni represents a set of neighboring connected road sections i, and Vi, uniformly represents a feature representation of an i-th row, i.e., road section i of the roadside pedestrian velocity data Vd and the road mobile navigation velocity data Vw, which will not be described separately.





Further, step 5 specifically includes: effectively fusing multi-source data by using multi-view velocity fusion, according to a determination whether a feature representing the time stamp is a filled data feature Fd and whether the current velocity data is a filled data feature Fw, then passing the features through an embedding layer and splicing the features, and according to the aggregated roadside pedestrian velocity data Vd and road navigation velocity data Vw obtained in step 4, obtaining the fusion velocity Ŷ through the multi-layer perceptron (MLP):






Z=Embedding(Concat(Fd,Fw))





{circumflex over (Y)}=MLP(Concat(Z,Vd,Vw))

    • finally, correcting the estimated velocity according to an external factor (i.e., the road type), easily obtaining a velocity distribution of each type of road according to the historical data, regarding the velocity distribution as a normal distribution, and correcting the velocity falling at a tail of the distribution to obtain a final estimated velocity Ŷ′.


Compared with the prior art, the present invention has the following innovative advantages and remarkable effects:

    • 1) the present invention integrates data from multiple sources, fuses roadside pedestrian data and road navigation data, implements velocity estimation with 100% coverage of all road sections of the road network, fills in more than 70% of missing data, greatly reduces the cost of velocity estimation, and does not require the additional installation of devices such as a loop detector or camera which costs tens of thousands yuan;
    • 2) the present invention proposes a standardized processing flow for data, and the specific implementation of each step can be changed, so that the flexibility and expansibility are high.





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 is a flowchart of a fine-grained large-scale urban traffic velocity estimation method of the present invention;



FIG. 2 is a schematic diagram of the correlation of neighbor road sections in step 3 applied to an embodiment of the present invention, in which (1) represents a road mobile navigation data, and (2) represents roadside pedestrian data; and



FIG. 3 is a schematic diagram of improving the accuracy of self-view velocity aggregation in step 3 applied to an embodiment of the present invention, in which (1) represents road mobile navigation data and (2) represents roadside pedestrian data.





DETAILED DESCRIPTION

Specific implementation method and working principles of the present invention will be described in detail below with reference to the attached drawings.


Embodiment

In this embodiment, user data acquired from a certain place and collected from Mar. 21, 2020 to Mar. 28, 2020 are processed, and a data collection process is anonymously protected. Specific variables included in a data set are shown in Table 1:









TABLE 1







Crowd sensing data set










Variable name
Variable description







Gid
User ID



Timestamp
User reporting time



Lon
User longitude



Lat
User dimension



APP list
User APP usage list



WIFI list
User scanned WIFI list










In this embodiment, an implementation data set for implementing fine-grained large-scale urban traffic velocity estimation is the above-mentioned user data in a certain place, and the detailed implementation steps are as follows:

    • Step 1, cleaning an original data set in Table 1, deleting duplicate records, etc., and obtaining road mobile navigation data by filtering an APP usage list in original data and screening navigation application programs such as Didi, Gaode Map, etc. The roadside pedestrian data is obtained by filtering the scanned WIFI list and screening the data containing vehicle WIFI signals.
    • Step 2, by using the data obtained by cleaning and filtering in the step 1, projecting trajectory data into a road network by using a hidden Markov road network matching algorithm, so as to obtain the current velocity X of each road section in different time periods. Then, the example calculates road coverages of different data sets from 8:00 am to 8:30 am, as shown in Table 2. Each item in the table represents the coverage of a different type of road section in a different data set during the illustrated time period. The greater the coverage is, the little the data missing is.









TABLE 2







Comparison of road coverages of different


data sets from 8:00 am to 8:30 am









Road type
Road navigation data
Roadside pedestrian data












All road sections
80.14%
81.52%


Main road
92.26%
91.44%


Secondary road
87.65%
88.24%


Tertiary road
79.76%
83.49%


Other roads
70.00%
70.79%









By comparison, it is found that coverage situations of different data are different, and the data is dominant in different road sections rather than always performing better in all road sections. The road mobile navigation data is mainly concentrated in the main road, while mobile roadside data is more evenly distributed.


Step 3, performing data filling on the current velocity X of each road section in different time periods obtained in the step 2, and estimating missing velocity data of recorded road sections on Mar. 28, 2020 in an example. In order to provide additional velocity mode information, road mobile navigation data and roadside pedestrian data from Mar. 21, 2020 to Mar. 27, 2020 are used as historical data. After calculation, the data of road mobile navigation and roadside pedestrian data on the same day are 74.5% and 76.8% respectively. Specifically, this embodiment adopts a learning rate α of 1e-4 and a learning rate β of 1e-4, and β1 is 0.9 and β2 is 0.999 in an Adam optimizer. In the example, firstly, normal matrix decomposition is pre-trained in 10K steps to get a good initialization, and then a meta-learning process of a weighted matrix is run in 30K steps. Finally, the matrix decomposition process based on the weighted matrix is trained in 10K steps, and the proposed method is compared with ordinary matrix decomposition, tensor decomposition, linear difference method, GAIN and KNN filling methods, and experimental results are shown in Table 3:









TABLE 3







Comparison of data filling effects of different methods










Road mobile




navigation data
Roadside pedestrian data













Model
MAE
RMSE
MAPE
MAE
RMSE
MAPE
















Original data
8.538
11.242
28.134
9.223
12.043
30.990


Ordinary
8.463
11.255
27.509
9.072
11.920
29.924


matrix


decomposition


Linear
8.830
11.745
29.016
9.470
12.396
31.430


interpolation


KNN filling
8.543
11.312
27.757
9.282
12.138
30.698


GAIN
8.941
12.078
29.108
10.315
13.875
33.094


Tensor
8.590
11.235
30.261
9.150
11.881
32.616


decomposition


Example
8.816
10.828
26.690
8.812
11.521
29.014


method









In the table, MAE represents a mean absolute error, RMSE represents a root mean square error, and MAPE represents a mean absolute percentage error. The lower an error value is, the better the method is. From the table, it can be seen that the method proposed in this embodiment has the lowest mean error value compared with other models under various error evaluation standards, which is obviously superior to other methods and has a good effect of filling missing data.

    • Step 4, capturing a spatial correlation between adjacent roads by using the self-view velocity aggregation, and aggregating information of neighbor road sections highly correlated to a central road section. First of all, this embodiment calculates a spatial correlation between road sections according to a historical velocity matrix, and keeps highly correlated parts and ignores irrelevant information:









e

i
,
j



=





(


H

i
,
:


-

H

j
,
:



)

2




,

j


the


set


of


neighboring


connected


roads


for


i







e

i
,
j


=

{





+


,


e

i
,
j




threshold








e

i
,
j



,


e

i
,
j




threshold











As shown in FIG. 2, the closer a geographical distance of each road section is, the greater a velocity similarity is; meanwhile, the farther the distance is, the greater a difference is.


Then, in this embodiment, a fusion coefficient ai,j between road sections is calculated according to the spatial correlation, and the velocity V after self-view aggregation is obtained:








a

i
,
j


=


exp

(


-

e

i
,
j



/
k

)









j
=

N
I





(

-


e

i
,
j


k


)


+
ε







V

i
,


=



(

1
-




j
=

N
i





a
ij

2



)

*


X
^


i
,
:



+




j
=

N
i






a
ij

2

*


X
^


j
,
:










As shown in FIG. 3, the aggregation method proposed in this embodiment can improve the velocity estimation accuracy of intermediate sections by combining information of neighbor road sections well.

    • Step 5, based on the data subjected to self-view aggregation in the step 4, multi-view velocity fusion is adopted to effectively fuse multi-source data, and the estimated velocity is corrected according to an external factor, that is, the road type. The results are shown in Table 4:









TABLE 4







Comparison of data fusion effects of different methods












Model
MAE
RMSE
MAPE
















Weighted mean
7.896
10.459
25.706



Gradient iteration
7.635
10.197
25.340



Linear regression
7.467
10.094
25.901



Enhanced decision tree
7.345
9.740
25.779



Example method
7.320
9.837
23.877










It can be seen from the table that the error value of the method proposed by the present invention is smaller than that by other models under three evaluations, so that the method has an obvious better effect than other methods, and has a good data fusion effect.


The above description are only embodiments of the present invention. Although the present invention has been described with reference to preferred embodiments, it should be understood that the present invention is not limited to the disclosed embodiments. Those skilled in the art can make many possible variations and modifications to the disclosed solution, or to modify the embodiments to equivalent embodiments, without departing from the scope of the technical solution of the present invention, using the methods and technical contents disclosed above. Therefore, any simple changes, equivalent variations and modifications made to the above embodiments according to the technical essence of the present invention are within the scope of the technical solution of the present invention, without departing from the content of the technical solution of the present invention.

Claims
  • 1. An urban traffic velocity estimation method based on multi-source crowd sensing data, comprising the following steps: step 1, data set preprocessing: cleaning an original data set collected by smart phones to obtain roadside pedestrian data and road navigation data respectively;step 2, average velocity calculation: calculating a current velocity X of each road section in different time periods by using the data set in the step 1;step 3, missing data filling: by using the current velocity X of each road section in different time periods obtained in the step 2, in combination with a historical velocity H of each road section in different time periods, filling missing velocity data in X to obtain a filled velocity {circumflex over (X)};step 4, self-view velocity aggregation: by using the filled velocity {circumflex over (X)} calculated in the step 3, quantizing spatial dependences between different road sections according to the historical velocity H, and collecting useful neighbor information to obtain an aggregated roadside pedestrian velocity data Vd and road mobile navigation velocity data Vw; andstep 5, multi-view velocity fusion: according to the aggregated roadside pedestrian velocity data Vd and the road mobile navigation velocity data Vw, fusing multi-source velocity data by using a multi-layer perceptron according to a determination whether a time stamp and current velocity data are filled data, to obtain a fusion velocity Ŷ, and finally correcting the fusion velocity according to a feature of a road type to obtain a fine-grained large-scale urban traffic estimated velocity Ŷ′.
  • 2. The urban traffic velocity estimation method based on multi-source crowd sensing data according to claim 1, wherein the step 1 specifically comprises: obtaining the road navigation data by filtering an APP usage list in original data; and by filtering a scanned WIFI signal list in the original data, obtaining the roadside pedestrian data according to a determination whether there is a vehicle-mounted WIFI signal in the list.
  • 3. The urban traffic velocity estimation method based on multi-source crowd sensing data according to claim 1, wherein the step 2 specifically comprises: by using the data obtained by cleaning and filtering in the step 1, projecting trajectory data into a road network by using a hidden Markov road network matching algorithm, so as to obtain the current velocity X of each road section in different time periods.
  • 4. The urban traffic velocity estimation method based on multi-source crowd sensing data according to claim 1, wherein the step 3 specifically comprises: firstly, introducing a mask matrix M to represent a missing unit of the velocity X:
  • 5. The urban traffic velocity estimation method based on multi-source crowd sensing data according to claim 1, wherein the step 4 specifically comprises: capturing a spatial correlation between adjacent roads by using the self-view velocity aggregation, and aggregating information of neighbor road sections highly correlated to a central road section; firstly, calculating a spatial correlation ei,j between a road section i and a road section j according to a historical velocity matrix, and keeping highly correlated parts and ignoring irrelevant information:
  • 6. The urban traffic velocity estimation method based on multi-source crowd sensing data according to claim 1, wherein the step 5 specifically comprises: fusing the multi-source velocity data by using multi-view velocities, according to a determination whether a feature representing the time stamp is a filled data feature Fd and whether the current velocity data is a filled data feature Fw, then passing the features through an embedding layer and splicing the features, and according to the aggregated roadside pedestrian velocity data Vd and road mobile navigation velocity data Vw obtained in step 4, obtaining the fusion velocity Ŷ through the multi-layer perceptron: Z=Embedding(Concat(Fd,Fw)){circumflex over (Y)}=MLP(Concat(Z,Vd,Vw))finally, correcting the estimated velocity according to the features of the road types, obtaining a velocity distribution of each type of road according to the historical data, regarding the velocity distribution as a normal distribution, and correcting the velocity falling at a tail of the distribution to obtain a final estimated velocity Ŷ′.
Priority Claims (1)
Number Date Country Kind
202310221863.0 Mar 2023 CN national
PCT Information
Filing Document Filing Date Country Kind
PCT/CN2023/093404 5/11/2023 WO