This application is the national phase entry of International Application No. PCT/CN2024/095641, filed on May 28, 2024, which is based upon and claims priority to Chinese Patent Application No. 202311313179.1, filed on Oct. 11, 2023, the entire contents of which are incorporated herein by reference.
This application relates to the field of intelligent driving technologies, and specifically to a Cro-IntentFormer-based method and system for predicting surrounding vehicle trajectories by integrating driving intentions.
With the increasing number of autonomous vehicles in China, human-machine co-driving has become a trend in future transportation developments. In complex and variable traffic environments, it is crucial for autonomous vehicles to understand complex driving scenarios and predict the future intentions and trajectories of surrounding traffic participants to ensure safety. This capability also lays the foundation for downstream trajectory planning tasks in autonomous driving.
Current trajectory prediction methods typically only consider the influence of traffic participants within a certain distance of the target vehicle on its future trajectory. They often focus more on the distance between the target vehicle and surrounding vehicles during interaction modeling, neglecting potential factors such as the driving intentions of the vehicles, which can significantly impact the strength of interactions. As a result, these methods do not adequately capture the interactive features between the target vehicle and its surroundings. Moreover, current approaches usually separate the prediction of vehicle driving intentions from trajectory prediction and do not effectively integrate driving intention prediction into the trajectory prediction tasks. Additionally, most existing trajectory prediction methods apply attention mechanisms to individual time frames rather than to time slices, thus overlooking the connections between adjacent trajectory segments.
The present disclosure provides a Cro-IntentFormer-based method and system for predicting surrounding vehicle trajectories by integrating driving intentions. This method explicitly considers the impact of driving intentions on future vehicle trajectories, incorporating both the distance between vehicles and the similarity of their behavioral intentions into the interaction modeling of vehicular relationships. It enables real-time prediction of surrounding vehicle trajectories during driving, offering strong support for safe navigation in complex and dynamic traffic environments, and providing a basis for downstream trajectory planning tasks.
A method for predicting surrounding vehicle trajectories based on Cro-IntentFormer and integrating vehicle driving intentions includes the following steps:
Further, the step S1 of preprocessing the vehicle trajectories includes:
Further, the constructing the physical relationship graph in the step S2 specifically includes:
Further, the specific steps in the step S2, which involve inputting the physical relationship graph G1 and the raw data into the spatio-temporal feature extraction module to obtain the spatio-temporal features of the trajectory, include:
Further, the computational process of the temporal information fusion network is as follows: dividing the historical state information h of each vehicle into time segments of length Lseg along each feature dimension:
Further, the specific steps in the step S3 are as follows:
Further, the specific steps in the step S4 for constructing the semantic relationship graph and obtaining the semantic features of the trajectory are:
Further, the specific steps in the step S5 are as follows:
A prediction system for the Cro-IntentFormer-based method of integrating vehicle driving intentions for predicting surrounding vehicle trajectories includes:
Furthermore, the prediction system also includes a hazard warning device that uses the future trajectories predicted by the surrounding vehicle trajectory prediction model to issue warnings for vehicles that may pose a collision risk with the ego vehicle's future path.
The present disclosure proposes a method and system for predicting surrounding vehicle trajectories based on Cro-IntentFormer, which incorporates vehicle driving intentions. The system utilizes the CrossFormer model to extract temporal features of vehicle trajectories. CrossFormer is a neural network model based on the attention mechanism that effectively captures dependencies across time segments and input feature dimensions, thus fully learning the information between adjacent trajectory segments. The system explicitly considers the impact of vehicle driving intentions on future vehicle trajectories, incorporating both the distance between vehicles and the similarity of their behavioral intentions into the interactive modeling of vehicular relationships, enhancing the model's interpretability and prediction accuracy, and providing strong support for safe driving in complex and variable traffic environments.
Advantages of the present disclosure include:
The present disclosure will be further described below with reference to the drawings and specific embodiments, although the scope of protection of the present disclosure is not limited thereto.
As shown in
The information collection and processing device includes onboard sensors, roadside sensors, and a data processing module. It is configured for real-time acquisition of position and speed information of the ego vehicle and surrounding vehicles, identifying trajectory information with timestamps and vehicle IDs, and performing standardization, cleaning, and preprocessing of the trajectory data. This includes removing outliers and duplicate data, filling missing values, and reducing noise. Afterwards, the trajectories are annotated with intentions based on the vehicle's heading angle and longitudinal/lateral speed at each time step. The behavioral intentions of vehicles include going straight a1, changing lanes to the left a2, and changing lanes to the right a3. Finally, the processed trajectory data is divided using a time window T to obtain raw data suitable for input into the surrounding vehicle trajectory prediction model.
The surrounding vehicle trajectory prediction device includes a spatio-temporal feature extraction module, an intent prediction module, a feature fusion module, and a decoder. During vehicle operation, it explicitly considers the impact of vehicle driving intentions on future vehicle trajectories, predicts the future trajectories of surrounding vehicles based on the raw data obtained from the information collection and processing device, and displays the prediction results on the vehicle's central control screen.
The hazard warning device, based on the future trajectories predicted by the surrounding vehicle trajectory prediction model, issues warnings for vehicles that may pose a collision risk with the ego vehicle's future path, providing reference for human drivers and serving as a basis for downstream trajectory planning tasks.
As shown in
The CrossFormer network, a neural network model based on the attention mechanism, effectively captures dependencies across time segments and input feature dimensions, fully learning the dependencies between adjacent trajectory segments. The present disclosure employs it as the trajectory time information fusion network. After using this network to learn the temporal dependencies of vehicle trajectories, attention operations are performed on vehicle trajectories within each time segment to learn the spatial dependencies. The present disclosure layers the extraction of spatio-temporal features of vehicle trajectories, maximizing the interpretability of the model.
Specifically, referring to the architectural diagram of the surrounding vehicle trajectory prediction model shown in
S2.1 Construction of the Physical Relationship Graph G1.
Select vehicles observed at time t as the nodes Vi of the graph. Based on a preset physical distance threshold D, calculate the physical distance dij between vehicles at time t. If dij<D, it is considered that there is an edge eq between nodes i and j, and a physical adjacency matrix A1 is established based on the physical distances between vehicles. The physical relationship graph G1={V, EI} is then constructed based on the connectivity relationships between nodes.
S2.2 Inputting the Physical Relationship Graph G1 and Raw Data into the Spatio-Temporal Feature Extraction Module to Obtain the Spatio-Temporal Features of the Trajectory.
Sequentially input the raw data Ht={ht1, ht2, . . . , htn} at time t of vehicles in the node set of the physical relationship graph G1 into the temporal information fusion network to learn the time dependency relationships of each vehicle's own trajectory and output the feature-extracted matrix B ∈RN×L×dmodel. Here, hti={st−T
The computational method for using the temporal information fusion network to learn the time dependency relationships of each vehicle's own trajectory is as follows:
Divide the historical state information h of each vehicle into time segments of length Lseg for each feature dimension:
Where, C is the number of features in the original vehicle trajectory data, and hi,c represents the i-th time segment of length Lseg for feature c. Use learnable linear matrices E ∈Rdmodel×L
Where, L is the total number of time segments after the historical time steps have been divided by Lseg.
Perform multi-head attention calculations and residual connections on the encoded feature vectors m along the time and feature dimensions, to obtain the feature matrix mdim ∈RL×C×dmodel that integrates both the time segments and the input feature dimensions:
m:,dtime=LayerNorm(m:,d+MSAtime(m:,d,m:,d,m:,d)
mtime=LayerNorm(m:,dtime+MLP(m:,dtime))
mi,:dim=LayerNorm(mi,:time+MSAdim(mi,:time,mi,:time,mi,:time)( ))
mdim=LayerNorm(mi,:dim+MLP(mi,:dim))
Where, MSA(Q,K,V) denotes the multi-head attention operation, LayerNorm denotes layer normalization, and MLP denotes a multi-layer perceptron; mi,2 ∈RC×dmodel represents the feature matrix of all feature dimensions for time segment i, and m2,c ∈RL×dmodel represents the feature matrix for feature dimension c across all time segments.
Finally, perform an additive aggregation operation on the feature matrix mdim ∈RL×C×dmodel along the feature dimension to obtain the feature matrix Bi ∈RL×dmodel that encapsulates the time-dependency relationships of vehicle is trajectory.
After obtaining the feature matrix B ∈RN×L×dmodel, which integrates the time-dependency relationships of each vehicle's trajectory through the temporal information fusion network, adjacency relationships are re-established based on the physical distances between vehicles at the last time step within each time segment, to obtain a physical relationship graph Gtime={Gl
S3: Input the spatio-temporal feature matrix into the intent prediction module to obtain the vehicle's predicted intent.
Aggregate the spatio-temporal feature matrix Z1 ∈RN×L×dmodel of all nodes in the physical relationship graph G1 along the time dimension using additive operations. After the additive aggregation, pass the resulting spatio-temporal feature matrix through a fully connected network and normalize it using the Softmax function to obtain the predicted intent vector ωi={atten1, atten2, atten3} for vehicle i. Here, atten1, atten2, and atten3 respectively represent the probabilities of the vehicle moving straight, changing lanes to the left, and changing lanes to the right.
Step S4.1 Construct the Semantic Relationship Graph: The present disclosure explicitly considers the impact of vehicle driving intentions on future vehicle trajectories, integrating both the distance between vehicles and the similarity of their behavioral intentions into the interactive modeling of vehicular relationships, and coupling the prediction of vehicle driving intentions with trajectory prediction. Select vehicles observed at time t as the nodes Vi for the graph. Based on the predicted intent vector ωi for vehicle i, select the behavior with the highest probability as the future intent a of the vehicle, establish connections between nodes of vehicles with the same intent, and construct the semantic relationship graph G2={V, E2}based on the connectivity relationships between nodes.
Step S4.2 Obtain the Semantic Features of the Trajectory: Input both the semantic relationship graph G2 and the raw data obtained in step S1 into the spatio-temporal feature extraction module to derive the semantic feature matrix Z2 ∈RN×L×dmodel for all nodes in the semantic relationship graph G2.
S5.1 Fusion of Spatio-Temporal and Semantic Features: Obtain the importance of the spatio-temporal and semantic features for all vehicle nodes, denoted as w1 and w2, respectively:
Where, q represents a learnable semantic-level attention vector, and tanh represents the hyperbolic tangent activation function;
Normalize the importance w1 and w2 of the spatio-temporal and semantic features for all vehicle nodes to obtain the feature weights βi for the trajectory's spatio-temporal and semantic features:
Perform a weighted summation of the spatio-temporal and semantic features to obtain the feature matrix J, which integrates the spatio-temporal and semantic information of the trajectory:
J=β1Z1+β2Z2
S5.2 Decoding to Obtain the Predicted Trajectories of Vehicles Surrounding the Target Vehicle: Input the feature matrix J into the decoder to obtain the predicted trajectories F={f1, f2, . . . ,fn} for vehicles surrounding the target vehicle, where fi={(xt+1i,yt+1i), (xt+2i, Yt+2i), . . . , (xt+T
The descriptions provided above are merely specific explanations of feasible implementations for this application and do not limit the scope of this application. Any obvious improvements, substitutions, or variations that can be made by those skilled in the art without deviating from the substantive content of the present disclosure are within the scope of protection of the present disclosure.
Number | Date | Country | Kind |
---|---|---|---|
202311313179.1 | Oct 2023 | CN | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2024/095641 | 5/28/2024 | WO |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2025/077208 | 4/17/2025 | WO | A |
Number | Date | Country |
---|---|---|
111046919 | Apr 2020 | CN |
114368387 | Apr 2022 | CN |
115158364 | Oct 2022 | CN |
115547040 | Dec 2022 | CN |
117351712 | Jan 2024 | CN |
20200133122 | Nov 2020 | KR |
Entry |
---|
Rulin Huang, Research on Key Technologies of Dynamic Obstacle Avoidance for Autonomous Vehicle, University of Science and Technology of China, 2017, pp. 1-98. |
Daofei Li, et al., Vehicle trajectory prediction for automated driving based on temporal convolution networks, 2022 WRC Symposium on Advanced Robotics and Automation (WRC SARA), 2022, pp. 257-262, IEEE. |
Number | Date | Country | |
---|---|---|---|
20250128727 A1 | Apr 2025 | US |