CONFLICT CONTROL METHOD FOR SHARED DRIVING, AND STORAGE MEDIUM AND ELECTRONIC DEVICE

Description

TECHNICAL FIELD

The present disclosure relates to the field of autonomous driving technology, in particular to a conflict control method for shared driving, a storage medium, and an electronic device.

BACKGROUND

Both the autonomous driving system and the driver in the shared driving are intelligent agents, and both will make judgments and decisions based on their own understanding of the scene. Therefore, in the shared driving, in addition to the human-machine steering conflict caused by the difference in preview actions between the control levels of the human and the machine, another major cause of the human-machine conflict is the disagreement between decision-making levels of the human and the machine. That is, there is difference in the target trajectories planned by the driver and the autonomous driving system, which leads to the steering torque conflict.

It should be noted that the information disclosed in the above section is only for enhancement of understanding of the background of the present disclosure, and thus may contain information that does not form the prior art already known to those of ordinary skill in the art.

SUMMARY

According to a first aspect of the present disclosure, a conflict control method for shared driving is provided. The conflict control method for shared driving includes: establishing, based on a driver's deterministic steering torque and a driver's stochastic steering torque, a game model for human-machine path tracking control corresponding to a human-machine interaction action; obtaining human-machine torque conflict information by solving the game model for human-machine path tracking control; determining a shared control strategy based on the human-machine torque conflict information; and controlling a vehicle based on the shared control strategy.

According to a second aspect of the present disclosure, a conflict control apparatus for shared driving is provided. The conflict control apparatus for shared driving includes: a modeling module configured to establish, based on a driver's deterministic steering torque and a driver's stochastic steering torque, a game model for human-machine path tracking control corresponding to a human-machine interaction action; a solving module configured obtain human-machine torque conflict information by solving the game model for human-machine path tracking control; and an application module configured to determine a shared control strategy based on the human-machine torque conflict information, and control a vehicle based on the shared control strategy.

According to a third aspect of the present disclosure, a computer-readable storage medium is provided. The computer-readable storage medium stores a computer program, which when executed by a processor, causes the conflict control method for shared driving as described in the above embodiments to be implemented.

According to a fourth aspect of the present disclosure, an electronic device is provided. The electronic device includes: one or more processors; and a storage unit for storing one or more programs, which when executed by one or more processors, cause the one or more processors to implement the conflict control method for shared driving as described in the above embodiments.

It should be understood that the above general description and the following detailed description are only illustrative and explanatory, and do not limit the present disclosure.

BRIEF DESCRIPTION OF THE DRAWINGS

The drawings, which are incorporated in and constitute a part of the specification, illustrate embodiments consistent with the present disclosure and together with the specification serve to explain the principles of the disclosure. Obviously, the drawings in the following descriptions are only some embodiments of the present disclosure, and for those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative efforts, in which,

FIG. 1 shows a schematic flowchart of a conflict control method for shared driving according to one or more embodiments of the present disclosure;

FIG. 2 shows a schematic diagram of a principle of steering interaction in a shared driving system according to one or more embodiments of the present disclosure;

FIG. 3 shows a schematic diagram of a principle of a non-cooperative game for a shared driving system according to one or more embodiments of the present disclosure;

FIG. 4 shows a schematic diagram of a principle of a closed-loop dynamic game for a shared driving system according to one or more embodiments of the present disclosure;

FIG. 5 shows a schematic diagram of a principle of an open-loop dynamic game for a shared driving system according to one or more embodiments of the present disclosure;

FIG. 6 shows a schematic flowchart of a method for establishing a closed-loop game model for human-machine path tracking control according to one or more embodiments of the present disclosure;

FIG. 7 shows a schematic diagram of a principle of a multi-point preview mode according to one or more embodiments of the present disclosure;

FIG. 8 shows a schematic flowchart of a method for establishing an open-loop game model for human-machine path tracking control according to one or more embodiments of the present disclosure;

FIG. 9 shows a schematic flowchart of a method for solving a closed-loop game model for human-machine path tracking control according to one or more embodiments of the present disclosure;

FIG. 10 shows a schematic flowchart of a method for solving a closed-loop game model for human-machine path tracking control according to one or more embodiments of the present disclosure;

FIG. 11 shows a schematic flowchart of a method for solving an open-loop game model for human-machine path tracking control according to one or more embodiments of the present disclosure;

FIG. 12 shows a schematic flowchart of a method for solving an open-loop game model for human-machine path tracking control according to one or more embodiments of the present disclosure;

FIG. 13 shows a schematic diagram of a structure of a conflict control apparatus for shared driving according to one or more embodiments of the present disclosure;

FIG. 14 shows a schematic diagram of a computer-readable storage medium according to one or more embodiments of the present disclosure; and

FIG. 15 shows a schematic diagram of a structure of a computer system in an electronic device according to one or more embodiments of the present disclosure.

DETAILED DESCRIPTION

Example embodiments will now be described more fully with reference to the drawings. Example embodiments, however, can be embodied in various forms and should not be construed as limited to the examples set forth herein. Instead, these embodiments are provided so that the present disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art.

In addition, the described features, structures, or characteristics can be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided in order to give a thorough understanding of the embodiments of the present disclosure. However, those skilled in the art will appreciate that the technical solutions of the present disclosure can be practiced without one or more of the specific details, or other methods, components, devices, steps, etc., can be employed. In other instances, well-known methods, devices, implementations, or operations have not been shown or described in detail to avoid obscuring aspects of the present disclosure.

The block diagrams shown in the drawings are functional entities and do not necessarily correspond to physically separate entities. These functional entities can be implemented in software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.

The flowcharts shown in the drawings are merely illustrative and do not necessarily include all contents and operations/steps, nor do they have to be executed in the order described. For example, some operations/steps can be decomposed, and some operations/steps can be combined or partially combined, thus an actual execution order may be changed according to actual situations.

In the human-machine dual-agent system, the driver and the autonomous driving system will simultaneously manipulate the actuator and change the vehicle state to achieve their goals, respectively. These redundancy inputs will inevitably cause the human-machine interaction conflict, which seriously affects the safety, comfort, power and fuel cost of the vehicle.

The game theory is an effective means of describing and understanding the interaction conflict between two parties in the multi-agent system, providing effective theoretical methods for quantitative modeling of the human-machine interaction, resolution of the human-machine conflict, and inference of the true intentions of the driver.

For the path tracking control problem in shared driving, both the autonomous driving system and the driver are intelligent agents, and both will make judgments and decisions based on their own understanding of the scene. Therefore, in the shared driving, in addition to the human-machine steering conflict caused by the difference in preview actions between the control levels of the human and the machine, another major cause of the human-machine conflict is the disagreement between decision-making levels of the human and the machine. That is, there is difference in the target trajectories planned by the driver and the autonomous driving system, which leads to the steering torque conflict.

On the one hand, in the modeling of the human-machine interaction mechanism, especially in the modeling process of the human-machine interaction under extreme vehicle conditions (such as human-machine collaborative emergency avoidance), it is difficult to directly apply linear dynamic models to describe human-machine interaction action. Many nonlinear methods, such as nonlinear prediction methods, local linearization methods, and piecewise affine methods, have been applied to deal with model mismatch problems under extreme vehicle conditions. However, nonlinear prediction methods often have poor real-time performance due to their high computational complexity. Although local linearization or piecewise affine methods can ensure the real-time performance of the algorithm, these methods inevitably cause the control strategy to switch back and forth within different linearization intervals, resulting in uneven human-machine interaction results, and even the sliding mode phenomena that switches back and forth near linear interval points. On the other hand, for the multi-agent dynamics system, the dynamic non-cooperative game theory can also be applied. However, current research on the game theory in shared driving systems is mainly limited to the design of shared control strategies, and there is no comprehensive theoretical description of the mapping relationship between the human-machine decision divergence and the control conflict.

The present disclosure studies and focuses on the human-machine interaction mechanism from decision divergence to control conflict. Due to the presence of both certain steering resistance torque and uncertain steering torque from the driver in the coupled human-machine steering dynamics system, the present disclosure proposes a new stochastic game theory framework that considers deterministic and stochastic steering torques, and Nash and Stackelberg equilibria is planned to be employed under different information modes to fully describe such mapping relationship, so as to design a theoretical bridge connecting the human-machine decision divergence and the control conflict, overcome the human-machine decision confusion problem shared driving systems, and provide theoretical basis for the design of shared control strategies.

Detailed explanations of technical solutions in embodiments of the present disclosure will be provided in the following.

FIG. 1 shows a schematic flowchart of a conflict control method for shared driving according to one or more embodiments of the present disclosure. As shown in FIG. 1, the conflict control method for shared driving includes steps S101 to S103.

In step S101, a game model for human-machine path tracking control corresponding to a human-machine interaction action is established based on a driver's deterministic steering torque and a driver's stochastic steering torque.

In step S102, human-machine torque conflict information is obtained by solving the game model for human-machine path tracking control.

In step S103, a shared control strategy is determined based on the human-machine torque conflict information, and a vehicle is controlled based on the shared control strategy.

In some embodiments of the present disclosure, the game model for human-machine path tracking control corresponding to the human-machine interaction action is established based on the driver's deterministic steering torque and the driver's stochastic steering torque, and the game model for human-machine path tracking control is solved to obtain the human-machine torque conflict information for vehicle control. On the one hand, the uncertain action of the driver can be incorporated into the human-machine path tracking control, making it more in line with practical scenario requirements and achieving more accurate control effects. On the other hand, the human-machine torque conflict information obtained from solving the game model is an accurate description of the human-machine interaction action from decision divergence to control conflict in the shared driving model, which can overcome the problem of human-machine decision confusion in the shared driving system, provide theoretical basis for the design of shared control strategies, and further optimize the results of the vehicle control.

More detailed explanations to the steps of the conflict control method for shared driving in the present disclosure will be provided in the following with reference to the drawings and embodiments.

In some embodiments, the game model for human-machine path tracking control for solving the human-machine interaction action is first established. The game model for human-machine path tracking control can adopt a non-cooperative game theory framework.

FIG. 2 shows a schematic diagram of a principle of steering interaction in a shared driving system according to one or more embodiments of the present disclosure. As shown in FIG. 2, both the driver and the autonomous driving system (hereinafter referred to as driving system) will generate corresponding steering manipulate actions according to their own target trajectory. At the same time, both the driver and the driving system can perceive decisions from each other through the steering system and the whole vehicle movement state, and thus, for the driver, in order to achieve the decision-making goal of himself, the steering manipulate action of the driver further includes the response to the steering action of the driving system. Therefore, when designing the controller of the driving system, the steering control input from the driver should also be fully considered. Therefore, in the shared driving system, the modeling of the human-machine steering conflict must consider the interaction of the human-machine steering action, especially under the condition of divergence of human-machine decision-making goals, where the steering action of the driver is not only based on his own decision-making goal, but also actively compensates the steering action of the autonomous driving system, which is quite different from pure manual driving.

Since the steering actions of the driver and the autonomous driving system in the process of shared driving are both aimed at minimizing the tracking error of each track, and their respective steering actions are generated according to the state feedback of the cooperative vehicle infrastructure system and the steering action of the other party, there are inevitably conflicts between the target trajectories of respective decision-making layers of the human and the machine, then the human-machine interaction process under such condition can be described through the non-cooperative game theory framework.

FIG. 3 shows a schematic diagram of a principle of a non-cooperative game for a shared driving system according to one or more embodiments of the present disclosure. As shown in FIG. 3, the driver and the driving system are considered as game participants, with both parties aiming to maximize their respective interests, to generate their own actions by integrating the state information of the game process and the actions of the other party.

When the shared driving is studied using the game theory, the optimal control strategy based on the cost function is used to describe the trajectory tracking actions of the driver and the autonomous driving system. Since the sum of the cost functions of the driver and the autonomous driving system is non-zero, and since the optimal trajectory tracking control is usually described using the prediction time domain and the control time domain, the trajectory tracking problem of the shared driving can be abstracted as a multi-stage dynamic game problem with a non-zero sum. For the multi-stage dynamic game with the non-zero sum, the game problem can be divided into two categories, namely the closed-loop memoryless dynamic game problem and the open-loop dynamic game problem, based on the information mode of the game.

FIG. 4 shows a schematic diagram of a principle of a closed-loop dynamic game for a shared driving system according to one or more embodiments of the present disclosure. As shown in FIG. 4, in this information mode, the permissible strategy set of the participant is mapped through the initial state and the states of each stage, while “memoryless” means that the participant only knows the initial state and the current state of the system when making decisions in each stage, and has no memory of the states of the other stages. Therefore, the action of the participant in stage i can be expressed as τ_i¹(x_i, x₀), τ_i²(x_i, x₀), i∈{0, 1, . . . , n_u}.

The opposite of the closed-loop memoryless dynamic games is the open-loop dynamic games. FIG. 5 shows a schematic diagram of a principle of an open-loop dynamic game for a shared driving system according to one or more embodiments of the present disclosure. As shown in FIG. 5, the entire game process is divided into n_ustages, and the state vector of the system in each stage is x_k. The permissible strategy set of the participant in each stage is only related to the initial state. Therefore, when the initial state x₀is given, the strategy set is a constant function, and the action of the participant in stage i also become constant values τ_i¹, τ_i², i∈{0, 1, . . . , n_u}.

Therefore, in the step S101, the game model for human-machine path tracking control in two different (closed-loop and open-loop) information modes can be established separately.

1. Closed-Loop Game Model for Human-Machine Path Tracking Control

FIG. 6 shows a schematic flowchart of a method for establishing a closed-loop game model for human-machine path tracking control according to one or more embodiments of the present disclosure, which uses the Linear Quadratic Regulator (LQR) method with optimal multi-point preview to describe the steering actions of the driver and the driving system. As shown in FIG. 6, the method for establishing the game model for human-machine path tracking control includes the following steps.

In step S601, a first discrete state update equation for a dynamics system of a shared driving vehicle in a closed-loop information mode is established based on the driver's deterministic steering torque and the driver's stochastic steering torque.

In step S602, a path tracking augmentation system that includes a human-machine preview state is obtained by augmenting the first discrete state update equation through a human-machine preview dynamic process.

In step S603, a driver trajectory cost function and a driving system trajectory cost function are established based on the path tracking augmentation system, to obtain the game model for human-machine path tracking control.

In some embodiments, this section focuses on the modeling of the human-machine interaction. Therefore, it can be assumed that both the target trajectories of the human and the machine have a small tangent direction angle. Under the condition of a small heading angle, the state vector in the model can be simplified as x_c=[θ_sw{dot over (θ)}_sw{dot over (ψ)} Y ψ]^T, where θ_swis the angel, {dot over (θ)}_swis the derivative of θ_swwith respect to time, Y is the global coordinate of the mass center of the vehicle, ψ is the heading angle of the vehicle, and {dot over (ψ)} is the derivative of ψ with respect to time.

The driver's deterministic steering torque is denoted as ξ^h, the driver's stochastic steering torque is denoted as τ^hsto, and the steering torque of the driving system is denoted as τ^m. The continuous state space equation is established based on the state vector x_c, the driver's deterministic steering torque is denoted as τ^h, and the driver's stochastic steering torque is denoted as τ^hsto, as shown in formula (1):

$\begin{matrix} {\begin{matrix} {\dot{x}}_{c} = A_{c} x_{c} + B_{h} (τ^{h} + τ^{hsto}) + B_{m} τ^{m} + N_{c} τ_{disk} \\ y_{c} = C_{c} x_{c} \end{matrix} & (1) \end{matrix}$

- where τ_diskis the steering torque related to the distance, A_c, B_h, B_m, N_c, and C_care all parameter matrices in the model, as shown in the following:

$A_{c} = [\begin{matrix} 0 & 1 & 0 & 0 & 0 & 0 \\ - \frac{B_{eq}}{J_{eq}} & - \frac{B_{eq}}{J_{eq}} & 0 & 0 & 0 & 0 \\ b_{1} & 0 & a_{11} & a_{12} & 0 & 0 \\ b_{2} & 0 & a_{21} & a_{21} & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & v_{x} \\ 0 & 0 & 0 & 1 & 0 & 0 \end{matrix}], B_{h} = [\begin{matrix} 0 \\ \frac{1}{J_{eq}} \\ 0 \\ 0 \\ 0 \\ 0 \end{matrix}], B_{m} = [\begin{matrix} 0 \\ \frac{1}{J_{eq}} \\ 0 \\ 0 \\ 0 \\ 0 \end{matrix}], N_{c} = [\begin{matrix} 0 \\ - \frac{1}{J_{eq}} \\ 0 \\ 0 \\ 0 \\ 0 \end{matrix}], C_{c} = {[\begin{matrix} 0 & 0 \\ 0 & 0 \\ 0 & 0 \\ 0 & 0 \\ 1 & 0 \\ 0 & 1 \end{matrix}]}^{T} .$

In order to describe the shared driving problem as a multi-stage game, by discretizing the above continuous system at the system discretization time T_s, the dynamics system of the shared driving vehicle can be transformed into the following difference equation, namely the first discrete state update equation, as shown in formula (2):

$\begin{matrix} {\begin{matrix} x_{c (k + 1)} = A_{c d} x_{c k} + B_{hd} τ_{k}^{h} + B_{md} τ_{k}^{m} + N_{c d} τ_{disk} + B_{h d} τ_{k}^{hsto}) \\ y_{c k} = C_{c} x_{c k} \end{matrix} & (2) \end{matrix}$

$where A_{c d} = e^{A_{c} T_{s}}, B_{hd} = B_{h} \int_{0}^{T_{s}} e^{A_{c} τ} d τ, B_{md} = B_{m} \int_{0}^{T_{s}} e^{A_{c} τ} d τ, N_{c d} = N_{c} \int_{0}^{T_{s}} e^{A_{c} τ} d τ, τ_{k}^{h}$

is the driver's deterministic steering torque at time instant k in the shared driving system, τ_k^mis the steering torque from the driving system at instant k, τ^hstois the driver's stochastic steering torque, and τ_diskis the steering torque related to the distance.

FIG. 7 shows a schematic diagram of a principle of a multi-point preview mode according to one or more embodiments of the present disclosure. In order to model the path tracking control system based on LQR method, and to consider the predictive actions of the human and the machine on vehicle dynamics, the human-machine preview action is first modeled as a multi-point preview mode as shown in FIG. 7. As shown in FIG. 7, the driver and the driving system preview a section area of their respective target trajectories based on their own decisions at each moment. This section area can be described as n_ppreview points, and the preview distance is determined by the driver's preview time t_p, where t_p=n_p×T_s. This dynamic process can be expressed using a shift register, as shown in formula (3):

$\begin{matrix} R_{x (k + 1)} = A_{r} R_{x k} + B_{r} r_{x k}^{p r e} & (3) \end{matrix}$

$\begin{matrix} where & r_{x (k + i)} = {[Y_{x} (k + i) ψ_{x} (k + i)]}^{T}, & r_{x}^{p r e} = {[Y_{x} (k + n_{p}) ψ_{x} (k + n_{p})]}^{T}, \end{matrix}$

$R_{x k} = {[\begin{matrix} r_{x k} & r_{x (k + 1)} & \dots & r_{x (k + n_{p} - 1)} \end{matrix}]}^{T}, x \in {h, m}, A_{r} = [\begin{matrix} 0 & I_{2 \times 2} & \dots & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & \dots & I_{2 \times 2} \\ 0 & 0 & \dots & 0 \end{matrix}], B_{r} = [\begin{matrix} 0 \\ ⋮ \\ 0 \\ I_{2 \times 2} \end{matrix}] .$

By augmenting the human-machine shared vehicle dynamics system through the human-machine preview dynamic process R_x(k+1)=A_rR_sk+B_rr_sk^pre, the path tracking augmentation system that includes the human-machine preview state can be obtained, as shown in formula (4):

$\begin{matrix} x_{k + 1} = A x_{k} + B_{1} τ_{k}^{h} + B_{2} τ_{k}^{m} + N τ_{disk} + B_{1} τ_{k}^{h s t o} + F_{c} R_{k}^{p r e} & (4) \end{matrix}$

$where x_{k} = {[x_{c k} R_{h k} R_{m k}]}^{T}, R_{k}^{p r e} = {[r_{h}^{p r e} r_{m}^{p r e}]}^{T},$

$A_{r} = [\begin{matrix} A_{c d} & 0 & 0 \\ 0 & A_{r} & 0 \\ 0 & 0 & A_{r} \end{matrix}], B_{1} = [\begin{matrix} B_{h d} \\ 0 \\ 0 \end{matrix}], B_{2} = [\begin{matrix} B_{md} \\ 0 \\ 0 \end{matrix}], N = [\begin{matrix} N_{cd} \\ 0 \\ 0 \end{matrix}], F_{c} = {[\begin{matrix} 0 & B_{r} & 0 \\ 0 & 0 & B_{r} \end{matrix}]}^{T} .$

In formula (4), let c_k=Nτ_disk, θ_k=B₁τ_k^hsto, R_k^prebe the lateral displacement and the heading angle of the farthest preview point in the preview areas of the driver and the driving system. Since the preview information of the driver and the driving system in other areas is all in the augmented state x_k, the farthest preview point information R_k^precan be omitted, and the path tracking augmentation system can be further simplified as shown in formula (5):

$\begin{matrix} x_{k + 1} = A x_{k} + B_{1} τ_{k}^{h} + B_{2} τ_{k}^{m} + c_{k} + θ_{k} & (5) \end{matrix}$

In some embodiments, in the path tracking augmentation system for the human-machine decision divergence, a driver trajectory cost function J₁and a driving system trajectory cost function J₂with the step of n_ufor both the prediction time domain and the control time domain are designed, to obtain the game model for human-machine path tracking control, which can be described as formula (6):

$\begin{matrix} J_{1} = \frac{1}{2} x_{k + n_{u}}^{T} Q_{1} x_{k + n_{u}} + \frac{1}{2} \sum_{j = 0}^{n_{u} - 1} (x_{k + j}^{T} Q_{1} x_{k + j} + τ_{k + j}^{h T} R_{11} τ_{k + j}^{h} + τ_{k + j}^{mT} R_{12} τ_{k + j}^{m}) & (6) \end{matrix}$

$J_{2} = \frac{1}{2} x_{k + n_{u}}^{T} Q_{2} x_{k + n_{u}} + \frac{1}{2} \sum_{j = 0}^{n_{u} - 1} (x_{k + j}^{T} Q_{2} x_{k + j} + τ_{k + j}^{h T} R_{21} τ_{k + j}^{h} + τ_{k + j}^{mT} R_{22} τ_{k + j}^{m})$

$where Q_{1} = H_{1}^{T} W_{1} H_{1}, Q_{2} = H_{2}^{T} W_{2} H_{2},$

$H_{1} = [\begin{matrix} 0 & 0 & 0 & 0 & 1 & 0 & - 1 & 0 & \dots & 0 & 0 & \dots \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & - 1 & \dots & 0 & 0 & \dots \end{matrix}], W_{1} = [\begin{matrix} q_{Y_{1}} & 0 \\ 0 & q_{ψ_{1}} \end{matrix}], H_{2} = [\begin{matrix} 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & \dots & - 1 & 0 & \dots \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & \dots & 0 & - 1 & \dots \end{matrix}], W_{2} = [\begin{matrix} q_{Y_{2}} & 0 \\ 0 & q_{ψ_{2}} \end{matrix}]$

- where Q₁and Q₂are respectively the state weighting matrices for the driver and the driving system, W₁and W₂are respectively the tracking error weighting matrices for the driver and the driving system, q_Y₁and q_ψ₁are respectively the weighting coefficients for the lateral tracking error and the heading angle error of the driver's vehicle, q_Y₂and q_ψ₂are respectively the weighting coefficients for the lateral tracking error and the heading angle error of the driving system's vehicle, R₁₁and R₂₂are respectively the input weighting coefficients for the driver and the driving system, R₁₁=q_u1, R₂₂=q_u2, and R₁₂and R₂₁are respectively the interactive input weighting coefficients for the driver and the driving system, R₁₂=q_u12, R₂₁=q_u21.

Based on this, the formula (6) establishes the game model for human-machine path tracking control at the n_u^thstage through a linear quadratic method. The cost functions of both parties include the steering control input from the other party, to express the human-machine interaction characteristics.

2. Open-Loop Game Model for Human-Machine Path Tracking Control

FIG. 8 shows a schematic flowchart of a method for establishing an open-loop game model for human-machine path tracking control according to one or more embodiments of the present disclosure, which uses the Distributed Model Predictive Control (DMPC) strategy to describe the mapping relationship between the human-machine decision divergence and the steering torque conflict in the open-loop information mode. As shown in FIG. 8, the method for establishing the game model for human-machine path tracking control includes the following steps.

In step S801, a second discrete state update equation for a dynamics system of a shared driving vehicle in an open-loop information mode is established based on the driver's deterministic steering torque and the driver's stochastic steering torque.

In step S802, a prediction output vector in the prediction time domain is determined based on the second discrete state update equation, and a reference trajectory vector of the driver, and a reference trajectory vector of the driving system are determined.

In step S803, a driver trajectory cost function and a driving system trajectory cost function are respectively established based on the prediction output vector, the reference trajectory vector of the driver, and the reference trajectory vector of the driving system, to obtain the game model for human-machine path tracking control.

In some embodiments, under the framework of the model prediction control, both the driver and the driving system estimate the vehicle trajectory within the prediction time domain n_p, and apply the steering control within the control time domain n_u, to minimize the deviation between the vehicle trajectory and their respective decisions. Compared with the linear quadratic regulator method, the model prediction control more intuitively reflects in the final interaction model the target trajectories planned by the decision-making levels of the driver and the autonomous driving system. Meanwhile, according to the model prediction control algorithm, the establishment of the state prediction vector in the cost function is based on the current state of the system and the control input within the control time domain n_u. Therefore, the control law is only related to the current initial state, which precisely conforms to the definition of the open-loop information mode.

For the dynamics system of the shared driving vehicle, the uncertain action of the driver is incorporated into the system interference, which results in the second discrete state update equation as shown in formula (7):

$\begin{matrix} {\begin{matrix} x_{c (k + 1)} = A_{c d} x_{c k} + B_{h d} τ_{k}^{h} + B_{md} τ_{k}^{m} + N_{c d} (τ_{d i s k} - τ^{hsto}) \\ y_{c k} = C_{c} x_{c k} \end{matrix} & (7) \end{matrix}$

$where A_{c d} = e^{A_{c} T_{s}}, B_{hd} = B_{h} \int_{0}^{T_{s}} e^{A_{c} τ} d τ, B_{m d} = B_{m} \int_{0}^{T_{s}} e^{A_{c} τ} d τ, N_{c d} = N_{c} \int_{0}^{T_{s}} e^{A_{c} τ} d τ, τ_{k}^{h}$

is the driver's deterministic steering torque in the shared driving system at time instant k, τ_k^mis the steering torque from the driving system at time instant k, τ^hstois the driver's stochastic steering torque, τ_diskis the steering torque related to the distance, and the interference input is recorded as w′_x=(τ_disk−τ^hsto).

Based on the above second discrete state update equation, if it is assumed that the interference input w′_kin the prediction time domain remains unchanged, the model output of the shared driving system in next n_psteps can be expressed as:

$y_{c (k + 1 ❘ k)} = C_{c} A_{c d} x_{c k} + C_{c} B_{hd} τ_{k}^{h} + C_{c} B_{md} τ_{k}^{m} + C_{c} N_{c d} w_{k}^{’}$

$⋮$

$y_{c (k + n_{u} ❘ k)} = C_{c} A_{c d}^{n_{u}} x_{c k} + C_{c} A_{c d}^{n_{u} - 1} B_{hd} τ_{k}^{h} + \dots + C_{c} B_{hd} τ_{(k + n_{u} - 1)}^{h} + C_{c} A_{c d}^{n_{u} - 1} B_{md} τ_{k}^{m} + \dots + C_{c} B_{md} τ_{(k + n_{u} - 1)}^{h} + \sum_{i = 1}^{n_{u}} C_{c} A_{c d}^{i - 1} N_{c d} w_{k}^{’}$

$⋮$

$y_{c (k + n_{p} | k)} = C_{c c d}^{n_{p}} x_{c k} + C_{c} A_{c d}^{n_{p} - 1} B_{h d} τ_{k}^{h} + C_{c} A_{c d}^{n_{p} - 2} B_{h d} τ_{k + 1}^{h} + \dots + \sum_{i = 0}^{n_{p} - n_{u}} C_{c} A_{d}^{i} B_{d} τ_{k + n_{u} - 1}^{h} + C_{c} A_{c d}^{n_{p} - 1} B_{md} τ_{k + 1}^{m} + C_{c} A_{c d}^{n_{p} - 2} B_{md} τ_{k + 1}^{m} + \dots + \sum_{i = 0}^{n_{p} - n_{u}} C_{c} A_{d}^{i} B_{md} τ_{k + n_{u} - 1}^{m} + \sum_{i = 1}^{n_{p}} C_{c} A_{c d}^{i - 1} N_{c d} w_{k}^{’}$

In step S802, a prediction output vector in the prediction time domain is determined based on the second discrete state update equation, as well as a reference trajectory vector of the driver and a reference trajectory vector of the driving system are determined.

It is assumed that the prediction time domain and the control time domain of the model prediction algorithm for the shared driving system are both n_u, then at time instant k, the model prediction output vector in the prediction time domain is defined as Y_pk, as shown in formula (8), and the human-machine control input vectors are respectively U_hkand U_mk, as shown in formulas (9) and (10):

$\begin{matrix} Y_{p k} = {[y_{c (k + 1 | k)} y_{c (k + 2 | k)} \dots y_{c (k + n_{p} | k)}]}^{T} & (8) \end{matrix}$

$\begin{matrix} U_{h k} = {[τ_{k}^{h} τ_{k + 1}^{h} \dots τ_{k + n_{u} - 1}^{h}]}^{T} & (9) \end{matrix}$

$\begin{matrix} U_{m k} = {[τ_{k}^{m} τ_{k + 1}^{m} \dots τ_{k + n_{u} - 1}^{m}]}^{T} & (10) \end{matrix}$

Based on the second discrete state update equation (formula 7), the model output of the shared driving system in next n_psteps, which is the prediction output vector, can be expressed as formula (11):

$\begin{matrix} Y_{p k} = Ψ x_{c k} + Θ_{h} U_{h k} + Θ_{m} U_{m k} + {Sw}_{k}^{'} & (11) \end{matrix}$

$where Ψ = {[C_{c} A_{c d} C_{c} A_{c d}^{2} \dots C_{c} A_{c d}^{n_{u} - 1}]}^{T},$

$S = {[C N_{c d} C N_{c d} + C A_{c d} N_{c d} \dots \sum_{m = 1}^{n_{p}} C A_{c d}^{m - 1} N_{c d}]}^{T},$

$Θ_{h} = [\begin{matrix} C_{c} B_{ω} & 0 & \dots & 0 & 0 \\ C_{c} A_{c d} B_{h d} & C B_{h d} & \dots & 0 & 0 \\ ⋮ & ⋮ & ⋱ & C B_{h d} & 0 \\ C A_{c d}^{n_{u} - 1} B_{h d} & C A_{c d}^{n_{u} - 2} B_{h d} & \dots & C A_{c d} B_{hd} & C B_{h d} \end{matrix}], and$

$Θ_{m} = [\begin{matrix} C_{c} B_{md} & 0 & \dots & 0 & 0 \\ C_{c} A_{c d} B_{md} & C B_{md} & \dots & 0 & 0 \\ ⋮ & ⋮ & ⋱ & C B_{md} & 0 \\ C A_{c d}^{n_{u} - 1} B_{md} & C A_{c d}^{n_{u} - 2} B_{md} & \dots & C A_{c d} B_{md} & C B_{md} \end{matrix}] .$

At each time step k, the reference trajectory vectors of the driver and the autonomous driving system can be expressed as formula (12):

$\begin{matrix} R_{h k} = {[r_{h (k + 1)} r_{h (k + 2)} \dots r_{h (k + n_{u})}]}^{T} & (12) \end{matrix}$

$R_{m k} = {[r_{m (k + 1)} r_{m (k + 2)} \dots r_{m (k + n_{u})}]}^{T}$

The parameters in the game model for human-machine path tracking control are represented by the prediction output vector Y_pkand the reference trajectory vectors R_hkand R_mk, and the driver trajectory cost function and the driving system trajectory cost function are obtained as shown in formula (13):

$\begin{matrix} J_{1} = { Γ_{y 1} [y_{p k} - R_{h k}] }^{2} + { Γ_{u 1} U_{h k} }^{2} & (13) \end{matrix}$

$J_{2} = { Γ_{y 2} [y_{p k} - R_{m k}] }^{2} + { Γ_{u 2} U_{m k} }^{2}$

- where Γ_yxand Γ_uxare both weighting matrices for the path tracking control, and x∈{1, 2},

$Γ_{y x} = diag {(\sqrt{q_{y x}}, \sqrt{q_{ψ x}}, \dots, \sqrt{q_{y x}}, \sqrt{q_{ψ x}})}_{2 n_{u} \times 2 n_{u}},$

$Γ_{u x} = diag {(\sqrt{q_{u x}}, \sqrt{q_{u x}}, \dots, \sqrt{q_{u x}})}_{n_{u} \times n_{u}} .$

In step S102, human-machine torque conflict information is obtained by solving the game model for human-machine path tracking control.

The game models for human-machine path tracking control are established separately in two (closed-loop and open-loop) different information modes in the step S101, and the solving of the two game models are also different.

In some embodiments, for the non-cooperative game problem, since the mirror symmetric anthropomorphic strategy is usually used in the shared driving system to design the controller of the autonomous driving system to improve the man-machine consistency, the driver and the autonomous driving system have an equal relationship in the game process. When the game participants are in the symmetric or equal relationship, the Nash Equilibrium provides a reasonable theoretical solution of the non-cooperative game. In such balanced state, no party plays a dominant role in the decision-making process, and neither party can reduce its own trajectory tracking cost function value by unilaterally adjusting its own decisions.

In addition, there is another type of non-cooperative game problem, namely the master-slave game problem of the dual-agent system, in which the leader understands the follower's reaction to its decision and makes the decision first, while the follower observes the leader's decision before making the corresponding decision. Such hierarchical game outcome is called Stackelberg Equilibrium.

Due to the fact that the Nash equilibrium and the Stackelberg equilibrium can theoretically model the decision control mechanism of the shared driving system, two solutions, namely the corresponding Nash equilibrium solution and the corresponding Stackelberg equilibrium solution can be obtained for each game model. Therefore, four quantitative models based on the non-cooperative game theory are planned to be used, and based on the open-loop information mode and the closed-loop information mode, the mapping relationship between human-machine decision divergence and steering torque conflict under the Nash equilibrium and the Stackelberg equilibrium will be analyzed in depth, in order to explore the optimal modeling method for the human-machine interaction mechanism.

1. Nash Equilibrium Solution of Human-Machine Decision-Making Control in Closed-Loop Information Mode

In some embodiments of the present disclosure, the problem of solving Nash equilibrium in the closed-loop information mode can be described as the Hamiltonian function constrained optimization problem as shown in formula (14):

$\begin{matrix} \min_{τ_{k}^{i} \in Γ_{k}^{i}} H_{k}^{i} (λ_{k + 1}^{i}, τ_{k}^{m^{*}}, x_{k}^{*}) & (14) \end{matrix}$

$s . t . (\begin{matrix} x_{k + 1} = f_{k} (x_{k}, τ_{k}^{h}, τ_{k}^{m}), x_{0}^{*} = x_{0} \\ λ_{k}^{i} = \frac{\partial}{\partial x_{k}} f_{k}^{T} (x_{k}^{*}, τ_{k}^{h^{*}}, τ_{k}^{m^{*}}) λ_{k + 1}^{i} + {[\frac{\partial}{\partial x_{k}} g_{k}^{i} (x_{k}^{*}, τ_{k}^{h^{*}}, τ_{k}^{m^{*}})]}^{T} + \\ \frac{\partial}{\partial τ_{k}^{j}} f_{k}^{T} (x_{k}^{*}, τ_{k}^{h^{*}}, τ_{k}^{m^{*}}) λ_{k + 1}^{i} + {[\frac{\partial}{\partial x_{k}} τ_{k}^{j^{*}} (x_{k}^{*}, x_{1})]}^{T} + \\ \frac{\partial}{\partial τ_{k}^{j}} τ_{k}^{j^{*}} (x_{k}^{*}, τ_{k}^{h^{*}}, τ_{k}^{m^{*}}) \frac{\partial}{\partial x_{k}} τ_{k}^{j^{*}} τ_{k}^{j^{*}} (x_{k}^{*}, x_{1}) \end{matrix}$

- where λ_kⁱis the covariant vector.

The closed form solution {τ_k^h=τ_k^h*(x₀, x_k), τ_k^m=τ_k^m*(x₀, x_k)} of the above constrained optimization problem forms a closed-loop Nash equilibrium solution.

The Nash equilibrium solution in the open-loop mode is only related to the initial state and is independent of the current state in each stage. Therefore, the open-loop Nash equilibrium solution {τ_k^h=τ_k^h*(x₀), τ_k^m=τ_k^m*(x₀)} also satisfies the closed-loop Nash equilibrium condition (14). However, it is clear that the solution of the closed-loop Nash equilibrium problem does not merely include the open-loop Nash equilibrium solution. This non-uniqueness of the information leads to the problem of multiple solutions in Nash equilibrium under closed-loop information structure. Due to the consideration of the parameter θ_kof the driver's stochastic steering torque in the system modeling herein, the non-uniqueness of the information in the game process is precisely eliminated. Therefore, the problem of multiple solutions can be avoided through the feedback Nash equilibrium, and the optimal solutions in each stage of the feedback Nash equilibrium satisfy the following conditions:

$J_{k}^{1} (τ_{1}^{h}, \dots, τ_{K - 1}^{h}, τ_{K}^{h^{*}}; τ_{1}^{m}, \dots, τ_{K - 1}^{m}, τ_{K}^{m^{*}}) \leq J_{k}^{1} (τ^{h}; τ_{1}^{m}, \dots, τ_{K - 1}^{m}, τ_{K}^{m^{*}})$

$J_{k}^{2} (τ_{1}^{h}, \dots, τ_{K - 1}^{h}, τ_{K}^{h^{*}}; τ_{1}^{m}, \dots, τ_{K - 1}^{m}, τ_{K}^{m^{*}}) \leq J_{k}^{2} (τ_{1}^{h}, \dots, τ_{K - 1}^{h}, τ_{K}^{h^{*}}; τ^{m})$

$⋮$

$J_{k - 1}^{1} (τ_{1}^{h}, \dots, τ_{K - 2}^{h}, τ_{K - 1}^{h^{*}}; τ_{K}^{h^{*}}; τ_{1}^{m}, \dots, τ_{K - 2}^{m}, τ_{K - 1}^{m^{*}}, τ_{K}^{m^{*}}) \leq J_{k - 1}^{1} (τ_{1}^{h}, \dots, τ_{K - 1}^{h}, τ_{K}^{h^{*}}; τ_{1}^{m}, \dots, τ_{K - 2}^{m}, τ_{K - 1}^{m^{*}}, τ_{K}^{m^{*}})$

$J_{k - 1}^{2} (τ_{1}^{h}, \dots, τ_{K - 2}^{h}, τ_{K - 1}^{h^{*}}; τ_{K}^{h^{*}}; τ_{1}^{m}, \dots, τ_{K - 2}^{m}, τ_{K - 1}^{m^{*}}, τ_{K}^{m^{*}}) \leq J_{k - 1}^{2} (τ_{1}^{h}, \dots, τ_{K - 1}^{h^{*}}, τ_{K}^{h^{*}}; τ_{1}^{m}, \dots, τ_{K - 2}^{m}, τ_{K - 1}^{m^{*}}, τ_{K}^{m^{*}})$

$J_{k - 1}^{1} (τ^{h^{*}}; τ^{m^{*}}) \leq J_{k - 1}^{1} (τ_{1}^{h}, τ_{2}^{h^{*}}, \dots, τ_{K}^{h^{*}}; τ^{m^{*}})$

$J_{k - 1}^{2} (τ^{h^{*}}; τ^{m^{*}}) \leq J_{k - 1}^{2} (τ^{h^{*}}; τ_{1}^{m}, τ_{2}^{m^{*}}, \dots, τ_{K}^{m^{*}})$

It can be seen that the feedback Nash equilibrium under the closed-loop condition conforms to the Bellman Optimality Principle, and the closed-loop Nash equilibrium solution (also referred to as feedback Nash equilibrium solution) of the path tracking control system under the human-machine decision divergence condition can be solved through the Stochastic Dynamic Programming (SDP) algorithm.

FIG. 9 shows a schematic flowchart of a method for solving a closed-loop game model for human-machine path tracking control according to one or more embodiments of the present disclosure. As shown in FIG. 9, the method for solving the game model for human-machine path tracking control includes the following steps.

In step S901, recursive relationships for steering control value functions respectively corresponding to the driver and the driving system under a Nash equilibrium condition are determined using a stochastic dynamic programming algorithm.

In step S902, closed-loop Nash equilibrium solutions respectively corresponding to the driver and the driving system are calculated based on the first discrete state update equation and the recursive relationships, as the human-machine torque conflict information.

In some embodiments, at any time step k, it is assumed that the unmodeled interference c_kof the shared driving system remains unchanged throughout the dynamic game of entire n_ustages, and the Gaussian random distribution θ_k˜N(μ, σ) is used to describe the parameter related to the driver's stochastic steering torque, where μ and σ respectively represent the mean and the standard deviation of the Gaussian distribution.

Due to the presence of interference vectors c_kand θ_k, solving the human-machine game problem based on the steering torque interaction becomes an affine quadratic problem. Therefore, in the process of dynamic programming, the steering control value functions of the driver and the driving system have the following affine quadratic form as shown in formula (15):

$\begin{matrix} V_{k + j}^{1} = \frac{1}{2} x_{k + j}^{T} Z_{k + j}^{1} x_{k + j} + ζ_{k + j}^{1 T} x_{k + j} + n_{k + j}^{1}; V_{k + n_{u}}^{1} = \frac{1}{2} x_{k + n_{u}}^{T} Q_{1} x_{k + n_{u}} & (15) \end{matrix}$

$V_{k + j}^{2} = \frac{1}{2} x_{k + j}^{T} Z_{k + j}^{2} x_{k + j} + ζ_{k + j}^{2 T} x + n_{k + j}^{2}; V_{k + n_{u}}^{2} = \frac{1}{2} x_{k + n_{u}}^{T} Q_{2} x_{k + n_{u}}$

The steering control value functions in the above formula (15) represent the values of the value functions of the driver and the driving system, in the game process of their respective cost functions from the j_thstage to the n_u^thstage, and the steering control value functions at the (k+j)^thstep and (k+j+1)^thstep satisfy the recursive relationships as shown in formula (16):

$\begin{matrix} V_{k + j}^{1 N} = \min_{τ_{k + j}^{1} \in R} E_{θ_{k + j}} [g_{1} (x_{k + j}, τ_{k + j}^{h}, τ_{k + j}^{mN}) + V_{k + j + 1}^{1 N}] & (16) \end{matrix}$

$V_{k + j}^{2 N} = \min_{τ_{k + j}^{2} \in R} E_{θ_{k + j}} [g_{2} (x_{k + j}, τ_{k + j}^{h N}, τ_{k + j}^{m}) + V_{k + j + 1}^{2 N}]$

- where the superscript “N” represents the value functions under the Nash equilibrium condition, and the single step cost functions g ( ) for each stage can be expressed as shown in formula (17):

$\begin{matrix} g_{1} (x_{k + j}, τ_{k + j}^{h}, τ_{k + j}^{m}) = \frac{1}{2} (x_{k + j}^{T} Q_{1} x_{k + j} + τ_{k + j}^{h T} R_{1 1} τ_{k + j}^{h} + τ_{k + j}^{m T} R_{1 2} τ_{k + j}^{m}) & (17) \end{matrix}$

$g_{2} (x_{k + j}, τ_{k + j}^{h}, τ_{k + j}^{m}) = \frac{1}{2} (x_{k + j}^{T} Q_{1} x_{k + j} + τ_{k + j}^{h T} R_{1 1} τ_{k + j}^{h} + τ_{k + j}^{m T} R_{1 2} τ_{k + j}^{m})$

By substituting the first discrete state update equation x_k+1=Ax_k+B₁τ_k^h+B₂τ_k^m+c_k+θ_k(formula 4) into the steering control value functions (formula 15), the human-machine torque relationships satisfying the closed-loop Nash equilibrium relationship can be obtained as shown in formula (18):

$\begin{matrix} R_{1 1} τ_{k + j}^{hN} + B_{1}^{T} Z_{k + j + 1}^{1} ({Ax}_{k + j} + B_{1} τ_{k + j}^{h N} + B_{2} τ_{k + j}^{M} + c_{k + j} + B_{1} μ) + B_{1}^{T} ζ_{k + j + 1}^{1} = 0 & (18) \end{matrix}$

$R_{2 2} τ_{k + j}^{mN} + B_{2}^{T} Z_{k + j + 1}^{2} (A x_{k + j} + B_{1} τ_{k + j}^{hN} + B_{2} τ_{k + j}^{mN} + c_{k + j} + B_{1} μ) + B_{2}^{T} ζ_{k + j + 1}^{2} = 0$

2. Stackelberg Equilibrium Solution of Human-Machine Decision-Making Control in Closed-Loop Information Mode

FIG. 10 shows a schematic flowchart of a method for solving a closed-loop game model for human-machine path tracking control according to one or more embodiments of the present disclosure. As shown in FIG. 10, the method for solving the game model for human-machine path tracking control includes the following steps.

In step S1001, recursive relationships for steering control value functions respectively corresponding to the driver and the driving system under a Stackelberg equilibrium condition are determined using a stochastic dynamic programming algorithm.

In step S1002, a driver reaction function is determined based on the recursive relationship for the steering control value function corresponding to the driving system.

In step S1003, an open-loop Stackelberg equilibrium solution corresponding to the driving system is calculated based on the first discrete state update equation, the driver reaction function, and the recursive relationship for the steering control value function corresponding to the driving system.

In step S1004, an open-loop Stackelberg equilibrium solution corresponding to the driver is calculated based on the open-loop Stackelberg equilibrium solution corresponding to the driving system, as the human-machine torque conflict information.

In some embodiments, different from the Nash equilibrium, in the Stackelberg equilibrium, there is a master-slave relationship between the driver and the autonomous driving system. At each stage of the game, the driving system, as the dominant party, first inputs the steering control input, and the driver observes this action and reacts accordingly.

Therefore, the closed-loop Stackelberg equilibrium solution can be solved through backward induction. Based on the stochastic dynamic programming method, the recursive relationships for the steering control value function corresponding to the driver are the same, that is, V_k+j^1Sand V_k+j^1N(see formula 16) are the same.

The recursive relationship for the steering control value function corresponding to the driving system in the closed-loop Stackelberg equilibrium solution (also referred to as feedback Stackelberg equilibrium solution) in the closed-loop information mode satisfies the condition shown in formula (19):

$\begin{matrix} V_{k + j}^{2 S} = \min_{τ_{k + j}^{hs} \in Γ^{h} (τ^{m}), τ_{k + j}^{m} \in R} E_{θ_{k + j}} [g_{2} (x_{k + j}, τ_{k + j}^{hS}, τ_{k + j}^{m}) + V_{k + j + 1}^{2 S}] & (19) \end{matrix}$

- where the superscript “S” represents the value function under the Stackelberg equilibrium condition, and

$Γ^{h} = {τ_{k + j}^{h} (τ_{m}) ❘ V_{k + j}^{1 S} = \min_{τ_{k + j}^{h} \in R} E_{θ_{k + j}} [g_{2} (x_{k + j}, τ_{k + j}^{h}, τ_{k + j}^{m}) + V_{k + j + 1}^{1 S}]} .$

By substituting the first discrete state update equation x_k+1=Ax_k+B₁τ_k^h+B₂τ_k^m+c_k+θ_k(formula 4) and the recursive relationship V_k+j^1Sfor the steering control value function corresponding to the driver into the recursive relationship for the steering control value function corresponding to the driving system (formula 19), the driver reaction function on the steering torque of the driving system can be calculated, as shown in formula (20):

$\begin{matrix} τ_{k + j}^{h S} = - {(R_{1 1} + B_{1}^{T} Z_{k + j + 1}^{1} B_{1})}^{- 1} B_{1}^{T} [Z_{k + j + 1}^{1} (B_{2}^{T} τ_{k + j}^{m} + A x_{k + j} + c_{k + j} + B_{2} μ) + ζ_{k + j + 1}^{1}] & (20) \end{matrix}$

By substituting the first discrete state update equation (formula 4) and the driver reaction function (formula 20) into the recursive relationship for the steering control value function corresponding to the driving system (formula 19), the open-loop Stackelberg equilibrium solution of the driving system under the Stackelberg equilibrium strategy is as shown in formula (21):

$\begin{matrix} A_{L} τ_{k + j}^{m S} + [A_{S 4} Z_{k + j + 1}^{1} + A_{S 3}^{T} Z_{k + j + 1}^{2} (I - B_{1} A_{S 2} Z_{k + j + 1}^{1})] A x_{k + j} + (A_{S 4} Z_{k + j + 1}^{1} - A_{S 3}^{T} Z_{k + j + 1}^{2} B_{1} A_{S 2} Z_{k + j + 1}^{1} + A_{S 3}^{T} Z_{k + j + 1}^{2}) \times (c_{k} + B_{2} μ) + A_{S 3}^{T} ζ_{k + j + 1}^{2} + (A_{S 4} - A_{S 3}^{T} Z_{k + j + 1}^{2} B_{1} A_{S 2}) ζ_{k + j + 1}^{1} = 0 & (21) \end{matrix}$

$where A_{S 2} = {(R_{1 1} + B_{1}^{T} Z_{k + j + 1}^{1} B_{1})}^{- 1} B_{1}^{T}, A_{S 3} = B_{2} - B_{1} A_{S 2} Z_{k + j + 1}^{1} B_{2},$

$A_{S 4} = B_{2}^{T} Z_{k + j + 1}^{1} A_{S 2}^{T} R_{2 1} A_{S 2}, A_{L} = R_{2 2} + A_{S 4} Z_{k + j + 1}^{1} B_{2} + A_{S 3}^{T} Z_{k + j + 1}^{2} A_{S 3} .$

Since the driving system is the dominant party and the driver is the follower party, the open-loop Stackelberg equilibrium solution of the driving system can be used to obtain the corresponding Hamiltonian function constrained optimization problem for the driver, and the closed form solution can be obtained as the open-loop Stackelberg equilibrium solution of the driver.

3. Nash Equilibrium Solution of Human-Machine Decision-Making Control in Open-Loop Information Mode

FIG. 11 shows a schematic flowchart of a method for solving an open-loop game model for human-machine path tracking control according to one or more embodiments of the present disclosure. As shown in FIG. 11, the method for solving the game model for human-machine path tracking control includes the following steps.

In step S1101, the game model for human-machine path tracking control is solved to obtain a closed form solution corresponding to the model, and a relationship expression between human-machine steering control and a target trajectory is obtained based on the closed form solution.

In step S1102, open-loop Nash equilibrium solutions respectively corresponding to the driver and the driving system are obtained by solving the relationship expression using the convex iterative algorithm, as the human-machine torque conflict information.

In some embodiments, based on the definition of the open-loop Nash equilibrium, the convex iterative algorithm is used to solve the Nash equilibrium solution of the system in the open-loop information mode.

For the unconstrained problem as shown in formula (13), the human-machine path tracking control law has the following closed form solutions, as shown in formula (22):

$\begin{matrix} U_{hk}^{*} = \underset{U_{hk}}{\arg \min} J_{1} = L_{h} (R_{h k} - Ψ x_{c k} - Θ_{m} U_{m k} - {Sw}_{k}^{'}) U_{m k}^{*} = \underset{U_{mk}}{\arg \min} J_{2} = L_{m} (R_{m k} - Ψ x_{c k} - Θ_{h} U_{h k} - {Sw}_{k}^{'}) where L_{h} = p i n v ([\begin{matrix} Γ_{y 1} Θ_{h} \\ Γ_{u 1} \end{matrix}]) [\begin{matrix} Γ_{y 1} \\ 0 \end{matrix}], L_{m} = pinv ([\begin{matrix} Γ_{y 2} Θ_{h} \\ Γ_{u 2} \end{matrix}]) [\begin{matrix} Γ_{y 2} \\ 0 \end{matrix}] . & (22) \end{matrix}$

Herein, the pinv( ) is the pseudo inverse operator, pinv(A)=(A^TA)⁻¹A^T, and thus the relationship expression between the human-machine steering control and the target trajectory can be obtained, as shown in formula (23):

$\begin{matrix} [\begin{matrix} U_{hk}^{*} \\ U_{mk}^{*} \end{matrix}] = [\begin{matrix} L_{h} & 0 \\ 0 & L_{m} \end{matrix}]  [\begin{matrix} R_{h k} - Ψ x_{c k} - Θ_{m} U_{m k} - {Sw}_{k}^{'} \\ R_{m k} - Ψ x_{c k} - Θ_{h} U_{h k} - {Sw}_{k}^{'} \end{matrix}] + [\begin{matrix} 0 & - L_{h} Θ_{m} \\ - L_{m} Θ_{h} & 0 \end{matrix}] [\begin{matrix} U_{h k} \\ U_{h k} \end{matrix}] & (23) \end{matrix}$

From formula (23), it can be seen that there is an interactive coupling relationship between the control laws of the two. That is, their own control actions are not only related to their respective decision objectives, system states, and interferences, but also closely related to each other's control strategies. Therefore, in order to decouple formula (23), the convex iterative algorithm is used to solve it, and an update equation is obtained as shown in formula (24):

$\begin{matrix} U_{hk (i + 1)} = w_{1} U_{hk (i)}^{*} + (1 - w_{1}) U_{hk (i)} U_{mk (i + 1)} = w_{1} U_{mk (i)}^{*} + (1 - w_{1}) U_{mk (i)} & (24) \end{matrix}$

The solving process based on the convex iterative algorithm can be summarized as follows: the initial values U_hk(0)and U_mk(0)for iteration are first determined, which are then substituted into the relationship expression (formula 23) to obtain the current optimal values U*_hk(0)and U*_mk(0), and again substituted into the update equation (formula 24), to update U_hk(1)and U_mk(1)for the next step. The above iterative cycle continues as such. When i approaches infinity, the update equation (formula 24) is transformed into:

$\begin{matrix} U_{h k (\infty)} = w_{1} U_{hk (\infty)}^{*} + (1 - w_{1}) U_{h k (\infty)} U_{m k (\infty)} = w_{1} U_{mk (\infty)}^{*} + (1 - w_{1}) U_{m k (\infty)} & (25) \end{matrix}$

Thus, the relationship expression (formula 23) is transformed into:

$\begin{matrix} [\begin{matrix} U_{h k (\infty)} \\ U_{m k (\infty)} \end{matrix}] = [\begin{matrix} L_{h} & 0 \\ 0 & L_{m} \end{matrix}]  [\begin{matrix} R_{h k} - Ψ x_{c k} - Θ_{m} U_{m k} - {Sw}_{k}^{'} \\ R_{m k} - Ψ x_{c k} - Θ_{h} U_{h k} - {Sw}_{k}^{'} \end{matrix}] + [\begin{matrix} 0 & - L_{h} Θ_{m} \\ - L_{m} Θ_{h} & 0 \end{matrix}] [\begin{matrix} U_{h k (\infty)} \\ U_{m k (\infty)} \end{matrix}] & (26) \end{matrix}$

The equation for the Nash equilibrium solution of the human-machine decision control in the open-loop information mode is as shown in formula (27):

$\begin{matrix} [\begin{matrix} U_{h k}^{O N} \\ U_{m k}^{ON} \end{matrix}] = {[\begin{matrix} I & L_{1} Θ_{m} \\ L_{2} Θ_{h} & I \end{matrix}]}^{- 1} [\begin{matrix} L_{h} & 0 \\ 0 & L_{m} \end{matrix}] [\begin{matrix} R_{h k} - Ψ x_{c k} - {Sw}_{k}^{'} \\ R_{m k} - Ψ x_{c k} - {Sw}_{k}^{'} \end{matrix}] & (27) \end{matrix}$

In the actual modeling process, the value of the first stage of the game is taken as the human-machine interaction result of the open-loop Nash equilibrium at time instant k, that is, the human-machine torque conflict information:

$\begin{matrix} τ_{k}^{h O N} = {[\begin{matrix} 1 & 0 & \dots & 0 \end{matrix}]}_{1 \times n_{u}} U_{h k}^{O N} τ_{k}^{m O N} = {[\begin{matrix} 1 & 0 & \dots & 0 \end{matrix}]}_{1 \times n_{u}} U_{m k}^{ON} & (28) \end{matrix}$

4. Stackelberg Equilibrium Solution of Human-Machine Decision-Making Control in Open-Loop Information Mode

FIG. 12 shows a schematic flowchart of a method for solving an open-loop game model for human-machine path tracking control according to one or more embodiments of the present disclosure. As shown in FIG. 12 the method for solving the game model for human-machine path tracking control includes the following steps.

In step S1201, the driving system trajectory cost function is converted into a driving system trajectory optimization function that considers the driver reaction function.

In step S1202, an open-loop Stackelberg equilibrium solution corresponding to the driving system is obtained by solving the driving system trajectory optimization function.

In step S1203, an open-loop Stackelberg equilibrium solution corresponding to the driver is calculated based on the open-loop Stackelberg equilibrium solution and the driver trajectory cost function, as the human-machine torque conflict information.

In some embodiments, the backward induction can also be used for obtaining the open-loop Stackelberg equilibrium solution, which is similar to that for the closed-loop Stackelberg equilibrium solution. Substituting the predicted output vector Y_pk(formula 11) into the game model (formula 13) yields:

$\begin{matrix} J_{1} = { Γ_{y 1} [Θ_{h} U_{h k} - E_{h k}] }^{2} + { Γ_{u 1} U_{h k} }^{2} J_{2} = { Γ_{y 2} [Θ_{m} U_{m k} - E_{m k}] }^{2} + { Γ_{u 2} U_{m k} }^{2} where E_{h k} = R_{h k} - Ψ x_{c k} - S_{d} w_{k}^{'} - Θ_{m} U_{m k}, E_{m k} = R_{m k} - Ψ x_{c k} - S_{d} w_{k}^{'} - Θ_{h} U_{h k} . & (29) \end{matrix}$

- Different from the open-loop Nash equilibrium solution, in the master-slave game, the autonomous driving system, as the dominant party, takes actions first at each stage of the game, and the driver reacts accordingly after observing the action of the autonomous driving system.

Therefore, by solving J₁, the driver reaction function to the driving system can be obtained, as shown in formula (30):

$\begin{matrix} U_{hk}^{*} = p i n v ([\begin{matrix} Γ_{y 1} Θ_{h} \\ Γ_{u 1} \end{matrix}]) [\begin{matrix} Γ_{y 1} \\ 0 \end{matrix}] (R_{h k} - Ψ x_{c k} - S_{d} w_{k}^{'} - Θ_{h} U_{h k}) = L_{h} E_{h k} & (30) \end{matrix}$

When a control strategy is adopted, the driving system considers the driver reaction function into its cost function, to obtain a new driving system trajectory cost function, as shown in formula (31):

$\begin{matrix} J_{2} = { Γ_{y 2} [Θ_{m} U_{m k} - R_{m k} + Ψ x_{c k} + S_{d} w_{k}^{'} + Θ_{h} L_{h k} E_{h k}] }^{2} + { Γ_{u 2} U_{m k} }^{2} & (31) \end{matrix}$

Therefore, solving the Stackelberg equilibrium solution of the steering torque control of the driving system in the open-loop information mode is transformed into solving the unconstrained optimization problem as shown in formula (32):

$\begin{matrix} \min_{U_{hk} \in Γ^{h} (U_{mk}), U_{mk} \in R^{n_{u}}} { \begin{matrix} Γ_{y 2} [Θ_{m} U_{m k} - R_{m k} + Ψ x_{c k} + S_{d} w_{k}^{'} + Θ_{h} L_{h k} E_{h k}] \\ Γ_{u 2} U_{m k} \end{matrix} }^{2} & (32) \end{matrix}$

By solving the above optimization problem, the open-loop Stackelberg equilibrium solution corresponding to the driving system can be obtained as shown in formula (33):

$\begin{matrix} U_{mk}^{OS} = L_{m}^{'} (- Θ_{m}^{'} R_{h k} + R_{m k} - Ψ_{m}^{'} x_{c k} - S_{m}^{'} w_{k}^{'}) where Ψ_{m}^{'} = Ψ - Θ_{h} L_{h k} Ψ, S_{m}^{'} = S - Θ_{h} L_{h} S_{d}, Θ_{m}^{'} = Θ_{h} L_{h}, Θ_{m 1}^{'} = Θ_{m} - Θ_{h} L_{h} Θ_{m}, L_{m}^{'} = pin v ([\begin{matrix} Γ_{y 2} Θ_{m 1}^{'} \\ Γ_{u 2} \end{matrix}]) . & (33) \end{matrix}$

Based on the Stackberg equilibrium solution U_mk^OSof the steering control of the driving system, the optimal reaction of the driver to the steering control of the autonomous driving system can be obtained by solving the optimization problem such as formula (34):

$\begin{matrix} \min_{U_{hk} \in Γ^{1} (U_{mk})} { \begin{matrix} Γ_{y 1} [Θ_{h} U_{hk} - R_{hk} + Ψ x_{c k} + S_{d} w_{k}^{'} + Θ_{m} U_{mk}^{OS}] \\ Γ_{u 1} U_{hk} \end{matrix} }^{2} & (34) \end{matrix}$

The Stackelberg equilibrium solution of the steering control of the driver in open-loop information mode can be obtained from the above equation, as shown in formula (35):

$\begin{matrix} U_{hk}^{OS} = L_{h} (T_{1}^{'} R_{h k} + Θ_{h}^{'} R_{m k} + Ψ_{1}^{'} x_{c k} + S_{1}^{'} w_{k}^{'}) & (35) \end{matrix}$

$where T_{1}^{'} = I + Θ_{m} L_{m}^{'} Θ_{m}^{'}, Θ_{h}^{'} = Θ_{m} L_{m}^{'},$

$Ψ_{1}^{'} = - Ψ + Θ_{m} L_{m}^{'} Ψ_{m}^{'}, S_{1}^{'} = - S + Θ_{m} L_{m}^{'} S_{m}^{'},$

$L_{h} = p i n v ([\begin{matrix} Γ_{y 1} Θ_{h} \\ Γ_{u 1} \end{matrix}]) .$

Similarly, the open-loop Stackelberg equilibrium solution of the first stage of the game is taken to describe the interaction result of human-machine steering torque at time k, as shown in formula (36):

$\begin{matrix} {\hat{τ}}_{k}^{h O N} = {[\begin{matrix} 1 & 0 & \dots & 0 \end{matrix}]}_{1 \times n_{u}} U_{h k}^{OS} {\hat{τ}}_{k}^{m O N} = {[\begin{matrix} 1 & 0 & \dots & 0 \end{matrix}]}_{1 \times n_{u}} U_{m k}^{OS} & (36) \end{matrix}$

Different from the open-loop Nash equilibrium, since the open-loop Stackelberg equilibrium is solved based on the backward induction method, when solving the autonomous driving system control law (formula 34), constraints can also be added, so that the autonomous driving system control law is transformed into a constrained optimization problem, and so that the controller can meet the expected performance. The significance of the open-loop Stackelberg equilibrium strategy provided in the present disclosure is not only to propose a theoretical model for the human-machine steering torque interaction, but also to flexibly design an interactive steering assistant controller that meets the kinematics and dynamics safety constraints by using this algorithm.

In step 103, a shared control strategy is determined based on the human-machine torque conflict information, and a vehicle is controlled based on the shared control strategy.

In some embodiments, the human-machine torque conflict information obtained in step 102 forms a theoretical basis for the relationship between the human-machine decision divergence and the control conflict. Based on this theoretical basis, a better shared control strategy can be designed for emergency lane changing conditions to guide the vehicle control.

In order to form the theoretical relationship between the human-machine decision divergence and the control conflict, the present disclosure models the mapping relationship between the human-machine decision divergence and the human-machine steering torque interaction based on four dynamic non-cooperative game theory frameworks.

On the one hand, the dynamics model for the shared driving system is augmented by the human-machine target trajectory, and then the human-machine interaction action in the feedback information mode is modeled using the linear quadratic regulator method.

Since the model establishes the mapping relationship between the human-machine decision divergence and the steering torque interaction, the existence of the steering resistance torque and the uncertainty torque from the driver makes the solving of the human-machine decision divergence problem become the affine quadratic game problem. Therefore, according to the affine quadratic game algorithm based on the stochastic dynamic programming provided by the present disclosure, the feedback Nash equilibrium solution and the feedback Stackelberg equilibrium solution that describe the interaction relationship between the human-machine decision and the steering torque can be obtained.

On the other hand, the distributed model predictive control method is used to describe the multi-objective path tracking control problem between humans and vehicles under the open-loop condition. In order to solve the human-machine decision control model in the open-loop information mode, the model predictive control method is further used to solve the open-loop Nash equilibrium solution and the open-loop Stackelberg equilibrium solution that describe the mapping relationship between the human-machine decision and control.

FIG. 13 shows a schematic diagram of a structure of a conflict control apparatus for shared driving according to one or more embodiments of the present disclosure. As shown in FIG. 13, the conflict control apparatus 1300 for shared driving can include a modeling module 1301, a solving module 1302, and an application module 1303.

The modeling module 1301 is configured to establish, based on a driver's deterministic steering torque and a driver's stochastic steering torque, a game model for human-machine path tracking control corresponding to a human-machine interaction action.

The solving module 1302 is configured obtain human-machine torque conflict information by solving the game model for human-machine path tracking control.

The application module 1303 is configured to determine a shared control strategy based on the human-machine torque conflict information, and control a vehicle based on the shared control strategy.

According to one or more embodiments of the present disclosure, the modeling module 1301 includes a first modeling unit. The first modeling unit is configured to establish, based on the driver's deterministic steering torque and the driver's stochastic steering torque, a first discrete state update equation for a dynamics system of a shared driving vehicle in a closed-loop information mode; obtain a path tracking augmentation system that includes a human-machine preview state by augmenting the first discrete state update equation through a human-machine preview dynamic process; and establish, based on the path tracking augmentation system, a driver trajectory cost function and a driving system trajectory cost function, to obtain the game model for human-machine path tracking control.

According to one or more embodiments of the present disclosure, the modeling module 1301 includes a second modeling unit. The second modeling unit is configured to establish, based on the driver's deterministic steering torque and the driver's stochastic steering torque, a second discrete state update equation for a dynamics system of a shared driving vehicle in an open-loop information mode; determine a prediction output vector in a prediction time domain based on the second discrete state update equation; determine a driver reference trajectory vector and a driving system reference trajectory vector; and establish, based on the prediction output vector, the driver reference trajectory vector, and the driving system reference trajectory vector, a driver trajectory cost function and a driving system trajectory cost function, to obtain the game model for human-machine path tracking control.

According to one or more embodiments of the present disclosure, the solving module 1302 includes a first solving unit. The first solving unit is configured to determine, based on a stochastic dynamic programming algorithm, recursive relationships for steering control value functions respectively corresponding to a driver and a driving system under a Nash equilibrium condition; and calculate, based on the first discrete state update equation and the recursive relationships, closed-loop Nash equilibrium solutions respectively corresponding to the driver and the driving system, as the human-machine torque conflict information.

According to one or more embodiments of the present disclosure, the solving module 1302 includes a second solving unit. The second solving unit is configured to determine, based on a stochastic dynamic programming algorithm, recursive relationships for steering control value functions respectively corresponding to a driver and a driving system under a Stackelberg equilibrium condition; determine a driver reaction function based on the recursive relationship for the steering control value function corresponding to the driving system; calculate, based on the first discrete state update equation, the driver reaction function, and the recursive relationship for the steering control value function corresponding to the driving system, an open-loop Stackelberg equilibrium solution corresponding to the driving system; and calculate, based on the open-loop Stackelberg equilibrium solution corresponding to the driving system, an open-loop Stackelberg equilibrium solution corresponding to the driver, as the human-machine torque conflict information.

According to one or more embodiments of the present disclosure, the solving module 1302 includes a third solving unit. The third solving unit is configured to obtain a closed form solution corresponding to the game model by solving the game model for human-machine path tracking control; obtain, based on the closed form solution corresponding to the game model, a relationship expression between human-machine steering control and a target trajectory; and obtain open-loop Nash equilibrium solutions respectively corresponding to a driver and a driving system by solving the relationship expression based on a convex iterative algorithm, as the human-machine torque conflict information.

According to one or more embodiments of the present disclosure, the solving module 1302 includes a fourth solving unit. The fourth solving unit is configured to convert the driving system trajectory cost function into a driving system trajectory optimization function that considers a driver reaction function; obtain an open-loop Stackelberg equilibrium solution corresponding to a driving system by solving the driving system trajectory optimization function; and calculate, based on the open-loop Stackelberg equilibrium solution and the driver trajectory cost function, an open-loop Stackelberg equilibrium solution corresponding to a driver, as the human-machine torque conflict information.

The specific details of each module in the above conflict control apparatus 1300 for shared driving have been described in detail in the corresponding conflict control methods for shared driving, which will not be repeated here.

It should be noted that although several modules or units of the apparatus used for perform actions are mentioned in the above in detail, the present disclosure is not mandatory. According to the embodiments of the present disclosure, the features and functions of two or more modules or units described in the above can be embodied in one module or unit. On the contrary, the features and functions of one module or unit described in the above can be further divided into and embodied in multiple modules or units.

In one or more embodiments of the present disclosure, a storage medium capable of implementing the above method is also provided. FIG. 14 shows a schematic diagram of a computer-readable storage medium according to one or more embodiments of the present disclosure. As shown in FIG. 14, a program product 1400 for implementing the above methods according to one or more embodiments of the present disclosure is described, which can be a portable compact disc read-only memory (CD-ROM) and include program codes. The program product can run on terminal devices such as mobile phones. However, the program product provided in the present disclosure is not limited to this. In the present disclosure, the readable storage medium can be any tangible medium containing or storing programs that can be used by an instruction execution system, an apparatus, or a device, or used in combination with it.

In one or more embodiments of the present disclosure, an electronic device capable of implementing the above method is also provided. FIG. 15 shows a schematic diagram of a structure of a computer system in an electronic device according to one or more embodiments of the present disclosure.

It should be noted that the computer system 1500 of the electronic device shown in FIG. 15 is only an example and should not impose any restrictions on the functions and use scope of embodiments of the present disclosure.

As shown in FIG. 15, the computer system 1500 includes a central processing unit (CPU) 1501, which can perform various appropriate actions and processes according to a program stored in a read-only memory (ROM) 1502 or a program loaded from a storage part 1508 into a random access memory (RAM) 1503. RAM 1503 also stores various programs and data required for system operation. CPU 1501, ROM 1502 and RAM 1503 are connected to each other through bus 1504. An input/output (I/O) interface 1505 is also connected to bus 1504.

The following components are connected to the I/O interface 1505: an input part 1506 including such as keyboard, mouse; an output part 1507 including such as a cathode ray tube (CRT), a liquid crystal display (LCD), and a loudspeaker; a storage part 1508 including such as a hard disk; and a communication part 1509 including a network interface card such as a LAN card, a modem, and the like. The communication part 1509 performs communication processing via a network such as the Internet. A drive 1510 is also connected to the I/O interface 1505 as required. A removable media 1511, such as magnetic disks, optical disks, magneto-optical disks, and semiconductor memories, are installed on the drive 1510 as required, so that computer programs read from the drive 1510 can be installed into the storage part 1508 as required.

According to embodiments of the present disclosure, the process described below with reference to a flowchart can be implemented as a computer software program. For example, embodiments of the present disclosure include a computer program product, which includes a computer program carried on a computer-readable medium, and the computer program includes program codes for executing a method shown in a flowchart. In such embodiments, the computer program can be downloaded and installed from the network through the communication part 1509, and/or installed from the removable media 1511. When the computer program is executed by the central processing unit (CPU) 1501, various functions defined in systems of the present disclosure are executed.

It should be noted that the computer-readable medium shown in the present disclosure can be a computer-readable signal medium, a computer-readable storage medium, or any combination of the two. The computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include, electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the above. In the present disclosure, a computer-readable storage medium may be any tangible medium containing or storing a program, which may be used by or in combination with an instruction execution system, apparatus, or device. In the present disclosure, a computer-readable signal medium may include data signals transmitted in baseband or as part of a carrier wave, in which computer-readable program code is carried. Such transmitted data signals may take various forms, including but not limited to electromagnetic signals, optical signals or any suitable combination of the above. A computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which may send, propagate, or transmit programs for use by or in combination with an instruction execution system, apparatus, or device. The program code contained in the computer readable medium can be transmitted in any suitable medium, including but not limited to wireless, wired, optical cable, RF, etc., or any suitable combination of the above.

The flowchart and block diagram in the accompanying drawings illustrate possible architectures, functions and operations of the systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in a flowchart or block diagram may represent a module, program segment, or part of a code that contains one or more executable instructions for implementing a specified logical function. It should also be noted that in some alternative implementations, the functions marked in the block may also occur in a different order from those marked in the drawings. For example, two consecutive boxes can actually be executed basically in parallel, or they can sometimes be executed in reverse order, depending on the function involved. It should also be noted that each block in the block diagram or flow chart, and the combination of the blocks in the block diagram or flow chart, can be implemented with a dedicated hardware based system that performs a specified function or operation, or can be implemented with a combination of dedicated hardware and computer instructions.

The units described in embodiments of the present disclosure can be implemented through software or hardware, and the units described can also be arranged in the processor. The names of these units do not constitute a limitation on the unit itself in some cases.

On the other hand, the present disclosure also provides a computer-readable medium, which can be included in the electronic devices described in the above embodiments. It can also exist independently without being assembled into the electronic device. The computer-readable medium carries one or more programs, and when the one or more programs are executed by one electronic device, the electronic device realizes the methods described in the above embodiments.

Based on the description of the above embodiments of the present disclosure, it is easy for those skilled in the art to understand that embodiments of the present disclosure can be implemented through software or by combining software with necessary hardware. Therefore, the technical solutions in embodiments of the present disclosure can be embodied in the form of a software product, which can be stored on a non-transitory storage medium (such as CD-ROM, USB flash drive, portable hard drive, etc.), or on a network. The software product includes several instructions to cause a computing device (such as a personal computer, a server, a touch terminal, or a network device) to implement the method according to the embodiments of the present disclosure.

After considering the specification and practicing the invention disclosed herein, those skilled in the art will easily think of other embodiments of the disclosure. The application is intended to cover any variant, use or adaptive change of the disclosure, which follows the general principles of the disclosure and includes the common general knowledge or frequently used technical means in the technical field not disclosed in the present disclosure.

It should be understood that the present disclosure is not limited to the precise structure already described above and shown in the drawings, and various modifications and changes can be made without departing from its scope. The scope of this disclosure is limited only by the appended claims.

Claims

1. A conflict control method for shared driving, performed by a shared driving system, the method comprising: establishing, based on a driver's deterministic steering torque and a driver's stochastic steering torque, a game model for human-machine path tracking control corresponding to a human-machine interaction action;obtaining human-machine torque conflict information by solving the game model for human-machine path tracking control;determining a shared control strategy based on the human-machine torque conflict information; andcontrolling a shared driving vehicle based on the shared control strategy.
2. The conflict control method for shared driving according to claim 1, wherein the game model for human-machine path tracking control comprises a closed-loop game model, and establishing the game model for human-machine path tracking control corresponding to the human-machine interaction action, comprises: establishing, based on the driver's deterministic steering torque and the driver's stochastic steering torque, a first discrete state update equation for a dynamics system of the shared driving vehicle in a closed-loop information mode;by augmenting the first discrete state update equation through a human-machine preview dynamic process; andestablishing, based on the path tracking augmentation system, a driver trajectory cost function and a driving system trajectory cost function, to obtain the game model for human-machine path tracking control.
3. The conflict control method for shared driving according to claim 1, wherein the game model for human-machine path tracking control comprises an open-loop game model, and establishing the game model for human-machine path tracking control corresponding to the human-machine interaction action, comprises: establishing, based on the driver's deterministic steering torque and the driver's stochastic steering torque, a second discrete state update equation for a dynamics system of the shared driving vehicle in an open-loop information mode;determining a prediction output vector in a prediction time domain based on the second discrete state update equation, and determining a driver reference trajectory vector and a driving system reference trajectory vector; andestablishing, based on the prediction output vector, the driver reference trajectory vector, and the driving system reference trajectory vector, a driver trajectory cost function and a driving system trajectory cost function, to obtain the game model for human-machine path tracking control.
4. The conflict control method for shared driving according to claim 2, wherein obtaining the human-machine torque conflict information by solving the game model for human-machine path tracking control, comprises: determining, based on a stochastic dynamic programming algorithm, recursive relationships for steering control value functions respectively corresponding to a driver and a driving system under a Nash equilibrium condition; andcalculating, based on the first discrete state update equation and the recursive relationships, closed-loop Nash equilibrium solutions respectively corresponding to the driver and the driving system, as the human-machine torque conflict information.
5. The conflict control method for shared driving according to claim 2, wherein obtaining the human-machine torque conflict information by solving the game model for human-machine path tracking control, comprises: determining, based on a stochastic dynamic programming algorithm, recursive relationships for steering control value functions respectively corresponding to a driver and a driving system under a Stackelberg equilibrium condition;determining a driver reaction function based on the recursive relationship for the steering control value function corresponding to the driving system;calculating, based on the first discrete state update equation, the driver reaction function, and the recursive relationship for the steering control value function corresponding to the driving system, an open-loop Stackelberg equilibrium solution corresponding to the driving system; andcalculating, based on the open-loop Stackelberg equilibrium solution corresponding to the driving system, an open-loop Stackelberg equilibrium solution corresponding to the driver, as the human-machine torque conflict information.
6. The conflict control method for shared driving according to claim 3, wherein obtaining the human-machine torque conflict information by solving the game model for human-machine path tracking control, comprises: obtaining a closed form solution corresponding to the game model by solving the game model for human-machine path tracking control;obtaining, based on the closed form solution corresponding to the game model, a relationship expression between human-machine steering control and a target trajectory; andobtaining open-loop Nash equilibrium solutions respectively corresponding to a driver and a driving system by solving the relationship expression based on a convex iterative algorithm, as the human-machine torque conflict information.
7. The conflict control method for shared driving according to claim 3, wherein obtaining the human-machine torque conflict information by solving the game model for human-machine path tracking control, comprises: converting the driving system trajectory cost function into a driving system trajectory optimization function that considers a driver reaction function;obtaining an open-loop Stackelberg equilibrium solution corresponding to a driving system by solving the driving system trajectory optimization function; andcalculating, based on the open-loop Stackelberg equilibrium solution corresponding to the driving system and the driver trajectory cost function, an open-loop Stackelberg equilibrium solution corresponding to a driver, as the human-machine torque conflict information.
8. (canceled)
9. A non-transitory computer-readable storage medium having a computer program stored thereon, which when executed by a processor, causes a conflict control method for shared driving to be implemented, wherein the conflict control method for shared driving comprises: establishing, based on a driver's deterministic steering torque and a driver's stochastic steering torque, a game model for human-machine path tracking control corresponding to a human-machine interaction action;obtaining human-machine torque conflict information by solving the game model for human-machine path tracking control;determining a shared control strategy based on the human-machine torque conflict information; andcontrolling a shared driving vehicle based on the shared control strategy.
10. An electronic device, comprising: one or more processors; anda storage unit for storing one or more programs, which when executed by the one or more processors, cause the one or more processors to be configured to:establish, based on a driver's deterministic steering torque and a driver's stochastic steering torque, a game model for human-machine path tracking control corresponding to a human-machine interaction action;obtain human-machine torque conflict information by solving the game model for human-machine path tracking control;determine a shared control strategy based on the human-machine torque conflict information; andcontrol a shared driving vehicle based on the shared control strategy.
11. The electronic device according to claim 10, wherein the one or more processors are further configured to: establish, based on the driver's deterministic steering torque and the driver's stochastic steering torque, a first discrete state update equation for a dynamics system of the shared driving vehicle in a closed-loop information mode;obtain a path tracking augmentation system that comprises a human-machine preview state by augmenting the first discrete state update equation through a human-machine preview dynamic process; andestablish, based on the path tracking augmentation system, a driver trajectory cost function and a driving system trajectory cost function, to obtain the game model for human-machine path tracking control.
12. The electronic device according to claim 10, wherein the one or more processors are further configured to: establish, based on the driver's deterministic steering torque and the driver's stochastic steering torque, a second discrete state update equation for a dynamics system of the shared driving vehicle in an open-loop information mode;determine a prediction output vector in a prediction time domain based on the second discrete state update equation, and determine a driver reference trajectory vector and a driving system reference trajectory vector; andestablish, based on the prediction output vector, the driver reference trajectory vector, and the driving system reference trajectory vector, a driver trajectory cost function and a driving system trajectory cost function, to obtain the game model for human-machine path tracking control.
13. The electronic device according to claim 11, wherein the one or more processors are further configured to: determine, based on a stochastic dynamic programming algorithm, recursive relationships for steering control value functions respectively corresponding to a driver and a driving system under a Nash equilibrium condition; andcalculate, based on the first discrete state update equation and the recursive relationships, closed-loop Nash equilibrium solutions respectively corresponding to the driver and the driving system, as the human-machine torque conflict information.
14. The electronic device according to claim 11, wherein the one or more processors are further configured to: determine, based on a stochastic dynamic programming algorithm, recursive relationships for steering control value functions respectively corresponding to a driver and a driving system under a Stackelberg equilibrium condition;determine a driver reaction function based on the recursive relationship for the steering control value function corresponding to the driving system;calculate, based on the first discrete state update equation, the driver reaction function, and the recursive relationship for the steering control value function corresponding to the driving system, an open-loop Stackelberg equilibrium solution corresponding to the driving system; andcalculate, based on the open-loop Stackelberg equilibrium solution corresponding to the driving system, an open-loop Stackelberg equilibrium solution corresponding to the driver, as the human-machine torque conflict information.
15. The electronic device according to claim 12, wherein the one or more processors are further configured to: obtain a closed form solution corresponding to the game model by solving the game model for human-machine path tracking control;obtain, based on the closed form solution corresponding to the game model, a relationship expression between human-machine steering control and a target trajectory; andobtain open-loop Nash equilibrium solutions respectively corresponding to a driver and a driving system by solving the relationship expression based on a convex iterative algorithm, as the human-machine torque conflict information.
16. The electronic device according to claim 12, wherein the one or more processors are further configured to: convert the driving system trajectory cost function into a driving system trajectory optimization function that considers a driver reaction function;obtain an open-loop Stackelberg equilibrium solution corresponding to a driving system by solving the driving system trajectory optimization function; andcalculate, based on the open-loop Stackelberg equilibrium solution corresponding to the driving system and the driver trajectory cost function, an open-loop Stackelberg equilibrium solution corresponding to a driver, as the human-machine torque conflict information.
17. The non-transitory computer-readable storage medium according to claim 9, wherein the game model for human-machine path tracking control comprises a closed-loop game model, and establishing the game model for human-machine path tracking control corresponding to the human-machine interaction action, comprises: establishing, based on the driver's deterministic steering torque and the driver's stochastic steering torque, a first discrete state update equation for a dynamics system of the shared driving vehicle in a closed-loop information mode;by augmenting the first discrete state update equation through a human-machine preview dynamic process; andestablishing, based on the path tracking augmentation system, a driver trajectory cost function and a driving system trajectory cost function, to obtain the game model for human-machine path tracking control.
18. The non-transitory computer-readable storage medium according to claim 9, wherein the game model for human-machine path tracking control comprises an open-loop game model, and establishing the game model for human-machine path tracking control corresponding to the human-machine interaction action, comprises: establishing, based on the driver's deterministic steering torque and the driver's stochastic steering torque, a second discrete state update equation for a dynamics system of the shared driving vehicle in an open-loop information mode;determining a prediction output vector in a prediction time domain based on the second discrete state update equation, and determining a driver reference trajectory vector and a driving system reference trajectory vector; andestablishing, based on the prediction output vector, the driver reference trajectory vector, and the driving system reference trajectory vector, a driver trajectory cost function and a driving system trajectory cost function, to obtain the game model for human-machine path tracking control.
19. The non-transitory computer-readable storage medium according to claim 17, wherein obtaining the human-machine torque conflict information by solving the game model for human-machine path tracking control, comprises: determining, based on a stochastic dynamic programming algorithm, recursive relationships for steering control value functions respectively corresponding to a driver and a driving system under a Nash equilibrium condition; andcalculating, based on the first discrete state update equation and the recursive relationships, closed-loop Nash equilibrium solutions respectively corresponding to the driver and the driving system, as the human-machine torque conflict information;ordetermining, based on a stochastic dynamic programming algorithm, recursive relationships for steering control value functions respectively corresponding to a driver and a driving system under a Stackelberg equilibrium condition;determining a driver reaction function based on the recursive relationship for the steering control value function corresponding to the driving system;calculating, based on the first discrete state update equation, the driver reaction function, and the recursive relationship for the steering control value function corresponding to the driving system, an open-loop Stackelberg equilibrium solution corresponding to the driving system; andcalculating, based on the open-loop Stackelberg equilibrium solution corresponding to the driving system, an open-loop Stackelberg equilibrium solution corresponding to the driver, as the human-machine torque conflict information.
20. The non-transitory computer-readable storage medium according to claim 18, wherein obtaining the human-machine torque conflict information by solving the game model for human-machine path tracking control, comprises: obtaining a closed form solution corresponding to the game model by solving the game model for human-machine path tracking control;obtaining, based on the closed form solution corresponding to the game model, a relationship expression between human-machine steering control and a target trajectory; andobtaining open-loop Nash equilibrium solutions respectively corresponding to a driver and a driving system by solving the relationship expression based on a convex iterative algorithm, as the human-machine torque conflict information;orconverting the driving system trajectory cost function into a driving system trajectory optimization function that considers a driver reaction function;obtaining an open-loop Stackelberg equilibrium solution corresponding to a driving system by solving the driving system trajectory optimization function; andcalculating, based on the open-loop Stackelberg equilibrium solution corresponding to the driving system and the driver trajectory cost function, an open-loop Stackelberg equilibrium solution corresponding to a driver, as the human-machine torque conflict information.

Priority Claims (1)

Number	Date	Country	Kind
202210414179.X	Apr 2022	CN	national

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a U.S. national phase application of International Application No. PCT/CN2022/134000 filed on Nov. 24, 2022, which claims priority to Chinese Patent Application No. 202210414179.X, filed on Apr. 14, 2022 and entitled “CONFLICT CONTROL METHOD AND APPARATUS FOR SHARED DRIVING, AND STORAGE MEDIUM AND ELECTRONIC DEVICE”, the entire contents of which are incorporated herein by reference for all purposes.

PCT Information

Filing Document	Filing Date	Country	Kind
PCT/CN2022/134000	11/24/2022	WO

CONFLICT CONTROL METHOD FOR SHARED DRIVING, AND STORAGE MEDIUM AND ELECTRONIC DEVICE

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

Priority Claims (1)

CROSS REFERENCE TO RELATED APPLICATIONS

PCT Information