COMPUTER-IMPLEMENTED METHOD FOR PREDICTING A BEHAVIOR OF AGENTS IN A DYNAMIC SYSTEM WITH A MULTIPLICITY OF INTERACTING AGENTS

Description

CROSS REFERENCE

The present application claims the benefit under 35 U.S.C. § 119 of German Patent Application No. DE 10 2022 204 723.0 filed on May 13, 2022, which is expressly incorporated herein by reference in its entirety.

FIELD

The present invention relates to a computer-implemented method for predicting a behavior of agents in a dynamic system with a multiplicity of interacting agents.

BACKGROUND INFORMATION

Possibilities of predicting behavior in such systems are described in Charlie Tang and Russ Salakhutdinov, “Multiple Futures Prediction,” 2019, NeurIPS and in Sergio Casas and Cole Gulino and Simon Suo and Katie Luo and Renjie Liao and Raquel Urtasun, “Implicit Latent Variable Model for Scene-Consistent Motion Forecasting,” 2020 ECCV.

SUMMARY

By the computer-implemented method and device according to the present invention, precise prediction of a behavior of agents is achieved at low cost with regard to the required computing resources.

According to an example embodiment of the present invention, a method for predicting the behavior of agents in a dynamic system with a multiplicity of interacting agents depending on the latent state thereof provides that for a plurality of components and for a plurality of time points up to a prediction time point, a value of a first moment of a first distribution, which models the latent state of the agents, is determined for each component, wherein a value of a second moment of the first distribution is determined, wherein an expected value for a first moment of a second distribution at the prediction time point is determined for each component depending on the value of the first moment of the first distribution at the prediction time point and depending on the value of the second moment of the first distribution at the prediction time point, wherein the second distribution models the behavior of the agents depending on the latent state thereof, wherein the expected value for the first moment of the second distribution defines a first moment of a third distribution, wherein a second moment of the third distribution is determined for each component, wherein a sum, in particular a sum weighted with at least one weight, of the third distributions of the component is determined, and wherein the prediction of the behavior is determined depending on the sum.

Preferably, according to an example embodiment of the present invention, it is provided that the value of the first moment of the first distribution is determined depending on a value of the first moment of the first distribution for a time point preceding the time point and on an expected value for a deterministic change of the first moment of the first distribution, and/or that the value of the second moment of the first distribution for the time point is determined depending on a value of the second moment of the first distribution for a time point preceding the time point and on a covariance of the deterministic change and on an expected value for a stochastic change of the second moment of the first distribution. This efficiently recursively determines the respective value.

The value of the second moment of the first distribution for the time point is preferably determined depending on the value of the second moment of the first distribution for the preceding time point and on the covariance of the deterministic change and on a covariance of the latent state at the preceding time point with the deterministic change and on a transpose of the covariance of the latent state at the preceding time point with the deterministic change and on the expected value for the stochastic change. This efficiently recursively determines the value.

Preferably, according to an example embodiment of the present invention, the expected value for the first moment of the second distribution is determined depending on the value of the first moment of the first distribution at the prediction time point. This efficiently determines the expected value.

Preferably, according to an example embodiment of the present invention, a covariance of the first moment of the second distribution is determined for each component depending on the value of the first moment of the second distribution at the prediction time point, wherein an expected value for the second moment of the second distribution at the prediction time point is determined for each component depending on a latent state at the prediction time point, wherein the second moment of the third distribution is determined for each component depending on the covariance of the first moment of the second distribution and on the expected value for the second moment of the second distribution at the prediction time point. The method can thus be performed particularly efficiently.

Preferably, according to an example embodiment of the present invention, a context variable is determined, which comprises an association, which associates at least one agent with another agent to be considered for predicting the behavior of this agent, and/or which characterizes a history of the dynamic system, wherein the first moment of the first distribution is determined depending on the context variable, and/or wherein the second moment of the first distribution is determined depending on the context variable, and/or wherein the expected value for the first moment is determined depending on the context variable, and/or wherein the first moment of the third distribution is determined depending on the context variable, and/or wherein the second moment of the third distribution is determined depending on the context variable, and/or wherein the at least one weight is determined for at least one component depending on the context variable. A neighborhood and/or a history of the agents is thereby considered.

According to an example embodiment of the present invention, the history is preferably determined depending on an observed behavior of the at least one agent, in particular a behavior which comprises the agent's position or movement, wherein the agent's position or movement is in particular acquired using a receiver for a satellite-based position determination system, or wherein at least one digital image is acquired, in particular using a sensor for digital images, preferably a camera, a LiDAR sensor, ultrasonic sensor, movement sensor, thermal imaging detector, and/or radar sensor, and the agent's position or movement is determined depending on at least one digital image, or wherein a signal is acquired using a speaker for receiving audible sound and the agent's position or movement is determined depending on the signal.

It may be provided that the context variable comprises a matrix, whose rows each represent one of the agents and whose columns each represent one of the agents, wherein at least one value, in particular a binary value, of an element of the matrix identified by a row and a column is determined and specifies whether or not the agent identified by the row is to be considered for the prediction for the agent identified by the column, or wherein at least one value, in particular a binary value, of an element of the matrix identified by a row and a column is determined and specifies whether or not the agent identified by the column is to be considered for the prediction for the agent identified by the row. The matrix represents a neighborhood to be considered. As a result, the calculation considers the most relevant agents in particular. As a result, the best possible prediction is calculated particularly efficiently.

Preferably, according to an example embodiment of the present invention, the first moment and the second moment of the first distribution are determined in iterations, wherein for a first one of the iterations for each component, a value of the first moment of the first distribution and a value of the second moment of the first distribution are determined, which depends on the context variable. The history is thereby considered particularly efficiently.

Preferably, according to an example embodiment of the present invention, it is provided that, for the prediction, latent states of an agent are modeled independently of one another and latent states of different agents are modeled independently of one another, or latent states of an agent are modeled independently of one another and corresponding elements of latent states of different agents are modeled dependently on one another, or different elements of a latent state of an agent are modeled dependently on one another and latent states of different agents are modeled independently of one another. This makes the calculation very efficient.

Preferably, according to an example embodiment of the present invention, at least one agent, in particular a computer-controlled machine, in particular a robot, preferably a vehicle, a household appliance, a driven machine, a manufacturing machine, a personal assistant, or an access control system is controlled depending on the prediction. This control is particularly robust.

The at least one agent may be an existing real object in the physical world.

According to an example embodiment of the present invention, the device comprises at least one processor and at least one memory, which are designed to perform the method. This device has advantages corresponding to those of the method.

According to an example embodiment of the present invention, a system comprises at least one agent, in particular a computer-controlled machine, in particular a robot, preferably a vehicle, a household appliance, a driven machine, a manufacturing machine, a personal assistant, or an access control system, wherein the agent or the system comprises the device, and wherein the device is designed to control the agent depending on the prediction. This system has advantages corresponding to those of the method.

According to an example embodiment of the present invention, a computer program comprises computer-readable instructions that, when executed by a computer, cause the method to run. This computer program has advantages corresponding to those of the method.

Further advantageous embodiments can be taken from the following description and the figures.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows a schematic representation of a device for predicting a behavior of agents in a dynamic system with a multiplicity of interacting agents, according to an example embodiment of the present invention.

FIG. 2 shows a behavior of agents in an exemplary dynamic system, according to an example embodiment of the present invention.

FIG. 3 shows a prediction of the behavior of the agents in the dynamic system, according to an example embodiment of the present invention.

FIG. 4 shows steps in a method for predicting, according to an example embodiment of the present invention.

FIGS. 5A-5D show examples of neural networks, according to an example embodiment of the present invention.

FIG. 6 shows a schematic representation of approximations of a covariance matrix, according to an example embodiment of the present invention.

DETAILED DESCRIPTION OF EXAMPLE EMBODIMENTS

FIG. 1 schematically shows a device 100 for predicting a behavior of agents 102 in a dynamic system 104 with a multiplicity of interacting agents 102. The dynamic system 104 in the example is a physical system, in particular a technical system. The agents 102 may be physical systems, in particular technical systems. The agents 102 may be an existing real objects in the physical world. The device 100 comprises at least one processor 106 and at least one memory 108. The device 100 is designed to perform a below described method for predicting the behavior of the agents 102 in the dynamic system 104. The device 100 optionally comprises an interface 110. The agents 102 optionally comprise an interface 112. The device 100 and the agents 102 are optionally designed to communicate via their interfaces, for example in order to transmit information about a behavior of the agents 102 from the agents 102 to the device 100 or to send information about a prediction of the behavior from the device 100 to the agents 102. A sensor system 114 may be provided that is designed to acquire information about the behavior of the agents 102 in the dynamic system 104. The sensor system 114 may be designed to measure a physical property of the agents 102. The agents 102 are optionally designed to provide information about their own behavior or about the behavior of other agents 102. For example, the information about their own behavior is acquired using the sensor system 114. For example, the sensor system 114 is arranged in one or more of the agents 102 and designed to acquire the information about the own behavior of the respective agent 102 and/or the behavior of the other agents 102. The sensor system 114 is, for example, designed to acquire a position or movement of the agents 102. The sensor system 114 may comprise a receiver for a satellite-based position determination system, e.g., a global positioning system, or a sensor for digital images, such as a camera, a LiDAR sensor, ultrasonic sensor, movement sensor, thermal imaging detector, and/or radar sensor. The sensor system 114 is, for example, designed to acquire a position or movement of the agents 102. The sensor system 114 may comprise a speaker for receiving audible sound and for generating audio signals. It may be provided that the sensor system 114 is arranged in an infrastructure 116 and is at least intermittently connected via a communication link 118 to the interface 110 of the device 100 in which the agents 102 can move. Instead of the sensor system 114, it may be provided that the data comprises information about the agents 102, in particular data structured in a graph.

The agents 102 are optionally designed to determine their own behavior depending on the prediction of the behavior of the other agents 102. For example, the agents 102 each comprise an actuator 120 that is designed to influence the behavior of the respective agent 102 depending on the prediction. It may also be provided that the device 100 is designed, instead of transmitting the prediction to the agents 102, to determine a control command for at least one agent 102 depending on the prediction and to transmit the control command to the agent(s) 102 to be controlled. In this case, the actuator 120 is designed to execute the control command. It may be provided that the device 100 is integrated in one or more of the agents 102.

Likewise provided is a computer program that contains instructions that, when executed by a computer, cause this method to run. For example, the at least one processor 102 executes the computer program.

FIG. 2 shows a behavior of agents 102 in an exemplary system 104. In the example, the behavior of the agents 102 is observed, wherein FIG. 2 shows trajectories on which the agents 102 have actually moved according to an observation of their behavior from a start time point of the observation to an end time point of the observation.

For example, the dynamic system 104 is a technical system in which the agents 102 are computer-controlled machines, e.g., robots, such as vehicles, household appliances, driven machines, manufacturing machines, personal assistants, or access control systems.

The dynamic system 104 may also be another system. For example, the dynamic system 104 is a molecular dynamics in which the agents 102 are atoms or molecules whose movements are being predicted. For example, the dynamic system is a game, such as a soccer, basketball, or American football game, in which the agents are people or game equipment, e.g., a ball, whose movements are being predicted.

The dynamic system 104 in the example is a roundabout 202. In the example, the roundabout 202 has a first entry 204, a second entry 206, a third entry 208, and a fourth entry 210. In the example, the roundabout 202 has a first exit 212, a second exit 214, a third exit 216, and a fourth exit 218. The agents 102 in the example include vehicles. It may be provided that the agents 102 include pedestrians. From the start time point, a first vehicle moves on a first observed trajectory 220 from the first entry 204 in the roundabout 202 and, at the end time point, is located in the roundabout 202 in the area of the second exit 214. From the start time point, a second vehicle moves on a second observed trajectory 222 from the area of the second exit 214 in the roundabout 202, exits the roundabout 202 via the third exit 216 and, at the end time point, is located outside the roundabout 202. From the start time point, a third vehicle moves on a third observed trajectory 224 from the second entry 206 in the roundabout 202 and, at the end time point, is located in the roundabout 202 in the area of the third exit 216. From the start time point, a fourth vehicle moves on a fourth observed trajectory 226 from an area in the roundabout 202 between the second exit 214 and the second entry 206 in the roundabout 202 and, at the end time point, is located in the fourth exit 218. From the start time point, a fifth vehicle moves on a fifth observed trajectory 228 from an area in the roundabout 202 between the third exit 208 and the fourth entry 218 in the roundabout 202 and, at the end time point, is located in the first exit 212. From the start time point, a sixth vehicle moves on a sixth observed trajectory 230 in the area of the fourth entry 210 until the end time point.

FIG. 3 shows a prediction of the behavior of the agents 102 in the dynamic system 104 using the example of the roundabout 202.

From the start time point, the first vehicle moves on the first observed trajectory 220 until an observation end time point. In the example, the first vehicle does not move but is located in the first entry 204 until the observation end time point. A first predicted trajectory 320 between the observation end time point and a prediction end time point is determined for the first vehicle. According to the prediction, the first vehicle moves from the first entry 204 in the roundabout 202 and, at the end time point, is located in the roundabout 202 in the area of the second exit 214.

From the start time point until the observation end time point, the second vehicle moves on the second observed trajectory 222 to an area in the roundabout 202 between the second entry 206 and the third exit 216. This portion of the second observed trajectory 222 is shown with dashed lines in FIG. 2 and in FIG. 3. A second predicted trajectory 322 between the observation end time point and the prediction end time point is determined for the second vehicle. According to the prediction, the second vehicle moves from the area in the roundabout 202 between the second entry 206 and the third exit 216 in the roundabout 202, exits the roundabout 202 via the third exit 216 and, at the end time point, is located outside the roundabout 202.

From the start time point until the observation end time point, the third vehicle moves on the third observed trajectory 224. In the example, the third vehicle does not move but is located in the second entry 206 until the observation end time point. A third predicted trajectory 324 between the observation end time point and the prediction end time point is determined for the third vehicle. According to the prediction, the third vehicle moves from the second entry 206 in the roundabout 202 and, at the end time point, is located in the roundabout 202 in the area of the third exit 216.

From the start time point until the observation end time point, the fourth vehicle moves on the fourth observed trajectory 226 to an area in the roundabout 202 between the third exit 216 and the third entry 208. This portion of the fourth observed trajectory 226 is shown with dashed lines in FIG. 2 and in FIG. 3. A fourth predicted trajectory 326 between the observation end time point and the prediction end time point is determined for the fourth vehicle. According to the prediction, the fourth vehicle moves from an area in the roundabout 202 between the third exit 216 and the third entry 208 in the roundabout 202 and, at the end time point, is located in the fourth exit 218.

From the start time point until the observation end time point, the fifth vehicle moves on the fifth observed trajectory 228 to an area in the roundabout 202 between the fourth exit 218 and the fourth entry 210. This portion of the fifth observed trajectory 228 is shown with dashed lines in FIG. 2 and in FIG. 3. A fifth predicted trajectory 328 between the observation end time point and the prediction end time point is determined for the fifth vehicle. According to the prediction, the fifth vehicle moves from an area in the roundabout 202 between the fourth exit 218 and the fourth entry 210 in the roundabout 202 and, at the end time point, is located in the first exit 212.

From the start time point until the observation end time point, the sixth vehicle moves on the sixth observed trajectory 230. In the example, the sixth vehicle does not move but is located in the fourth entry 210 until the observation end time point. A sixth predicted trajectory 330 between the observation end time point and the prediction end time point is determined for the sixth vehicle. According to the prediction, the sixth vehicle moves in the area of the fourth entry 210 until the end time point.

The predictions, i.e., the predicted trajectories in the example, are approximated as a Gaussian mixture distribution. The moments of the Gaussian mixture distribution are determined using the method described below depending on a portion of the behavior respectively observed for the individual agents 102, i.e., in the example, the observed portion, shown in dashed lines, of the respective observed trajectory.

In the example, 95% confidence intervals are visualized for the prediction with respect to the other portion shown of the observed trajectories.

The prediction of the trajectories, i.e., a time profile of positions of the agents 102, is an example. It may also be provided to determine the prediction for a distance between the agents 102, a velocity or an acceleration of the agents 102.

The behavior of the agents 102, in the example that of the vehicles, is observed for a specified time period. The prediction is determined depending on the behavior observed in the specified time period. In one example, at least one of the vehicles is an autonomous vehicle. The prediction represents a simulation of an environment of the at least one autonomous vehicle, wherein the at least one autonomous vehicle is controlled depending on the prediction.

The prediction is determined in the example by means of machine learning of a model, wherein the prediction is determined using the model.

This is described below for a latent variable X={x_t}_t=0^Twith x_t∈R^MD^xand an observed variable Y={y_t}_t=0^Tof dimension D_y, wherein x_t∈R^D^xis a set of M agents 102, and x_t^m∈R^MD^xis a latent state of an agent m at a time point t, and y_t^m∈R^MD^yis a state of the agents 102 at the time point t, which is defined by

x
₀
˜p(x₀|I)

x
_t
=x
_t−1+ƒ(x_t−1,I)+L(x_t−1,I)w_t−1,t=1, . . . ,T

y
_t
˜N(y_t|g(x_t),QQ^T(x_t))

wherein

- x_tis a latent state of the agents 102,
- x₀is an initial value for the latent state of the agents 102 at the start time point t=0,
- ƒ(x_t,I):R^MD^x×R^D^I→R^D^xis a deterministic change in the latent state x_tof the agents 102, which change is modeled in the example as a neural network parameterized with parameters θ_ƒ,
- L(x_t, I):R^MD^x×R^D^I→R^D^x^×D^xis a stochastic change in the latent state x_tof the agents 102, which change is modeled in the example as a neural network parameterized with parameters θ_L, wherein θ={θ_ƒ, θ_L} denotes these parameters,
- I∈R^D^Iis a context variable that comprises an association N which associates each agent 102 with other agents 102 to be considered for predicting the behavior of this agent 102, and that comprises a history that characterizes the behavior for each agent 102,
- w_t∈R^MD^xis a random variable from a normal distribution w_t˜N(0,I) through which a disturbance variable is introduced,
- N(y_t|g(x_t),QQ^T(x_t)) is a normal distribution whose mean value
- g(x_t):R^MD^x→R^MD^y^×MD^yis modeled by a non-linear neural network parameterized with parameters ψ_g, wherein the covariance thereof
- QQ^T(x_t): R^MD^x→R^MD^y^×MD^yis determined by a variable Q, which is assumed to be constant or is modeled by a non-linear neural network parameterized with parameters ψ_Q, wherein ψ={ψ_g,ψ_Q} denotes these parameters.

The variable Y={y_t}_t=0^Tcomprises the states of the dynamic system 104, in particular the states y_tof the agents 102 at the time points t. In the example, the agents 102 are the vehicles and the variable Y comprises the observed portions of the trajectories. The variable X={x_t}_t=0^Tcomprises the latent states of the dynamic system 104. In the example, the variable X comprises the latent states x_tof the agents 102 at the time points t. The latent states x_tcomprise further information for a reliable prediction of a respective future state y_t+1of the agents 102. The latent state x_tat the time point t comprises, for example, the accelerations or velocities of the vehicles at the time point t.

The prediction is determined below for a number M of agents 102 denoted hereinafter by m.

For them, the deterministic change is

$f (x_{t}, I) = [\begin{matrix} \overline{f} (x_{t}^{1}, x_{t}^{N_{1}}, I) \\ ⋮ \\ \overline{f} (x_{t}^{M}, x_{t}^{N_{M}}, I) \end{matrix}]$

and the stochastic change is

$L (x_{t}, I) = diag [\begin{matrix} \overline{L} (x_{t}^{1}, x_{t}^{N_{1}}, I) \\ ⋮ \\ \overline{L} (x_{t}^{M}, x_{t}^{N_{M}}, I) \end{matrix}]$

wherein ƒ(x_t^m, x_t^N^m, I): R^D^x×R^D^x×R^D^I→R^D^xdenotes an update to the deterministic change,

wherein L(x_t^m, x_t^N^m, I): R^D^x×R^D^x×R^D^I→R^D^xdenotes an update to the stochastic change,

wherein x_t^N^m∈R^D^xis a message for the agent m at the time point t, which message is determined as

x
_t
^N
^m
=AGG(x_t,ε)^m

wherein N_m={e^m,m′|e^m,m′=1}_m′=1^Mdenotes the first information item N for the agent m and an operation AGG(x_t,ε):R^MD^x×R^M×M→R^MD^xhas an output m for each agent, wherein the m-th agent is associated with the m-th output, wherein ε∈R^M×Mdenotes edges of a graph that define a relationship of the agents to one another. In the example, the relationship of the agents to one another is a binary value.

After t prediction steps, this model considers correlations between agents 102 that have a distance from one another of at most t steps. In one example, distance means how many edges have to be followed to get from an agent m to an agent m′. The distance may be infinite if an agent is not connected to any other agent.

The prediction for a prediction time point T is a marginal probability p(y_T|I), which as a nested integral

p(y_T|I)=∫p(y_T|x_T)p(x_T|x₀,I)p(x₀|I)dx_T,x₀

with a probability p(y_T|x_T) a kernel p(x_T|x₀,I) and a Gaussian mixture model (GMM) p(x₀|I).

The kernel p(x_T|x₀,I) is approximated for each time step t by a normal distribution N(x_t|μ_t(I),Σ_t(I)) with a mean value μ_t(I) and a covariance Σ_t(I), wherein

μ_t(I)=μ_t−1(I)+E[ƒ(x_t−1,I)]

Σ_t(I)=Σ_t−1(I)+Cov[ƒ(x_t−1,I)]+Cov[x_t−1,ƒ(x_t−1,I)]+Cov[x_t−1,ƒ(x_t−1,I)]^T+

E[LL
^T(x_t−1,I)]

wherein E denotes the expected value, and Cov denotes the covariance, and wherein Cov[x_t−1,ƒ(x_t−1,I)] denotes the cross-covariance between random vectors in the arguments x_t−1and ƒ(x_t−1, I).

The function ƒ(x,I) is implemented in the example as a neural network. The function L(x,I) is implemented in the example as a neural network. The function g(x) is implemented in the example as a neural network. The function Q(x) is implemented in the example as a neural network.

An expected value and a covariance for an output of the respective neural network is determined as described, for example, in Anqi Wu, Sebastian Nowozin, Edward Meeds, Richard E. Turner, Jose Miguel Hernandez-Lobato, and Alexander L. Gaunt: “Deterministic Variational Inference for Robust Bayesian Neural Networks,” in ICLR, 2019a (Anqi Wu).

The cross-covariance Cov[x_t, ƒ(x_t, I)] is approximated, for example, by

Cov[x_t,ƒ(x_t,I)]=Cov[x_t]E[∀_x_t,ƒ(x_t,I)]

wherein the expected value for the Jacobi matrix is approximated as in Andreas Look, Jan Peters, and Melih Kandemir: “Deterministic Inference of Neural Stochastic Differential Equations,” arXiv, abs/2006.08973, 2020, (Andreas Look).

$E [\nabla_{x_{t}} f (x_{t}, I)] \approx \prod_{l = 1}^{L} E [J_{t}^{l}]$

wherein J_t^lis the Jacobi matrix in the layer l of the neural network at the time point t.

FIG. 4 shows steps in a method for the prediction p(y_T|I) of a behavior y_t={y_t^m}_i=1^Mof agents m in the dynamic system 104 with a multiplicity M of interacting agents m.

The prediction p(y_T|I) is determined depending on the latent state x_tof the agents m. The method comprises two loops, an inner loop and an outer loop.

For a first one of the iterations, the initial latent state x₀it taken from a Gaussian mixture model with V components v, which is defined by the normal distribution N(x₀|μ₀(I),Σ₀(I)).

The first moment μ_tand the second moment Σ_tof the normal distribution N(x_t|μ_t(I),Σ_t(I)) is determined in the inner loop in iterations. Initially, for each component v, a value of the first moment μ_0,vand a value of the second moment Σ_0,vof the normal distribution N(x_0,v|μ_0,v(I)), Σ_0,v(I)). In the example, these values depend on the context variable I.

The values of the moments μ_0,vand Σ_0,Vare determined as a function of the context variable I by a further neural network. An example of this neural network with 30 fully connected layers and a tanh activation, which is followed by a layer for the operation AGG, which is followed by 64 fully connected layers and a tanh activation, which is followed by a fully connected layer for the values of the first moment μ_0,vand which is followed by a further fully connected layer with exp activation, which is shown in FIG. 5a.

The inner loop is calculated for a plurality V of components v and for a plurality of time points t at a prediction time point T. The inner loop is calculated for the prediction time point T for the plurality V of components v.

The normal distribution N(x_t|μ_t(I),Σ_t(I)) models the latent state x_tof the agents m. The normal distribution N(y_t|g(x_t),QQ^T(x_t)) models the behavior y_tof the agents m depending on the latent state thereof x_t.

In the method, a normal distribution N(a_T,v(I),B_T,v(I)) models a behavior of individual components v.

In a step 402, the context variable I is specified. The context variable I comprises, in one example, the association N^m, which associates at least one agent m with another agent m to be considered for predicting the behavior of this agent m. The context variable I in the example is given.

The association N^min one example is a matrix whose rows each represent one of the agents m and whose columns each represent one of the agents m.

In the example, a value, in particular a binary value, is determined for each element of the matrix.

In one example, the value of an element identified by its row and its column in the matrix specifies whether or not the agent m identified by the row is to be considered for the prediction for the agent m identified by the column.

In one example, the value of an element identified by its row and its column in the matrix specifies whether or not the agent m identified by the column is to be considered for the prediction for the agent m identified by the row.

For example, the relationships of the agents m to one another are modeled using the graph, wherein the values ε of the edges are determined such that, in the graph, agents m′ neighboring an agent m are considered for the prediction thereof.

The context variable I comprises, in one example, the history of the dynamic system 104.

The context variable I in the example is used to determine the moments, expected values and weights, the argument of which comprises the context variable I.

The plurality V of components v is determined in the example with a neural network whose input variables comprise the history of the dynamic system 104 and the edges E from the context variable I. In one example, the history of the system 104 is defined by the observed behavior of agents m, in particular the observed portion of the trajectories.

The edges in the example in the matrix N^mare binary values 0 or 1, which, for example, indicate with the value 1 that an edge exists between two nodes and are otherwise zero. The latent state x_t^mof an agent m at the time point t is represented by a node in the graph.

The trajectories are defined in one example by a temporal sequence of two-dimensional or three-dimensional geographic coordinates, which indicate a temporal sequence of positions of the vehicles.

Using the operation AGG in the example, the messages x_t^N^mare determined depending on the matrix N^mand the one-dimensional input variable. The messages x_t^N^mare concatenated the one-dimensional input variable and mapped using the neural network onto the values of the first moment μ_0,vand the value of the second moment Σ_0,v.

For example, the neural network is a graph neural network. The latter is designed, for example, as described in Peter W. Battaglia, Jessica B. Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinicius Flores Zambaldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner, Qaglar GiAlcehre, H. Francis Song, Andrew J. Ballard, Justin Gilmer, George E. Dahl, Ashish Vaswani, Kelsey R. Allen, Charles Nash, Victoria Langston, Chris Dyer, Nicolas Heess, Daan Wierstra, Pushmeet Kohli, MatthewBotvinick, Oriol Vinyals, Yujia Li, and Razvan Pascanu; “Relational inductive biases, deep learning, and graph networks;” arXiv, abs/1806.01261, 2018.

In a step 404, for the plurality V of components v and for the plurality of time points t=1, . . . T, in iterations, until the prediction time point T, for each component v, a value of the first moment μ_tof the normal distribution N(x_t|μ_t(I),Σ_t(I)) is determined.

The value of the first moment μ_tis determined recursively in the example. This means that the value of the first moment μ_tat a time point t is determined depending on a value of the first moment μ_t−1for a time point, e.g., t−1, preceding the time point t.

The following description is based on a tool which can be used to determine an expected value E[f(x)] of a function f(x), a covariance matrix Cov(f(x)) of the function f(x) and a cross-covariance matrix Cov(x,f(x)). For example, the expected value E[f(x)] and the covariance matrix Cov(f(x)) are determined, for example, as described in Anqi Wu. The cross-covariance matrix Cov(x,f(x)) is determined, for example, as described in Andreas Look.

The tool requires that layers, the moments of which can be calculated at the output, are used in the neural network to determine the expected value E[f(x)], the covariance matrix Cov(f(x)) and the cross-covariance matrix Cov(x,f(x)). The operation AGG(x_t,ε) is used for this purpose.

The operation AGG(x_t,ε) is implemented, for example, as a mean value aggregation in the respective neural network, wherein, for a layer l of the neural network, the message x_t^l,N^min the time step t for the agent m

$x_{t}^{l, N_{m}} = \frac{1}{❘ N_{m} ❘} \sum_{m' \in N_{m}} x_{t}^{l, m'}$

is determined.

For example, for a set of messages x_t^l,N, the Kronecker product is used to determine, ⊗ depending on the E[x_t^l] for the message from a layer l, the expected value

E[x
_t
^l,N]=(A└I_D_x,l)E[x_t^l]

and the covariance

Cov[x_t^l,N]=(A⊗I_D_x,l)Cov[x_t^l](A⊗I_D_x,l)^T

wherein

- A∈R^M×Mis an adjacency matrix with normalized rows that comprise the information ε regarding the edges in matrix form, and I_D_x,lis an identity matrix of dimension D_x,l×D_x,l. The Jacobi matrix is available as_J_t^l=A⊗I_D_x,l.

The tool requires that for the layers l of the neural network, the same affine transformation with the same weight matrix W^land the same bias b^lis carried out. In one example, the calculation takes place for all layers together using a Kronecker product.

E[x
_t
^l+1
]=Ŵ
^l
E[x
_t
^l
]+{circumflex over (b)}
^l

Cov[x_t^l+1]=Ŵ^lCov[x_t^l](Ŵ^l)^T+{circumflex over (b)}^l

with

$x_{t}^{l} = [\begin{matrix} x_{t}^{l, 1} \\ x_{t}^{l, 2} \\ ⋮ \\ x_{t}^{l, M} \end{matrix}]$

${\hat{W}}^{l} = [\begin{matrix} W^{l} & 0 & \dots & 0 \\ 0 & W^{l} & \dots & 0 \\ ⋮ & ⋱ & ⋮ \\ 0 & 0 & \dots & W^{l} \end{matrix}]$

${\hat{b}}^{l} = [\begin{matrix} b^{l} \\ b^{l} \\ ⋮ \\ b^{l} \end{matrix}]$

wherein the Jacobi matrix is available as J_t^l=Ŵ^l.

The value of the first moment μ_tis determined in the example depending on the expected value E[ƒ(x_t−1,I)] for the deterministic change ƒ(x_t−1,I) of the first moment μ_t.

In one example, the value of the first moment μ_tis determined as follows:

μ_t(I)=μ_t−1(I)+E[ƒ(x_t−1,I)]

The expected value E[ƒ(x_t−1,I)], i.e., the change in the first moment μ_t, in one example, is comprised depending on the edges E from the context variable I and the distribution of the latent state x_t−1at the preceding time point, is determined as described in Anqi Wu by means of the tool.

Using the operation AGG in the example, the messages x_t^N^mare determined depending on the matrix N^mand the latent state x_t−1at the preceding time point. The messages x_t^N^mare concatenated with the state x_t−1at the preceding time point. The tool is used to determine the expected value E[ƒ(x_t−1, I)].

An example of a neural network ƒ(x_t−1, I) with a layer for the operation AGG, which is followed by 24 fully connected layers and a ReLu activation, which is followed by a fully connected layer and a ReLu activation, which is followed by another fully connected layer, is shown in FIG. 5b.

In a step 406, for the plurality V of components v and for the plurality of time points t=1, . . . T until the prediction time point T, for each component v, a value of a second moment Σ_tof the normal distribution N(x_t|μ_t(I), Σ_t(I)) is determined.

The value of the second moment Σ_tis determined recursively in the example. This means that the value of the second moment Σ_tfor the time point t is determined depending on a value of the second moment Σ_t−1for a time point, e.g., t−1, preceding the time point t.

In one example, the value of the second moment Σ_tis determined depending on the covariance Cov[ƒ(x_t−1, I)] of the deterministic change ƒ(x_t−1, I) and on the expected value E[LL^T(x_t−1, I)] for the stochastic change L(x_t−1, I) of the second moment Σ_t.

It may be provided that the value of the second moment Σ_tfor the time point t is determined depending on the value of the second moment Σ_t−1for the preceding time point, e.g., t−1, and the covariance Cov[ƒ(x_t−1, I)] of the deterministic change ƒ(x_t−1, I) and the covariance Cov[x_t−1, ƒ(x_t−1,I)] of the latent state x_t−1at the preceding time point t_t−1with the deterministic change ƒ(x_t−1, I) and the transpose of the covariance Cov[x_t−1, ƒ(x_t−1, I)] of the latent state x_t−1at the preceding time point, e.g., t_t−1, with the deterministic change ƒ(x_t−1, I) and the expected value E[LL^T(x_t−1, I)] for the stochastic change L(x_t−1, I):

Σ_t(I)=Σ_t−1(I)+Cov[ƒ(x_t−1,I)]+Cov[x_t−1,ƒ(x_t−1,I)]+Cov[x_t−1,ƒ(x_t−1,I)]^T+

E[LL
^T(x_t−1,I)]

The expected value E[LL^T(x_t−1, I)], i.e., the change in the second moment Σ_t, in one example, is determined depending on edges ε from the context variable I and the latent state x_t−1at the preceding time point. In the example, the tool is used to determine E[L] and Cov[L] and thus E[LL^T(x_t−1, I)]=Cov[L]+E[L]E[L]^T.

Using the operation AGG in the example, the messages x_t^N^mare determined depending on the matrix N^mand the latent state x_t−1at the preceding time point. The messages x_t^N^mare concatenated with the state x_t−1at the preceding time point and mapped onto the expected value E[LL^T(x_t−1, I)].

An example of a neural network L(x_t−1,I) with a layer for the operation AGG, which is followed by 24 fully connected layers and a ReLu activation, which is followed by a fully connected layer and a ReLu activation, is shown in FIG. 5c.

The inner loop includes steps 404 and 406.

In a step 408, for each component v, an expected value E[g(x_T,v)] for the first moment g(x_T,v) of the normal distribution N(y_t|g(x_t), QQ^T(x_t)) at the prediction time point T is determined. The distribution of y_t, in one example, is approximated by a Gaussian mixture model (GMM) y_T˜Σ_vπ(I)N(y_T|a_T,v(I),B_T,v(I)).

In the example, for each component v, depending on the value of the first moment g(x_T,v) at the prediction time point T, a covariance of Cov[g(x_T,v)] of the first moment g(x_T,v) is determined.

In a step 410, the expected value E[g(x_T,v)] for the first moment g(x_T,v) of the normal distribution N(y_t,v|g(x_t,v), QQ^T(x_t,v)) is determined depending on the value of the first moment μ_T,vat the prediction time point T.

In a step 412, a first moment a_T,v(I) of the normal distribution N(a_T,v(I),B_T,v(I)) is determined.

In the example, for each component v, depending on a latent state x_T,vat the prediction time point T, an expected value E[QQ^T(x_T,v)] for the second moment QQ^T(x(t)) of the normal distribution N(y_t|g(x_t), QQ^T(x_t)) at the prediction time point T is determined as described in Anqi Wu as a function of x_t˜N(x_t|μ_t,v,Σ_t,v) by the tool.

In the example, the expected value E[g(x_T,v)] defines the first moment a_T,v(I), e.g., by

a
_T,v(I)=E[g(x_T,v)]

The expected value E[g(x_T,v)] is determined in the example using the tool.

An example of a neural network g(x_t,v) with 24 fully connected layers and a ReLu activation, which is followed by a fully connected layer, is shown in FIG. 5d. For Q(x_t), a constant is assumed in the example, but more complex neural networks are also possible.

In a step 414, for each component v, a second moment B_T,v(I) of the normal distribution N(a_T,v(I),B_T,v(I)) is determined.

In the example, for each component v, depending on the covariance Cov[g(x_T,v)] of the first moment g(x_T,v) and on the expected value E[QQ^T(x_T,v)] for the second moment QQ^T(x_T,v) at the prediction time point T, the second moment V_T,v(I) of the normal distribution N(a_T,v(I),B_T,v(I)) is determined, e.g., by

B
_T,v(I)=COV[g(x_T,v)]+E[QQ^T(x_T,v)]

The Covariance Cov[g(x_T,v)] and the expected value E[QQ^T(x_T,v)] are determined in the example using the tool.

The outer loop includes steps 408 to 414.

In a step 416, in particular weighted by at least one weight π_v(I), a sum Σ_v=1^Vπ_v(I)N(a_T,v(I), B_T,v(I)) of the third normal distributions N(a_T,v(I), B_T,v(I)) of the components v is determined.

In a step 420, the prediction p(y_T|I) of the behavior y_T={y_t^m}_t=1^Mis determined depending on the sum Σ_v=1^Vπ_v(I)N(a_T,v(I), B_T,v(I)), e.g.,

p(y_T|I)=Σ_v=1^V(I)N(a_T,v(I),B_T,v(I))

v=1

In a step 422, the prediction is output and/or at least one agent 102 is controlled depending on the prediction.

For example, the computer-controlled machine, the robot, the vehicle, the household appliance, the driven machine, the manufacturing machine, the personal assistant, or the access control system is controlled.

For example, the prediction for the molecular dynamics is determined and output. For example, the prediction for the movement during the game is determined and output.

The covariances for the different combinations of the latent states can be processed as a matrix of dimension MD_x×MD_x, which comprises blocks which are respectively defined by one of the covariances. In one example, it is provided that the matrix is approximated as a thinly populated matrix.

FIG. 6 shows a schematic representation of approximations of a covariance matrix for five agents A, B, C, D, E.

The latent state of an agent m comprises several elements in one example. For example, for the trajectories, the latent state comprises an element for a velocity of the agent m and an element for an acceleration of the agent m. The elements do not have to be physical quantities but may also relate to other aspects of a state of an agent.

In a first approximation of the matrix, only elements from the matrix located on the major diagonal of the matrix are used for the prediction, wherein other elements of the matrix are not considered. This means that the latent states of an agent m are modeled independently of one another and the latent states of different agents m are also modeled independently of one another. For example, the velocity of the agent m is modeled independently of its acceleration, and both the velocities of the different agents m and their accelerations are modeled independently of one another.

The main diagonal is shown in FIG. 6 as a solid diagonal line. In a second approximation, the latent states of an agent are modeled independently of one another and corresponding elements of the latent states of different agents are modeled dependently on one another. For example, the velocity and the acceleration of the same agent are modeled independently of one another, and the velocities of the different agents are modeled dependently on one another, and the accelerations of the different agents are modeled dependently of one another. This is represented in FIG. 6 by the solid line and diagonal, dashed lines.

In a third approximation, different elements of the latent state of an agent are modeled dependently on one another and the latent states of different agents are modeled independently of one another. This is shown in FIG. 6 by the diagonal of the shaded blocks.

The parameters θ={θ_ƒ, θ_L} and ψ={ψ_g, ψ_Q} of the neural networks parameterized therewith are determined in the example in a training with a data set D={Y,I}, in the example the observed trajectories, by minimizing the expected negative logarithmic probability:

argmin_θ,ψ−log E[P(y_t|I)]

Claims

1. A computer-implemented method for predicting a behavior of agents in a dynamic system with a multiplicity of interacting agents depending on a latent state thereof, the method comprising the following steps: determining, for a plurality of components and for a plurality of time points up to a prediction time point, a value of a first moment of a first distribution, which models the latent state of the agents, for each component;determining a value of a second moment of the first distribution;determining an expected value for a first moment of a second distribution at a prediction time point, for each component depending on the value of the first moment of the first distribution at the prediction time point and depending on the value of the second moment of the first distribution at the prediction time point, wherein the second distribution models the behavior of the agents depending on the latent state thereof, wherein the expected value for the first moment of the second distribution defines a first moment of a third distribution;determining a second moment of the third distribution for each component;determining a sum, weighted with at least one weight, of the third distributions of the component; anddetermining the prediction of the behavior depending on the sum.
2. The method according to claim 1, wherein: the value of the first moment of the first distribution is determined depending on a value of the first moment of the first distribution for a time point preceding the time point and on an expected value for a deterministic change of the first moment of the first distribution, and/orthe value of the second moment of the first distribution for the time point is determined depending on a value of the second moment of the first distribution for a time point preceding the time point and on a covariance of a deterministic change and on an expected value for a stochastic change of the second moment of the first distribution.
3. The method according to claim 2, wherein the value of the second moment of the first distribution for the time point is determined depending on the value of the second moment of the first distribution for the preceding time point and on the covariance of the deterministic change and on a covariance of the latent state at the preceding time point with the deterministic change and on a transpose of the covariance of the latent state at the preceding time point with the deterministic change and on the expected value for the stochastic change.
4. The method according to claim 1, wherein the expected value for the first moment of the second distribution is determined depending on the value of the first moment of the first distribution at the prediction time point.
5. The method according to claim 1, wherein a covariance of the first moment of the second distribution is determined for each component depending on the value of the first moment of the second distribution at the prediction time point, wherein an expected value for the second moment of the second distribution at the prediction time point is determined for each component depending on a latent state at the prediction time point, wherein the second moment of the third distribution is determined for each component depending on the covariance of the first moment of the second distribution and on the expected value for the second moment of the second distribution at the prediction time point.
6. The method according to claim 1, wherein a context variable is determined, which includes an association which associates at least one agent with another agent to be considered for predicting the behavior of this agent, and/or which characterizes a history of the dynamic system; andwherein: the first moment of the first distribution is determined depending on the context variable, and/orthe second moment of the first distribution is determined depending on the context variable, and/orthe expected value for the first moment is determined depending on the context variable, and/orthe first moment of the third distribution is determined depending on the context variable, and/orthe second moment of the third distribution is determined depending on the context variable, and/orthe at least one weight is determined for at least one component depending on the context variable.
7. The method according to claim 6, wherein: the history is determined depending on an observed behavior of the at least one agent, the behavior which includes the agent's position or movement, andwherein: i) the agent's position or movement is acquired using a receiver for a satellite-based position determination system, or ii) at least one digital image is acquired using a sensor for digital images, and the agent's position or movement is determined depending on at least one digital image, or iii) a signal is acquired using a speaker for receiving audible sound and the agent's position or movement is determined depending on the signal.
8. The method according to claim 6, wherein the context variable includes a matrix, whose rows each represent one of the agents and whose columns each represent one of the agents, and wherein: i) at least one value of an element of the matrix identified by a row and a column is determined and specifies whether or not the agent identified by the row is to be considered for the prediction for the agent identified by the column, or ii) at least one value of an element of the matrix identified by a row and a column is determined and specifies whether or not the agent identified by the column is to be considered for the prediction for the agent identified by the row.
9. The method according to claim 6, wherein the first moment and the second moment of the first distribution is determined in iterations, wherein for a first one of the iterations for each component, a value of the first moment of the first distribution and a value of the second moment of the first distribution are determined, which depends on the context variable.
10. The method according to claim 1, wherein, for the prediction: i) latent states of each agent are modeled independently of one another and latent states of different agents are modeled independently of one another, or ii) latent states of each agent are modeled independently of one another and corresponding elements of latent states of different agents are modeled dependently on one another, or iii) different elements of a latent state of each agent are modeled dependently on one another and latent states of different agents are modeled independently of one another.
11. The method according to claim 1, wherein at least one agent is controlled depending on the prediction, the at least one agent including a computer-controlled machine, or a robot, or a vehicle, or a household appliance, or a driven machine, or a manufacturing machine, or a personal assistant, or an access control system.
12. The method according to claim 1, wherein the at least one agent is an existing real object in the physical world.
13. A device, comprising: at least one processor; andat least one memory;wherein the device is configured to predict a behavior of agents in a dynamic system with a multiplicity of interacting agents depending on a latent state thereof, the device configured to: determine, for a plurality of components and for a plurality of time points up to a prediction time point, a value of a first moment of a first distribution, which models the latent state of the agents, for each component;determine a value of a second moment of the first distribution;determine an expected value for a first moment of a second distribution at a prediction time point, for each component depending on the value of the first moment of the first distribution at the prediction time point and depending on the value of the second moment of the first distribution at the prediction time point, wherein the second distribution models the behavior of the agents depending on the latent state thereof, wherein the expected value for the first moment of the second distribution defines a first moment of a third distribution;determine a second moment of the third distribution for each component;determine a sum, weighted with at least one weight, of the third distributions of the component; anddetermine the prediction of the behavior depending on the sum.
14. A system, comprising: at least one agent including a computer-controlled machine, or a robot, or a vehicle, or a household appliance, or a driven machine, or a manufacturing machine, a personal assistant, or an access control system; anda device including: at least one processor; andat least one memory;wherein the device is configured to predict a behavior of agents in a dynamic system with a multiplicity of interacting agents depending on a latent state thereof, the device configured to: determine, for a plurality of components and for a plurality of time points up to a prediction time point, a value of a first moment of a first distribution, which models the latent state of the agents, for each component;determine a value of a second moment of the first distribution;determine an expected value for a first moment of a second distribution at a prediction time point, for each component depending on the value of the first moment of the first distribution at the prediction time point and depending on the value of the second moment of the first distribution at the prediction time point, wherein the second distribution models the behavior of the agents depending on the latent state thereof, wherein the expected value for the first moment of the second distribution defines a first moment of a third distribution;determine a second moment of the third distribution for each component;determine a sum, weighted with at least one weight, of the third distributions of the component; anddetermine the prediction of the behavior depending on the sum;wherein the device is configured to control the agent depending on the prediction.
15. A non-transitory computer-readable medium on which is stored a computer program including computer-readable instructions for predicting a behavior of agents in a dynamic system with a multiplicity of interacting agents depending on a latent state thereof, the instructions, when executed by a computer, causing the computer to perform the following steps: determining, for a plurality of components and for a plurality of time points up to a prediction time point, a value of a first moment of a first distribution, which models the latent state of the agents, for each component;determining a value of a second moment of the first distribution;determining an expected value for a first moment of a second distribution at a prediction time point, for each component depending on the value of the first moment of the first distribution at the prediction time point and depending on the value of the second moment of the first distribution at the prediction time point, wherein the second distribution models the behavior of the agents depending on the latent state thereof, wherein the expected value for the first moment of the second distribution defines a first moment of a third distribution;determining a second moment of the third distribution for each component;determining a sum, weighted with at least one weight, of the third distributions of the component; anddetermining the prediction of the behavior depending on the sum.

Priority Claims (1)

Number	Date	Country	Kind
10 2022 204 723.0	May 2022	DE	national

COMPUTER-IMPLEMENTED METHOD FOR PREDICTING A BEHAVIOR OF AGENTS IN A DYNAMIC SYSTEM WITH A MULTIPLICITY OF INTERACTING AGENTS

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

Priority Claims (1)