1. Field of the Invention
The present invention relates to robotics, and particularly to a control system and method for multi-vehicle systems that uses nonlinear model predictive control to regulate navigation of multiple autonomous vehicles operating under automatic control.
2. Description of the Related Art
Researchers have addressed multi-vehicle control by implementing a potential fields formation control strategy, but they considered a point mass robot. What is needed, however, is to extend the fields formation control strategy to make one of the robots lead the others in an unknown environment, and at the same time, have all the agents in the fleet keep their formation shape, based on the potential fields.
Thus, a control system and method for multi-vehicle systems solving the aforementioned problems is desired.
The control system and method for multi-vehicle systems provides nonlinear model predictive control (NMPC) to regulate navigation of multiple autonomous vehicles (mobile robots) operating under automatic control. The system includes an NMPC controller and an NMPC algorithm. The NMPC controller includes an optimizer, a state predictor, and a state estimator. Data compression is accomplished using a neural networks approach.
These and other features of the present invention will become readily apparent upon further review of the following specification and drawings.
Similar reference characters denote corresponding features consistently throughout the attached drawings.
At the outset, it should be understood by one of ordinary skill in the art that embodiments of the present method can comprise software or firmware code executing on a computer, a microcontroller, a microprocessor, or a DSP processor; state machines implemented in application specific or programmable logic; or numerous other forms without departing from the spirit and scope of the method described herein. The present method can be provided as a computer program, which includes a non-transitory machine-readable medium having stored thereon instructions that can be used to program a computer (or other electronic devices) to perform a process according to the method. The machine-readable medium can include, but is not limited to, floppy diskettes, optical disks, CD-ROMs, and magneto-optical disks, ROMs, RAMs, EPROMs, EEPROMs, magnetic or optical cards, flash memory, or other type of media or machine-readable medium suitable for storing electronic instructions.
The control system and method for multi-vehicle systems provides nonlinear model predictive control (NMPC) to regulate navigation of multiple autonomous vehicles (mobile robots) operating under automatic control. The system 100 includes an NMPC controller and an NMPC algorithm. The NMPC controller includes an optimizer, a state predictor, and a state estimator. Data compression is accomplished using a neural networks approach. As shown in
The multi-vehicles 110 have nonlinear discrete-time dynamics characterized by the relation:
x
t+1
=f(xt, ut, wt), (1)
and the nonlinear output is:
y
t
=h(xt) (2)
Internal states xt, outputs yt, local control inputs ut and external inputs wt belong to the following constrained convex sets:
External input w will be later used to model the information communicated by other members of the team or obstacles. In the current context of a single robotic vehicle (agent), we can utilize it to model any disturbance affecting the agent (e.g. wing gust, water current, turbulence etc.) or information about obstacle it has to avoid. The disturbance evolves according to the following nonlinear mapping:
w
t+1
=g(wt, Φt), (4)
where φ is an unknown input vector, possibly random. Since wt is not additive, it can also be used to represent plant uncertainty. The actual state of the system is xt, while the state predicted by a model at time t for future time instant t+l is {tilde over (x)}t,t+1, assuming that the model of the system is not perfect, such that the nominal model actually used for state prediction is:
{tilde over (x)}
t+1
={tilde over (f)}({tilde over (x)}t, ut, {tilde over (w)}t). (5)
Often, not all states are directly measurable, and when they are sensors may produce an output corrupted with noise and this lead to uncertainty. Therefore, the measured output is:
{tilde over (y)}
t
=y
t+ξy
Therefore, given the outputs measured by sensors, there is a need to estimate the states in a manner such that the effect of noise and uncertainty are mitigated. Assumption is that a mechanism of state estimation exists, such that the state is estimated with some bounded error ξx, such that:
{tilde over (x)}
t
={tilde over (x)}
t|t−1
+K
t({tilde over (y)}t−h({tilde over (x)}t|t−1)). (7)
where Kt is time varying nonlinear filter, which is assumed to be available and {tilde over (x)}t|t−1 is the prior estimate. In the present method, the assumption is that this filter exists, such that:
{tilde over (x)}
t
=x
t+ξx
Moreover, assume the existence of another estimator for w, which produces the estimate {tilde over (w)}, such that:
{tilde over (w)}
t
=w
t+ξw
Without exact knowledge of the evolution of wt,t+N
{tilde over (w)}
t+1
={tilde over (g)}({tilde over (w)}t), (10)
such that there is a bounded disturbance transition uncertainty due to disturbance model mismatch:
w
t+1
=g(wt, Φt)+ew
Similarly, it is assumed that system model mismatch leads to system transition uncertainty
such that:
{tilde over (f)}(xt, ut, wt)=f(xt, ut, wt)+ex
Now, due to uncertainty, the constraint sets (3) for x and w will be larger than constraints sets for {tilde over (x)}, and {tilde over (w)}, such that:
Normally NMPC is used for state regulation, i.e., it will usually steer the state to the origin or to an equilibrium state xr=r, where r is a constant reference. This is generally true for process industries. However, in mobile robotics, the control objective depends on the mission profile of the vehicle, as the target state may evolve over time, rather than being constant. Tracking and path tracking are two fundamental control problems in mobile robotics. For tracking problems, the objective is to converge to a time-varying reference trajectory xd(t) designed separately. On the other hand, in path following applications, the objective is to follow a reference path parameterized by geometric parameters rather than time. The path following problem can be reduced to state regulation task. Therefore, the control strategy of MPC is explained using regulation problem as an example. Based on the control objective, let the vehicle have the finite-horizon optimization cost function given by:
where Np and Nc are prediction and control horizons. Cost function (14) consists of transition cost h, terminal cost hf and robustness cost q (due to the effect of external input). Control sequence ut,t+Np consists of two parts ut,t+N
The optimal control sequence that minimizes the finite horizon cost of eqn. (14) is:
subject to (1) nominal state dynamics eqn. (5); (2) nominal disturbance dynamics eqn. (10); (3) Control constraint eqn. (3) and the tightened constraint sets relation (13); and (4) terminal state being constrained to an invariant terminal set Xf ∈ {tilde over (X)}t+N
{tilde over (x)}
t+l
∈ X
f
, ∀l=N
c
, . . . , N
p. (16)
Suboptimal sequence ut,t+N
Θt()=ut0({tilde over (x)}t, {tilde over (w)}t, Np, Nc), (17)
and the loop dynamic becomes:
x
t+1
=f(xt, Θt(), wt)=fc(xt, wt). (18)
with closed loop nonlinear map fc(x, w). This process is repeated every sampling instant, as illustrated in plots 200 of
The two classes of optimization problems solved in Algorithm 1 are offline and online. This overall algorithm consists of various ingredient algorithms, described below.
As shown in , terminal set Xf and uncertainty ball βn(c) are shown. With actual constraints X and W, the tightened constraints are given by:
for l=0, . . . , NP, where
Then, any (in general suboptimal) admissible control sequence ut,t+N
x,
Constraint tightening (19)-(20) is novel as it is the first time that such a variety of uncertainty contributions have been considered simultaneously. Remarkably, the external input is not a constant or random unknown as is usually assumed, but herein it is considered to evolve according to an uncertain nonlinear map. Besides, estimation errors and prediction errors are also considered. This leads to very general bounds on prediction error, which can be specialized to specific cases (e.g. perfect measurement will mean ξx→0). Also worthy of note is the fact that we have not considered the model mismatch to be state-dependent, as it does not have obvious practical application in mobile robotics. In fact, if the system is very nonlinear, one cannot expect modeling error to reduce with state, as in many cases larger state amplitude offers better model fidelity. Moreover, the FHOCP is recursively feasible.
Satisfying constraints along the horizon depends on the future realization of the uncertainties, which are random. By assuming Lipschitz continuity of the nominal disturbance and state models it is possible to compute bounds on effect of the evolving uncertainties on the system. Since, our system consists of many possible sources of uncertainty, the bound calculated will be much more involved and comprehensive than those presented in existing literature.
With respect to the convex optimal control problem (OCP (A)) for maximizing a terminal constraint set, the volume of terminal constraint set Xf(a)={{tilde over (x)}: {tilde over (x)}T Qf {tilde over (x)}≦a}, for a>0, within set M defined in (4.13) with cost functional (3.58), is maximized for matrix variables
by solving:
Matrix {tilde over (S)} is a positive definite matrix. Additionally, if it is required to converge with a given rate â then the OCP is subject to another condition:
Plots 500 of
n×n, such that −q({tilde over (x)},{tilde over (w)}) + ψ({tilde over (w)}) ≦ {tilde over (x)}ci{tilde over (S)}{tilde over (x)}
Most modern software packages select the initial iterate internally. For example, the SDPT3 semidefinite programming package algorithms can start with an infeasible starting point, as the algorithms try to achieve feasibility and optimality of its iterates simultaneously.
However, the performance of these algorithms is quite sensitive to the choice of the initial iterate. Reasonable initial guesses are often crucial, especially in non-convex optimization. It is advisable to choose an initial iterate that at least has the same order of magnitude as the optimal solution. Therefore the present invention provides an initial guess of optimization variables to warm-start Algorithm 3 by solving discrete-time algebraic Riccati equations (DARE) at each vertex point as follows:
Q
f
∞=(Q−{tilde over (S)})+AvT (Qf
where Qf
K
v
∞=(R+BvTQf
It should be understood that
which are solutions of Riccati equations (29), the maximum volume ellipsoid in the intersection of all εvnis obtained by solving the following convex OCP (2) for some Lagrangian variables tv:
Regarding Determination of 1-Step Controllable Set to Terminal Constraint Set, the maximum allowable uncertainty is bounded by the minimum size of the 1-step controllable set to Xf, denoted by C1(Xf). In particular, this bound on uncertainty was shown to ensure recursive feasibility. Therefore, it is imperative to determine the minimum size of C1(Xf), i.e., the minimum size of 1-step controllability set to terminal set Xfis defined as:
The relationship 700 between {tilde over (X)}t+N
OCP (3), Min-Max Optimization of One-Step Robust Controllability Set is as follows. Given the target set Xf, tightened constraints defined in (19)-(20) and nominal constraints (3), let the boundary of Xf be discretized appropriately into
for i=1, . . . ,
{tilde over (x)}
f
i
={tilde over (f)}({tilde over (x)}c
1−{tilde over (x)}c1iQf∞{tilde over (x)}c
{tilde over (x)}c1i ∈ {tilde over (X)}N
The boundary of C1(Xf) is given as ∂ (C1(Xf))={{tilde over (x)}c11, ∀i=1, . . . ,
Plot 800 of
Algorithm 6 extends algorithm 5 to solve for the entire feasibility region of for the MPC algorithm. Robust one-step controllability set C1(Xf) contains the target set Xf, i.e. Xf ⊂ C(Xf). Robust one-step controllability set C(Xf) to the terminal set Xf, is contained in the one-step controllability set of robust output feasible set XMPC, i.e.:
C
1(Xf)⊂C1(
Robust one-step controllability set C(Xf) to the terminal set Xf can be written as a finite union of polyhedra. The one-step controllable set operator can be used recursively to define l-step controllable set C1(Xf) as follows (for l≧2):
C
l(Xf)=C1(Cl−1(Xf)). (41)
The boundary of target set, i.e. ∂(Xf) is included in the one step controllable set C1 (Xf):
∂(Xf) ⊂ C1(xf). (42)
Given the terminal set Xf, tightened constraints {tilde over (x)} ∈ {tilde over (X)}t+1, {tilde over (w)} ∈ {tilde over (W)}t−1, for l=1, . . . , Nc and control constraint u ∈ U, the robust feasibility set is obtained by Ncapplications of the one-step controllable set operator C∞(·) by recursively solving OCP (3), such that:
which can be generalized as follows:
Thus, the recursive XMPC Feasibility Region determination procedure is summarized in Table 6.
The method given in Algorithms 5-6 is computationally demanding. However, all algorithms in this chapter are for used offline in the proposed MPC scheme, therefore computational burden is not an overriding concern. However, we must provide the caveat that choosing initial conditions for higher dimension systems is far less intuitive. In that case Algorithms 5-6 should be implemented in a heuristic (non-gradient-based approaches) to avoid the problems of local minima.
Plot 1100 of
This method contemplates the distributed control of a fleet of autonomous agents. Often the main task in multi-vehicle cooperative control is formation. Formations control means the ability to move the entire fleet with a common speed and heading. This invariably means that the vehicles in the team should be able to either sense the states of team members, or receive state information from other team members. In most cases however, the communication occurs wirelessly as the agents are spread over a large area or it is not possible to maintain tethered connection network due to movement of vehicles and presence of obstacles. Also due to mobile nature of these vehicles, the on-board computational power is limited due to size and power budgets. Therefore, distributed control, 1800 as shown in
There are three basic elements in multi-agent formation control; Cohesion: attraction to distant neighbors up to a reachable distance. Alignment: velocity and average heading agreement with neighbors. Separation: repulsion from neighbors within some minimal distance. This is also called collision avoidance. Formation control without collision avoidance is also called state synchronization.
In a dynamic neural network based adaptive control scheme for distributed fleet state synchronization, without the need to know local or leader (nonlinear) dynamics. Lyapunov analysis is used to derive tuning rules, with the implicit need for persistent excitation, for both strongly connected and weakly (simply) connected networks. Plot 2300 of
We address the problem of leader-follower formation control of constrained autonomous vehicles subject to propagation delays. We consider the simultaneous presence of six sources of uncertainty: (i) error in estimating current state; (ii) error in estimating current external input (disturbance or external information); (iii) error in predicting future system state due to model mismatch; (iv) error in predicting future external input due to disturbance model mismatch (disturbance model is another uncertain dynamic system with unknown in-put); (v) error in approximating trajectory due to data compression; and (vi) error in approximating the last segments (tail) of the compressed trajectory due to propagation delays.
Limited network throughput demands reduction in packet size. The proposed approach achieves formation tracking through NMPC such that each agent performs local optimization based on planned approximate trajectories received from its neighbors. Since exchanging the entire horizon will increase packet size proportional to length of prediction horizon, the trajectory is compressed using neural networks, which is shown to reduce the packet size considerably. Moreover, the method allows the agents to be heterogeneous, make asynchronous measurements and have different local optimization parameters. Correction for propagation delays is achieved by time-stamping each communication packet. Collision avoidance is achieved by formulating a novel spatial-filtered potential field, which is activated in a “zone of safety” around the agent's trajectory. New theoretical results are presented along with validating simulations.
OCP (4): Consider a set of N agents, where each agent is denoted as Ai with i=1, . . . , N. Each agent has the following open loop nonlinear discrete-time dynamics described by
x
t+1
i
=f
0
i(xti, uti), ∀i≧0,i=1, . . . , N (45)
where f0i is a nonlinear map for local open loop dynamics, xti, uti are states and controls local to agent Ai. These variables belong to the following constraint sets:
One can observe that the agents' dynamics (1) are decoupled from each other in open loop. This is the standard case for most formation control problems. Focusing on team of dynamically decoupled agents and due to measurements corrupted with sensor noise, we assume that local states are estimated (locally) with bounded error ξxi, such that:
{tilde over (x)}
t
i
=x
t
i+ξx
Even though the agents are dynamically decoupled, they need to cooperate with each other to perform the formation keeping task. To achieve this goal, a co-operation component is added to the cost functional (performance index) of each agent. To this end, define wti as an information vector transmitted to agent Ai by other agents in its neighborhood Gi, which consists of the states of these neighbors.
The external input to agent Ai in formation control task in a multi-agent system consists of state information of other agents in its neighborhood Gi, such that:
where Miis the number of agents in the neighborhood of Ai. This external input in the form of the information vector wi is driven by the dynamics of the neighboring systems, as below:
where gi is a nonlinear map composed of nonlinear dynamics of neighboring agents and their local inputs
We assume that the information vector is constrained to the following set:
Moreover, assume that we have an updatable approximation for wi, which produces the approximation {tilde over (w)}i, such that:
{tilde over (w)}
t
i
=w
t
u+ξw
We assume that we do not have exact knowledge of the evolution of the information over the horizon i.e. wt,t+N
{tilde over (w)}
t+1
o
={tilde over (g)}
i l (
{tilde over (w)}
t
i), (52)
such that there is a bounded information vector transition uncertainty due to information vector model mismatch:
{tilde over (g)}(wti)=g(wti, φti)+ew
Now, let the distributed cost function of each agent be given as:
where Npoand Nci are prediction and control horizons, respectively according to the NMPC notation. The distributed cost function (54) consists of: (i) Local transition costhi, which is the cost to reach a local target state, which is embedded in the local alignment vector dt
x
t+1
i
=f
i(xt1, uti, wti), ∀t≧0, i=1, . . . , N. (55)
We assume that our model of agent dynamics is not perfect, such that the nominal model used for control synthesis is:
{tilde over (x)}
t+1
i
=
1({tilde over (x)}ti, uti, wti), (56)
where, the actual state of the system is xti, while the state predicted by model (49) is {tilde over (x)}ti. This system model mismatch leads to agent transition uncertainty such that:
{tilde over (f)}
i(xt, ut, wt)=fi(xt, ut, wt)+ex
Now, due to uncertainty, the constraint sets (5.2) and (5.6) for xi and wiwill be ‘larger’ than constraint sets for {tilde over (x)}i, and {tilde over (w)}i, such that:
{tilde over (x)}
t
∈ {tilde over (X)}
t
i(ēxi,{tilde over (ξ)}xi, ewi,
Distributed finite horizon optimal control problem (OCP (4)): At every instant t≧0, given prediction and control horizonsNpi, Nci ∈ ≧0 terminal control kfi({tilde over (x)}i): Rn→Rm, state estimate {tilde over (x)}ti and information vector approximation {tilde over (w)}t+N
subject to (I.) nominal state dynamics (56), (II) nominal information vector dynamics (52), (III) tightened constraint sets (58), (IV) Terminal state {tilde over (x)}t+N
{tilde over (x)}
t+l
i
∈ X
f
i
, ∀l=N
C
i
, . . . , N
p
i (60)
The loop is closed by implementing only the first element of ut,t+N
Θti({tilde over (x)}i, {tilde over (w)}i)=ut0
and the closed loop dynamics becomes:
x
t+1
i
=f(xti, Θti({tilde over (x)}i, {tilde over (w)}i), wti)=fCi(xti, wti), (62)
with local closed loop nonlinear map fCi(x, w). This process is repeated every sampling instant, as illustrated in state and control plots 200 of
{tilde over (J)}
t
i
=J
t
i(1+φti). (63)
A collision course 2100 as illustrated in
where Rmin safety zone of an agent and dijk is thhe Euclidan distance between two agents, the summation representing the total number of agents in collision course with agent Ai. The repelling potential is formulated as:
Successful collision avoidance occurs if weighted average distance between the agents on collision course is increased during the next time instant i.e.:
For an agent on collision course the optimal trajectory with modified cost will avoid the collision while maintaining input-to-state practical stability if the repulsive spatial filter weights are computed at each sampling instant t as follows:
Successful collision avoidance is illustrated in
With respect to neural network based trajectory compression, for cooperation, agents transmit their planned state trajectories as mentioned supra. These communication packets are received by vehicles within the neighborhood of transmitting agents. Neighborhood may be defined based on communication range, number of channels on receiving agents etc. With reference to neural network compression 2000 of
To reduce packet size, this trajectory containing ni×Npi floating points is compressed by approximating it with neural network Ni of qi weights and biases, with compression factor Cwi of
Tail recovery of a useful part of the trajectory at reception time t is accomplished by tail prediction as follows:
{tilde over (w)}
t+N
−Δ
+1
i
={tilde over (g)}
i({tilde over (w)}t+N
Preferred neural network for this computation is a two layer NN although other NN topologies are contemplated by the present method.
The Distributed NMPC Algorithm for Formation Control procedure is summarized in Table 8.
ti = ti + Ti
Multi-agent prediction error bounds are analogous to the single agent prediction error bounds developed supra. With actual constraints Xi and Wi, the tightened constraints are given by:
The prediction error bound {tilde over (ρ)}ixis defined as:
The constraint tightening procedure for multi-agents distributed processing is summarized in Table 9.
x
i, ēxi, ēwi, and horizons NCi, NPi.
It is to be understood that the present invention is not limited to the embodiments described above, but encompasses any and all embodiments within the scope of the following claims.