Systems, such as industrial process systems, utilize advanced controllers to control processes carried out by the systems. Controllers are purchased by many companies having a wide variety of systems to control, from boilers and oil refineries to ice cream manufacturing, heating and cooling plants, and packaging plants to name a few. Such controllers utilize many parameters that are measured values corresponding to different pieces of equipment making up the system. The measured values may correspond to temperatures, pressures, flow rates, and many other measurable and estimated values for parameters that cannot be directly measured. Variables may be set by the controllers to control the processes. To help set the variables, the systems may be modeled using complex modelling techniques. The modelling may be referred to as system identification.
System identification is about science (algorithms) and art (experience). Both parts are indispensable when developing mathematical models. In a white-box model, an engineer knows the system/process, all parameter are known or measurable, and the model may be constructed using first-principles knowledge. In a gray-box model, some of the parameter are unknown and must be estimated using data. Physically well-founded models are available for use, but there is no systematic approach that may be used to estimate gray-box model parameters. In a black-box model, the engineer knows only experimental data and the model is built using some standard regression models. Some well-designed algorithms exist for parameter estimation of black-box models.
At present, more and more complex models are being developed as more complex systems are being controlled or observed. It is apparent that such models require more information to be correctly identified. The information can be collected by measuring experimental/operational data, but this can be intractable or expensive. On the other hand, a substantial piece of information comes from first-principle modeling. Using this information in model derivation results in more robust models with respect to errors in data, and it also lowers the demand on the amount of experimental data required.
A natural way to describe first-principle models is by using a gray-box modeling approach. The gray-box model (GBM) effectively captures user's knowledge about the system, which is typically being described by a set of parameterized differential equations. The GBM is parameterized by a vector of unknown parameters that are to be identified using experimental data.
Whereas the GBM definition falls into the “art” part of the system identification, the parameter estimation problem is about the “science.” Looking for the optimal value of GBM parameters results in a hard optimization task, which is the main drawback of the gray-box modeling approach.
A method includes using first principles and engineering knowledge to define a continuous time nonlinear gray-box model of a system performing a process, defining a numerically tractable optimization problem for parameter estimation of the nonlinear gray-box model, tuning a vector of parameters of a static model of the nonlinear gray-box model, and extending the vector of parameters of the static model to a dynamic model by fitting measured transient data from the process.
An advanced controller having a parameterized model for controlling a system to control a process, the controller including processor and a storage device having code and data to define a gray-box model of the system and corresponding parameters for the gray-box model. The parameters are determined in accordance with a method at least partially performed by a programmed computer. The method includes defining a numerically tractable optimization problem for parameter estimation of a general nonlinear gray-box model, tuning parameters of a static model of the nonlinear gray-box model, and extending the parameters of the static model to a dynamic model by fitting measured transient data from the process. A method includes estimating a non-linear steady state model representative of a system performing a process using steady-state data, the model providing tuned static parameters for an advanced controller to control the system, and using the non-linear steady state model with the tuned static parameters to tune dynamic gray-box model parameters using sub-sequences of transient experimental data.
An advanced controller having a parameterized model for controlling a system to perform a process. The controller includes a processor, and a storage device having code and data to define a dynamic gray-box model of the system and corresponding dynamic gray-box parameters for the gray-box model, wherein the parameters are determined in accordance with a method at least partially performed by a programmed computer. The method includes estimating a non-linear steady state model representative of a system performing a process using steady-state data, the model providing tuned static parameters for an advanced controller to control the system, and using the non-linear steady state model with the tuned static parameters to provide dynamic gray-box model parameters using sub-sequences of transient experimental data.
A method includes performing a steady-state model estimation to obtain a vector of steady state parameters for a non-linear model representative of a system controlled by an advanced controller to perform a process, and tuning the vector of steady-state parameters by a using local approximation for estimation of non-linear model parameters at different operating levels of the system to obtain local linear model vectors of gray-box parameters for the advanced controller.
An advanced controller having a parameterized model for controlling a system to perform a process includes a processor and a storage device having code and data to define a dynamic gray-box model of the system and corresponding dynamic gray-box parameters for the gray-box model, wherein the parameters are determined in accordance with a method at least partially performed by a programmed computer.
The method includes performing a steady-state model estimation to obtain a vector of steady state parameters for a non-linear model representative of a system controlled by the advanced controller, and tuning the vector of steady-state parameters by a using local approximation for estimation of non-linear model parameters at different operating levels of the system to obtain local linear model vectors of gray-box parameters for the advanced controller.
In the following description, reference is made to the accompanying drawings that form a part hereof, and in which is shown by way of illustration specific embodiments which may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the invention, and it is to be understood that other embodiments may be utilized and that structural, logical and electrical changes may be made without departing from the scope of the present invention. The following description of example embodiments is, therefore, not to be taken in a limited sense, and the scope of the present invention is defined by the appended claims.
The functions or algorithms described herein may be implemented in software or a combination of software and human implemented procedures in one embodiment. The software may consist of computer executable instructions stored on computer readable media or computer readable storage device such as one or more memory or other type of hardware based storage devices, either local or networked. Further, such functions correspond to modules, which are software, hardware, firmware or any combination thereof. Multiple functions may be performed in one or more modules as desired, and the embodiments described are merely examples. The software may be executed on a digital signal processor, ASIC, microprocessor, or other type of processor operating on a computer system, such as a personal computer, server or other computer system.
Gray-box models are created for use by advanced controllers which use models to compute control signals to control a process performed by a system such as a boiler or other systems that include multiple components operating together to perform a process. Other uses of the models include model simulation and error detection, such as system malfunction detection, sensor error detection and other types of error detection. Many controllers utilize classic simple types of control, such as those based on proportional, integral, and derivative (PID) algorithms. More advanced controllers utilize a model of a system that depends on estimation of multiple parameters for the model to effectively control the system implementing the process.
Generally speaking, the model of a system and the controller can be considered independently. In practical application the process of developing models for control is iterative. Testing is performed to determine whether the model is good enough to be used by a controller. Various embodiments address the modeling phase independent of the controller.
Modelling may be performed to obtain a good model of a system (process), such that the output of the system (process) can be precisely predicted using current model state and future (manipulated) inputs.
“Good model”=a model which provides reliable predictions of the outputs of the system.
“Current model state”=state of the dynamical model, which accumulates the past history of the model, such that the future behavior is only a function of this state and future inputs to the system.
Such models are typically used in advanced controllers to compute the control actions (the future manipulated inputs) to the system so that the system behaves optimally.
One goal is to find this model, which can be used e.g. in the controller to predict the behavior of the system and thus to optimize the control action. To compose the model, some prior information about the system and set of measured input/output data, which is typically gathered from identification experiments is used.
Having data and prior knowledge a natural way to obtain a model is using the gray-box modelling. Gray-box models have a given structure that comes from the prior information/engineering knowledge and list of (partially) unknown parameters, which have to be estimated using data. This estimation is done off-line and only the resulting model is provided to a controller. The drawback of this modelling method is the resulting non-convex and non-linear optimization task, which has to be solved by a computer (program) to find optimal values of the parameters. Optimal values, in this case, mean that the predicted outputs of the model are in accordance with the measured outputs of the real system.
In some embodiments described in the present application, the hard (off-line) optimization task of finding the optimal parameters of the gray-box model is solved. The solutions can be described from above (“what I want”) down to (“how to do it”) as follows:
Having a nice designed (i.e. numerically tractable) optimization problem for parameter estimation of a general (non-linear) gray-box model.
A two-step identification (tune the static model first, then extend it to a dynamic model by fitting transient data).
Fitting transient data using local linear approximation of the non-linear gray-box model and thus avoiding computationally expensive numerical integration.
Support the optimization algorithm by providing sensitivity to the optimizing parameters. In other words, provide a gradient of a cost function. The gradient is computed using prediction errors and their sensitivities to the parameters.
Fast evaluation of the computation of the prediction errors and their sensitivities (of the locally linearized gray-box models) using an extended discrete-time linear stochastic model.
Computation of the extended discrete-time linear stochastic model using the continuous-time gray-box model, as was provided by the user.
In various embodiments, parameters may be efficiently estimated for continuous time linear gray-box models using a fast sensitivity evaluation, which is a measure of how output of the model of the system changes with changes in parameters. In a further embodiment, a two-step parameter estimation of non-linear models, such as a steady-state model and dynamical model, is performed. In still a further embodiment, an efficient parameter estimation algorithm is employed for estimation of non-linear model parameters via local approximation at different system operating levels or operating set-points. The above embodiments provide a fast and reliable identification of first-principle models.
A block diagram of a system to be controlled is illustrated in
The system 130 may be any type of system that requires a controller to operate the system to perform a process properly and in an optimal manner. Example systems range from industrial boilers and oil refineries to ice cream manufacturing, heating and cooling plants, and packaging plants to name a few. Variability in input and environmental conditions may lead to difficulty of control of such systems. Many sensors 135 may be used to measure variables in the system, such as temperatures, pressures, flow rates. Such measurements are generally provided to the controller 115 to enable the controller to react to changes to obtain and maintain a desired set point 120 of the system, which again may be measured by the sensors.
A natural way to describe first-principle models for systems is by using a gray-box modeling approach. The gray-box model (GBM) effectively captures a user's knowledge about the system, which is typically described by a set of parameterized differential equations. The GBM is parameterized by a vector of unknown parameters that are to be estimated using experimental data.
In various embodiments, there are several improvements which may be used to enhance the performance of the parameter estimation task. One improvement includes efficient parameter estimation of linear, continuous-time, stochastic, gray-box models. A further improvement includes a two-step parameter estimation of non-linear, continuous-time, gray-box models. Yet another improvement includes employing the efficient parameter estimation algorithm for the non-linear gray-box models.
Parameter Estimation of Linear Models:
{dot over (x)}(t)=A(θ)x(t)+B(θ)u(t),
y(t)=C(θ)x(t)+D(θ)u(t). (2.1)
where x(t)∈Rn
The goal is to estimate the model parameters at 230 using measured (experimental) data 225, which may be obtained from sensors 135 to arrive at a parameterized model 235 in one embodiment:
DN={UN,YN},UN={u(0), . . . ,u(N−1)},YN={y(0), . . . ,y(N−1)}. (2.2)
Here DN denotes measured input/output data sequence of N samples.
Due to the stochastic nature (may not be precisely predicted) of the measured data and unmeasured disturbances, the above mentioned deterministic LTI GBM 215 is not sufficient for a good parameter estimation. A stochastic model 220 should be introduced.
Stochastic Model:
It is commonly known that choosing a right noise model is helpful in parameter estimation. Defining the noise properties within the GBM is a bit tricky. There is usually a good knowledge of the deterministic part of the model; however, the stochastic part is typically hidden. The noise model may be utilized to set up filters, which are employed to estimate the current state of the system x(t). In the parameter estimation, the noise model may be used mainly to correctly estimate model parameters, whereas in the controller settings, parameters to assumed to be valid and estimate (or filter) the current state. This state is used together with the model to predict the output behavior of the system and thus; to find an optimal control sequence.
Consider a continuous-time linear stochastic model with a discrete time measurements
dx(t)=Ax(t)dt+Bu(t)dt+dv(t),
y(tk)=Cx(tk)+Du(tk)+w(tk),
tk=k·Ts, where k∈Z,Ts>0, (2.3)
where the matrices A, . . . , D are taken from the deterministic description of the GBM, Ts is a sampling time, k is an integer, v(t)∈Rn
Note that defining the process noise in the continuous-time domain is unusual in system identification. Data are typically collected using regular sampling; i.e. a discrete-time domain appears to be more convenient. On the other hand, continuous-time domain has an advantage in simple de-correlation of the noise-to-state impacts (compared to a discretized model definition), which issues in a smaller number of stochastic parameters required.
Due to the discrete-time measurements, it is convenient to translate the model 220 into the discrete-time domain. This can be easily done for the case of LTI models.
Discrete-Time Domain Translation (Discretization):
Translation of the model from the continuous time domain to the discrete time domain and model performance analysis may also be performed. Consider the stochastic model (2.3) to be periodically sampled with a sampling interval Tx. The adequate discrete-time linear stochastic model has a description
x(k+1)=Mx(k)+Nu(k)+v(k),
y(k)=Cx(k)+Du(k)+w(k), (2.4)
where x(k)=x(tk), tk=kTs is a discrete-time state, and u(k)=u(τ), τ=[kTs; (k+1)Ts) is a piece-wise constant input. The discrete-time noises are considered to have the Gaussian distribution and to be white; i.e.
The discrete-time matrices are
M=exp(ATs), N=∫0T
Q=∫0T
Note that the above equation can be easily solved using an augmented matrix exponential
and for the discrete-time covariance matrix we can use e.g. the following approximation
Parameter Estimation:
The continuous-time LTI stochastic model and its discrete-time counterpart have been defined. Note that the discrete-time model (2.4) is parameterized by the same vector of deterministic parameters θ as the original continuous-time model (2.1); i.e,
M=M(θ,Ts) N=N(θTs)
C=C(θ) D=D(θ) (2.9)
and that the discretized process noise matrix Q depends on both η and θ
Q=Q(η,θ). (2.10)
A typical approach to parameter estimation problem is by using the prediction error method (PEM). The PEM tries to minimize (in some way) the prediction errors
e(k)=y(k)−{circumflex over (y)}(k|k−1), (2.11)
where y(k) is a measured output and
{circumflex over (y)}(k|k−1)=E{y(k)|Dk−1,u(k)} (2.12)
is the predicted output (the conditional expectation of the output) at time instant k, predicted using input/output data up to the time instant k−1 and input at time instant k. The predicted output is in this case defined by the well-known Kalman filter as follows
{circumflex over (x)}(k+1|k)=M{circumflex over (x)}(k|k−1)+Nu(k)+K(k)(y(k)−{circumflex over (y)}(k|k−1)),
{circumflex over (y)}(k|k−1)=C{circumflex over (x)}(k|k−1)+Du(k). (2.13)
Here, {circumflex over (x)}(k+1|k) is a predicted state defined similarly to (2.12) and K(k) is a Kalman's gain matrix driven by a discrete-time Riccati's equation (The cross-covariance matrix S is considered to be zero. The approach can be easily extended to non-zeros S, which appears e.g. when a realistic sampling is considered.)
The P(k+1|k) corresponds to a covariance matrix of the state estimate.
The prediction error is a preliminary part for any PEM that will use the LTI GBM and measured data to estimate unknown (or uncertain) parameters. The basic algorithm works as follows:
Here, p is a vector combining all unknown parameters, e.g. p=[θTηT]T and prediction errors e(0), . . . , e(N−1) are function of the parameter values.
It is important to mention that the resulting optimization problem is very difficult and is the main drawback of the gray-box modeling approach. The optimization problem is non-linear and non-convex with all its issues, such as multiple local minima, iterative optimization, computationally demanding task, and no certificate of global minima nor guarantee of reaching a local minima.
These issues cannot be solved in general, but some aids can be provided. Approaching the global optima can be supported by a careful initial estimation of model parameters, which belongs to the “art” part of the system identification. From the “science” point of view, the computationally demanding iterative optimization should be refined as much as possible.
Sensitivity of Prediction Errors:
When looking for an iteration update during the parameter optimization a substantial piece of information is a sensitivity (or gradient) of the criterial function V (p,e(0), . . . , e(N−1)) with respect to parameter p
The difficulty in evaluating the previous expression comes from a complex dependence of the prediction errors to parameters in p. Let p∈{p1, . . . , pn
When a solution to the previous equation is found, the solution to (2.15) is usually obvious. The inventors have found no systematic approach to this for the case of linear, continuous-time, stochastic GBMs.
Suppose that we know the initial state exactly (Initial state can be also considered as an additional unknown vector of parameters); i.e.
{circumflex over (x)}(0|−1)=x0, P(0|−1)=0 (2.17)
then the Riccati's equation (2.14) becomes an algebraic equation
K=(MPCT+S)(CPCT+R)−1,
P=MPMT+Q−KΣKT,
Σ=CPCT+R, (2.18)
and the output predictor is an LTI model
{circumflex over (y)}(k|k−1)=C{circumflex over (x)}(k|k−1)+Du(k),
{circumflex over (x)}(k+1|k)=M{circumflex over (x)}(k|k−1)+Nu(k)+K(y(k)−{circumflex over (y)}(k|k−1)). (2.19)
Let the sensitivity of the state vector with respect to a parameter pi be
and the sensitivity of the output with respect to parameter pi is
Now, using the chain rule, the output sensitivities can be computed using the augmented system as follows
where the subscript (i) indicates the partial derivative with respect to parameter pi; i.e., Xi=∂/∂piX, X∈{M,N,C,D,K}. The original problem was translated into: “How to evaluate partial derivatives of these matrices”, whereas for the matrices of the continuous-time LTI system (2.1) this can be easily computed, in the case of the discrete-time matrices M, N and K it becomes more involving.
Lemma 1: A partial derivative of the discrete-time matrices Mi and Ni can be evaluated using matrix exponential as follows:
Finally, the evaluation of the partial derivatives of the Kalman gain matrix K may be performed. Looking at (2.18) it is seen that the matrix K is a function of the deterministic as well as the stochastic parameters.
Sensitivity of the Riccati's Equation
Lemma 2: A partial derivative of matrix K can be evaluated by solving the following Lyapunov's equation
Pi=ĀPiĀT+
where
Ā=M−KC,
Āi=Mi−KCi,
The partial derivative of the covariance matrix will be
Σi=CiPCT+CPiCT+CPCiT, (2.26)
and the partial derivative of the Kalman gain will be
Ki=[ĀiPCT+Ā(PiCT+PCiT)]Σ−1, (2.27)
The algebraic Riccati's equation is defined as
K=MPCT(CPCT+R)−1,
P=MPMT+Q−KΣKT,
Σ=CPCT+R. (2.28)
The system matrices M and C are functions of unknown parameters θ, the covariance matrix of measurement noise, R, is a function of η and covariance matrix of process noise, Q, is a function of both θ and η (see (2.8)). Consider that all partial derivatives of matrices M, C, Q and R are known, then by differentiating the first equation in (2.28) we obtain
where the inner terms are
Substituting (2.30) into (2.29) results in
The previous equation can be rewritten as Lyapunov's equation
Pi=ĀPiĀT+
where
Ā=M−KC,
Āi=Mi−KCi,
Summarizing the previous results, it may be observed that the sensitivity of the prediction errors can be computed iteratively using augmented system (2.22), partial derivatives of the discrete-time (deterministic) matrices can be solved using augmented matrix exponential (2.23), and partial derivatives of the Kalman gain matrix K can be computed by solving Lyapunov's (linear) equation (2.24) and by evaluating (2.27).
Maximum-Likelihood Estimation
The goal is to find a parameters θ and η of LTI stochastic system (2.3). Here we consider Qc to be parameterized by a vector of unknown parameters η∈ Rn
The goal is to find a vector of parameters that maximize the following likelihood function
where p(y(k)|Yk−1, Uk,θ) is a probability density function, which comes from the noise properties of the stochastic LTI GBM. It is well-known, that in the case of LTI model with Gaussian process and measurement noises, the probability of the predicted output will be also Gaussian. Specially, for the steady state Kalman gain (or known initial state), the probability will be:
p(y(k)|YN,UN,θ,η): (2.36)
N({circumflex over (y)}(k|k−1),Σ),(2.37)
Finding the arguments that maximize the likelihood function leads to optimization of the problem solution at 320 and is equivalent to minimizing the corresponding negative log-likelihood
which is also equivalent to minimizing
Here E∈ Rn
E=[e(0|−1) . . . e(N−1|N−2)]. (2.40)
In order to support numerical minimization of (2.39) and hence solve the formulated optimization problem at 320. a gradient (parameter sensitivity) of the cost function must be provided. The partial derivative of the cost function can be computed as follows
In the previous equation
Partial derivatives of Σ with respect to p is given by (2.26).
The sensitivity function (or the gradient) (2.41) can be evaluated when the prediction errors e(k|k−1), their sensitivities et(k|k−1) and the sensitivity of the solution to the algebraic Riccati's equation are known. All of this has been already introduced in the previous section.
It is worth mentioning that similar approach can be employed when evaluating the gradient of various PEM criterion functions, e.g. Least Squares Estimation (LSE) and its modifications. Usually it is the prediction error and its sensitivity that are the building blocks for evaluating gradients of the PEM cost functions. Model performance analysis is illustrated at 330 and may utilize model output fit utilizing values of parameters. The parameters may be searched using an iterative (Gauss-Newton) optimization algorithm, i.e. the cost function and all sensitivities are repeatedly evaluated and new updates are computed until a possible local minima is found.
Finally, when the optimal parameters are estimated, the model performance is analyzed. If the performance is not adequate a new model structure and/or new initial estimate of parameters should be provided and the process is repeated.
The proof to (2.42) and (2.43):
With regard to the sensitivity of the MLE, consider the objective function to be parameterized as follows
The derivative of the objective function
A partial derivative of the cost function with respect to a parameter pi will be (see (3.2) and (3.5))
A partial derivative of the cost function with respect to Σ gives (see (3.3), (3.6), and (3.5))
In the last equation
denotes an estimated covariance matrix.
where
Various feedback loops are summed at junctions 550, 555, and 560 providing respective inputs. The block-oriented design can be naturally incorporated to the GBM definition, as the model is described in the continuous-time domain. This makes the model easy to modify. Example computation times for 5000 samples of input/output data and a standard laptop processor Intel i5 2.4 GHz may consume approximately 18 seconds using an analytical gradient whereas 13 minutes may be consumed using the numerically evaluated gradient. Note than an equivalent black-box model would have 24 parameters and presumable difficulties with over-fitting.
In summary, a continuous time GBM may be derived using first-principles and model splitting. A noise model is included for a good estimate of deterministic parameters. A prediction error method is used to estimate model parameters. To evaluate output predictions, a numerical integration of the stochastic differential equation is solved. In general, this can be very computational expensive. In the case of a linear differential equation and Gaussian process and measurement noises, an exact discretization can be performed which results in a set of difference equations:
dx(τ)=Acx(τ)dτ+Bcu(τ)dτ+τvc(τ)x(k+1)=Ax(k)+Bu(k)+v(k)
y(tk)=Ccx(tk)+Dcu(tk)+w(tk)y(k)=Cx(k)+Du(k)+w(k)
The sensitivity of this exactly discretized model equations is then used to evaluate a gradient of the cost function in a chosen PEM
In the previous section linear time invariant (LTI) gray-box models (GBMs) were considered. Such models typically result from some approximation of more complex non-linear systems and they are valid only locally—close to a given linearization point. In many cases this is sufficient, e.g. when the model is used by a controller to maintain a certain operating point (OP). On the other hand, by using advanced control with multiple manipulated and controlled variables, the optimal OP can be placed in varied positions, which involves more general non-linear models that are valid over an extended region.
Consider two descriptions of a non-linear model of a system. The first one is a static map of steady state inputs to steady state outputs. We will refer to this model to as a “static” or “steady-state” model. The other one is a “dynamic model”, which describes also the transient behavior between the steady-states. As expected, the second description should embrace the static description as well.
The static model can be used to place the OP optimally, whereas the dynamic model (a set of models) is employed by the controller to accomplish bringing the system to the operating point (OP). Acquired experimental data is often consist with that assumption. The experimental data is usually composed of various step-tests at different levels (operating points).
The assumptions are used to derive a two-step parameter estimation algorithm illustrated at 700 in
The only difference to the LTI GBM definition is that the resulting model is represented by a set of nonlinear differential equations
dx(t)=f(x(t),u(t),θ,t)dt+dv(t)
y(tk)=g(x(tk),u(tk),θ,tk),+e(tk)
For the steady state model estimation, there is no need for numerical integration. The estimation provides enhanced fidelity at low frequency and a physical plausibility of state. In one embodiment, input and output data is analyzed and steady states are extracted. The steady states are considered as new optimization variables which are tuned together with the model parameters such that the steady-state condition is fulfilled and output error minimized. This allows constraint of the model states such that they remain physically feasible. For the dynamical model estimation, pre-estimated parameters from the steady state model estimation may be used. Dynamical parameters are tuned using sub-sequences of experimental data. In one embodiment sequences of input and output data 740 wherein the dynamical modes are excited are determined. An initial guess of the model parameters and their uncertainty from the steady-state model estimation is made. Parameters can be tuned using a brute force ordinary differential equation (ODE) solver plus non-linear Gauss-Newton search. Further improvement uses local linearization and avoids numerical integration by employing the algorithms for linear GBM parameter estimation, as will be described later. Input and output data for the two-step identification of parameters is illustrated in a graph at 900 in
The deterministic non-linear GBM is defined as follows
{dot over (x)}(t)=f(x(t),u(t),θ),
y(t)=g(x(t),u(t),θ), (3.9)
where x(t)∈Rn
A steady-state model will be
0=f(xss,uss,θ),
yss=g(xss,uss,θ). (3.10)
where xss is a steady-state that corresponds to a steady-state input uss and output yss. Instead of looking for an explicit solution to xss the implicit definition is kept. The steady-state data {uss(k), yss(k)} has to be extracted from the measured data.
The idea is to consider steady-states xss(k) as free optimization variables; use the first equation in (3.10) as a soft constraint, which handle the steady-state feasibility, and try to minimize the output errors
ess(k)=yss(k)−g(xss(k),uss(k),θ). (3.11)
A cost function, which captures this idea
The last term in (3.12) regularizes the parameter estimation—θ0 is an initial estimate of the parameters and matrix P is used to determine a weight of the initial estimate (typically diagonal matrix). An auxiliary parameter α∈ (0,1) is a tuning parameter, which helps with a convergence of the optimization problem.
To evaluate the gradient of the cost function (3.12), the partial derivatives of the model equation must be computed. Assume that functions f(x,u,θ) and g(x,u,θ) are continuous and have continuous first-order partial derivatives for given xss, uss, θ; and define
The gradient is then
Note that a simplified notation X(k)=X(xss(k), uss(k),θ) has been used for X∈{A,B,C,D,G,F}.
Similarly, the second-order derivatives can be also evaluated and used for construction of the Hessian matrix. These derivatives will be further reused in the second step—estimation of dynamic parameters using transient data.
A substantial amount of information lies in known constraints. Constraints on model parameters can be derived from the first-principle knowledge and data inspection. These assumptions can be easily added as linear constraints. Typically
θmin≤θ≤θmax,
Ω(uss(k),k)xss(k)≤ω(uss(k),k). (3.16)
Finally, the user has to select an initial estimate of (feasible) parameters and run the optimization
The optimal value of steady-state estimate θss* and the corresponding Hessian matrix H* are used as an initial point for the next step, estimation of non-linear dynamical model.
Further Detail of the Non-Linear Dynamical Model Estimation:
Once the steady-state model has been estimated, the next step is to tune the remaining parameters using measured transient responses of the system. Some prediction error method is again employed to find an optimal value of parameters of non-linear model.
In order to evaluate prediction errors of the non-linear model, some numerical integration method are typically employed to solve the appropriate differential equation (3.9). Furthermore, this must be solved for each step of the iterative optimization algorithm, which searches for the optimal parameters. This makes the approach usable only for small data sets and simple models (where the corresponding differential equation can be easily solved).
Using the fact, that experimental transient data is usually centered on some given operating point, a local approximation can be sufficient to capture the model behavior. The linearized model will have a form
{dot over (x)}(t)=Ax(t)+Bu(t)+b,
y(t)=Cx(t)+Du(t)+d, (3.18)
and the matrices A, B, C, D and vectors b, d will be functions of the operating point xop, uop and the parameter vector θ
A=A(xop,uop,θ), C=C(xop,uop,θ).
B=B(xop,uop,θ), D=D(xop,uop,θ),
b=f(xop,uop,θ)−Axop−Buop, d=g(xop,uop,θ)−Cxop−Duop. (3.19)
Note, that by extending the input vector u(t)→[uT(t)1]T, the above “affine” representation may be translated to a standard linear representation (2.1). System matrices are functions not only of the deterministic parameters θ, but also of the operating point.
Regarding the character of the transient data, we can assume that
The triple {xop,uop,θ} is linked and thus, the number of parameters can be reduced by up to m×(nx+nu) variables. Here m is a number of operating points (transient data sets) considered.
To apply the approach derived for the LTI GBM in the Section 2, partial derivatives of matrices (3.19) has to be defined. This is not a difficult task if the second order derivatives of functions f(⋅) and g(⋅) are known.
Let p∈{θ1, . . . , θn
By using the relation for the steady-state operating point
0=f(xop,uop,θ), we obtain for
Now all the information is available for estimating parameters of the linearized GBM by the above mentioned algorithm, using for example the MLE, described above. The point is that parameters, to be estimated, are actually the parameters of the corresponding non-linear model (at least a part of its parameters). Based on the data available, several operating points may be selected and utilized for tuning the “dynamical” parameters of the non-linear GBM. Note that the steady-state model is considered to be already tuned as described in the described above.
In one embodiment, a two-step parameter estimation may be performed. First, a non-linear static map is estimated from the steady-state data as described above. Next, a set of operating points is identified using steady-state conditions for each level. For a chosen value of uop and using the steady-state condition 0=f (xop,uop, {circumflex over (θ)}ss), where {circumflex over (θ)}ss is the estimated steady-state vector of parameters, we have {circumflex over (x)}op=xop({circumflex over (θ)},uop). Substituting this into the (3.19) the linearized model is obtained
Ac=Ac({circumflex over (x)}op,uop,θ), . . . , Dc=Dc({circumflex over (x)}op,uop,θ). (3.23)
In further detail, let VLN(i)(θ,η) be a cost function for LTI GBM parameter estimation (e.g. MLE (2.39)) that corresponds to transient data set #i. Here, η is a vector of“stochastic” parameters as defined above. An example of overall cost function for estimating “dynamical” parameters of the non-linear GBM is
v(θ,η)=VLTI(1)(θ,η)+ . . . +VLTI(m)(θ,η)+γ∥θ−θss*∥H
The last term in the equation (3.24) assures that the information from the non-linear steady-state model estimation is preserved (θss*, Hss are the optimal parameter estimate and its Hessian from the steady-state model parameter estimation, γ is a weighting factor).
A summary of a practical application is now provided. In one embodiment a user defines a non-linear gray-box model as follows:
dx(τ)=f(x(τ),u(τ),θ)+τv(τ)
y(tk)=g(x(tk),u(tk),θ)+e(tk) (3.25)
First and second order partial derivatives may be solved using a symbolic solver or by using pre-defined model blocks that are simply interconnected by a user as illustrated in
Steady-state data and transient data may then be provided to a programmed computer for formulation of the parameter estimation optimization problem, including steady-state parameter estimation supported by analytical gradient and Hessian, and dynamical parameter estimation supported by fast exact discretization and analytical gradient.
Conclusion:
We have shown that, assuming that we have provided the first and second-order derivatives of the system equations (3.9), the parameter estimation task can be well-formulated as follows
A coherent algorithm for the parameter estimation of the LTI GBM makes the hard non-linear optimization task numerically tractable and computationally effective. In various embodiments, the complexity of the parameter optimization is performed by a machine, making the model identification as user-friendly as possible.
Memory 1203 may include volatile memory 1214 and non-volatile memory 1208. Computer 1200 may include—or have access to a computing environment that includes—a variety of computer-readable media, such as volatile memory 1214 and non-volatile memory 1208, removable storage 1210 and non-removable storage 1212. Computer storage includes random access memory (RAM), read only memory (ROM), erasable programmable read-only memory (EPROM) & electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, compact disc read-only memory (CD ROM), Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium capable of storing computer-readable instructions.
Computer 1200 may include or have access to a computing environment that includes input 1206, output 1204, and a communication connection 1216. Output 1204 may include a display device, such as a touchscreen, that also may serve as an input device. The input 1206 may include one or more of a touchscreen, touchpad, mouse, keyboard, camera, one or more device-specific buttons, one or more sensors integrated within or coupled via wired or wireless data connections to the computer 1200, and other input devices. The computer may operate in a networked environment using a communication connection to connect to one or more remote computers, such as database servers. The remote computer may include a personal computer (PC), server, router, network PC, a peer device or other common network node, or the like. The communication connection may include a Local Area Network (LAN), a Wide Area Network (WAN), cellular, WiFi, Bluetooth, or other networks.
Computer-readable instructions stored on a computer-readable medium are executable by the processing unit 1202 of the computer 1200. A hard drive, CD-ROM, and RAM are some examples of articles including a non-transitory computer-readable medium such as a storage device. The terms computer-readable medium and storage device do not include carrier waves. For example, a computer program 1218 capable of providing a generic technique to perform access control check for data access and/or for doing an operation on one of the servers in a component object model (COM) based system may be included on a CD-ROM and loaded from the CD-ROM to a hard drive. The computer-readable instructions allow computer 1200 to provide generic access controls in a COM based computer network system having multiple users and servers.
1. A method comprising:
using first principles and engineering knowledge to define a continuous time nonlinear gray-box model of a system performing a process;
defining a numerically tractable optimization problem for parameter estimation of the nonlinear gray-box model;
tuning a vector of parameters of a static model of the nonlinear gray-box model; and
extending the vector of parameters of the static model to a dynamic model by fitting measured transient data from the process.
2. The method of example 1 wherein fitting measured transient data uses a local linear approximation of the non-linear gray-box model, and thus avoiding computationally expensive numerical integration.
3. The method of any of examples 1-2 and further comprising providing analytical sensitivity to the optimizing parameters.
4. The method of example 3 wherein sensitivity is provided as a gradient of a cost function, wherein the gradient is computed using prediction errors and their sensitivities to the parameters.
5. The method of example 4 and further comprising evaluating the computation of the prediction errors and their sensitivities of locally linearized gray-box models corresponding to local linearized models of the nonlinearized gray-box model using an extended discrete-time linear stochastic model.
6. The method of example 5 wherein the locally linearized gray-box models are defined by a set of first order stochastic differential equations;
dx(t)=A(θ)x(t)dt+B(θ)u(t)dt+dv(t),
y(tk)=C(θ)x(tk)+D(θ)u(tk)+w(tk),
where x(t)∈Rn
7. The method of example 5 and further comprising computing an extended discrete-time linear stochastic model using the continuous-time stochastic gray-box model.
8. The method of example 7 wherein discretizing the extended continuous time stochastic linear model to create a discrete time stochastic model parameterized by the same vector of parameters while maintaining the sensitivities includes:
generating discrete-time matrices; and
solving the discrete-time matrices using an augmented matrix exponential to obtain the discrete time stochastic model.
9. The method of example 8 wherein discretizing the extended continuous time stochastic linear model to create a discrete time stochastic model with the optimized vector of parameters while maintaining the sensitivities further comprises:
parameterizing the discrete time stochastic model using the continuous time model parameters;
utilizing a prediction error method with the continuous time linear gray-box model and measured data to estimate unknown parameters.
10. The method of example 9 wherein the prediction error is determined by solving Ricati's equation to compute the prediction error using the optimized vector of parameters by:
defining an initial estimate of all the unknown parameters that define a vector of deterministic parameters θ, a discretized process noise matrix Q, measurement noise matrix R, and initial state of the model x;
selecting a prediction error cost function; and
iteratively looking for an optimal value of the unknown parameters that minimizes the prediction error cost function.
11. The method of any of examples 1-10 and further comprising evaluating the sensitivity of the prediction errors to the model optimized parameters via a solution to a specific Lyapunov's equation.
12. The method of any of examples 1-11 and further comprising:
providing the gray-box model with optimized parameters to the advanced controller; and
controlling the system using the optimized gray-box model.
13. The method of any of examples 1-12 wherein discretizing the extended continuous time stochastic model is performed using Riccati's equation and a Lyapunov equation is used to evaluate the sensitivity of the Riccati's equation.
14. An advanced controller having a parameterized model for controlling a system to control a process comprising:
a processor; and
a storage device having code and data to define a gray-box model of the system and corresponding parameters for the gray-box model, wherein the parameters are determined in accordance with a method at least partially performed by a programmed computer, the method comprising:
defining a numerically tractable optimization problem for parameter estimation of a general nonlinear gray-box model;
tuning parameters of a static model of the nonlinear gray-box model;
extending the parameters of the static model to a dynamic model by fitting measured transient data from the process.
15. The advanced controller of example 14 wherein the gray-box model is defined by a set of first order differential equations:
{dot over (x)}(t)=f(x(t),u(t),θ,t)
y(t)=g(x(t),u(t),θ,t)
where x(t)∈Rn
16. The advanced controller of any of examples 14-15 wherein the nonlinear gray-box model 15 is used to control the process; primarily to compute optimal set-point of the underlying closed-loop controller.
17. The advanced controller of any of examples 14-16 wherein the nonlinear gray-box model 15 is used to compute an optimal set-point and initialize a locally linearized gray-box model, which is used by a closed-loop model predictive controller to perform optimal control.
{dot over (x)}(t)=A(θ)x(t)+B(θ)u(t),
y(t)=C(θ)x(t)+D(θ)u(t),
where x(t)∈Rn
18. The advanced controller of any of examples 14-17 wherein the method at least partially performed by a programmed computer further comprises defining a noise model including process and measurement noise in the continuous time domain utilizing stochastic parameters.
19. The advanced controller of example 18 wherein the linear stochastic model is defined as:
dx(t)=A(θ)x(t)dt+B(θ)u(t)dt+dv(t),
y(tk)=C(θ)x(tk)+D(θ)u(tk)+w(tk),
where x(t)∈Rn
20. The advanced controller of any of examples 14-19 wherein discretizing the continuous time stochastic linear model to create a discrete time stochastic model with optimized parameters while maintaining the sensitivities includes:
generating discrete-time matrices; and
solving the discrete-time matrices using an augmented matrix exponential to obtain the discrete time stochastic model.
21. A method comprising:
estimating a non-linear steady state model representative of a system performing a process using steady-state data, the model providing tuned static parameters for an advanced controller to control the system; and
using the non-linear steady state model with the tuned static parameters to tune dynamic gray-box model parameters using sub-sequences of transient experimental data.
22. The method of example 21 and further comprising:
providing tuned gray-box parameters to the advanced controller; and
controlling the system using the provided gray-box parameters.
23. An advanced controller having a parameterized model for controlling a system to perform a process comprising:
a processor; and
a storage device having code and data to define a dynamic gray-box model of the system and corresponding dynamic gray-box parameters for the gray-box model, wherein the parameters are determined in accordance with a method at least partially performed by a programmed computer, the method comprising:
estimating a non-linear steady state model representative of a system performing a process using steady-state data, the model providing tuned static parameters for an advanced controller to control the system; and
using the non-linear steady state model with the tuned static parameters to provide dynamic gray-box model parameters using sub-sequences of transient experimental data.
24. A method comprising:
performing a steady-state model estimation to obtain a vector of steady state parameters for a non-linear model representative of a system controlled by an advanced controller to perform a process; and
tuning the vector of steady-state parameters by a using local approximation for estimation of non-linear model parameters at different operating levels of the system to obtain local linear model vectors of gray-box parameters for the advanced controller.
25. The method of example 24 and further comprising:
providing the tuned gray-box parameters to the advanced controller; and
controlling the system using the provided gray-box parameters.
26. An advanced controller having a parameterized model for controlling a system to perform a process comprising:
a processor; and
a storage device having code and data to define a dynamic gray-box model of the system and corresponding dynamic gray-box parameters for the gray-box model wherein the parameters are determined in accordance with a method at least partially performed by a programmed computer, the method comprising:
performing a steady-state model estimation to obtain a vector of steady state parameters for a non-linear model representative of a system controlled by the advanced controller; and
tuning the vector of steady-state parameters by a using local approximation for estimation of non-linear model parameters at different operating levels of the system to obtain local linear model vectors of gray-box parameters for the advanced controller.
Although a few embodiments have been described in detail above, other modifications are possible. For example, the logic flows depicted in the figures do not require the particular order shown, or sequential order, to achieve desirable results. Other steps may be provided, or steps may be eliminated, from the described flows, and other components may be added to, or removed from, the described systems. Other embodiments may be within the scope of the following claims.
Number | Date | Country | Kind |
---|---|---|---|
15170586 | Jun 2015 | EP | regional |
Number | Name | Date | Kind |
---|---|---|---|
9235657 | Wenzel | Jan 2016 | B1 |
20100049369 | Lou et al. | Feb 2010 | A1 |
20120065783 | Fadell | Mar 2012 | A1 |
Entry |
---|
Ji{hacek over (r)}í {hacek over (R)}eho{hacek over (r)} et al., A Practical Approach to Grey-box Model Identification, Sep. 2, 2011, The International Federation of Automatic Control, 10776-10781 (Year: 2011). |
Erik Wernholt, Nonlinear Gray-Box Identification Using Local Models Applied to Industrial Robots, Nov. 15, 2010, Linkoping University, 1-15 (Year: 2010). |
“European Application Serial No. 15170586.0, Response filed Jun. 7, 2017 to Extended European Search Report dated Dec. 15, 2015”, 13 pgs. |
“European Application Serial No. 15170586.0, Extended European Search Report dated Dec. 15, 2015”, 6 pgs. |
Mukhopadhyay, Vivek, “Digital Robust Control Law Synthesis Using Constrained Optimization”, Journal of Guidance and Control and Dynamics, 12(2), (1989), 175-181. |
Tan, Kay Chen, et al., “Evolutionary grey-box modelling for practical system”, Genetic Algorithms in Engineering System: Innovations and Applications, (1997), 369-375. |
Number | Date | Country | |
---|---|---|---|
20160357166 A1 | Dec 2016 | US |