The present invention generally relates to monitoring and controlling continuous sheetmaking systems using model-based controllers and more specifically to techniques for identifying suitable process models by generating good quality process data while the system is operating in closed-loop.
The paper machine is a large-scale process to convert fibers into sheets of paper with high efficiency. It has hundreds of actuators at the head along the cross direction to control the properties of the pulp on the paper sheet. Thousands of measurement boxes are located at the end to measure the paper properties. For the controller design, there are two important directions associated with the paper machine: machine direction (MD) and cross direction (CD). The MD refers to the direction in which the paper sheet moves while the CD is the direction perpendicular to the MD.
Apart from the large number of actuators and measurement bins, the CD process is also an ill-conditioned process. Besides, the CD process model suffers from large uncertainties. All of these characteristics add to the complexity associated with the corresponding model identification and controller design. A common technique to address this issue assumes that all the actuators have identical temporal (in the time direction) and spatial (in the CD) response behavior. Moreover, the temporal and spatial responses are assumed to be separable. These assumptions are valid in practice and make the CD process easier to handle. Even then, the controller design and model identification for the CD process are still challenging.
A current control employed in the CD process is model predictive control (MPC) that requires a model with good quality. Therefore, model identification for the CD process plays an essential role in determining the performance of the MPC. In terms of the system identification, it is well known that a good excitation signal is necessary to make the identified model reliable and precise. How to design the excitation signal in an optimal way has received extensive attention. A number of well-known strategies have been proposed such as the frequency domain approach, time domain approach, open-loop optimal input design and closed-loop optimal input design.
In terms of the optimal input design for the CD process, most existing results focus on the open-loop case to generate good data for process model identification, which risks interrupting normal process operations and sacrificing quality. The main drawback is that it may bring significant profit loss for the mills as the normal operations are interrupted. The industry needs a technique that can generate good quality process data without having to suspend control and without sacrificing product quality.
The present invention simplifies the optimal design of process experiments for closed-loop CD process identification by using a causal equivalent process model to find the optimal spatial input spectrum subject to input and output power constraints. The technique includes converting the current or nominal process model from a large matrix to a non-causal transfer function. This reduces the number of parameters in the model, but puts it in a form that is difficult to use for optimal input design. Next, the non-causal model is converted into a causal model that has an equivalent output spectrum, which puts the model into a simple form that can be used for optimal input design. The causal model is then implemented to design the optimal input spectrum. This optimal input spectrum indicates how much input excitation should exist at each frequency to have an optimal experiment (one that generates the data from which we can obtain a process model estimate with the smallest uncertainty). Finally, the frequency domain representation of the optimal experiment is converted to a time domain realization, that is, a series of perturbations to the sheetmaking process that will generate the required informative data.
In one aspect, the invention is directed to a method of closed-loop identification of process models for a model predictive controller (MPC) for an industrial sheetmaking system having a plurality of actuators arranged in the cross-direction (CD) wherein the MPC provides control for a spatially-distributed sheet process which is employed in the sheetmaking system. The method includes the steps of:
(a) selecting a process model for the spatially-distributed process wherein the process model is defined by a matrix. The matrix defines the steady-state gains between actuator positions and spatially distributed process measurements. This model form is particularly convenient for use with the MPC. The initial process model selection may be based on use of an existing process model that is not as accurate as desired or it may be based on some a priori information about the process whereby the point of the initial model is to allow for the design of an excitation sequence that is tailored to a specific process of interest.
(b) converting the matrix into a non-causal transfer function. The non-causal spatial finite impulse response model is generated by taking the parameters from a single column of a spatial gain matrix and the non-causal spatial impulse response model is factored into a causal transfer function and an identical but anti-causal transfer function.
(c) converting the non-causal transfer function into a causal model that has an equivalent spectrum. This is preferably accomplished by taking the square (squaring) of the causal factor of the non-causal transfer function.
(d) using the causal model to design an optimal input spectrum for process excitation. The input spectrum is designed to minimize the covariance of the non-causal process model parameters subject to constraints on input and output power. To solve this optimization problem, a finite dimensional parameterization of the spectrum is made which allows the problem to be solved by readily available optimization toolboxes.
(e) transforming a frequency domain representation of the optimal input spectrum to a time domain realization which is a sequence of actuator movements. A controllable and observable state space realization of the spectrum can be constructed using established techniques.
(f) applying the sequence of actuator movements to the plurality of actuators and collecting data regarding changes in cross-directional sheet properties due to the sequence of actuator movements to determine measured actuator response profiles.
(g) analyzing the data to extract new model parameters.
(h) and inputting the new model parameters for the process model.
In another aspect, the invention is directed to a multivariable model predictive controller (MPC) for providing control to a cross-direction (CD) process having at least one manipulated actuator array and at least one controlled measurement array, wherein the MPC employs a process model that is defined by a matrix and the MPC includes a processor that is configured to:
convert the matrix into a non-causal transfer function;
convert the non-causal transfer function into a causal model that has an equivalent spectrum;
use the causal model to design an optimal input spectrum for process excitation;
transform a frequency domain representation of the optimal input spectrum to a time domain realization which is a sequence of actuator movements;
apply the sequence of actuator movements to the plurality of actuators and collecting data regarding changes in cross-directional sheet properties due to the sequence of actuator movements to determine measured actuator response profiles;
analyze the data to extract new model parameters; and
input the new model parameters for the process model.
The present invention is particularly suited for so-called “single-beam” applications for identifying suitable process models for the MPC with respect to a single actuator array comprising of a plurality of manipulated actuators that are arranged in the CD and a corresponding single controlled measurement array.
As shown in
The system further includes a profile analyzer 44 that is connected, for example, to scanning sensor 38 and actuators 18, 20, 32 and 36 on the headbox 10, steam box 12, vacuum boxes 28, and dryer 34, respectively. The profile analyzer is a computer which includes a control system that operates in response to the cross-directional measurements from scanner sensor 38. In operation, scanning sensor 38 provides the analyzer 44 with signals that are indicative of the magnitude of a measured sheet property, e.g., caliper, dry basis weight, gloss or moisture, at various cross-directional measurement points. The analyzer 44 also includes software for controlling the operation of various components of the sheetmaking system, including, for example, the above described actuators. To implement to the control system of the present invention, analyzer 44 can include memory 62 and processing devices 64 to execute software/firmware instructions for performing various operations related to MPC control of an industrial process. Interface 60 allows processing devices to receive data and provide signals to actuators or controllers.
As an example shown in
It is understood that the inventive technique is sufficiently flexible as to be applicable for online implementation with any large-scale industrial multiple actuator array and multiple product quality measurements cross-directional process that is controlled by a single-input-single-output (SISO) controller or by a multivariable model predictive controller (MPC) such as in papermaking. Suitable paper machine processes where paper is continuously manufactured from wet stock are further described, for instance, in U.S. Pat. No. 6,807,510 to Backstrom et al., and U.S. Pat. No. 8,224,476 to Chu et al., and U.S. 2015/0268645 to Shi et al., which are incorporated herein by reference. In so-called “bump” tests” operating parameters on the sheetmaking system, such as a papermaking machine, are altered and changes of certain dependent variables resulting therefrom are measured. Bump tests techniques are described, for open-loop CD process model estimation, in U.S. Pat. No. 6,086,237 to Gorinevsky et al. and, for closed-loop CD process alignment identification, in U.S. Pat. No. 7,459,060 to Stewart, which are incorporated herein by reference. While the invention will be described with respect to a papermaking machine, it is understood that the invention is applicable to other spatially-distributed processes such as plastic sheetmaking, rubber sheetmaking, and sheet metal operation.
The invention will be illustrated with a closed-loop input design for the CD process. Since most of time the closed-loop CD process is operating at steady-state, the preferred process will be on the steady-state closed-loop CD process. The main challenges are how to deal with the large input-output dimensions of the process and how to incorporate the controller in order to perform the closed-loop optimal input design. With the present invention, a non-causal model for the CD process is developed to avoid the high dimensional issue associated with the conventional Multi-input-multi-output (MIMO) CD model. To eliminate the resultant difficulty for the input design, we then propose an approach to obtain a causal model with an equivalent output spectrum as the non-causal model. It is shown that the maximum likelihood estimate and the parameter covariance matrix of the causal-equivalent model will converge to those of the non-causal model asymptotically with probability one. In this sense, the optimal excitation signal can be designed directly based on the causal model.
In the following CD process model that includes controller 70 and plant 72 shown in
y(t)=9(z−1)Gu(t)+v(t), (1)
y(t)εm represents the measured controlled variable (CV) profile, and m is the number of the measurement boxes along the cross direction. u(t)εm is the manipulated variable (MV) profile and here we assume a square CD model. v(t)εm is the disturbance acting on the output of the process. z−1 is the unit backward-shift operator. g(z−1) is used to describe the dynamics associated with each actuator and is assumed to be a scalar transfer function. In other words, we assume that all the actuators share the same dynamics, which is a common practice to simplify the modeling of the process. Gεm×m is the steady-state gain matrix and each column of G matrix is indeed the sampled impulse response of a single actuator along the CD at steady-state. The most distinguished features of G are its ill-conditionality, Toeplitz structure and sparsity. These characteristics greatly reduce the complexity of the controller design and model identification of the CD process. For clarity, we pose the following assumption on the structure of G matrix.
Assumption 1.
The actuators of the CD process have the same and symmetric impulse response shape along the spatial direction at steady-state, except the center of the response shape of each actuator is different. The columns of G are indeed sampled version of these responses.
For the dynamic model, we assume g(z−1) to have the following form (the subscript t represents temporal),
where d is the time-delay. Bt(z−1) and At(z−1) are polynomials and mostly g(z−1) is the discretization of a first-order plus time-delay model with unit gain. Similarly, the output disturbance v(t) is assumed to be filtered white noise, both temporally and spatially,
where Ct(z−1) and Dt(z−1) are monic and scalar polynomials describing the filter in the temporal direction. Again, here we assume the disturbances affecting all the output channels have the same dynamic model. The constant matrix φεm×m denotes the spatial correlation of the noise. e(t) is the white noise with zero mean and covariance [e(t)eT(t0)]=Σeδ(t−t0), where is the expectation operator, Σe is the co-variance matrix and δ is the Dirac delta function.
For the purpose of spatial optimal excitation signal design, the following steady-state CD process model is of interest,
y
ss
=G
ss
u
ss
+v
ss, (4)
where yssεm is the steady-state measured CV profile, and ussεm is the steady-state MV profile. Gss=G is the steady-state process gain. vssεm is the steady-state output disturbance. For convenience, we suppose that the spatial filter φ is also Toeplitz-structured and sparse as Gss.
It is well-known that the closed-loop dither signal design will involve the explicit expression of the controller in the formulation of the objective function. Most modern industrial CD controllers are MPC and if any of the constraints is active, the controller will become highly complicated and even nonlinear.
The presence of a nonlinear controller in the control loop will add to the complexity of closed-loop optimal input design. Thus to simplify this procedure, we introduce the following assumption.
Assumption 2.
Throughout this analysis, the MPC is assumed to operate in the linear mode and no constraints are active.
From Assumption 2, the specific expression of the model predictive controller at steady-state for the CD process is known to be,
K
ss
=Q
3
−1αKGssQ1, (5)
where Q1 is the weight matrix in the MPC objective function penalizing the deviation of the CV profile from its set-points. Q3 is the corresponding weight matrix to penalize the offset of the steady-state manipulated variable (MV) from its target. αK is a constant determined from the dynamic model (2) of the actuators. In practice, for convenience, the weighting matrices Q1 and Q3 are in general selected to be diagonal. From (5), it is observed that the controller Kss will possess a similar structure as the gain Gss. Awareness of this important point will greatly facilitate the proceedings of the derivations in the sequel.
Combining (4) and (5), from
Y
ss+(1+GssKss)−1GssKssr+(1+GssKss)−1vss, (6)
u
ss=(1+KssGss)−1r−(1+KssGss)−1Kssyss, (7)
where rεm is the spatial excitation signal to be designed.
When it comes to the spatial optimal input, the parameters of interest for the CD process model will be those in the gain matrix Gss (or more specifically, the parameters in a column of Gss). Notice that the optimal input design directly based on the closed-loop model (6)-(7) is nontrival due to the large input-output dimensions as well as the large number of parameters in Gss. To avoid this issue, we propose to use a scalar transfer function along the spatial coordinate to represent the spatial response of the actuators. In this sense, the original optimal input design aimed for the MIMO CD model can be re-formulated into that for a scalar spatial model, which significantly reduces the associated complexity. However, the price to pay is that the scalar spatial transfer function will have to be non-causal as any bumped actuator will generate responses on two sides (see
In this section, we show the procedures to develop causal-equivalent models for both the CD closed-loop model and open-loop model. Let us focus on the steady-state process model Gss since the steady-state controller matrix Kss will follow the same vein as long as it possesses a similar structure as G.
From the aforementioned structure of the Gss as well as Assumption 1, one is readily able to extract a scalar non-causal FIR model from any single column of Gss to represent the spatial impulse response of the actuator,
g(λ,λ−1)=g−nλ−n+ . . . +g0+ . . . +gnλn, (8)
where λ is the spatially forward-shift operator. The positive and negative powers of λ denote the anti-causal and causal shift. The gi, i=n, . . . , n, are the spatial impulse response coefficients of each single actuator and in general the symmetry of the impulse response is enforced, i.e., gi=g−i. As in most cases, the noncausal FIR model (8) will have a high order (i.e., n normally is large), a parsimonious non-causal transfer function is necessary to simplify this model. Before we show that, the following assumption is posed.
Assumption 3.
The CD MIMO steady-state models (e.g., Gss and Kss are Toeplitz-structured, and the corresponding spatial impulse response sequence satisfies the Wiener-Hopf factorization conditions: real, symmetric, and taking G as an example,
g
−nλ−n+ . . . +g0+ . . . +gnλn=M(λ)M(λ−1),∀ω, (9)
where λ=ejω. Here M(λ) has the following expression,
M(λ)=m0+m1λ−1+ . . . +mnλn, (10)
where mi, i=1, n, are the coefficients.
An immediate observation is that the frequency response of the left-hand side of (9) is non-negative and real for any frequency, which places certain restrictions on the scope of the possible spatial impulse response shapes that we may investigate. However, industrial experience reveals that most actual actuator response shapes are able to satisfy this condition. The relationship between Gss and Kss from (5) affirms that if Gss satisfies (5) then so does Kss.
After obtaining the causal FIR model M(λ−1), the next step would be to find a parsimonious transfer function model (e.g. output error model) to represent M(λ−1). This process can be accomplished from the system identification toolbox in Matlab and the original noncausal g(λ, λ−1) is re-written as follows,
where na and nb are the orders of B(λ−1) and A(λ−1), respectively. In a similar fashion, the noncausal transfer function form of the controller is achieved to be,
where ne and nf are the orders of E(λ−1) and F(λ−1), respectively. From (11)-(16), the original high-dimensional MIMO steady-state closed-loop model (6)-(7) can be replaced by scalar but non-causal transfer functions,
where x stands for the spatial coordinate. Note that the input and output sensitivity functions have the same non-causal transfer function representation as shown in the above equations.
Up to now, the closed-loop scalar non-causal model of the CD process (17)-(18) is still not a convenient form for further processing such as the optimal input design. In this subsection, we will develop methods to find causal equivalent models for the non-causal transfer functions such as
Lemma 1.
Suppose that
Proof.
Since
1(ejω,e−jω)≧0,∀ω,
2(ejω,e−jω))≧0,∀ω,
Thus it follows that,
1(ejω,e−jω)+
Besides, the coefficient sequence of (19) is real and symmetric. Thus one is always able to find an M(λ) such that (9) is satisfied. This ends the proof.
Defining
from (17)-(18), we have,
From Lemma 1, it follows that the denominator of (20) can be factorized to be the product of a causal FIR filter and its anti-causal form. Therefore, the closed-loop transfer functions (17)-(18) are simplified to be,
y
ss(x)=
u
ss(x)=
where
where {e(x)} is a spatial white noise sequence. To find a causal-equivalent transfer function for (21)-(23), we establish the following theorem.
Theorem 1.
Consider a stochastic process with the output sequence {y(x), x=1, . . . , m} (in the sequel, the subscript is omitted and the argument x is used to indicate the steady-state and output sequence) generated according to the following non-causal Box-Jenkins model
where {e(x), x=1, . . . , m} is a Gaussian white noise sequence. The polynomials with arguments λ−1 and λ are the causal and anti-causal parts, respectively. Assume that all the polynomials have no zeros on the unit circle and are minimum phase. Then there exist causal polynomials {tilde over (M)}y(λ−1), Ñy(λ−1), {tilde over (R)}y(λ−1),
Proof.
If multiplying both sides of (24) by using N(λ)N(λ−1)S(λ)S(λ−1), we can obtain,
N(λ)N(λ−1)S(λ)S(λ−1)y(x)=M(λ)M(λ−1)S(λ)S(λ−1)r(x)+N(λ)N(λ−1)R(λ)R(λ−1)e(x), (26)
Define the roots of the anti-causal polynomials M(λ−1), N(λ−1), R(λ−1), S(λ−1) to be, respectively, αi, βi, γi and δi. Let
Notice that N(λ)πN=N2(λ−1) and the same also holds for M(λ), R(λ) and S(λ). Multiplying both sides of (26) by πmπs, after some manipulations, one can obtain,
Since πM, πN, πR, and πs are all-pass filters, {{tilde over (e)}y(x)} is a white noise sequence with the same spectra as {e(x)} but may correspond to different realizations. Besides, {{tilde over (y)}(x)} has the same spectra as {yss(x)}. Therefore, (25) is verified by pairing {tilde over (M)}(λ−1)=M2(λ−1) and so on with (27), which ends this proof.
Remark 1.
From Theorem 1, one may interpret that the equivalence between {{tilde over (y)}(x)} and {y(x)} is in terms of the spectra, although the realizations might be different. However, this equivalence greatly facilitates the maximum likelihood estimation for the original noncausal model by reducing it into a causal-equivalent form. The rationale of performing in this way is based on the conclusion that the log-likelihood function for the non-causal model and the causal model converge to be the same with probability one as the sample number tends to infinity, which can also be extended to the non-causal Box-Jenkins model in (24).
Similarly, the input signal uss(x) in (22) can also be represented through causal filters,
where {ũ(x)} and {uss(x)} have the same spectra. The equations (25) and (29) will be necessary for the optimal input design in the sequel.
It is well known that if the white noise is Gaussian distributed, the prediction error method with properly chosen criterion will coincide with the maximum likelihood estimation. It has been shown that, for the open-loop data, the log-likelihood function of the non-causal ARX model and that of the corresponding causal ARX model will converge to the same value as the sample number tends to infinity. In this subsection, we will demonstrate a similar statement for the closed-loop data.
Theorem 2.
Let us consider the following non-causal process model (θ is the parameter in a compact set Ω),
y(x)=
where
Remark 2.
Theorem 2 implies that both the log-likelihood function and its derivative with respect the parameter θ obtained from the original noncausal model and the causal model are identical asymptotically. Therefore, we can conclude that the parameter covariance matrices from the two schemes coincide, and hence we may perform the optimal input design based on the causal model.
The optimal input design in the closed-loop for the steady-state CD process model was investigated. The emphasis will be placed on the non-causal CD process model due to the drawbacks associated with the MIMO CD closed-loop model (6)-(7). The actual implementation of the input design will be on the causal equivalent CD closed-loop model due to Theorem 2.
Note that in practice, the noise model parameters are of less interest and thus we split the parameter θ to be θ=[ρTηT]T, where ρ is the process model parameter vector and η is the noise model parameter vector. For the input design, the focus will be on minimizing the covariance of ρ only. Based on this motivation, due to Theorem 2, the parameter covariance matrix of ρ, Pρ, is expressed as,
where λ0 is the variance of the noise {tilde over (e)}y(x). {tilde over (g)} and {tilde over (h)} are the causal equivalent forms of g and
The closed-loop optimal input design can be formulated as minimizing a function of the parameter covariance Pρ subject to a set of constraints, e.g., input and output power constraints,
where cu and cy are the limits on the input signal power and output signal power. The constraints (34)-(35) can be written in terms of the design variable φr(w) by (32) and (25), respectively. As this optimization problem is still infinite-dimensional (since φr(ω) is a continuous function of ω), a technique known as the finite dimensional parameterization can be employed to reduce it into finite-dimensional case. Specifically, φr(ω) can be parameterized by the definition of a spectrum,
where ck, k=mc, . . . , mc, are the parameters, and mc. is the selected number of parameters. With (36), the original optimization problem can be cast into one with finite number of parameters. It is worth pointing out that the non-negativity of the parameterized spectrum (36) at any frequency has to be satisfied while searching for the optimal ck. This requirement is fulfilled by the KYP lemma by constructing a controllable and observable state-space realization for the spectrum. After these treatments, we will obtain a neat and solvable convex optimization (choose f0(·) to be convex) problem which can be readily solved by the off-the-shelf solvers such as the CVX toolbox.
Remark 3.
Note that the aforementioned optimal input design only considers the power constraints on the input and output (34)-(35). However, in practice, the hard constraints on the CVs and MVs make more sense and this is still an open problem for the frequency-domain optimal input design as posed above. Besides, specific to the CD process, the second-order bending constraints preventing the ‘picketing’ on the actuators are also important.
A simulation example was used to validate the proposed CD process model identification and closed-loop optimal input design methods. In particular, the effect of the optimally designed input on the identification was compared with that of the bumped excitation that is currently employed in the industry as described in Chu et al. U.S. Pat. No. 8,224,476 to Chu et al., which is incorporated herein.
In practice, the spatial response shape of a single actuator is assumed to satisfy the following nonlinear equation,
where γ, ξ, β, α represent the gain, width, divergence and attenuation, respectively. x is the spatial coordinate. In this example, these parameters are specified with values, respectively, γ=0.3802, ξ=268.6414 mm, β=0.10, α=3.5. The response shape under impulse signal of amplitude 5 is illustrated as the solid curve in the plot of
To make the comparison between the optimal excitation signals with the bumped excitation meaningful, we set a hard constraint ±10 on the amplitude of the excitation signals. For the optimally designed input, if any part of its amplitude violates this constraint, we will set that part to be saturated on this bound. For the bumped signal, the amplitude of the bumps alternate between −10 and 10. To achieve this goal, we carefully choose cu=4 and cy=0.2. The plot in
In some embodiments, various functions described above are implemented or supported by a computer program that is formed from computer readable program code and that is embodied in a computer readable medium. The phrase “computer readable program code” includes any type of computer code, including source code, object code, and executable code. The phrase “computer readable medium” includes any type of medium capable of being accessed by a computer, such as read only memory (ROM), random access memory (RAM), a hard disk drive, a compact disc (CD), a digital video disc (DVD), or any other type of memory. A “non-transitory” computer readable medium excludes wired, wireless, optical, or other communication links that transport transitory electrical or other signals. A non-transitory computer readable medium includes media where data can be permanently stored and media where data can be stored and later overwritten, such as a rewritable optical disc or an erasable memory device.
This application claims priority to U.S. Provisional Patent Application No. 62/305,412 that was filed on Mar. 8, 2016 and which is incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
62305412 | Mar 2016 | US |