Claims
- 1. A method for determining the optimum operation of a system, comprising the steps of:
receiving the outputs of the system and the measurable inputs to the system; and optimizing select ones of the outputs as a function of the inputs by minimizing an objective function J to provide optimal values for select ones of the inputs; wherein the step of optimizing includes the step of predicting the select ones of the outputs with a plurality of models of the system, each model operable to map the inputs through a representation of the system to provide predicted outputs corresponding to the select ones of the outputs which predicted outputs of each of the plurality of models are combined in accordance with a predetermined combination algorithm to provide a single output corresponding to each of the select ones of the outputs.
- 2. The method of claim 1, wherein the optimal value of the outputs of the plurality of models is determined as a single averaged optimal output value for each of the select ones of the outputs.
- 3. The method of claim 1, and further comprising the step of applying the optimal values of the select ones of the inputs to the corresponding inputs of the system after determination thereof.
- 4. The method of claim 1, wherein the step of receiving the outputs of the system comprises receiving measurable outputs of the system.
- 5. The method of claim 1, wherein the step of optimizing comprises a derivative-based optimization operation.
- 6. The method of claim 5, wherein the step of optimizing comprises the steps of:
determining the average predicted output of the plurality of models <y(t)>; determining the average derivative of the average predicted output <y(t)> with regards to the inputs x(t) as ∂<y(t)>/∂x(t); the objective function J being a function of <y(t)> and determining a derivative of the objective function J with respect to <y(t)> as ∂J/∂<y(t)>; determining with the chain rule the relationship ∂J/∂x(t); and determining the minimum of the J.
- 7. The method of claim 5, wherein the average derivative of the average predicted output is weighted over the plurality of models.
- 8. The method of claim 2, wherein the step of predicting the select ones of the outputs with the plurality of models of the system comprises predicting the output to a point forward in time as a trajectory.
- 9. A method for optimizing the parameters of a system having a vector input x(t) and a vector output y(t), comprising the steps of:
storing a representation of the system in a plurality of models, each model operable to map the inputs through a representation of the system to provide a predicted output, each of the models operable to predict the output of the system for a given input value of x(t), each model operable to map the inputs through a representation of the system to provide a predicted output; providing predetermined optimization objectives; and determining a single optimized input vector value {circumflex over (x)}(t) by applying a predetermined optimization algorithm to the plurality of models to achieve a minimum error to the predetermined optimization objective.
- 10. The method of claim 9, wherein the step of determining comprises determining the derivative ∂y(t)/∂x(t) of each of the models and then determining an average of the derivatives ∂y(t)/∂x(t).
- 11. The method of claim 10, wherein the step of determining the average of the derivative comprises determining the weighted average of the derivatives ∂y(t)/∂x(t).
- 12. The method of claim 11, wherein the step of determining the average derivative is defined by the following relationship:
- 13. The method of claim 9, wherein the step of storing a representation of the system in a plurality of models comprises storing a representation of the system in a plurality of non-linear or linear networks, each operable to map the input x(t) to a predicted output through a stored representation of the system.
- 14. The method of claim 13, wherein the stored representation of the system in each of the plurality of non-linear or linear networks are related in such a manner wherein the parameters of each of the linear or non-linear networks are stochastically related to each other.
- 15. The method of claim 14, wherein the stochastic relationship is a Bayesian relationship.
- 16. The method of claim 9, wherein the predetermined optimization algorithm is an iterative optimization algorithm.
- 17. The method of claim 9, wherein the step of determining the single optimized input vector value {circumflex over (x)}(t) comprises determining the derivative of the predetermined optimization objective relative to the input vector x(t) as ∂J/∂x(t), where J represents the predetermined optimization objective.
- 18. The method of claim 9, wherein the step of determining comprises determining the derivative ∂y(t)/∂x(t) of each of the models and then determining an average of the derivatives ∂y(t)/∂x(t).
- 19. The method of claim 18, wherein the step of determining the average derivative is defined over a (q, p) matrix by the following relationship:
- 20. The method of claim 19, wherein the step of determining ∂J/∂<x(t)> comprises the steps of:
determining the weighted average of the predicted output of each of the models by the following relationship: 23⟨y→(t)⟩∝∑w=1NwF(w)(x→)∏i=1n P(y(i)&LeftBracketingBar;x(i),ω)P(ω)where P(y(i)|x(i), ω)P(ω) represents the posterior probability of the model indexed by w, and Nw represents the maximum number of models in the stochastic relationship, and wherein the stored representation of the system in each of the plurality of models are related in such a manner wherein the parameters of each of the models are stochastically related to each other; determining the derivatives ∂J/∂<y(t)> as the variation of the predetermined optimization objective with respect to the predicted output y(t); and determining by the chain rule the following: 24∂J∂x→p(t)&RightBracketingBar;p=∑q∂J∂⟨y→q(t)⟩∂⟨y→q(t)⟩∂x→(t)p.
CROSS REFERENCE TO THE RELATED APPLICATION
[0001] The present application is a Continuation Application of application Ser. No. 09/290,791, filed Oct. 6, 1998, entitled: BAYESIAN NEURAL NETWORK FOR OPTIMIZATION, which is a Continuation-in-Part of, and claims priority in, U.S. Provisional Patent Application Serial No. 60/103,269, entitled Bayesian Neural Networks For Optimization and Control, and filed Oct. 6, 1998 (Attorney Docket No. PAVI-24,473).
Provisional Applications (1)
|
Number |
Date |
Country |
|
60103269 |
Oct 1998 |
US |
Continuations (1)
|
Number |
Date |
Country |
Parent |
09290791 |
Apr 1999 |
US |
Child |
10827977 |
Apr 2004 |
US |