This application is related to U.S. patent application Ser. No. 10/007,906 and U.S. patent application Ser. No. 10/080,742 which are incorporated by reference herein in their entirety.
The present invention relates generally to the optimization of designs, models, structures and. shapes, for example in fluid-dynamic applications such as the design of an aircraft wing, a gas turbine or a compressor blade. In particular, it relates to the optimization of such designs using Evolutionary Algorithms (EAs).
In the field of evolutionary algorithms, basic principles of natural evolution are used for generating or optimizing technical structures. Basic operations are mutation and recombination as a method for modifying structures or parameters. To eliminate unfavorable modifications and to proceed with modifications which increase the overall quality of the system, a selection operation is used. Principles of the evolution strategy can be found for example in Rechenberg, Ingo (1994) “Evolutionsstrategie”, Friedrich Frommann Holzboog Verlag, which is incorporated by reference herein in its entirety.
The application of EAs in the optimization of designs is well known, see for example the book “Evolutionary Algorithms in Engineering Applications” by D. Dasgupta and Z. Michalewicz, Springer Verlag, 1997, which is incorporated by reference herein in its entirety.
As evolutionary algorithms are more and more successfully used as optimization tools for large-scale “real-world” problems, the influence of noise on the performance and the convergence properties of evolutionary algorithms have come into focus.
Quality evaluations in optimization processes are frequently noisy due to design uncertainties concerning production tolerances or actuator imprecision acting directly on the design variables x. That is, the performancefof a design becomes a stochastic quantity {tilde over (f)} via internal design perturbations
{tilde over (f)}(x)=f(x+δ), δ-random vector,
where the random vector δ obeys a certain unknown distribution (often modeled as a Gaussian distribution) and
E[δ]=0.
This means, given a design x, evaluating its quality {tilde over (f)}(x) necessarily yields stochastic quantity values. As a result, an optimization algorithm applied to {tilde over (f)}(x) must deal with these uncertainties and it must use this information to calculate a robust optimum based on an appropriate robustness measure.
Probably the most widely used measure is the expected value of {tilde over (f)}(x), that is
Assuming a continuous design space, the expected value robustness measure is given by the integral
and the optimal design x is formally obtained by
If one were able to calculate
analytically, the resulting optimization problem would be an ordinary one, and standard (numerical) optimization techniques could be applied. However, real-world applications will usually not allow for an analytical treatment, therefore one has to rely on numerical estimates of
using Monte-Carlo simulations. Alternatively one can use direct search strategies capable of dealing with the noisy information directly.
The latter is the domain of evolutionary algorithms (EAs). In particular, evolutionary algorithms have been shown to cope with such stochastic variations better than other optimization algorithms, see e.g. “On the robustness of population-based versus point-based optimization in the presence of noise” by V. Nissen and J. Propach, IEEE Transactions on Evolutionary Computation 2(3):107-119, 1998, which is incorporated by reference herein in its entirety.
A conventional technique to find approximate solutions to the above equation using EAs is to use the design uncertainties δ explicitly. That is, given an individual design x, the perturbation δ is explicitly added to the design x. While the EA works on the evolution of x, the goal function in the black-box is evaluated with respect to {tilde over (x)}:=x+δ. Since in center of mass evolution strategies an individual offspring design is the result of a mutation z applied to a parental individual and the parental centroid (x), respectively, the actually design tested is
{tilde over (x)}:=(x)+z+δ.
Taking now another perspective, one might interpret z+δ as a mutation in its own right. This raises the question whether it is really necessary to artificially add the perturbation in a black-box to the design x. As an alternative one might simply use a mutation {tilde over (z)}=z+δ with a larger mutation strength instead of z. In other words, the mutations themselves may serve as robustness tester.
However, even though evolutionary algorithms/strategies are regarded as well suited for noisy optimization, its application to robust optimization bears some subtleties/problems: due to selection, the robustness of a design x is not tested with respect to samples of the density function p(δ). Selection prefers those designs which are by chance well adopted to the individual realizations of the perturbation δ.
For example, when considering actuator noise of standard deviation ε on a sphere model ∥x∥2 (to be minimized), the actually measured standard deviation Di of a specific component i of the parent population will usually be smaller, i.e. Di<σ. This is so, because selection singles out all those x+δ states with large length ∥x+δ∥. That is, shorter δ vectors are preferred resulting in a smaller measured standard deviation. Therefore, an evolutionary algorithm for robust optimization must take into account this effect and take appropriate counter measures.
Furthermore, it is well known that noise deteriorates the performance of the evolutionary algorithms. If the function to be optimized is noisy at its global or local optimizer, the evolutionary algorithm cannot reach the optimizer in expectation. That is, the parental individuals are located in the long run (steady state behavior) in a certain (expected) distance to the optimizer, both in the object parameter space and usually also in the quality/fitness space.
What is needed is an improved system and method (1) for evaluating the robustness of an Evolutionary Algorithm; (2) where the observed parental variance is controlled such that robustness (with regard to noise etc.) is tested correctly; and/or (3) for optimization that is driven by the trade-off between reducing the residual distance (induced by the noise) to the optimizer state and reducing the number of required fitness evaluations. In other words, such a method for optimization can reduce the residual distance (induced by the noise) to the optimizer state while at the same time minimizing the required additional fitness evaluation effort.
A system and method (1) for evaluating the robustness of an Evolutionary Algorithm; (2) where the observed parental variance is controlled such that robustness (with regard to noise etc.) is tested correctly; and/or (3) for optimization that is driven by the trade-off between reducing the residual distance (induced by the noise) to the optimizer state and reducing the number of required fitness evaluations. In other words, such a method for optimization can reduce the residual distance (induced by the noise) to the optimizer state while at the same time minimizing the required additional fitness evaluation effort.
One embodiment of the present invention is a method for optimizing a parameter set comprising object parameters, the method comprising the steps of: (a) creating an initial population of a plurality of individual parameter sets, the parameter sets comprising object parameters describing a model, structure, shape, design or process to be optimized and setting the initial population as a current parent population; (b) for each individual parameter set in a parent population mutating the parameters and optionally recombining the parameters to create an offspring population of individual parameter sets, wherein the strength of an individual object parameter mutation is enlarged by a noise contribution to enhance the robustness of the optimization; (c) evaluating a quality of each individual in the offspring population; (d) selecting individuals of the offspring population to be the next parent generation; and (e) repeating steps (b) through (d) until a termination criterion is reached.
The features and advantages described in the specification are not all inclusive and, in particular, many additional features and advantages will be apparent to one of ordinary skill in the art in view of the drawings, specification, and claims. Moreover, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter.
A preferred embodiment of the present invention is now described with reference to the figures where like reference numbers indicate identical or functionally similar elements. Also in the figures, the left most digit of each reference number corresponds to the figure in which the reference number is first used.
Reference in the specification to “one embodiment” or to “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiments is included in at least one embodiment of the invention. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment.
Some portions of the detailed description that follows are presented in terms of algorithms and symbolic representations of operations on data bits within a computer memory. These algorithmic descriptions and representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. An algorithm is here, and generally, conceived to be a self-consistent sequence of steps (instructions) leading to a desired result. The steps are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical, magnetic or optical signals capable of being stored, transferred, combined, compared and otherwise manipulated. It is convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like. Furthermore, it is also convenient at times, to refer to certain arrangements of steps requiring physical manipulations of physical quantities as modules or code devices, without loss of generality.
However, all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the following discussion, it is appreciated that throughout the description, discussions utilizing terms such as “processing” or “computing” or “calculating” or “determining” or “displaying” or “determining” or the like, refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system memories or registers or other such information storage, transmission or display devices.
Certain aspects of the present invention include process steps and instructions described herein in the form of an algorithm. It should be noted that the process steps and instructions of the present invention could be embodied in software, firmware or hardware, and when embodied in software, could be downloaded to reside on and be operated from different platforms used by a variety of operating systems.
The present invention also relates to an apparatus for performing the operations herein. This apparatus may be specially constructed for the required purposes, or it may comprise a general-purpose computer selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a computer readable storage medium, such as, but is not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, application specific integrated circuits (ASICs), or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus. Furthermore, the computers referred to in the specification may include a single processor or may be architectures employing multiple processor designs for increased computing capability.
The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Various general-purpose systems may also be used with programs in accordance with the teachings herein, or it may prove convenient to construct more specialized apparatus to perform the required method steps. The required structure for a variety of these systems will appear from the description below. In addition, the present invention is not described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any references below to specific languages are provided for disclosure of enablement and best mode of the present invention.
In addition, the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter. Accordingly, the disclosure of the present invention is intended to be illustrative, but not limiting, of the scope of the invention, which is set forth in the claims.
In order to unify/simplify the notations for the pseudo code description of the algorithms the following conventions will be used: g is the generation (time) counter, it appears as parenthesized superscript on the respective quantities. The i-th component of a vector x is denoted by xi. Vectors are always denoted by boldface letters in the following. In order to refer to the i-th component of a vector x we will alternatively write (x)i, that is (x)i≡xi. N denotes the object parameter space dimension. xεN is the N-dimensional object parameter vector. σ is referred to as the mutation strength being the standard deviation of the normally distributed mutations. μ is the parental population size. Quantities related to parental individuals are indexed by subscript m. γ is the number of offspring generated in a single generation. Quantities related to offspring individuals are indexed by subscript l and are denoted with a tilde. is an exogenous strategy parameter called truncation ratio, defined as
Normally distributed random variables/numbers y are denoted by N(
This is basically a centroid calculation. Overlined symbols, such as
R:=∥(x)∥.
r is the length of the first N−1 components of the vector x
a. Parameter Iinitialization
In step 100, the basic parameters of the optimization algorithm are initialized.
The generation g is set to a start value (0). The mutation strength σ, the population size λ and the number of offspring generated in a single generation λ are all set to initial values, whereby the number of offspring depends on the truncation ratio . The recombinant (x) is set to an initial vector x(init).
b. Procreation of Offspring for Generation n
In step 110, the procreation of the λ offspring is realized.
First, a log-normal mutation of the recombined strategy parameter cr is performed:
σl:=(σ)exp[τσNl(0,1)] (1)
where τσ is an exogenous strategy parameter, the so-called learning parameter. In order to ensure linear convergence order of the evolutionary algorithm on the sphere, it is known to be sufficient to ensure that τσ ∝1/√{square root over (N)}, such that
is a reasonable choice.
The new recombination of the strategy parameter is calculated as
Secondly, a mutation of the object parameter is performed on top of the recombinant (x):
({tilde over (x)}l)i:=(xi)+√{square root over ({tilde over (σ)}l2+εl2)}Nl,i(0,1) (4)
where l=1, . . . , λ and i=1, . . . , N and the new recombinant (x) is calculated as
At this point, it is to be noted that the actual strength by which the mutation is performed differs from the known “standard” evolution strategy: In order to account for noise (actuator noise etc.), the strength consists of the evolution strategy specific contribution σ and an additional noise contribution ε. Since normality of the noise is assumed, the sum of the strategy-specific mutation contribution and the noise contribution is still a normally distributed random vector, however, with variance {tilde over (σ)}l2+εi2for the i-th component. Performing the mutations in this way allows for taking advantage of the evolution strategy immanent mutation (of strength σl) as an additional robustness tester. That is, there is no need to evolve the EA's mutation strength down to very small values since it can take over a part of the robustness testing itself.
c. Controlling Mutation Strength (Robustness Variance Control)
In step 120, the strength of the mutation of the object parameter (cf. equation 4) is controlled by (a) measuring the parental population variance or its standard deviation Di and (b) adjusting the εi accordingly.
With regard to substep (a), it must be noted that, since robustness testing is highly noisy, calculating the parental population variance from just one generation results in highly fluctuating Di estimates not well suited for εi control. Therefore, a smooth Di estimate is needed. One way of smoothing the data is by weighted accumulation, also known as exponential averaging. Since
Di=√{square root over (
Di can be obtained from the smoothed time averages of xi and xi2. The exponential averaging is designed in such a way that the xi and xi2 information fades away exponentially fast if (xi) and (xi2), respectively, are zero. The time constant by which this process happens is controlled by the accumulation time constant cxε[0,1]. Since the changing rates of the evolution strategy (e.g. the progress rate on the sphere) are often of the order 1/N, it is reasonable to use
(7)
as a first choice.
With regard to substep (b), it is to be noted that the actual mutation strength depends on σ and εi. It is important to realize that εi is not equivalent to the desired actuator noise strength εi*. The latter is the desired strength by which the actually realized design instances are tested. As was already stated in the beginning, due to the (μ,λ)-selection, the actual variances of the selected (i.e. parental) {tilde over (x)}m;λ states are usually smaller than the desired ε1*.
Therefore, εi must be controlled in such a way that the observed (i.e. measured) standard deviation
Di:=√{square root over (Var[{({tilde over (x)}1;λ)i, . . . , ({tilde over (x)}μ;λ)i}] (8)
gets close to εi.
Given a stable estimate of the real parental population variance, one can compare it with the desired noise strength εi*.
That is, the aim is to control the observed parental population variance Di in such a way that
Di≈εi* (9).
If the evolution strategy is able to get close to the robust optimizer, then the above condition ensures that robustness is guaranteed for the correct target noise strength. While, in general, one cannot be sure that the evolution strategy locates the robust optimizer, fulfilling the above condition can be ensured by the control rule
εi:=εi exp[τεSign(εi*−Di)] (10).
If Di=εi*, the above equation does not change εi. In case Di<εi*, εi is increased and if Di>εi*, εi is decreased. Due to the choice of the sign function, the ε change rate is independent of the actual value of the Di−εi* difference. This ensures that large differences do not result in extreme εi changes. As an alternative one might replace sign( ) by a sigmoid function, e.g. the hyperbolic tangent.
The choice of the parameter τε, which may be interpreted as a damping constant, must be taken with care. The dynamics of the Di and εi interfere with each other. As a result, such a system can exhibit instabilities, e.g. oscillatory behavior. In order to prevent such instabilities, cx and τε must be chosen appropriately. While there is clearly a need for a thorough analysis, in the investigations done so far, the choice
worked flawlessly.
d. Increasing/Controlling Population Size
In step 130 as shown in
In order to control the population size λ, a measure is needed which allows for a decision whether to increase λ. Assuming a stationary actuator noise distribution, the dynamics of the evolution strategy will (usually) approach a steady state behavior in a certain vicinity of the optimizer. That is, for a certain time period one observes on average a measurable improvement in the observed parental fitness values. If, however, one reaches the vicinity of the steady state, parental fitness will start to fluctuate around an average value. Therefore, if one observes on average no improvements of the fitness values from generation g to g+1, it is time to increase the population size (a rule for λ-decrease has not been developed so far). The average parental fitness change ΔF is given by
ΔF=(F)(g)−(F)(g−1) (12)
where
Since ΔF itself is a strongly fluctuating quantity, an exponential smoothing should be used to avoid unnecessary population increase due to random fluctuations, as in line 28 of the algorithm shown in
can be used.
As an update rule, the population size is increased at every Δg-th generation, if the (exponentially smoothed) average fitness change
When considering maximization, desired fitness changes are of the kind ΔF>0. Therefore, if ΔF≦0, the population size λ should be increased. Since the increased population does not necessarily change the sign of ΔF in the next generation (random fluctuations), the test of the update rule in Line 29 is performed every Δg-th generation.
The λ-update itself is carried out in step 140 of
μ:=|μcμ| (15).
That is, the new μ is obtained from the old μ using the change rate cμ.
e. Terminating Condition
In step 150 it is checked whether the optimized state fulfils a predetermined criterion.
If yes, the procedure ends. If no, the procedure continues with the procreation of offspring in the next generation.
As termination conditions the standard stopping rules can be used:
A concrete example of an implementation of the above-described method is presented using pseudocode in
Parameter initialization (step 100 in
The procreation of offspring (step 110 in
An intermediate recombination of the strategy parameter σ is done on line 18 of
Control of the actually observed fluctuation strength (step 120 in
Control of the population size (steps 130 and 140) is realized according to the invention in lines 27 to 32 of
There are three new exogenous strategy parameters to be fixed: The truncation ratio :=μ/λ, the update time interval Δg, and the μ change rate cμ. Extensive simulations suggest
Δg=N, cμ=4
as a rule of thumb. That is, population upgrading should be done in a rather aggressive manner. The choice of the truncation ratio should be in the interval
=0.4, . . . , 0.6.
The second embodiment of the invention shown in
Unlike the mutation strength adaptation by the previous embodiment described with reference to
Due to the special way how the mutation strength is determined in the CSA evolution strategy and how the offspring are generated using the same mutation strength σ for all offspring individuals, there seems to be no direct way to transfer the idea of direct robustness testing through mutations to the CSA evolution strategy.
Therefore, the present embodiment employs a black-box approach: the evolution strategy is applied without modifications to the function ƒ(x) which is internally disturbed by actuator noise of strength ε. Thus, one has to differentiate between the evolution strategies' individual vectors {tilde over (x)}l and the real actuator state {tilde over (x)}la entering the ƒ function. The latter is invisible to the employed evolution strategy, however, it is needed for calculating the actually realized parental actuator fluctuations measured by the standard deviation Di.
In order to perform the path accumulation, the accumulation time constant cσ must be fixed. Two different recommendations concerning cσ can be found in the literature: ∝1/√{square root over (N)} and ∝1/N. From the viewpoint of stability
should be chosen. According to experimental evidences and theoretical analysis, the damping constant dσ must be chosen depending on cσ
A concrete example of an implementation of the method described in relation to
Parameter initialization (step 200 in
The procreation of offspring (step 210 in
The major difference to the method according to the first embodiment is located in Lines 20 and 21 of
Control of the actually observed fluctuation strength (step 220 in
Control of the population size (steps 230 and 240) is realized according to the invention in lines 32 to 35 of
The rest of the algorithm in
The present invention finds application for all kinds of structure encoding.
Specific examples of application of the present invention apart from turbine blades are airfoil shapes and other aerodynamic or hydrodynamic structures. Other fields of application are architecture and civil engineering, computer science, wing design, engine valve design and scheduling problems.
While particular embodiments and applications of the present invention have been illustrated and described herein, it is to be understood that the invention is not limited to the precise construction and components disclosed herein and that various modifications, changes, and variations may be made in the arrangement, operation, and details of the methods and apparatuses of the present invention without departing from the spirit and scope of the invention as it is defined in the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
05 019 800.1 | Sep 2005 | EP | regional |