The present invention generally relates to a method and apparatus for empirical designs of experiments, and more particularly relates to a particular design of experiments pertaining to a simulator model using historical data and a nonlinear neural network model.
Designs of experiments are often used in studying the effects of multiple input variables upon one or more output variables, such as the quantifiable output of a particular process. For example, designs of experiments can be used in testing the effects of various environmental conditions upon the operation of a particular apparatus, such as a gas turbine engine. In such an example, the input variables can represent certain quantifiable conditions, such as altitude and inlet pressure, and the output variables can represent quantifiable measures representing the operation of an apparatus, such as the exhaust gas temperature of a gas turbine engine. Designs of experiments often use linear models to approximate the relationship between the input variables and the output variables.
Often a design of experiments is conducted by running a series of experiments on an apparatus such as a gas turbine engine. In such experiments, the input variables representing the environmental conditions are systematically altered, and the corresponding effects on the output variables are recorded. However, in many circumstances the physical apparatus may be costly to obtain and/or not readily available. Moreover, it is often difficult, costly and time consuming to properly configure the testing so that the input variables represent the entire range of environmental conditions, and to perform the testing and collect the data from the results of all of the tests to obtain complete and accurate results in the experiments on the apparatus.
An alternative approach, using an accurate model as a proxy for the apparatus, can save a significant amount of time and money with little loss of accuracy, depending on the accuracy of the baseline model. However, frequently the available models are too complex and/or cumbersome to run efficiently, often relying on thousands of data points, and taking weeks or months to run, for example in the case of available finite element models for gas turbine engines. Other available models, such as linear regression models, may not provide a very accurate fit for the data, particularly for nonlinear relationships among the variables.
Accordingly, there is a need for an improved design of experiments for modeling relationships between input variables and output variables associated with the operation of an apparatus or other process, such as the operation of a gas turbine engine, that is more accurate, time effective and/or cost effective than existing models, that does not require running new tests on the apparatus or process, and that does not have the limitations of a linear regression model.
A method is provided for a design of experiments for modeling the effects of two or more input variables on one or more output variables. The method comprises a first step of generating a data set comprising data points from historical data for the input variables and the output variables, each data point comprising corresponding values for one or more input variables and one or more output variables. The method further comprises a second step of identifying any fault data points in the historical data, a fault data point being a data point in which an output variable value is determined to be caused by factors other than the input variables, and a third step of removing the identified fault data points from the data set, thereby generating a revised data set. The method further comprises a fourth step of supplying the data points from the revised data set into a nonlinear neural network model, and a fifth step of deriving a simulator model characterizing a relationship between the input variables and the output variables using the nonlinear neural network model with the supplied data.
An apparatus is provided for modeling the effects of two or more input variables on one or more output variables. The apparatus comprises a means for generating a data set comprising data points from historical data for the input variables and the output variables, in which each data point comprises corresponding values for one or more input variables and one or more output variables. The apparatus further comprises means for identifying any fault data points from the historical data, and means of removing the identified fault data points from the data set, thereby generating a revised data set. The apparatus further comprises means for supplying the data points from the revised data set into a nonlinear neural network model, and means for deriving a simulator model characterizing a relationship between the input variables and the output variables using the nonlinear neural network model with the supplied data.
The present invention will hereinafter be described in conjunction with the following drawing figures, wherein like numerals denote like elements, and
The following detailed description of the invention is merely exemplary in nature and is not intended to limit the invention or the application and uses of the invention. Furthermore, there is no intention to be bound by any theory presented in the preceding background of the invention or the following detailed description of the invention.
Turning now to
Preferably each data point 22 comprises corresponding values for each input variable 12 and each output variable 14, so that the data points 22 represent more accurate and meaningful relationships between the input variables 12 and the output variables 14. For example, in the case of a gas turbine engine, each data point 22 preferably includes values for each of the input variable 12 environmental conditions affecting turbine engine performance during a particular time period, as well as values for each of the output variable 14 engine turbine performance measures resulting from this particular set of environmental conditions. By including values for each of the input variables 12 and each of the output variables 14 in each data point 22, this preferred embodiment helps to prevent a situation in which the effects of a particular input variable 12 may otherwise be masked or incorrectly attributed to another input variable 12, which could occur if the particular input variable 12 did not have a value represented in a particular data point. However, it will be appreciated that in some situations values may be unavailable for one or more of the input variables 12 or output variables 14 in a particular data point 22, in which case the data point 22 may take a different configuration with less than all of the variable values.
It will also be appreciated that the historical data 18 may be obtained in any one of a number of different manners, for example from sensor records of prior operations of an apparatus or system. Next, in step 24 a data set 26 is generated by assembling the various data points 22. The data set 26 comprises the various data points 22 of the historical data 18.
Next, in step 28, the data set 26 is analyzed so as to split fault data 30 from no fault data 32. For the purposes of step 28, the fault data 30 includes any data points 22 for which an output variable 14 value is determined to be caused by factors other than the input variables 12. For example, in the example of a gas turbine engine, the fault data 30 may include data points 22 for which the output variable 14 values are determined to be caused in significant part by some problem in the gas turbine engine, or the operation thereof, rather than by any environmental conditions that may be represented in the input variables 12. For the purposes of step 28, the no fault data 32 includes any data points 22 that are not fault data 30. In other words, the no fault data 32 includes data points 22 for which the output variable 14 values are determined to be caused predominantly by the input variables 12. As shown, the fault data 30 is removed from the data set 26 in step 34, the fault data 30 thereby becoming removed data 36. Conversely, the no fault data 32 is retained in step 38, resulting in a revised data set 40 comprising data points 22 of the no fault data 32. The revised data set 40 allows for a more accurate modeling of the effects of the input variables 12 on the output variables 14.
Next, in step 42, the data points 22 of the revised data set 40 are supplied to a neural network model 44 for the purposes of generating a simulator model 10. In a preferred embodiment a feed-forward neural network model 44 is used; however, it will be appreciated that any one of a number of different types of nonlinear models can be used for the neural network model 44. Regardless of the particular type of model used, in step 46 the neural network model 44 generates the simulator model 10, which includes one or more formulas modeling the effects of the input variables 12 on the output variables 14. For ease of reference, steps 42 and 46 will be collectively hereafter referenced as a single step 48, “Build a simulator model”, as depicted in
The simulator model 10 can be a very useful tool in designing, monitoring, and analyzing the particular apparatus, systems or processes for which the simulator model 10 is used. For example, in the above-mentioned application of a gas turbine engine, the simulator model 10 can be used for designing a gas turbine engine or components or parts thereof, improving the engine, components or parts, and predicting performance of an engine, among various other uses. The simulator model 10 can save significant time and money, particularly when (i) the apparatus, system or process to be studied is expensive or difficult to obtain; (ii) it is difficult, expensive or time consuming to run comprehensive testing on the apparatus, system or process; and/or (iii) available models lack sufficient accuracy, precision, simplicity or speed in running.
The original algorithm 52 includes a subset of the input variables 12, namely certain input variables 12 originally determined to have substantial effects on the output variables 14. It will be understood that the original algorithm 52 may be commonly known in the industry, and/or may be the result of empirical testing, a theory or hypothesis, or any one of a number of different ways to generate an algorithm. Regardless of the origin of the original algorithm 52, the extended process 50 uses a series of steps for enhancing the original algorithm 52.
The extended process 50, similar to the DOE process 16, begins with generating the data set 26, comprising data points 22 from the historical data 18. The data set 26 is split into fault data 30 and no fault data 32, as with the DOE process 16. Next, in step 48 the simulator model 10 is built, via the process set forth in greater detail in steps 42 and 46 of
In addition, statistical measures 54 are selected for the revised data set 40, and values for the statistical measures 54 are determined for each of the input variables 12, based on the data points 22 in the revised data set 40, for subsequent use with the simulator model 10. As shown in
Next, in step 56, the data points 22 corresponding to the values of the statistical measures 54 are used, in conjunction with the simulator model 10, to predict values of the output variables 14 corresponding to the values of the statistical measures 54 for the input variables 12. As shown in
Returning now to
Specifically,
Returning again to
Regardless of the particular techniques used in steps 58 and 60, the results of these steps are utilized in step 62 in generating an enhanced algorithm 64, which represents the addition and/or removal of certain input variables 12 as determined in steps 58 and 60. The enhanced algorithm 64 can be used for various purposes such as, for example, improved modeling and analysis of the effects of the input variables 12 on the output variables 14. For example, step 42 of the DOE process 16 can be re-run using values from the revised data set 40 corresponding with the input variables 12 in the enhanced algorithm 64, along with the output variables 14, to generate a new simulator model 10 in step 46 corresponding with the enhanced algorithm 64. As shown in
For example,
In
It will be appreciated that the extended process 50 and the enhanced algorithm 64 can be used for various other types of testing, modeling, and analysis, and can be used in any one of a number of different applications.
In addition, the DOE process 16 and the extended process 50 can be implemented in a wide variety of platforms including, for example, any one of numerous computer systems. Turning now to
The processor 86 performs the computation and control functions of the computer system 84. The processor 86 may comprise any type of processor, include single integrated circuits such as a microprocessor, or may comprise any suitable number of integrated circuit devices and/or circuit boards working in cooperation to accomplish the functions of a processing unit. In addition, the processor 86 may comprise multiple processors implemented on separate systems. In addition, the processor 86 may be part of an overall system for an apparatus or process. During operation, the processor 86 executes the programs contained within the memory 94 and as such, controls the general operation of the computer system 84.
The memory 94 can be any type of suitable memory. This would include the various types of dynamic random access memory (DRAM) such as SDRAM, the various types of static RAM (SRAM), and the various types of non-volatile memory (PROM, EPROM, and flash). It should be understood that the memory 94 may be a single type of memory component, or it may be composed of many different types of memory components. In addition, the memory 94 and the processor 86 may be distributed across several different computers that collectively comprise the computer system 84. For example, a portion of the memory 94 may reside on a computer within a particular apparatus or process, and another portion may reside on a remote computer.
The bus 92 serves to transmit programs, data, status and other information or signals between the various components of the computer system 84. The bus 92 can be any suitable physical or logical means of connecting computer systems and components. This includes, but is not limited to, direct hard-wired connections, fiber optics, infrared and wireless bus technologies.
The interface 88 allows communication to the computer system 84, and can be implemented using any suitable method and apparatus. It can include one or more network interfaces to communicate to other systems, terminal interfaces to communicate with technicians, and storage interfaces to connect to storage apparatuses such as the storage device 90. The storage device 90 can be any suitable type of storage apparatus, including direct access storage devices such as hard disk drives, flash systems, floppy disk drives and optical disk drives. As shown in
In accordance with a preferred embodiment, the computer system 84 includes a program 98 for use in implementing the DOE process 16 and/or the extended process 50. During operation, the program 98 is stored in the memory 94 and executed by the processor 86. As one example implementation, the computer system 84 may also utilize an Internet website, for example for providing or maintaining data or performing operations thereon.
It should be understood that while the embodiment is described here in the context of a fully functioning computer system, those skilled in the art will recognize that the mechanisms of the present invention are capable of being distributed as a program product in a variety of forms, and that the present invention applies equally regardless of the particular type of computer-readable signal bearing media used to carry out the distribution. Examples of signal bearing media include: recordable media such as floppy disks, hard drives, memory cards and optical disks (e.g., disk 96), and transmission media such as digital and analog communication links.
While at least one exemplary embodiment has been presented in the foregoing detailed description of the invention, it should be appreciated that a vast number of variations exist. It should also be appreciated that the exemplary embodiment or exemplary embodiments are only examples, and are not intended to limit the scope, applicability, or configuration of the invention in any way. Rather, the foregoing detailed description will provide those skilled in the art with a convenient road map for implementing an exemplary embodiment of the invention, it being understood that various changes may be made in the function and arrangement of elements described in an exemplary embodiment without departing from the scope of the invention as set forth in the appended claims and their legal equivalents.
Number | Name | Date | Kind |
---|---|---|---|
5091843 | Peczkowski | Feb 1992 | A |
5461699 | Arbabi et al. | Oct 1995 | A |
5633800 | Bankert et al. | May 1997 | A |
5684946 | Ellis et al. | Nov 1997 | A |
5781430 | Tsai | Jul 1998 | A |
5980096 | Thalhammer-Reyero | Nov 1999 | A |
6161054 | Rosenthal et al. | Dec 2000 | A |
6249712 | Boiquaye | Jun 2001 | B1 |
6353804 | Bowman | Mar 2002 | B1 |
6411945 | Nakajima | Jun 2002 | B1 |
6430993 | Seta | Aug 2002 | B1 |
6496347 | Christensen et al. | Dec 2002 | B1 |
6604092 | Stewart | Aug 2003 | B1 |
6606612 | Rai et al. | Aug 2003 | B1 |
6678668 | Fisher et al. | Jan 2004 | B2 |
6718288 | Wakeman | Apr 2004 | B1 |
6725112 | Kaminsky et al. | Apr 2004 | B1 |
6773149 | Kulkarni et al. | Aug 2004 | B2 |
6836754 | Cooper | Dec 2004 | B2 |
6850877 | Sengupta | Feb 2005 | B1 |
6868716 | Okano et al. | Mar 2005 | B2 |
6910947 | Paik | Jun 2005 | B2 |
6961636 | Chong et al. | Nov 2005 | B1 |
7069198 | Snyder et al. | Jun 2006 | B2 |
7101799 | Paik | Sep 2006 | B2 |
7110924 | Prewett et al. | Sep 2006 | B2 |
7164954 | Lefebvre et al. | Jan 2007 | B2 |
7194320 | Lefebvre et al. | Mar 2007 | B2 |
7364846 | Erlander et al. | Apr 2008 | B2 |
7400935 | Lefebvre et al. | Jul 2008 | B2 |
20020138457 | Jin et al. | Sep 2002 | A1 |
20020183986 | Stewart et al. | Dec 2002 | A1 |
20030036891 | Aragones et al. | Feb 2003 | A1 |
20030062339 | Houge et al. | Apr 2003 | A1 |
20030074171 | Desai et al. | Apr 2003 | A1 |
20030200069 | Volponi | Oct 2003 | A1 |
20030237058 | Burghaus et al. | Dec 2003 | A1 |
20040019469 | Leary et al. | Jan 2004 | A1 |
20040107082 | Sato et al. | Jun 2004 | A1 |
20040117161 | Burdgick et al. | Jun 2004 | A1 |
20040143834 | O'Rourke et al. | Jul 2004 | A1 |
20040148049 | Schwarm | Jul 2004 | A1 |
20040148940 | Venkateswaran et al. | Aug 2004 | A1 |
Number | Date | Country | |
---|---|---|---|
20070239633 A1 | Oct 2007 | US |