This application claims priority to Korean Patent Application No. 10-2019-0103991, filed on Aug. 23, 2019, and Korean Patent Application No. 10-2019-0164802, filed on Dec. 11, 2019, in the Korean Intellectual Property Office, the disclosures of which are incorporated by reference herein in their entireties.
The disclosure relates to modeling, and more particularly, to a method and a system for a hybrid model including a machine learning model and a rule-based model.
A modeling technique may be used to estimate an object or phenomenon having a causal relationship, and a model generated through the modeling technique may be used to predict or optimize the object or phenomenon. For example, the machine learning model may be generated by training (or learning) based on a large amount of sample data, and the rule-based model may be generated by at least one rule defined based on physical laws and the like. The machine learning model and the rule-based model may have different characteristics and thus may be applicable to different fields and have different advantages and disadvantages. Accordingly, a hybrid model that minimizes the disadvantages of the machine learning model and the rule-based model and maximizes the advantages thereof may be very useful.
According to an aspect of an example embodiment, there is provided a method for a hybrid model that includes a machine learning model and a rule-based model, the method including obtaining a first output from the rule-based model by providing a first input to the rule-based model, and obtaining a second output from the machine learning model by providing the first input, a second input, and the obtained first output to the machine learning model. The method further includes training the machine learning model, based on errors of the obtained second output.
According to another aspect of an example embodiment, there is provided a method for a hybrid model that includes a machine learning model and a rule-based model, the method including obtaining an output from the machine learning model by providing an input to the machine learning model, and evaluating the obtained output by providing the obtained output to the rule-based model. The method further includes training the machine learning model, based on a result of the obtained output being evaluated.
According to another aspect of an example embodiment, there is provided a method for a hybrid model that includes a plurality of machine learning models and a plurality of rule-based models, the method including obtaining a first output from a first rule-based model by providing a first input to the first rule-based model, and obtaining a second output from a first machine learning model by providing a second input to the first machine learning model. The method further includes obtaining a third output by providing the obtained first output and the obtained second output to a second rule-based model or a second machine learning model, and training the first machine learning model, based on errors of the obtained third output.
According to another aspect of an example embodiment, there is provided a system for a hybrid model that includes a machine learning model and a rule-based model, the system including at least one computer subsystem, and at least one component that is executed by the at least one computer subsystem. The at least one component includes the rule-based model configured to obtain a first output from a first input, based on at least one predefined rule, the machine learning model configured to obtain a second output from the first input, a second input, and the obtained first output, and a model trainer configured to train the machine learning model, based on errors of the obtained second output.
According to another aspect of an example embodiment, there is provided a system for a hybrid model that includes a machine learning model and a rule-based model, the system including at least one computer subsystem, and at least one component that is executed by the at least one computer subsystem. The at least one component includes the machine learning model configured to obtain an output from an input, the rule-based model configured to evaluate the obtained output, based on at least one predefined rule, and a model trainer configured to train the machine learning model, based on a result of the obtained output being evaluated.
According to another aspect of an example embodiment, there is provided a system for a hybrid model that includes a plurality of machine learning models and a plurality of rule-based models, the system including at least one computer subsystem, and at least one component that is executed by the at least one computer subsystem. The at least one component includes a first rule-based model configured to obtain a first output from a first input, based on at least one predefined rule, a first machine learning model configured to obtain a second output from a second input, a second rule-based model or a second machine learning model configured to obtain a third output from the obtained first output and the obtained second output, and a model trainer configured to train the first machine learning model, based on errors of the obtained third output.
Example embodiments of the disclosure will be more clearly understood from the following detailed description taken in conjunction with the accompanying drawings in which:
The hybrid model 10 may be implemented by any computing system (e.g., a computing system 170 of
A rule-based model based on at least one predefined rule and a machine learning model based on a large amount of sample data may each have unique advantages and disadvantages due to different features. For example, the rule-based model may be easy for humans to understand and require a relatively small amount of data. Thus, the rule-based model may provide relatively high explainability and generalizability but may be applicable to relatively limited areas and provide relatively low predictability. On the other hand, the machine learning model may not be easy for humans to understand and may require a large amount of sample data. Accordingly, the machine learning model may provide relatively low generalizability and low explainability but may be applicable to wide areas range and provide relatively high predictability. As will be described below with reference to the drawings, in the hybrid model 10 according to embodiments, the rule-based model 12 and the machine learning model 14 are integrated together to maximize the advantages of the rule-based model 12 and the machine learning model 14 and minimize the disadvantages thereof, thereby providing high modeling accuracy and reducing costs.
The first input IN1 and the second input IN2 may correspond to at least some of factors affecting an object or a phenomenon to be modeled by the hybrid model 10, and the output OUT may represent a state or a change of the object or the phenomenon. The first input IN1 may correspond to factors that affect the output OUT and for which rules are defined, and the second input IN2 may correspond to factors that affect the output OUT and for which rules are not defined. In the present specification, the first input IN1 may be referred to as an input for the rule-based model 12, and the second input IN2 may be referred to as an input not for the rule-based model 12. In embodiments, the second input IN2 may be omitted.
The rule-based model 12 may include at least one rule defined by the first input IN1. For example, the rule-based model 12 may include at least one formula defined by the first input IN1 and include at least one condition that the first input IN1 may satisfy. In embodiments, the rule-based model 12 may include any one or any combination of a physical simulator, an emulator modeled on the physical simulator, an analytical rule, a Heuristic rule, and an experience rule, to which at least a portion of the first input IN1 is input. For example, the rule-based model 12 may include at least one model, e.g., a spice model used for circuit simulation, which uses electrical values, e.g., voltage, current, and the like as inputs. Rules included in the rule-based model 12 may be defined based on physical phenomena, and the rule-based model 12 may be referred to as a physical model herein.
The machine learning model 14 may have any structure that may be trained by machine learning. Examples of the machine learning model 14 may include an artificial neural network, a decision tree, a support vector machine, a Bayesian network, and/or a genetic algorithm. Objects or phenomena may not be completely modeled by the rules included in the rule-based model 12, and the machine learning model 14 may supplement parts not modeled by the rules. Non-limiting examples of the hybrid model 10 including the rule-based model 12 and the machine learning model 14 and non-limiting examples of a method for the hybrid model 10 will be described with reference to drawings below.
Referring to
In operation S33, the first input IN1, a second input IN2, and the first output OUT1 may be provided to the machine learning model 24. In operation S34, a second output OUT2 may be obtained from the machine learning model 24. As described above with reference to
In operation S35, the machine learning model 24 may be trained based on errors of the second output OUT2. The errors of the second output OUT2 may correspond to the differences between expected values (or measured values) of the second output OUT2 and values of the second output OUT2. The machine learning model 24 may be trained in various ways. For example, the machine learning model 24 may include an artificial neural network, and weights of the artificial neural network may be adjusted based on values back-propagated from the errors of the second output OUT2. An example of operation S35 will be described with reference to
In operation S35_1, a loss function may be calculated based on the errors of the second output OUT2. The loss function may be defined to evaluate the second output OUT2 generated from the first input IN1 and the second input IN2, and may also be referred to as a cost function. The loss function may define a value that increases as the difference between the second output OUT2 and an expected value (or a measured value) increases. In embodiments, a value of the loss function may increase as the errors of the second output OUT2 increase. Thereafter, in operation S35_2, the machine learning model 24 may be trained to reduce a result value of the loss function.
In embodiments, the hybrid model 20 of
In Equation 1, a metal gate boundary (MGB) represents a distance of a gate from a boundary and Nfin represents the number of fins included in a FinFET. The first input IN1 may include MGB, Nfin, and the like of Equation 1. The rule-based model 22 may generate as the first output OUT1 the variation ΔVt of the threshold voltage corresponding to the first input IN1, based on Equation 1. The machine learning model 24 may receive not only the first input IN1 but also the first output OUT1, i.e., the variation ΔVt of the threshold voltage, and generate a finally estimated variation ΔVt of the threshold voltage as the second output OUT2.
Referring to
Referring to
The rule-based model 62 may include rules that define at least part of the plasma process. For example, as illustrated in
The machine learning model 64 may receive the first input IN1 and the second input IN2, and receive, as the first output OUT1, the ion/radical ratio D61, the electron temperature D62, the energy distribution D63, and the angular distribution D64 from the rule-based model 62. The machine learning model 64 may generate the second output OUT2 from the first input IN1, the second input IN2, and the first output OUT1. The second output OUT2 may include values for accurately estimating a profile of a pattern formed by the plasma process and/or a degree of opening of the pattern.
The horizontal axis of the graph of
Referring to
In operation S86, rules of the rule-based model 22 may be modified based on the errors of the second output OUT2. For example, the rule-based model 22 may include a plurality of parameters used to generate the first output OUT1 from the first input IN1, and any one or any combination of the plurality of parameters may be modified based on the errors of the second output OUT2. Accordingly, the machine learning model 24 may be trained in operation S85 and the rules of the rule-based model 22 may be modified in operation S86, thereby increasing the accuracy of the hybrid model 20. An example of operation S86 will be described with reference to
In embodiments, the machine learning model 24 may be trained based on a degree to which the rules included in the rule-based model 22 are modified. The rules included in the rule-based model 22 may be defined based on physical phenomena, and thus, the machine learning model 24 may be trained such that fewer modifications are made to the rules included in the rule-based model 22. For example, operation S85 of
Lnew(θ), which is the first term of Equation 2, may correspond to the errors of the second output values OUT2 or values derived from the errors. In the second term of Equation 2, λ may be a constant determined according to the weights of training both the machine learning model 24 and the rule-based model 22 for regularization thereof, θn may represent an nth parameter included in the rule-based model 22 before the rule-based model is adjusted, θn* may represent an nth parameter after the rule-based model is adjusted, and Fn may be a constant determined according to the importance of the nth parameter. As errors between the plurality of parameters included in the rule-based model 22 and the adjusted plurality of parameters increase, the second term of Equation 2 may increase and thus a value of the loss function L(θ) may also increase. As described above with reference to
In operation S86_1, the machine learning model 24 may be frozen. For example, values of internal parameters of the machine learning model 24 may be changed in a process of training the machine learning model 24 in operation S85 of
In operation S86_2, errors of the first output OUT1 may be generated from errors of the second output OUT2. For example, the errors of the first output OUT1 due to the errors of the second output OUT2 may be generated from the machine learning model 24 frozen in operation S86_1 while the first input IN1 and the second input IN2 are given. In some embodiments, when the machine learning model 24 includes an artificial neural network, the errors of the first output OUT1 may be calculated from the errors of the second output OUT2 while weights included in the artificial neural network are fixed.
In operation S86_3, the rules of the rule-based model 22 may be modified based on the errors of the first output OUT1. For example, any one or any combination of the plurality of parameters included in the rule-based model 22 may be adjusted, based on the given first input IN1 and the errors of the first output OUT1. Accordingly, the rule-based model 22 may include rules modified according to the adjusted parameters.
The first machine learning model 101 may receive a first input IN1 as process parameters and may output a threshold voltage Vt of the transistor from the first input IN1. In some embodiments, unlike that illustrated in
The rule-based model 102 may receive the first input IN1, receive the threshold voltage Vt from the first machine learning model 101, and output drain current IdPHY physically estimated from the first input IN1 and the threshold voltage Vt. As illustrated in
Id=μCox(Vg−Vt)2 [Equation 3]
In Equation 3, μ, may represent the mobility of electrons (or holes), Cox may represent a gate capacitance per unit area, and Vg may represent a gate voltage.
The second machine learning model 104 may receive the first input IN1, the second input IN2, and the physically estimated drain current IdPHY, and output drain current IdFIN finally estimated from the first input IN1, the second input IN2, and the estimated drain current IdPHY.
In some embodiments, the rules included in the rule-based model 102 may be modified, as well as the first machine learning model 101 and the second machine learning model 104. For example, in the rule defined by Equation 3, μ, representing electron mobility may be modified (or corrected) based on Equation 4 below.
μ=g(μmin,μmax) [Equation 4]
In Equation 4, μmin may represent a minimum value of the electron mobility μ determined by errors of the physically estimated drain current IdPHY, and μmax may represent a maximum value of the electron mobility μ determined by the errors of the physically estimated drain current IdPHY. The electron mobility μ may be defined by a function g of the minimum value μmin and the maximum value μmax, and the rule defined by Equation 3 may be modified according thereto. According to an experiment, with respect to about 100 samples, the performance of the hybrid model 100 may be three times or more better than that of a single machine learning model.
Referring to
Referring to
In operation S123, the second output OUT2 may be provided to the rule-based model 112. In operation S124, the second output OUT2 may be evaluated based on the first output OUT1 of the rule-based model 112. In some embodiments, the rule-based model 112 may include a rule defining an allowable range of the second output OUT2, and the second output OUT2 may be evaluated better (a score of evaluating the second output OUT2 may be increased) as it approaches the allowable range. In embodiments, the rule-based model 112 may include as a rule a formula defined by the second output OUT2, and the second output OUT2 may be evaluated better (the score of evaluating the second output OUT2 may be increased) as it approximates the formula. In embodiments, the first output OUT1 may have a value that increases or decreases as the result of evaluating the second output OUT2 is better or increases.
In operation S125, the machine learning model 114 may be trained based on the evaluation result. In some embodiments, operation S125 may include operations S35_1 and S35_2 of
Based on a large number of experiments, a rule that a change of dimension with respect to a flow rate of a gas, i.e., sensitivity, is within a range EXP may be predefined, and the rule-based model 112 may include the predefined rule. When a single machine learning model is used, sensitivity beyond the range EXP may be estimated as indicated by “P1” in
Referring to
Referring to
A first input IN1 may be provided to the first rule-based model 142a in operation S151, and a first output OUT1 may be obtained from the first rule-based model 142a in operation S152. A second input IN2 may be provided to the first machine learning model 144a in operation S153, and a second output OUT2 may be obtained from the first machine learning model 144a in operation S154.
The first output OUT1 and the second output OUT2 may be provided to the second rule-based model 146a and/or the second machine learning model 146b in operation S155. A third output OUT3 may be obtained from the second rule-based model 146a and/or the second machine learning model 146b in operation S156. Next, the first machine learning model 144a may be trained based on errors of the third output OUT3 in operation S157. In some embodiments, in the hybrid model 140b of
Referring to
The hybrid model 162′ may include first to fourth machine learning models ML1 to ML4, as well as the physical models Ph, Imp and SR, and the physical model MR and the fifth machine learning model ML5 may be integrated together. For example, similar to the machine learning model 24 of
The fourth machine learning model ML4 may receive the additional input X4 and provide an output to the fifth machine learning model ML5 and the physical model MR, which are integrated together. The fifth machine learning model ML5 may be integrated with the physical model MR. For example, the physical model MR and the fifth machine learning model ML5 may process outputs of the first to fourth machine learning models ML1 to ML4 in parallel as illustrated in
The computing system 170 may be a stationary computing system, such as a desktop computer, a workstation, or a server, or a mobile computing system such as a laptop computer. As illustrated in
The processor 171 may be referred to as a processing unit, for example, a micro-processor, an application processor (AP), a digital signal processor (DSP), or a graphics processing unit (GPU), and include at least one core capable of executing an instruction set (e.g., IA-32 (Intel Architecture-32), 64-bit extensions IA-32, x86-64, PowerPC, Sparc, MIPS, or ARM, IA-64). For example, the processor 171 may access memory, i.e., the RAM 174 or the ROM 175, via the bus 177, and execute instructions stored in the RAM 174 or the ROM 175.
The RAM 174 may store a program 174_1 for performing a method for a hybrid model or at least a part thereof, and the program 174_1 may cause the processor 171 to perform at least some of operations included in the method for the hybrid model. That is, the program 174_1 may include a plurality of instructions executable by the processor 171, and the plurality of instructions in the program 174_1 may cause the processor 171 to perform at least some of the operations included in the method described above.
The storage device 176 may retain data stored therein even when power supplied to the computing system 170 is cut off. Examples of the storage device 176 may include a non-volatile memory device or a storage medium such as a magnetic tape, an optical disk, or a magnetic disk. The storage device 176 may be detachable from the computing system 170. The storage device 176 may store the program 174_1 according to embodiments. The program 174_1 or at least a part thereof may be loaded from the storage device 176 to the RAM 174 before the program 174_1 is executed by the processor 171. Alternatively, the storage device 176 may store a file written in a programming language, and the program 174_1 generated by a compiler or the like from the file or at least a part thereof may be loaded to the RAM 174. As illustrated in
The storage device 176 may store data to be processed or data processed by the processor 171. That is, the processor 171 may generate data by processing data stored in the storage device 176 according to the program 174_1, and store the generated data in the storage device 176.
The I/O devices 172 may include an input device, such as a keyboard or a pointing device, and an output device such as a display device or a printer. For example, a user may trigger execution of the program 174_1, input training data, or check result data by the processor 171 through the I/O devices 172.
The network interface 173 may provide access to a network outside the computing system 170. For example, the network may include a large number of computing systems and communication links, and the communication links may include wired links, optical links, wireless links, or any other form of links.
The computer system 182 may include at least one computer subsystem, and the program 184_1 may include at least one component executed by at least one computer subsystem. For example, the at least one component may include a rule-based model and a machine learning model as described above with reference to the drawings, and include a model trainer that trains a machine learning model or modifies rules included in a rule-based model. The computer-readable medium 184 may include a non-volatile memory device, similar to the storage device 176 of
In operation S191, a hybrid model modeled on a semiconductor process may be generated. For example, the hybrid model may be generated by modeling any one or any combination of a plurality of processes included in the semiconductor process. As described above with reference to the drawings, the hybrid model may include at least one rule-based model (or physical model) and at least one machine learning model, and may be generated to output characteristics of an integrated circuit by receiving process parameters.
In operation S192, characteristics of an integrated circuit corresponding to process parameters may be obtained. For example, the characteristics of the integrated circuit, e.g., electron mobility and a dimension and profile of a pattern, may be obtained by providing the process parameters to the hybrid model generated in operation S191. As described above with reference to the drawings, the obtained characteristics of the integrated circuit may have high accuracy regardless of a small amount of sample data provided to the hybrid model.
In operation S193, whether the process parameters are to be adjusted may be determined. For example, it may be determined whether the characteristics of the integrated circuit obtained in operation S192 satisfy requirements. When the characteristics of the integrated circuit do not satisfy the requirements, the process parameters may be adjusted and operation S192 may be performed again. Alternatively, when the characteristics of the integrated circuit satisfy the requirements, operation S194 may be subsequently performed.
In operation S194, an integrated circuit may be manufactured by a semiconductor process. For example, an integrated circuit may be manufactured by a semiconductor process to which the process parameters finally adjusted in operation S193 are applied. The semiconductor process may include a front-end-of-line (FEOL) process and a back-end-of-line (BEOL) process in which masks fabricated based on an integrated circuit are used. For example, the FEOL process may include planarizing and cleaning a wafer, forming trenches, forming wells, forming gate lines, forming a source and a drain, and the like. The BEOL process may include silicidating gate, source and drain regions, adding a dielectric, performing planarization, forming holes, adding a metal layer, forming vias, forming a passivation layer, and the like. The integrated circuit manufactured in operation S194 may have characteristics that match the characteristics, which are obtained in operation S192, with high accuracy due to high accuracy of the hybrid model. Accordingly, a time and costs for manufacturing an integrated circuit with desirable characteristics may be reduced, and an integrated circuit with better characteristics may be manufactured.
In operation S201, a hybrid model may be generated. For example, as described above with reference to the drawings, a hybrid model that includes a rule-based model and a machine learning model may be generated. The hybrid model may provide high efficiency and accuracy. Next, in operation S202, samples of an input and an output of the hybrid model may be collected. For example, samples of an input may be provided to the hybrid model, and samples of an output corresponding to the samples of the input may be obtained from the hybrid model.
In operation S203, a machine learning model modeled on the hybrid model may be generated. In some embodiments, a machine learning model (e.g., an artificial neural network) may be generated by modeling the hybrid model to reduce computing resources to be consumed in implementing a hybrid model including a rule-based model and a machine learning-based model. To this end, the machine learning model modeled on the hybrid model may be trained with the samples of the input and the output collected in operation S202. The trained machine learning model may provide relatively low accuracy when compared to the hybrid model but be implemented with reduced computing resources.
As is traditional in the field of the technical concepts, the embodiments are described, and illustrated in the drawings, in terms of functional blocks, units and/or modules. Those skilled in the art will appreciate that these blocks, units and/or modules are physically implemented by electronic (or optical) circuits such as logic circuits, discrete components, microprocessors, hard-wired circuits, memory elements, wiring connections, and the like, which may be formed using semiconductor-based fabrication techniques or other manufacturing technologies. In the case of the blocks, units and/or modules being implemented by microprocessors or similar, they may be programmed using software (e.g., microcode) to perform various functions discussed herein and may optionally be driven by firmware and/or software. Alternatively, each block, unit and/or module may be implemented by dedicated hardware, or as a combination of dedicated hardware to perform some functions and a processor (e.g., one or more programmed microprocessors and associated circuitry) to perform other functions. Also, each block, unit and/or module of the embodiments may be physically separated into two or more interacting and discrete blocks, units and/or modules without departing from the scope of the technical concepts. Further, the blocks, units and/or modules of the embodiments may be physically combined into more complex blocks, units and/or modules without departing from the scope of the technical concepts.
While example embodiments been shown and described, it will be understood that various changes in form and details may be made therein without departing from the spirit and scope of the following claims.
Number | Date | Country | Kind |
---|---|---|---|
10-2019-0103991 | Aug 2019 | KR | national |
10-2019-0164802 | Dec 2019 | KR | national |