This application claims the benefit of and priority to Korean Patent Application No. 10-2023-0144301, filed on Oct. 26, 2023, the entirety of which is incorporated herein by reference for all purposes.
Embodiments of the present disclosure relate to machine-learning techniques and semiconductor process, and particularly to, for example, without limitation, a machine learning-based semiconductor process optimization method and system.
In recent years, neural networks have been actively used to predict and analyze the impact of semiconductor design and manufacturing processes and optimize device characteristics. However, in semiconductor processes demanding significant investment and precise yield control, utilizing neural networks for optimization and process monitoring with uncertain reliability can potentially yield detrimental results. Therefore, it is vital to verify results obtained using a neural network.
In particular, since electrical properties of manufactured devices have a non-uniform distribution, including uncontrollable variability, data density is not constant over the entire range. Consequently, there exists a sparse data space where the accuracy of neural network predictions may deteriorate. Moreover, when training and validation data sets are split to prevent overfitting of the neural network in the sparse data space, the prediction error in the sparse data space is further expanded due to the lack of training data.
Furthermore, when gradient descent is applied to a single neural network to optimize the input for low output in the sparse data space, the optimal input has an output deviation according to the training index of the neural network. That is, the input optimized with a single neural network can result in significant output variations, posing a challenge to the reliability of optimization processes.
The description of the related art should not be assumed to be prior art merely because it is mentioned in or associated with this section. The description of the related art includes information that describes one or more aspects of the subject technology, and the description in this section does not limit the invention.
The inventors of the present disclosure have recognized the problems and disadvantages of the related art and have performed extensive research. The inventors of the present disclosure have thus invented a new method and a new system that substantially obviate one or more problems due to limitations and disadvantages of the related art.
The disclosed embodiments are intended to provide a machine-learning method for semiconductor process optimization, capable of decreasing output variation and increasing reliability, and a computing device for performing the same.
In one or more aspects, there is provided a machine-learning method for semiconductor process optimization that is executed in a computing device including one or more processors and a memory storing one or more programs executed by the one or more processors. The machine-learning method may include inputting semiconductor-related parameters into each of a plurality of first neural network models and outputting, based on the semiconductor-related parameters, a predicted figure of merit of a semiconductor device as a first output value from each of the plurality of first neural network models. The semiconductor-related parameters may include electrical measurement parameter values measured on one or more semiconductor devices.
The plurality of first neural network models may each have the same neural network structure and be trained through a training data set randomized through a plurality of epochs, so that weights within neural network models of the plurality of first neural network models have different values.
The machine-learning method may further include sorting a plurality of first output values output from the plurality of first neural network models in order of size, removing first output values that fall within a preset upper range and first output values that fall within a preset lower range among the plurality of first output values, and inputting, into a second neural network model, first output values remaining after the first output values that fall within the preset upper range and the first output values that fall within the preset lower range are removed.
The machine-learning method may further include outputting a plurality of second output values from the second neural network model based on the remaining first output values and calculating a final output value based on the plurality of second output values.
The second neural network model may be trained to transform the remaining first output values to have a same number as the plurality of second output values.
The final output value may be a mean value of the plurality of second output values or a median value among the plurality of second output values.
A semiconductor manufacturing process may be performed with at least one semiconductor manufacturing parameter as a target value. After the semiconductor manufacturing process is performed, the electrical measurement parameter values may be measured using one or more measuring devices. According to the semiconductor-related parameters, which are based on the at least one semiconductor manufacturing parameter and the electrical measurement parameter values measured using the one or more measuring devices, each of the plurality of first neural network models may determine the predicted figure of merit of the semiconductor device.
The machine-learning method may utilize a feedback loop between an output and an input of the plurality of first neural network models. The machine-learning method may further include computing an output value based on a plurality of first output values output from the plurality of first neural network models, updating the electrical measurement parameter values based on the output value, inputting updated semiconductor-related parameters into each of the plurality of first neural network models, and outputting, based on the updated semiconductor-related parameters, an updated predicted figure of merit of the semiconductor device as an updated first output value from each of the plurality of first neural network models. The updated semiconductor-related parameters may include the updated electrical measurement parameter values.
Updating the electrical measurement parameter values may include computing a gradient of the output value, and limiting at least one value of the updated electrical measurement parameter values to cause the at least one value to satisfy a preset limit. Updating the electrical measurement parameter values can enable the gradient to move in a preset direction.
The plurality of first neural network models may have different biases, and the plurality of first neural network models may have different local minima.
The plurality of first neural network models may be trained using a set of training data, including a fixed data set and a randomized data set. The randomized data may include a training data set and a verification data set.
In one or more aspects, there is provided a computing device including one or more processors, a memory, and one or more programs, in which the one or more programs are configured to be stored in the memory and executed by the one or more processors. The one or more programs may include instructions for inputting semiconductor-related parameters into each of a plurality of first neural network models and instructions for outputting, based on the semiconductor-related parameters, a predicted figure of merit of a semiconductor device as a first output value from each of the plurality of first neural network models. The semiconductor-related parameters may include electrical measurement parameter values measured on one or more semiconductor devices.
In one or more aspects, there is provided a machine-learning method for semiconductor process optimization that is executed in a computing device including one or more processors and a memory storing one or more programs executed by the one or more processors. The machine-learning method may include inputting semiconductor-related parameters into each of a plurality of neural network models, outputting, based on the semiconductor-related parameters, a predicted figure of merit of a semiconductor device as an output value from each of the plurality of neural network models, sorting a plurality of output values output from the plurality of neural network models in order of size, removing output values that fall within a preset upper range and output values that fall within a preset lower range among the plurality of output values, and calculating a final output value based on output values remaining after the output values that fall within the preset upper range and the output values that fall within the preset lower range are removed.
In one or more aspects, there is provided a computing device including one or more processors, a memory, and one or more programs, in which the one or more programs are configured to be stored in the memory and executed by the one or more processors. The one or more programs may include instructions for inputting semiconductor-related parameters into each of a plurality of neural network models, instructions for outputting, based on the semiconductor-related parameters, a predicted figure of merit of a semiconductor device as an output value from each of the plurality of neural network models, instructions for sorting a plurality of output values output from the plurality of neural network models in order of size, instructions for removing output values that fall within a preset upper range and output values that fall within a preset lower range among the plurality of output values, and instructions for calculating a final output value based on output values remaining after the output values that fall within the preset upper range and the output values that fall within the preset lower range are removed.
Additional features, advantages, and aspects of the present disclosure are set forth in part in the description that follows and in part will become apparent from the present disclosure or may be learned by practice of the inventive concepts provided herein. Other features, advantages, and aspects of the present disclosure may be realized and attained by the descriptions provided in the present disclosure, or derivable therefrom, and the claims hereof as well as the appended drawings. It is intended that all such features, advantages, and aspects be included within this description, be within the scope of the present disclosure, and be protected by the following claims. Nothing in this section should be taken as a limitation on those claims. Further aspects and advantages are discussed below in conjunction with embodiments of the disclosure.
It is to be understood that both the foregoing description and the following description of the present disclosure are examples, and are intended to provide further explanation of the disclosure as claimed.
The accompanying drawings are incorporated in and constitute a part of this disclosure, illustrate aspects and embodiments of the disclosure, and together with the description serve to explain principles and examples of the disclosure.
Throughout the drawings and the detailed description, unless otherwise described, the same drawing reference numerals should be understood to refer to the same elements, features, and structures. The features, elements and depiction thereof may be exaggerated for clarity, illustration, and/or convenience.
Hereinafter, specific example embodiments of the present disclosure are described with reference to the accompanying drawings. The following detailed description is provided to assist in a comprehensive understanding of the methods, devices and/or systems described herein. However, the detailed description is only for illustrative purposes and the present disclosure is not limited thereto.
In describing the embodiments of the present disclosure, when it is determined that detailed descriptions of known technology related to the present disclosure may unnecessarily obscure the gist of the present disclosure, the detailed descriptions thereof may be omitted. The terms used below are defined in consideration of functions in the present disclosure, but may be changed depending on the customary practice, the intention of a user or operator, or the like. Thus, the definitions should be determined based on the overall content of the present specification. The terms used herein are only for describing example embodiments of the present disclosure, and should not be construed as limiting. Unless expressly stated otherwise, a singular form includes a plural form. Embodiments are example embodiments. Aspects are example aspects. In the present description, the terms “including,” “comprising,” “having,” and the like are used to indicate certain characteristics, numbers, steps, operations, elements, and a portion or combination thereof, but should not be interpreted to preclude one or more other characteristics, numbers, steps, operations, elements, and a portion or combination thereof.
In describing a temporal relationship, when the temporal order is described as, for example, “after,” “subsequent,” “next,” “before,” “preceding,” “prior to,” or the like, a case that is not consecutive or not sequential may be included and thus one or more other events may occur therebetween, unless a more limiting term, such as “just,” “immediate(ly),” or “direct(ly),” is used.
For the expression that an element is “connected,” “coupled,” “linked,” or the like to another element, the element can not only be directly connected, coupled, linked, or the like to another element, but also be indirectly connected, coupled, linked, or the like to another element with one or more intervening elements disposed or interposed between the elements, unless otherwise specified.
The term “at least one” should be understood as including any and all combinations of one or more of the associated listed items. For example, at least one of a plurality of elements can represent (i) one element of the plurality of elements, (ii) some elements of the plurality of elements, or (iii) all elements of the plurality of elements. The expression of a first element, a second elements “and/or” a third element should be understood as one of the first, second and third elements or as any or all combinations of the first, second and third elements. The term “or” means “inclusive or” rather than “exclusive or.” For example, “a or b” may mean “a,” “b,” or “a and b.”
Further, it will be understood that, although the terms first, second, and so on may be used herein to describe various elements, these elements should not be limited by these terms. These terms may be used to distinguish one element from another element. For example, without departing from the scope of the present disclosure, a first element could be termed a second element, and similarly, a second element could be termed a first element.
Referring to
An electrical measurement parameter corresponding to each of semiconductor manufacturing parameters may be input into each of the plurality of first neural network models 102. Here, the semiconductor manufacturing parameters are design parameters used to manufacture a semiconductor of a specific structure. For example, when a semiconductor device includes transistors, the semiconductor manufacturing parameters may include gate length, oxide thickness, doping concentration, junction gradient, gate stack height, and so on.
In addition, the electrical measurement parameter is a measured parameter value that, after performing a semiconductor manufacturing process with the semiconductor manufacturing parameter as a target value, enables estimation of the semiconductor manufacturing parameter. For example, the electrical measurement parameter corresponding to the gate length may be an electrical critical dimension (ECD), and the electrical measurement parameter corresponding to the insulating film thickness may be an effective oxide thickness (EOT) and breakdown voltage (BV).
The plurality of first neural network models 102 may each predict a figure of merit (FOM) of the semiconductor device by using electrical measurement parameters as input. Hereinafter, electrical measurement parameters will be described as being used as input, but the present disclosure is not limited thereto and semiconductor manufacturing parameters may also be used as input. Electrical measurement parameters and semiconductor manufacturing parameters may be referred to as semiconductor-related parameters.
Here, figures of merit of the semiconductor device may include a power delay product (PDP), a frequency, a ring oscillator delay (ROD), power dissipation, a current-resistance (IR) drop, a voltage drop, and so on, but hereinafter, an example in which the figure of merit of the semiconductor device is power delay product (PDP) will be described. In an example, a PDP may be a metric used to evaluate the efficiency of a semiconductor device or circuit, and may represent the product of the power dissipation and the propagation delay of a semiconductor device or circuit.
When electrical measurement parameters are input, each of the plurality of first neural network models 102 may be trained to output the power delay product (PDP) of the corresponding semiconductor as a predicted value based on the input electrical measurement parameters. In this case, each first neural network model 102 may output one predicted value.
The plurality of first neural network models 102 may each have the same neural network structure. In an example embodiment, when the plurality of first neural network models 102 are trained, a test data set may be fixed at 20% of the total data set and the remaining 80% data set may be randomized with 70% being a training data set and 10% being a verification data set, which are divided a total of M times, and each of the plurality of first neural network models 102 may be trained through M epochs.
In this case, since the plurality of first neural network models 102 are trained through the training data set randomized during M epochs, the first neural network models 102 are trained so that weights within the neural network models have different values. That is, the plurality of first neural network models 102 may be trained to have different biases. Accordingly, the plurality of first neural network models 102 may have different local minima.
The semiconductor process optimization device 100 may sort output values (that is, predicted PDP values) of the plurality of first neural network models 102 in order of size. The semiconductor process optimization device 100 may remove output values that fall within a preset upper range (e.g., output values that exceed or are equal to a predetermined upper limit or a predetermined maximum level) and output values that fall within a preset lower range (e.g., output values that are less than or equal to a predetermined lower limit or a predetermined minimum level) from among a plurality of (that is, N) output values.
For example, the semiconductor process optimization device 100 may remove output values corresponding to a preset upper ratio and output values corresponding to a preset lower ratio among the plurality of output values. In addition, the semiconductor process optimization device 100 may remove output values that are equal to or greater than a preset upper threshold and output values that are less than or equal to a preset lower threshold in the Gaussian distribution of the plurality of output values.
The semiconductor process optimization device 100 may calculate a final output value (that is, a predicted final PDP value) based on output values remaining after the output values that fall within the preset upper range and the output values that fall within the preset lower range are removed among the plurality of output values.
In
According to one or more example embodiments, by allowing the plurality of first neural network models 102 to have different biases and calculating the final output value based on the output values remaining after the output values that fall within the upper range and the output values that fall within the lower range are excluded among the output values of the plurality of first neural network models 102, it is possible to decrease a prediction error due to local minima of the plurality of first neural network models 102 and to derive optimal input with low output variance in a sparse data space.
Referring to
The semiconductor process optimization device 100 may calculate a gradient of the final PDP value, and then update the electrical measurement parameters so that the calculated gradient moves in a negative direction using a preset machine-learning rate at time stamp t=0.
Meanwhile, the semiconductor process optimization device 100 may further include a limiter to prevent optimal electrical measurement parameters detected through the gradient descent method from exceeding a range (that is, the minimum to maximum value) of the electrical measurement parameters learned by the plurality of first neural network models 102. The limiter may receive a latent value with an unlimited random value (that is, −∞ to +∞) as input and use a limiter function to output the latent value within the previously learned range. The latent value may be a value generated by applying a sigmoid function to the electrical measurement parameters.
In one or more examples, the semiconductor process optimization device 100 may compute an output value (e.g., a final PDP value) based on a plurality of first output values output from the plurality of first neural network models 102 and then update the electrical measurement parameter values based on the output value. The semiconductor process optimization device 100 may then input the updated semiconductor-related parameters into each of the plurality of first neural network models 102 and output, based on the updated semiconductor-related parameters, an updated predicted figure of merit of the semiconductor device as an updated first output value from each of the plurality of first neural network models 102. The updated semiconductor-related parameters may include the updated electrical measurement parameter values. The semiconductor process optimization device 100 may thus utilize a feedback loop between an input and an output of the plurality of first neural network models 102, as shown in
Referring to
The plurality of first neural network models 202 may be trained to predict the figure of merit (e.g., power delay product (PDP)) of a semiconductor device by each receiving electrical measurement parameters as input. N first neural network models 202 may be arranged in parallel. Each of the plurality of first neural network models 202 may output one first output value.
The semiconductor process optimization device 200 may sort first output values (that is, predicted PDP values) of the plurality of first neural network models 202 in order of size. The semiconductor process optimization device 200 may remove first output values that fall within a preset upper range and first output values that fall within a preset lower range (marked as “trim” in
The semiconductor process optimization device 200 may input remaining first output values Pre-PDP1 to Pre-PDPN/2 (in
The second neural network model 204 may serve to transform the remaining first output values into a plurality of second output values. That is, since the remaining first output values are the values remaining after the first output values that fall within the preset upper range and the first output values that fall within the lower range are removed, when the final output value is calculated using only the remaining first output values, values within the upper and lower ranges that have already been removed may not be derived. Accordingly, by transforming the remaining first output values into a plurality of second output values through the second neural network model 204, a biased value may be output according to the distribution of the remaining first output values.
The semiconductor process optimization device 200 may calculate a final output value based on the plurality of second output values of the second neural network model 204. In
The semiconductor process optimization device 200 may train the second neural network model 204 to minimize the difference between the final output value and the correct answer value. When the second neural network model 204 is trained, weights of the plurality of first neural network models 202 may be fixed. That is, the second neural network model 204 may be trained in a state in which the training of the plurality of first neural network models 202 has been completed.
The illustrated computing environment 10 includes a computing device 12. In one or more example embodiments, the computing device 12 may be the semiconductor process optimization device 100 or 200.
The computing device 12 includes at least one processor 14, a computer-readable storage medium 16, and a communication bus 18. The processor 14 may cause the computing device 12 to operate according to one or more example embodiments of the present disclosure. For example, the processor 14 may execute one or more programs stored in the computer-readable storage medium 16. The one or more programs may include one or more computer-executable instructions, which may be configured to cause, when executed by the processor 14, the computing device 12 to perform operations according to one or more example embodiments of the present disclosure.
The computer-readable storage medium 16 is configured to store computer-executable instructions or program codes, program data, and/or other suitable forms of information. A program 20 stored in the computer-readable storage medium 16 includes a set of instructions executable by the processor 14. In an example embodiment, the computer-readable storage medium 16 may be a memory (a volatile memory such as a random-access memory, a non-volatile memory, or any suitable combination thereof), one or more magnetic disk storage devices, optical disc storage devices, flash memory devices, other types of storage media that are accessible by the computing device 12 and may store desired information, or any suitable combination thereof.
The communication bus 18 interconnects various other components of the computing device 12, including the processor 14 and the computer-readable storage medium 16.
The computing device 12 may also include one or more input/output interfaces 22 that provide an interface for one or more input/output devices 24, and one or more network communication interfaces 26. The input/output interface 22 and the network communication interface 26 are connected to the communication bus 18. The input/output device 24 may be connected to other components of the computing device 12 via the input/output interface 22. The example input/output device 24 may include a pointing device (e.g., a mouse, a trackpad, or the like), a keyboard, a touch input device (e.g., a touch pad, a touch screen, or the like), a voice or sound input device, input devices such as various types of sensor devices and/or imaging devices, and/or output devices such as a display device, a printer, an interlocutor, and/or a network card. The example input/output device 24 may be included inside the computing device 12 as one of components constituting the computing device 12, or may be connected to the computing device 12 as a separate device distinct from the computing device 12.
The source device 61 may include, for example, a signal generator(s), an electrical source(s), optical source(s), a current source(s), a voltage source(s), and/or a power source(s). The measuring device 62 may include, for example, an electrical meter(s), optical detector(s), a volt meter(s), a current meter(s), and/or power meter(s). The source device 61 may apply one or more signals (e.g., electrical signals, optical signals, current, voltage and/or power) to the semiconductor device 51 using, for example, a probe(s), a probe card, or the like. The measuring device 62 may measure or detect one or more signals (e.g., electrical signals, optical signals, voltage, current, and/or power) output from the semiconductor device 51. The measuring device 62 may supply the measured or detected signals to the computing device 70. The computing device 70 may control, communicate with, or be coupled to the source device 61 and/or the measuring device 62 using wired communication interfaces or wireless communication interfaces. The computer device 70 may be local or remote to the source device 61 and/or the measuring device 62. In one or more examples, the computing device 70 may be the computing environment 10 or the computing device 12.
In one or more examples, a semiconductor device 51 may be one or more semiconductor devices. A source device 61 may be one or more source devices. A measuring device 62 may be one or more measuring devices.
According to one or more example embodiments of the present disclosure, by allowing a plurality of neural network models to have different biases and calculating a final output value based on remaining output values after output values that fall within an upper range and output values that fall within a lower range are excluded among output values of the plurality of neural network models, it is possible to decrease a prediction error due to local minima of the plurality of neural network models and to derive optimal input with low output variance in a sparse data space.
Although the representative embodiments of the present disclosure have been described in detail as above, those skilled in the art will understand that various modifications may be made thereto without departing from the scope of the present disclosure. Therefore, the scope of the present disclosure should not be limited to the described embodiments. The scope of protection of the present disclosure should be construed based on the following claims, and all technical features within the scope of equivalents thereof should be construed as being included within the scope of the present disclosure.
| Number | Date | Country | Kind |
|---|---|---|---|
| 10-2023-0144301 | Oct 2023 | KR | national |