The present disclosure relates to an inference device, an inference method, and an inference program.
Conventionally, in the field of various manufacturing processes, inference techniques are known for inferring, from measurement data (a data set of multiple types of time series data, hereinafter referred to as a “time series data group”) measured during processing a target object, the state of the target object after processing and an event in a process during the processing.
As an example, in a semiconductor manufacturing process, a virtual measurement technique for inferring the state of a wafer after processing and an abnormality detection technique for inferring the presence or absence of an abnormality in the process during processing are known.
On the other hand, models used in these inference techniques (e.g., virtual measurement models, abnormality detection models) need to generate and optimize models on a process-by-process basis to realize more precise inference, which requires cost and time.
With respect to the above, if a model that achieves high-precision inference for a specific process can be applied to other processes of the same type, the cost and time required to optimize the model can be reduced.
[Patent Document 1] Japanese Laid-open Patent Publication No. 2006-163517
The present disclosure provides an inference device, an inference method, and an inference program that can realize high precision inference regardless of an application target.
An inference device according to one aspect of the present disclosure has the following configuration, for example.
That is, the inference device includes:
an acquisition section configured to acquire a time series data group measured in accordance with processing of a target object in a predetermined processing unit of a manufacturing process; and
an inference section configured to tune respective output data that is output by processing the acquired time series data group using a plurality of network sections that have been machine-learned in advance and to output an inference result by combining the respective tuned output data;
wherein the inference section is configured to tune the respective output data using a correction parameter corresponding to an error included in the inference result.
According to the present disclosure, it is possible to provide an inference device, an inference method, and an inference program that can realize high precision inference regardless of an application target.
In the following, each embodiment will be described with reference to the accompanying drawings. In each of the following embodiments, a case will be described in which, for a specific semiconductor manufacturing process as a target, a time series data group measured in accordance with wafer processing is used to generate
a virtual measurement model that infers a state of a wafer after processing; or
an abnormality detection model that infers the presence or absence of an abnormality in the process. At this time, in each of the following embodiments, a model that realizes high-precision inference is generated by performing multifaceted analysis by processing a time series data group using a plurality of network sections.
In each of the following embodiments, by adding a fine-tuning function to the generated model, when the model is applied to other semiconductor manufacturing processes of the same type, errors (errors included in the inference result) caused by individual differences between processes are reduced by using the fine-tuning function.
Thereby, according to each of the following embodiments, it is possible to provide an inference device, an inference method, and an inference program that can realize high precision inference regardless of an application target. As a result, cost and time can be reduced compared to a case where a new model is generated for another semiconductor manufacturing process for optimization.
In the first embodiment of each of the embodiments, a case will be described in which a virtual measurement model is generated as a model based on a time series data group, and a correction matrix is used as a fine-tuning function. In the second embodiment, a case will be described in which a neural network is used instead of the correction matrix as a fine-tuning function will be described. Further, in the third embodiment, a case will be described in which, as a model based on a time series data group, an abnormality detection model is generated instead of the virtual measurement model.
In the following embodiments and the accompanying drawings, elements having substantially the same functional configurations are referred to by the same numerals, and a duplicate description thereof will be omitted.
<Application Example of Inference Device>
First, an application example of a virtual measurement device (inference device) with a fine-tuning function added to a virtual measurement model will be described.
As illustrated in
A system 100B includes a semiconductor manufacturing process B, time series data acquisition devices 140B_1 to 140B_n, an inspection data acquisition device 150B, and a virtual measurement device 160B. In the system 100B, the semiconductor manufacturing process B is another process similar to the semiconductor manufacturing process A, and in the present embodiment, the semiconductor manufacturing process B is a target to which a virtual measurement device (inference device) with a fine-tuning function added to the virtual measurement model generated in the system 100A is applied.
In the system 100A, the semiconductor manufacturing process A processes a target object (a wafer 110A before processing) in a predetermined processing unit 120A to generate a result (a wafer 130A after processing). It should be noted that that the processing unit 120A is an abstract concept and will be described in detail later. The wafer 110A before processing refers to a wafer (substrate) before being processed in the processing unit 120A, and the wafer 130A after processing refers to a wafer (substrate) after being processed in the processing unit 120A.
Further, in the system 100A, the time series data acquisition devices 140A_1 to 140A_n respectively measure the time series data in accordance with the processing of the wafer 110A before processing. The time series data acquisition devices 140A_1 to 140A_n measure kinds of measurement items different from each other. It should be noted that the number of measurement items measured by each of the time series data acquisition devices 140A_1 to 140A_n may be one or more. The time series data measured in accordance with the processing of the wafer 110A before processing includes not only a time series data measured during the processing of the wafer 110A before processing but also a time series data measured during pre-processing and post-processing that are performed before and after the processing of the wafer 110A before processing. These processing may include pre-processing and post-processing performed without a wafer (substrate).
A time series data group measured by the time series data acquisition devices 140A_1 to 140A_n is stored as learning data (input data) in a learning data storage section 163A of the virtual measurement device 160A.
In the system 100A, the inspection data acquisition device 150A inspects predetermined inspection items (e.g., an ER (Etch Rate)) of the wafer 130A after processing processed in the processing unit 120A to acquire inspection data. The inspection data acquired by the inspection data acquisition device 150A is stored in the learning data storage section 163A of the virtual measurement device 160A as learning data (labeled data).
In addition, in the system 100A, a virtual measurement program including a learning program and an inference program is installed in the virtual measurement device 160A. When the virtual measurement program is executed, the virtual measurement device 160A functions as a learning section 161A and an inference section 162A.
The learning section 161A performs machine learning by using the time series data group measured by the time series data acquisition devices 140A_1 to 140A_n and the inspection data acquired by the inspection data acquisition device 150A.
Specifically, a plurality of network sections included in the learning section 161A are used to process the time series data group, and machine learning is performed for the plurality of network sections so that the combined result of respective output data output from the plurality of network sections approaches the inspection data.
The inference section 162A acquires the time series data group measured in accordance with the processing of a new target object (wafer before processing) and inputs it to the plurality of network sections for which machine learning has been performed. Accordingly, the inference section 162A infers, based on the time series data acquired in accordance with the processing of the new wafer before processing, the inspection data of the wafer after processing and outputs the inference result (virtual measurement data).
As described above, by processing the time series data group measured in accordance with the processing of the target object using the plurality of network sections, the virtual measurement device 160A enables to analyze the time series data group from various aspects. As a result, a virtual measurement model (inference section 162A) that realizes high-precision inference can be generated compared to a case where the time series data group is processed using one network section.
On the other hand, in the system 100B, the semiconductor manufacturing process B is the same type of process as the semiconductor manufacturing process A of the system 100A. Further, in the system 100B, the time series data acquisition devices 140B_1 to 140B_n and the inspection data acquisition device 150B correspond to the time series data acquisition devices 140A_1 to 140A_n and the inspection data acquisition device 150A, respectively.
Further, in the system 100B, the virtual measurement device 160B (inference device) corresponds to the virtual measurement device 160A of the system 100A. However, in the case of the virtual measurement device 160B of the system 100B, the learning section 161A is not included. Also, instead of the inference section 162A, an inference section 162B with a fine-tuning function is included (a virtual measurement program that does not include a learning program but includes an inference program similar to the inference program installed in the virtual measurement device 160A is installed).
In a case of the virtual measurement device 160B of the system 100B, rather than generating a new virtual measurement model and performing machine learning by using the time series data group for optimization, the virtual measurement model (inference section 162A) generated in the virtual measurement device 160A of the system 100A is applied.
Here, the semiconductor manufacturing process A and the semiconductor manufacturing process B are the same type of process as described above, but have individual differences. Therefore, even if the virtual measurement model (the inference section 162A) generated in the virtual measurement device 160A is applied as it is, the inference result (virtual measurement data) includes an error.
Thus, in a case of the virtual measurement device 160B (the inference device), an inference section having a fine-tuning function added to the virtual measurement model (the inference section 162A) generated in the virtual measurement device 160A is generated. In
To the inference section 162B with a fine-tuning function, while the virtual measurement model (the inference section 162A) generated in the virtual measurement device 160A is applied (see the dashed line 170), a fine-tuning function is added to reduce an error caused by an individual difference (error included in the inference result).
Specifically, the inference section 162B with a fine-tuning function updates correction parameters (parameters included in a correction matrix used when tuning respective output data, details are described below) so as to reduce an error between
an inference result (virtual measurement data) that is output by processing a time series data group using a plurality of network sections included in the generated virtual measurement model and combining respective output data output from the plurality of network sections after tuning; and
inspection data acquired by the inspection data acquisition device 150B.
As a result, the virtual measurement device 160B can realize a model generated in the virtual measurement device 160A and to which a virtual measurement model (inference section 162A) is applied to realize high-precision inference, which is a model capable of high-precision inference even for the semiconductor manufacturing process B that is an application target.
<Predetermined Processing Unit for Semiconductor Manufacturing Processes>
Next, the predetermined processing units 120A and 120B of the semiconductor manufacturing process A and B will be described.
Here, 2a of
The time series data group measured in accordance with the processing of the wafers 110A and 110B before processing in the processing units 120A and 120B in 2a of
a time series data group measured in accordance with processing in the chamber A (first processing space);
a time series data group measured in accordance with processing in the chamber B (second processing space); and
a time series data group measured in accordance with processing in in the chamber C (third processing space).
On the other hand, 2b of
In the processing units 120A and 120B of 2b of
Here, 3a of
Also, in the processing units 120A and 120B of 3a of
In the example of 3a of
On the other hand, 3b of
Also, in the processing units 120A and 120B of 3a of
<Example of Time series Data Group>
Next, a specific example of the time series data groups acquired by the time series data acquisition devices 140A_1 to 140A_n and 140B_1 to 140B_n will be described.
Of these, 4a of
On the other hand, 4b of
In 4a of
Specifically, the time series data acquisition devices 140A_1 to 140A_n and 140E_1 to 140E_n may acquire, as the time series data group 1, a plurality of sets of time series data measured during executing the pre-processing. The time series data acquisition devices 140A_1 to 140A_n and 140E_1 to 140E_n may acquire, as the time series data group 2, a plurality of sets of time series data measured during executing the wafer processing. Further, the time series data acquisition devices 140A_1 to 140A_n and 140E_1 to 140E_n may acquire, as the time series data group 3, a plurality of sets of time series data measured during executing the post-processing.
Similarly, the time series data acquisition devices 140A_1 to 140A_n and 140B_1 to 140E_n may acquire a plurality of sets of time series data measured during executing the recipe I as the time series data group 1. The time series data acquisition devices 140A_1 to 140A_n and 140B_1 to 140B_n may acquire a plurality of sets of time series data measured during executing the recipe II as the time series data group 2. Further, the time series data acquisition devices 140A_1 to 140A_n and 140B_1 to 140B_n may acquire a plurality of sets of time series data measured during executing the recipe III as the time series data group 3.
<Hardware Configuration of Virtual Measurement Device>
Next, a hardware configuration of the virtual measurement devices 160A and 160B will be described. FIG. is a diagram illustrating an example of a hardware configuration of the virtual measurement devices. As illustrated in
The virtual measurement device 160 further includes an auxiliary storage device 505, a display device 506, an operating device 507, an I/F (interface) device 508, and a drive device 509. The hardware parts of the virtual measurement device 160 are connected to one another through a bus 510.
The CPU 501 is an arithmetic device that executes various types of programs (e.g., a virtual measurement program) installed in the auxiliary storage device 505.
The ROM 502 is a nonvolatile memory and functions as a main memory device. The ROM 502 stores various types of programs, data, and the like necessary for the CPU 501 to execute the various types of programs installed in the auxiliary storage device 505. Specifically, the ROM 502 stores boot programs and the like such as BIOS (basic input/output system) and EFI (extensible firmware interface).
The RAM 503 is a volatile memory such as a DRAM (dynamic random access memory) or an SRAM (static random access memory) and functions as a main memory device. The RAM 503 provides a work area to which the various types of program installed in the auxiliary storage device 505 are loaded when executed by the CPU 501.
The GPU 504 is an arithmetic device for image processing, and when a virtual measurement program is executed by the CPU 501, the GPU 504 performs high-speed calculation by parallel processing on various image data (in the present embodiment, a time series data group). The GPU 504 is equipped with an internal memory (CPU memory), and temporarily holds information necessary for performing parallel processing on various image data.
The auxiliary storage device 505 stores various types of programs, and various types of data used when the various types of program are executed by the CPU 501.
The display device 506 is a display device that displays an internal state of the virtual measurement devices 160A and 160B. The operating device 507 is an input device that is used by an administrator of the virtual measurement devices 160A and 160B to input various types of instructions to the virtual measurement devices 160A and 160B. The I/F device 508 is a connection device for connecting to a non-illustrated network for performing communication.
The drive device 509 is a device for setting a recording medium 520. Here, the recording medium 520 includes a medium for optically, electrically, or magnetically recording information, such as a CD-ROM, a flexible disk, a magneto-optical disk, or the like. The recording medium 520 may also include a semiconductor memory or the like that electrically records information, such as a ROM, a flash memory, or the like.
The various types of programs to be installed in the auxiliary storage device 505 are installed by the drive device 509 reading the various types of programs recorded in the recording medium 520 upon the recording medium 520 being set in the drive device 509, for example. Alternatively, the various types of program to be installed in the auxiliary storage device 505 may be installed upon being downloaded from a network.
<Functional Configuration of Learning Section>
Next, a functional configuration of the learning section 161A of the virtual measurement device 160A in the system 100A will be described.
The branch section 610 reads out the time series data group from the learning data storage section 163A. The branch section 610 processes the read-out time series data group so that the time series data group is processed using a plurality of network sections from the first network section 620_1 to the Mth network section 620_M.
The first network section 620_1 to the Mth network section 620_M are configured based on a convolution neural network (CNN) and have a plurality of layers.
Specifically, the first network section 620_1 includes a first layer 620_1 to an Nth layer 620_1N. Similarly, the second network section 620_2 includes a first layer 620_21 to an Nth layer 620_2N. Hereinafter, a similar configuration is included, and the Mth network section 620_M includes a first layer 620_M1 to an Nth layer 620_MN.
In each layer of the first layer 620_1 to the Nth layer 620_1N of the first network section 620_1, various processes such as a normalization process, a convolution process, an activation process, and a pooling process are performed. Further, similar various processes are performed in each layer of the second network section 620_2 to the Mth network section 620_M.
The coupling section 630 combines respective output data from the output data output from the Nth layer 620_1N of the first network section 620_1 to the output data output from the Nth layer 620_MN of the Mth network section 620_M and outputs the combined result to the comparison section 640.
The comparison section 640 compares the combined result output from the coupling section 630 with the inspection data (labeled data) read from the learning data storage section 163A and calculates the error. In the learning section 161A, mechanical learning is performed for the first network section 620_1 to the Mth network section 620_M and the coupling section 630 so that the error calculated by the comparison section 640 satisfies a predetermined condition.
Thus, the model parameters of the respective layers of the first network section 620_1 to the Mth network section 620_M and the model parameters of the coupling section 630 are optimized.
<Details of Processing of Each Section of the Learning Section>
Next, the details of processing of each section (here, in particular, the branch section 610) of the learning section 161A of the virtual measurement device 160A in the system 100A will be described with reference to a specific example.
(1) Detail 1 of Processing of Branch Section
Also, the branch section 610 generates the time series data group 2 (the second time series data group) by processing the time series data group measured by the time series data acquisition devices 140A_1 to 140A_n according to a second criterion and inputs it to the second network section 620_2.
As described above, by processing the time series data groups according to different criteria to be processed with divided respective different network sections to perform machine learning, the time series data groups can be analyzed in a multifaceted manner. As a result, it is possible to generate a virtual measurement model (inference section 162A) that realizes high-precision inference compared to a case in which a time series data group is input to one network section and machine learning is performed.
In the example of
(2) Detail 2 of Processing of Branch Section
Next, another processing of the branch section 610 will be described in detail.
As described above, by dividing the time series data group into a plurality of groups according to a data type and by processing using different network sections to perform machine learning, the time series data group can be analyzed in a multifaceted manner. As a result, it is possible to generate a virtual measurement model (inference section 162A) that realizes high-precision inference compared to a case in which a time series data group is input to one network section and machine learning is performed.
In the example of
(3) Detail 3 of Processing of Branch Section
Next, another processing of the branch section 610 will be described in detail.
The example of
Of these, the normalization section 1001 performs a first normalization process on the time series data group input by the branch section 610 and generates a normalized time series data group 1 (first time series data group).
Similarly, the example of
Of these, the normalization section 1011 performs a second normalization process on the time series data group input by the branch section 610 and generates a second normalized time series data group 2 (second time series data group).
As described above, by performing machine learning with a configuration of processing a time series data group using a plurality of network sections each of which includes a normalization section that performs a normalization process using a different method, the time series data group can be analyzed in a multifaceted manner. As a result, it is possible to generate a virtual measurement model (inference section 162A) that realizes high-precision inference compared to a case in which a time series data group is input to one network section that performs one normalization process and machine learning is performed.
(4) Detail 4 of Processing of Branch Section
Next, another processing of the branch section 610 will be described in detail.
The branch section 610 inputs the time series data group 2 (the second time series data group) measured in accordance with the processing in the chamber B to the eighth network section 620_8 among the time series data groups measured by the time series data acquisition devices 140A_1 to 140A_n.
As described above, by performing machine learning with a configuration of using different network sections to process respective time series data groups measured in accordance with the processing in the different chambers (the first processing space and the second processing space), the time series data groups space) can be analyzed in a multifaceted manner. As a result, it is possible to generate a virtual measurement model (inference section 162A) that realizes high-precision inference compared to a case in which machine learning is performed by inputting respective time series data groups to one network section.
<Function Configuration of Inference Section of Virtual Measurement Device>
Next, a functional configuration of the inference section 162A of the virtual measurement device 160A in the system 100A will be described.
The branch section 1210 acquires a time series data group newly measured by the time series data acquisition devices 140A_1 to 140A_N. The branch section 1210 performs control so that the acquired time series data group is processed using the first network section 1220_1 to the Mth network section 1220_M.
The first network section 1220_1 to the Mth network section 1220_M are formed by machine learning performed by the learning section 161A and optimizing model parameters of respective layers of the first network section 20_1 to the Mth network section 620_M.
The coupling section 1230 is formed by the coupling section 630 for which machine learning is performed by the learning section 161A and model parameters are optimized. The coupling section 1230 combines the respective output data from the output data output from the Nth layer 1220_1N of the first network section 1220_1 to the output data output from the Nth layer 1220_MN of the Mth network section 1220_M and outputs the virtual measurement data.
<Flow of Virtual Measurement Processing>
Next, the entire flow of virtual measurement processing by the virtual measurement device 160A in the system 100A will be described.
In step S1301, the learning section 161A acquires a time series data group and inspection data as learning data.
In step S1302, the learning section 161A performs machine learning with the time series data group as input data and the inspection data as labeled data of the acquired learning data.
In step S1303, the learning section 161A determines whether or not to continue machine learning. In a case of acquiring further learning data to continue the machine learning (in the case of YES in step S1303), the processing returns to step S1301. Meanwhile, in a case of ending the machine learning (in the case of NO in step S1303), the processing proceeds to step S1304.
In step S1304, the inference section 162A generates the first network section 1220_1 to the Mth network section 1220_M by reflecting the model parameters optimized by the machine learning.
In step S1305, the inference section 162A inputs a time series data group measured in accordance with the processing of a new wafer 110A before processing and infers virtual measurement data.
In step S1306, the inference section 162A outputs the inferred virtual measurement data.
<Functional Configuration of Inference Section with Fine-Tuning Function of Virtual Measurement Device>
Next, a functional configuration of the inference section 162B with a fine-tuning function of the virtual measurement device 160B in the system 100B will be described.
As illustrated in
Of these, since the branch section 1210 is the same as the branch section 1210 of the inference section 162A and has been described with reference to
Specifically, the first network section 1220_1 to the Mth network section 1220_M are formed by machine learning performed by the learning section 161A and optimizing model parameters of respective layers of the first network section 20_1 to the Mth network section 620_M.
The coupling section 1410 is formed by the coupling section 630 for which machine learning is performed by the learning section 161A and model parameters are optimized. However, in a case of the coupling section 1410, the respective output data from the output data output from the Nth layer 1220_1N of the first network section 1220_1 to the output data output from the Nth layer 1220_MN of the Mth network section 1220_M are output without being combined.
The individual tuning section 1420 multiplies the respective output data output from the coupling section 1410 by a factor (referred to as the “individual sensitivity”) corresponding to the individual difference between the processing unit 120A of the semiconductor manufacturing process A and the processing unit 120B of the semiconductor manufacturing process B.
The fine-tuning section 1430 multiplies the respective output data, by which the individual sensitivity is multiplied by the individual tuning section 1420, a correction matrix to calculate virtual measurement data that is the scalar quantity.
The comparison section 1440 acquires the virtual measurement data output by the fine-tuning section 1430 and acquires inspection data for the wafer 130B after processing. The comparison section 1440 calculates the difference between the acquired virtual measurement data and the inspection data and sends a notification to the fine-tuning section 1430.
Thus, in the inference section 162B with a fine-tuning function, the fine-tuning section 1430 updates the correction parameters (P1 to PM) based on the inspection data for the wafer 130B after processing for a predetermined period of time in the semiconductor manufacturing process B. The fine-tuning section 430 of the inference section 162B with the fine-tuning function continues to update the correction parameters (P1 to PM) until the difference between the virtual measurement data and the inspection data is equal to or less than a predetermined threshold value.
This enables the fine-tuning section 1430 to reduce errors (errors included in the inference result) caused by the individual difference between the processing unit 120A of the semiconductor manufacturing process A and the processing unit 120B of the semiconductor manufacturing process B.
In a case of the inference section 162B with a fine-tuning function, the cost and time can be reduced compared to a case where, with time series data group measured in the semiconductor manufacturing process B as added data and a virtual measurement model is optimized by re-learning.
<Flow of Fine-Tuning Processing>
Next, a flow of fine-tuning processing performed by the virtual measurement device 160B in the system 100B will be described.
In step S1501, the branch section 1210 of the inference section 162B with a fine-tuning function acquires a time series data group measured in accordance with the processing of a new wafer 110B before processing in the processing unit 120B of the semiconductor manufacturing process B. The first to Mth network sections 1220_1 to 1220_M of the inference sections 162B with a fine-tuning function process the acquired time series data group. Accordingly, the respective output data are output from the final layers of the first to Mth network sections 1220_1 to 1220_M.
In step S1502, the individual tuning section 1420 of the inference section 162B with a fine-tuning function tunes the respective output data by multiplying the respective output data output from the final layers of the first to Mth network sections 1220_1 to 1220_M by the individual sensitivity.
In step S1503, the fine-tuning section 1430 of the inference section 162B with a fine-tuning function multiplies the respective output data, by which the individual sensitivity is multiplied, by the correction matrix to calculate the virtual measurement data.
In step S1504, the inference section 162B with a fine-tuning function acquires inspection data for the post-processed wafer 130B and sends a notification to the comparison section 1440. Also, the comparison section 1440 compares the virtual measurement data output from the fine-tuning unit 1430 with the reported inspection data and calculates the difference (error included in the inference result).
In step S1505, the comparison section 1440 of the inference section 162B with a fine-tuning function determines whether or not it is necessary to update the correction parameters by determining whether or not the difference is equal to or less than the predetermined threshold value based on the comparison result.
In a case of determining in step S1505 that the difference exceeds the predetermined threshold value and it is necessary to update the correction parameters (in the case of YES in step S1505), the processing proceeds to step S1506.
In step S1506, the fine-tuning section 1430 of the inference section 162B with a fine-tuning function updates correction parameters (P1 to PM) of the correction matrix in accordance with the difference (error included in the inference result) calculated by the comparison section 1440. Thereafter, the processing proceeds to step S1507.
Meanwhile, in a case of determining in step S1505 that the difference is equal to or less than the predetermined threshold value and it is not necessary to update the correction parameters (in the case of NO in Step S1505), the processing proceeds directly to step S1507.
In step S1507, the inference section 162B with a fine-tuning function determines whether to end the fine-tuning processing. In a case of determining in step S1507 not to end the fine-tuning processing (in the case of NO in step S1507), the processing returns to step S1501.
Meanwhile, in a case of determining in step S1507 to end the fine-tuning processing (in the case of YES in step S1507), the fine-tuning processing ends.
<Summary>
As is obvious from the above description, the virtual measurement device 160A
acquires a time series data group measured in accordance with the processing of a target in a predetermined processing unit of a manufacturing process; and
performs machine learning for respective network sections so that the combined result of respective output data output from the respective network sections by processing the acquired time series data group using the plurality of network sections approaches inspection data of a result object obtained by processing the target object.
As described above, multifaceted analysis can be performed by processing a time series data group using a plurality of network sections. As a result, the virtual measurement device 160A can generate a virtual measurement model that realizes high-precision inference.
Also, the virtual measurement device 160B (inference device)
uses a plurality of network sections included in the generated virtual measurement model to process a time series data group measured in accordance with the processing of a target in a predetermined processing unit of another manufacturing process to output respective output data;
combines the respective output data after being fine-tuned using correction parameters to infer virtual measurement data; and
updates the correction parameters according to an error included in the inferred virtual measurement data.
As described above, when applying a virtual measurement model generated using a time series data group to another manufacturing process at a predetermined processing unit of a manufacturing process, the virtual measurement device 160B adds a function to fine-tune respective output data that is output from a plurality of network sections.
This enables to reduce errors (errors included in an inference result) due to individual differences between processes when applying the virtual measurement model to other manufacturing processes. That is, according to the first embodiment, an inference device, an inference method, and an inference program that can realize high-precision inference regardless of an application target can be provided.
In the above-described first embodiment, respective output data output from the final layers of the respective network sections are fine-tuned using an individual sensitivity and a correction matrix. However, the method of fine-tuning the respective output data by the inference section with a fine-tuning function is not limited thereto. For example, a network section for fine-tuning may be used to fine-tune the respective output data.
The fine-tuning network section 1610 is configured based on a convolutional neural network and outputs virtual measurement data by inputting respective output data output from the coupling section 1410.
The fine-tuning network section 1610 updates the correction parameters that are model parameters of the fine-tuning network section 1610 based on the difference reported from the comparison section 1440 in accordance with the output of the virtual measurement data.
Thus, in the inference section 1600B with a fine-tuning function, the fine-tuning network section 1610 updates the correction parameters based on the inspection data for the wafer 130B after processing for a predetermined period of time in the semiconductor manufacturing process B. At this time, the model parameters of the first network section 1220_1 to the Mth network section 1220_M are be maintained in a fixed state. Then, the fine-tuning network section 16100 of the inference section 1600B with a fine-tuning function continues to update the correction parameters until the difference between the virtual measurement data and the inspection data is equal to or less than a predetermined threshold value.
This enables the fine-tuning network section 1610 to reduce an error (an error included in an inference result) caused by an individual difference between the processing unit 120A of the semiconductor manufacturing process A and the processing unit 120B of the semiconductor manufacturing process B.
In a case of the inference section 1600B with a fine-tuning function, the possibility of overfitting can be reduced compared to a case in which a virtual measurement model is newly generated and it is optimized using a time series data group measured in the semiconductor manufacturing process B.
In the first and second embodiments described above, a virtual measurement model generated by the virtual measurement device 160A is applied to another semiconductor manufacturing process B. However, the model applied to another semiconductor manufacturing process B is not limited to the virtual measurement model.
In a third embodiment, a case is described in which the virtual measurement devices 160A and 160B described in the first and second embodiments are read as the abnormality detection devices 160A and 160B and an abnormality detection model generated by the abnormality detection device 160A is applied to another semiconductor manufacturing process B.
In a case of the abnormality detection device 160A, the learning section 161A performs machine learning on an abnormality detection model (inference section 162A) with a time series data group as input data and an event (information indicating the presence or absence of an abnormality) as labeled data. The abnormality detection model (inference section 162A) has a similar configuration to the virtual measurement model (inference section 162A), and differs only in learning data used for machine learning.
In a case of the abnormality detection device 160A, examples of the time series data acquisition devices 140A_1 to 140A_n that output a time series data group used for machine learning include:
an emission spectroscopy analyzer that outputs OES (Optical Emission Spectrometry) data, which is a time series data group;
a process data acquisition device that outputs process data such as temperature data or pressure data, which is a time series data group; and
a radio-frequency power supply device for plasma that outputs RF data, which is time series data.
Also, in a case of the abnormality detection device 160B (inference device), the inference section 1600B with a fine-tuning function inputs the time series data group and infers information indicating the presence or absence of an abnormality.
In a case of the abnormality detection device 160B, examples of the time series data acquisition devices 140A_1 to 140A_n that output a time series data group used for inference include:
an emission spectroscopy analyzer that outputs OES (Optical Emission Spectrometry) data, which is a time series data group;
a process data acquisition device that outputs process data such as temperature data or pressure data, which is a time series data group; and
a radio-frequency power supply device for plasma that outputs RF data, which is time series data.
<Summary>
As is obvious from the above description, the abnormality detection device 160A
acquires a time series data group (OES data, process data, RF data) measured in accordance with the processing of a target in a predetermined processing unit of a manufacturing process; and
performs machine learning for respective network sections so that the combined result of respective output data output from the respective network sections by processing the acquired time series data group using the plurality of network sections approaches an invent (information indicating the presence or absence of an abnormality) that occurs in accordance with the processing of the target.
In this way, by processing a time series data group using a plurality of network sections, it is possible to perform multifaceted analysis. As a result, the abnormality detection device 160A can generate an abnormality detection model that realizes high-precision inference.
Also, the abnormality detection device 160B (inference device)
uses a plurality of network sections included in the generated abnormality detection model to process a time series data group (OES data, process data, RF data) measured in accordance with the processing of a target object in a predetermined processing unit of another manufacturing process to output respective output data;
combines the respective output data after being fine-tuned using correction parameters to infer information indicating the presence or absence of an abnormality; and
updates the correction parameters according to an error included in the inferred information indicating the presence or absence of an abnormality.
As described above, when applying an anomality detection model generated using a time series data group to another manufacturing process at a predetermined processing unit of a manufacturing process, the anomality detection device 160B adds a function to fine-tune respective output data that is output from a plurality of network sections.
This enables to reduce errors (errors included in an inference result) due to individual differences between processes when applying the virtual measurement model to other manufacturing processes. That is, according to the third embodiment, an inference device, an inference method, and an inference program that can realize high-precision inference regardless of an application target can be provided.
In the above-described first and second embodiments, a case is described in which an individual sensitivity and a correction matrix or a network section for fine-tuning are used as a method of fine-tuning each output data. However, the method of fine-tuning respective output data is not limited thereto, and, for example, a generalized linear mixed model, Gaussian process regression analysis, Kalman filter, or the like may be used.
In the third embodiment described above, the abnormality detection device acquires OES data, process data, or RF data output from an emission spectroscopic analyzer, a process data acquisition device, or a radio-frequency power supply device for plasma in accordance with the processing of a target object. However, the combination of data acquired by the abnormality detection device is not limited thereto. Any one of data may be acquired, or a combination of two data may be acquired.
In each of the above-described embodiments, the inference sections 162B and 1600B with a fine-tuning function include the first to Mth network sections 1220_1 to 1220_M. However, the inference sections 162B and 1600B with a fine-tuning function are not required to include all of first to Mth network sections 1220_1 to 1220_M, but include at least two or more of the network sections.
In each of the above-described embodiments, a machine learning algorithm of each network section of the learning section 161A is described as being configured based on a convolutional neural network. However, the machine learning algorithm of each network section of the learning section 161A is not limited to a convolutional neural network, and may be configured based on other machine learning algorithms.
In each of the embodiments described above, the virtual measurement device or the abnormality detection device 160A functions as the learning section 161A and the inference section 162A. However, a device functioning as the learning section 161A need not be integral with a device functioning as the inference section 162A, but may be configured separately. That is, the virtual measurement device or the abnormality detection device 160A may function as the learning section 161A not including the inference section 162A, or may function as the inference section 162A not including the learning section 161A.
In each of the embodiments described above, a virtual measurement device (or an abnormality detection device) in which a fine-tuning function is added to a virtual measurement model (or an abnormality detection model) generated in the system 100A is to applied to the system 100B. However, the application target to which the virtual measurement device (or the abnormal detection device), to which a fine-tuning function is added, is applied is not limited to other systems, but may be the own system.
For example, in a case where the degree of change is small, such as a case where a part of a process recipe is changed, a fine-tuning function may be added to a virtual measurement model (or an abnormal detection model) generated by the own system.
Alternatively, it may be applied when the accuracy of a virtual measurement model (or an abnormality detection model) generated by the own system decreases, for example, when a maintenance work such as parts replacement is performed on a device in the own system, or when the environment inside a device changes due to consumption of parts of the device in the own system.
The present invention is not limited to configurations illustrated here, such as combinations with other elements in the configurations and the like described in the above embodiments. These respects can be changed without departing from the spirit of the present invention, and can be determined appropriately in accordance with the application form.
The present application is based on and claims priority to Japanese Patent Application No. 2019-217439, filed on Nov. 29, 2019, the entire contents of the Japanese Patent Application are hereby incorporated herein by reference.
| Number | Date | Country | Kind |
|---|---|---|---|
| 2019-217439 | Nov 2019 | JP | national |
| Filing Document | Filing Date | Country | Kind |
|---|---|---|---|
| PCT/JP2020/042564 | 11/16/2020 | WO |