The present disclosure relates to a method which allows classification of defect type such as a particle and a scratch adhered onto a sample like a semiconductor wafer, or classification of defects into a defect of interest (DOI), and other defects or noise (Nuisance). The disclosure further relates to a method and a system for classification, and a computer-readable medium.
Defects existing on surfaces of semiconductor substrates and thin film substrates are inspected on a line for manufacturing the semiconductor substrates and the thin film substrates for the purpose of maintaining and improving the product yield. Patent Literature 1 discloses the defect inspection apparatus provided with multiple detectors each having a different relative angle to the normal vector on the substrate surface.
The inspection apparatus provided with multiple detectors as disclosed in Patent Literature 1 allows highly efficient detection of the multidirectionally scattering light. Meanwhile, the inventors have examined application of the artificial intelligence as the tool for classification of defects or the like for the purpose of classifying the defect with high accuracy. Appropriate training has to be executed for applying the artificial intelligence as the appropriate classifier. Patent Literature 1 does not disclose the method for training the classifier.
The present disclosure proposes the method and the system for defect inspection, aiming at defect classification using an appropriately trained learning device, and a computer-readable medium.
According to an aspect to attain the object, the disclosure proposes a defect inspection method for inspecting a defect on a sample based on output information of a detector for detecting a scattered light generated by irradiation of the sample with a light, using one or more computers. The defect inspection method includes the steps of receiving outputs of multiple detectors arranged at multiple elevation angles to a surface of the sample, and in multiple azimuths to an irradiation point of the light on the sample in a direction to the surface of the sample, and inputting the output information of the multiple detectors to a learning device trained using the output information of the multiple detectors and defect information for outputting the defect information.
According to another aspect to attain the object, the disclosure proposes a system which includes an inspection tool composed of multiple detectors for detecting a scattered light generated by irradiation of a sample with a light. The multiple detectors are arranged at multiple elevation angles to a surface of the sample, and in multiple azimuths to an irradiation point of the light on the sample in a direction to the surface of the sample. The system further includes a computer allowed to execute a program stored in a computer-readable storage medium for processing output information of the multiple detectors. The computer receives the output information of the multiple detectors, and inputs the output information of the multiple detectors to a learning device trained using the output information of the multiple detectors and defect information for outputting the defect information.
The disclosure further proposes a non-temporal computer-readable medium configured to store an instruction of a program executable by a computer for processing output information of a detector for detecting a scattered light generated by irradiation of a sample with a light. The non-temporal computer-readable medium receives outputs of multiple detectors arranged at multiple elevation angles to a surface of the sample, and in multiple azimuths to an irradiation point of the light on the sample in a direction to the surface of the sample, and inputs the output information of the multiple detectors to a learning device trained using the output information of the multiple detectors and defect information for outputting the defect information.
The above-described configuration allows execution of the defect classification using the appropriately trained learning device.
Hereinafter, explanations will be made about a classification method which allows classification of defect types such as a particle and a scratch, adhered onto a sample like a semiconductor wafer or classification of defects into a defect of interest (DOI), and other defects or noise. The disclosure further relates to a system for classification, and a computer-readable medium.
<Structure Example of Inspection Apparatus>
The signal processing system 11 (for example, one or more computer systems) is configured to execute threshold determination with respect to signal intensity of a detector output. The system is further configured to identify the defect type based on a plurality of set thresholds. The signal processing system 11 includes an A/D conversion section 107, a signal processing section 110, a comparative computing section 111, an external input section 112, a data processing section 113, and a display section 114. The signal processing section 110 serves as a signal separator for separating a signal indicating the defect, the particle or the like as described later from a HAZE (reflected light (background light) caused by the sample surface roughness) signal. The data processing section 113 includes a learning device to be described later, and one or more computer systems for training the learning device. For example, a neural network, a regression tree, a Bayesian classifier, and the like may be used as the learning device.
Information to be input to an input section 115 may be input to the operation system 109 by an operator together with a recipe condition. A wafer identifier ID and a process state are displayed on an inspection result screen or expressed as recipe information. A table for converting the wafer identifier ID into an individual object and the process state allows the wafer identifier ID to contain the process state.
The signal processing section 110 may be configured to include an amplifier for amplifying outputs of multiple detectors, and a computation unit for addition or subtraction of outputs of multiple detectors.
The detection system 154 as illustrated in
The inspection apparatus as illustrated in
<Specific Structure Example of Detection System>
Referring to
The illuminating light is supplied diagonally to the wafer 101 so that the detection systems 151 to 156, and 161 to 166 detect the scattered lights. The inspection apparatus 10 of the embodiment may be referred to as a so-called dark field type device. Each aperture of the detection systems 151 to 156, and 161 to 166 is illustrated as a substantially circular shape when the wafer 101 is seen from its normal direction. However, this applies to the polygonal shape.
The detection system as illustrated in
The scattered light is focused by the detection optical system with a prescribed numerical aperture. The detection optical system includes multiple lenses (lens group), and constitutes a light condensing optical system or an image forming optical system. The focused scattered light has its undesirable light shielded by a space filter and a polarizing filter, and is photoelectrically converted by the photoelectric conversion element. The photoelectrically converted signal obtained as the current or voltage is AD converted by the A/D conversion section 107, and processed by the signal processing section 110. The photoelectric conversion element may be exemplified as the photomultiplier, the Avalanche photo diode array, and the multipixel photon counter.
<Beam Spot Scanning Method>
<Configuration Example of Signal Processing System>
Alternatively, as illustrated in
<Configuration Example of System for Defect Classification>
The system includes a computer system 403 connected to those apparatus via a bus or a network. The computer system 403 is composed of one or more computer systems.
The computer system 403 includes a computer-readable medium 406, a processing unit 405 for executing the respective modules stored in the computer-readable medium 406, and an input/output device 410 which receives an input of information for generating teaching data or the like for a learning device.
The computer readable medium 406 stores a signal processing module (component) 407 for processing signals output from the inspection apparatus 10, a defect inspection module 408 for estimating type of the defect using the learning device such as the neural network, the regression tree, and the Bayesian classifier, and a model generation module 409 for training the learning device (model). The model generation module 409 is configured to train the learning device using teaching data constituted by data set including, for example, multiple outputs of multiple detectors of the inspection apparatus 10, a predetermined classification algorithm, defect type information input from the input/output device 410, ADC (Auto Defect Classification) results of the electron microscope 401, and the like.
The computer system as illustrated in
The defect inspection module 408 includes, for example, a learning model having parameters adjusted so that the estimation processing is executed using the learning model. The learning model formed as, for example, the neural network includes one or more input layers, one or more intermediate layers (hidden layers), and one or more output layers as illustrated in
The neural network executes learning by adjusting the parameter (weight, bias) to obtain desired classification results from the output layer. This makes it possible to provide appropriate outputs. The learning is executed by updating the variable (weight, bias) successively using, for example, an error back propagation algorithm (back propagation). An output error of data is partially differentiated using the weight (including an active function) for gradual adjustment to attain an optimum value.
In the neural network, the information input to the input layer is propagated sequentially to the intermediate layer, and to the output layer for outputting the estimated result. The intermediate layer is composed of multiple intermediate units. The information input to the input layer is weighted with a coupling coefficient between the input unit and the intermediate unit. The weighted information is then input to the intermediate unit. The input to the intermediate unit is added to obtain the value of the intermediate unit. The value of the intermediate unit is nonlinearly converted using an I/O function. The output from the intermediate unit is weighted with the coupling coefficient between the intermediate unit and the output unit. The weighted output is then input to each of the output units. The input to the output unit is added to obtain the output value of the output layer.
The parameters (constant, coefficient) such as the coupling coefficient between the respective units, and the coefficient which describes the I/O function of each unit are gradually optimized by proceeding learning. The optimized parameters are stored in a predetermined storage medium as learning results of the neural network. Similarly, in the case of using the learning device other than the neural network, the parameters optimized in the learning process are stored in the predetermined storage medium.
The model generation module 409 calculates an error between the estimated result derived from the defect inspection module 408 and the information input as correct answer data (teaching data). More specifically, the model generation module 409 calculates a conversion error between the estimated result derived from the forward propagation and the correct answer data. The model generation module 409 adjusts the neural network parameter (variable) based on the calculated conversion error to suppress the possible conversion error. Repetitive execution of the forward propagation and the back propagation makes it possible to improve the output accuracy.
<Function Configuration Example of Computer System 403 Constituting the Defect Classifying System>
In a learning phase, the computer system 403 receives information necessary for generating the teaching data from a label information storage medium 501 which stores information on type (classification) of the defect or the like, and from a learning information storage medium 502 which stores outputs of multiple detectors of the inspection apparatus 10, or multi-detector output information such as signals obtained by amplifying, adding, and subtracting those outputs so that the learning model is trained. In an estimation phase, the computer system receives outputs of the multiple detectors, which are stored in the estimation information storage medium 503, or multi-detector output information such as signals obtained by amplifying, adding, and subtracting those outputs so that estimation processing of the defect type or the like is executed.
The information estimated by the estimation section may be fed back to provide the new teaching data. It is also possible to output the information estimated by the estimation section and an operator's determination result to the teaching data storage section 506 as the teaching data as indicated by a dashed line. Referring to
Explanations will be made about an estimation method, and an estimation system for identifying the type of the defect or the particle on the sample using the inspection apparatus and the system as described above, and a computer-readable medium for storing a program that allows one or more computers to execute the estimation processing.
<Scattered Light Intensity Distribution of Defect on Wafer>
The COP is a void defect as crystal defect type. Upon incidence of laser from a diagonal direction, the scattered light intensity distribution of the COP forms a symmetrical shape in a direction (left-right direction) orthogonal to the front-rear direction (when seen from the vertical direction to the wafer surface) as illustrated in
Referring to the inspection apparatus as illustrated in
In the case that DOI is an isotropically scattering particle (particle), it may be classified based on evaluation with respect to the left-right symmetry without using the teaching data. At least a part of the multiple detectors are disposed left-right symmetrically to the incident light to attain the classification as described above. Unlike the example of the particle, the scratch and noise exhibit the left-right asymmetrical distribution. Accordingly, they can be distinguished from the particle.
The information about the identified type of the particle, or the information about the type identified using the SEM and AFM is input to the system illustrated in
The detection optical system illustrated in
The signal processing system 11 or the computer system 403 determines existence/non-existence of the defect by the threshold determination with respect to the detector output information.
The noise contains many shot noises owing to background light generated by minute unevenness (roughness) 708 on the wafer surface. As the shot noises are randomly generated, they are unlikely to be detected by multiple detectors simultaneously. In this embodiment, outputs of the multiple detectors are identified based on the threshold 709 at the level which tolerates the noise. It is further determined whether the output ratio among multiple detectors becomes a predetermined value or larger, whether an incidence of each signal of the multiple detectors, which is equal to or larger than the threshold becomes a predetermined value or larger, or whether all outputs of the multiple detectors exceed the threshold 709 (determination based on AND condition). If the condition is satisfied, it is determined that the signal indicates the defect. Otherwise, it is determined that the signal indicates the noise. This makes it possible to remove the noise component.
Determination is made on detectability using multiple detectors to separate the noise from the particles and defects. Thereafter, evaluation is made on symmetry and distribution of the scattered light intensity to attain the micro defect detection and high accuracy classification. The noise signal (multi-detector output information) is labelled as noise to separate the noise from the defect signal with higher accuracy.
The multi-detector output information, and the database representing the relationship between the defect type and noise are preliminarily provided to allow identification of the defect type, and noise filtering with reference to the database.
<Another Example of Defect Classification>
An explanation will be made about another example of the defect classification based on outputs of the multiple detectors.
Referring to
The boundary in the multidimensional space is evaluated from multiple directions to allow estimation of size of minimum classifiable particle. Outputs of three or more detectors are plotted in the multidimensional space to allow formation of a curved boundary with higher classifying accuracy based on each boundary between clusters in each dimension.
<Details of Labeling Processing>
The specific method of labeling for constructing the learning model will be described.
In the simulation, the scattered light intensity distribution in accordance with the shape and reflection factor of the particle is obtained using the Monte Carlo method, for example (step 902). The scattered light intensity at the elevation angle and the azimuth angle at which the detector is positioned is calculated (step 903). When the respective values of scattered light intensity detected by multiple detectors are obtained, information of combined outputs is labeled using the information about the type of the defect, which has been input in step 901 (step 904). Results of estimating the scattered light intensity are plotted in the multidimensional space having an output of the single detector being in one dimension (step 905).
The foregoing processing attains construction of the learning device capable of estimating the defect type based on outputs from the inspection apparatus provided with multiple detectors.
Referring to the flowchart in
A comparison is made between the scattered light intensity distribution (plotted result) derived from the scattered light simulation and the scattered light intensity result based on the real sample inspection (step 909).
<Labeling to Feature Value>
An explanation will be made about the method for executing labeling to the feature value (multi-detector output information) based on the information derived from inspection executed multiple times. The wafer is introduced into the inspection apparatus 10 (step 1101), and subjected to the inspection multiple times (for example, 10 times) (step 1102). When inspecting the same wafer multiple times, the same defect is detected on the same coordinate. Meanwhile, as the noise is randomly generated, it is unlikely that the noise detected on the coordinate in an inspection is detected in another inspection again.
After execution of the inspection multiple times, or in the inspection, the computer system 403 extracts the defect candidate by the threshold determination (step 1103). Similar to the threshold 709 in
The teaching data are generated using the selected signal as described above, and the labeling information indicating either the real defect or the noise. The generated teaching data are used for training the learning device to allow construction of the learning device capable of executing highly advanced distinguishment between the real defect and the noise (steps 1106, 1107). The output signals each determined as the noise are labeled as either noise or nuisance to allow construction of the learning device for distinguishment between the DOI and the nuisance with higher accuracy.
It is possible to execute more specific classification with respect to the signal classified as the real defect through the classification process represented by the flowchart in
Coordinates in the range where the capture rate is approximate to the predetermined value (for example, assuming that the capture rate CP1 is 90%, it is ranged from 88% (CP2) to 92% (CP3), that is, CP2≤CP1≤CP3) may be reviewed using the electron microscope in place of, or in addition to the coordinate on which the capture rate is equal to or higher than the predetermined value. It is difficult to determine whether the signal identified to be approximate to the threshold is originated from the real defect or the false report. The above-described coordinate is selectively reviewed to collect data for appropriate learning while suppressing the review to be carried out excessive number of times. After observation using the SEM, if the image of the particle cannot be acquired, it is determined that the signal is originated from noise so that labeling is executed.
Instead of executing the ADC, the label information may be manually input from a GUI screen as illustrated in
The defect type information is input to the right section with reference to the left section as illustrated in
<Another Defect Classification Processing Using Learning Device>
Another defect classification method using the learning device will be described.
After acquisition of the peripheral information, the high accuracy classification is executed to the defect candidate information and the peripheral information which have been stored using outputs of the detectors more than those used for the candidate determination executed in step 1301 so that the inspection is executed with higher accuracy (step 1303). Execution of the high accuracy classification through the fitting processing takes relatively longer time compared with the rough classification of the defect candidates. Taking much time for the processing may fail to operate the inspection apparatus with high operation rate. In this embodiment, the rough classification (which tolerates mixture of noise) using outputs of small number of detectors. The high accuracy classification is executed to the roughly classified result using relatively larger number of detectors (multiple low elevation angle detectors and multiple high elevation angle detectors) in step 1303. The high accuracy classification is executed using widely ranged detector outputs (for example, raw data).
Execution of the high accuracy classification to all the defect candidates takes much time. The rough classification is executed up to the stage of peripheral data collection, and the classification is further executed using the peripheral data collected in the high accuracy classification. This makes it possible to execute the classification with high efficiency and high accuracy.
Data equivalent to N cycles are stored in a signal buffer (FIFO: First In First Out) to allow execution of steps 1301 to 1303 repeatedly. In this case, the original signal data corresponding to the N cycles may be referred. However, the processing using data prior to the N cycles has to be completed before execution of the next scanning.
Each defect extending in the specific direction forms the Gauss distribution extending in the specific direction. It is possible to execute the high accuracy classification using the two-dimensional distribution information (detector output information in the two-dimensional region including multiple sampling points which contain the defect candidate coordinate) by preliminarily providing a table indicating a relationship between the defect type and the fitting shape of the scattered light intensity distribution in the two-dimensional region. It is also possible to execute the classification using the neural network by training the learning device with teaching data derived from labeling indicating the defect type to the two-dimensional distribution information.
<Setting of Inspection Condition>
An explanation will be made about the computer system or the inspection apparatus for setting appropriate learning and appropriate inspection conditions.
The defect candidate for evaluation (teaching) is selected (step 1604). It is preferable to select the defect required to be identified, and the defect positioned around the boundary between the clusters in the feature value space. Preferably, the defect candidates are selected evenly in the feature value space on the premise that they are selected as the teaching data for the learning device. In this case, the defect candidates suitable for the teaching data are selected. The selected defect candidate is classified as the real defect or the false report (S1605). The classification may be performed by executing the inspection multiple times repeatedly under the same inspection conditions as represented in
After the classification as described above, the inspection condition suitable for the inspection apparatus is selected (step 1607). The inspection condition suitable for classification of the real defect/false report, or classification and detection of the defect type is selected. The method for selecting the inspection condition may be implemented as described below. For example, the distance from the boundary set in the feature value space, or the distance from the gravity center of the cluster is set as the evaluation criteria to select the inspection condition in which the distance to the cluster or the region to be classified is relatively reduced. It is also possible to select the inspection condition having a relatively higher S/N of the signal as the parameter. The above-described processing is automatically executed to allow setting of the inspection conditions suitable for training of the learning device and classification using the learning device.
Additionally, apparatus conditions and processing may be optimized. Specifically, load distribution conditions upon parallel signal processing of the CPU and the GPU may be optimized. In such a case, the real inspection is executed to learn the load state in each processing upon inspection so that allocation to the CPU core is changed. More specifically, the parallel processing is executed in the server, the core is selected, and the learning result is used to determine as to which processing is allocated to which core. In the case of the detection system as illustrated in
<Labeling Processing Before/after Semiconductor Manufacturing Process>
A wafer 1701 illustrated in
In order to execute the highly sensitive inspection, the feed pitch in the direction r of the stage is narrowed to increase an overlap amount of beam major axes so that the stable signal is acquired. The position of the same wafer in the direction r is inspected multiple times by stopping the movement in the direction r so that an average is taken. The randomly generated noise is removed in the processing to emphasize only the real signal. More specifically, the inspection is normally executed at the positions from R1 (radial position of the wafer), R2, R3, . . . to Rfinal. Meanwhile, the inspection is executed repeatedly at each position from R1, R2, R3, . . . to Rfinal. The inspection time is obtained by multiplying the normal inspection time by the number of repeated inspections. Alternatively, the beam diameter may be reduced to narrow the feed pitch, or the rotation speed may be lowered to increase the integrated value of the scattered light quantity as the beam passes the defect.
The highly sensitive inspection may be executed by execution of the inspection multiple times as represented in
A second inspection is executed to a wafer 1702 corresponding to the wafer which has been subjected to the management target process. In this case, the highly sensitive inspection is executed as well.
A comparison is made between the wafers before and after execution of the management target process on each coordinate on which the predetermined feature value is acquired (the coordinate on which the detector output becomes equal to or larger than a predetermined value). The newly generated defect, which has not been detected in the first inspection is regarded as the particle adhered in the management target process. For the purpose of managing the management target process, preferably, the learning device is configured to selectively classify the defect generated in the management target process as the DOI. The computer system 403 then applies labeling indicating nuisance to the defect which has been labeled as the DOI on the wafer 1701 as a result of the coordinate comparison. If the particle which has not been detected on the wafer 1701, but newly determined as the DOI on the wafer 1702, such particle is labeled as the DOI.
Learning is executed based on the label information generated by executing the above-described process to allow formation of the learning device suitable for evaluating the management target process.
The first and the second inspections may be executed without the threshold determination, or using the low threshold. Upon the inspection without using the threshold, if the distance of the feature value in the feature value space between before and after execution of the management target process is equal to or longer than a predetermined value, and the size (signal amount) is equal to or larger than a predetermined value, the label information indicating the DOI may be generated, and otherwise, the label information indicating the nuisance is generated. If the inspection is executed using the threshold, or the shot noise is removed, data obtained before execution of the management target process are compared with those obtained after execution of the management target process based on the defect data coordinate which exists after execution of the management target process. This makes it possible to determine whether the label information indicates the DOI (newly detected in the second inspection), or the nuisance (defect is detected on the same coordinate both in the first and the second inspections).
An inspection using the inspection apparatus 10 under the inspection apparatus condition (sensitivity) in a normal operation state is executed to the wafer 1702 as a target of the second inspection. Compared with the case of the highly sensitive inspection, the inspection apparatus in the normal operation state has to be operated while considering the throughput. It is therefore difficult to perform the work which may lower the throughput such as the inspection executed multiple times. An explanation will be made about the method for forming a learning device which allows estimation without executing the inspection multiple times utilizing the learning device which has been trained with the label information derived from the highly sensitive inspection.
As the same wafer is inspected both in the highly sensitive inspection and the normal operation, the same defect exists on the same coordinate. It is possible to use the label on each coordinate applied in the highly sensitive inspection for the feature value obtained in the inspection in the normal operation state. The label on each coordinate in the highly sensitive inspection is applied to the coordinate on which the feature value of the inspection in the normal operation state is obtained. Such label is then set as the one indicating the feature value of the inspection in the normal operation state.
Labeling is executed repeatedly by acquiring the feature values upon inspection in the normal operation state, and making a comparison with the information derived from the highly sensitive inspection. This makes it possible to acquire the teaching data which contain variation in the feature value space. The feature values derived from the inspection executed multiple times are averaged for each coordinate so that variation in the teaching data generated in the normal operation inspection may be suppressed. Execution of the deviation-containing learning or labeling adapted to allocation allows training of the learning device employed in the normal operation state using the teaching data based on secure classification results derived from the high accuracy inspection.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2020/023173 | 6/12/2020 | WO |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2021/250884 | 12/16/2021 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
8908172 | Urano | Dec 2014 | B2 |
9601393 | Lee | Mar 2017 | B2 |
9823065 | Kondo | Nov 2017 | B2 |
10228332 | Honda et al. | Mar 2019 | B2 |
10755396 | Enyama | Aug 2020 | B2 |
20080075352 | Shibuya et al. | Mar 2008 | A1 |
20080075353 | Tek | Mar 2008 | A1 |
20090299681 | Chen | Dec 2009 | A1 |
20120229618 | Urano | Sep 2012 | A1 |
20120293795 | Urano et al. | Nov 2012 | A1 |
20120294507 | Sakai et al. | Nov 2012 | A1 |
20130077092 | Sasazawa et al. | Mar 2013 | A1 |
20160358041 | Venkataraman et al. | Dec 2016 | A1 |
20170146463 | Honda et al. | May 2017 | A1 |
20170194126 | Bhaskar et al. | Jul 2017 | A1 |
20180157933 | Brauer et al. | Jun 2018 | A1 |
20190073566 | Brauer | Mar 2019 | A1 |
20190073568 | He et al. | Mar 2019 | A1 |
20190294923 | Riley et al. | Sep 2019 | A1 |
20190303717 | Bhaskar et al. | Oct 2019 | A1 |
20190370955 | Zhang et al. | Dec 2019 | A1 |
20230175979 | Honda | Jun 2023 | A1 |
20230175982 | Honda | Jun 2023 | A1 |
Number | Date | Country |
---|---|---|
2008-82821 | Apr 2008 | JP |
2011-163855 | Aug 2011 | JP |
2011-179823 | Sep 2011 | JP |
2013-72788 | Apr 2013 | JP |
2015-197320 | Nov 2015 | JP |
6328468 | May 2018 | JP |
2018-524804 | Aug 2018 | JP |
2019-508678 | Mar 2019 | JP |
2020-500422 | Jan 2020 | JP |
2020-501154 | Jan 2020 | JP |
WO-2005024404 | Mar 2005 | WO |
Entry |
---|
International Search Report (PCT/ISA/210) issued in PCT Application No. PCT/JP2020/023173 dated Sep. 15, 2020 with English translation (seven (7) pages). |
Japanese-language Written Opinion (PCT/ISA/237) issued in PCT Application No. PCT/JP2020/023173 dated Sep. 15, 2020 (four (4) pages). |
Number | Date | Country | |
---|---|---|---|
20230175981 A1 | Jun 2023 | US |