This disclosure relates to semiconductor inspection, and more specifically to classifying defects detected by semiconductor inspection.
Modern optical semiconductor-inspection tools use wavelengths that are significantly longer than the dimensions of a typical defect, often by an order of magnitude or more. As such, inspection tools cannot resolve the defects and thus cannot provide images showing the defects; instead, the inspection tools merely provide an indication that a defect has been detected. Furthermore, many of the detected defects are so-called nuisance defects that do not impact device functionality and are not of interest to process-integration and yield-improvement engineers. And nuisance defects may outnumber defects of interest, for example by a factor of 1000 or more. The high volume of nuisance defects makes it impractical to perform subsequent failure analysis (e.g., visualization using a scanning electron microscope) on all identified defects. The high volume of nuisance defects also makes it impossible to determine whether a wafer should be scrapped or reworked due to a high number of defects of interest.
Existing techniques for distinguishing defects of interest from nuisance defects are limited in their effectiveness. For example, a single best optical mode for distinguishing the two types of defects may be identified and used for inspection. This approach ignores information that other optical modes can provide. Other techniques that consider the union or intersection of inspection results for multiple modes fail to consider the multiple modes in cohort and thus are too simplistic.
Accordingly, there is a need for improved methods and systems of classifying defects. Such methods and systems may use inspection results for multiple modes in cohort.
In some embodiments, a defect-classification method includes scanning a semiconductor die in a semiconductor-inspection tool, using a plurality of optical modes. The method also includes steps performed in a computer system comprising one or more processors and memory storing instructions for execution by the one or more processors. The steps include identifying a plurality of defects on the semiconductor die based on results of the scanning. Respective defects of the plurality of defects correspond to respective pixel sets of the semiconductor-inspection tool. The scanning fails to resolve the respective defects. The results include multi-dimensional data based on pixel intensity for the respective pixel sets, wherein each dimension of the multi-dimensional data corresponds to a distinct mode of the plurality of optical modes. The steps also include, for the respective pixel sets, applying a discriminant function (which may also be referred to as a classification function) to the results to transform the multi-dimensional data into respective scores and, based at least in part on the respective scores, dividing the respective defects into distinct classes.
In some embodiments, a non-transitory computer-readable storage medium stores one or more programs for execution by one or more processors of a semiconductor-inspection system that includes a semiconductor-inspection tool. The one or more programs include instructions for causing a semiconductor-inspection tool to scan a semiconductor die using a plurality of optical modes and for identifying a plurality of defects on the semiconductor die based on results of the scanning. Respective defects of the plurality of defects correspond to respective pixel sets of the semiconductor-inspection tool. The scanning fails to resolve the respective defects. The results include multi-dimensional data based on pixel intensity for the respective pixel sets, wherein each dimension of the multi-dimensional data corresponds to a distinct mode of the plurality of optical modes. The one or more programs also include instructions for applying a discriminant function to the results for the respective pixel sets to transform the multi-dimensional data into respective scores and for dividing the respective defects into distinct classes, based at least in part on the respective scores.
For a better understanding of the various described implementations, reference should be made to the Detailed Description below, in conjunction with the following drawings.
Like reference numerals refer to corresponding parts throughout the drawings and specification.
Reference will now be made in detail to various embodiments, examples of which are illustrated in the accompanying drawings. In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the various described embodiments. However, it will be apparent to one of ordinary skill in the art that the various described embodiments may be practiced without these specific details. In other instances, well-known methods, procedures, components, circuits, and networks have not been described in detail so as not to unnecessarily obscure aspects of the embodiments.
In some embodiments, each optical mode of the plurality of optical modes has (304) a distinct combination of optical characteristics. In some embodiments, the optical characteristics are selected from the group consisting of range of wavelengths, polarization, focus, transmission distribution in the illumination aperture, transmission distribution in the collection aperture, and phase-shift distribution in the collection aperture. For example, a first mode has a first polarization and a second mode has a second polarization distinct from the first polarization. In the same example or another example, the first mode has a first range of wavelengths and the second mode has a second range of wavelengths distinct from the first range of wavelengths.
In some embodiments, respective portions of the semiconductor die are simultaneously illuminated (306) using the plurality of optical modes or a subset thereof. The respective portions of the semiconductor die are simultaneously imaged (306) using distinct detector arrays (e.g., detector arrays 122,
A plurality of defects is identified (308) on the semiconductor die based on results of the scanning (e.g., by comparing, for each optical mode, an image obtained by the scanning to another image that does not include defects). Respective defects of the plurality of defects correspond to respective pixel sets of the semiconductor-inspection tool. (Each pixel set is the set affected by a corresponding defect.) The scanning fails to resolve the respective defects, because the respective defects are too small in comparison to the wavelength being used. The results include multi-dimensional data based on pixel intensity (e.g., pixel-intensity data or data for an attribute derived from the pixel intensity) for the respective pixel sets, wherein each dimension of the multi-dimensional data corresponds to a distinct mode of the plurality of optical modes. For example, the results include a vector {right arrow over (x)} for each defect, where each entry xk of the vector {right arrow over (x)} is a value based on pixel intensity (e.g., the intensity or an attribute value derived from the intensity) for a respective optical mode (i.e., for the kth optical mode) of the plurality of optical modes. Each entry of the vector {right arrow over (x)} thus corresponds to a result for a distinct optical mode for a particular defect.
For the respective pixel sets, a discriminant function (i.e., a classification function) is applied (310) to the results to transform the multi-dimensional data into respective scores. In some embodiments, the discriminant function is linear. The linear discriminant function may specify a direction that separates (e.g., maximally separates) the respective defects into the distinct classes. For example, for a respective pixel set, applying the discriminant function includes determining a projection of a vector containing the multi-dimensional data onto an axis corresponding to (e.g., perpendicular to) the direction specified by the linear discriminant function. In other embodiments, the discriminant function is non-linear.
For example, to apply the discriminant function, means are calculated for each class of defects. If there are two classes indexed class 0 and class 1 (e.g., nuisance defects and defects of interest, respectively), then means μ0 and μ1 are calculated:
where {right arrow over (x)}ji is the ith defect in class j, N0 is the number of defects in class 0 (e.g., the number of nuisance defects), and N1 is the number of defects in class 1 (e.g., the number of defects of interest). Each summation is thus over all the defects in the respective class. Covariances are then calculated using the means. For class 0 and class 1, respective covariances S0 and S1 are calculated:
where i indexes the defects of the respective classes. A pooled covariance Sp for the defect classes is then calculated:
where N=N0+N1.
The pooled covariance Sp is used in a transformation that transforms {right arrow over (x)} into a score L with a dimensionality equal to the number of classes. For the example of class 0 and class 1,
{right arrow over (L)}=[L0,L1]T. (6)
The transformation is achieved by applying the discriminate function of step 310. In the example of a linear discriminate function,
where i indexes the classes. In equation 11, pri is a prior probability distribution that may be assumed to be constant:
pri=Ni/N (12)
Equation 7 effectively specifies a direction that maximally separates the identified defects into class 0 and class 1. Equation 7 projects the vector {right arrow over (x)}, which contains the multi-dimensional data, onto an axis perpendicular to this direction. If the discriminate function is non-linear, the transformation of equation 7 is replaced by a non-linear transformation.
In some embodiments, the discriminant function is determined based on a training set of defects that includes defects from all of the distinct classes. A training set of defects is initially identified by scanning one or more die of the type of interest and then performing failure analysis to classify at least some of the identified defects (e.g., by performing scanning electron microscopy and/or using other appropriate failure-analysis techniques). For example, in equation 7, Ŵ and {right arrow over (c)} may be determined based on a training set.
Based at least in part on the respective scores, the respective defects are divided (312) into distinct classes. The distinct classes may include (314) defects of interest that will impede functionality of the semiconductor die and nuisance defects that will not impede functionality of the semiconductor die. In some embodiments, the defects of interest may be divided into multiple classes (e.g., corresponding to respective types of defects).
In some embodiments, the respective scores are converted (316) to probabilities that the respective defects belong to particular classes of the distinct classes. The respective defects are classified (318) based on the probabilities. For example, the respective scores are converted to probabilities that the respective defects are defects of interest or nuisance defects, and the respective defects are classified based on the probabilities. To convert the scores obtained in equation 7 to probabilities, the softmax function may be applied to obtain:
where i again indexes the classes, as does j. The summation in the denominator is thus over the plurality of classes (e.g., over class 0 and class 1), while the value in the numerator is for a specific class (e.g., class 0 or class 1).
In some embodiments, images obtained from the distinct detector arrays are aligned (e.g., before the defects are identified in step 308 and before application of the discriminant function in step 310). The multi-dimensional data are obtained from the aligned images. Each dimension of the multi-dimensional data corresponds to a respective image of the aligned images and also to a respective optical mode. In some embodiments, the images are aligned based on simulation results for respective optical modes, with the simulation results being matched up against the inspection results to determine alignment. The simulation results may be obtained by simulating illumination of a die with the respective optical modes, using a file (e.g., a gds file) that specifies the layout of the die.
In some embodiments, to account for potential misalignment of images for respective optical modes, steps 310 and 312 are performed both for a pixel that corresponds to an identified defect and for adjacent pixels (e.g., for a 3×3 patch of pixels centered on the pixel that corresponds to the identified defect). For example, if any of the adjacent pixels are determined to be defects of interest, then the identified defect is classified as a defect of interest.
In some embodiments, the plurality of optical modes is selected from a group of available optical modes that is larger (i.e., contains more optical modes) than the plurality of optical modes. The plurality of optical modes may be selected based on Fisher's score (or another score that indicates the efficacy of a set of optical modes in classifying defects). One or more die of the type of interest are scanned using all of the optical modes in the group of available optical modes. If the group of available optical modes has M optical modes, then Fisher's score for a subset of the group of optical modes is defined as:
The summations in equations 16, 17, and 19 are over all classes i (e.g., over classes 0 and 1). The summation in equation 18 is over defects in a particular class i. Fisher's score may be calculated for multiple subsets of the group, and the subset with the highest score is selected as the plurality of optical modes. For example, Fisher's score may be calculated for all subsets with two or more optical modes, for all subsets with exactly two optical modes, or for all subsets with a number of optical modes greater than or equal to two and less than or equal to a specified number.
In some embodiments, a report is generated (320) specifying the classes for the respective defects and/or specifying the defects in one or more classes (e.g., in a particular class, such as the defects of interest). For example, the report may list all of the defects (e.g., with their coordinates) and specify the class of each defect. Alternatively, the report may list defects of a specified class or set of classes (e.g., with their coordinates) and omit the other defects. For example, the report may list the defects of interest (e.g., with their coordinates) and omit the nuisance defects. The report may be graphical; for example, the report may show a map of the die with indications of the locations of defects by class, or with indications of the locations of defects in one or more classes (e.g., in a particular class, such as the defects of interest). The report may be displayed and/or transmitted to a client device for display.
In some embodiments, the steps 310 and 312 are performed in real-time, as respective defects are identified, such that defects are classified in real-time. Defects determined to be nuisance defects are ignored and thus effectively are not identified: while technically they are identified in step 308, they are not reported to the user.
In some embodiments, a decision whether to scrap, rework, or continue to process a wafer is made based at least in part on defects of interest identified using the method 300.
The inspection tool 404 includes an illumination source 405 (e.g., light source 102,
The user interfaces 410 may include a display 411 and one or more input devices 412 (e.g., a keyboard, mouse, touch-sensitive surface of the display 411, etc.). The display 411 may display results of defect classification. For example, the display 411 may display the report of step 320 of the method 300 (
Memory 414 includes volatile and/or non-volatile memory. Memory 414 (e.g., the non-volatile memory within memory 414) includes a non-transitory computer-readable storage medium. Memory 414 optionally includes one or more storage devices remotely located from the processors 402 and/or a non-transitory computer-readable storage medium that is removably inserted into the server system 400. In some embodiments, memory 414 (e.g., the non-transitory computer-readable storage medium of memory 414) stores the following modules and data, or a subset or superset thereof: an operating system 416 that includes procedures for handling various basic system services and for performing hardware-dependent tasks, an inspection module 418 (e.g., for causing steps 302, 304, and/or 306 of the method 300,
The memory 414 (e.g., the non-transitory computer-readable storage medium of the memory 414) thus includes instructions for performing all or a portion of the method 300 (
The foregoing description, for purpose of explanation, has been described with reference to specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit the scope of the claims to the precise forms disclosed. Many modifications and variations are possible in view of the above teachings. The embodiments were chosen in order to best explain the principles underlying the claims and their practical applications, to thereby enable others skilled in the art to best use the embodiments with various modifications as are suited to the particular uses contemplated.
This application claims priority to US Provisional Patent Application Nos. 62/701,007, filed Jul. 20, 2018, titled “Multimode Approach for Defect and Nuisance Filtering,” and 62/767,916, filed Nov. 15, 2018, titled “Multimode Defect Classification in Semiconductor Inspection,” which are hereby incorporated by reference in their entirety for all purposes.
Number | Name | Date | Kind |
---|---|---|---|
6078443 | Yu | Jun 2000 | A |
6608321 | Fontaine et al. | Aug 2003 | B1 |
7446865 | Chung et al. | Nov 2008 | B2 |
7720275 | Shibuya et al. | May 2010 | B2 |
9037280 | Dishner et al. | May 2015 | B2 |
9601393 | Lee et al. | Mar 2017 | B2 |
9778206 | Honda et al. | Oct 2017 | B2 |
9816939 | Duffy et al. | Nov 2017 | B2 |
10115040 | Brauer | Oct 2018 | B2 |
10204290 | Yong | Feb 2019 | B2 |
20050140958 | Fiolka | Jun 2005 | A1 |
20070041609 | Chung et al. | Feb 2007 | A1 |
20120294500 | Utsunomiya et al. | Nov 2012 | A1 |
20130294680 | Harada et al. | Nov 2013 | A1 |
20140198975 | Nakagaki et al. | Jul 2014 | A1 |
20150369752 | Honda | Dec 2015 | A1 |
20160025648 | Duffy | Jan 2016 | A1 |
20160259100 | Uehara | Sep 2016 | A1 |
20180114310 | He et al. | Apr 2018 | A1 |
20190179874 | Noda | Jun 2019 | A1 |
Number | Date | Country |
---|---|---|
201444212 | Apr 2010 | CN |
H0474951 | Mar 1992 | JP |
2006266872 | Oct 2006 | JP |
2008543113 | Nov 2008 | JP |
2011156035 | Aug 2011 | JP |
2012517702 | Aug 2012 | JP |
2014149177 | Aug 2014 | JP |
2017528697 | Sep 2017 | JP |
20060124514 | Dec 2006 | KR |
20170100710 | Sep 2017 | KR |
201820187 | Jun 2018 | TW |
2012098615 | Jul 2012 | WO |
2018025361 | Feb 2018 | WO |
Entry |
---|
Machine translation of CN201444212 (Year: 2010). |
PCT/US2019/042500, International Search Report, dated Nov. 13, 2019. |
PCT/US2019/042500, Written Opinion of the International Searching Authority, dated Nov. 13, 2019. |
Nagendra Kumar & Andreas G. Andreou, “On Generalizations of Linear Discriminant Analysis,” Tech. Report JHU/ECE-96-07, Apr. 8, 1996 (Johns Hopkins University), pp. 1-30. |
Quanquan Gu et al., “Generalized Fisher Score for Feature Selection,” arXiv:1202.3725 (Feb. 14, 2012). |
R.A. Fisher, “The Statistical Utilization of Multiple Measurements,” Annals of Human Genetics, vol. 8, No. 4, pp. 376-386, 1938. |
JP Patent App. No 2021-502984, Notice of Reasons for Refusal, dated Feb. 14, 2023 (in original Japanese with English machine translation). |
Wikipedia, “Linear Classifier,” Sep. 5, 2015 (The Wayback Machine), Japanese and English versions. |
ROC (Taiwan) Intellectual Property Office, Office Action for TW Application No. 108124201, dated Mar. 3, 2023. |
Number | Date | Country | |
---|---|---|---|
20200025689 A1 | Jan 2020 | US |
Number | Date | Country | |
---|---|---|---|
62767916 | Nov 2018 | US | |
62701007 | Jul 2018 | US |