The described embodiments relate to metrology systems and methods, and more particularly to methods and systems for improved measurement of semiconductor structures.
Semiconductor devices such as logic and memory devices are typically fabricated by a sequence of processing steps applied to a specimen. The various features and multiple structural levels of the semiconductor devices are formed by these processing steps. For example, lithography among others is one semiconductor fabrication process that involves generating a pattern on a semiconductor wafer. Additional examples of semiconductor fabrication processes include, but are not limited to, chemical-mechanical polishing, etch, deposition, and ion implantation. Multiple semiconductor devices may be fabricated on a single semiconductor wafer and then separated into individual semiconductor devices.
Metrology processes are used at various steps during a semiconductor manufacturing process to detect defects on wafers to promote higher yield. Optical and X-ray based metrology techniques offer the potential for high throughput without the risk of sample destruction. A number of metrology based techniques including scatterometry, reflectometry, and ellipsometry implementations and associated analysis algorithms are commonly used to characterize critical dimensions, film thicknesses, composition, overlay and other parameters of nanoscale structures.
Many metrology techniques are indirect methods of measuring physical properties of a specimen under measurement. In most cases, the raw measurement signals cannot be used to directly determine the physical properties of the specimen. Instead, a measurement model is employed to estimate the values of one or more parameters of interest based on the raw measurement signals. For example, ellipsometry is an indirect method of measuring physical properties of the specimen under measurement. In general, a physics-based measurement model or a machine learning based measurement model is required to determine the physical properties of the specimen based on the raw measurement signals (e.g., (αmeas and βmeas).
In some examples, a physics-based measurement model is created that attempts to predict the raw measurement signals (e.g., (αmeas and βmeas) based on assumed values of one or more model parameters. As illustrated in equations (1) and (2), the measurement model includes parameters associated with the metrology tool itself, e.g., machine parameters (Pmachine), and parameters associated with the specimen under measurement. When solving for parameters of interest, some specimen parameters are treated as fixed valued (Pspec-fixed) and other specimen parameters of interest are floated (Pspec-float), i.e., resolved based on the raw measurement signals.
αmodel=f(Pmachine,Pspec-fixed,Pspec-float) (1)
βmodel=g(Pmachine,Pspec-fixed,Pspec-float) (2)
Machine parameters are parameters used to characterize the metrology tool (e.g., ellipsometer 101). Exemplary machine parameters include angle of incidence (AOI), analyzer angle (A0), polarizer angle (P0), illumination wavelength, numerical aperture (NA), compensator or waveplate (if present), etc. Specimen parameters are parameters used to characterize the specimen (e.g., material and geometric parameters characterizing the structure(s) under measurement). For a thin film specimen, exemplary specimen parameters include refractive index, dielectric function tensor, nominal layer thickness of all layers, layer sequence, etc. For a CD specimen, exemplary specimen parameters include geometric parameter values associated with different layers, refractive indices associated with different layers, etc. For measurement purposes, the machine parameters and many of the specimen parameters are treated as known, fixed valued parameters. However, the values of one or more of the specimen parameters are treated as unknown, floating parameters of interest.
In some examples, the values of the floating parameters of interest are resolved by an iterative process (e.g., regression) that produces the best fit between theoretical predictions and experimental data. The values of the unknown, floating parameters of interest are varied and the model output values (e.g., αmodel and βmodel) are calculated and compared to the raw measurement data in an iterative manner until a set of specimen parameter values are determined that results in a sufficiently close match between the model output values and the experimentally measured values (e.g., αmeas and βmeas). In some other examples, the floating parameters are resolved by a search through a library of pre-computed solutions to find the closest match.
In some other examples, a trained machine learning based measurement model is employed to directly estimate values of parameters of interest based on raw measurement data. In these examples, a machine learning based measurement model takes raw measurement signals as model input and generates values of the parameters of interest as model output.
Both physics based measurement models and machine learning based measurement models must be trained to generate useful estimates of parameters of interest for a particular measurement application. Generally, model training is based on raw measurement signals collected from specimen having known values of the parameters of interest (i.e., Design of Experiments (DOE) data).
A machine learning based measurement model is parameterized by a number of weight parameters. Traditionally, the machine learning based measurement model is trained by a regression process (e.g., ordinary least squares regression). The values of the weight parameters are iteratively adjusted to minimize the differences between the known, reference values of the parameters of interest and values of the parameters of interest estimated by the machine learning based measurement model based on the measured raw measurement signals.
As described hereinbefore, a physics based measurement model is parameterized by a number of machine parameters and specimen parameters. Traditionally, a physics based measurement model is also trained by a regression process (e.g., ordinary least squares regression). One or more of the machine parameters and specimen parameters are iteratively adjusted to minimize the differences between the raw measurement data and the modelled measurement data. For each iteration, the values of the specimen parameters of interest are maintained at the known DOE values.
Traditionally, the training of both machine learning based measurement models and physics based measurement models (a.k.a., measurement recipe generation) is achieved by minimizing total output error; typically expressed as a least squares minimization. Total output error is an expression of total measurement uncertainty; an aggregation of all of the errors arising from the measurement, including precision errors, tool-to-tool matching errors, parameter tracking errors, within wafer variations, etc. Unfortunately, model training based on total measurement uncertainty without control over the components of the total measurement uncertainty leads to suboptimal measurement performance. In many examples, large modeling errors arise, particularly when training is performed based on simulated data due to discrepancies between simulated and real data.
Furthermore, domain knowledge acquired from experience, measurement data, and physics is not directly expressed in the objective function driving the optimization of the measurement model. As a result, domain knowledge is not fully exploited in the measurement recipe development process. Again, this leads to suboptimal measurement performance.
Future metrology applications present challenges for metrology due to increasingly small resolution requirements, multi-parameter correlation, increasingly complex geometric structures, and increasing use of opaque materials. Thus, methods and systems for improved measurement recipe generation are desired.
Methods and systems for training and implementing metrology recipes based on specific domain knowledge associated with measurement data are presented herein. Domain knowledge includes performance metrics employed to quantitatively characterize the measurement performance of a metrology system in a particular measurement application. Domain knowledge is employed to regularize the optimization process employed during measurement model training, model-based regression, or both.
By way of non-limiting example, probability distributions associated with measurement precision, tool to tool matching, tracking, within wafer variations, etc., are employed to physically regularize the optimization process. In this manner, these important metrics are controlled during measurement model training, model-based regression, or both. The resulting trained measurement models, model-based measurements, or both, provide significant improvement in measurement performance and reliability.
In one aspect, a measurement model is trained based on a physically regularized optimization function. The training is based on measurement data associated with multiple instances of one or more Design of Experiments (DOE) metrology targets disposed on one or more wafers, reference values of parameters of interest associated with the DOE metrology targets, actual measurement data collected from multiple instances of one or more regularization structures disposed on one or more wafers, and measurement performance metrics associated with the actual measurement data.
Furthermore, the one or more measurement performance metrics are employed to regularize the optimization driving the measurement model training process. For example, statistical information characterizing actual measurement data collected from regularization structures, e.g., the known distributions associated with important measurement performance metrics such as measurement precision, wafer mean, etc., are specifically employed to regularize the optimization that drives measurement model training.
In a further aspect, the trained measurement model is employed to estimate values of parameters of interest based on measurements of structures having unknown values of one or more parameters of interest. In some embodiments, the measurement system employed to measure the unknown structures is the same measurement system employed to collect the DOE measurement data. In general, the trained measurement model may be employed to estimate values of parameters of interest based on a single measured spectrum or estimate values of parameters of interest simultaneously based on multiple spectra.
In some embodiments, the regularization structures are the same structures as the DOE metrology targets. However, in general, regularization structures may be different from the DOE metrology targets.
In some embodiments, the actual regularization measurement data is collected by a particular metrology system. In these embodiments, the measurement model is trained for a measurement application involving measurements performed by the same metrology system.
In some other embodiments, the actual regularization measurement data is collected by multiple instances of a metrology system, i.e., multiple metrology systems that are substantially identical. In these embodiments, the measurement model is trained for a measurement application involving measurements performed by any of the multiple instances of the metrology system.
In some examples, the measurement data associated with the measurement of each of the multiple instances of one or more Design of Experiments (DOE) metrology targets by a metrology system is simulated. The simulated data is generated from a parameterized model of the measurement of each of the one or more DOE metrology structures by the metrology system.
In some other examples, the measurement data associated with the multiple instances of one or more Design of Experiments (DOE) metrology targets is actual measurement data collected by a metrology system or multiple instances of a metrology system. In some of these embodiments, the same metrology system or multiple instances of the metrology system is employed to collect the actual regularization measurement data from the regularization structures.
In some embodiments, the physical measurement performance metrics characterize the actual measurement data collected from each of the multiple instances of the one or more regularization structures. In some embodiments the performance metrics are based on historical data, domain knowledge about the processes involved in producing the structure, physics, or a best guess by a user. In some examples, a measurement performance metric is a single point estimate. In other examples, the measurement performance metric is a distribution of estimated values.
In general, the measurement performance metric associated with the measurement data collected from the regularization structures provides information about the values of the physical attributes of the regularization structures. By way of non-limiting example, the physical attributes of the regularization structures includes any of measurement precision, tool to tool matching, wafer mean, within wafer range, tracking to reference, wafer to wafer matching, tracking to wafer split, etc.
In a further aspect, trained measurement model performance is validated with test data using error budget analysis. Real measurement data, simulated measurement data, or both, may be employed as test data for validation purposes. Error budget analysis over real data allows the estimation of the individual contribution of accuracy, tracking, precision, tool matching errors, wafer to wafer consistency, wafer signature consistency, etc. to total error. In some embodiments, test data is designed such that total model error is split into each contributing component.
In another further aspect, the training of a measurement model includes optimization of model hyper-parameters. For example, hyper-parameters for neural network based models include the number and types of neural-network layers, the number of neurons in each layer, optimizer settings, etc. During hyper-parameter optimization multiple models are created and the model with the minimum cost is chosen as the best model.
In another aspect, a model-based regression on a measurement model is physically regularized by on one or more measurement performance metrics. Estimates of one or more parameters of interest are determined based on actual measurement data collected from multiple instances of one or more structures of interest disposed on one or more wafers, statistical information associated with the measurement, and prior estimated values of the parameters of interest.
The foregoing is a summary and thus contains, by necessity, simplifications, generalizations and omissions of detail; consequently, those skilled in the art will appreciate that the summary is illustrative only and is not limiting in any way. Other aspects, inventive features, and advantages of the devices and/or processes described herein will become apparent in the non-limiting detailed description set forth herein.
Reference will now be made in detail to background examples and some embodiments of the invention, examples of which are illustrated in the accompanying drawings.
Methods and systems for training and implementing metrology recipes based on specific domain knowledge associated with measurement data are presented herein. Domain knowledge includes performance metrics employed to quantitatively characterize the measurement performance of a metrology system in a particular measurement application. Domain knowledge is employed to regularize the optimization process employed during measurement model training, model-based regression, or both. In this manner, the optimization process is physically regularized by one or more expressions of the physically based measurement performance metrics. By way of non-limiting example, probability distributions associated with measurement precision, tool to tool matching, tracking, within wafer variations, etc., are employed to physically regularize the optimization process. In this manner, these important metrics are controlled during measurement model training, model-based regression, or both. The resulting trained measurement models, model-based measurements, or both, provide significant improvement in measurement performance and reliability.
Physically regularizing the optimization process employed to train a measurement model using domain knowledge characterizing the training data improves model consistency and reduces the computational effort associated with model training. The optimization process is less sensitive to overfitting. Measurement performance specifications, such as precision, tool to tool matching, and parameter tracking, are more reliably met across different measurement model architectures and measurement applications when physical regularization is employed. In some embodiments, simulated training data is employed. In these embodiments, physical regularization significantly reduces errors due to discrepancies between simulated and real measurement data.
In one aspect, a measurement model is trained based on a physically regularized optimization function. The training is based on measurement data associated with multiple instances of one or more Design of Experiments (DOE) metrology targets disposed on one or more wafers, reference values of parameters of interest associated with the DOE metrology targets, actual measurement data collected from multiple instances of one or more regularization structures disposed on one or more wafers, and measurement performance metrics associated with the actual measurement data. Furthermore, the one or more measurement performance metrics are employed to regularize the optimization driving the measurement model training process.
As depicted in
In a further embodiment, system 100 may include one or more computing systems 130 employed to perform measurements of structures based on measurement models developed in accordance with the methods described herein. The one or more computing systems 130 may be communicatively coupled to the spectrometer 104. In one aspect, the one or more computing systems 130 are configured to receive measurement data 111 associated with measurements of a structure under measurement (e.g., structure 101).
In one aspect, computing system 130 is configured as a measurement model training engine 150 to train a measurement model based on measurements of regularization structures as described herein.
Measurement model training module 154 trains a measurement model based on an optimization function regularized by the one or more measurement performance metrics. In some examples, the measurement model is a neural network model. In some examples, each measurement performance metric is represented as a separate distribution. In one example, the distribution of measurement precision associated with the regularization structures is an inverse gamma distribution. Equation (1) illustrates a probability density function, p, for measurement precision dataset, x, where, Γ(·), denotes the gamma function, the constant, a, denotes a shape parameter, and the constant, b, denotes a scale parameter.
In another example, the distribution of mean values of instances of a measured regularization structure over a wafer is described by a normal distribution. Equation (2) illustrates a probability density function, m, for measurement wafer mean dataset, x, where, μ, denotes a specific mean and, σ, denotes a specific variance associated with the distribution.
In a further aspect, the statistical information characterizing actual measurement data collected from regularization structures, e.g., the known distributions associated with important measurement performance metrics such as measurement precision, wafer mean, etc., are specifically employed to regularize the optimization that drives measurement model training. Equation (3) illustrates the joint likelihood of DOE parameters of interest, yDOE along with the measurement performance metrics, criteriareg, associated with measurements of regularization structures. By maximizing the joint likelihood, the measurement model, h(·) evolves during training to maintain fidelity on DOE measurement data, XDOE, while adapting to satisfy the measurement performance on measurement data associated with the regularization structures, xreg.
P(yDOE,criteriareg|h(·),xDOE,xreg) (3)
To maximize the joint likelihood, the DOE measurement data contributes to mean squared errors and measurement data associated with the regularization structures contribute as regularization terms in the loss function. In summary, maximizing the joint likelihood is equivalent to the minimization of the loss function illustrated in Equation (4) assuming independence between DOE measurement data and regularization data as well as among different regularization datasets, where Reg(h(·)) is the generic regularization for model parameters weighted by constant parameter, α, Regk (xreg,k,h(·),θreg,k) is the kth regularization term weighted by constant parameter, γk, where xreg,k, is the kth regularization data set and θreg,k is the vector of parameters describing the statistical information associated with the actual measurement data collected from the regularization structures.
In one example, a measurement model optimization employs two different regularization terms, Reg1 and Reg2. Reg1 represents regularization of measurement precision on measurement precision datasets, xreg-prec, and Reg2 represents regularization of wafer mean on within wafer datasets, xWIW. By way of non-limiting example, we assume that measurement precision is described by an inverse Gamma distribution with shape parameter aσ
Similarly, regularization term, Reg2, can be written as illustrated in equation (6), where θreg-WiW={μWiW,σWiW2} and
In this example, the measurement model optimization function can be written as illustrated in equation (7), where hw,b(·) is a neural network model having weighting values, W, and bias values, b, model error variance, σD2, and weight variance, σW2.
The DOE datasets, measurement precision datasets, and wafer mean datasets employed for model training using the measurement model optimization function are illustrated in equation (8).
The known parameters of statistical models describing model error, neural network weight values, measurement precision, and wafer mean within wafer are illustrated in equation (9).
θ={σD2,σW2,aσ
During model training, the optimization function illustrated by equation (7) balances between the DOE data estimation errors and all other criteria. The first term expresses the DOE data estimation error as a mean squared error penalized by the model error variance, σD2. The second term is a generic regularizer for the model weights, W. The model weights, W, are penalized by the weight variance, σW2. The last two terms regularize the optimization for measurement precision and wafer mean as described hereinbefore.
At each iteration, the optimization function drives changes to the weighting values, W, and bias values, b, of the neural network model, hW,b(·) that minimize the optimization function. When the optimization function reaches a sufficiently low value, the measurement model is considered trained, and the trained measurement model 157 is stored in memory (e.g., memory 132).
In another further aspect, the trained measurement model is employed to estimate values of parameters of interest based on measurements of structures having unknown values of one or more parameters of interest. In some examples, the trained model provides both an estimate of the value of parameter of interest and the uncertainty of the measured value. The trained measurement model is employed to estimate values of one or more parameters of interest from actual measurement data (e.g., measured spectra) collected by the measurement system (e.g., metrology system 100). In some embodiments, the measurement system is the same measurement system employed to collect the DOE measurement data. In other embodiments, the measurement system is the system simulated to generate the DOE measurement data synthetically. In one example, the actual measurement data includes measured spectra 111 collected by metrology system 100 from one or more metrology targets having unknown values of the one or more parameters of interest.
In general, the trained measurement model may be employed to estimate values of parameters of interest based on a single measured spectrum or estimate values of parameters of interest simultaneously based on multiple spectra.
In some embodiments, the regularization structures are the same structures as the DOE metrology targets. However, in general, regularization structures may be different from the DOE metrology targets.
In some embodiments, the actual regularization measurement data collected from the multiple instances of the one or more regularization structures is collected by a particular metrology system. In these embodiments, the measurement model is trained for a measurement application involving measurements performed by the same metrology system.
In some other embodiments, the actual regularization measurement data collected from the multiple instances of the one or more regularization structures is collected by multiple instances of a metrology system, i.e., multiple metrology systems that are substantially identical. In these embodiments, the measurement model is trained for a measurement application involving measurements performed by any of the multiple instances of the metrology system.
In some examples, the measurement data associated with the measurement of each of the multiple instances of one or more Design of Experiments (DOE) metrology targets by a metrology system is simulated. The simulated data is generated from a parameterized model of the measurement of each of the one or more DOE metrology structures by the metrology system.
In some other examples, the measurement data associated with the multiple instances of one or more Design of Experiments (DOE) metrology targets is actual measurement data collected by a metrology system or multiple instances of a metrology system. In some of these embodiments, the same metrology system or multiple instances of the metrology system is employed to collect the actual regularization measurement data from the regularization structures.
In some embodiments, the physical measurement performance metrics characterize the actual measurement data collected from each of the multiple instances of the one or more regularization structures. In some embodiments the performance metrics are based on historical data, domain knowledge about the processes involved in producing the structure, physics, or a best guess by a user. In some examples, a measurement performance metric is a single point estimate. In other examples, the measurement performance metric is a distribution of estimated values.
In general, the measurement performance metric associated with the measurement data collected from the regularization structures provides information about the values of the physical attributes of the regularization structures. By way of non-limiting example, the physical attributes of the regularization structures includes any of measurement precision, tool to tool matching, wafer mean, within wafer range, tracking to reference, wafer to wafer matching, tracking to wafer split, etc.
In some examples, a measurement performance metric includes specific values of a parameter of a regularization structure and corresponding uncertainties at specific locations on the wafer. In one example, the measurement performance metric is a critical dimension (CD) at a particular location on a wafer and its uncertainty, e.g., the CD is 35 nanometers +/−0.5 nanometers.
In some examples, a measurement performance metric includes a probability distribution of values of a parameter of a structure within a wafer, within a lot of wafers, or across multiple wafer lots. In one example, the CD has a normal distribution with a mean value and a standard deviation, e.g., mean value of CD is 55 nanometers and the standard deviation is 2 nanometers.
In some examples, a measurement performance metric includes a spatial distribution of values of a parameter of interest across a wafer, e.g., a wafer map, and the corresponding uncertainties at each location.
In some examples, a measurement performance metric includes distributions of measured values of parameters of interest across multiple tools to characterize tool to tool matching. The distributions may represent mean values across each wafer, values at each site, or both.
In some examples, a measurement performance metric includes a distribution of measurement precision errors.
In some examples, a measurement performance metric includes a wafer map matching estimates across wafer lots.
In some examples, a measurement performance metric includes one or metrics characterizing the tracking of estimated values of a parameter of interest with reference values of the parameter of interest. In some examples, the metrics characterizing tracking performance include any of an R2 value, a slope value, and an offset value.
In some examples, a measurement performance metric includes one or more metrics characterizing the tracking of estimated values of a parameter of interest to wafer mean for a DOE split experiment. In some examples, the metrics characterizing tracking performance include any of an R2 value, a slope value, and an offset value.
In a further aspect, trained measurement model performance is validated with test data using error budget analysis. Real measurement data, simulated measurement data, or both, may be employed as test data for validation purposes.
Error budget analysis over real data allows the estimation of the individual contribution of accuracy, tracking, precision, tool matching errors, wafer to wafer consistency, wafer signature consistency, etc. to total error. In some embodiments, test data is designed such that total model error is split into each contributing component.
By way of non-limiting example, real data includes any of the following subsets: real data with reference values for accuracy and tracking calculations. Reference values include slope, offset, R2, 3STEYX, mean squared error, 3 sigma error, etc.; real data from measurements of the same site measured multiple times to estimate measurement precision; real data from measurements of the same site measured by different tools to estimate tool-to-tool matching; real data from measurement of sites on multiple wafers to estimate wafer to wafer changes of wafer mean and wafer variance; and real data measurements of multiple wafers to identify wafer signatures, e.g., typical wafer patterns like a bullseye pattern that is expected to be present for given wafers.
In some other examples, a parametrized model of the structure is employed to generate simulated data for error budget analysis. Simulated data is generated such that each parameter of the structure is sampled within its DOE while other parameters are fixed at nominal values. In some examples, other parameters of the simulation, e.g., system model parameters, are included in an error budget analysis. The true reference values of a parameter are known with simulated data, so errors due to changes of each parameter of the structure can be separated.
In some examples, additional simulated data is generated with different noise sampling to calculate precision error.
In some examples, additional simulated data is generated outside of the DOE of the parametrized structure to estimate extrapolation errors.
In another further aspect, the training of a measurement model includes optimization of model hyper-parameters. For example, hyper-parameters for neural network based models include the number and types of neural-network layers, the number of neurons in each layer, optimizer settings, etc. During hyper-parameter optimization multiple models are created and the model with the minimum cost is chosen as the best model.
In general, multiple models created during hyper-parameter optimization may have similar total cost but the costs associated with each different performance metric and associated regularization term may be very different. Error budget analysis is applied to separate the errors and the optimization described herein provides flexibility to weight differently the contribution of each performance metric, allowing a user to choose the model that best suits user criteria.
For example,
In another aspect, a model-based regression on a measurement model is physically regularized by on one or more measurement performance metrics. Estimates of one or more parameters of interest are determined based on actual measurement data collected from multiple instances of one or more structures of interest disposed on one or more wafers, statistical information associated with the measurement, and prior estimated values of the parameters of interest.
In one aspect, computing system 130 is configured as a measurement model regression engine to perform measurements of structures as described herein.
Measurement model regression module 191 estimates values of one or more parameters of interest 195 associated with the measured metrology targets based on an optimization function regularized by the one or more measurement performance metrics. The estimated values of the parameters of interest 195 are stored in a memory (e.g., memory 132).
The loss function of regression includes a data reconstruction error term and one or more regularization terms. Equation (10) illustrates an exemplary loss function of a model based regression to estimate values of one or more parameters of interest from actual measurements.
J(Y;X,θ)=∥g(Y)−X∥Σ2+γ1·Reg1(Y1,θ1)+ . . . +γk·Regk(Yk,θk) (10)
The first term of the loss function is a reconstruction error that measures the difference between the real measurement data, X, and the simulated measurement data, g(Y), where g(·) is the known measurement simulation model that estimates measured spectra from a current estimated value of one or more parameters of interest, Y. In the example illustrated in equation (10), the reconstruction error term is weighted by the inverse of the noise covariance matrix, Σ. The regularization terms evaluate how well the measurement performance metrics are met based on known parameters of models describing each measurement performance metric and prior estimated values of the one or more parameters of interest. Each dataset, Xk, is a subset of data X with corresponding measurement information, θk, and estimated parameters, Yk. The goal for the regression is to find the values of the one or more parameters of interest that minimizes the loss function. During regression, the parameter, Y, is adjusted to reduce the mismatch between simulated data and real data as well as satisfying the measurement performance metrics given the prior information.
In one example, the regularization terms are measurement precision and wafer mean within wafer as described hereinbefore. In this example, the regularization term associated with measurement precision is illustrated in equation (11), where Yreg-prec is the prior estimated values of the parameter of interest and σ(Yreg-prec) denotes the standard deviation of Yreg-prec.
The regularization term associated with wafer mean within wafer precision is illustrated in equation (12), where
In this example, the loss function is illustrated by equation (13).
In some embodiments, values of parameters of interest employed to train a measurement model are derived from measurements of DOE wafers by a reference metrology system. The reference metrology system is a trusted measurement system that generates sufficiently accurate measurement results. In some examples, reference metrology systems are too slow to be used to measure wafers on-line as part of the wafer fabrication process flow, but are suitable for off-line use for purposes such as model training. By way of non-limiting example, a reference metrology system may include a stand-alone optical metrology system, such as a spectroscopic ellipsometer (SE), SE with multiple angles of illumination, SE measuring Mueller matrix elements, a single-wavelength ellipsometer, a beam profile ellipsometer, a beam profile reflectometer, a broadband reflective spectrometer, a single-wavelength reflectometer, an angle-resolved reflectometer, an imaging system, a scatterometer, such as a speckle analyzer, an X-ray based metrology system such as a small angle x-ray scatterometer (SAXS) operated in a transmission or grazing incidence mode, an x-ray diffraction (XRD) system, an x-ray fluorescence (XRF) system, an x-ray photoelectron spectroscopy (XPS) system, an x-ray reflectometer (XRR) system, a Raman spectroscopy system, an atomic force microscopy (AFM) system, a transmission electron microscopy system, a scanning electron microscopy system, or other technologies capable of determining device geometry.
In some embodiments, a measurement model trained as described herein is implemented as a neural network model. In other examples, a measurement model may be implemented as a linear model, a non-linear model, a polynomial model, a response surface model, a support vector machines model, a random forest model, a deep network model, a convolutional network model, or other types of models. In some examples, a measurement model trained as described herein may be implemented as a combination of models.
In yet another further aspect, the measurement results described herein can be used to provide active feedback to a process tool (e.g., lithography tool, etch tool, deposition tool, etc.). For example, values of measured parameters determined based on measurement methods described herein can be communicated to an etch tool to adjust the etch time to achieve a desired etch depth. In a similar way etch parameters (e.g., etch time, diffusivity, etc.) or deposition parameters (e.g., time, concentration, etc.) may be included in a measurement model to provide active feedback to etch tools or deposition tools, respectively. In some example, corrections to process parameters determined based on measured device parameter values and a trained measurement model may be communicated to the process tool. In one embodiment, computing system 130 determines values of one or more parameters of interest during process based on measured signals 111 received from a measurement system. In addition, computing system 130 communicates control commands to a process controller (not shown) based on the determined values of the one or more parameters of interest. The control commands cause the process controller to change the state of a process (e.g., stop the etch process, change the diffusivity, change lithography focus, change lithography dosage, etc.).
In some embodiments, the methods and systems for metrology of semiconductor devices as described herein are applied to the measurement of memory structures. These embodiments enable optical critical dimension (CD), film, and composition metrology for periodic and planar structures.
In some examples, the measurement models are implemented as an element of a SpectraShape® optical critical-dimension metrology system available from KLA-Tencor Corporation, Milpitas, Calif., USA. In this manner, the model is created and ready for use immediately after the spectra are collected by the system.
In some other examples, the measurement models are implemented off-line, for example, by a computing system implementing AcuShape® software available from KLA-Tencor Corporation, Milpitas, Calif., USA. The resulting, trained model may be incorporated as an element of an AcuShape® library that is accessible by a metrology system performing measurements.
In block 301, an amount of Design of Experiments (DOE) measurement data associated with measurements of one or more Design of Experiments (DOE) metrology targets is received by a computing system.
In block 302, known, reference values of one or more parameters of interest associated with the DOE metrology targets are received by the computing system.
In block 303, an amount of regularization measurement data from measurements of one or more regularization structures disposed on a first wafer by a metrology tool is received by the computing system.
In block 304, values of one or more measurement performance metrics associated with the regularization measurement data are received by the computing system.
In block 305, a measurement model is trained based on an optimization function including the amount of Design of Experiments (DOE) measurement data, the reference values of one or more parameters of interest, the regularization measurement data, and the one or more measurement performance metrics. The optimization function is regularized by the one or more measurement performance metrics.
In block 401, an amount of measurement data from measurements of one or more metrology targets disposed on a wafer by a metrology tool is received by a computing system.
In block 402, values of one or more measurement performance metrics associated with the measurement data are received by the computing system.
In block 403, values of one or more parameters of interest characterizing the one or more metrology targets are estimated from the amount of measurement data based on a regression analysis including an optimization function that is regularized by the one or more measurement performance metrics.
In a further embodiment, system 100 includes one or more computing systems 130 employed to perform measurements of semiconductor structures based on spectroscopic measurement data collected in accordance with the methods described herein. The one or more computing systems 130 may be communicatively coupled to one or more spectrometers, active optical elements, process controllers, etc. In one aspect, the one or more computing systems 130 are configured to receive measurement data associated with spectral measurements of structures of wafer 101.
It should be recognized that one or more steps described throughout the present disclosure may be carried out by a single computer system 130 or, alternatively, a multiple computer system 130. Moreover, different subsystems of system 100 may include a computer system suitable for carrying out at least a portion of the steps described herein. Therefore, the aforementioned description should not be interpreted as a limitation on the present invention but merely an illustration.
In addition, the computer system 130 may be communicatively coupled to the spectrometers in any manner known in the art. For example, the one or more computing systems 130 may be coupled to computing systems associated with the spectrometers. In another example, the spectrometers may be controlled directly by a single computer system coupled to computer system 130.
The computer system 130 of system 100 may be configured to receive and/or acquire data or information from the subsystems of the system (e.g., spectrometers and the like) by a transmission medium that may include wireline and/or wireless portions. In this manner, the transmission medium may serve as a data link between the computer system 130 and other subsystems of system 100.
Computer system 130 of system 100 may be configured to receive and/or acquire data or information (e.g., measurement results, modeling inputs, modeling results, reference measurement results, etc.) from other systems by a transmission medium that may include wireline and/or wireless portions. In this manner, the transmission medium may serve as a data link between the computer system 130 and other systems (e.g., memory on-board system 100, external memory, or other external systems). For example, the computing system 130 may be configured to receive measurement data from a storage medium (i.e., memory 132 or an external memory) via a data link. For instance, spectral results obtained using the spectrometers described herein may be stored in a permanent or semi-permanent memory device (e.g., memory 132 or an external memory). In this regard, the spectral results may be imported from on-board memory or from an external memory system. Moreover, the computer system 130 may send data to other systems via a transmission medium. For instance, a measurement model or an estimated parameter value determined by computer system 130 may be communicated and stored in an external memory. In this regard, measurement results may be exported to another system.
Computing system 130 may include, but is not limited to, a personal computer system, mainframe computer system, workstation, image computer, parallel processor, or any other device known in the art. In general, the term “computing system” may be broadly defined to encompass any device having one or more processors, which execute instructions from a memory medium.
Program instructions 134 implementing methods such as those described herein may be transmitted over a transmission medium such as a wire, cable, or wireless transmission link. For example, as illustrated in
As described herein, the term “critical dimension” includes any critical dimension of a structure (e.g., bottom critical dimension, middle critical dimension, top critical dimension, sidewall angle, grating height, etc.), a critical dimension between any two or more structures (e.g., distance between two structures), and a displacement between two or more structures (e.g., overlay displacement between overlaying grating structures, etc.). Structures may include three dimensional structures, patterned structures, overlay structures, etc.
As described herein, the term “critical dimension application” or “critical dimension measurement application” includes any critical dimension measurement.
As described herein, the term “metrology system” includes any system employed at least in part to characterize a specimen in any aspect, including measurement applications such as critical dimension metrology, overlay metrology, focus/dosage metrology, and composition metrology. However, such terms of art do not limit the scope of the term “metrology system” as described herein. In addition, the system 100 may be configured for measurement of patterned wafers and/or unpatterned wafers. The metrology system may be configured as a LED inspection tool, edge inspection tool, backside inspection tool, macro-inspection tool, or multi-mode inspection tool (involving data from one or more platforms simultaneously), and any other metrology or inspection tool that benefits from the calibration of system parameters based on critical dimension data.
Various embodiments are described herein for a semiconductor measurement system that may be used for measuring a specimen within any semiconductor processing tool (e.g., an inspection system or a lithography system). The term “specimen” is used herein to refer to a wafer, a reticle, or any other sample that may be processed (e.g., printed or inspected for defects) by means known in the art.
As used herein, the term “wafer” generally refers to substrates formed of a semiconductor or non-semiconductor material. Examples include, but are not limited to, monocrystalline silicon, gallium arsenide, and indium phosphide. Such substrates may be commonly found and/or processed in semiconductor fabrication facilities. In some cases, a wafer may include only the substrate (i.e., bare wafer). Alternatively, a wafer may include one or more layers of different materials formed upon a substrate. One or more layers formed on a wafer may be “patterned” or “unpatterned.” For example, a wafer may include a plurality of dies having repeatable pattern features.
A “reticle” may be a reticle at any stage of a reticle fabrication process, or a completed reticle that may or may not be released for use in a semiconductor fabrication facility. A reticle, or a “mask,” is generally defined as a substantially transparent substrate having substantially opaque regions formed thereon and configured in a pattern. The substrate may include, for example, a glass material such as amorphous SiO2. A reticle may be disposed above a resist-covered wafer during an exposure step of a lithography process such that the pattern on the reticle may be transferred to the resist.
One or more layers formed on a wafer may be patterned or unpatterned. For example, a wafer may include a plurality of dies, each having repeatable pattern features. Formation and processing of such layers of material may ultimately result in completed devices. Many different types of devices may be formed on a wafer, and the term wafer as used herein is intended to encompass a wafer on which any type of device known in the art is being fabricated.
In one or more exemplary embodiments, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer. By way of example, and not limitation, such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code means in the form of instructions or data structures and that can be accessed by a general-purpose or special-purpose computer, or a general-purpose or special-purpose processor. Also, any connection is properly termed a computer-readable medium. For example, if the software is transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. Disk and disc, as used herein, includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.
Although certain specific embodiments are described above for instructional purposes, the teachings of this patent document have general applicability and are not limited to the specific embodiments described above. Accordingly, various modifications, adaptations, and combinations of various features of the described embodiments can be practiced without departing from the scope of the invention as set forth in the claims.
The present application for patent claims priority under 35 U.S.C. § 119 from U.S. provisional patent application Ser. No. 62/942,730, entitled “Metrology System Utilizing Probabilistic Domain Knowledge and Physical Realization,” filed Dec. 2, 2019, the subject matter of which is incorporated herein by reference in its entirety.
Number | Name | Date | Kind |
---|---|---|---|
5608526 | Piwonka-Corle et al. | Mar 1997 | A |
5859424 | Norton et al. | Jan 1999 | A |
6023338 | Bareket | Feb 2000 | A |
6429943 | Opsal et al. | Aug 2002 | B1 |
6716646 | Wright et al. | Apr 2004 | B1 |
6778275 | Bowes | Aug 2004 | B2 |
6787773 | Lee | Sep 2004 | B1 |
6992764 | Yang et al. | Jan 2006 | B1 |
7242477 | Mieher et al. | Jul 2007 | B2 |
7321426 | Poslavsky et al. | Jan 2008 | B1 |
7406153 | Berman | Jul 2008 | B2 |
7478019 | Zangooie et al. | Jan 2009 | B2 |
7626702 | Ausschnitt et al. | Dec 2009 | B2 |
7656528 | Abdulhalim et al. | Feb 2010 | B2 |
7826071 | Shchegrov et al. | Nov 2010 | B2 |
7842933 | Shur et al. | Nov 2010 | B2 |
7873585 | Izikson | Jan 2011 | B2 |
7929667 | Zhuang et al. | Apr 2011 | B1 |
7933026 | Opsal et al. | Apr 2011 | B2 |
8068662 | Zhang et al. | Nov 2011 | B2 |
8138498 | Ghinovker | Mar 2012 | B2 |
9291554 | Kuznetsov et al. | Mar 2016 | B2 |
9826614 | Bakeman et al. | Nov 2017 | B1 |
9885962 | Veldman et al. | Feb 2018 | B2 |
9915522 | Jiang et al. | Mar 2018 | B1 |
10013518 | Bakeman et al. | Jul 2018 | B2 |
10101670 | Pandev et al. | Oct 2018 | B2 |
10152678 | Pandev et al. | Dec 2018 | B2 |
10324050 | Hench et al. | Jun 2019 | B2 |
10352695 | Dziura et al. | Jul 2019 | B2 |
20030021465 | Adel et al. | Jan 2003 | A1 |
20070176128 | Van Bilsen et al. | Aug 2007 | A1 |
20070221842 | Morokuma et al. | Sep 2007 | A1 |
20090152463 | Toyoda et al. | Jun 2009 | A1 |
20110266440 | Boughorbel et al. | Nov 2011 | A1 |
20120292502 | Langer et al. | Nov 2012 | A1 |
20130208279 | Smith | Aug 2013 | A1 |
20130245985 | Flock et al. | Sep 2013 | A1 |
20130304424 | Bakeman et al. | Nov 2013 | A1 |
20140019097 | Bakeman et al. | Jan 2014 | A1 |
20140064445 | Adler | Mar 2014 | A1 |
20140111791 | Manassen et al. | Apr 2014 | A1 |
20140172394 | Kuznetsov et al. | Jun 2014 | A1 |
20140222380 | Kuznetsov et al. | Aug 2014 | A1 |
20140297211 | Pandev et al. | Oct 2014 | A1 |
20150046121 | Dziura et al. | Feb 2015 | A1 |
20150110249 | Bakeman et al. | Apr 2015 | A1 |
20150117610 | Veldman et al. | Apr 2015 | A1 |
20150204664 | Bringoltz et al. | Jul 2015 | A1 |
20150285749 | Moncton et al. | Oct 2015 | A1 |
20150300965 | Sezginer et al. | Oct 2015 | A1 |
20160202193 | Hench et al. | Jul 2016 | A1 |
20160320319 | Hench et al. | Nov 2016 | A1 |
20170167862 | Dziura et al. | Jun 2017 | A1 |
20180106735 | Gellineau et al. | Apr 2018 | A1 |
20180113084 | Hench et al. | Apr 2018 | A1 |
20180328868 | Bykanov et al. | Nov 2018 | A1 |
20190017946 | Wack et al. | Jan 2019 | A1 |
20190086200 | Amit | Mar 2019 | A1 |
20190293578 | Gellineau | Sep 2019 | A1 |
20190325571 | Pandev | Oct 2019 | A1 |
Number | Date | Country |
---|---|---|
2017100424 | Jun 2017 | WO |
2017176637 | Oct 2017 | WO |
Entry |
---|
International Search Report dated Mar. 5, 2021, for PCT Application No. PCT/US2020/060642 filed on Nov. 16, 2020 by KLA Corporation, 4 pages. |
Number | Date | Country | |
---|---|---|---|
20210165398 A1 | Jun 2021 | US |
Number | Date | Country | |
---|---|---|---|
62942730 | Dec 2019 | US |