The subject matter described herein generally relates to a method and apparatus for determining sensitivities of reflection or transmission spectra to various grating parameters via a time or frequency domain simulation. The computation of these sensitivities may be central in a number of techniques used in optically based metrology tools to deduce the features on semiconductor wafers.
Some metrology tools measure more than just one-dimensional critical dimension (CD) features, because of smaller device dimensions and tighter process control windows. It has become critical to efficiently detect, identify, and measure changes in feature profiles to control current and future semiconductor lithography and etch processes.
Generally, the determination of sensitivities of reflection or transmission spectra to various grating parameters via a time or frequency domain simulation has used a finite difference approximation of the derivatives.
The method of computing derivatives by finite differences has a number of disadvantages. For example, each desired derivative with respect to a single scalar parameter requires an additional solution of Maxwell's Equation. This computation can be costly, and as a result limits the total number of derivatives that may be computed for application where computation time is important. The computation cost also limits the number of wavelengths from the optical scattering data that may be processed, which prevents the use of the full information provided by the optical scattering data.
One embodiment of the invention relates to a method of model-based metrology. An area of a geometrical structure of dispersive materials is illuminated with incident electromagnetic radiation, wherein the incident electromagnetic radiation is polarized, and spectral components of the incident electromagnetic radiation reflected from the area are measured. A determination is made as to parameter values that minimize an objective function which represents a difference between the measured spectral components and computed spectral components based on a parameterized model of the geometrical structure.
Another embodiment of the invention relates to an apparatus for metrology. The apparatus includes at least a polarized illuminator, a detector, and a data processing system. The polarized illuminator is configured to illuminate an area of a geometrical structure of dispersive materials with incident polarized electromagnetic radiation, and the detector is configured to measure spectral components of the incident electromagnetic radiation reflected from the area. The data processing system is configured to determine parameter values that minimize an objective function which represents a difference between the measured spectral components and computed spectral components based on a parameterized model of the geometrical structure.
Another embodiment relates to a method for estimating a critical dimension from spectroscopic measurements. The method includes computing a solution to state equations driven by a function representing the incident electromagnetic radiation, and computing a solution to an adjoint to the state equations. Such computations may be made using regression, or a library, or a combination of regression and a library. Advantageously, the time to compute each iteration of the solution at a given level of accuracy scales slower with an increasing number of parameters than computing each iteration of the solution at the given level of accuracy by finite difference derivatives.
Another embodiment relates to a method of computation to estimate a critical dimension from spectroscopic measurements, wherein with an increasing number of parameters N, a computation time per iteration at a given level of accuracy increases at a rate slower than N to the third power. In a preferred embodiment, the computation time per iteration at a given level of accuracy increases linearly with N.
Other embodiments and features are also disclosed.
The accompanying drawings are included to provide further understanding of embodiments of the invention, illustrate various embodiments of the invention, and together with the description serve to explain the principles and operation of the invention.
Some embodiments of the present invention provide a mechanism for measuring critical dimensions (CD) in “grating” structures on semiconductor substrates, such as semiconductor wafers, using a technique called Optical CD (OCD). The mechanism minimizes an objective function that is a difference (typically mean square) between physical measurements of diffracted spectra and those that may be computed for a given parameterized description of the grating structure. The minimization procedure, referred to generally as regression, uses derivatives of the objective and/or measurement function with regard to the geometric parameters describing the grating.
In accordance with one embodiment of the invention, a method includes computing the spectra from a solution of the Maxwell's equations. The solution may be done broadly in at least two ways, first, using a frequency domain computation, one example of which is the Rigorous Coupled Wave Analysis (RCWA) method; and second, using a time domain computation.
The adjoint methods disclosed herein and described below provide for the computation of objective or measurement function derivatives with regard to geometric parameters more quickly and accurately than other methods; thus, speeding up the regression process. Furthermore, the same techniques may be used to find sensitivities of the measurements to optical parameters, such as wavelength, angle of incidence, azimuth angle, pitch, and index of refraction. The results may be used to optimize the signal-to-noise ratio of the measurement or to provide information about the precision of the estimate.
In one embodiment of the invention, the adjoint methods include five computations. The first computation is called the state solution. The state solution includes the computation of the measured reflection and transmission spectra of a grating on a semiconductor wafer via the solution to Maxwell's equations. The source function of these equations is a function describing the incident light.
The second computation is called the adjoint solution. The adjoint solution is the computation of the solution to a set of equations related to state equations. The source function for these equations is a function describing the derivative or first variation of the objective or measurement function as a function of the solution of the state equation.
The third computation is that of parametric derivatives due to perturbations of the state and/or adjoint equations.
The fourth computation is an inner-product like functional involving the state and adjoint solutions, where the inner-product functional depends on the parametric derivatives. The result of the fourth computation are derivatives of local parameters, such as a set of slab heights and slab CD fractions, or on optical parameters, such as wavelength, angle of incidence, azimuth angle, pitch, and index of refraction.
The result of the fifth computation is the compaction of the linear map between the local parameters and geometric parameters, such as trapezoid height, width, and wall angle.
Typically, the state computation may be time-intensive to perform, while the adjoint, derivative and inner-product computations are generally less so. For example, for the RCWA based 2D and 3D frequency domain Maxwell's equations solver, the computation time of the state solution scales as the third power of the number of Fourier modes describing the index of refraction of the grating. The adjoint, derivative and inner-product computations scale as the second power of the Fourier modes.
In various embodiments, the time and/or frequency domain methods may more quickly compute the derivatives of the measurement function with regard to geometric parameters that define the grating; thus, enabling higher order parameterization of a grating. Additionally, these derivatives may be computed more accurately than, for example, a finite-difference type method.
In various embodiments, the time and frequency domain methods may provide for is that the computation of measurements sensitivities allows the measurement setup to be optimized, maximizing measured signal as a function of, for example, wavelength, angle of incidence, and the like. By computing measurement derivatives, the reflection or transmission coefficients may be interpolated or extrapolated to neighboring values of incidence angle using Taylor's expansion or a cubic spline; which may be applied in numerical aperture averaging. Computing measurement derivatives may also enable faster computation of spectra in a wavelength neighborhood about the originally computed spectra in the case of the frequency domain solution.
Additional advantages, objects and features of embodiments of the invention are set forth in part in the detailed description which follows. It is to be understood that both the foregoing general description and the following detailed description are merely exemplary of embodiments of the invention, and are intended to provide an overview or framework for understanding the nature and character of embodiments of the invention.
In the previous discussion it was noted that if derivatives could be computed much faster than the spectra, then the computational speed of regression algorithms that require derivatives would be greatly increased. This is due to the fact that finite difference derivatives would not be need to be computed, which require an extra computation (forward solve) per dimension.
Another way in which the fast computation of derivatives would be advantageous would be in the production of a measurement library. A measurement library starts with a parameterization of a particular structure on a semiconductor wafer, and a volume in parameter space over which the parameters are expected to be varied. A measurement library is a collection of spectra computed at various nodes in parameter space. Once a library is generated, multidimensional polynomial models may be constructed for each spectrum, where the polynomial support is related to the parameterization of the structure. If these multidimensional polynomial models are sufficiently accurate, they may be used in place of the actual forward solves within regression. This significantly increases the speed of the regression, which eliminates a computational time bottleneck within the CD measurement process. In other words, the cost of the forward solve is paid in advance in the computation of the library, not during measurement. Nevertheless, it is also important to reduce the time it takes to compute the library, and the ability to compute derivative more quickly than the spectra can help in this regard.
To motivate this discussion, we start with a simple one dimensional example. Suppose we have a model of a semiconductor feature for which only the total height is unknown. Suppose, too, that the height is expected to vary from 100 nm to 130 nm, and that the dependence of the spectra, α and β is smooth enough that a cubic model sufficiently captures its behavior over 30 nm intervals.
In such a case, we could build two cubic polynomial models,
for α and β, to cover the region in parameter space from p=100 to p=130 nm.
If we had only the spectra available (and not the derivatives) we would have to build a library consisting of the evaluation of α and β at four nodes, e.g., [100 110 120 130]. The coefficients would be computed in the linear system, AX=B, where each row of the matrix A is row vector rA formed by appropriately exponentiating the parameter value, p
and where each row rB in the matrix B is the spectra associated with the particular parameter value
rB(p)=[α(p),β(p)]
In the example above, the coefficients of the cubic spectra models would be computed by solving the following linear system:
If, however, we have derivatives available, we need only compute spectra and their derivatives at two nodes, e.g., [100 130]. In this case the linear system would be comprised of two-row blocks, RA, in the matrix A for each parameter value
as well as two-row blocks, RB, matrix B is the spectra associated with the particular parameter value
In the example above, the coefficients would be computed by solving the following linear system:
In the example above, we can see that the time to compute the library is halved for the one dimensional problem when spectral derivatives are available; presuming the cost to produce the spectral derivatives is negligible via the adjoint method.
For multidimensional problems, the formulation is quite similar, with the exception that the monomials will be made up of cross-terms formed from the elements of the parameter vector. For example, for the two dimensional interpolation problem, a cubic multidimensional polynomial derived from parameter p=[x y]T be comprised of ten monomials in x and y. For each parameter value, the system matrix is made up of three-row blocks with ten columns, one for each monomial
The associated rows of the matrix B are
In one embodiment of the library interpolation method, local cubic multidimensional models are built on the fly in a region where the solution is expected to lie. Numerical experiments indicate that for a given volume of parameter space, the expected precision of the local models is roughly equal when the library space contains the same number of independent data, whether that data is comprised of just spectra or spectra and derivatives. In other words, for a parameter space with M degrees of freedom, one might expect the same accuracy of the local models for (M+1) N spectra evaluations or N spectral evaluations with MN derivatives. If we presume the time to compute the spectral derivatives is negligible compared to the spectra, then we can expect a speed-up of a factor of M+1 in the time it takes to compute a library.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of embodiments of the present invention. Embodiments of the present invention may be practiced without some or all of these specific details. In other instances, well known process operations have not been described in detail in order not to unnecessarily obscure the present invention.
Spectroscopic Scatterometer
Before the diffracting structure 102c is measured, an XYZ stage 104 is used for moving the wafer in the horizontal XY directions in order to first measure the film thickness and refractive index of the underlying structure underneath the photoresist pattern 102c.
For the purpose of measuring the film thickness and refractive index of the underlying structure (102b and 102a), a broadband radiation source such as white light source 106 supplies light through a fiber optic cable 108 which randomizes the polarization and creates a uniform light source for illuminating the wafer. Upon emerging from fiber 108, the radiation passes through an optical illuminator that may include a slit aperture and a focus lens (not shown). The slit aperture causes the emerging light beam to image a small area of layer 102b. The light emerging from illuminator 110 is polarized by a polarizer 112 to produce a polarized sampling beam 114 illuminating the layer 102b.
The radiation originating from sampling beam 114 that is reflected by layer 102b, passed through an analyzer 116 and to a spectrometer 118 to detect different spectral components of the reflected radiation. In the spectroscopic ellipsometry (SE) mode of system 100 for measuring film thickness and refractive index, either the polarizer 112 or the analyzer 116 is rotated (to cause relative rotational motion between the polarizer and the analyzer) when spectrometer 118 is detecting the reflected light at a plurality of wavelengths, such as those in the spectrum of the radiation source 106, where the rotation is controlled by computer 120 in a manner known to those skilled in the art. The reflected intensities at different wavelengths detected is supplied to computer 120 which computes the film thickness and n and k values of the refractive index of layer 102b.
Frequency Domain
As shown in
In the embodiment where derivatives with respect to geometric parameter are desired, the computation continues with the computation of the derivatives of the input vector and system matrix from the linear system of step 1 (202) above with regard to the local parameters, such as slab heights and CD fractions (206); the computation of a set of local parameter derivatives, namely with respect to slab heights and CD fractions, computed via the inner product of the state and the adjoint vectors weighted by the derivatives of the state matrix with regard to local parameter derivatives; and the computation of a map between local parameter derivatives (208), and geometric parameter derivatives, such as grating height, width, and sidewall angle.
In the embodiment where derivatives with respect to optical parameters are desired, the computation (204) is followed by the optional computation (206), the computation of the derivatives of the state input vector and system matrix from the linear system of computation (202) above with regard to the optical parameters, and the optional computation of optical parameter derivatives of the inner product between the adjoint and the difference between the derivative of the state input vector and the product of the state and the derivative of the state matrix (from computation 206).
A scattering problem may be described as light impinging on a geometrical structure and measuring the fraction of light that is reflected and/or transmitted. These geometric structures typically include an incident layer, a grating layer with a parameterized geometry, and a substrate layer. While both the incident layer and the substrate layer have uniform index of refractions, the grating layer has a non-uniform index of refraction. This non-uniformity may be viewed as a function of height and length.
An illustration of a scattering problem may include a grating region that contains two rectangular areas with different indices of refraction, na and nb. The sum of the widths of the rectangular regions constitute the pitch, Λ, and the width of the rectangle associated with index of refraction, nb, is the critical dimension (CD). The geometry of this problem may be parameterized by grating height, h, and CD fraction, ƒ=CD/Λ.
One technique to compute the solution of this scattering problem is to use the RCWA framework, casting the problem as the solution to a linear system, Ax=b, where A is the system matrix and b is the system input vector. In the transverse electric (TE) field case, this linear system may be written as Aexe=be or:
The constituent elements of the system matrix Ae and the system input vector be are defined along the lines of (1) and are given as follows. The matrix
is the difference between the square of the diagonal matrix Kx that contains the weightings of the diffraction modes in the forward (horizontal) direction of the incident light and the matrix E, and a Toeplitz matrix comprised of the difference of the Fourier coefficients of the function describing the permittivity in the grating region. The constant
The solution to the linear system xe contains sought after values the reflection coefficient, re, and the transmission coefficient, te, as well as two coefficients describing the boundary conditions between the grating layer and the incident layer, and the grating layer and the substrate layer.
It should be noted that this simple geometry forms a basis for solving the scattering problems, as they may be built up with many individual slabs having rectangular geometries, in Ziggurat fashion. In this embodiment, the system solution vector is elongated containing additional terms describing the boundary conditions between each slab.
Similarly, the linear system for the transverse magnetic (TM) field case may be written as Amxm=bm or
where
B=A−1(KxE−1Kx−I), (4)
Zl, Zll, N, and bl are similarly defined in (1). Here, the matrix A is a Toeplitz matrix comprised of the difference of the Fourier coefficients of the function describing the inverse of the permittivity in the grating region.
Inferring the geometry of the scattering region requires finding the parameter values, p, associated with the spectra α and β, that are closest to measured spectra, α0 and β0. The measure of closeness is given by an objective function; this is often a quadratic form based on the differences between measured and computed spectra. In the above example, this would entail finding the grating height, h, and CD fraction, ƒ, whose attendant spectra when differenced with the measured spectra minimizes the quadratic form. A suitable objective function, J, used within the RCWA framework may be defined based on functions of the state vectors, xe and xm:
where
and
with
The vectors s and c are defined as:
s=sin(η)er0 (11)
and
c=cos(η)er0 (12)
where the vector er0=[0 . . . 0 1 0 . . . 0]T Or extracts the zeroth order reflection coefficient from the state vectors xe and xm in (1) and (2) respectively. The angle η is the analyzer angle. The derivatives of the objective function with regard to the parameter vector p=[ƒ,h]T may be written as follows
where the adjoint vectors, {circumflex over (x)}e and {circumflex over (x)}m, are the solutions of the linear systems
and
In the embodiment described in this section, the spectra of interest is defined as α and β; however, this definition may be made more general by allowing any function of the state to be considered as spectra. Such functions may be quadratic functions of the state, among whose class Mueller matrix elements belong, or linear functions of the state, among whose class Jones matrix elements belong. Such reformulations would be evident to one of skill in the art.
Matrix Derivatives
The computation of (13) includes the computation of derivatives of scalar functions relative to vectors and matrices relative to scalars. Starting with the right hand sides of (14) and (15):
∇xeJ=(α−α0)∇xeα+(β−β0)∇xeβ (16)
where
and
with similar definition for ∇xmJ, that is, the derivative of the objective function with respect to xm. The derivatives of I0, Iα, and Iβ with respect to the state vectors, xe and xm, may be written as follows:
Next, compute the values of:
and
In the problem given, the values of ∇pbe=0 and ∇pbm=0; however, this in general is not so, particularly for optical parameters like angle of incidence, pitch, and wavelength.
Note from (2) that the Fourier expansion of the permittivity function ε(·) is given as:
The matrix E in (2) and (4) is a Toeplitz matrix formed from the elements of ε, i.e., [E]ij=εi-j. Similarly, the Fourier expansion of the prohibitivity function a=1/ε is given as:
The matrix A in (4) is also a Toeplitz matrix formed from the elements of a, i.e., [A]ij=ai-j. Since ε and a are analytic in ƒ, they may be differentiated,
It should be noted that while in this case the derivative of the Fourier components matrices E and A have been computed analytically, they may also be computed numerically. Since the numerical derivative involves ultimately the derivation of a vector with regard to a scalar (or set of scalars in a plurality of materials in the grating), the numerical computation of the derivatives by finite differences is efficient and does not materially impact the speed of the method.
It should also be noted that the parameterization of the discrete change in permittivity by a single parameter, namely CD fraction, may not be adequate for describing index of refraction in the slab. For example, multiple materials in the slab might require multiple parameters akin to CD fraction. A slab wherein there is a smooth (as opposed to an abrupt) change in permittivity might require a different parameterization, such as one that describes a linear or cubic spline. A smooth change in slab permittivity may also be desirable particularly when multiple slabs are used to represent a complex shape. Such a smooth description might be a means of more accurately representing the overall permittivity of the structure, resulting in better accuracy with fewer Fourier modes or slabs in the approximation.
The definition of
and
follows as merely the Toeplitz matrices formed from
and
vectors. The definitions of
and
are given simply as:
Next, find the derivatives of the block matrices in the system matrices in (1) and (3) with regard to height, h, and CD fraction, ƒ. This is made up of Fréchet derivatives with respect to two functions, the matrix square root and the matrix exponential. To compute these derivatives, assume that E and A are semi-simple matrices and use the spectral product method of computing the Fréchet derivative of a function of a matrix [4]. This may be defined as follows. Suppose
A=XΛX−1 (28)
where Λ is a diagonal matrix with elements λi and where the analytical function of interest is defined as F(·). Then the Fréchet derivative of the matrix F(A) in the direction V, i.e., DV(F(A)) is given as:
DV(F(A))=X(
with ∘ denoting the Hadamard (element by element) product,
The definition of the Fréchet derivative in (29) provides a particularly efficient mechanism to compute the derivative of the system matrices Ae and Am, as it involves using many of the computational by-products available from the computation of the spectra. In an alternative embodiment, a method for computing the Fréchet derivative of the square root of a matrix is via a Lyapunov equation, that is, if the eigenvector decomposition of the matrix is available, the solution of the Lyapunov equation reduces to the solution of a diagonal system. Also, it should be noted that the methods described herein to compute the Fréchet derivative may be focused on spectral methods. Other methods to compute the Fréchet derivative may be based on functions of block upper triangular matrices with the system matrices on the diagonal and the derivative direction in the upper right hand quadrant.
In the TE case, it is noted that dC/dƒ=−dE/dƒ; thus:
and
The derivative of the system matrix with respect to CD fraction,
is then given by:
Given that
the derivative of the system matrix with respect to CD fraction,
is defined analogously.
The derivatives of the system matrices with regard to the height h may be computed using the derivative of the matrix exponential:
Thus, the derivative of the system matrix with respect to height,
may be written as:
with
given analogously.
It should also be noted the framework discussed in this example represents a two dimensional grating. The extension of this example to three dimensions is evident to one of skill in the art.
The following example is an illustration of a method of the present invention, but is not meant to be limiting to specifics of the example. The numerical example is given, with indices of refraction nl=na=1 and nll=nb=2, impinging light wavelength λ0=350 nm, pitch Λ=400 nm, height d=110 nm, CD fraction ƒ=0.25, and analyzer angle φ=−25°. These values give α0=0.5795 and β0=−0.8033. The value of the objective function J may be viewed as a surface 300 over the two parameters, height h, and CD fraction ƒ, and is depicted in
A plot of the derivatives the objective function J with regard to height h and CD fraction ƒ are computed two ways, first with the adjoint method of the present invention, and secondly using finite differences. These values coincide, as is shown in
In this embodiment, the objective function value and gradient are used in a regression in which the optimization method is a trust-region Newton method. The regression starts with initial value h=115 nm and ƒ=0.14. The optimization routine converges to within 0.1% of correct value h=110 and ƒ=0.25 in 13 iterations. The progression of the algorithm is plotted on top of a contour plot of the objective function surface, and is shown in
Time Domain
In the time domain embodiment, as illustrated in
In an embodiment including mirror symmetry of the grating relative to the incident light, the computation of the time domain adjoint equation may be replaced by a suitable scaling of the state equations.
In the time domain solution embodiment of the present invention, the reflection coefficients are determined by the Fourier transform of the reflected electromagnetic field ur(t) divided by the Fourier transform of the incident pulse ui(t) at an observation point (dependence of ur(t) on spatial variables has been omitted).
To solve an inverse equation, find the parameters of grating such that T(ω) are the same as (or close to) T0(ω) for a finite set of ω,
{ωn}n=1N. (0.2)
Thus, the inverse equation is a minimization equation. Therefore, q (e.g. geometrical parameters of grating) may be found which provides:
where
In this embodiment, therefore
From definition (0.1) we have
and
and finally
where
are complex numbers calculated for previous direct solution.
Note that the same result may be obtained with a functional with some filtering of ur. Indeed, consider functional:
with ufilt defined by following finite Fourier series with
Then, using
the following filter presentation may be obtained:
Alternatively, a more common embodiment with some weights {ak}, ak is a complex number, thus
Therefore, Parseval's identity may be obtained:
Note if T is provided such that set {ωn}n=1N from (0.2) is a subset of {{tilde over (ω)}k}k=−KK (0.6) (or, at least, is approximated by a subset of (0.6)), and take
for this subset, put ak=0 for rest frequencies {tilde over (ω)}k from set (0.6), the same functional as (0.3) may be obtained.
Filter is defined by:
Functional (0.3) may be satisfactorily used for adjoint equation. The right-hand side of the adjoint equation is given by:
Thus, what is known about T0(ω). If one measures |T0(ω)|, i.e. just amplitude of T0(ω), one might use the functional:
Then
Finally, the derivative in the form (0.4)
where cn is slightly different
The right-hand side of the adjoint equation is given by (0.8). Note that this observation functional is for a single observation point. In other embodiments, an observation plane maybe incorporated. For 0-order scattering from grating:
Here Γ is a segment of observation plane, α=ω sin φ/c0, φ is an angle of the incident wave.
For m-order mode scattering from grating yields:
where Λ is the grating period. The right-hand side of the adjoint equation in this case has spatial distribution
on observation plane. The wave
is damped and thus this reflection coefficient depends on choice of observation plane.
Observation Functional, Alpha/Beta Coefficients
Using objective function J defined in equation (5):
Or, alternatively:
where α(ω), β(ω) may be expressed by means of bilinear form of coefficients of Jones matrix. The Jones matrix relates TM- and TE-polarized components of the scattered field with TM- and TE-polarized components of incident field.
For a diagonal Jones matrix:
where α(ω), β(ω) are defined by
and for more complicated full Jones matrix
where A is analyzer angle and Mueller coefficients are defined by
For both embodiments, derivatives of bilinear expressions with respect to TM- and TE-polarized component of field are calculated.
Here, let ETE-solution with x-polarized incident field and ETM respectively for y-polarized incident field. Then:
where
The derivatives with respect to ErxTE, and the like of quadratic forms Rp, Rs may be expressed the same way as for ordinary reflection coefficients. Consider derivative of Cd. Electromagnetic field is a real-valued vector in this context, thus complex conjunction of rss leads to changing of sign of ω
r*ss(ω)=rss(−ω)
and
Cd(ω)=Re(rpp(ω)r*ss(ω))=rpp(ω)rss(−ω)+rpp(−ω)rss(ω).
This form is bilinear and may be differentiated as:
Thus, the derivatives of J with respect to ErxTE, EryTE and ErxTM, EryTM may be calculated and it provides the right hand side of two adjoint equations for TM and TE embodiments, respectively.
Anti-Hermitian Hamiltonian Form of Equations, General
Consider a Maxwell's equation system with the dispersion law
where χp is a Lorentz susceptibility function
Coefficients αp, βp and γp are parameters of the frequency-domain susceptibility function:
The medium polarization Pp is described by the second-order differential equation
{umlaut over (P)}p+2αp{dot over (P)}p+ωp2Pp=γpβpE
Thus, additional equations are added with Pp and {dot over (P)}p and by obtaining Hamiltonian form of Maxwell's equation with Lorentz polarization in ADE form:
In order to obtain anti-Hermitian Hamiltonian, variable transformation may be used:
and rewrite the system in the form
Consider Hamiltonian :
Define scalar product with multiplying components E by ε−1. For media with damping coefficient αp=0 Hamiltonian is anti-Hermitian for this scalar product. Spatial differential part of changes sign and polarization part of is the transposed matrix.
The adjoint operator to
where H* is the transposed complex conjugate matrix to H. Here, adjunction is accomplished in space with scalar product with ε−1 and with boundary conditions for all corresponding vectors. Additional concerns regarding boundary conditions and right-hand side (RHS) are considered later. The adjoint equation has the form:
where Hα=−H*. Equation (0.22) may be solved in inverse time direction.
Consider α. If αp=0 then α= and the adjoint operator is
else if the αp>0 the difference is only in sign of the bottom right corner of matrix (2αp instead of −2αp).
By changing variables and unknown functions:
Thus,
The last term in YQ2-equations changes sign. Thus, when passing from inverse-time equation to a direct time from α to is passed. Finally, the adjoint equation, is the same equation as the direct equation for E, H, Q1, Q2. The only difference being the RHS
Mirror Symmetry
Consider incident wave in form:
exp(iαx−iβz+iωt). (0.25)
The observation functional depends on 0-order scattering from grating and is defined by:
where
where Γ is an observation plane, α=ω sin φ/c0, φ is an angle of the incident wave. The RHS of adjoint system is defined by complex-conjugate derivative of density of observation function
Thus,
where
In variables with “tilde” {tilde over (t)}=T−t, {tilde over (Y)}
Thus, FRHS has the angle of incidence −φ.
Consider the case of profiles with mirror symmetry ƒ(x)=ƒ(−x). By changing variables
Thus,
({tilde over (∇)}×{tilde over (Y)}E)x(∇×YE)x and the same for ({tilde over (∇)}×{tilde over (Y)}H)x,
({tilde over (∇)}×{tilde over (Y)}E)y=−(∇×YE)y and for all other components,
and equation formulation
This is the same equation as (0.24). The right-hand side (RHS) has form
Thus, the right-hand side (RHS) has the same angle φ of incidence as (0.25).
Taking the incident wave in the form:
sin(αx−βz+ωt) (0.30)
and making a change in the definition (0.26) exp to sin, real-valued functions without complex-conjugate operation may be computed. The RHS function is in the form
If the observation functional depends only on 0-order scattering from grating the adjoint equation may be computed by the following: 1) Use forward time solution to obtain frequency response of system (frequency response of adjoint system is identical due to mirror symmetry); 2) apply frequency response to RHS of adjoint system; and 3) make inverse Fast Fourier Transform (FFT) of 2) to obtain the time-series response of adjoint system (needed to estimate gradients of observation functional).
Boundary Condition.
To conjugate the differential part of Hamiltonian (0.21) boundary conditions are applied. Let the geometry of the equation be Λ-periodic with respect to x and unbounded in z direction.
Making conjugation to
Boundary conditions to adjoint variable Y may be obtained from equality
In the frequency domain the condition
Ez(x+Λ,ω)exp(−ikxΛ)=Ez(x,ω) (0.32)
exists and from (0.31):
Y(x+Λ,ω)exp(−ikxΛ)=Y(x,ω)
In the embodiment including mirror symmetry a change of variables is made: {tilde over (x)}=−x, {tilde over (t)}=T−t. Thus, formally the condition changes to
{tilde over (Y)}({tilde over (x)}+Λ,ω)exp(ikxΛ)={tilde over (Y)}({tilde over (x)},ω).
A complex-adjoint function {tilde over (Y)}(ω)* may be taken with condition (0.32). Because of the real-valued solution of the original equation both frequencies ω and −ω are available, which are related by {tilde over (Y)}(−ω)={tilde over (Y)}(ω)*. Thus the condition of periodicity remains the same for the adjoint solution.
In order to consider the z-direction, the originally unbounded domain from −∞ to ∞ is used and set such that all functions vanish in both directions. Therefore, the same boundary conditions may be used for the adjoint equation.
Observation Functional Gradients
In another embodiment, consider formulation of the adjoint solution based on:
and the observation functional reads as follows:
Variation of (0.33) yields:
and the observation functional variation is estimated by
The adjoint equation is
Functional variation may be obtained in the form:
Formula (0.35) yields a gradient of the functional. Evidently, to calculate the functional, the direct and adjoint solutions in the non-zero region of operator
is known, for example, to calculate the right-hand side (RHS) of the adjoint equation c(ωn), which is a function of a direct solution, is known.
Now looking at the frequency domain formulation in order to optimize our time domain formulation, for each prescribed frequency ωn from some finite set {ωn}n=1N a pair of direct and adjoint equation is provided as
where Tω is the boundary operator at the top boundary far from domain of interest, Xωi is the incident wave, and ƒω is the delta-function of observation plane with coefficient
The observation functional variation may be written in the form:
In this embodiment, to solve the direct frequency an artificial time domain calculation is made with smooth incident pulse X0. Then Fourier components are extracted and scaling is done in order to obtain {Xω} for each ω∈{ωn}n=1N (this set prescribes parameters of incident pulse). Doing the same for the adjoint frequency equation, provides {Yω} for ω∈{ωn}n=1N. Solving the time domain equation with some pulse, then using the Fourier transformation and scaling by
Thus, the functional variation (0.35) and the time domain adjoint equation (0.34) with RHS from 0 to T are not needed, instead the functional (0.37) and artificial adjoint equation with short pulse are used for the direct equation.
Model-Based Metrology Using Objective Function
As disclosed herein, an adjoint-based technique may be advantageously used in determining the parameter values that minimize the objective function. Specific embodiments of this technique are discussed above in relation to
The steps for determining the parameter values that minimize the objective function may include computing a solution to state equations driven by a function representing the incident electromagnetic radiation, wherein the state equations are derived from Maxwell's equations with dispersive materials, and computing a solution to an adjoint to the state equations.
In one implementation, computing the solution to the state equations may be performed in the frequency domain using a banded matrix linear system solver. Advantageously, the solution to the adjoint to the state equations may then be computed by re-using LU-decomposition factors from the banded matrix linear system solver. In addition, derivatives of the state equations with respect to local parameters may be computed, and a mapping may be computed between derivatives of the state equations with respect to the local parameters and derivatives of the state equations with respect to the geometric parameters.
In another implementation, computing the solution to the state equations may be performed in the time domain. In such an implementation, the technique may compute a description of grating boundaries via control parameters. The technique may also compute a derivative of the objective function with respect to the control parameters, wherein the computation involves computing an inner product between the state and adjoint equations.
The technique disclosed herein may be applied to a grating layer patterned along one direction. For example, such a grating layer may comprise lines of gates for transistors being fabricated.
The technique disclosed herein may be extended by one of skill in the art and applied to a grating layer patterned in two dimensions. For example, such a grating layer may comprise cells for flash memory being fabricated.
As seen in the graph, computational time for the finite difference technique of the prior art (802) at a given level of accuracy scales in proportion to the third power of the number of parameters. In contrast, the computational time for the “adjoint” technique (804) at a given level of accuracy scales linearly or approximately linearly with the number of parameters in accordance with an embodiment of the present invention. This advantageously enables the adjoint technique to be performed for larger number of parameters. In other words, the adjoint technique scales better with the number of parameters than the prior finite difference technique.
The present disclosure provides various new aspects and features over prior techniques. These new aspects and features include at least the following. (i) A frequency domain solution of the adjoint method is formulated using rigorous coupled wave analysis (RCWA). (ii) Parametric derivatives used in optical CD regression are computed using an adjoint method. (iii) Fréchet matrix derivatives in the optical CD adjoint method are computed using a spectral method. (iv) Caching of the Fréchet derivative direction in RCWA. (v) Computation of boundary conditions using S-matrix propagation. (vi) Analytical derivatives of quotients of quadratic forms involving the RCWA state solution are computed. (vii) An analytical form describing a trapezoid is used to produce a local parameter to structural parameter diffeomorphism. (viii) A slabbing algorithm is used to produce a local parameter to structural parameter map. (ix) Measurement sensitivities (derivatives) in a Taylor's expansion are used to perform Numerical Aperture averaging. (x) Measurement sensitivities (derivatives) in a Taylor's expansion are used to compute the measurement spectra at neighboring wavelengths. (xi) The Adjoint solution in the RCWA-based Frequency Domain solution is accelerated using the LU factors from a banded matrix solver. (xii) A time-domain method for computing objective function derivatives is formulated with regard to structural parameters by computing the solution to an adjoint equation. And, (xiii) A time-domain method for computing objective function derivatives is formulated with regard to structural parameters that exploits mirror symmetry to speed up the solution of the adjoint equation.
In various embodiments of the invention, the operations discussed herein (e.g., with reference to
Reference in the specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least an implementation. The appearances of the phrase “in one embodiment” in various places in the specification may or may not be all referring to the same embodiment.
Although embodiments have been described in language specific to geometric features and/or methodological acts, it is to be understood that claimed subject matter may not be limited to the specific features or acts described. Rather, the specific features and acts are disclosed as sample forms of implementing various embodiments. While the invention has been described above in conjunction with one or more specific embodiments, it should be understood that the invention is not intended to be limited to one embodiment. The invention is intended to cover alternatives, modifications, and equivalents as may be included within the spirit and scope of the invention, such as those defined by the appended claims.
Number | Name | Date | Kind |
---|---|---|---|
6483580 | Xu et al. | Nov 2002 | B1 |
6664121 | Grodnensky et al. | Dec 2003 | B2 |
6782337 | Wack et al. | Aug 2004 | B2 |
6839126 | Yen et al. | Jan 2005 | B2 |
7092110 | Balasubramanian et al. | Aug 2006 | B2 |
7115858 | Holden et al. | Oct 2006 | B1 |
7126700 | Bao et al. | Oct 2006 | B2 |
7171284 | Vuong et al. | Jan 2007 | B2 |
7245356 | Hansen | Jul 2007 | B2 |
7305322 | Funk et al. | Dec 2007 | B2 |
7330279 | Vuong et al. | Feb 2008 | B2 |
7355728 | Li et al. | Apr 2008 | B2 |
7363611 | Rosenbluth | Apr 2008 | B2 |
7372565 | Holden et al. | May 2008 | B1 |
7388677 | Vuong et al. | Jun 2008 | B2 |
7391524 | Chen et al. | Jun 2008 | B1 |
7451054 | Deshpande et al. | Nov 2008 | B2 |
7487053 | Funk et al. | Feb 2009 | B2 |
7495781 | Vuong et al. | Feb 2009 | B2 |
7505153 | Vuong et al. | Mar 2009 | B2 |
7523076 | Drege et al. | Apr 2009 | B2 |
7525672 | Chen et al. | Apr 2009 | B1 |
7525673 | Vuong et al. | Apr 2009 | B2 |
7526354 | Madriaga et al. | Apr 2009 | B2 |
7532317 | Smith et al. | May 2009 | B2 |
7555395 | Willis et al. | Jun 2009 | B2 |
7571074 | Funk et al. | Aug 2009 | B2 |
7588949 | Vuong et al. | Sep 2009 | B2 |
7616325 | Vuong et al. | Nov 2009 | B2 |
7751046 | Levy et al. | Jul 2010 | B2 |
7763404 | Willis et al. | Jul 2010 | B2 |
7831528 | Doddi et al. | Nov 2010 | B2 |
8179530 | Levy et al. | May 2012 | B2 |
8577820 | Jin et al. | Nov 2013 | B2 |
20030215965 | Grodnensky et al. | Nov 2003 | A1 |
20040008353 | Chu | Jan 2004 | A1 |
20040017574 | Vuong et al. | Jan 2004 | A1 |
20040017575 | Balasubramanian et al. | Jan 2004 | A1 |
20040267397 | Doddi et al. | Dec 2004 | A1 |
20050128489 | Bao et al. | Jun 2005 | A1 |
20050192914 | Drege et al. | Sep 2005 | A1 |
20050209816 | Vuong et al. | Sep 2005 | A1 |
20050251771 | Robles | Nov 2005 | A1 |
20060064280 | Vuong et al. | Mar 2006 | A1 |
20060146347 | Smith et al. | Jul 2006 | A1 |
20060187466 | Li et al. | Aug 2006 | A1 |
20060290947 | Li et al. | Dec 2006 | A1 |
20070100591 | Harazaki | May 2007 | A1 |
20080074677 | Willis et al. | Mar 2008 | A1 |
20080074678 | Willis et al. | Mar 2008 | A1 |
20080076046 | Willis et al. | Mar 2008 | A1 |
20080077352 | Willis et al. | Mar 2008 | A1 |
20080195342 | Li et al. | Aug 2008 | A1 |
20090306941 | Kotelyanskii et al. | Dec 2009 | A1 |
20100233599 | Hinnen et al. | Sep 2010 | A1 |
20120086940 | Shih et al. | Apr 2012 | A1 |
20130222795 | Madsen et al. | Aug 2013 | A1 |
20140032463 | Jin et al. | Jan 2014 | A1 |
Entry |
---|
J. Elschner et al. in “Numerical Solution of Optimal Design Problems for Binary Gratings”, Journal of Computational Physics, vol. 146, pp. 603-626, 1998. |
H. Gross et al. in “Mathematical modeling of indirect measurements in scatterometry”, Measurement: Journal of the International Measurement Confederation, vol. 39, No. 9, Nov. 2006, pp. 782-794. |
Ioannis T. Rekanos in “Inverse Scattering in the Time Domain: An Iterative Method Using an FDTD Sensitivity Analysis Scheme”, IEEE Transactions on Magnetics, vol. 38, No. 2, Mar. 2002, pp. 1117-1120. |
C. Altman et al. in “Eigenmode Scattering Relations for Plane-Stratified Gyrotropic Media”, Applied Physics A: Materials Science & Processing, vol. 19, No. 2, Jun. 1979, Springer Berlin/Heidelberg, pp. 213-219. |
S.V. Vasil'ev, “Efficient diffraction grating for use in a grazing-incidence configuration”, Quantum Electronics, vol. 28, No. 5, 1998, pp. 429-432. |