The present invention relates in general to manufacturing processes that require lithography and, in particular, to methods of designing photomasks and optimizing lithographic and etch processes used in microelectronics manufacturing.
Today's integrated circuits (ICs) contain features which are not easily resolved by available lithography tools, so proper printing of critical feature dimensions (CDs) requires that compensating adjustments be made to the IC shapes deployed on the mask. In so-called model-based optical-proximity-correction (MBOPC), the appropriate adjustments are determined by simulating the lithographic process, and in particular, providing a means to simulate the image at the wafer plane. Ultimately, a determination of the image formed in the resist layer (e.g. the latent resist image) is desired, but often the aerial image at the wafer plane is used as an approximation for the latent image in the resist.
Conventional image simulation is typically done using the Hopkins integral for scalar partial coherent image formation, where the expression for the aerial image intensity I0 is given by,
I0({right arrow over (r)})=∫∫∫∫d{right arrow over (r)}′d{right arrow over (r)}″h({right arrow over (r)}−{right arrow over (r)}′) h*({right arrow over (r)}−{right arrow over (r)}″)m({right arrow over (r)}′)m*({right arrow over (r)}″), Equation 1
where,
This 4-dimensional (4D) Hopkins integral (Equation 1) may be approximated as an incoherent sum of 2-dimensional (2D) coherent image integrals. This method of computing the Hopkins integral is known as the sum of coherent systems (SOCS) procedure. In the SOCS procedure, an optimal n-term approximation to the partially coherent Hopkins integral is:
where {circle around (×)} represents the two-dimensional (2D) convolution operation, λk, φk({right arrow over (r)}) represent the kth eigenvalue and eigenfunction of the Hopkins kernel, respectively, derived from the Mercer expansion of:
which suggests that a partially coherent imaging problem can be optimally approximated by a finite sum of coherent images obtained, for example, by linear convolution. Typically, the source and the mask polygons are decomposed (e.g. into grids or sectors), and each field image is computed as an incoherent sum of coherent sub-images (also referred to as component-images, or pre-images). The total intensity at an image point {right arrow over (r)} in question is then the sum over all component images. In the SOCS approximation, the number of coherent sub-images that must be calculated is minimized, for example, by diagonalizing the image matrix to achieve an acceptable approximate matrix of minimal rank by eigenvalue decomposition. For example, even a large-fill source can be adequately approximated when the number of 2D convolutions n is about 10.
The mask transmission function, m, which can be approximated by a binary mask design pattern, usually consisting of polygons of one or more transmission, can be represented in different ways, such as grid cells or a sector decomposition, and the image at a point {right arrow over (r)} may be represented by a finite incoherent sum of weighted coherent convolutions of the mask transmission function m and the eigenfunctions φk. Each of the eigenfunctions, and its convolution with a possible mask sector, may be pre-computed, thus providing a fast method of computing the aerial image I0. The aerial image is often assumed to be an adequate approximation of the resist latent image. To account for resist effects, previous methods have used post processing the aerial image by applying a lumped parameter model or by applying resist blurring. However, in extending model-based-optical-proximity-correction (MBOPC) to the sub-100 nm dimensions of next-generation IC products, the prior art has a number of limitations.
For example, the scalar treatment is applicable for numerical apertures (NAs) less than about 0.7, where the angle between interfering orders is fairly small in resist, so that the electric fields in different beams are almost perpendicular or anti-perpendicular when they interfere. Under these circumstances, superposition is almost scalar. However, numerical apertures (NAs) of at least 0.85 must be employed when extending optical lithography to the sub-100 nm dimensions. At the resulting steep obliquities the standard scalar Hopkins integral becomes inaccurate, and the vector character of the electric field must be considered. It is known that this can be accomplished by calculating independent images in each of the global Cartesian coordinates of the electric field (Ex, Ey, Ez), and then summing these images over each of the allowed orientations of the illuminating polarization, and again over each of the different source directions that illuminate the mask. However, the computational efficiency of this procedure is not adequate for MBOPC.
Adam et al. (K. Adam, Y. Granik, A. Torres and N. B. Cobb, “Improved modeling performance with an adapted vectorial formulation of the Hopkins imaging equation,” in SPIE vol. 5040, Optical Microlithography XVI, ed. Anthony Yen (2003), p. 78–91) and Hafeman et al. (in SPIE vol. 5040, Optical Microlithography XVI, ed. Anthony Yen (2003), p. 700) disclosed an extension of the scalar Hopkins imaging equation to include vectorial addition of the electromagnetic (EM) field inside a film. However, the approaches of Adam et al. and Hafeman et al. ignore the z-component of the field, and do not take into account effects such as lens birefringence, tailored source polarization, or blur from the resist or mask.
The scalar treatment of image formation ignores illumination polarization, and assumes that the lens and resist surfaces introduce negligible partial polarization. However, OPC for future lithography will also have to take into account the polarization properties of the lens itself. As NA and lens complexity increases, the polarization state of the output beam is changed by cumulative polarization-dependent reflection losses that arise at the surfaces of the lens elements. Attainable performance in the antireflection coatings that inhibit these losses will become increasingly poor as wavelength decreases into the deep ultraviolet (UV). Forthcoming 157 nm lenses will be birefringent even in their bulk substrates, due to spatial dispersion in the element substrates. Skew incidence at beamsplitter and other coatings will distort the light polarization. There is also interest in sources that deliberately introduce polarization variations between rays in order to minimize the contrast loss that arises in transverse magnetic (TM) polarization at high NA (i.e. polarization-tailored sources). The exposed image is also impacted by the resist film stack, due to multiple polarization-dependent reflections between the various interfaces. The anti-reflective (AR) films that inhibit these reflections can be less effective over the broad angular ranges that arise at high NA. Refraction at the top surface of the resist gives rise to spherical aberration in the transmitted image.
Finally, minimum feature sizes in next generation ICs are beginning to approach the resolution of the photoresist, and this must be taken into account during MBOPC.
Methods are known for modeling each of these phenomena over mask areas of modest size. Vector images can be calculated by summing over all illumination and image-plane polarizations, and over all source points.
Resist blur may occur due to the finite resolution of the resist. Propagation in spatially dispersive media has been treated in the physics literature, and this analysis has now been applied to 157 nm lithographic lenses. Finite resist resolution can be treated in an approximate way by blurring the exposed optical image using a resist kernel, which in the frequency domain is equivalent to frequency filtering the 4D Hopkins integral. It is known that a post-exposure blurring in the resist arises in the chemical image that is acted on by the resist developer; this resist blurring can be accounted for by convolution of the optical image with a blur function, or equivalently by attenuation of the image spatial frequency content by a modulation transfer function (for example, see J. Garofalo et al., “Reduction of ASIC Gate-Level line-end shortening by Mask Compensation,” in SPIE v.2440—Optical/Laser Microlithography VIII, ed. Timothy A. Brunner (SPIE, 1995), p. 171.). Hoffnagle et al. (in “Method of measuring the spatial resolution of a photoresist,” Optics Letters 27, no.20 (2002): p. 1776.) have shown that the resist modulation transfer function (also known as the effective resist MTF) can be determined for a particular spatial frequency from critical dimension (CD) measurements in resist that has been exposed to a pair of interfering sine waves (this interference pattern exhibiting essentially 100% optical contrast). The latent image contrast deduced in this way can be substantially less than 100%, and this modulation loss becomes steadily more important as feature-sizes come closer to the resolution limit of resists. For example, at a pitch of 225 nm, the UV2 resist analyzed in Hoffnagle et al. transfers to the latent chemical image only about 50% of the modulation in the exposing optical image. More modest contrast losses arise even at relatively coarse spatial frequencies, and can give rise to proximity effects of relatively long range.
However, such methods to account for resist blurring involve a direct spatial domain convolution of the resist blur function with the continuous optical image, which is relatively slow to compute, as compared to the polygons of the kind that are deployed on the mask, which are relatively simple to handle, and can be relatively fast to compute. Unfortunately, future MBOPC will only be practical if a way can be found to calculate these effects very quickly.
Other aspects of the MBOPC process (e.g. etch simulation, simulation of mask electromagnetic effects, polygon edge adjustment, as well as many non-lithographic issues) are not directly considered in the present discussion; however, it is important to bear in mind that lithographic simulation is only one small aspect of the full IC design process. Mask polygons must be stored in a recognized data structure of complex hierarchical organization; the patterns are arrived at after considerable effort that involves a long succession of circuit and device design software. Additional processing software converts the post-OPC shapes to a format used in mask writing. Because of the complexity of this computer-aided design (CAD) process, it has become customary to use suites of software design tools (including tools for MBOPC) to avoid problems when migrating the chip data from one stage of the design process to the next. Improved MBOPC methods must be compatible with this design flow. It is also desirable that improved methods for lithographic simulation be compatible with existing OPC programs.
The SOCS method gives a fast and accurate enough algorithm for computing the scalar aerial image (AI). This would be enough to satisfy and even exceed any practical requirements if only the AI was needed as the final output of the algorithm. However, because of the presence of the resist development step, further computation is needed to get the resist image from the AI.
Images obtained using the SOCS method are calculated by pre-storing the corresponding images of all possible semi-infinite sectors from which the mask polygons are composed. The images of polygonal mask features can therefore be calculated very rapidly.
These pre-stored tables are based only on the scalar Hopkins integral. To take into account such phenomena as resist resolution, vector imaging, resist thin-film effects, illumination polarization, and lens birefringence, we need to obtain new tables that are able to reproduce the effects of these phenomena as if the phenomena arose from an incoherent sum of coherent images.
Accordingly, there is a need for an efficient method of simulating images that more accurately includes non-scalar effects (i.e. “non-Hopkins” effects) including the vector electric field, polarization effects, and is applicable for computing a variety of images including an aerial image or a resist image. In addition, it would be desirable to implement such a method that can be incorporated into existing computer codes without significant restructuring of the code.
Bearing in mind the problems and deficiencies of the prior art, it is therefore object of the present invention to provide a fast method for computing images that takes into account non-scalar (i.e. “non-Hopkins”) effects.
It is a further objective of the present invention to provide a fast method of computing non-scalar images that can be incorporated into existing computer codes without significant restructuring of such codes.
Still other objects and advantages of the invention will in part be obvious and will in part be apparent from the specification.
In accordance with the present invention, an efficient method and system is provided for computing lithographic images that takes into account vector effects such as lens birefringence, resist stack effects and tailored source polarizations, and may also include blur effects of the mask and the resist. In accordance with the present invention, these effects are included by forming a generalized bilinear kernel, which can then be treated using a SOCS decomposition to allow rapid computation of an image that includes such non-scalar effects. A source intensity distribution S({right arrow over (k)}s) is provided. The source can be characterized by an intensity distribution S({right arrow over (k)}s). and may be characterized by independent polarizations m, each of which is characterized by a polarization map. A projection impulse response function {right arrow over (h)}m is provided, which preferably includes the vector impulse response function of the lens including lens birefringence, but may also be generalized to include vector effects in the resist film stack. Furthermore, in the case of a source with tailored polarization, the projection impulse response function {right arrow over (h)} may also include a source polarization map. A generalized bilinear kernel V({right arrow over (r)}′,{right arrow over (r)}′) is formed by forming a bilinear autocorrelation of the source intensity distribution S({right arrow over (k)}s) with the vector impulse response function {right arrow over (h)}({right arrow over (k)}′,{right arrow over (k)}s). A generalized bilinear kernel can be formed to include mask blur. Resist blur may be included in a generalized bilinear kernel, for example, where averaging over the depth of the resist stack and/or a range of focus positions may be performed. Preferably, a SOCS decomposition of the generalized bilinear kernel is performed. The generalized bilinear kernel V is formed so as to be independent of the mask transmission function m({right arrow over (r)}). In another aspect of the present invention, the image I({right arrow over (r)}) at a point {right arrow over (r)} is then computed by combining, or more particularly, performing a bilinear integration of the generalized bilinear kernel with the mask transmission function. The resulting image I({right arrow over (r)}) can then be used to perform model-based optical proximity correction (MBOPC).
In a preferred embodiment of the present invention, the SOCS decomposition of the generalized bilinear kernel may be performed. First the image plane within the region of interest (ROI) integration domain is gridded. If possible, in accordance with the invention, the ROI integration domain is preferably folded according to the symmetry of the system. The generalized bilinear kernel (GBK) is computed and tabulated at the grid points of the image plane, in the folded ROI integration domain. In this preferred embodiment, the tabulated GBK values are remapped to a reduced basis, and then the eigenfunctions of the GBK are calculated in the reduced basis, and converted back to the original grid. If necessary, the dominant eigenfunctions may be iteratively refined against the tabulated GBK values, for example, as in the Lanczos method. Then, the convolutions of the dominant eigenfunctions with the possible mask polygon sectors are pre-computed.
According to yet another aspect of the present invention, the computation of the image includes the step of decomposing the mask transmission function into an appropriate set of the possible polygon sectors. Then, for each of the dominant eigenfunctions, a pre-image is computed by forming the coherent sum of the contributions from the pre-computed convolutions of the dominant eigenfunctions with the mask polygon sectors within the ROI. The pre-image is weighted, preferably by the eigenvalue of the eigenfunction, and in particular, more preferably by the square root of the eigenvalue. Alternatively, the weights may be determined empirically. Finally, the image I({right arrow over (r)}) is formed from the incoherent sum of the weighted pre-images of all the dominant eigenfunctions.
The present invention has the advantage that non-scalar effects such as vector effects including tailored source polarization, lens birefringence, and resist stack polarization, as well as blur in the mask or resist, can be incorporated efficiently in image calculation.
The present invention provides the following advantages:
1. The avoidance of extremely large computational errors which can result from taking the derivatives of a function which already has been computed with rounding errors.
2. Reduction of the time complexity of the computation of the resist image to that of the computation of just the aerial image. In accordance with the present invention, a generalized bilinear kernel is formed that has the same or similar form as the bilinear kernel used to form the aerial image according to the SOCS approximation of the Hopkins model. In other words the invention reduces the complexity of the resist image computation to the complexity of obtaining just the aerial image, or by at least a factor on the order of 10.
3. Improved accuracy of the image due to the inclusion of non-scalar effects, including, but not limited to effects of the vector electric field, tailored source polarization, lens birefringence, defocus variations, resist stack and blur from the mask and the resist.
The features of the invention believed to be novel and the elements characteristic of the invention are set forth with particularity in the appended claims. The figures are for illustration purposes only and are not drawn to scale. The invention itself, however, both as to organization and method of operation, may best be understood by reference to the detailed description which follows taken in conjunction with the accompanying drawings, in which:
In the following description, numerous specific details may be set forth to provide a thorough understanding of the present invention. However, it will be obvious to those skilled in the art that the present invention may be practiced without such specific details. In other instances, well-known features may have been shown in block diagram form in order not to obscure the present invention in unnecessary detail.
Refer now to the drawings wherein depicted elements are not necessarily shown to scale and wherein like or similar elements are designated by the same reference numeral through the several views.
Each image calculation from MBOPC programs typically involves only those mask shapes that are contained within a small region (the so-called Region Of Interest, or ROI) that is centered on the image point of interest {right arrow over (r)}. For example,
Suppose we then interpret the ROI 310 as the unit cell of a grating, where the size of the grating 300 is large compared to the ROI 310, but still microscopic on the scale of the projection lens 130. Referring to
Vector and Polarization Effects
Referring to
For a ray propagating in a direction {circumflex over (κ)}=(α,β,γ=√{square root over (1−α2−β2)}), the polarization of the ray can be tracked against an (s, p) basis, where the p component of the ray polarization lies within the meridional plane 220 and the s component of the ray polarization is orthogonal to the p component, where:
where the ray directions in image space 132 ({circumflex over (κ)}′) and in the resist layer 142 ({circumflex over (κ)}″) are determined from the object space 122 direction ({circumflex over (κ)}) using the magnification (sine condition), and the resist index (Snell's law). Note that we have the identity
ŝ=ŝ′=ŝ″ Equation 8
since the sign change between the ŝ component of the ray polarization of Equation 5 and the ŝ′ component of Equation 6 simply represents the inversion of the ray direction by the lens 130 (i.e. the magnification is negative). Note also that a distinction is drawn between image space 132 and the resist layer 142. Although it is not required by the present invention, preferably, the propagation from mask to wafer is treated using polarization ray tracing, but propagation into the resist stack is treated using a conventional plane wave thin-film approach.
To make large-area lithographic simulations practical, it is customary to make simplifying assumptions about mask diffraction. This involves two approximations; first, the so-called thin mask approximation, and second, an assumption that the projection lens impulse response function (which arises from spatial invariance) is also invariant with respect to illumination direction. For the present invention it is necessary extend both approximations to the case of a polarized source. To some extent both approximations are improved at large reduction ratios, where propagation angles of collected orders are relatively small and mask features relatively large; however even at low mask-side NA both approximations can be invalidated by strong topographic effects in the patterned mask.
Vector Effects
At low mask-side NA there is no difficulty in representing the source polarization distribution by a 2D map which specifies the (2D) polarization direction of each source point. However, the reduction to 2D becomes somewhat arbitrary when mask-side NA is non-negligible. By convention we will refer to the coordinate directions for the source map as x′ and y′ (i.e. we take the optical axis to lie along z ). The illuminating direction will be specified by the (global) x and y direction cosines, while local x′ and y′ coordinate directions will be used to specify the illuminating polarization along each ray, for example by defining the local x′ axis to be perpendicular to the plane 220 containing the ray and the global y axis (with the local y′ axis then being perpendicular to the x′ axis and the ray 201). The unit vectors along the x′ and y′ directions can be expressed as follows:
where {circumflex over (x)}.ŷ are the unit vectors of the global coordinate system, and {circumflex over (κ)}s is the illumination direction.
Unpolarized or partially polarized sources are treated by independently superposing the system kernels calculated for independent polarized sources.
The assumption of spatial invariance (i.e. the assumption that the projection lens 130 can be characterized be a fixed lens impulse response function within each local imaging field) primarily involves the projection lens 130. However, to treat partially coherent illumination an associated assumption is conventionally made about mask diffraction, namely that mask diffraction can properly be treated by applying the same fixed lens impulse response function to the mask shapes even as the illuminating direction shifts from source point to source point (with the amplitude transmitted through the mask shapes changing only by the illumination tilt phase). The present treatment likewise requires that the projection lens exhibit a spatially invariant impulse response. It is not strictly necessary that this impulse response be invariant with respect to source direction, even for a single source polarization. However, the pre-calculation phase of the invention is significantly lengthened unless the impulse response obeys a constrained dependence on source direction, which is hereinafter referred to as a generalized source-invariance. The simplest such source-invariance model constitutes a natural extension of the assumption conventionally made in the scalar Hopkins treatment; the projection lens is found to be governed by two vector impulse response functions that are independent of illumination direction (except for a phase tilt), with each vector response corresponding to polarization along one of the two source polarization directions, x′ or y′. Even though, at non-negligible mask-side NA, the polarization of the illuminating ray cannot really be completely invariant as ray direction is varied, since mask-side NA is reduced by a factor of at least 4, the polarization of the illuminating ray is almost invariant as ray direction is varied.
Efficient pre-calculation is also possible with more complex models that do not require approximate source invariance. For example, the source 115 can be divided into independent regions (e.g. the poles of a quadrupole) that are each governed by a pair of source impulse response functions. In accordance with the present invention, efficient integration to determine a generalized bilinear kernel does not require full directional source invariance within the region; only that the response be a sum of separable functions of the illumination and collection directions (i.e. diffraction directions).
The present invention uses the thin mask approximation, as known in the art, in that the amplitude of the diffracted light is calculated as the Fourier transform of the mask patterns (or, in some embodiments, as the Fourier transform of a blurred rendition of the mask patterns; in general the function must be linear). It has been shown that the thin-mask framework can be extended to accommodate mask topography effects by replacing the true mask patterns with effective patterns whose Fourier transform more closely reproduces the true diffraction spectrum. A simple approach along these lines is to bias each edge with a slightly increased chrome width.
In a preferred embodiment of the present invention, consider the case of an ideal lens for a simple model of mask diffraction. Consider the on-axis source point to be polarized along a direction {right arrow over (E)}0 that has no z component. The {right arrow over (E)}0 direction will usually have a component perpendicular to the meridional plane 220 of a given ray 201; we identify the projection of {right arrow over (E)}0 perpendicular to the meridional plane 220 as the s component of the ray polarization, and assume initially that the amplitude of the ŝ component is maintained in propagating through the lens 130 (this is true of an ideal lens; the general case will be considered later). In our initial simple scenario the s amplitude of the electric field is taken to be equal to this projection factor multiplied by the usual Fourier component of the mask diffraction pattern, which (due to relatively low NA) is calculated on a scalar basis (as is done in the prior art). The s component of the electric field is the given by, for an ideal lens:
where A({circumflex over (κ)}) is the scalar order-amplitude, given in Equation 13 below. When the s component is mathematically subtracted from {right arrow over (E)}0, the remainder is not transverse to the ray 220, and therefore does not represent a true p polarization component. Since we are treating diffraction from the mask 120 in an approximate way (given the relatively low-NA), we will initially assume that the magnitude of the p component can be calculated by finding the magnitude of this in-plane {right arrow over (E)}0 component (multiplied by the appropriate scalar diffraction order amplitude A({circumflex over (κ)}). The direction of this p component is of course taken to lie along the transverse direction {circumflex over (p)}. We take the sign of the p component to be sign({circumflex over (p)}·{right arrow over (E)}0).
After some algebra, we find that the p component is then given by:
where we have used {circumflex over (z)}·{right arrow over (E)}0=0. Note that Equation 11 will obey the identity
since the x and y components of {circumflex over (κ)} (the direction of propagation of the ray 220 in object space 122) and {circumflex over (κ)}′ (the direction of propagation of the ray 220 in image space 132) only change in the ratio of the magnification, which cancels between numerator and denominator.
In our model the electric field strength along the ray 220 will also be multiplied by the scalar diffraction efficiency A({circumflex over (κ)}, {circumflex over (κ)}s) of the mask 120 in the ray propagation direction {circumflex over (κ)}, which for unit illuminating intensity from source direction {circumflex over (κ)}s is given by:
A({circumflex over (κ)}, {circumflex over (κ)}s)=∫∫d{right arrow over (r)}m({right arrow over (r)})ei({circumflex over (κ)}
where m({right arrow over (r)}) is the mask transmittance function.
In an ideal lens the s and p polarizations will not mix, and ray intensity will not be absorbed in the lens. As discussed-above, we can identify the ray 220 with a single order diffracted from the mask 120, truncated with a cross-section bounded by e.g. the ROI 310 in the mask plane (or some multiple thereof, taken here to be 1). The cross-section area subtended in the wafer plane is equal to ROI×M2 (where M is the magnification of the lithographic system, and typically M=−0.25). However, this area is only subtended obliquely, i.e. it is foreshortened as seen from the ray direction, and the perpendicular cross-section area is therefore reduced by a factor γ′, where γ′ denotes the z component of {circumflex over (κ)}′, the direction of propagation in image space 132. This concentration causes an increase in the electric field strength. In an ideal lens the electric field along the ray (diffraction order) therefore changes by a ratio M√{square root over (γ/γ′)} between object and image. We suppress the constant factor M for simplicity, and define the so-called obliquity factor as:
We should also note that it is customary to describe mask-plane quantities in so-called “1×” wafer-scale coordinates, often without explicitly saying so. We will follow this custom below, referring to both the mask-scale and wafer-scale integration domains as “the ROI”.
The s component of the ray 220 is transverse to the resist plane of incidence (the resist plane of incidence is the plane containing the ray and the normal to the resist surface), so the s vector component propagates into the resist 142 with a transmittance τs(Δzr), where τs(Δzr) is the coefficient for s transmission through the wafer film stack 142 to a depth Δzr. Using standard thin film methods one can also calculate a transmission coefficient τp(Δzr) for the p component. The transmittance coefficients calculated by these standard methods provide the tangential component of the field (i.e. the component parallel to the interface of the resist stack 142), so we define
as the transmission coefficient for the magnitude of the p amplitude. In the equations below we will suppress the Δz dependence for brevity.
If the bottom surface of the resist layer 142 is anti reflected, the electric field {right arrow over (E)}″ in the resist 142 will then be proportional to the quantity {right arrow over (Q)}({circumflex over (κ)}), that is:
{right arrow over (E)}″({circumflex over (κ)}″)=A({circumflex over (κ)}″; {circumflex over (κ)}s)O({circumflex over (κ)}′) {right arrow over (Q)}″({circumflex over (κ)}″). Equation 16
where {right arrow over (Q)}({circumflex over (κ)}) is defined as
{right arrow over (Q)}({circumflex over (κ)})≡τsEsŝ″+{tilde over (τ)}pEp{circumflex over (p)}″. Equation 17
Note that the electric field in image space is similarly given by:
{right arrow over (E)}′({circumflex over (κ)}′)=A({circumflex over (κ)}′; {circumflex over (κ)}s)O({circumflex over (κ)}′){right arrow over (Q)}′({circumflex over (κ)}′) Equation 18
where the obliquity factor O({circumflex over (κ)}′) in both the resist (Equation 16) and image space (Equation 18) is given by the same factor given by Equation 14.
By making use of Equation 5, Equation 6, Equation 7, Equation 8, Equation 10, Equation 11, Equation 12 and Equation 15, we find after some algebra that inside the resist layer, with an index-matched substrate:
where {right arrow over (k)}″ is the projection of {circumflex over (κ)}″ transverse to the local optical axis within the resist stack 142:
{right arrow over (k)}″≡{circumflex over (κ)}″−({circumflex over (κ)}″·{circumflex over (z)}){circumflex over (z)}={α, β,0}, Equation 20
with the last form representing a vector of Cartesian components. Note that the first term τs{right arrow over (E)}0 on the right of Equation 19 is essentially the standard, near-scalar result that obtains at low NA, while the remaining terms can be thought of as vector corrections. However, the τs factor appearing in the first term of Equation 19 can have a significant angle dependence at high NA. The term in square brackets Equation 19 is of order unity as {right arrow over (k)}″→0 because τs→{tilde over (τ)}p, γ″→1 at normal incidence; we therefore have {right arrow over (Q)}→τs{right arrow over (E)}0 in the low-NA limit, as expected.
The beam cone angle is more pronounced in air than in resist, so the image in air (the so-called aerial image) shows larger departures from scalar behavior at high NA than does the resist image. Thus, one must be cautious in approximating the resist image with the aerial image at NA's where vector interference effects are important. Bearing this caveat in mind, we note that Equation 19 provides as a special case a simple expression for the electric field {right arrow over (E)}′ along rays in the aerial image space 132:
{right arrow over (Q)}″({circumflex over (κ)}) of Equation 19 can be generalized to include the case of a stack of films 142 on the 144 wafer, e.g. antireflection layers above and below the resist, and/or other process films. Transfer into a general film stack 142 is not governed by a single transmittance τ; in addition to the transmitted down-traveling wave having amplitude u, the resist layer 142 will in general also contain an up-traveling wave having amplitude v that has reflected from the substrate 144. To solve for these amplitudes we divide the film stack 142 into an upper and lower substack, separated by the depth-plane within the resist at which we want to calculate the image. Standard thin-film methods can be used to calculate the up-traveling and down-traveling amplitudes, v,u, respectively; separate calculations must be made for the up and down-traveling s components (vs and us, respectively) and the p components (vp and up, respectively). Thin-film methods conventionally provide transfer coefficients for the tangential field components, since the equations that are solved for these coefficients express the continuity of tangential components across interfaces. Our Equation 26 below recovers the full electric field from these components.
To solve for the fields, we first calculate the standard thin-film characteristic matrix for each layer, defined as
where β≡2πnd cos θ/λ, n is the film index of refraction, and d is the film thickness of the layer. Y is the layer admittance, defined as Y≡n cosθ for s polarization, and Y=n/cosθ for p polarization, with θ the ray angle inside the film (calculated from Snell's law). The film matrices are multiplied together to form substack matrices M1 for the upper substack and M2 for the lower, and full stack matrix MTot≡M1M2. The following four linear equations are then solved for u and v (eliminating the supplementary unknowns ρTot and τTot):
repeating this procedure for s and p polarizations. The field inside the resist 142 is then obtained from:
where ũ is defined as ũ≡γ′u/γ″, and similarly for v.
In the case of a non-ideal lens, the electric field in image space can be determined by polarization raytracing, as known in the art. The relationship between image-space field {right arrow over (E)}′ and object-space field {right arrow over (E)} can be represented as multiplication by a Jones matrix J:
{Es′, Ep′}=J({right arrow over (k)}′)·{Es,Ep} Equation 25
where the Jones matrix J may be used to include birefringence of the lens. Equation 25 can then be employed in the following generalized version of Equation 24:
{right arrow over (Q)}({circumflex over (k)}′)=(vs+us)[{circumflex over (z)}×{circumflex over (k)}′]Es′+[{tilde over (v)}p({circumflex over (k)}′γ″+{circumflex over (z)}k″)+ũp({circumflex over (k)}′γ″−{circumflex over (z)}k″)]Ep′. Equation 26
Equation 25 and Equation 26 assume that matrix J is calculated using the s, p basis of Equation 5, Equation 6 and Equation 7. Thus, the electric field {right arrow over (E)}′ in image space obtained using {right arrow over (Q)}({right arrow over (k)}′) of Equation 26 includes the vector effects of the film stack.
The s component of the electric field Es (of Equation 10) and p component of the electric field Ep (of Equation 11) are written in terms of a single coherent illuminating polarization {right arrow over (E)}0. Unpolarized illumination can be handled as an incoherent sum of coherent contributions from orthogonal initial choices of {right arrow over (E)}0, as will be discussed further below.
Customized Source Polarization
Another important generalization is that of a source having customized or tailored polarization. For example, as NA approaches or exceeds 1, the image quality of certain low-k1 features can be enhanced by polarizing off-axis illumination in a tangential direction (i.e. in a direction perpendicular to the tilted plane of incidence). Formally, such customized polarization distributions can be handled by making {right arrow over (E)}0 (and therefore {right arrow over (Q)}) functions of the projected illuminating direction {right arrow over (k)}s, which is defined by analogy with Equation 19 as
{right arrow over (k)}s≡{circumflex over (κ)}s−({circumflex over (κ)}s·{circumflex over (z)}){circumflex over (z)}={αs, βs,0}. Equation 27
This also allows us to handle the polarization of light diffracted by the mask in a more general manner.
Vector Image Based on a Generalized Bilinear Kernel
In accordance with the present invention, we now obtain an expression for the overall image I({right arrow over (r)}) at a point {right arrow over (r)} in the resist 142 that comprises a generalized bilinear kernel. To do so, we calculate the total electric field vector at each position {right arrow over (r)} in the resist 142 (from each source point and from each independent illumination polarization), then calculate |{right arrow over (E)}|2, and then sum over all source points σ and all independent illumination polarizations {right arrow over (k)}s. (The sum over independent source polarizations, ranging from m=1 to m=mMax, with mMax typically equal to 1 or 2, allows us to account for unpolarized or partially polarized illumination.) We obtain:
As discussed further below, in accordance with the present invention, the image I({right arrow over (r)}) of Equation 28 is now expressed in terms of a generalized bilinear kernel, V({right arrow over (r)}′,{right arrow over (r)}″), given by:
which is dependent only on the source function S({right arrow over (k)}s) and the impulse response function of the lens {right arrow over (h)} (which may also include information about the resist stack), but is independent of the mask transmittance function m({right arrow over (r)}).
In the last line of Equation 28 we have written the mask patterns m({right arrow over (r)}) as if the reduction ratio R=1/Abs(M) were 1, following the common practice of describing masks in “1× dimensions”. The integrals in Equation 28 should technically extend to infinity, but in practice they are restricted for computational reasons to a domain (such as ROI 310) modestly larger than the lens resolution, e.g. ±˜4λ/NA. P({right arrow over (k)}) represents the aberrated scalar pupil of the system, which may include defocus aberration. We have also introduced the symbol {right arrow over (h)} to represent the electric field distribution in the image plane due to illuminating the projection, and is referred to as a lens vector impulse response, defined as
where {right arrow over (h)}m is the projection impulse response for a given source polarization m, which may also include lens birefringence, and may also include defocus aberration corresponding to a defocus position different from zero. {right arrow over (h)}m does not include the effect of limited resist resolution, but in general includes the effect of multiple reflections within the resist stack. However, if we use the aerial image field {right arrow over (Q)}({circumflex over (k)}) (as in Equation 21) instead of the more general expression for the image field of Equation 26, then the projection impulse response function {right arrow over (h)}m of Equation 30 provides the aerial image impulse response of the lens without resist effects. Moreover, if we retain only the leading term in Equation 21, {right arrow over (E)}0, we recover the scalar aerial image impulse response in Equation 30 rather than a generalized impulse response that includes vector interference, resist film stack, lens birefringence, customized source polarization, and varying partial degree of polarization in the source.
It should be noted that {right arrow over (Q)}({circumflex over (k)}) of Equation 24 or Equation 26 only provide the field at a particular depth in the resist. This is consistent with the common practice of applying OPC to an image calculated within a single fixed image plane; however, in many cases it may be preferable to base OPC on the exposing image as averaged through the depth of the resist. This depth average can be approximated by an average over several planes, spaced by e.g. Δz=0.15λ/NA2. (One can average over lens focus in a similar way, for example to account for variation in positioning the wafer against the image beam; some of this variation will be truly cumulative in modern scanned lithography tools, since it is incurred when the wafer is scanned through the imaging field.) If the total number of planes is W, then using w as an index to denote the particular {right arrow over (Q)} value obtained for each resist plane (from Equation 24 or Equation 26), the bilinear terms appearing in the expression for the image intensity I({right arrow over (r)}) of Equation 28 are:
which provide the image in resist averaged over the depth of the resist. Though Equation 31 requires that the field be calculated in multiple planes, the actual use of depth-averaged images entails no additional computation once this pre-computation is completed (as long as the same total number of kernels is used). Note also that only a single sum over w should be made when {right arrow over (h)} appears pairwise in bilinear terms; each copy of {right arrow over (h)} or {right arrow over (Q)} is indexed with the same value of w.
For simplicity we denote the inner integrals in the expression for image intensity I({right arrow over (r)}) of Equation 28 with the symbol V, which is a 4D function in the x,y components of {right arrow over (r)}′ and {right arrow over (r)}″, so that the image intensity I({right arrow over (r)}) of Equation 28, in accordance with the present invention, becomes a double convolution over the mask patterns:
which, in accordance with the present invention, is a generalized bilinear kernel that is independent of the mask transmission function. Thus, in accordance with the present invention, the image intensity I({right arrow over (r)}) of Equation 32 can be viewed as a generalization of the scalar Hopkins integral, but which additionally includes the effects of vector diffraction, resist film stack, and tailored source polarization. In addition, according to the present invention, this leads to a generalization of the SOCS method, which reduces the 4D Hopkins integral to a sum of squared 2D convolutions. Additionally, if the polygonal shapes of IC mask patterns are exploited, for example, as described by K. Lai et al. (in co-assigned U.S. patent application Ser. No. 10/694,466, filed on Oct. 27, 2003, the contents of which are hereby incorporated by reference in its entirety), then the convolution integrals over mask patterns can be calculated rapidly. Note that if the index of refraction of the resist is equal to 1, then I({right arrow over (r)}) of Equation 32 is equivalent to the aerial image.
Mask Blur
However, the mask shapes m({right arrow over (r)}) will not be exactly polygonal, since mask-making tools have finite resolution, causing corner rounding in the pattern. Roughly speaking, we may approximate this loss in mask definition by convolution with a mask blur function b({right arrow over (r)}). If we denote the blurred mask patterns as m′, we have in this approximation the following replacement for the image intensity I({right arrow over (r)}) of Equation 32 that includes mask blur:
I({right arrow over (r)})=∫∫d2{right arrow over (r)}′d2{right arrow over (r)}″V({right arrow over (r)}′,{right arrow over (r)}″)m′({right arrow over (r)}−{right arrow over (r)}′)m′*({right arrow over (r)}−{right arrow over (r)}″), Equation 34
where
m′({right arrow over (r)})≡∫∫d2{right arrow over (R)}b({right arrow over (R)})m({right arrow over (r)}−{right arrow over (R)}). Equation 35
Unfortunately, the image intensity I({right arrow over (r)}) including mask blur as in Equation 34 requires considerably more computation to evaluate than Equation 32, since the functions m′ are continuously varying functions, rather than binary polygons.
However, we can generalize the intensity I({right arrow over (r)}) of Equation 28 to include the effect of the mask blur function b({right arrow over (r)}) as follows:
Substituting the mask shapes m of Equation 35 into the intensity I({right arrow over (r)}) of Equation 34 gives:
I({right arrow over (r)})=∫∫∫∫d2{right arrow over (r)}′d2{right arrow over (r)}″d2{right arrow over (R)}′d2{right arrow over (R)}″V({right arrow over (r)}′, {right arrow over (r)}″)b({right arrow over (R)}′)b({right arrow over (R)}″)m({right arrow over (r)}−{right arrow over (r)}′−{right arrow over (R)}′)m*({right arrow over (r)}−{right arrow over (r)}″−{right arrow over (R)}″). Equation 3
Making the change of variables
where
is the generalized bilinear kernel without the mask (see Equation 33). It is convenient to modify Equation 38 and Equation 39 with additional changes of variables, to obtain
I({right arrow over (r)})=∫∫d2{right arrow over (r)}′d2{right arrow over (r)}″V′({right arrow over (r)}′, {right arrow over (r)}″)m({right arrow over (r)}−{right arrow over (r)}′)m*({right arrow over (r)}−{right arrow over (r)}″), Equation 40
with
Note that Equation 40 images the nominal polygonal mask patterns m({right arrow over (r)}), rather than the blurred mask m′({right arrow over (r)}) that is used in Equation 34.
Thus, in accordance with the present invention, the image I({right arrow over (r)}) of Equation 40 includes the generalized bilinear kernel V′({right arrow over (r)}′,{right arrow over (r)}″) of Equation 41, which includes blurring from mask fabrication, and accounts for blurring imposed by the optics (and by multiple reflections within the film stack) using a generalized vector model (also accounting for the contrast loss at high-NA that can occur with interfering vector fields).
However, resist blur may occur due to the finite resolution of the resist. As discussed above, this resist blurring can be accounted for by convolution of the optical image with a blur function, or equivalently by attenuation of the image spatial frequency content by a modulation transfer function. However, such methods, which involve direct convolution of the resist blur function with the image, are inefficient.
Resist Blur
In accordance with the present invention, the effects of resist blur can be included efficiently in the image, for example by determining an effective resist blur function from the measured resist modulation transfer function (MTF) by 2D Fourier transform (Hankel transform):
where g2D({right arrow over (r)}) is the exposure response of the resist, and more particularly, the exposure response at a plane in the resist stack structure, and J0 is a Bessel function. The exposure response of the resist is typically measured, or estimated, or provided by the resist manufacturer. The lithographically relevant image is determined from the optical image according to:
Unfortunately, the optical image I({right arrow over (R)}) is continuously varying, making Equation 43 difficult to integrate in the rapid fashion required for OPC. However, in accordance with the present invention, we can account for resist resolution using a modified generalized bilinear imaging kernel V″ that adds resist blur to the extended list of phenomena included in Equation 41:
ILitho({right arrow over (r)})=∫∫d2{right arrow over (r)}′d2{right arrow over (r)}″V″({right arrow over (r)}′,{right arrow over (r)}″)m({right arrow over (r)}−{right arrow over (r)}″)m*({right arrow over (r)}−{right arrow over (r)}″), Equation 44
where the modified generalized bilinear kernel V″({right arrow over (r)}′,{right arrow over (r)}″) in accordance with the present invention is
which includes resist blur averaged over depth of the resist stack and/or a range of focus positions. The generalized bilinear kernel V″ includes a projection impulse response function {right arrow over (h)} combined with the resist blur function g2D. It also includes mask blur b and vector imaging effects. The combination of any one of the resist blur function, the mask blur function or other resist stack effects, is referred to hereinafter as an exposure response. If W is 1, then Equation 45 represents resist blur at a single resist plane.
Note that the last term of the generalized bilinear kernel of Equation 45 given by
represents the bilinear impulse response of the lithographic subsystem including the (scalar or vector) projection impulse response, mask blur and resist blur. Additionally, in accordance with the present invention, the subsystem impulse response can include the scalar or vector projection impulse response plus resist blur (without mask blur), or scalar or vector projection impulse response with mask blur (without resist blur).
Even though the generalized bilinear kernel V″ of Equation 45 would ordinarily make use of the generalized vector impulse response {right arrow over (h)}m of the lens (Equation 30), we have noted above that the impulse response {right arrow over (h)}m can instead be calculated using either the electric field term {right arrow over (Q)} of Equation 21 or the leading term {right arrow over (E)}0 of Equation 21 in order to obtain a generalized bilinear kernel involving either the aerial image or the scalar aerial image, respectively.
Optionally, mask blur, resist blur, depth averaging and focus averaging may be added later.
Transmission Cross-Correlation Coefficients (TCCs)
If the generalized bilinear kernel V″({right arrow over (r)}′,{right arrow over (r)}″) of Equation 45 is Fourier-transformed, we obtain a generalization of what are known as the transmission cross-correlation coefficients (TCCs):
where each capital-letter variable denotes the Fourier transform of the spatial domain quantity that is represented by the same symbol in lower-case. Variable H is an exception to this convention; we use H to denote the circular lens pupil P (e.g. a circular tophat function, equal to the Fourier transform of the scalar projection impulse response function) combined with the obliquity factor O, or more generally an aberrated scalar pupil combined with an obliquity factor and any defocus in the lens that may be present.
Application to Resist Blur Models
The generalized bilinear kernel V″ of Equation 45 includes the blurring in the optical exposure introduced by multiple reflections within the resist stack, the effect of finite resolution of the resist material itself, and (if averaged through depth) the effect of defocus within the resist layer, as well as the inherent resolution limits of a finite-NA lens imaging a vector field. However, it does not include dynamic effects that occur when the developer (which is the chemical used to develop the patterns of exposed resist) interacts with the exposed latent image to print the developed circuit feature, nor the similar effects that occur when the developed resist feature is transferred into an inorganic circuit film. These dynamic transfer effects are small, since the resist is strongly hard-limited, e.g. it is characterized by a power-law nonlinearity with the resist constrast, Γ, which is typically >10. Thus, resist development follows a constant threshold model to first order, and the goal that a given feature edge be printed at a specified target position is approximately met by requiring that the intensity at that position equal a reference intensity. This reference intensity is often chosen as the edge intensity of a particularly critical feature, since the development process is generally adjusted to print the most critical feature edge at nominal.
However, in reality resist contrast Γ cannot be infinite, so the printed feature will usually be biased slightly away from the nominal image contour. This can be accounted for as a bias, or equivalently (for small biases) as an effective change in the edge intensity. In some development models, such as the Brunner-Ferguson model, this bias (or intensity change) is modeled as a function of the intensity and intensity slope at the nominal edge position, denoted f(I,∂l/∂x). (Without loss of generality we consider the right edge of a horizontally oriented feature, so that image slope can be represented as a positive derivative along x.) If f is expanded in a series, we have for the effective intensity at the edge:
Ieff({right arrow over (r)})=I({right arrow over (r)})+f(1)[I({right arrow over (r)})−1({right arrow over (r)}ref)]f(2)[i({right arrow over (r)})−İ({right arrow over (r)}ref)], Equation 48
where İ is shorthand for ∂I/∂x, {right arrow over (r)}ref refers to the edge of the reference feature, and where f(1) and f(2) represent small correction coefficients.
Coefficients f(1) and f(2) can be determined analytically, or by fitting to empirical data. The effective intensity can equivalently be expressed in terms of different parameters c0, c1, and c2, as
Ieff({right arrow over (r)})=c0+I({right arrow over (r)})(1+c1)+c2İ({right arrow over (r)}) Equation 49
If the generalized bilinear kernel V for intensity I is obtained by the methods described above according to the present invention, it can immediately be modified to incorporate c0 and c1 by direct substitution. The kernel for İ can be obtained in several ways, for example by differentiating an eigenvector expansion of V. We are then able to calculate the effective intensity Ieff directly, rather than postcalculating it from separate calculations of intensity and intensity slope, thus speeding execution time. In accordance with the present invention, an effective generalized bilinear kernel is obtained as:
where L represents the number of retained eigenfunctions Ψ of the generalized bilinear kernel V″. {dot over (Ψ)} is the partial derivative of Ψ with respect to x. The effective generalized bilinear kernel Veff can then be diagonalized into a final set of eigenfunctions.
Other empirical terms can likewise be added to Veff. Given an general functional dependence of ΔI on the behavior of I in the vicinity of an edge, we can expand this dependence in a functional Taylor series, and obtain a lowest order linear term that takes the form of an integration of I with a first-order kernel Ω (i.e. the first order term of the functional Tayor series)
where we have included an explicit fitting constant C3 (which may also be absorbed into Ω). Ω may be determined from fits to CD data; for example, CD data that contains many different assist feature combinations. Regularizing terms can be added during the fit to ensure smooth and monotonic behavior in Ω. Once Ω has been determined, the fit can be updated against later data by simply readjusting the linear coefficient C3. We then have
where ΔVΩ is obtained by replacing the resist blur function g2D in the generalized bilinear kernel V″({right arrow over (r)}′,{right arrow over (r)}″) of Equation 45 with a corrected influence function gΩ given by:
(Experimental determinations of the influence function are likely to determine gΩ directly.) Note that the (1+c1) and c3 terms in Equation 52 can be combined as a total blur function
geff({right arrow over (r)})≡(1+c1)g2D({right arrow over (r)})+c3gΩ({right arrow over (r)}). Equation 54
In many cases gΩ will be a longer range function than g2D. Formally, we can thus regard gL as either a long distance component of the resist blur function, or alternatively as a separate (often density-like) influence function. If readjustment in c1 and c3 is not contemplated, it may be desirable to carry out a one-time fitting in which (by iteration) the eigenfunctions of Veff itself are made to serve in the c2 slope term of Equation 52. This is convenient if the tabulated kernels are to be input to pre-existing OPC software that does not support post-application of fitting kernels. On the other hand, if the long range character of gΩ forces an increased number of retained eigenvectors, one may prefer to leave the c3 term separate from the main image generalized bilinear kernel, V″, adding it to the intensity only in cases where the most accurate OPC calculations are needed (e.g. avoiding it when checking for print-through or basic image polarity, or when carrying out initial iterations of OPC).
Generalization of SOCS Decomposition
In accordance with the present invention, the generalized bilinear kernel may be used to generalize the SOCS decomposition. The dominant eigenfunctions of the generalized bilinear kernel operator V″ (or alternatively, the Fourier transform of the dominant eigenfunctions of operator T″ which are then Fourier transformed), and their associated eigenvalues are calculated. The eigenfunctions Ψ and associated eigenvalues μ satisfy the equation
Fast Pre-Computation
Equation 55 may be solved for Ψ and μ using matrix eigendecomposition methods. Typically, only the largest 10 or 20 eigenvalues and associated eigenfunctions need be calculated. (We refer to these as the dominant eigenelements.) The eigenelements may be found by approximating the Equation 55 integral as a summation on a grid, with grid step of order 0.2λ/(NA(1+σMax)), with σMax denoting the tilt of the most obliquely incident illuminating direction used. We may also set a lower limit of, e.g. 0.4 on σMax, even in particular near-coherent exposure processes in which the maximum a value actually employed is somewhat lower. Therefore, the generalized bilinear kernel V may be approximated as a matrix, with variables {right arrow over (r)}′ and {right arrow over (r)}″ stepped out along rows and columns respectively. (Though each of these variables is actually two-dimensional, their associated 2D grids are unwrapped into an unraveled linear grid along a matrix axis.)
Variable Grid Pre-Computation
Matrix size can be reduced without sacrificing accuracy by using a variable-grid algorithm, in which a coarser grid is used where the generalized bilinear kernel is small and/or slowly varying. Let the x and y grid sizes at the kth row or column be Δxk and Δyk. If we then calculate the eigenvectors Ψ′(xj, yj) of the modified discrete matrix equation
where the regridded matrix {tilde over (V)} is given by
{tilde over (V)}(xi, yi,xj, yj)≡V(xi, yi,xj, yj)√{square root over (ΔxiΔyiΔxjΔyj,)} Equation 57
then the desired eigenvectors Ψ(xj, yj) of the imaging kernel V will be given by
Reduced Basis Pre-Computation
The dominant eigenvectors and eigenvalues (Equation 58) of the kernel matrix (Equation 57) can be obtained by standard methods, such as the Lanzos method, which can avoid calculation of most non-dominant eigenelements, improving efficiency.
Matrix size can be significantly reduced by remapping to a suitable reduced basis. Preferably, a suitable reduced basis may be obtained so that the eigenvectors of the original kernel matrix are approximated by the new reduced basis vectors, for example, reduced basis vectors from previously-calculated eigenvectors obtained under similar system parameters, or reduced basis vectors formed from a coarse-grid solution. Similarly, eigenanalysis can be performed on such a reduced-basis matrix in order to obtain a good starting solution for the iterative eigensolution algorithm. To do so, the eigenvectors of the reduced-basis matrix are converted back to the original grid basis. Then, the eigenvectors for the original kernel matrix may be iteratively refined, if necessary, by methods such as the Lanzos method.
The total element count in the kernel matrix (whether V, the generalized bilinear kernel in the spatial domain, or T, which is the generalized bilinear kernel in the Fourier domain) scales as the 4th power of the ROI, and the time required to calculate the eigenelements usually increases even more rapidly (for example, eigenanalysis computation time may scale as the 3/2 power of the element count). In cases where the ROI must be unusually large (for example, when flare or other interaction from distant features must be included, or when the effective resist blur function has a long tail), it is particularly desirable to minimize the precomputation involved in calculating the kernel and its eigenfunctions. This is also important when many different system models must be computed, e.g. in data fitting.
Application of Symmetry to Pre-Computation
Evaluation time of the generalized bilinear kernel, in accordance with the present invention, can be reduced by a factor of 2 if the Hermitian symmetry of the kernel is exploited, i.e. if one exploits
V″({right arrow over (r)}′,{right arrow over (r)}″)≡V″*({right arrow over (r)}″,{right arrow over (r)}′) Equation 59
to eliminate half the kernel evaluations.
In many cases it is acceptable to regard the ROI boundary as circular; if so, both kernel evaluation time and matrix size can be reduced by a factor (π/4)2 by using a circular rather than a square domain (in each variable of the kernel).
Both kernel evaluation time and eigenanalysis time can usually be reduced substantially by exploiting system symmetry. For example, projection lenses have nominal rotational symmetry if residual imperfections are neglected (at least, the underlying design form has a nominal axis of rotational symmetry even when the physical lithographic lens itself uses an off-axis imaging field), and usually it is not desirable to include residual asymmetries when carrying out OPC correction, since aberration-specific correction would entail the use of tool-specific masks, and would also force single circuit elements to undergo multiple OPC solutions when repeated in different places across the chip. Defocus is in some cases an exception to this rule, i.e. defocus can be the one aberration which should be taken into account during OPC; however, defocus preserves rotational symmetry.
Even though the present invention can handle systems that are not symmetric (and can treat images which are averaged over a scan path by using a single averaged 4D kernel), precomputation is much faster when symmetry is exploited.
System symmetry also requires symmetry in the source. Residual source asymmetries, like residual lens asymmetries, are most often deliberately neglected during OPC; however, it is often the case that even the nominal source will inherently exhibit a lesser degree of symmetry than the lens. (For example, the source may have a dipole shape even when the lens is rotationally symmetric.) In virtually all cases, the (nominal) source shape will at least have bilateral symmetry about the x and y axes. In most cases the source will also be symmetric about the 45°,135° diagonals (since this provides matching performance in vertically and horizontally oriented features), the principal exception being dipole sources. In some cases the source shape will have rotational symmetry as well (e.g. disk or annular sources). These shape symmetries are preserved in the source as a whole when the source is unpolarized, and should preferably be maintained when source polarization is deliberately customized to a more complicated distribution. Any of these customary source symmetries will be shared by the nominal projection lens (even with defocus), as well as by the isotropic films in the resist stack, and also by the isotropic mask and resist blur functions; thus it is the source symmetry which usually determines the symmetry of the entire system.
Application of Bilateral Symmetry in Pre-Computation
In nearly all cases of interest for OPC the system will exhibit bilateral symmetry about the x and y axes. We will refer to this as “dipole symmetry”, since this symmetry is obeyed even by dipole sources (which are perhaps the least symmetric nominal sources used in lithography). Note that a source with dipole symmetry actually has a four-fold mirror symmetry, i.e.
V(x′, y′;x″, y″)≡V(−x′, y′;−x″, y″),
V(x′,y′;x″,y″)≡V(x′,−y′;x″,−y″). Equation 60
Equation 60 dictates a similar symmetry in the eigenfunctions of the generalized bilinear kernel V. To see this, note that if Ψ(x′, y′) is such an eigenfunction, then application of Equation 60 after direct substitution of e.g. Ψ(−x′, y′) into Equation 55 demonstrates that Ψ(−x′, y′) will also be an eigenfunction of V with the same eigenvalue. We thus see by forming the linear combinations [Ψ(x′, y′)±Ψ(−x′, y′)]/√{square root over (2)} that the eigenfunctions of V will be either odd or even in x (and likewise y), or, in the case of two-fold degeneracy in μ, that the associated eigenfunctions will be spanned by subspace eigenbasis functions that are odd or even only.
Let us then specify the odd or even symmetry of Ψ along x with the parameter ξx, so that ξx=−1 when Ψ has odd symmetry, and +1 with even symmetry. Similarly, we use parameter ξy to specify the y-symmetry of Ψ. By making the appropriate changes of variables in the Equation 55 integral, we can then remap the domain of integration to the positive x′, y′ quadrant:
The integration domain in Equation 61 has ¼ the area of that in Equation 55, and it need only be evaluated over ¼ as large a range in x″,y″ (namely 0<x″<ROI, 0<y″<ROI), since Ψ is determined by symmetry in the other (negative) quadrants (according to ξx,ξy). The matrix obtained by discretizing Equation 61 thus contains 1/16 the number of elements as would be obtained by simply discretizing Equation 55, and the eigenelements of an Equation 61 matrix can be found far more rapidly (e.g. in 1/64 the time if eigensolution speed scales as the 3/2 power of element count). However, Equation 61 must be solved four times (since there are four different combinations of ξx,ξy), so in a typical case the total eigensolution time may improve by about 16× when bilateral system symmetry is exploited in this way. Speed also improves because generalized bilinear kernel V of the present invention need only be evaluated at ¼ as many points. Once the separate (dominant) eigensolutions are obtained for each combination of ξx and ξy, they are merged and resorted to obtain the dominant eigenelements of V. Note that different ξx,ξy combination generally produce different eigenvalues.
Application of Quadrupole Symmetry to Pre-Computation
Besides having symmetry about the x and y axes, the source (and the projection system as a whole) will in most cases also have bilateral symmetry about the ±45° diagonals. We will refer to this as “quadrupole symmetry”, since it is obeyed by standard quadrupole source shapes; note that it implies an eight-fold mirror symmetry of the system about the boundaries of each octant. The generalized bilinear kernel for such systems will obey the relation
V(x′, y′;x″, y″)≡V(y′,x′;y″,x″) Equation 62
while continuing to obey the Equation 59 and Equation 60 dipole and Hermitian symmetries. We can exploit the Equation 62 symmetry in somewhat the same way as was done above with dipole symmetry, but the quadrupole case is more complicated. If with a quad-symmetric kernel we substitute the transpose Ψ(y′,x′) of an eigenfunction satisfying Equation 61 into the Equation 61 operator (i.e. we apply the folded dipole-symmetry kernel to the transposed [diagonally flipped] eigenfunction), then make a change of variables (swap) x′→y′, y′→x′ in the integral, and then apply Equation 62, we see that (with appropriate choice of ξx and ξy) Ψ(y′,x′) will be an eigenfunction of V with the same eigenvalue as Ψ(x′, y′). Moreover, if ξx=ξy, the transposed eigenfunction will have the same dipole symmetry as Ψ(x′, y′), i.e. it will satisfy Equation 61 under the same values of ξx and ξy. Thus, we first consider determination of the subset of quad eigenfunctions which obey the dipole symmetry ξ=+1 or −1, where ξ is defined by ξ≡ξx=ξy. This set of quad eigenfunctions is obtained by reducing Equation 61 to an integration over one octant:
where ξ45 can be +1 or −1. If Equation 63 is solved by reduction to a discrete grid (i.e. to matrix equations), its solution for given ξ entails eigenanalysis of two matrix choices (corresponding to the choices ξ45=+1 and ξ45=−1), each reduced in size 4× from the dipole symmetry case. If solution time scales as the 3/2 power of element count, this provides a net 4× speed improvement over the corresponding portion of the dipole eigenanalysis. 2× fewer kernel evaluations are needed.
The remaining quad-symmetry eigenfunctions are those associated with ξx=−ξy(i.e. they are solutions with what may be termed opposed or x-y-asymmetric dipole symmetry); these do not obey an octagonal folding of the kind exploited in Equation 63. This may be seen by substituting Ψ(y′,x′) into Equation 61 as above; we find that in this case the transposed Ψ is an eigenfunction with the same eigenvalue as Ψ(x′, y′), but with opposite dipole parameters (i.e. with ξx→ξy,ξy→ξx. Thus, to exploit quadrupole symmetry in calculating such eigenelements, we first solve a single opposed-symmetry combination using the dipole-symmetry folded kernel of Equation 61 (e.g. we choose ξx=+1,ξy=−1, and solve that case as a dipole). After doing so, the eigenfunctions for the remaining case (e.g. ξy=+1;ξx=−1) are recovered immediately by taking the transpose of the first set (e.g. Ψ(y′,x′)). This represents a net 2× speed improvement over the corresponding steps of the dipole solution. Likewise, only half as many kernel evaluations are needed if Equation 62 is exploited.
Though quadrupole symmetry implies that Ψ(y′,x′) will be an eigenfunction whenever Ψ(x′, y′) is an eigenfunction, this is a rather trivial result when ξx=ξy=±1, because in those cases the octagonal folding symmetry of Equation 63 indicates that Ψ(y′,x′) and Ψ(x′, y′) will actually be the same function (to within multiplication by −1). However, the eigenfunctions for the remaining two dipole cases (ξx=−ξy=+1 and ξx=−ξy=−1), though related by a simple transpose, are distinct functions (with the same eigenvalue); thus the (equal) eigenvalues obtained in these cases are two-fold degenerate. When ξx=−ξy, any linear combination of Ψ(y′,x′) and Ψ(x′, y′) will also be an eigenfunction having the same eigenvalue.
Application of Rotational Symmetry in Pre-Computation
If the optical system is rotationally symmetric, we can find the eigenelements using a numerical version of the method disclosed in R. M. von Bunau, Y. C. Pati, and Y.-T. Wang, “Optimal coherent decompositions for radially symmmetric optical systems,” J. Vac. Sci. Technol. B 15, no. 6 (1997), p. 2412, hereinafter referred to as Von Bunau, the contents of which are hereby incorporated by reference in its entirety. (Von Bunau also found analytic expressions for the eigenfunctions of the focused Hopkins kernel in the case of disk or annular sources.) In all likelihood the majority of lithographic sources in use today are disk-shaped or annular, and these sources will be circularly symmetric if unpolarized or tangentially polarized (or if treated as scalar). Rotational symmetry is preserved under defocus, and also in the presence of isotropic mask or resist blur, and likewise within isotropic layers of the resist film stack. Rotational symmetry is a continuous symmetry, and as shown below this allows us to reduce the dimensionality of the eigenproblem by 1, providing a considerable time saving (increasingly so as problem size increases). Moreover, since we are only interested in the dominant eigenelements, we can reduce the dimensionality from 4 to “2.5”, in the sense that decomposition of several 2D problems can replace decomposition of the symmetry-derived 3D kernel; in cases of practical interest the additional time savings is appreciable. Note also that with radial symmetry, all Fourier transforms are efficiently carried out using fast Hankel transforms.
To see how these savings are realized, we consider the generalized bilinear kernel V″({right arrow over (r)}′,{right arrow over (r)}″) of Equation 45 in the case of an unpolarized source. In that case mMax is 2, and the sum extends over uncorrelated images under x-polarized and y-polarized illumination; thus under the ideal lens model for the s polarization component of the electric field of Equation 10 and p component of the electric field of Equation 11, in accordance with the present invention, the generalized bilinear kernel V″ becomes:
where j({right arrow over (r)}) is the coherence function occurring in the Hopkins kernel of Equation 2, and where the projection impulse response function is
{right arrow over (h)}m({right arrow over (r)})≡≡∫∫d2{right arrow over (K)}{right arrow over (h)}m({right arrow over (K)})b({right arrow over (r)}−{right arrow over (K)}). Equation 65
We therefore form over one octant of {right arrow over (r)} the function
f({right arrow over (r)},s)≡j(2s)[{right arrow over (h)}x({right arrow over (r)}−s{circumflex over (x)}+{right arrow over (R)})·{right arrow over (h)}x*({right arrow over (r)}+s{circumflex over (x)}+{right arrow over (R)})+{right arrow over (h)}y({right arrow over (r)}−s{circumflex over (x)}+{right arrow over (R)})·{right arrow over (h)}y*({right arrow over (r)}+s{circumflex over (x)}+{right arrow over (R)})], Equation 66
and then fill in the remainder of the {right arrow over (r)} domain by symmetry. After convolving f with g2D by FFT, then, in accordance with the present invention, the symmetric generalized bilinear kernel V″ can be expressed as a 3D function in polar coordinates:
V″({right arrow over (r)}′,{right arrow over (r)}″)=V″(r′,θ′;r″,θ″)≡{tilde over (V)}″(r′,r″,θ′−θ″)=f(r′cos(θ)+r″,r′sin(θ),√{square root over (r′1+r″2−2r′r″cos(θ))}) Equation 67
where θ≡θ′−θ″.
If we then denote the m th order azimuthal Fourier component of {tilde over (V)}″ as vm, i.e.
we can reduce the eigenanalysis problem for a given value of m to a purely radial one (1D in each variable). When calculating those eigenfunctions of V″ that are associated with the m th azimuthal order, we find that the j th such eigenfunction is given by
where φ′j,m denotes the j th eigenfunction of the symmetrized 2D radial kernel obtained from vm, i.e.
The eigenvalue associated with Ψj,m is λj,m. Note that the eigenvalues are degenerate when m≠0; the eigenfunctions Ψj,m defined in Equation 69 for +m and −m then provide a pair that span the subspace. A reasonable value for m is about 10.
Application of the Generalized Bilinear Kernel to OPC
Once the eigenelements are calculated, the image intensity I({right arrow over (r)}) in accordance with the present invention may be calculated at OPC fragmentation points in much the same way as is done in conventional OPC methods. For example, this may be done by pre-storing a set of sector convolutions CΨ for each of the dominant eigenfunctions Ψ (for Manhattan geometries, this set need contain only a single such sector convolution, representing convolution with a 90° corner; however other geometries require additional convolutions for other corner angles). The image intensity at point {right arrow over (r)} is then calculated by a sum over M dominant eigenelements and L corners enclosed within the ROI:
where {right arrow over (r)}l is the position of the lth corner, and where δl is +1 or −1, depending on the order in which the corner occurs when tracing the perimeter of the polygon that contains it (see K. Lai et al., co-assigned U.S. patent application Ser. No. 10/694,466). The CΨ tables can then be used by MBOPC software (including software that is designed for use with the Hopkins model).
Equation 71 is easy to differentiate with respect to position {right arrow over (r)}, for example in a direction perpendicular to the polygon's edge, and one can prestore tables of the derivatives of the CΨ. This means, as discussed further below, that resist development models which involve image slope can also exploit the fast computation speed provided by the present invention.
At this point it is appropriate to consider some computational aspects of Equation 45 for the generalized bilinear kernel V″ and Equation 47 for the transmission cross-correlation coefficient T″. Their evaluation is complicated by the dependence of {right arrow over (h)}m on both {right arrow over (r)}′ and {right arrow over (k)}s. (Of course, this evaluation is only required during the pre-calculation steps of the invention; once the eigenfunctions of the resulting kernel are obtained, Equation 45 imposes no computational burden when the invention is actually applied to the mask shapes of interest.) As noted in the discussion surrounding Equation 27 regarding a polarized source, the dependence of {right arrow over (h)}m on {right arrow over (k)}s arises with sources that have customized (non-uniform) polarization. The most-well-known examples are probably annular sources in which the polarization is made tangential in order to image critical spatial frequencies (at high NA) in transverse electric field (TE) polarization. The polarization changes in the ray polarization {right arrow over (E)} that are incurred between mask and wafer will depend only on the diffracted direction {right arrow over (k)}″, but the polarization that diffracts from the mask will also be dependent on the polarization of the illuminating ray, and hence on {right arrow over (k)}s. Let Xx and Xy denote the x′ and y′ polarization components of the illuminating ray (in the local coordinate system for source polarization described above), so that Xx and Xy provide a map of the polarization distribution of the source, which may be tailored or customized. Then, using the thin-mask approximation, the following expression for the p component of the ray polarization after diffraction from the mask is obtained:
{right arrow over (E)}p({circumflex over (κ)}, {right arrow over (k)}S)=A({right arrow over (κ)}+{right arrow over (κ)}S)S({right arrow over (k)}S)[Xx′({right arrow over (k)}S)Dp,x′({circumflex over (κ)}; {right arrow over (k)}S)+Xy′({right arrow over (k)}S)Dp,y′({circumflex over (κ)},{right arrow over (k)}S)]{circumflex over (p)}. Equation 72
(A similar expression holds for the s component.) Here A({circumflex over (κ)},{circumflex over (κ)}s) is a Fourier component of the mask pattern, given by Equation 13. We have chosen to break out the intensity S({right arrow over (k)}s) along the illuminating ray as a separate term, making Xx and Xy the components of a unit vector. The D coefficients represent the dependence of diffracted polarization on illuminating polarization. For example, under the generalized assumption of spatial invariance discussed above, we would take the diffracted polarization in the case of e.g. an x′-polarized source to be independent of illuminating ray direction. We indicate this {right arrow over (k)}s-independence by adding a superscipt (H) (to denote pupil H rather than source S). If we further apply the specific model used above to derive Equation 10 and Equation 11, we have (see Equation 11)
(We have also absorbed the p unit vector into D(H), making it a vector quantity.)
Rapid evaluation of the integrals comprising the generalized bilinear kernel V″ of Equation 45 does not force the use of the simple spatial invariance model of Equation 73. More generally, we need only assume that Dp,x′ (along with Dp,y′,Ds,x′, and Ds,y′) can be written in separable form within each region of the source. For example if we divide the source into a total of J regions, each referenced by index j, then we require that
for each of the J source regions. (Either D(H) or D(S) may be a vector quantity.) Noting that both the lens parameters in Equation 25 and the film-stack parameters in Equation 26 are independent of illumination direction (and bearing in mind the simple geometrical correspondence between the various ray direction variables k″, k′ and {circumflex over (κ)}, we can collapse all dependence in {right arrow over (Q)} on ray direction into generalized D coefficients [which we denote {right arrow over ({tilde over (D)}(H)({right arrow over (κ)}″)] that are specific to the source polarization (i.e. to the specific x′ and y′ components along a given illuminating ray) and to the source region j. The projection impulse response function {right arrow over (h)}m Equation 30 then becomes
where we have introduced impulse response components {right arrow over ({tilde over (h)} that do not depend on the illuminating ray direction (except through index j), and where we have combined the illumination-dependent terms D(S) and S into single factors denoted {tilde over (S)}.
The generalized bilinear kernel of Equation 45 becomes
If the dot products are expanded out, the integrals for the generalized bilinear kernel V″({right arrow over (r)}′,{right arrow over (r)}″) reduce to a sum of bilinear integral terms in {right arrow over ({tilde over (h)}. Several approaches are available to evaluate these bilinear terms. For example, one method ferst considers the preliminary case of no blurring from resist or mask (Equation 33). In the no-blur case, V is given by a sum of terms having the form
where subscripts e and f can refer to any terms in the sum, and where a coherence term J has been introduced to denote the result of the source integral, and where in the last line we have approximated the unblurred V term by a bilinear sum of its dominant eigenfunctions (which are denoted Θ).
When resist and/or mask blur is added, the expression for the generalized bilinear kernel V″ (Equation 76, where the double prime on V denotes the presence of blur) contains blurred versions of the same set of bilinear terms in {right arrow over ({tilde over (h)} that arise without blur (given in Equation 77). When blur is added, these terms take the form
Ve,f″=∫∫d2{right arrow over (R)}g2D(|{right arrow over (R)}|)∫∫d2{right arrow over (K)}′d2{right arrow over (K)}″Je,f({right arrow over (K)}′−{right arrow over (K)}″)b({right arrow over (r)}′+{right arrow over (K)}′)b({right arrow over (r)}″+{right arrow over (R)}−{right arrow over (K)}″){right arrow over ({tilde over (h)}e({right arrow over (K)}′)·{right arrow over ({tilde over (h)}f*({right arrow over (K)}″) Equation 78
Substituting from Equation 77, this becomes
The functions in the generalized bilinear kernel V″ of Equation 79 are smoothly varying and hence easy to integrate numerically in the spatial domain. The number of integrations involved is similar to that involved in the calculation of scalar TCCs in an ordinary Hopkins model (though of course many such e,f combinations must be evaluated in the most general case).
Another approach to calculating V″ is to use integration in the frequency domain (T″ of Equation 47), which can be easier than spatial-domain integration (except in the case of simple convolution) if one explicitly determines the boundary intersections of the pupil H and source S functions. The spatial domain kernels V″ can then be determined by inverse transform. Efficient integration bounds in the frequency domain are defined by the bandlimited system response, and the periodic repeats of the object structure that are implicitly generated by integration against a discrete grid in the frequency domain may mimic excluded mask structure (i.e. exterior to the ROI) in a more realistic way than does simple spatial domain truncation.
More preferably, another approach is to employ a mixed strategy. If the Equation 77 blur-free eigenfunctions Θn,e,f are Fourier transformed (we denote their transforms by {tilde over (Θ)}n,e,f), the transform of Equation 79 becomes
which after tabulation can be inverse transformed to obtain Ve,f″. The eigenfunctions may also be transformed. The advantage of this approach is that, as shown by the middle line of Equation 77, tabulation of the unblurred kernel V does not require repeated integral evaluations. (One-time Fourier transforms are required, but these precede the tabulation loops, and may be carried out quickly by Fast Fourier Transform (FFT).) Once the unblurred kernel is calculated and eigendecomposed, it can be converted rapidly to the frequency domain by FFT, where application of blur is again a simple matter of function evaluation (no integration). However, this procedure does require two eigenanalysis steps.
In summary, the present invention is directed to an efficient method and system for computing lithographic images that takes into account vector effects such as lens birefringence, resist stack effects and tailored source polarizations, and may also include blur effects of the mask and the resist. In accordance with the present invention, these effects are included by forming a generalized bilinear kernel, which can then be treated using a SOCS decomposition to allow rapid computation of an image that includes such non-scalar effects. Referring to
In a preferred embodiment of the present invention, the SOCS decomposition of the generalized bilinear kernel (Block 540) may be performed as illustrated in
A preferred embodiment for computing the image (Block 560) is illustrated in
The present invention has the advantage that non-scalar effects such as vector effects including tailored source polarization, lens birefringence, and resist stack polarization, as well as blur in the mask or resist, can be incorporated efficiently in image calculation using techniques such as SOCS decomposition. For example, the computation of the image (Block 550) can be performed efficiently by a method such as outlined in
The method of the present invention for computing an image of an integrated circuit design may be implemented by a computer program or software incorporating the process steps and instructions described above in otherwise conventional program code and stored on an electronic design automation (EDA) tool or an otherwise conventional program storage device. These instructions include providing an impulse response function for the projection lens (which may include resist stack effects and blur), the generation of a generalized bilinear kernel, taking into account vector effects, and performing a SOCS-like decomposition of the generalized bilinear kernel, using methods as described above. The instructions to perform the method of the present invention may be incorporated into program code to perform model-based OPC. As shown in
It will be appreciated by those skilled in the art that the method and system for performing the method in accordance with the present invention is not limited to the embodiments discussed above. Accordingly, the invention is intended to encompass all such alternatives, modifications and variations which fall within the scope and spirit of the invention and the following claims.
Number | Name | Date | Kind |
---|---|---|---|
6563566 | Rosenbluth et al. | May 2003 | B1 |
6738859 | Liebchen | May 2004 | B1 |
Number | Date | Country | |
---|---|---|---|
20050185159 A1 | Aug 2005 | US |