Disclosed embodiments relate to integrated circuits (ICs) including semiconductor fabrication, and more particularly to computational lithography for forming IC devices and IC devices therefrom.
Analog ICs generally need a high degree of precision in terms of local parametric matching as well as matching across the die. A modern patterning process which transfers a circuit design from a reticle (or mask) to a layer (e.g., polysilicon or metal) on a substrate surface (e.g., Si) using optical lithography undergoes multiple process steps such as lithography imaging and develop etching of a resist, followed by plasma and/or wet etching to form the features, and then chemical cleans for removal of residual resist polymers. All these process steps impact cumulatively on the dimensions of devices, such as transistors and circuit elements including resistors and capacitors, and thus IC parameters dependent thereon, depending on the pattern density in the recticle.
The workhorse to enable sub-wavelength lithography is referred to as computational lithography (CL). CL makes use of numerical simulations to improve the performance (resolution and contrast) provided by cutting-edge reticles. CL combines techniques including Resolution Enhancement Technology (RET) and Optical Proximity Correction (OPC), and some non-optical portions. Beyond the models used for RET and OPC, CL can include the signature of the scanner to help improve accuracy of the OPC model, polarization characteristics of the lens pupil, a Jones matrix of the stepper lens, optical parameters of the resist stack, and a model for diffusion through the resist.
Generally, processes such as for the transistor active area, gate electrode and metal processes are modeled by collecting empirical (inline) critical dimension (CD) data only after both the resist patterning and the etching process which together define the resulting structures. Heuristic threshold-based CL models are formed using convolution of an aerial image (AI) and Gaussian kernels representing photoacid diffusion in resists and other process-related effects by training them (fitting the model with “thresholds”) to the empirical data using statistical methods. Such threshold-based models have been used over several process nodes. In this form, the modeling accuracy is proportional to the number of sampling functions used, with a trade-off made between accuracy and run-time.
Disclosed embodiments recognize integrated circuit (IC) process levels including transistor active area and gate electrode conventionally modeled with heuristic threshold-based models that rely on Gaussian kernel coefficients generated by collecting inline data only after both resist patterning and etch lack accuracy because of the combination of resist patterning effects and etch effects. A major weakness in such conventional threshold-based modeling is recognized by disclosed embodiments to lie in the calibration of the resist portion of the model, which relies solely on Gaussian kernel-based models.
It has been recognized conventional kernel coefficients and output thresholds (either constant or variable) cannot accurately model the resist patterning processes. For instance, the process of “developing” resist is recognized herein to involves chemical “etching”, and the resulting resist loss is not accounted for in conventional threshold-based lithography models.
Algorithms disclosed herein calibrate computational lithography (CL) models individually to the individual resist patterning process step, and include a non-Gaussian developer etching kernel which represents the developer used for printing which can account for the process of chemical “etching” when developing resist, in addition to a Gaussian kernel. Disclosed developer etching kernels thus improve the accuracy of the resist model which models the resist patterning process. With disclosed algorithms, the process of modeling the resist patterning process and the etch process are separated, and since the resist model is carried into the etch model, a more accurate etch model is provided by disclosed resist models.
Reference will now be made to the accompanying drawings, which are not necessarily drawn to scale, wherein:
Example embodiments are described with reference to the drawings, wherein like reference numerals are used to designate similar or equivalent elements. Illustrated ordering of acts or events should not be considered as limiting, as some acts or events may occur in different order and/or concurrently with other acts or events. Furthermore, some illustrated acts or events may not be required to implement a methodology in accordance with this disclosure.
The RAFs include RAFs in a size range selected so that a lithography system (including a specific resist composition) used for printing prints some of the RAFs, and some of the RAFs do not print. The set of gratings can include gratings all having a constant pitch, with different pattern densities provided, and the size range can span from zero (nothing) to a size of the main features. As noted above, the features can be lines and/or spaces.
Step 102 comprises determining a plurality of resist kernels, using a computing device, from the post-develop resist CD data including a non-Gaussian developer etching kernel for representing an effect from the developer used for the printing, and a Gaussian kernel. The Gaussian kernel can include a representation for an effect of a base quencher to a photoacid generator in the resist.
The method can further comprise scaling/assigning relative weights to the post-develop resist CD data, wherein the determining can comprises minimizing a figure of merit (FOM) based on a standard deviation of a weighted residual error of the post-develop resist CD data. The post-develop resist CD data inherently includes the exposure variance.
The non-Gaussian developer etching kernel can be in the form of an Arrhenius relation. Equation 1 below describes a develop process, using a modified Arrhenius equation, as a function of pattern density:
exp(Rate x (1−Pr)) (1)
where Pr is an exportable 2D-convolution object generated by convolving a disk kernel of radius r over a pattern, and Rate is a fitting parameter that is regressed empirically along with r. Alternatively, the 2D-convolution object representing Pr can be obtained from a Gaussian kernel where σ of the Gaussian kernel replaces r in the above convolving.
Step 103 comprises generating a resist model which provides a resist image contour from an aerial image (AI) contour and the plurality of resist kernels determined in step 102. Step 104 comprises collecting inline post-etch CD data after etching the layer, such as a layer comprising polysilicon, metal, or a dielectric material. Step 105 comprises generating an etch model which generates an etch contour from the resist image contour and the plurality of etch kernels provided by the resist model.
CL can be performed using the etch model to design a reticle for at least one level for fabricating an IC. The resist model may be represented in the following form, expressing the Resist Image Contour as a function (f) of several terms as shown below:
Resist Image Contour=f(AI*GAD+Mask*(modified Arrhenius equation), or simplifying:
Resist Image Contour=Aerial Image (AI) Contour+Developer Bias Kernels
Where AI is the aerial image, GAD is a photoacid term obtained from Data type 2 described below, and Mask*(modified Arrhenius equation) represents developer loading. The term “developer bias kernels” in the simplified equation form shown above includes (i) a disclosed non-Gaussian developer etching kernel which represents a developer used for the printing and (ii) a conventional Gaussian kernel.
An example resist model calibration procedure is now provided. Thresholds are extracted using Data Type 1 defined as lines (or spaces) of varying spacing. Structures to generate Data type 1 are known. Regression is used to determine the developer bias kernels using Data Type 2.
Data type 2 is obtain from disclosed test structures having set of gratings (lines or spaces) having main features and RAFs in proximity to the main features, wherein the RAFs include RAFs a size range selected so that a lithography system used for the printing prints some of the RAFs, and does not print others of the RAFs. See
The final threshold (Gaussian Diffusion Kernel(s)) are then fine tuned using Data Type 1 or the entire data-set (Data type 1 and Data type 2). For example, a Transmission Cross Coefficient (TCC) matrix where the AI is represented as a Bessel function may be used for fine tuning to determine the final thresholds.
Data is collected using Data Type 2 after etch of the layer exposed by the resist pattern, such as after a plasma (or wet) etch. An example etch model has the following form, where the etch contour is a function of the resist image contour described above (AI Contour+Developer Bias Kernels):
ETCH Contour=f(Resist Image Contour+Mask*EtchKernels), or simplifying:
ETCH Contour=Resist Image Contour+Etch Bias Kernels
The etch (or “etch bias”) kernel(s) can be represented as conventional Gaussian kernel(s), and/or include one or more non-Gaussian etch kernels. Non-Gaussian etch kernels may be determined analogously to the non-Gaussian developer etching kernel as described above.
Disclosed embodiments are further illustrated by the following specific Examples, which should not be construed as limiting the scope or content of this Disclosure in any way.
As described below, the CL simulator comprising an aerial image convolved with Gaussian diffusion was enhanced with addition of base quencher term to the acid diffusion and developer loading kernels. The relative importance of these kernels was demonstrated by model regression with these kernels against one data-set and validating the result against an independent data-set. The presence of the bulk loading kernel was determined to be significant in not only lowering the simulation FOM, but also in resulting model predictability that was valid beyond the region of the collected empirical (inline) data.
This Example utilized an independent data-set to extract the resist parameters to simplify the procedure while improving the accuracy and portability of the generated resist model. Whenever possible, the goal was to minimize changes to the existing model form to facilitate use in existing correction algorithms with no or minimal change to provide portability. Besides introducing developer loading parameters, the acid-quencher diffusion model was also considered to replace a conventional straight Gaussian diffusion kernel. As the sampling data for both these effects is complementary and the procedure for the simulator to extract all these parameters is purely statistical, it is possible and helpful to simultaneously extract both of these terms.
A set of gratings comprising lines were generated as shown in
Similarly, an inverse of this module (features being spaces) was also created as shown in
Regarding kernel formation, the threshold based model form was enhanced to account for chemically amplified resists enhanced with base quencher and developer loading to improve simulation accuracy. The optical model was represented with the minimum number of kernels required to describe convolution of an AI over a pattern using the Synopsys PROGEN package (Synopsys Corp, Mountain View, Calif.). The reduction in fitting error, where necessary, was achieved with the addition of kernels tuned specifically to a free parameter representing a process effect. In this manner, the deterministic form of the model was retained which opened the possibility to make the model portable.
The quencher influences the photoacid concentration both during formation (exposure) and diffusion during post-exposure bake. The possibility of acid volatility and its re-deposition (chemical flare) was not considered independently. However, if this were indeed occurring in any significant manner, then its effect would be lumped into the density term. The diffusion process during post-exposure bake would generally be more significant for proximity modeling since it would be a longer range effect. Furthermore, diffusion has no measurable impact on photoacid generation kinetics, and therefore would maintain the simplicity of the threshold based model form. The Gaussian kernel shown as (f(x)) below was modified by subtracting a constant (truncation of diffusion length due to quencher (“quencher”)) from the kernel as shown in Equation 2 below, where the integral is evaluated from −∞ to ∞.
∫f(x)dx−quencher (2)
In the Equation 2 representation, the quencher term changes the blurring of the AI in a manner that reduces the concentration (amplitude) and diffusion length (ambit). Mathematically, this approximation describes the long range distribution of a low level base concentration. Alternately, an additional long range Gaussian kernel to represent the long range distribution of a low level base concentration could be used. However, since its impact was identical, it was dropped due to increase in run time.
The relative comparison of both these kernels is shown in
The other kernel introduced was to replicate the effect of developer with varying pattern density. Physically, the effect of developer represents removal of the resin that has been modified by the photoacid, by a base. This effect is known to be a function of resist and developer chemistry, and physical conditions such as temperature and time. For purpose of computational lithography it is generally sufficient to capture only the final state of the resist pattern as a function of density. Ideally, this could be represented by modeling it as the process of resist removal such as in an etching process. However, it can be sufficient to investigate this simply as a problem of threshold modification. The threshold modification was considered by superposition of independent effects, in this case developer loading, to the optical model form. This concept can also be extended to other density effects such as etch loading or substrate (e.g., Si) loss during surface cleaning by chemical or other (e.g., thermal) means.
A disk or cylindrical kernel was chosen to detect pattern density. Convolving a disk kernel with a pattern data removes those portions of the kernel where no pattern exists or vice versa. This is a simplified way to model a variation in feature size as function of pattern density.
Equation 1 (disclosed above, copied again below) describes the develop process, using a modified Arrhenius equation, as a function of pattern density:
exp(−Rate x (1−Pr)) (1)
where Pr is an exportable 2D-convolution object generated by convolving a disk of radius r over a pattern, and Rate is a fitting parameter that is regressed empirically along with r. However, as described above, alternatively, the 2D-convolution object representing Pr can be obtained from Gaussian kernel where σ of the Gaussian kernel replaces r in the convolving. Two kernels were used, one with a large radius representing the bulk effect of developer and one with short radius representing a reduced (or enhanced) rate for tight (closely spaced) features either due localized developer depletion and/or surface tension/capillary action for the resist/developer system.
As expected, the short-range kernel would be mapping local (micro-loading) effects such as surface tension while the longer range kernel would represent the bulk loading effects. One could add additional kernels depending on the pattern interaction range of available empirical data.
If a precise and transportable resist model were available, then the process of generating analog components could be automated. Besides the clear advantages in availability of a precise and compact CL model, such a CL model could also form the foundation for etch correction using staged etch models and for model-based validation work.
Disclosed models were regressed using inline data collected from a set of structures described in
Since the goal of the calibration process was to accurately extract physical constants to form a deterministic model, rather than to minimize fitting error, minimizing the standard deviation of residual error was chosen as the FOM to avoid convergence to points with the highest residual error. To test this hypothesis and the fit of the extracted parameters, the models were validated on an independent data set using the test structures obtained using the grating shown in
Data from the validation pattern was not used in any form during the calculation of empirical error. The experimental matrix with the included parameters regressed in this Example was compiled in Table 1 shown below. No attempt was made to change the imaging parameters. Instead, as and when needed, additional developer loading kernels were added.
Table 1 shows the design of experiment matrix with the relative importance of the additional kernels (parameters) to the model, including a surface loading kernel and a bulk loading kernel. The FOM decreased significantly from 3.75 nm for a model with only a conventional Gaussian diffusion kernel (simulation 1) to 2.29 nm with the addition of a disclosed bulk loading kernel (simulation 4). While this in itself was a significant improvement in fitting, the validation results were of even more significance. Here, the residual fitting error was plotted with the asymmetric pitch.
With the addition of a disclosed bulk load kernel (simulation 4), the fitting error for the same pitch was reduced by more than ⅔ from 120 nm to less than 40 nm as shown in
All subsequent simulations included a disclosed bulk developer loading kernel to the CL model form. To that, the surface developer loading kernel was added in simulation 5. With the addition of the surface developer loading kernel, the FOM improved from 2.29 nm to 2.19 nm. Moreover, the model validation demonstrated an acceptable fit for all points with the residual fitting error for tightest pitch being reducing from ˜40 nm to 4 nm, as can be seen in
This model was further enhanced by the addition of the quencher term in simulation 6. While the FOM showed further improvement, most fitting errors in the validation suite were now within 2 nm as shown in
Disclosed embodiments can be used for a variety of lithography systems to form semiconductor devices that may include various elements therein and/or layers thereon, including barrier layers, dielectric layers, device structures, active elements and passive elements including source regions, drain regions, bit lines, bases, emitters, collectors, conductive lines, conductive vias, etc. Those skilled in the art to which this disclosure relates will appreciate that many other embodiments and variations of embodiments are possible within the scope of the claimed invention, and further additions, deletions, substitutions and modifications may be made to the described embodiments without departing from the scope of this disclosure.
This application claims the benefit of Provisional Application Ser. No. 61/614,962 entitled “SYSTEM AND METHOD TO CALIBRATE MULTIPLE DENSITY KERNELS TO BE USED FOR OPC”, filed Mar. 23, 2012, which is herein incorporated by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
61614962 | Mar 2012 | US |