The disclosure generally relates to the field of wafer surface metrology, and particularly to a system and a method for the prediction of in-plane distortions (IPD) introduced by the wafer shape in semiconductor wafer chucking process.
Thin polished plates such as silicon wafers and the like are a very important part of modern technology. A wafer, for instance, may refer to a thin slice of semiconductor material used in the fabrication of integrated circuits and other devices. Other examples of thin polished plates may include magnetic disc substrates, gauge blocks and the like. While the technique described here refers mainly to wafers, it is to be understood that the technique also is applicable to other types of polished plates as well. The term wafer and the term thin polished plate may be used interchangeably in the present disclosure.
Generally, certain requirements are established for the flatness and thickness uniformity of the wafers. However, chucking of substrates with wafer shape (defined as the median surface of the wafer in its free state obtained from the front and back surfaces of the wafer) and thickness variations results in elastic deformation that can cause significant in-plane distortions (IPD). IPD may lead to errors in downstream applications such as overlay errors in lithographic patterning or the like. Therefore, providing the ability to predict/estimate IPD due to wafer shape in the chucking process and thus to control the wafer shape specification is a vital part of semiconductor manufacturing process.
The development and usage of a finite element (FE) model based IPD prediction is described in: Predicting distortions and overlay errors due to wafer deformation during chucking on lithography scanners, Kevin Turner et al., Journal of Micro/Nanolithography, MEMS, and MOEMS, 8(4), 043015 (October-December 2009), which is herein incorporated by reference in its entirety. The FE model based IPD prediction utilizes full-scale 3-D wafer and chuck geometry information and simulates the non-linear contact mechanics of the wafer chucking mechanism, allowing the FE model to provide the most accurate prediction of IPD of the wafer surface. The FE model is developed and executed through a simulation-driven product development tool such as the ANSYS software package from ANSYS, Inc. However, FE model based IPD prediction is computationally expensive and may be complicated to setup, and therefore it is not suitable to be used in a high volume manufacturing environment.
Wafer higher order shape (HOS) information extracted from using wafer geometry tools, such as WaferSight from KLA-Tencor, can also be utilized to provide IPD prediction. For instance, wafer shape and HOS information may be used to simulate wafer chucking and predict its IPD. However, studies have shown that while HOS based IPD prediction may provide acceptable results for medium warp wafers, the accuracy of the IPD prediction degrades as the degree of wafer warp increases. The accuracy of HOS-based IPD prediction degrades primarily due to the fact that large 2nd order shape of the wafer (e.g., bowl, dome, saddle and the like) contributes to IPD during wafer chucking that is not completely represented by just the local higher order wafer slope. HOS, which is a local higher order slope based metric, is unable to capture well the IPD coma components (i.e., IPD distribution contours which closely resemble contours of coma components of Zernike polynomials) produced by large 2nd order shape and other lower order high magnitude shape components.
Therein lies a need for systems and methods for accurate and efficient prediction of in-plane distortions due to semiconductor wafer shape in the chucking process without the aforementioned shortcomings.
The present disclosure is directed to a computer implemented method for providing in-plane distortion (IPD) prediction. The method includes: generating a series of Zernike basis wafer shapes; performing finite element (FE) model based IPD prediction for the series of Zernike basis wafer shapes; performing higher order shape (HOS) based IPD prediction for the series of Zernike basis wafer shapes; comparing the FE model based IPD prediction and the HOS based IPD prediction of a particular Zernike basis wafer shape of the series of Zernike basis wafer shapes to determine whether said particular Zernike basis wafer shape produces large prediction differences between the FE model based IPD prediction and the HOS based IPD prediction; storing the Zernike basis wafer shapes that produce large prediction differences between the FE model based IPD prediction and the HOS based IPD prediction; and providing a HOS based IPD prediction for a given wafer utilizing the stored Zernike basis wafer shapes.
The method described above may be utilized for overlay error prediction. The method may include: generating a series of Zernike basis wafer shapes; performing finite element (FE) model based IPD prediction for the series of Zernike basis wafer shapes; performing higher order shape (HOS) based IPD prediction for the series of Zernike basis wafer shapes; comparing the FE model based IPD prediction and the HOS based IPD prediction of a particular Zernike basis wafer shape of the series of Zernike basis wafer shapes to determine whether said particular Zernike basis wafer shape produces large prediction differences between the FE model based IPD prediction and the HOS based IPD prediction; storing the Zernike basis wafer shapes that produce large prediction differences between the FE model based IPD prediction and the HOS based IPD prediction; performing a first HOS based IPD prediction for a given wafer; improving accuracy of the first HOS based IPD prediction result utilizing the stored Zernike basis wafer shapes; performing a second HOS based IPD prediction for the given wafer after a wafer patterning process; improving accuracy of the second HOS based IPD prediction result utilizing the stored Zernike basis wafer shapes; calculating differences between the IPD for the given wafer predicted before the wafer patterning process and the IPD for the given wafer predicted after the wafer patterning process; and applying a linear scanner correction routine to the IPD differences to obtain the overlay error prediction.
A further embodiment of the present disclosure is directed to a system for providing in-plane distortion (IPD) prediction for a given wafer. The system may include an optical system configured for obtaining a wafer shape of the given wafer and an IPD prediction module in communication with the optical system. The IPD prediction module may be configured for: generating a series of Zernike basis wafer shapes; performing finite element (FE) model based IPD prediction for the series of Zernike basis wafer shapes; performing higher order shape (HOS) based IPD prediction for the series of Zernike basis wafer shapes; comparing the FE model based IPD prediction and the HOS based IPD prediction of a particular Zernike basis wafer shape of the series of Zernike basis wafer shapes to determine whether said particular Zernike basis wafer shape produces large prediction differences between the FE model based IPD prediction and the HOS based IPD prediction; storing the Zernike basis wafer shapes that produce large prediction differences between the FE model based IPD prediction and the HOS based IPD prediction; and providing a HOS based IPD prediction for the given wafer utilizing the stored Zernike basis wafer shapes.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not necessarily restrictive of the present disclosure. The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate subject matter of the disclosure. Together, the descriptions and the drawings serve to explain the principles of the disclosure.
The numerous advantages of the disclosure may be better understood by those skilled in the art by reference to the accompanying figures in which:
Reference will now be made in detail to the subject matter disclosed, which is illustrated in the accompanying drawings.
The present disclosure is directed to systems and methods for prediction of in-plane distortions (IPD) due to wafer shape in semiconductor wafer chucking process. A process or a combination of analytical and empirical method to emulate the non-linear finite element (FE) contact mechanics model based IPD prediction is utilized in accordance with one embodiment of the present disclosure. The emulated FE model based prediction process (may be referred to as the EFE process) is substantially more efficient and provides accuracy comparable to the FE model based IPD prediction that utilizes full-scale 3-D wafer and chuck geometry information and requires computation intensive simulations.
The purpose of using the EFE process in accordance with the present disclosure is to generate IPD signatures that are similar to the IPD coma components that would be observed in FE predictions. More specifically, suppose for each initial shape, there exists a certain shape, referred to as the interim shape, whose local slopes (x- and y-slopes) cause the IPD signatures to change similar to the IPD coma components that would be observed in FE predictions. Under this postulate, for any given initial shape, once its corresponding interim shape is determined, its IPD coma components can be captured as well, allowing the EFE process to provide relatively accurate emulation of FE IPD predictions for that given initial shape.
Now the question is how to determine the interim shape for a given initial wafer shape.
In one embodiment, step 202 is processed for a large number of initial sample wafer shapes created from 2nd order polynomial equations. For example, a polynomial equation such as Z0(x,y)=b1+b2x2+b3xy+b4y2 may be utilized for creating various sample wafer shapes. More specifically, by varying the coefficients b1 through b4, a variety of input shapes can be created, including bow up, bow down, saddle shapes and the like.
As described above, the FE model based IPD prediction process is performed for each of the initial wafer shape created. For instance, as shown in
Step 204 then integrates the FE model based prediction results to derive the interim shape for each initial wafer shape. In one embodiment, a pair of polynomial equations is utilized to describe the X-IPD 402 divided by a neutral surface factor and the Y-IPD 404 divided by the neutral surface factor. The development and usage of shape-slope residual metric is described in: Overlay and Semiconductor Process Control Using a Wafer Geometry Metric, P. Vukkadala et al., U.S. patent application Ser. No. 13/476,328, which is herein incorporated by reference in its entirety. More specifically, the X-IPD 402 may be expressed as:
the Y-IPD 404 may be expressed as:
and the polynomial equations for
may now be fitted to their corresponding shapes 402 and 404, respectively, to obtain the coefficients a1 through an. The polynomial equations for
chosen here are obtained by taking the partial derivative of the equation for Z(x,y) (shown in the next page) relative to variables x and y. This method allows for efficiently integrating the X-IPD and Y-IPD to obtain the interim shape Z. However, note that other available methods for integrating two independent derivatives into an integral may also be used to achieve the same result.
It is noted that the Taylor polynomials described above are open-ended to indicate that polynomials of higher order may be utilized without departing from the spirit and scope of the present disclosure. Using higher order polynomials will reduce the shape fitting error. However, note that the higher order model requires more computation of the fitting coefficients and may degrade the model prediction in the general shape variation. It is also contemplated that specific polynomial equations used to express the shapes are not limited to the Taylor polynomials described above. For instance, polynomial fitting using Zernike polynomials may also be utilized without departing from the spirit and scope of the present disclosure. Furthermore, it is understood that any surface mapping/fitting techniques may be employed to facilitate the fitting process to determine the coefficients a1 through an.
Now, for simplicity of the discussion, suppose Taylor polynomials are utilized, and upon completion of the fitting process, coefficients a1 through a8 have been determined. Such coefficients can then be used to derive the interim shape Z for each initial wafer shape Z0. More specific to the example described above, for each initial shape defined as
Z
0(x,y)=b1+b2x2+b3xy+b4y2,
its corresponding interim shape Z may be defined as
Z(x,y)=a1x2+a2xy+a3y2+a4x4+a5x3y+a6x2y2+a7xy3+a8y4.
It is contemplated that the process described above for obtaining the interim shape for a given initial shape may be repeated (or executed in parallel) for each of the large number of sample wafer shapes created in step 202. That is, for each initial shape defined by a set of coefficients b1 through b4 (jointly referred to as B), a set of corresponding coefficients a1 through a8 (jointly referred to as A) can be determined. Suppose that the relationship between the set of coefficients B and the set of coefficients A can be defined as a function ƒ, then if ƒ is obtained, coefficients A can be computed directly for a given set of B.
Step 206 therefore tries to obtain the function ƒ based on the data collected from steps 202 and 204. That is, each set of B used to generate a sample wafer shape in step 202 and its corresponding set of A obtained in step 204 are used as training data in step 206 in order to obtain the function ƒ.
In one embodiment, the function ƒ is defined as A=ƒ(B)×C, wherein C is also a set of coefficients. More specifically, ai may be defined as:
a
i
=c
1
b
3
+c
2
b
2
2
+c
3
b
3
2
+c
4
b
4
2
+c
5
b
2
b
4
+c
6
b
2
b
3
+c
7
b
4
b
3
+c
8
b
2
3
+c
9
b
4
3
+c
10
b
2
2
b
3
+c
11
b
2
2
b
4
+c
12
b
3
2
b
2
+c
13
b
3
2
b
4
+c
14
b
4
2
b
3
+c
15
b
4
2
b
2
It is contemplated that more terms may be used in the polynomial above to make adjustments to the model if needed to improve model accuracy. However, for simplicity of the discussion, 15 coefficients, i.e., c1 through c15, are used for each aiεA. This results in a total of 15×8=120 coefficients to be determined in order to establish the relationship between B and A. Since the values of ∀biεB and ∀aiεA are known in this training process, the 120 coefficients can be obtained by solving the equations using any equation solving techniques.
With the values of C determined, they can be used to compute the values of A directly for any given set of B. For example, upon receiving shape data of a new wafer, the values of b1 through b4 that describe the new wafer shape Z0 may be determined by fitting the equation Z0(x,y)=b1+b2x2+b3xy+b4y2 to the new wafer shape. Subsequently, the values of A can be calculated based on the values of B obtained using surface fitting and the values of C determined using the prediction process 200 described above. With the values of A determined, the interim shape Z(x,y)=a1x2+a2xy+a3y2+a4x4+a5x3y+a6x2y2+a7xy3+a8y4 that corresponds to the new wafer shape Z0 can also be determined. Furthermore, by definition of the interim shape Z, its x-slope
can be calculated to predict the X-IPD for Z0. Similarly, the y-slope of the interim shape,
can be calculated to predict the Y-IPD for Z0.
As described above, the emulated FE model based prediction process in accordance with the present disclosure is an analytical/empirical model which is highly efficient compared to FE models. Once the prediction process 200 is completed for the sample wafer shapes created in step 202, the only inputs needed to perform IPD prediction for a new wafer are the initial shape data of that wafer and a set of 120 coefficients (i.e., c1 through c15 for calculating each ai, as previously described). Furthermore, testing results indicate that the emulated FE model based prediction process in accordance with the present disclosure provides comparable results against the true FE model based prediction process. For example, both
It is contemplated that while the emulated FE model based prediction process described above efficiently addresses the differences between FE model based prediction and HOS based prediction that occur due to the presence of large magnitude 2nd order components in the wafer shape, a more generic approach may be utilized to address the FE and HOS IPD differences that are results of not only large magnitude 2nd order components but also large magnitude higher-order components of wafer shape. Although the more generic approach may not be as efficient as the emulated FE model based prediction process, it may be suitable to address more general cases and may provide greater prediction accuracies.
As previously mentioned, in semiconductor industry, the finite element (FE) model based prediction process has been widely utilized to analyze the wafer shape change during the chucking process. FE model takes into account of many wafer and system factors of the process, such as the initial wafer shape and wafer material parameters, chuck types and pressure configurations. Once the correct FE model is setup, the accurate prediction of the wafer in-plane-distortion (IPD) and out-plane-distortion (OPD) from the wafer chucking process can be generated from the FE model. However, FE model in general takes long time to run and may not be suitable to the high volume manufacturing application now.
On the other hand, wafer high order shape (HOS) based model has been constructed to provide a more efficient prediction of the wafer IPD and OPD in the chucking process. This model takes the initial wafer shape as input and simulates the wafer chucking process by calculating corresponding wafer shape and shape slopes in various predefined orientations. Then the prediction of the wafer IPD and OPD is calculated. While this model is efficient and can provide good prediction results for the low bow wafer shapes, studies have shown that the accuracy of the prediction degrades as warp increases.
The more generic approach in accordance with the present disclosure is directed to address the differences between the FE model and the HOS model. More specifically, the accuracy and applicability of HOS based IPD and OPD prediction model to a wider range of wafer shapes can be improved by incorporating the analysis results of the FE model responses to the Zernike basis images. These Zernike basis images form a complete set of models that are orthogonal over a circle of unit radius and therefore any wafer shape image can be well approximated by a linear combination of Zernike basis images when enough Zernike basis model images are used.
Referring to
Subsequently, both the FE model and the HOS model proceed to predict wafer IPD/OPD, as indicated in steps 708 and 710, respectively, and the output responses from these two model systems are compared in step 712 to identify major Zernike shape components that produce large differences (e.g., above a certain threshold) between the two model systems. The term and magnitude information of the identified Zernike shapes may be stored (e.g., in a reference database, lookup table or the like) for use in the HOS performance enhancement stage. The image information stored as Zernike terms and coefficients can be efficiently retrieved, allowing the images to be easily reconstructed from the Zernike model information.
Once the Zernike components that produce large prediction differences are identified, step 808 can retrieve the FE model prediction results for these components and combine the FE model prediction results with the HOS model to produce a more accurate IPD/OPD prediction. That is, since the HOS based IPD/OPD model system is linear, the contributions of the identified major shape components can be combined with the results generated by the HOS IPD/OPD model to produce enhanced IPD/OPD prediction for a wide range of the wafer shape variations. The predicted wafer IPD/OPD from the enhanced HOS model may then be reported in step 810.
It is contemplated that the analysis steps described in
It is also contemplated that the emulated FE model based approach (i.e., using the interim shape) and the generic approach (i.e., using the enhanced HOS prediction) may be used together to further improve the overall IPD prediction. Both approaches for prediction of IPD in accordance with the present disclosure are capable of providing efficient prediction of the wafer IPD in the chucking process, which may be appreciated in various downstream applications such as monitoring and/or controlling overlay errors that occur during semiconductor manufacturing.
Overlay error is misalignment between any of the patterns used at different stages of semiconductor integrated circuit manufacture.
The systems and methods for prediction of IPD due to wafer chucking in accordance with the present disclosure can be utilized to identify overlay errors. For instance, as illustrated in
It is contemplated that while the example above referenced the emulated FE model based IPD prediction process, the enhanced HOS prediction process may also be utilized alternatively/additionally without departing from the spirit and scope of the present disclosure. It is also contemplated that the IPD prediction processes in accordance with the present disclosure may be utilized for other applications in addition to overlay error prediction and control described above.
Another critical application of IPD prediction using wafer shape is to feed-forward computed scanner corrections based on predicted IPD to improve the alignment of the wafer on scanner. In this scenario, the wafer shape is measured after step N and the interim shape that corresponds to the measured shape may then be determined and utilized to obtain the predicted IPD as previously described. The wafer shape is then measured right before step N+1 and the interim shape that corresponds to the measured shape may then be determined and utilized to obtain the predicted IPD as previously described. Using these two IPD's scanner corrections needed to align the wafer at N+1 to minimize the misalignment to step N can be computed and feed-forward to the scanner prior to step N+1. Utilizing the feed-forward technique reduces the misalignment and overlay prior to printing litho-layer and eventually reduces wafer re-work.
It is understood that overlay and alignment control and monitoring is one of the many critical applications of the emulated FE metric. The application can also be extended to monitor and control other processes such as Rapid Thermal Processing (RTP), Chemical-Mechanical Planarization (CMP), Chemical Vapor Deposition (CVD) and the like. To enable process control, new prediction process based on the process conditions need to be developed. For example, the chuck design varies from process to process resulting in different wafer chuck interaction that needs to be modeled. One of the key components of localized metric computation is the local area in which the metric is computed. The wafer can be divided into local areas based on the process. For example, in the case of RTP process, RTP chucks have radial zones for heating the wafer, and non-uniform heating can result in wafer geometry variations at different radial bands. Similarly the EFE based IPD metric can be divided into radial bands and appropriate metric computed within the radial band to capture these geometry variations due to non-uniform heating.
Referring now to
The IPD prediction system 1100 also includes an IPD prediction module 1104 in communication with the optical system 1102. The IPD prediction module 1104 is configured for carrying out the methods for providing IPD prediction for a given wafer as described above. The prediction results may subsequently be utilized as control input for various downstream applications 1106, including, but not limited to, overlay error monitor and control, alignment control, as well as RTP, CMP and CVD processes or the like.
It is contemplated that the IPD prediction method and system in accordance with the present disclosure can be used for bare wafers, patterned wafers and the like. Furthermore, the IPD prediction method and system in accordance with the present disclosure can be used for wafers with streets defined thereof. As shown in
In one embodiment, the street regions 1204 are masked off when the metrics of the device regions 1202 are calculated for the selected wafer surfaces, including front, back and shape image maps. Unlike the street masking for the patterned wafer inspection where only the scribe streets on the front surface is masked, for the wafer surface metrology measurement using the surface metrology tools, such as KLA-Tencor WaferSight, the scribe streets on the front, back and shape maps of the wafer can be selectively masked. Masking off the street regions 1204 can be done manually or systematically when the patterned wafer geometry measurements are taken. For instance, in manual mode, the user may define the measurement grid size and shift, and the street size for the algorithm to use in the metric calculation. In systematic mode, on the other hand, the grid size and shift, as well as the street positions and widths can be estimated from the pattern wafer image based on the projections and periodic peak identification. As a result, users will not be required to provide these device related values using the systematic mode.
It is understood that while the measurement sites and street size in the exemplary patterned wafer shown in
It is contemplated that while the examples above referred to wafer metrology measurements, the systems and methods in accordance with the present disclosure are applicable to other types of polished plates as well without departing from the spirit and scope of the present disclosure. The term wafer used in the present disclosure may include a thin slice of semiconductor material used in the fabrication of integrated circuits and other devices, as well as other thin polished plates such as magnetic disc substrates, gauge blocks and the like.
The methods disclosed may be implemented as sets of instructions, through a single production device, and/or through multiple production devices. Further, it is understood that the specific order or hierarchy of steps in the methods disclosed are examples of exemplary approaches. Based upon design preferences, it is understood that the specific order or hierarchy of steps in the method can be rearranged while remaining within the scope and spirit of the disclosure. The accompanying method claims present elements of the various steps in a sample order, and are not necessarily meant to be limited to the specific order or hierarchy presented.
It is believed that the system and method of the present disclosure and many of its attendant advantages will be understood by the foregoing description, and it will be apparent that various changes may be made in the form, construction and arrangement of the components without departing from the disclosed subject matter or without sacrificing all of its material advantages. The form described is merely explanatory.
The present application is a continuation of U.S. patent application Ser. No. 13/735,737 filed on Jan. 7, 2013, which claims the benefit under 35 U.S.C. §119(e) of U.S. Provisional Application Ser. No. 61/712,259, filed Oct. 11, 2012, and U.S. Provisional Application Ser. No. 61/712,746, filed Oct. 11, 2012. Said U.S. patent application Ser. No. 13/735,737, U.S. Provisional Application Ser. No. 61/712,259, and U.S. Provisional Application Ser. No. 61/712,746 are herein incorporated by reference in their entireties.
Number | Date | Country | |
---|---|---|---|
61712259 | Oct 2012 | US | |
61712746 | Oct 2012 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 13735737 | Jan 2013 | US |
Child | 15172667 | US |