This application claims priority of EP application 22155715.0 which was filed on 8 Feb. 2022 and which is incorporated herein in its entirety by reference.
The present invention relates to a lithographic process and more specifically to a method to measure a parameter of a lithographic process.
A lithographic apparatus is a machine that applies a desired pattern onto a substrate, usually onto a target portion of the substrate. A lithographic apparatus can be used, for example, in the manufacture of integrated circuits (ICs). In that instance, a patterning device, which is alternatively referred to as a mask or a reticle, may be used to generate a circuit pattern to be formed on an individual layer of the IC. This pattern can be transferred onto a target portion (e.g., including part of, one, or several dies) on a substrate (e.g., a silicon wafer). Transfer of the pattern is typically via imaging onto a layer of radiation-sensitive material (resist) provided on the substrate. In general, a single substrate will contain a network of adjacent target portions that are successively patterned. In lithographic processes, it is desirable frequently to make measurements of the structures created, e.g., for process control and verification. Various tools for making such measurements are known, including scanning electron microscopes, which are often used to measure critical dimension (CD), and specialized tools to measure overlay, a measure of the accuracy of alignment of two layers in a device. Overlay may be described in terms of the degree of misalignment between the two layers, for example reference to a measured overlay of Inm may describe a situation where two layers are misaligned by 1nm.
Recently, various forms of scatterometers have been developed for use in the lithographic field. These devices direct a beam of radiation onto a target and measure one or more properties of the scattered radiation—e.g., intensity at a single angle of reflection as a function of wavelength; intensity at one or more wavelengths as a function of reflected angle; or polarization as a function of reflected angle—to obtain a “spectrum” from which a property of interest of the target can be determined. Determination of the property of interest may be performed by various techniques: e.g., reconstruction of the target by iterative approaches such as rigorous coupled wave analysis or finite element methods; library searches; and principal component analysis.
The targets used by conventional scatterometers are relatively large, e.g., 40 μm by 40 μm, gratings and the measurement beam generates a spot that is smaller than the grating (i.e., the grating is underfilled). This simplifies mathematical reconstruction of the target as it can be regarded as infinite. However, in order to reduce the size of the targets, e.g., to 10 μm by 10 μm or less, e.g., so they can be positioned in amongst product features, rather than in the scribe lane, metrology has been proposed in which the grating is made smaller than the measurement spot (i.e., the grating is overfilled). Typically such targets are measured using dark field scatterometry in which the zeroth order of diffraction (corresponding to a specular reflection) is blocked, and only higher orders processed. Examples of dark field metrology can be found in international patent applications WO 2009/078708 and WO 2009/106279 which documents are hereby incorporated by reference in their entirety. Further developments of the technique have been described in patent publications US20110027704A, US20110043791A and US20120242970A. Modifications of the apparatus to improve throughput are described in US2010201963A1 and US2011102753A1. The contents of all these applications are also incorporated herein by reference. Diffraction-based overlay using dark-field detection of the diffraction orders enables overlay measurements on smaller targets. These targets can be smaller than the illumination spot and may be surrounded by product structures on a wafer. Targets can comprise multiple gratings which can be measured in one image.
There are several methods for overlay inference. A known method of determining overlay from metrology images such as those obtained using dark-field methods, while making some correction for non-overlay asymmetry is known as the A+/A− regression method. This method comprises measuring a biased target having two differently biased sub-targets using radiation having at least two different wavelengths, and plotting intensity asymmetry from one of the sub-targets against intensity asymmetry from the other of the sub-targets for each wavelength. Regressing through each data point yields a line having a slope indicative of overlay.
It would be desirable to improve on such known overlay inference methods.
The invention in a first aspect provides a method for determining a parameter of interest relating to at least one target on a substrate. The method comprises obtaining metrology data comprising at least one asymmetry signal, said at least one asymmetry signal comprising a difference or imbalance in a measurement parameter; obtaining a trained model having been trained or configured to relate said at least one asymmetry signal to the parameter of interest, the trained model comprising at least one proxy for at least one nuisance component of the at least one asymmetry signal; and inferring said parameter of interest for said at least one target from said at least one asymmetry signal using the trained model.
The invention in a second aspect provides a method of training a model to relate asymmetry signals to a parameter of interest, the model comprising at least one proxy for at least one nuisance component of the asymmetry signals, the method comprising: obtaining training data comprising a plurality of training asymmetry signals relating to a plurality of training targets; and training said model by optimizing a respective training parameter for each of said at least one proxy.
The invention in a third aspect provides a metrology apparatus being operable to perform the method of the first or second aspect.
In a further aspect of the invention, there is provided a computer program comprising program instructions operable to perform the method of the first aspect when run on a suitable apparatus.
Further features and advantages of the invention, as well as the structure and operation of various embodiments of the invention, are described in detail below with reference to the accompanying drawings. It is noted that the invention is not limited to the specific embodiments described herein. Such embodiments are presented herein for illustrative purposes only. Additional embodiments will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein.
Embodiments of the invention will now be described, by way of example only, with reference to the accompanying drawings in which:
Before describing embodiments of the invention in detail, it is instructive to present an example environment in which embodiments of the present invention may be implemented.
The illumination optical system may include various types of optical or non-optical components, such as refractive, reflective, magnetic, electromagnetic, electrostatic or other types of components, or any combination thereof, for directing, shaping, or controlling radiation.
The patterning device support holds the patterning device in a manner that depends on the orientation of the patterning device, the design of the lithographic apparatus, and other conditions, such as for example whether or not the patterning device is held in a vacuum environment. The patterning device support can use mechanical, vacuum, electrostatic or other clamping techniques to hold the patterning device. The patterning device support may be a frame or a table, for example, which may be fixed or movable as required. The patterning device support may ensure that the patterning device is at a desired position, for example with respect to the projection system. Any use of the terms “reticle” or “mask” herein may be considered synonymous with the more general term “patterning device.”
The term “patterning device” used herein should be broadly interpreted as referring to any device that can be used to impart a radiation beam with a pattern in its cross-section such as to create a pattern in a target portion of the substrate. It should be noted that the pattern imparted to the radiation beam may not exactly correspond to the desired pattern in the target portion of the substrate, for example if the pattern includes phase-shifting features or so called assist features. Generally, the pattern imparted to the radiation beam will correspond to a particular functional layer in a device being created in the target portion, such as an integrated circuit.
The patterning device may be transmissive or reflective. Examples of patterning devices include masks, programmable mirror arrays, and programmable LCD panels. Masks are well known in lithography, and include mask types such as binary, alternating phase-shift, and attenuated phase-shift, as well as various hybrid mask types. An example of a programmable mirror array employs a matrix arrangement of small mirrors, each of which can be individually tilted so as to reflect an incoming radiation beam in different directions. The tilted mirrors impart a pattern in a radiation beam, which is reflected by the mirror matrix.
As here depicted, the apparatus is of a transmissive type (e.g., employing a transmissive mask). Alternatively, the apparatus may be of a reflective type (e.g., employing a programmable mirror array of a type as referred to above, or employing a reflective mask).
The lithographic apparatus may also be of a type wherein at least a portion of the substrate may be covered by a liquid having a relatively high refractive index, e.g., water, so as to fill a space between the projection system and the substrate. An immersion liquid may also be applied to other spaces in the lithographic apparatus, for example, between the mask and the projection system. Immersion techniques are well known in the art for increasing the numerical aperture of projection systems. The term “immersion” as used herein does not mean that a structure, such as a substrate, must be submerged in liquid, but rather only means that liquid is located between the projection system and the substrate during exposure.
Referring to
The illuminator IL may include an adjuster AD for adjusting the angular intensity distribution of the radiation beam. Generally, at least the outer and/or inner radial extent (commonly referred to as σ-outer and σ-inner, respectively) of the intensity distribution in a pupil plane of the illuminator can be adjusted. In addition, the illuminator IL may include various other components, such as an integrator IN and a condenser CO. The illuminator may be used to condition the radiation beam, to have a desired uniformity and intensity distribution in its cross section.
The radiation beam B is incident on the patterning device (e.g., mask) MA, which is held on the patterning device support (e.g., mask table MT), and is patterned by the patterning device. Having traversed the patterning device (e.g., mask) MA, the radiation beam B passes through the projection optical system PS, which focuses the beam onto a target portion C of the substrate W, thereby projecting an image of the pattern on the target portion C. With the aid of the second positioner PW and position sensor IF (e.g., an interferometric device, linear encoder, 2-D encoder or capacitive sensor), the substrate table WT can be moved accurately, e.g., so as to position different target portions C in the path of the radiation beam B. Similarly, the first positioner PM and another position sensor (which is not explicitly depicted in
Patterning device (e.g., mask) MA and substrate W may be aligned using mask alignment marks M1, M2 and substrate alignment marks P1, P2. Although the substrate alignment marks as illustrated occupy dedicated target portions, they may be located in spaces between target portions (these are known as scribe-lane alignment marks). Similarly, in situations in which more than one die is provided on the patterning device (e.g., mask) MA, the mask alignment marks may be located between the dies. Small alignment markers may also be included within dies, in amongst the device features, in which case it is desirable that the markers be as small as possible and not require any different imaging or process conditions than adjacent features. The alignment system, which detects the alignment markers is described further below.
Lithographic apparatus LA in this example is of a so-called dual stage type which has two substrate tables WTa, WTb and two stations—an exposure station and a measurement station—between which the substrate tables can be exchanged. While one substrate on one substrate table is being exposed at the exposure station, another substrate can be loaded onto the other substrate table at the measurement station and various preparatory steps carried out. The preparatory steps may include mapping the surface control of the substrate using a level sensor LS and measuring the position of alignment markers on the substrate using an alignment sensor AS. This enables a substantial increase in the throughput of the apparatus.
The depicted apparatus can be used in a variety of modes, including for example a step mode or a scan mode. The construction and operation of lithographic apparatus is well known to those skilled in the art and need not be described further for an understanding of the present invention.
As shown in
In order that the substrates that are exposed by the lithographic apparatus are exposed correctly and consistently, it is desirable to inspect exposed substrates to measure properties such as overlay between subsequent layers, line thicknesses, critical dimensions (CD), etc. Accordingly a manufacturing facility in which lithocell LC is located also includes metrology system MET which receives some or all of the substrates W that have been processed in the lithocell. Metrology results are provided directly or indirectly to the supervisory control system SCS. If errors are detected, adjustments may be made to exposures of subsequent substrates, especially if the inspection can be done soon and fast enough that other substrates of the same batch are still to be exposed. Also, already exposed substrates may be stripped and reworked to improve yield, or discarded, thereby avoiding performing further processing on substrates that are known to be faulty. In a case where only some target portions of a substrate are faulty, further exposures can be performed only on those target portions which are good.
Within metrology system MET, an inspection apparatus is used to determine the properties of the substrates, and in particular, how the properties of different substrates or different layers of the same substrate vary from layer to layer. The inspection apparatus may be integrated into the lithographic apparatus LA or the lithocell LC or may be a stand-alone device. To enable most rapid measurements, it is desirable that the inspection apparatus measure properties in the exposed resist layer immediately after the exposure. However, the latent image in the resist has a very low contrast—there is only a very small difference in refractive index between the parts of the resist which have been exposed to radiation and those which have not—and not all inspection apparatuses have sufficient sensitivity to make useful measurements of the latent image. Therefore measurements may be taken after the post-exposure bake step (PEB) which is customarily the first step carried out on exposed substrates and increases the contrast between exposed and unexposed parts of the resist. At this stage, the image in the resist may be referred to as semi-latent. It is also possible to make measurements of the developed resist image—at which point either the exposed or unexposed parts of the resist have been removed—or after a pattern transfer step such as etching. The latter possibility limits the possibilities for rework of faulty substrates but may still provide useful information.
A metrology apparatus is shown in
As shown in
At least the 0 and +1 orders diffracted by the target T on substrate W are collected by objective lens 16 and directed back through beam splitter 15. Returning to
A second beam splitter 17 divides the diffracted beams into two measurement branches. In a first measurement branch, optical system 18 forms a diffraction spectrum (pupil plane image) of the target on first sensor 19 (e.g. a CCD or CMOS sensor) using the zeroth and first order diffractive beams. Each diffraction order hits a different point on the sensor, so that image processing can compare and contrast orders. The pupil plane image captured by sensor 19 can be used for focusing the metrology apparatus and/or normalizing intensity measurements of the first order beam. The pupil plane image can also be used for many measurement purposes such as reconstruction.
In the second measurement branch, optical system 20, 22 forms an image of the target T on sensor 23 (e.g. a CCD or CMOS sensor). In the second measurement branch, an aperture stop 21 is provided in a plane that is conjugate to the pupil-plane. Aperture stop 21 functions to block the zeroth order diffracted beam so that the image of the target formed on sensor 23 is formed only from the −1 or +1 first order beam. The images captured by sensors 19 and 23 are output to processor PU which processes the image, the function of which will depend on the particular type of measurements being performed. Note that the term ‘image’ is used here in a broad sense. An image of the grating lines as such will not be formed, if only one of the −1 and +1 orders is present.
The particular forms of aperture plate 13 and field stop 21 shown in
In order to make the measurement radiation adaptable to these different types of measurement, the aperture plate 13 may comprise a number of aperture patterns formed around a disc, which rotates to bring a desired pattern into place. Note that aperture plate 13N or 13S can only be used to measure gratings oriented in one direction (X or Y depending on the set-up). For measurement of an orthogonal grating, rotation of the target through 90° and 270° might be implemented. Different aperture plates are shown in
Once the separate images of the overlay targets have been identified, the intensities of those individual images can be measured, e.g., by averaging or summing selected pixel intensity values within the identified areas. Intensities and/or other properties of the images can be compared with one another. These results can be combined to measure different parameters of the lithographic process. Overlay performance is an important example of such a parameter.
Using for example the method described in applications such as US20110027704A, mentioned above, overlay (i.e., undesired and unintentional overlay misalignment) between the two layers within the sub-targets 32 to 35 is measured. Such a method may be referred to as micro diffraction based overlay (μDBO). This measurement may be done through overlay target asymmetry, as revealed by comparing their intensities in the +1 order and −1 order dark field images (the intensities of other complementary higher orders can be compared, e.g. +2 and −2 orders) to obtain a measure of the intensity asymmetry.
There are several methods for overlay inference from such measurements, which aim to separate or isolate the wanted overlay signal from other nuisance contributions (e.g., inter alia non-overlay target asymmetries). These methods differ may data dimensions used in the model, the model itself, and/or the model/recipe setup. A number of methods have been developed which use a multi wavelength data dimension. Other methods use additional data dimensions, such as multi target positions, so as to acquire more information in a diversified input. These methods may also require a condition for model/recipe setup, such as external reference data.
Present overlay inference methods, and in particular after-development inspection ADI (i.e., pre-etch inspection) overlay inference methods (e.g., compatible with faster metrology techniques such as dark-field metrology), suffer from a number of drawbacks which include:
In understanding the concepts disclosed herein, it should be appreciated that the measured asymmetry signal A is a function of overlay and any process variations:
where A is the measured asymmetry signal, OV is the unknown true overlay (the parameter of interest), PA is the asymmetric process variation and PS describes the symmetric process variation. These are geometrical properties: independent of wavelength and polarization and usually depend on target position i on the wafer and pixel position j in the captured image (or region of interest ROI). ∂A/∂OV, ∂A/∂PAi, ∂A/∂PSj describe the sensitivity of the measured asymmetry to each of these unknown parameters and are dependent on wavelength and polarization. As such, the first term describes the overlay signal component, the second term describes the asymmetric non-overlay signal contribution (a first nuisance component) and the third term describes the symmetric non-overlay signal component (a second nuisance component).
Examples of asymmetric process variations include inter alia: a difference in sidewall angle dSWA (e.g., between two walls of a grating feature), a (non-overlay) grating asymmetry, a floor tilt. Examples of symmetric process variations include inter alia: layer thickness variation, CD variation, grating imbalance GI and symmetric SWA variation.
An intensity image can be decomposed as:
where S is the symmetrical part and A the asymmetrical part. The asymmetrical part comprises the overlay information. For an infinite grating pair, A(ov) is periodic with pitch P and therefore A(ov) can be written as:
Using the atan approximation, only the first term of the above expression may be taken without expanding sin(x):
In this formulation, K is a measure of sensitivity (Jacobian) of the asymmetric part of the intensity to the overlay.
The measured asymmetry signal may be, for example, an intensity asymmetry of complementary diffraction orders or phase difference asymmetry from a pair of complementary sub-targets (“M pad” and “W pad”), depending on whether the metrology method is intensity based (e.g., μDBO) or phase based (e.g., continuous DBO or cDBO). As such, for intensity based applications, A may be defined as A+=I+b+1−I+b−1 and A−=I−b+1−I−b−1 (conveniently written as A±=(I+1−I−1)±b), where I+1, I+1 are the respective intensities of the +1 diffraction order and −1 diffraction order from each sub-target having respective biases +b, −b.
For phase based applications asymmetry signal A may be defined as the phase difference between a diffraction order from an “M pad” and a corresponding diffraction order from a “W pad”, optionally averaged over both diffraction orders of a complementary diffraction order pair: e.g., A=(ϕM−ϕW)+1+(ϕM−ϕW)−1, where (ϕM−ϕW)+1, (ϕM−ϕW)−1 are the measured phase difference between the “M pad” and “W pad” of the +1 diffraction order and −1 diffraction order respectively. As such, it can be appreciated that the concepts described herein are applicable to different types of asymmetry signal. The principle of cDBO is described in Matsunobu et al, Novel diffraction-based overlay metrology utilizing phase-based overlay for improved robustness, Proc. SPIE 11611, Metrology, Inspection, and Process Control for Semiconductor Manufacturing XXXV, 1161126 (22 Feb. 2021), which is incorporated herein by reference.
Like a μDBO target, a cDBO target comprises multiple pads or sub-targets (optionally per direction), each sub-target having overlaid gratings in respective layers for which an overlay value is to be measured. Instead of the gratings having the same pitch in the two layers like a μDBO target, cDBO targets comprise sub-targets each having gratings of different pitches in the two layers. More specifically a typical cDBO target comprises an arrangement of two different types of sub-targets (e.g., per direction): an “M pad” or “M sub-grating” which comprises a bottom grating having a smaller pitch than a top grating, with a “W pad” or “W sub-grating” which has these gratings reversed (i.e., it has the same pitches as the M pad but with the larger pitch in the top layer).
Different example arrangements of cDBO targets are described in PCT application WO2021224009A1, which is incorporated herein by reference. The cDBO targets may comprise at least one pair of similar target regions which are arranged such that the whole target arrangement is, or at least the target regions for measurement in a single direction together are, centrosymmetric (i.e., the arrangement is the same if rotated through 180 degrees). Such an arrangement can help address matching issues caused by distortions in the metrology tool optics. Generally, for inferring overlay in cDBO, a target arrangement comprises two (or more) clusters of sub-targets, each cluster comprising at least one M pad and one W pad per direction (a pad may be shared between clusters in some designs). As such, each cluster can be used independently to infer overlay. However, to address the matching issues, overlay may be determined as an average of the overlay computed per each cluster:
where ci is the ith cluster, p1 and p2 are the grating pitches of the gratings in each layer.
The measured symmetry signal, in an intensity-based application, can be determined as the sum of the intensity measured from the normal and complementary branches, i.e., S±b=(I+1+I−1)±b. In a phase-based application such as cDBO metrology, the phase symmetry cannot be directly measured without a phase reference. However, the amplitude is measured independently from the phase in cDBO metrology and therefore can be accessed directly. As such, for cDBO applications, the symmetry signal may be defined as: SM,W=(a+1+A−1)M,W. where a is the amplitude signal (difference between the max and min intensity from each pad (M,W)).
As such, symmetric and asymmetric process variations each induce a respective component in the measured signal. To achieve accurate and robust overlay inference, the sensitivity of the measured signal to process variations should be suppressed. However, information from only a single measurement cannot separate overlay from the other nuisance contributors described by PA and PS. Also, process variations are highly stack and application dependent. Therefore, a more flexible model is desirable.
Described herein is an improved overlay inference method (e.g., for ADI overlay) which increases the accuracy and robustness of the inference to both symmetric and asymmetric process variations. One aspect of the proposal comprises a multi-dimensional regression, which uses the diversity in multiple information channels: e.g., intensity channels, for example relating to some or all of: positive bias/W pad, negative bias/M pad, normal diffraction order, complementary diffraction order, different pixels inside the ROI, the measurement wavelengths, target positions on the wafer. This results in the most efficient use of the diversity present in the information channels of presently performed measurement strategies and does not require additional metrology compared to present (e.g., μDBO/cDBO) methods.
Alternatively or in addition, another aspect of the proposed method is the use of a parameterized model which comprises at least a symmetric proxy for at least the symmetric process variations, and optionally separate symmetric and asymmetric proxies for the symmetric process variations and asymmetric process variations respectively. This helps make the overlay inference model more flexible. Additionally, the inference method may be data driven or training based; the model parameters may be trained or calibrated based on available external reference data; or in a self-referenced embodiment, on the known target biases (e.g., for μDBO applications). For a cDBO self-referenced embodiment, the model parameters may be trained or calibrated on the equality of inference from pairs of sub-target clusters.
As such, a method is described for determining a parameter of interest such as overlay, the method comprising: obtaining asymmetry signals (e.g., captured at an image plane such as intensity asymmetry and/or phase difference asymmetry signals); obtaining a trained model having been trained to relate said metrology signals to the parameter of interest in terms of at least one proxy for at least one nuisance component; and using the trained model to infer said parameter of interest from said metrology signals.
In an embodiment, the trained model has been trained to relate said asymmetry signals to the parameter of interest in terms of at least two proxies, an asymmetric first proxy for an asymmetric nuisance component and a symmetric proxy for a symmetric nuisance component. The trained model may be a multi-dimensional regression model.
For an intensity based embodiment, such as μDBO, the method may comprise measuring the intensity asymmetry of complementary diffraction orders from each biased sub-target, an intensity summation for the symmetric proxy and a second-order intensity asymmetry for the asymmetric proxy. In the cDBO application, the phase difference between M and W pads (e.g., averaged over a pair of complementary diffraction orders), amplitude symmetry and at least one amplitude asymmetry may be measured.
Also disclosed is a method of training said model to relate asymmetry signals to a parameter of interest in terms of at least one proxy for at least one nuisance component (an optionally at least an asymmetric first proxy for an asymmetric nuisance component and a symmetric proxy for a symmetric nuisance component). In a further embodiment, more than one asymmetric proxies may be defined. The training may use external training data (reference values for the parameter of interest), or it may be self-referenced by training on known properties of the target designs (e.g., target biases for μDBO and target clusters for centrosymmetric cDBO targets).
Such a method exploits the variation in the available information channels in the measurement data (e.g., obtained from the dark-field images). There are multiple information channels (data dimensions) in the measured dark-field images obtained via, for example, μDBO or cDBO metrology. Using this diversity, the proposed methods aim to separate the real overlay signal from the nuisance component(s). Furthermore, the proposed methods aim to suppress the sensitivity to nuisance (e.g., symmetric and asymmetric) process variations by combining multiple information channels e.g., intensity or phase (optionally per pixel or image region), wavelengths and target locations. To incorporate the additional dimensions efficiently and effectively, the overlay estimation problem may be formulated as a multi-dimensional regression problem.
To suppress the effect of process variations, it is proposed to learn the sensitivity and magnitude of process variations in a data driven method. This may be achieved by parameterizing the overlay inference model with sensitivities to process variations. The exact value and types of the symmetric and asymmetric process variations are unknown and direct measurement is either impossible or impractical. For this reason, it is proposed in an embodiment to define a symmetrical proxy P and at least one asymmetrical proxy G for these variations, based on the measured data. If training targets are used which have a programmed or known overlay (or a derived overlay parameter v±) training parameters M and N relating to the proxies on the training targets can be calibrated or trained and apply the learnt model on production targets. As such, in an externally referenced intensity based (μDBO) embodiment (i.e., where known reference overlay values are available to label the training data), the model may take form, for the positive biased and negative biased target pads (sub-targets) +b and −b:
which may be represented more conveniently as:
The overlay parameter v±b,i may, for example, take one of the following forms depending on whether a sine or linear overlay transform is used:
where ov is the true overlay. Note that bold-face lower case letters denote vectors, bold-face upper case letters denote matrices and normal letters denote scalars.
Training or calibration of the model comprises performing the following minimization for training parameters M±b and N±b:
where: v±∈N
N
N
N
(N
where Hi,±b=ΔIi,±b⊗[1Pi,±bT1GiT], ∀i∈Nt; H∈N
In an embodiment, the symmetric proxy may comprise a sum of the measurement values (e.g., intensities or phase differences) from a pair of complementary diffraction orders. In an embodiment, the asymmetric proxy may comprise a second order measurement value difference comprising a difference of: a first measurement value difference of corresponding diffraction orders (e.g., the normal or +1 order) from each of a first biased (or M) sub-target and second biased (or W) sub-target and a second measurement value difference of corresponding diffraction orders (e.g., the complementary or −1 order) from each of the first biased (or M) sub-target and second biased (or W) sub-target; e.g., using an intensity example:
Once the trained, the training parameters M±b and N±b can be used to infer overlay v±,i from measured asymmetry signals ΔI±b,i:
where the actual true overlay ovi per target i can be determine via reverse transformation of the two values: v+,i, v−,i and averaging these values:
For the cDBO embodiment, the overlay equation may be defined as:
In such a cDBO embodiment, the proxies may be defined independently per cluster as indicated by the subscript c: Pc and Gc.
In addition, since phase differences between M and W pads and the amplitude of each of the pads are measured, either one or both of two different asymmetry proxies may be defined:
where aM/W=(Imax−Imin)M/W.
In embodiments with three (or more) proxies, there needs to be defined additional respective training parameters, e.g., a third magnitude/sensitivity matrix, e.g.
Multiple symmetric and/or asymmetric proxies may be defined. In such an embodiment, each proxy may comprise a variable that correlates with a particular type of process variation. For example, a three proxy embodiment may comprise a first (symmetric) proxy which correlates with symmetric process variations, a second (asymmetric) proxy which correlates with e.g., dSWA and a third (asymmetric) proxy which correlates with e.g., floor tilt. In this way, the contribution of each process variation to the overlay signal can be better distinguished and thus a better model defined to extract/suppress the effect from the overlay signal. Since in cDBO there is access to both amplitude and phase measurement, there are more signal dimensions to identify relevant proxies. However, additional (e.g., more than two) proxies may also in general be defined for μDBO.
The models disclosed herein may be derived as follows (this derivation is provided in terms of a μDBO example, although the concepts are broadly similar for cDBO). For a single pixel, single wavelength λ and single target position on the wafer, and assuming the 2-biased grating target design (per direction) of μDBO (e.g., as illustrated in
where ΔIj,λ,i,±b
The proposed method, in contrast to present methods based on eliminating K, exploits K to learn about the process variations. It can be shown that K is correlated to symmetric process variations and asymmetric process variations. In other words, while only the asymmetric part of the intensity signal is typically used, symmetric variations are included due to the dependence of K to symmetric variations. Therefore, K can be used as a proxy for the process variations in a training-based algorithm.
Applying a change of variable and simplifying the notations as:
A linear transformation may be used instead of the sine transformation here. Rearranging the equation to describe sensitivity of overlay parameter v to a change in the measured signal ΔI (instead of sensitivity of ΔI to change in v) yields:
vi=wj,λ,iΔIj,λ,i; ∀j∈Np, ∀λ∈Nλ, ∀i∈Nt
where weight w=K−1
The two-wave model may be expanded by defining proxies Pj,λ,i, Gj,λ,i for symmetric and asymmetric process variations. To do so, the weights related to symmetric and asymmetric process variations may be separated and the estimation model rewritten as:
where wj,λ=(zj,λ+rj,λ), with zj,λ and rj,λ comprising weights that are a function of symmetric and asymmetric process variations respectively:
Symmetric process variations: zj,λ(Pj,λ,i):
Asymmetric process variations: rj,λ(Gj,λ,i);
i.e., the difference between the images of both biases of the normal or +1 diffraction order and the complementary or −1 diffraction order.
It can be assumed that zj,λ and rj,λ are generic unknown functions of Pj,λ,i and Gj,λ,i respectively. The dependence on wavelength and pixels may be dropped by vectorizing zj,λ, rj,λ, Pj,λ,i and Gj,λ,i for all the considered wavelengths and pixels, yielding z∈N
N
N
N
Defining moT=z0T(
where M∈(1+N
zT(Pi)=[1PiT]M
Similarly, defining noT=r0T(
where N∈(1+N
rT(Gi)=[1GiT]N
Combining the above, the overlay inference model becomes:
The model and theory outlined above rely on there being known training labels or known parameter of interest values (e.g., overlay) for the training data. Such known values may be measured using accurate metrology techniques such as scanning electron microscopy, for example, or obtained by any other suitable method. The known values may comprise scanner set values for wafers with programmed overlay. Where training data comprises simulated data (training data may be measured and/or simulated), the parameter of interest/overlay values may be the simulated values. However reference values may not always be available. As such, a self-referencing embodiment will now be described which is an adaption of the embodiment described above.
In this embodiment, it is proposed to find the model training parameters
using all the target positions collectively, based on the assumption that overlay on both sub-targets must be the same:
where:
It is also assumed that v+b and v−b can be captured by the same model, i.e.,
As such, the training problem comprises solving a single minimization problem to minimize
while in the previous embodiment, the training problem comprising solving a pair of minimization problems (one cost function per sub-target bias +b, −b) to minimize
Therefore, to insert the equality in v space yields for the training problem:
which may be conveniently written as:
where:
This training is completely self-referenced, with the training using the known bias b of the sub-targets in place of external overlay values.
In an inference phase or manufacturing phase, the overlay can be determined from the following equation.
once again, this is a single equation rather than the pair of equations (one per bias) of the previous external referenced embodiment, such that the signals ΔI+b,iΔI−b,i from the sub-targets of both biases are used together. The transformed overlay values v+b,i, v+b,i can then be reverse transformed and averaged as with the previous embodiment to determine a final overlay value ov for each target.
For a cDBO self-referenced embodiment, the following assumptions may be made:
where c1 and c2 indicate respective different (e.g., centrosymmetric) target clusters of a pair of clusters. This training may then be performed to enforce this equality; i.e., find the training parameters which minimize the difference between overlay inference between two (or more) centrosymmetric target design clusters. The training equation in this can be written as:
and, once trained, overlay may be determined from:
with the final overlay comprising the average of the overlay of the two clusters (i is the target position index):
A proposed method based on the above concepts may comprise two phases: a recipe setup phase or training phase (which may be combined with present qualification processes); and a manufacturing phase or overlay inference phase.
embodiment.
In addition to the training data, external reference data POI Ref comprising known reference values for the parameter of interest (e.g., known overlay values) is obtained. This may comprise measured values (e.g., using SEM metrology), other AEI references or (scanner model corrected) scanner set values for wafers with programmed overlay for measured data MEA DAT and/or programmed/computed/set values for the simulated data SIM DAT.
The reference data POI Ref and training data TD is used in a training or optimization step TRA to find/train at least the training parameters TP M±b, N±b (or per cluster in a cDBO example). Of course, there will be more training parameters in embodiments with more proxies defined. This step may use the minimization function Equation 2, and may be solved via a closed form solution and/or multi-pixel version (as will be described). In this training phase, the parameter estimation problem distills to an optimization problem to be solved simultaneously for the model parameters over multiple dimensions (e.g., intensity, wavelength and target position on the wafer). This training phase can be performed in-line or offline.
In the inference phase, the parameter of interest POI of the new measurement data MD is computed using the trained model TM, based on the training parameters TP (e. g., M±b, N±b) determined in the training phase. This may be done using Equation (3) for example. The parameter of interest POI inference can be done collectively for all the targets, a subset of all targets of on a per-target basis. In this phase, to prevent deviation from the process during manufacturing, it is optionally proposed to use one or more monitoring KPIs. These KPIs can be used identify a drift from the trained model TM. The monitoring KPIs can flag changes for which the learned model is not sufficiently descriptive. This can be used within a control loop to trigger a further recipe set up or training phase.
The training data TD is used in a training or optimization step TRA to find/train the training parameters TP M, N. This step may use the minimization function Equation 4, and may be solved via a closed form solution and/or multi-pixel version (as will be described). No external reference is used, the training being done on the known biases.
The proposed multi-dimensional regression problem has a closed form solution. The external referenced problem (Equation (2)) may be rewritten as:
The solution, in the training phase can be found by computing the pseudo-inverse as:
where H+ indicates the pseudo-inverse of H. Additionally, to avoid instability and apply regularization, the modes used in computing the pseudo-inverse H+ may be restricted by applying a threshold on the singular values of H.
After computing M, N in the training phase, the overlay vector v±b may be computed as:
Similarly, for the self-referenced embodiment, a closed form solution of Equation (4) may be:
where H=H+b−H−b
A closed form solution for the cDBO embodiments can be constructed similarly, as will be apparent to the skilled person.
We can also add explicit regularization to the optimization problem. For example, the optimization problem by adding Tikhonov regularization can be written as
This leads to the closed form solution:
Similar regularized formulation can also be stated for the self-reference version leading to the closed-form solution
As has been described, it is possible to solve the regression problem using the information from multiple pixels within the image ROI. The above described regression problems may be solved for multiple pixels Np, or with Np set to 1 (standard averaging within the ROI).
However, if multiple pixels information is to be used, the known data matrices P, G, ΔI have the size NpNλ×Nt and the unknown training matrices M,N∈(N
In the training phase, SVD is applied on each of the data matrices P, G, ΔI:
where Up,g,a∈N
N
N
The compressed version of the data matrices, i.e. Pc, Gc, ΔIc may be calculated as:
Where ∈
N
∈
N
∈
N
Applying the data compression, the regression problem (Equation 2) can be rewritten as:
where N
N
extended in the diagonal with 1 and Mr∈
(1+q
(1+q
This regression problem can be equivalently written as:
To find the solution of this optimization problem, the compressed version of the data matrices (Pc, Gc and ΔIc) may be used to train the reduced version of the training parameter matrices (Mr, Nr). For example, the training phase may comprise solving:
and in the inference phase, the overlay values may be obtained via:
This approach may be adapted and used for the self-referenced embodiment and/or a cDBO embodiment, as will be apparent to the skilled person.
It can also be appreciated that a dimension reduction step such as an SVD step may be applied independently of an individual pixel treatment, e.g., it may be applied to any of the other embodiments disclosed herein.
In an embodiment, the overlay methods disclosed herein may allow removal of one biased grating (e.g., per direction) from the μDBO target, thereby halving the target area. Similarly cDBO targets may be made smaller (e.g., halved in size) by having only one cluster (e.g., only one M pad and W pad) per cDBO target. As such, targets with only a single sub-target per direction having a single bias or no bias, (μDBO) or two sub-targets, one each of an M pad and W pad (cDBO) can be used for inline ADI overlay monitoring.
In a first variation of such an embodiment, the bias may be varied over different single-biased target (the scope of the term “single biased target” encompasses a zero biased/unbiased target in this context), to provide a collective multi-bias. As such, each wafer may comprise multiple targets, all of which have a single bias (optionally per direction), but with the bias varied between targets such that there are two (or more) biases on the wafer. This enable the proxies Pj,λ,i and Gj,λ,i to be defined in the same way as already described; in particular it allows the asymmetric proxy Gj,λ,i to be used, this proxy relying on there being different biases to determine the second order difference, as has been described. More specifically, to determine the asymmetric proxy in such an embodiment, the required signal differences (Ij,λ,i,+b+1−Ij,λ,i,−b+1) and (Ij,λ,i,+b−1−Ij,λ,i,−b−1) between the differently biased sub-targets can now be obtained from different target positions, rather than from the same target position (i.e., Ij,λ,i,+b+1, Ij,λ,i,+b−1 relate to a different target position than Ij,λ,i,−b+1, Ij,λ,i,−b−1). The two target positions used to determine the asymmetric proxy in such an embodiment may be close or adjacent, where it may be assumed that the asymmetric process effect is similar for the two target locations.
A similar distributed approach may be applied to cDBO targets, where the targets on a wafer may comprise first targets of a first cluster type at some wafer locations, and targets of a second cluster type at other wafer locations.
In an alternative embodiment, all the targets on the wafer can have the same single bias/same cluster type. However, for the μDBO embodiments, this only allows the symmetric proxy to be used. As such, in such a μDBO embodiment, the asymmetric proxy is omitted or set to zero: Gj,λ,i=0, and its corresponding training parameter N±b not trained nor used in inference.
In summary, an ADI metrology method is described, usable in fast metrology techniques (e.g., dark-field metrology) for inline overlay monitoring. As such, this approach extends training-based inference methods to dark-field applications. The method incorporates both asymmetric process variations and symmetric process variations into the model.
Such an approach can improve accuracy and robustness against process variations, due to the fact that the method may be used to estimate overlay collectively on multiple target positions on the wafer. The proposed methods may be used to reduce target size. Additionally, the methods allow pixel intensity information from the dark-field images to be incorporated in the overlay inference problem. This provides opportunity for a co-development of calibration and inference methods within the same framework, incorporation of pixel-selection and ROI-refinement methods, such as pixel mapping, directly inside the overlay inference.
As has been mentioned, while the embodiments disclosed have largely been described in relation to μDBO or more generally intensity based applications, the same methods can be extended to use in the cDBO or more generally phase based applications application.
While the targets described above are metrology targets specifically designed and formed for the purposes of measurement, in other embodiments, properties may be measured on targets which are functional parts of devices formed on the substrate. Many devices have regular, grating-like structures. The terms ‘target grating’ and ‘target’ as used herein do not require that the structure has been provided specifically for the measurement being performed. In such an embodiment, either the target gratings and mediator grating may all comprise product structure, or only one or both target gratings comprise product structure, with the mediator grating being specifically formed to mediate the allowable pitches, and therefore enable measurements directly on the product structure. Further, pitch of the metrology targets is close to the resolution limit of the optical system of the scatterometer, but may be much larger than the dimension of typical product features made by lithographic process in the target portions C. In practice the lines and/or spaces of the overlay gratings within the targets may be made to include smaller structures similar in dimension to the product features.
An embodiment may include a computer program containing one or more sequences of machine-readable instructions describing methods of measuring targets on a substrate and/or analyzing measurements to obtain information about a lithographic process. This computer program may be executed for example within unit PU in the apparatus of
The program may optionally be arranged to control the optical system, substrate support and the like to perform the steps necessary to calculate the overlay error for measurement of asymmetry on a suitable plurality of targets.
Further embodiments according to the invention are presented in below numbered clauses:
59. A processing apparatus comprising a processor, and being configured to perform the method of any preceding clause.
Although specific reference may have been made above to the use of embodiments of the invention in the context of optical lithography, it will be appreciated that the invention may be used in other applications, for example imprint lithography, and where the context allows, is not limited to optical lithography. In imprint lithography a topography in a patterning device defines the pattern created on a substrate. The topography of the patterning device may be pressed into a layer of resist supplied to the substrate whereupon the resist is cured by applying electromagnetic radiation, heat, pressure or a combination thereof. The patterning device is moved out of the resist leaving a pattern in it after the resist is cured.
The terms “radiation” and “beam” used herein encompass all types of electromagnetic radiation, including ultraviolet (UV) radiation (e.g., having a wavelength of or about 365, 355, 248, 193, 157 or 126 nm) and extreme ultra-violet (EUV) radiation (e.g., having a wavelength in the range of 5-20 nm), A well A particle beams, such A ion beams or electron beams.
The term “lens”, where the context allows, may refer to any one or combination of various types of components, including refractive, reflective, magnetic, electromagnetic and electrostatic components.
The foregoing description of the specific embodiments will so fully reveal the general nature of the invention that others can, by applying knowledge within the skill of the art, readily modify and/or adapt for various applications such specific embodiments, without undue experimentation, without departing from the general concept of the present invention. Therefore, such adaptations and modifications are intended to be within the meaning and range of equivalents of the disclosed embodiments, based on the teaching and guidance presented herein. It is to be understood that the phraseology or terminology herein is for the purpose of description by example, and not of limitation, such that the terminology or phraseology of the present specification is to be interpreted by the skilled artisan in light of the teachings and guidance.
The breadth and scope of the present invention should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.
| Number | Date | Country | Kind |
|---|---|---|---|
| 22155715.0 | Feb 2022 | EP | regional |
| Filing Document | Filing Date | Country | Kind |
|---|---|---|---|
| PCT/EP2023/051111 | 1/18/2023 | WO |