This application claims priority of EP application 21173654.1 which was filed on May 12, 2021 and which is incorporated herein in its entirety by reference.
This description relates to mapping metrics between manufacturing systems.
A lithographic apparatus is a machine constructed to apply a desired pattern onto a substrate. A lithographic apparatus can be used, for example, in the manufacture of integrated circuits (ICs). A lithographic apparatus may, for example, project a pattern (also often referred to as “design layout” or “design”) at a patterning device (e.g., a mask) onto a layer of radiation-sensitive material (resist) provided on a substrate (e.g., a wafer).
To project a pattern on a substrate a lithographic apparatus may use electromagnetic radiation. The wavelength of this radiation determines the minimum size of features which can be formed on the substrate. Typical wavelengths currently in use are 365 nm (i-line), 248 nm, 193 nm and 13.5 nm. A lithographic apparatus, which uses extreme ultraviolet (EUV) radiation, having a wavelength within the range 4-20 nm, for example 6.7 nm or 13.5 nm, may be used to form smaller features on a substrate than a lithographic apparatus which uses, for example, radiation with a wavelength of 193 nm.
Low-k1 lithography may be used to process features with dimensions smaller than the classical resolution limit of a lithographic apparatus. In such process, the resolution formula may be expressed as CD=k1×λ/NA, where λ is the wavelength of radiation employed, NA is the numerical aperture of the projection optics in the lithographic apparatus, CD is the “critical dimension” (generally the smallest feature size printed, but in this case half-pitch) and k1 is an empirical resolution factor. In general, the smaller k1 the more difficult it becomes to reproduce the pattern on the substrate that resembles the shape and dimensions planned by a circuit designer in order to achieve particular electrical functionality and performance.
To overcome these difficulties, sophisticated fine-tuning steps may be applied to the lithographic projection apparatus and/or design layout. These include, for example, but are not limited to, optimization of NA, customized illumination schemes, use of phase shifting patterning devices, various optimization of the design layout such as optical proximity correction (OPC, sometimes also referred to as “optical and process correction”) in the design layout, or other methods generally defined as “resolution enhancement techniques” (RET). Alternatively, tight control loops for controlling a stability of the lithographic apparatus may be used to improve reproduction of the pattern at low k1.
Various metrology operations may be used to measure features of a design. If measured on different metrology systems, the data from a metrology operation on one system may not match the data from the same metrology operation on a different system. Advantageously, the present method(s) and system(s) are configured to provide a framework to improve matching between systems by exhaustive use of available system calibration data.
According to an embodiment, a method determining a model configured to predict values of physical characteristics associated with a patterned substrate measured using different measurement tools is provided. The method involves obtaining (i) training data comprising a first set of measured data (e.g., a set of intensity images) associated with a first set of patterned substrates using a first measurement tool (e.g., T1), and reference measurements of a physical characteristic (e.g., overlay, CD) associated with the first set of patterned substrates (ii) a second set of measured data (e.g., another set of intensity images) associated with a second set of patterned substrates that is measured using a second set of measurement tools, the second set of measurement tools being different from the first measurement tool, and (iii) virtual data based on the second set of measured data, the virtual data being associated with a virtual tool. The method generates a set of mapping functions between the second set of measured data and the virtual data, where each mapping function mapping each measured data of the second set of measured data to the virtual data. The method converts, based on the set of mapping functions, the first set of measured data of the training data. The method determines a model based on the reference measurements and the converted first set of measured data such that the model predicts values of the physical characteristic that are within an acceptable threshold of the reference measurements.
In some embodiments, the model may be a machine learning model, an empirical model, or other mathematical models characterized by parameters trained according to the above method.
In some embodiments each of the first measured data and the second measured data comprises signals detected by sensors configured to measure the portion of the second patterned substrate. In some embodiments, each of the first measured data and the second measured data comprises intensities corresponding to light reflected from the portion of the second patterned substrate.
In some embodiments, the reference measurements of the physical characteristic are obtained by measuring the first set of patterned substrates using the SEM or an atomic force microscope (AFM).
In some embodiments, the method may further involve creating, based on the set of mapping functions, recipe of the virtual tool, the recipe comprising configuration of one or more tool characteristics used during a measurement. In some embodiments, the one or more tool characteristic includes, but not limited to a wavelength of the light used for measurements; a pupil shape used for measurements; an intensity of light used for measurements; and/or a grating-to-sensor orientation of a patterned substrate.
According to an embodiment, another method determining a model configured to predict consistent values of physical characteristics associated with a patterned substrate measured using different measurement tools is provided. The method involves obtaining (i) reference measurements of a physical characteristic (e.g., overlay, CD) associated with a first set of patterned substrates, (ii) first measured data associated with a portion of a second patterned substrate using a first measurement tool, and (iii) second measured data associated with the portion of the second patterned substrate using a second measurement tool. The method determines a model by adjusting model parameters based on the first measured data, the second measured data, and the reference measurements to cause the model to predict values of the physical characteristic that are within an acceptable threshold of the reference measurements.
In some embodiments, the physical characteristic comprises at least one of an overlay between a feature on a first layer and a feature on a second layer of the patterned substrate, a critical dimension of features of the patterned substrate, a tilt of the patterned substrate, or an edge placement error associated with the patterned substrate.
In some embodiments, determining of the model involves computing a difference between the first measured data and the second measured data; determining a set of basis functions characterizing the difference data; applying the set of basis functions to the first measured data and the second measured data to generate projected data; and determining the model by adjusting model parameters based on the projected data and the reference measurements to cause the model to predict values of the physical characteristic that are within the acceptable threshold of the reference measurements. In an embodiment, the set of basis functions are determined by a singular value decomposition of the difference data, or principal component analysis of the difference data.
In some embodiments, 2determining of the model involves determining model parameters by satisfying a difference constraint comprising a difference between a first predicted physical characteristic value and a second predicted physical characteristic value, the first predicted physical characteristic value being predicted using the first measured data as input to the model and the second predicted physical characteristic value being predicted using the second measured data as input to the model.
In some embodiments, the model may be a machine learning model, an empirical model, or other mathematical models characterized by parameters trained according to the above method.
In some embodiments each of the first measured data and the second measured data comprises signals detected by sensors configured to measure the portion of the second patterned substrate. In some embodiments, each of the first measured data and the second measured data comprises intensities corresponding to light reflected from the portion of the second patterned substrate.
According to an embodiment, a metrology tool is provided. The metrology tool includes a sensor configured to detect signals associated with a portion of a patterned substrate being measured; one or more processors configured to executing a model trained according to the methods discussed herein. The one or more processors are configured to receive the signals from the sensor; and determine, via a model using the signals as input, values of a physical characteristic associated with the patterned substrate, the model being configured based on measurement data associated with one or more patterned substrates measured using different metrology tools, and reference measurements of the physical characteristic associated with a reference patterned substrate.
In some embodiments, the physical characteristic comprises at least one of an overlay between a feature on a first layer and a feature on a second layer of the patterned substrate, a critical dimension of features of the patterned substrate, a tilt of the patterned substrate, or an edge placement error associated with the patterned substrate.
In some embodiments, the physical characteristic comprises at least one of an overlay between a feature on a first layer and a feature on a second layer of the patterned substrate, a critical dimension of features of the patterned substrate, a tilt of the patterned substrate, or an edge placement error associated with the patterned substrate.
In some embodiments, the detected signal of the metrology tool includes intensities corresponding to light reflected from the portion of the patterned substrate being measured. In some embodiments, each of the detected signal is represented as a pixelated image, one or more pixels have intensity indicative of a feature of the patterned substrate. In an embodiment, the metrology tool is an optical tool configured to measure a portion of the patterned substrate.
According to an embodiment, there is provided a non-transitory computer-readable medium configured for determining a model configured to predict values of physical characteristics associated with a patterned substrate measured using different measurement tools, the medium comprising instructions stored therein that, when executed by one or more processors, cause operations or processes of methods discussed herein.
The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate one or more embodiments and, together with the description, explain these embodiments. Embodiments of the invention will now be described, by way of example only, with reference to the accompanying schematic drawings in which corresponding reference symbols indicate corresponding parts, and in which:
Various metrology operations may be used to measure features of a design. If measured on different metrology systems, the data from a metrology operation on one system may not match the data from the same metrology operation on a different system. For example, in the context of integrated circuits, matching between measured overlay values measured on different overlay measurement systems is often out of specification. A current approach for ensuring that data from different metrology systems is comparable uses the Jones Framework. The Jones-framework is a ray-based framework, which accounts for the polarization state of the light used by the system for measuring (e.g., a light/pupil based metrology system). However, this current approach ignores any phase-shift of the light as it travels through the metrology system and thus it fails to capture phase related differences between systems. Phase effects are a major source of system-to-system matching issues. For example, the objective retardation (a.k.a. alpha-map) and the phase-induced channel leakage for a given system are thought to be causes of the system-to-system matching issues.
Advantageously, the present method(s) and system(s) are configured to provide a generic framework to improve matching between systems by exhaustive use of available system calibration data. These calibration data are assumed to be present in the form of the incoming and outgoing density matrices (e.g., ρin and Mout). In the present method(s) and system(s), an intensity metric (e.g., which may, in some embodiments, be and/or include an intensity image (associated with a pupil), an intensity map, a set of intensity values, and/or other intensity metrics) is determined for a manufacturing system (e.g., a light/pupil based system configured to measure overlay continuing with the example above). The intensity metric is determined based on a reflectivity of a location on a substrate (e.g., a wafer and/or other substrates), a manufacturing system characteristic, and/or other information. A mapped intensity metric for a reference system is determined. The reference system has a reference system characteristic. The mapped intensity metric is determined based on the intensity metric, the manufacturing system characteristic, and the reference system characteristic, to mimic the determination of the intensity metric for the manufacturing system using the reference system. In this way, any number of intensity metrics from any number of manufacturing systems may be mapped to this reference system to facilitate comparison of data from different manufacturing systems.
Although specific reference may be made in this text to the manufacture of ICs, and/or metrology related to the manufacture of IC's, the description herein has many other possible applications. For example, it may be employed in the manufacture of integrated optical systems, guidance and detection patterns for magnetic domain memories, liquid-crystal display panels, thin-film magnetic heads, etc. In these alternative applications, the skilled artisan will appreciate that, in the context of such alternative applications, any use of the terms “reticle”, “wafer” or “die” in this text should be considered as interchangeable with the more general terms “mask”, “substrate” and “target portion”, respectively. In addition, it should be noted that the method described herein may have many other possible applications in diverse fields such as language processing systems, self-driving cars, medical imaging and diagnosis, semantic segmentation, denoising, chip design, electronic design automation, etc. The present method may be applied in any fields where quantifying uncertainty in machine learning model predictions is advantageous.
In the present document, the terms “radiation” and “beam” are used to encompass all types of electromagnetic radiation, including ultraviolet radiation (e.g. with a wavelength of 365, 248, 193, 157 or 126 nm) and EUV (extreme ultra-violet radiation, e.g. having a wavelength in the range of about 5-100 nm).
A patterning device may comprise, or may form, one or more design layouts. The design layout may be generated utilizing CAD (computer-aided design) programs. This process is often referred to as EDA (electronic design automation). Most CAD programs follow a set of predetermined design rules in order to create functional design layouts/patterning devices. These rules are set based processing and design limitations. For example, design rules define the space tolerance between devices (such as gates, capacitors, etc.) or interconnect lines, to ensure that the devices or lines do not interact with one another in an undesirable way. One or more of the design rule limitations may be referred to as a “critical dimension” (CD). A critical dimension of a device can be defined as the smallest width of a line or hole, or the smallest space between two lines or two holes. Thus, the CD regulates the overall size and density of the designed device. One of the goals in device fabrication is to faithfully reproduce the original design intent on the substrate (via the patterning device).
The term “reticle,” “mask,” or “patterning device” as employed in this text may be broadly interpreted as referring to a generic patterning device that can be used to endow an incoming radiation beam with a patterned cross-section, corresponding to a pattern that is to be created in a target portion of the substrate. The term “light valve” can also be used in this context. Besides the classic mask (transmissive or reflective; binary, phase-shifting, hybrid, etc.), examples of other such patterning devices include a programmable mirror array.
As a brief introduction,
In operation, the illumination system IL receives a radiation beam from a radiation source SO, e.g. via a beam delivery system BD. The illumination system IL may include various types of optical components, such as refractive, reflective, magnetic, electromagnetic, electrostatic, and/or other types of optical components, or any combination thereof, for directing, shaping, and/or controlling radiation. The illuminator IL may be used to condition the radiation beam B to have a desired spatial and angular intensity distribution in its cross section at a plane of the patterning device MA.
The term “projection system” PS used herein should be broadly interpreted as encompassing various types of projection system, including refractive, reflective, catadioptric, anamorphic, magnetic, electromagnetic and/or electrostatic optical systems, or any combination thereof, as appropriate for the exposure radiation being used, and/or for other factors such as the use of an immersion liquid or the use of a vacuum. Any use of the term “projection lens” herein may be considered as synonymous with the more general term “projection system” PS.
The lithographic apparatus LA may be of a type wherein at least a portion of the substrate may be covered by a liquid having a relatively high refractive index, e.g., water, so as to fill a space between the projection system PS and the substrate W—which is also referred to as immersion lithography. More information on immersion techniques is given in U.S. Pat. No. 6,952,253, which is incorporated herein by reference.
The lithographic apparatus LA may also be of a type having two or more substrate supports WT (also named “dual stage”). In such “multiple stage” machine, the substrate supports WT may be used in parallel, and/or steps in preparation of a subsequent exposure of the substrate W may be carried out on the substrate W located on one of the substrate support WT while another substrate W on the other substrate support WT is being used for exposing a pattern on the other substrate W.
In addition to the substrate support WT, the lithographic apparatus LA may comprise a measurement stage. The measurement stage is arranged to hold a sensor and/or a cleaning device. The sensor may be arranged to measure a property of the projection system PS or a property of the radiation beam B. The measurement stage may hold multiple sensors. The cleaning device may be arranged to clean part of the lithographie apparatus, for example a part of the projection system PS or a part of a system that provides the immersion liquid. The measurement stage may move beneath the projection system PS when the substrate support WT is away from the projection system PS.
In operation, the radiation beam B is incident on the patterning device, e.g. mask, MA which is held on the mask support MT, and is patterned by the pattern (design layout) present on patterning device MA. Having traversed the mask MA, the radiation beam B passes through the projection system PS. which focuses the beam onto a target portion C of the substrate W. With the aid of the second positioner PW and a position measurement system IF, the substrate support WT can be moved accurately, e.g., so as to position different target portions C in the path of the radiation beam B at a focused and aligned position. Similarly, the first positioner PM and possibly another position sensor (which is not explicitly depicted in
In order for the substrates W (
An inspection apparatus, which may also be referred to as a metrology apparatus, is used to determine properties of the substrates W (
The computer system CL may use (part of) the design layout to be patterned to predict which resolution enhancement techniques to use and to perform computational lithography simulations and calculations to determine which mask layout and lithographie apparatus settings achieve the largest overall process window of the patterning process (depicted in
The metrology apparatus (tool) MT may provide input to the computer system CL to enable accurate simulations and predictions, and may provide feedback to the lithographic apparatus LA to identify possible drifts, e.g. in a calibration status of the lithographic apparatus LA (depicted in
In lithographic processes, it is desirable to make frequent measurements of the structures created, e.g., for process control and verification. Tools to make such measurements include metrology tool (apparatus) MT. Different types of metrology tools MT for making such measurements are known, including scanning electron microscopes or various forms of scatterometer metrology tools MT. Scatterometers are versatile instruments which allow measurements of the parameters of a lithographic process by having a sensor in the pupil or a conjugate plane with the pupil of the objective of the scatterometer, measurements usually referred as pupil based measurements, or by having the sensor in the image plane or a plane conjugate with the image plane, in which case the measurements are usually referred as image or field based measurements. Such scatterometers and the associated measurement techniques are further described in patent applications US20100328655, US2011102753A1, US20120044470A, US20110249244, US20110026032 or EP1,628,164A, incorporated herein by reference in their entirety. Aforementioned scatterometers may measure features of a substrate such as gratings using light from soft x-ray and visible to near-IR wavelength range, for example.
In some embodiments, a scatterometer MT is an angular resolved scatterometer. In these embodiments, scatterometer reconstruction methods may be applied to the measured signal to reconstruct or calculate properties of a grating and/or other features in a substrate. Such reconstruction may, for example, result from simulating interaction of scattered radiation with a mathematical model of the target structure and comparing the simulation results with those of a measurement. Parameters of the mathematical model are adjusted until the simulated interaction produces a diffraction pattern similar to that observed from the real target.
In some embodiments, scatterometer MT is a spectroscopic scatterometer MT. In these embodiments, spectroscopic scatterometer MT may be configured such that the radiation emitted by a radiation source is directed onto target features of a substrate and the reflected or scattered radiation from the target is directed to a spectrometer detector, which measures a spectrum (i.e. a measurement of intensity as a function of wavelength) of the specular reflected radiation. From this data, the structure or profile of the target giving rise to the detected spectrum may be reconstructed, e.g. by Rigorous Coupled Wave Analysis and non-linear regression or by comparison with a library of simulated spectra.
In some embodiments, scatterometer MT is a ellipsometric scatterometer. The ellipsometric scatterometer allows for determining parameters of a lithographic process by measuring scattered radiation for each polarization states. Such a metrology apparatus (MT) emits polarized light (such as linear, circular, or elliptic) by using, for example, appropriate polarization filters in the illumination section of the metrology apparatus. A source suitable for the metrology apparatus may provide polarized radiation as well. Various embodiments of existing ellipsometric scatterometers are described in U.S. patent application Ser. Nos. 11/451.599, 11/708,678, 12/256,780, 12/486,449, 12/920,968, 12/922,587, 13/000,229, 13/033,135, 13/533,110 and 13/891,410 incorporated herein by reference in their entirety.
In some embodiments, scatterometer MT is adapted to measure the overlay of two misaligned gratings or periodic structures (and/or other target features of a substrate) by measuring asymmetry in the reflected spectrum and/or the detection configuration, the asymmetry being related to the extent of the overlay. The two (typically overlapping) grating structures may be applied in two different layers (not necessarily consecutive layers), and may be formed substantially at the same position on the wafer. The scatterometer may have a symmetrical detection configuration as described e.g. in patent application EP1,628,164A, such that any asymmetry is clearly distinguishable. This provides a way to measure misalignment in gratings. Further examples for measuring overlay may be found in PCT patent application publication no. WO 2011/012624 or US patent application US 20160161863, incorporated herein by reference in their entirety.
Other parameters of interest may be focus and dose. Focus and dose may be determined simultaneously by scatterometry (or alternatively by scanning electron microscopy) as described in US patent application US2011-0249244, incorporated herein by reference in its entirety. A single structure (e.g., feature in a substrate) may be used which has a unique combination of critical dimension and sidewall angle measurements for each point in a focus energy matrix (FEM—also referred to as Focus Exposure Matrix). If these unique combinations of critical dimension and sidewall angle are available, the focus and dose values may be uniquely determined from these measurements.
A metrology target may be an ensemble of composite gratings and/or other features in a substrate, formed by a lithographic process, commonly in resist, but also after etch processes, for example. Typically the pitch and line-width of the structures in the gratings depend on the measurement optics (in particular the NA of the optics) to be able to capture diffraction orders coming from the metrology targets. A diffracted signal may be used to determine shifts between two layers (also referred to ‘overlay’) or may be used to reconstruct at least part of the original grating as produced by the lithographic process. This reconstruction may be used to provide guidance of the quality of the lithographic process and may be used to control at least part of the lithographic process. Targets may have smaller sub-segmentation which are configured to mimic dimensions of the functional part of the design layout in a target. Due to this sub-segmentation, the targets will behave more similar to the functional part of the design layout such that the overall process parameter measurements resemble the functional part of the design layout. The targets may be measured in an underfilled mode or in an overfilled mode. In the underfilled mode, the measurement beam generates a spot that is smaller than the overall target. In the overfilled mode, the measurement beam generates a spot that is larger than the overall target. In such overfilled mode, it may also be possible to measure different targets simultaneously, thus determining different processing parameters at the same time.
Overall measurement quality of a lithographic parameter using a specific target is at least partially determined by the measurement recipe used to measure this lithographic parameter. The term “substrate measurement recipe” may include one or more parameters of the measurement itself, one or more parameters of the one or more patterns measured, or both. For example, if the measurement used in a substrate measurement recipe is a diffraction-based optical measurement, one or more of the parameters of the measurement may include the wavelength of the radiation, the polarization of the radiation, the incident angle of radiation relative to the substrate, the orientation of radiation relative to a pattern on the substrate, etc. One of the criteria to select a measurement recipe may, for example, be a sensitivity of one of the measurement parameters to processing variations. More examples are described in US patent application US2016-0161863 and published US patent application US 2016/0370717A 1 incorporated herein by reference in its entirety.
It is often desirable to be able computationally determine how a patterning process would produce a desired pattern on a substrate. Computational determination may comprise simulation and/or modeling, for example. Models and/or simulations may be provided for one or more parts of the manufacturing process. For example, it is desirable to be able to simulate the lithography process of transferring the patterning device pattern onto a resist layer of a substrate as well as the yielded pattern in that resist layer after development of the resist, simulate metrology operations such as the determination of overlay, and/or perform other simulations. The objective of a simulation may be to accurately predict, for example, metrology metrics (e.g., overlay, a critical dimension, a reconstruction of a three dimensional profile of features of a substrate, a dose or focus of a lithography apparatus at a moment when the features of the substrate were printed with the lithography apparatus, etc.), manufacturing process parameters (e.g., edge placements, aerial image intensity slopes, sub resolution assist features (SRAF), etc.), and/or other information which can then be used to determine whether an intended or target design has been achieved. The intended design is generally defined as a pre-optical proximity correction design layout which can be provided in a standardized digital file format such as GDSII, OASIS or another file format.
Simulation and/or modeling can be used to determine one or more metrology metrics (e.g., performing overlay and/or other metrology measurements), configure one or more features of the patterning device pattern (e.g., performing optical proximity correction), configure one or more features of the illumination (e.g., changing one or more characteristics of a spatial/angular intensity distribution of the illumination, such as change a shape), configure one or more features of the projection optics (e.g., numerical aperture, etc.), and/or for other purposes. Such determination and/or configuration can be generally referred to as mask optimization, source optimization, and/or projection optimization, for example. Such optimizations can be performed on their own, or combined in different combinations. One such example is source-mask optimization (SMO), which involves the configuring of one or more features of the patterning device pattern together with one or more features of the illumination. The optimizations may use the parameterized model described herein to predict values of various parameters (including images, etc.), for example.
In some embodiments, an optimization process of a system may be represented as a cost function. The optimization process may comprise finding a set of parameters (design variables, process variables, etc.) of the system that minimizes the cost function. The cost function can have any suitable form depending on the goal of the optimization. For example, the cost function can be weighted root mean square (RMS) of deviations of certain characteristics (evaluation points) of the system with respect to the intended values (e.g., ideal values) of these characteristics. The cost function can also be the maximum of these deviations (i.e., worst deviation). The term “evaluation points” should be interpreted broadly to include any characteristics of the system or fabrication method. The design and/or process variables of the system can be confined to finite ranges and/or be interdependent due to practicalities of implementations of the system and/or method. In the case of a lithographic projection apparatus, the constraints are often associated with physical properties and characteristics of the hardware such as tunable ranges, and/or patterning device manufacturability design rules. The evaluation points can include physical points on a resist image on a substrate, as well as non-physical characteristics such as dose and focus, for example.
As described above, method 50 (and/or the other methods and systems described herein) is configured to provide a generic framework to improve matching between systems using available system calibration data. These calibration data are assumed to be present in the form of the incoming and outgoing density matrices (e.g., ρin and Mout) and/or in other forms. The density matrices are related to the Jones matrices of the incoming (from source to target) and outgoing (from target to detector) optical paths of a manufacturing (e.g., metrology) system. A Jones matrix associated with an optical path describes how the optical electric fields propagates along said path. The associated density matrix is defined as the product of the associated Jones matrix with the conjugate transpose (a.k.a. Hermitian transpose, both designated by “†”) of that same Jones matrix. More specifically, ρin=JinJin†, and Mout=Jout†Jout, with JinJout the respective Jones matrices.
In method 50, an intensity metric (e.g., which may, in some embodiments, be and/or include an intensity image (associated with a pupil), an intensity map, a set of intensity values, and/or other intensity metrics) is determined for a manufacturing system (e.g., a light/pupil based system). The intensity metric is determined based on a reflectivity of a location on a substrate (e.g., a wafer and/or other substrates), a manufacturing system characteristic, and/or other information. A corresponding mapped intensity metric for a reference system is determined. The reference system has a reference system characteristic. The manufacturing system characteristic and/or the reference system characteristic may be and/or include one or more matrices comprising calibration data and/or other information for a given system (e.g., as further described below). The mapped intensity metric is determined based on the intensity metric, the manufacturing system characteristic, the reference system characteristic, and/or other information, to mimic the determination of the intensity metric for the manufacturing system using the reference system. In this way, any number of intensity metrics from any number of manufacturing systems may be mapped to this reference system to facilitate comparison of data from different manufacturing systems.
In some embodiments (as described herein), reference system 62 is an idealized system with predetermined characteristics. The predetermined characteristics may include system operating parameters and/or set points, calibration settings and/or other data, and/or other information. In some embodiments, the predetermined characteristics may be measured for a given manufacturing system, electronically obtained from a manufacturing system and/or electronic storage associated with such a system, programmed by a user (e.g., for a virtual system), assigned by a user, and/or may include other information. In some embodiments, the reference system may be a physical system or a virtual system. In some embodiments, the reference system may represent an average or typical system. In some embodiments, the reference system is configured to represent a plurality of different (physical and/or virtual) manufacturing systems. In some embodiments, the reference system is virtual, and the manufacturing system(s) is (are) physical.
Returning to
Method 50 combines different “measurement channels”, each channel characterized by an incoming-outgoing-polarization and grating-to-sensor-angle (and wavelength), and/or other information. Each channel corresponds to a different set of density matrices (and system matrices) and also to different measured Intensities I. A channel is an aggregate of measured data, calibration data, and labels. It includes a set of points, each point having a position in the pupil-plane, a measured intensity value (all together forming a pupil intensity image), an incoming density matrix, and an outgoing density matrix. Said channel also has labels: the associated incoming polarization value, outgoing polarization value, the wavelength, and a grating-to-sensor angle. Additional aspects of operation 52 are further described below in context with operation 54.
At an operation 54, a mapped intensity metric (e.g., 68 and/or 69 in
By way of a non-limiting example, the intensity metric may be associated with overlay measured as part of a semiconductor manufacturing process, and the mapped intensity metric may be associated with a mapped overlay, such that the mapped overlay can be compared to other mapped overlays from other manufacturing systems also associated with the semiconductor manufacturing process. In some embodiments, the intensity metric is an intensity in an intensity-image (pupil), an intensity image itself, an intensity map, a set of intensity values, and/or other intensity metrics. A mapped overlay (for comparison with other overlay values measured by other manufacturing systems) may be determined by taking all these intensities together (in a linear or non-linear way) with certain weight-factors (e.g., as described below). Overlay is not necessarily associated with a single point in a pupil.
The present system(s) and method(s) make use of the Jones Framework. The Jones framework describes the propagation of polarized light through an optical system in terms of Jones matrices. A Jones matrix of an optical element. J, is a 2×2 complex matrix that acts on a 2×1 electric field input-vector Ein to produce a 2×1 electric field output-vector Eout, according to Eout=JEin. Each electric field E is expressed as a linear combination of two chosen orthogonal unit-(field-) vectors that span a 2D subspace perpendicular to the propagation direction of the light. Said unit vectors constitute the local polarization directions of the light. The Jones matrix of an optical system is the matrix product of the Jones matrices of the associated optical elements.
The reference system has a reference system characteristic and/or other associated information. In some embodiments, the reference system characteristic is a matrix (or a plurality of matrices) that comprises calibration data for the reference system and/or other information. In some embodiments, the reference system characteristic is one or more matrices and/or other arrangements of characteristics that comprise calibration data and/or other data for the manufacturing system. The reference system matrix (or matrices) may include any data that may be uniquely associated with the reference system so that any variation caused by a reference system itself is represented in, and/or otherwise accounted for by, the reference system matrix (or matrices).
The mapped intensity metric is determined based on the intensity metric, the manufacturing system characteristic, the reference system characteristic, and/or other information. In some embodiments, the manufacturing system matrix and the reference system matrix form a transform matrix. The components of the transform matrix “T” are determined by the system matrices of the manufacturing system(s) and the matrices of the reference system.
In some embodiments, determining the mapped intensity metric comprises a linear transform of measured channel intensities. In some embodiments, determining the mapped intensity metric comprises combining pointwise linear transforms of measured channel intensities. Individual measurement channels may be characterized by an incoming-outgoing polarization, a grating to sensor rotation, a wavelength, and/or other parameters. Polarized light comprises a light wave that is vibrating in a single plane. Light may be polarized with a filter and/or with other components. Polarized light comprises a light wave of which the electric field vector oscillates in a single direction (linear polarization) or in a rotating fashion (circular or elliptical polarization). In the case of linearly polarized light, a direction attribute, e.g. H, V, S or P, is used to specify the direction. In the case of circular or elliptical polarized light, a rotational sense and/or ellipticity attribute is used to specify the light. In some embodiments, a grating to sensor rotation may comprise an azimuthal angle between a substrate and a sensor in a manufacturing system used to measure reflectivity, intensity, and/or other parameters. The wavelength may refer to the wavelength of light used by the manufacturing system for measuring the reflectivity, intensity, and/or other parameters.
The incoming-outgoing linear polarization comprises horizontal (in) horizontal (out) (H-H), vertical horizontal (V-H), horizontal vertical (H-V), and/or vertical vertical (V-V). The polarization attribute H or V refers to the linear polarization direction of the light as it (e.g., virtually) travels through the pupil plane of the objective. The H-direction refers to a first chosen direction in the pupil plane. The V direction refers to a second direction perpendicular to the first direction. Said filters to select incoming and outgoing H and V polarizations are aligned accordingly. In some embodiments, the incoming-outgoing linear polarization comprises S-P. where S (“Senkrecht”) and P (Parallel) form machine independent polarization directions. The S and P polarization directions are defined in relation to the plane spanned by the direction of the (incoming or outgoing) light and the surface normal of the target. The S direction refers to a first direction perpendicular to said plane. The P direction associated with the incoming light is perpendicular to said S direction and perpendicular to the propagation direction of the incoming light. The P direction associated with the outgoing light is perpendicular to said S direction and perpendicular to the propagation direction of the outgoing light. In some embodiments, the grating to sensor rotation comprises a set of given angles (these can be any angles whatsoever), and the set of given angles plus 180 degrees.
In some embodiments, determining the mapped intensity metric comprises mapping individual intensities directly from different points on a pupil, and mapping corresponding intensities from reciprocal points on the pupil. For example,
In some embodiments, determining the mapped intensity metric comprises weighting the intensities directly mapped from the different points on the pupil, and the corresponding intensities from the reciprocal points on the pupil. The weighting is based on the calibration data in the manufacturing system matrix and/or the reference system matrix, a corresponding vectorized form of the reflectivity (as described below), and/or other information. Individual weights are determined based on an incoming polarization, an outgoing polarization, a grating to sensor rotation, a reciprocity, a diffraction order, and/or other parameters associated with a given intensity metric.
For example, the individual mapped points indicated by arrows shown in
then r†=(r1,r2,r3,r4)*, with * denoting the complex conjugate.
As a reminder, in relation 95, intensity I (e.g., an intensity metric) is determined by a manufacturing system (e.g., as described above), S is a system matrix (e.g., comprising one or more manufacturing characteristics as appropriate), and the reflectivity r is unknown (and need not be known). An advantage of using the system matrix S is that the (manufacturing) system properties only enter into the mathematics once, and in a linear way. This enables making linear combinations of sets of equations, even if the actual reflectivity R or r is unknown.
In this example, only the incoming and outgoing polarizations are used and it is assumed that four pupils are measured: HH, HV, VH, and VV. Reciprocity is not taken into account in this example. The four mapped pupils with the same polarization labels (and label “ref”) are determined. There are four expressions (a, b, c, d) corresponding to the four polarization states of I. Taking linear combinations of these equations comprises taking linear combinations of the manufacturing system matrix S (or matrices) on one side (without the need to know r), and the same linear combinations of I on the other side. For each mapped polarization label the linear combinations are sought such that the resulting combination of the actual system matrices S approaches the corresponding reference system matrix with that same mapped polarization label (HH in the example). The linear combination can be optimized for instance with respect to a minimal Frobenius norm of the difference between the combination of manufacturing system matrices and the corresponding reference system matrix. Also other choices can be made. Finally, the linear combination is applied to the intensities I to yield the mapped (or “reference”) intensity. Carrying out the procedure for other mapped polarization labels gives the mapping matrix T that transforms measured intensities to mapped intensities. The mapping operation (e.g., operation 54 shown in
In some embodiments, a “default” use case for the present system(s) and method(s) may be to map to a reference system that somehow resembles the actual manufacturing systems used. Typically, an idealized version of such a system is taken for reference. However, the principles described herein can also be used to define a (hypothetical and/or virtual) refence system that may be difficult to make in reality. In doing so it may be possible to extract intrinsic (semiconductor manufacturing) stack properties that virtually do not depend on any physical manufacturing system. The intrinsic optical stack properties are usually expressed in terms of a complex reflectivity matrix. The elements of this matrix act on the S and P polarization components of the light, where S (“Senkrecht”) and P (Parallel) form machine independent polarization directions, only depending on the direction of the incoming/outgoing light.
Returning to
In-device metrology (IDM) focuses on measuring physical characteristics such as stack parameters (e.g., overlay) associated with a substrate that are of interest. In existing technology, a model may be trained to determine the physical characteristics of a substrate from measured data (e.g., pupil data) obtained from a metrology tool such as an optical tool. To generate the model, a data-driven approach is used in order to learn how to associate physical characteristics to measured data, using substrates whose reference values of the physical characteristics of interest are given.
Typically, the measured data associated to these substrates all originate from a single measurement tool. But it is expected that the trained model provides consistent physical characteristics measurements even it measured with different metrology tools used in the semiconductor manufacturing. However, this is not always the case, as small differences in the hardware components of the metrology tools can make a model trained on a tool be unsuitable for another tool, generating significant tool-to-tool matching issues.
In some embodiments, a method such as observable mapping was developed to improve tool optical calibration and therefore tool-to-tool matching. However, in some cases, observable mapping may face challenges when measuring particular circuit patterns. For example, in circuit patterns such as 3D-NAND stacks, there exists high-frequency components in the optical signals obtained from the metrology tool. These high frequency components make calibration via observable mapping difficult. In another example, such as circuit pattern including DRAM layers there may be difference in measurements from different tools due to hardware mismatch between the metrology tools coupled with a weak signal providing information about the physical characteristics.
For mitigating above matching issues (e.g., related to 3D-NAND, DRAM, etc.), a series of time-consuming steps and engineering resources may be required. For example, the mitigation may require a user to measure additional 10-20 patterned substrates (i.e., in addition to the substrates used to train the model) on the different tools that are meant to give matching measurements.
In the present disclosure, solutions for determining improved tool-to-tool matching issue related to determining physical characteristics of a patterned substrate. These solutions employ data-driven approaches for model training and recipe creation by adding a number of steps (different from existing training and recipe creation methods) in a procedure to develop a trained inference model suitable for different metrology tools, ensuring physical characteristics measurements match across different tools.
Typically, a metrology recipe creation involves using a number of substrates measured on a single metrology tool. For these substrates corresponding reference data of the physical characteristics is also made available to allow for data-driven model training. On the other hand, the methods herein include calibration substrates that are measured by different tools. For example, the different tools may be a first optical metrology tool and a second optical metrology tool used in the semiconductor manufacturing process. The details of the methods for training a model and recipe creation are further discussed as follows.
Process S11 involves obtaining (i) training data comprising a first set of measured data TDX associated with a first set of patterned substrates using a first measurement tool T1, and reference measurements REF1 of a physical characteristic associated with the first set of patterned substrates (ii) a second set of measured data CDX (also referred as calibration data) associated with a second set of patterned substrates (also referred as calibration wafers) that is measured using a second set of measurement tools T2, the second set of measurement tools T2 being different from the first measurement tool T1, and (iii) virtual data VD1 based on the second set of measured data CDX, the virtual data VD1 being associated with a virtual tool. In one embodiment, the second set of measurement tools includes the first measurement tool T1 and additional tools different from the tool T1.
In an embodiment, the first set of measured data TDX comprises measured data in a form of signals detected by a sensor the first measurement tool T1 configured to measure a portion of a patterned substrate of the first set of patterned substrates. In an embodiment, the first set of measured data TDX includes a first measured data detected by the sensor the first measurement tool T1 configured to measure a portion of a first patterned substrate of the first set of patterned substrates; and a second measured data detected by the sensor the first measurement tool T1 configured to measure a portion of a second patterned substrate of the first set of patterned substrates.
In an embodiment, each measured data of the first set of measured data TDX comprises intensities corresponding to light reflected from a portion of a particular patterned substrate of the first set of patterned substrates. In an embodiment, the intensities comprise pixel intensities of a pixelated image generated by using a pupil for measuring the portion of the particular patterned substrate of the first set of patterned substrates.
In an embodiment, the physical characteristic includes, but not limited to, an overlay between a feature on a first layer and a feature on a second layer of a patterned substrate; and/or a critical dimension of features of a patterned substrate, a tilt of the patterned substrate, or an edge placement error associated with the patterned substrate.
In an embodiment, the reference measurements REF1 are obtained using a reference tool, the reference tool being different from the first measurement tool T1. In an embodiment, the reference tool is a scanning electron microscope (SEM), or an atomic force microscope (AFM). For example, the reference measurements REF1 of the physical characteristic (e.g., overlay) are obtained by measuring the first set of patterned substrates using the SEM or an atomic force microscope (AFM). In an embodiment, the reference measurements REF1 may be in the form of self-reference targets (also called as programmed patterned substrates), for example, in an alignment radiation source (ASR).
In an embodiment, the virtual data VD1 is determined by applying a mathematical operation between each of the second set of measured data CDX. In an embodiment, the mathematical operation comprises an averaging operation or a weighted averaging operation applied to the second set of measured data CDX. As the virtual data VD1 may be generated based on the calibration data CDX, the virtual data VD1 comprises variations caused by different tool hardware or settings. In an embodiment, based on the mathematical operation, the virtual data VD1 may include common aspects related to the different tools, and filter out uncommon aspects (e.g., variations due to difference in recipes, hardware, etc.).
Process S13 involves generating a set of mapping functions MFX between the second set of measured data CDX and the virtual data VD1, each mapping function mapping each measured data of the second set of measured data CDX to the virtual data VD1. In an embodiment, the set of mapping functions MFX may be linear functions that maps one data point (e.g., a pixel value of measured data CDX) to a corresponding data point in the virtual data VD1.
In an embodiment, generating of the set of mapping functions MFX involves mapping each measured data of the second set of measured data CDX to the virtual data VD1, each mapping function providing a means to represent each measured data as if measured by the virtual tool. In an embodiment, generating of the set of mapping functions MFX (e.g., MF1 and MF2) involves determining a function for mapping each measured data of the second set of measured data CDX to the virtual data VD1. In an embodiment, the mapping function may be determined using any appropriate data mapping method such as using a least square. In an embodiment, each measured data and the virtual data VD1 are represented as pixelated images.
For example, the mapping function MFX may be a linear function configured to map a particular measured data to the virtual data VD1, a non-linear function configured to map a particular measured data to the virtual data VD1, or other types of functions. For example, MF1 is a linear map between pixel values of a first measured data and pixel values of virtual data VD1, and MF2 is another linear map between pixel values of a second measured data and pixel values of virtual data VD1.
Process S15 involves converting, based on the set of mapping functions MFX, the first set of measured data TDX of the training data. In an embodiment, the converting operation causes the first measured data TDX to be mapped to the virtual tool while incorporating (via the mapping functions) effects of variations in the tools. As such, when the converted data is used for training the model, the trained model predictions (e.g., overlay values) correspond as if determined using the virtual tool. Process S17 involves determining a model M10 based on the reference measurements REF1 and the converted first set of measured data TDX such that the model M10 predicts values of the physical characteristic that are within an acceptable threshold (e.g., within 10% range) of the reference measurements REF1.
In an embodiment, determining of the model M10 is an iterative process. Each iteration may involve predicting, via a base model configured with initial values of model parameters and using the converted first set of measured data TDX as input, values of the physical characteristic associated with the first set of patterned substrates. The predicted values of the physical characteristic (e.g., CD, overlay, etc.) are compared with the reference measurements REF1. In an embodiment, the comparison involves determining a difference between the predicted values and the reference measurements REF1. Based on the comparison, the initial values of the model parameters are adjusted to cause the predicted values (e.g., CD, overlay, etc.) to be within the acceptable threshold of the reference measurements REF1, wherein the adjusted model parameters configure the model M10 for predicting values of the physical characteristic for any measurement tool.
In an embodiment, the method 1100 further involves creating, based on the set of mapping functions MFX, recipe of the virtual tool, the recipe includes configuration of one or more tool characteristics used during a measurement. In an embodiment, the one or more tool characteristics includes, but not limited to a wavelength of the light used for measurements; a pupil shape used for measurements; an intensity of light used for measurements; and/or a grating-to-sensor orientation of a patterned substrate.
In an embodiment, the method 1100 further includes transforming, based on the trained model M10 and the set of mapping functions MFX, a recipe of the virtual tool associated with the virtual data VD1 to recipes associated with the first measuring tool and the second measuring tool. In an embodiment, each recipe causes the respective tool to provide consistent measurements. For example, a first recipe includes characteristics associated with the first measurement tool T1, and a second recipe includes characteristics associated with the second measurement tool (e.g., a tool of T2).
In an embodiment, the method 1100 may further include process S18 for capturing, via a metrology tool, signals associated with a portion of a patterned substrate; and process S19 for executing the trained model M10 using the captured signals as input to determine measurements of the physical characteristic associated with the patterned substrate.
In an embodiment, the process S19 further includes converting, via a mapping function from the set of mapping function corresponding to the metrology tool being used, the signals; and executing the trained model using the converted signals as input to determine measurements of the physical characteristic associated with the patterned substrate. For example, the metrology tool captures an image of a portion of the patterned substrate. The captured image can be used as an input to the trained model M10 that is configured using a mapping function (e.g., MF1) corresponding to the metrology tool, so that the model M10 can predict overlay values associated with patterns printed on the patterned substrate.
Based on the calibration data related to wafers CWA and CWB, virtual data may be generated. For example, an average, or linear combination of the calibration data may be computed to generate the virtual data. In an embodiment, such virtual data may be considered to be associated with a virtual tool VT. A virtual setting or recipe may also be computed based on the recipes of the tools T2 and T3 or based on the virtual data. As such, when the virtual tool is considered to be configured according to the virtual recipe, it generates the virtual data.
As shown in
Furthermore, training data comprising measured data MDX associated with wafers TW1, TW2, and TW3, and reference data R1 (e.g., overlay values) corresponding to each of the wafers TW1-TW3 may be obtained. For example, the measured data MDX includes a first pupil data MD1, a second pupil data MD2, and a third pupil data MD3 obtained by light reflected from a portion of the wafers TW1, TW2, and TW3, respectively. Furthermore, the training data includes reference data R1 such as overlay values associated with wafers TW1, TW2, and TW3. In an embodiment, the reference data R1 may be obtained using a tool such as SEM or AFM.
In an embodiment, the measured data MDX correspond to a particular tool, and may not correspond to measurements that could have been obtained if the wafters TW1-TW3 were measured using the virtual tool. As such, the measured data MDX is converted using the mapping functions MF2 and MF3. The converted data (e.g., T1′ and T2′) of MDX along with the reference data R1 is further used for determining a model. For example, a process 1200 may be a machine learning, or data fitting process based on the model type (e.g., a machine learning model, or an empirical model). In an embodiment, the process 1200 is configured to determine model parameters of the model using the converted data T1′ and T2′ as input for making predictions of physical characteristics. The predicted characteristic values may be compared with the reference data R1 to adjust the model parameters. For example, a gradient based adjustment of model parameters may be employed to cause an error between the predictions and reference data to be minimized. The process 1200 generates a trained model M1 configured to predict values of the physical characteristics of interest.
In an embodiment, the model M1 may be further combined with the mapping functions such as MF2 and MF3 to generate models M11 and M12. The model M11 may be employed when determining physical characteristics using the metrology tool T2, while model M12 may be employed when determining physical characteristics using the metrology tool T3.
In an embodiment, the model M1 may be trained to determine measurement recipes to be applied by a metrology tool so that consistent measurements from different tools may be obtained. The model M11 may be employed for determining a recipe for the metrology tool T2, while model M12 may be employed for determining a recipe for the metrology tool T3.
Process S31 involves obtaining (i) reference measurements REF1 of a physical characteristic associated with a first set of patterned substrates, (ii) first measured data MD13 associated with a portion of a second patterned substrate using a first measurement tool T1, and (iii) second measured data CD13 associated with the portion of the second patterned substrate using a second measurement tool T2.
In an embodiment, each of the first measured data MD13 and the second measured data CD13 comprises signals detected by sensors of tools T1 and T2, respectively, configured to measure the portion of the second patterned substrate. In an embodiment, each of the first measured data MD13 and the second measured data CD13 comprises a pixeled image, wherein each pixel has intensity corresponding to light reflected from the portion of the second patterned substrate.
Process S33 involves determining a model M30 by adjusting model parameters based on the first measured data MD13, the second measured data CD13, and the reference measurements REF1 to cause the model M30 to predict values of the physical characteristic that are within an acceptable threshold of the reference measurements REF1. In an embodiment, the model M30 is a machine learning model (e.g., CNN), or an empirical model.
In an embodiment, the reference measurements REF1 are obtained using a reference tool, the reference tool being different from the tools T1 and T2. In an embodiment, the reference tool is a scanning electron microscope (SEM), or an atomic force microscope (AFM). For example, the reference measurements REF1 of the physical characteristic (e.g., overlay) are obtained by measuring the first set of patterned substrates using the SEM or an atomic force microscope (AFM).
In an embodiment, the process S33 of determining of the model M30 involves example operations S331, S33, S335, and S337, as shown in
Step S331 involves computing a difference between the first measured data MD13 and the second measured data CD13. As an example, the measured data may be pupil data such as an intensity image created by light reflecting from a portion of the substrate being measured. Accordingly, for measurements from two different tools, a pupil-to-pupil difference between measurements is computed as a pupil difference: PΔ=P1−P2, where P1 represents the first measured data MD13 and P2 represents the second measured data CD13.
Step S333 involves determining a set of basis functions BF characterizing the difference data (e.g., as the pupil difference PΔ). In an embodiment, the set of basis functions BF are determined by a single value decomposition (SVD) of the difference data, or principal component analysis (PCA) of the difference data. The decomposition methods SVD and PCA are only exemplary and the present disclosure is not limited to a particular set of basis function or a decomposition method.
In an embodiment, the singular value decomposition of the obtained pupil difference data may be computed as follows:
PΔ=UΔSΔVΔT,
Ũ
Δ
=U
Δ(:,1: k)
In the above equation, matrix UΔ represents components or a set of basis functions that explain the difference data. Matrix ŨΔ represents a filter which you construct by looking at a difference between the two tools. The term k represent the first “k” columns of the matrix UΔ that account for a desired amount (e.g. more than 80%) of the total energy or variation in pupil difference data. In an embodiment, ŨΔ represents a set of coefficients of the set of basis functions BF (e.g., principal components or other basis functions) that account for the desired amount (e.g., more than 80%) of the total energy or pupil data. In this example, the pupil data difference is a linear combination of these the column of this matrix. In one example, matrix VΔ represents the components or the set of basis functions that are orthonormal to UΔ. In another example, the matrices U and V may be in different spaces that are not necessarily orthogonal.
Step S335 involves applying the set of basis functions BF to the first measured data MD13 and the second measured data CD13 to generate projected data 1310. For example, project the pupil data (e.g., MD13) of the training wafers (e.g., TW1 and TW2) on the subspace orthogonal to ŨΔ using following equation:
P
pr=(1−ŨΔŨΔT)P
The above projection indicates that when the pupil data from the training wafers is projected using above projection operation, the pupil data is cleaned from signals that are different in the two tools (e.g., T1 and T2). In other words, filter out the signals that are not common between the two tools. Hence, when the projected data is used for training the model, the trained model will not be sensitive to these differences. So, the model will not associate the tool differences to the values of the physical characteristics (e.g., overlay). In an embodiment, the above process may be applied for any product. For example, a product related to memory, a circuit performing a desired function related to an application, etc. In an embodiment, above process may be applied every time a product change.
Step S337 involves determining the model M30 by adjusting model parameters based on the projected data 1310 (e.g., Ppr) and the reference measurements REF1 to cause the model M30 to predict values of the physical characteristic that are within the acceptable threshold of the reference measurements REF1.
In an embodiment, the process S33 of determining of the model M30 involves determining model parameters by satisfying a difference constraint comprising a difference between a first predicted physical characteristic value and a second predicted physical characteristic value, the first predicted physical characteristic value being predicted using the first measured data MD13 as input to the model M30 and the second predicted physical characteristic value being predicted using the second measured data CD13 as input to the model M30.
In an embodiment, the process S33 of determining of the model M30 involves example steps S341, S343, and S345, as shown in
Step S345 involves responsive to the difference constraint not being satisfied, adjusting the initial values of the model parameters based on a gradient descent of the difference constraint with respect to the model parameters such that the difference constraint is satisfied. In an embodiment, the gradient descent indicating a direction in which values of the model parameters be adjusted. It can understood by a person of ordinary skill in the art that the present disclosure is not limited to gradient descent method, and any other optimization or model fitting methods may be used to determine appropriate model parameters.
In an embodiment, the determining of the model parameter further involves computing a cost function as a function of the predicted physical characteristic values and the reference measurements REF1; determining whether the cost function satisfies a desired threshold associated therewith; and adjusting the initial values of the model parameters based on the cost function to cause the cost function to be within the desired threshold, the adjusting being performed using a gradient descent of the cost function with respect to the model parameters. In an embodiment, the cost function may be error squares plus a regularization term that tries to prevent from over fitting of the model M30. In an embodiment, the model fitting is done using Lagrange multipliers configured to solve by iterating and finding the Lagrange multiplier that satisfies the constraints and minimizes the cost function.
In an embodiment, the cost function and constraints used during the training of the model are defined as follows:
minimize cost function ƒ(x)
Constraint: mean(partool1−partool2)2<∈
During the training of a base model (having initial model parameter values) above constraints are used. During the training, measured data from the tool T1 is used as input to the base model for predicting values partool1 of the physical characteristics. Similarly, another measured data from the other tool T2 is used as input to the base model to predict values partool2 of the physical characteristics. According to the above constraints, the model parameters are configured to maintain the difference in values of the physical characteristics below an acceptable threshold. In other words, after completing the training, the model M30 predicts values of the physical characteristics using input data from different tools. The predicted difference also matches with the reference data REF1. Hence, after completing the training, the model M30 when applied predicts substantially the same values of the physical characteristics (e.g., overlay) irrespective of whether the input data is received from different tools such as the metrology tool T1 or the other metrology tool T2. Hence, consistent measurements of the physical characteristics may be obtained.
In the present example, the calibration data may be obtained by measuring calibration wafers CWA and CWB (an example of the second set of measured data CDX) using two different metrology tools. For example, a first calibration wafer CWA may be measured using an optical metrology tool T2. and a second calibration wafer CWB may be measured using another optical metrology tool T3. In another example, the first calibration wafer CWA may be measured using both optical metrology tools T2 and T3 to generate measured data C1 and C2, respectively. Similarly, the second calibration wafer CWB may be measured using the optical metrology tools T2 and T3 to generate measured data C3 and C4 (not illustrated). In one embodiment, the measured data may be represented as intensity images obtained from reflected light from a portion of the substrates CWA and CWB. In the above example, settings or measurement recipes used with the tools T2 and T3 may be same or different. For example, a first recipe involves obtaining pupil data or intensity image using a 400-millimeter wavelength, and a second recipe involves obtaining pupil data or intensity image using a 700-millimeter wavelength.
The measured data MDX and corresponding the reference data R1, and the calibration data C1-C4 is used for determining a model M3. According to an embodiment, the model M3 is determined by the process 1300 (of
In an embodiment, the methods discussed herein may be provided as one or more computer program products or a non-transitory computer readable medium having instructions recorded thereon, the instructions when executed by a computer implementing the operation of the method 400 discussed above. For example, an example computer system CS in
Computer system CS may be coupled via bus BS to a display DS, such as a cathode ray tube (CRT) or flat panel or touch panel display for displaying information to a computer user. An input device ID, including alphanumeric and other keys, is coupled to bus BS for communicating information and command selections to processor PRO. Another type of user input device is cursor control CC, such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to processor PRO and for controlling cursor movement on display DS. This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane. A touch panel (screen) display may also be used as an input device.
According to one embodiment, portions of one or more methods described herein may be performed by computer system CS in response to processor PRO executing one or more sequences of one or more instructions contained in main memory MM. Such instructions may be read into main memory MM from another computer-readable medium, such as storage device SD. Execution of the sequences of instructions contained in main memory MM causes processor PRO to perform the process steps described herein. One or more processors in a multi-processing arrangement may also be employed to execute the sequences of instructions contained in main memory MM. In an alternative embodiment, hard-wired circuitry may be used in place of or in combination with software instructions. Thus, the description herein is not limited to any specific combination of hardware circuitry and software.
The term “computer-readable medium” as used herein refers to any medium that participates in providing instructions to processor PRO for execution. Such a medium may take many forms, including but not limited to, non-volatile media, volatile media, and transmission media. Non-volatile media include, for example, optical or magnetic disks, such as storage device SD. Volatile media include dynamic memory, such as main memory MM. Transmission media include coaxial cables, copper wire and fiber optics, including the wires that comprise bus BS. Transmission media can also take the form of acoustic or light waves, such as those generated during radio frequency (RF) and infrared (IR) data communications. Computer-readable media can be non-transitory, for example, a floppy disk, a flexible disk, hard disk, magnetic tape, any other magnetic medium, a CD-ROM, DVD, any other optical medium, punch cards, paper tape, any other physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, any other memory chip or cartridge. Non-transitory computer readable media can have instructions recorded thereon. The instructions, when executed by a computer, can implement any of the features described herein. Transitory computer-readable media can include a carrier wave or other propagating electromagnetic signal.
Various forms of computer readable media may be involved in carrying one or more sequences of one or more instructions to processor PRO for execution. For example, the instructions may initially be borne on a magnetic disk of a remote computer. The remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem. A modem local to computer system CS can receive the data on the telephone line and use an infrared transmitter to convert the data to an infrared signal. An infrared detector coupled to bus BS can receive the data carried in the infrared signal and place the data on bus BS. Bus BS carries the data to main memory MM, from which processor PRO retrieves and executes the instructions. The instructions received by main memory MM may optionally be stored on storage device SD either before or after execution by processor PRO.
Computer system CS may also include a communication interface CI coupled to bus BS. Communication interface CI provides a two-way data communication coupling to a network link NDL that is connected to a local network LAN. For example, communication interface CI may be an integrated services digital network (ISDN) card or a modem to provide a data communication connection to a corresponding type of telephone line. As another example, communication interface CI may be a local area network (LAN) card to provide a data communication connection to a compatible LAN. Wireless links may also be implemented. In any such implementation, communication interface CI sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.
Network link NDL typically provides data communication through one or more networks to other data devices. For example, network link NDL may provide a connection through local network LAN to a host computer HC. This can include data communication services provided through the worldwide packet data communication network, now commonly referred to as the “Internet” INT. Local network LAN (Internet) both use electrical, electromagnetic or optical signals that carry digital data streams. The signals through the various networks and the signals on network data link NDL and through communication interface CI, which carry the digital data to and from computer system CS, are exemplary forms of carrier waves transporting the information.
Further embodiments of the present non-transitory computer-readable medium, method and metrology tool are disclosed in the subsequent list of numbered clauses:
The concepts disclosed herein may simulate or mathematically model any generic imaging system for imaging sub wavelength features, and may be especially useful with emerging imaging technologies capable of producing increasingly shorter wavelengths. Emerging technologies already in use include EUV (extreme ultra violet), DUV lithography that is capable of producing a 193 nm wavelength with the use of an ArF laser, and even a 157 nm wavelength with the use of a Fluorine laser. Moreover, EUV lithography is capable of producing wavelengths within a range of 20-5 nm by using a synchrotron or by hitting a material (either solid or a plasma) with high energy electrons in order to produce photons within this range.
While the concepts disclosed herein may be used for imaging on a substrate such as a silicon wafer, it shall be understood that the disclosed concepts may be used with any type of lithographic imaging systems, e.g., those used for imaging on substrates other than silicon wafers, and/or metrology systems. In addition, the combination and sub-combinations of disclosed elements may comprise separate embodiments. For example, predicting a complex electric field image and determining a metrology metric such as overlay may be performed by the same parameterized model and/or different parameterized models. These features may comprise separate embodiments, and/or these features may be used together in the same embodiment.
Although specific reference may be made in this text to embodiments of the invention in the context of a metrology apparatus, embodiments of the invention may be used in other apparatus. Embodiments of the invention may form part of a mask inspection apparatus, a lithographic apparatus, or any apparatus that measures or processes an object such as a wafer (or other substrate) or mask (or other patterning device). These apparatus may be generally referred to as lithographic tools. Such a lithographic tool may use vacuum conditions or ambient (non-vacuum) conditions.
While specific embodiments of the invention have been described above, it will be appreciated that the invention may be practiced otherwise than as described. The descriptions above are intended to be illustrative, not limiting. Thus it will be apparent to one skilled in the art that modifications may be made to the invention as described without departing from the scope of the claims set out below.
Number | Date | Country | Kind |
---|---|---|---|
21173654.1 | May 2021 | EP | regional |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2022/060839 | 4/25/2022 | WO |