The described embodiments relate to metrology systems and methods, and more particularly to methods and systems for improved measurement accuracy.
Semiconductor devices such as logic and memory devices are typically fabricated by a sequence of processing steps applied to a specimen. The various features and multiple structural levels of the semiconductor devices are formed by these processing steps. For example, lithography among others is one semiconductor fabrication process that involves generating a pattern on a semiconductor wafer. Additional examples of semiconductor fabrication processes include, but are not limited to, chemical-mechanical polishing, etch, deposition, and ion implantation. Multiple semiconductor devices may be fabricated on a single semiconductor wafer and then separated into individual semiconductor devices.
Metrology processes are used at various steps during a semiconductor manufacturing process to detect defects on wafers to promote higher yield. Optical and x-ray based metrology techniques offer the potential for high throughput without the risk of sample destruction. A number of techniques including scatterometry and reflectometry implementations and associated analysis algorithms are commonly used to characterize critical dimensions, film thicknesses, composition and other parameters of nanoscale structures.
As devices (e.g., logic and memory devices) move toward smaller nanometer-scale dimensions, characterization becomes more difficult. Devices incorporating complex three-dimensional geometry and materials with diverse physical properties contribute to characterization difficulty. In general, semiconductor device shapes and profiles are changing dramatically along with new process capabilities. In particular, advanced logic and memory devices must meet increasingly demanding specifications for Critical Dimension (CD) profiles. Thus, detailed features of geometric profiles must be measured accurately.
Significant advances in process chemistry have enabled new etch applications. In some examples, High Aspect Ratio (HAR) etch tools are capable of etching away very narrow vertical channels in semiconductor die with aspect ratios, i.e., ratio of height/width, of 80:1, or higher. This capability has enabled flash memory architectures to transition from two dimensional floating-gate architectures to fully three dimensional geometries. In some examples, film stacks and etched structures are very deep (e.g., three micrometers in depth, or more) and include an extremely high number of layers (e.g., 400 layers, or more).
As the etch process penetrates deeper into the structure, the etch rate is susceptible to change along the channel. This leads to a non-uniform etch profile, i.e., the Critical Dimension (CD) of a fabricated channel varies as a function of height. Typical semiconductor devices include millions of HAR channels separated from each other by extremely small distances, e.g., tens of nanometers. Thus, etch profile uniformity and parallelism of HAR channels must be controlled to very tight specifications to achieve an acceptable device yield.
High aspect ratio structures create challenges for film and CD measurements. The ability to measure the critical dimensions that define the shapes of holes and trenches of these structures is critical to achieve desired performance levels and device yield. The metrology must be capable of measuring the CD of a continuous profile through a deep channel to determine the location of CD variations and inflection points of profile variations.
In other examples, the most advanced memory and logic device structures, e.g., nanowire structures, forksheet structures, complementary field effect transistor (CFET) structures, multi-deck VNAND structures, etc., incorporate new complex three-dimensional geometry, dramatic topographic changes, and materials with diverse orientation and physical properties. These advanced devices are difficult to characterize.
In response to these challenges, more complex metrology tools have been developed. Measurements are performed over large ranges of several machine parameters (e.g., wavelength, azimuth and angle of incidence, etc.), and often simultaneously. As a result, the measurement time, computation time, and the overall time to generate reliable results, including measurement recipes and accurate measurement models, increases significantly.
Existing physical model based metrology methods typically include a series of steps to model and then measure structure parameters. Typically, measurement data (e.g., DOE spectra) is collected from a set of samples or wafers, a particular metrology target, a testing critical dimension target, an in-cell actual device target, an SRAM memory target, etc. An accurate model of the optical response from these complex structures including a model of the geometric features, dispersion parameter, and the measurement system is formulated. Typically, a regression is performed to refine the geometric model. In addition, simulation approximations (e.g., slabbing, Rigorous Coupled Wave Analysis (RCWA), etc.) are performed to avoid introducing excessively large errors. Discretization and RCWA parameters are defined. A series of simulations, analysis, and regressions are performed to refine the geometric model and determine which model parameters to float. A library of synthetic spectra is generated. Finally, measurements are performed using the library or regression in real time with the geometric model.
In other examples, machine learning model based metrology methods are employed to perform measurements. In these examples, training data are gathered and employed to train a machine learning based model that maps measured spectra to values of one or more parameters of interest.
In many fabrication scenarios, dimensions measured before and after a particular process step, or set of process steps, are correlated. Thus, in principle, pre-process measurement data may be employed to improve the accuracy of measurements of structures after one or more intervening process steps.
In some examples, pre-process measurement data are employed as part of a physical model based metrology method. In these examples, pre-process measurement data are fed forward to a post-process measurement model. This approach may be employed in a regression based solution using an RCWA solver where all model degrees of freedom are floated simultaneously. In these examples, the pre-process measurement data breaks parameter correlations among multiple interacting model parameters within the post-process measurement model. Unfortunately, RCWA solver based measurement methods have not shown promise for many measurement applications involving the most advanced memory and logic device structures.
Machine learning model based metrology methods have shown greater promise for many measurement applications involving the most advanced memory and logic device structures. However, pre-process measurement data cannot be directly fed forward to a post-process ML based measurement model because each floating parameter is resolved using an independently optimized ML model. Thus, there are no parameter correlations to be broken in an ML based measurement model. As a result, data feedforward of pre-process measurement data does not improve the accuracy of an ML model based measurement.
In general, the effectiveness of a ML based measurement model depends on the quality and quantity of reference training data. Training data must be gathered at high sampling frequency over a wide range of the process parameter space to meet the measurement application requirements involving the most advanced memory and logic device structures. Unfortunately, actual reference measurement data, e.g., Tunneling Electron Microscopy (TEM) measurements, Scanning Electron Microscopy (SEM) measurements, etc., are limited to a very small number of measurement sites on a wafer, e.g., less than 50 locations, for a particular measurement application. The lack of accurate reference measurement data collected from each wafer leaves vast portions of the process space unrepresented or underrepresented in the training data set. As a result, a ML based measurement model trained on sparse reference measurement data tends to inaccurately characterize structures fabricated within large portions of the process space.
In an attempt to overcome the lack of actual reference measurement data, training data are generated synthetically. In some examples, post-process synthetic data sets, e.g., spectra, are generated based on a physics based measurement model. The synthetic data sets are generated using an RCWA solver over a range of different parameter values that simulates the process space. The synthetically generated data sets are employed to train a ML based measurement model. In practice, however, synthetically generated training data do not capture all pre-process variation and cannot predict all the possible post-process variation before library generation. In some examples, pre-process variation occurs in parameters that are fixed in value when generating post-process synthetic data sets. As a result, the pre-process variation is not captured at all in the training data. In one example, pre-process process variation of a material optical dispersion occurs, yet the dispersion parameter is treated as fixed when generating post-process synthetic data sets.
In practice, the number of pre-process parameters that can be floated in a physics based measurement model is limited, e.g., 10-15 parameters, by concerns about model quality, time to solution, and the computational effort associated with library generation. Moreover, the number of pre-process parameters that actually vary greatly exceeds that number. As a result, the pre-process variation of many parameters is not captured in the post-process synthetic data sets, even when some pre-process parameters are floated when generating post-process synthetic data sets.
Modeling of emerging semiconductor structures has resulted in increasingly complicated models with unsatisfactory results. Machine learning model based metrology methods have shown promise for many measurement applications involving the most advanced memory and logic device structures. Unfortunately, the training data currently employed to train ML based measurement models is limiting the performance of trained ML based measurement models. In particular, pre-process measurement data that correlates with post-process parameters of interest is not adequately captured by the training data sets currently employed. As complex semiconductor structures become more common, and with less time per project, improved measurement model training methods and tools are desired.
Methods and systems for using historical measurement data to train a present state, machine learning (ML) based measurement model are described herein. This approach takes advantage of the correlation between structural characteristics of measured samples fabricated in accordance with different design revisions, process revisions, or both. In addition, prior state measurement data are employed to train a present state, ML based measurement model. This approach takes advantage of the correlation between structural characteristics of measured samples fabricated before and after one or more intervening process steps.
In one aspect, a present state, ML based measurement model is trained using training data associated with measurements of a plurality of instances of a current version of a semiconductor structure in a present state of a semiconductor process flow and training data associated with measurements of a plurality of instances of a historical version of the semiconductor structure in the present state of the semiconductor process flow.
A present state indicates the state of the semiconductor structure after the latest process step applied to the semiconductor structure, and before any subsequent process steps are applied to the semiconductor structure. A prior state indicates the state of the semiconductor structure before the latest process step was applied to the semiconductor structure. In some examples, a significant amount of validated measurement data is collected from a semiconductor structure in a prior state. In some of these examples, accurate measurements of one or more parameters of interest in a prior state are relatively easy to obtain compared to a present state. In this manner, measurement data associated with a semiconductor structure in a prior state are typically trusted. To the extent that a semiconductor structure in a prior state is reasonably correlated with a the semiconductor structure in a present state, prior state measurement data can be employed to improve the accuracy and reliability of a trained present state measurement model.
A version of a semiconductor structure indicates the design version of a semiconductor structure, a process recipe version employed to fabricate the semiconductor structure, or both. A current version of a semiconductor structure is the design revision, process recipe, or both, associated with semiconductor structure for which a present state measurement model is being trained. A historical version of the semiconductor structure is a different design revision, different process recipe, or both, associated with the semiconductor structure. Typically, a historical version of a semiconductor structure is an earlier design revision, earlier process recipe, or both, for which a significant amount of validated measurement data has been collected. In this manner, measurement data associated with historical versions of a semiconductor structure are typically trusted. To the extent that historical versions of semiconductor structure are reasonably correlated with a current version of the semiconductor structure, historical measurement data can be employed to improve the accuracy and reliability of a trained present state measurement model.
In one aspect, a training data set includes raw measurement signals associated with a measurement of each of the plurality of instances of the current version of the semiconductor structure in the present process state, and a corresponding measured value of the parameter of interest associated with a reference measurement of each of the plurality of instances of the current version of the semiconductor structure by a reference metrology system.
In some embodiments, the training data set also includes raw measurement signals associated with a measurement of each of a plurality of instances of a historical version of the semiconductor structure in the present process state, and a corresponding measured value of the parameter of interest associated with a reference measurement of each of the plurality of instances of the historical version of the semiconductor structure by a reference metrology system.
In some embodiments, a training data set also includes raw measurement signals associated with a measurement of each of a plurality of instances of a historical version of the semiconductor structure in the present process state, and a corresponding measured value of the measured parameter of interest associated with each of the plurality of instances of the historical version of the semiconductor structure by a high-throughput, in-line production metrology system.
In some embodiments, a training data set also includes a plurality of assumed values of a parameter of interest characterizing the current version of the semiconductor structure in the present process state and synthetically generated, raw measurement signals corresponding to each of the assumed values of the parameter of interest.
In some embodiments, a training data set includes raw measurement signals associated with a measurement of each of a plurality of instances of the current version of the semiconductor structure in a prior process state and a corresponding measured value of the parameter of interest associated with a reference measurement of each of the plurality of instances of the current version of the semiconductor structure by a reference metrology system.
In some embodiments, a training data set includes assumed values of a parameter of interest characterizing the current version of the semiconductor structure in a prior process state and synthetically generated, raw measurement signals corresponding to each of the assumed values of a parameter of interest.
In some embodiments, a training data set includes raw measurement signals associated with a measurement of each of a plurality of instances of a historical version of the semiconductor structure in a prior process state and a corresponding measured value of the parameter of interest associated with a reference measurement of each of the plurality of instances of the historical version of the semiconductor structure in the prior process state by a reference metrology system.
In some embodiments, a training data set includes raw measurement signals associated with a measurement of each of the plurality of instances of a historical version of the semiconductor structure in the present process state and a corresponding measured value of the measured parameter of interest associated with each of the plurality of instances of the historical version of the semiconductor structure by a high-throughput, in-line production metrology system. This training data set is derived from high-throughput, in-line measurements of historical instances of the structure of interest fabricated on one or more production wafers in the ith prior state, where i is any non-zero positive integer number bounded by the total number of prior process states before the present process state in the semiconductor fabrication process flow.
In general, training data sets may include training data associated with historical, prior state measurements at any number of prior process steps. Each of the different prior states of the semiconductor process flow and the present state of the semiconductor process flow are separated by one or more intervening semiconductor manufacturing process steps. Furthermore, each of the different prior states of the semiconductor process flow and any other of the different prior states are separated by one or more intervening semiconductor manufacturing process steps.
In some examples, training data continues to be generated based on reliable, high-throughput, in-line measurements of a continuously growing number of instances of a structure of interest at a prior state, along with corresponding high-throughput, in-line measurement signals in the present state. In these examples, prior state and present state measurements continue to be collected from in-line, production wafers. Periodically, the expanded set of training data is employed to retrain the present state measurement model to continuously improve the accuracy and reliability of the trained present state measurement model as production continues.
In a further aspect, a present state measurement model training engine includes a weighting module that assigns different weighting values to different sets of training data. The relative weighting of different sets of training data emphasizes training data sets assigned a relatively high weighting and deemphasizes training data sets assigned a relatively low weighting. In this manner, training data sets associated with a higher level of trust in the data or higher correlation to the current version of the structure of interest in the present state are emphasized over training data sets that are less trusted or have lower correlation to the current version of the structure of interest in the present state.
The foregoing is a summary and thus contains, by necessity, simplifications, generalizations and omissions of detail; consequently, those skilled in the art will appreciate that the summary is illustrative only and is not limiting in any way. Other aspects, inventive features, and advantages of the devices and/or processes described herein will become apparent in the non-limiting detailed description set forth herein.
Reference will now be made in detail to background examples and some embodiments of the invention, examples of which are illustrated in the accompanying drawings.
Methods and systems for using historical measurement data to train a present state, machine learning (ML) based measurement model are described herein. Historical measurement data are employed as part of the training data set for a present state, ML based measurement model to take advantage of the correlation between structural characteristics of measured samples fabricated in accordance with different design revisions, process revisions, or both.
In addition, methods and systems for using prior state measurement data to train a present state, machine learning (ML) based measurement model are described herein. Prior state measurement data are employed as part of the training data set for a present state, ML based measurement model to take advantage of the correlation between structural characteristics of measured samples fabricated before and after one or more intervening process steps.
The methods and systems described herein improve measurement accuracy and robustness for cutting-edge measurement applications involving Gate All Around (GAA) structures, Fork Sheet structures, CFET structures, 3D VNAND structures, 3D DRAM structures, etc. In one example, the methods and systems described herein improve critical dimension measurements of a GAA SRAM structure after a Sheet Formation process step.
In a further embodiment, the metrology system 100 is a measurement system 100 that includes one or more computing systems 116 configured to execute present state measurement tool 150 in accordance with the description provided herein. In the preferred embodiment, present state measurement tool 150 is a set of program instructions 120 stored on a carrier medium 118. The program instructions 120 stored on the carrier medium 118 are read and executed by computing system 116 to realize model based measurement functionality as described herein. The one or more computing systems 116 may be communicatively coupled to the spectrometer 104. In one aspect, the one or more computing systems 116 are configured to receive measurement data 111 associated with a measurement (e.g., critical dimension, film thickness, composition, process, etc.) of the structure 114 of specimen 112. In one example, the measurement data 111 includes an indication of the measured spectral response (e.g., measured intensity as a function of wavelength) of the specimen by measurement system 100 based on the one or more sampling processes from the spectrometer 104. In some embodiments, the one or more computing systems 116 are further configured to determine specimen parameter values of structure 114 from measurement data 111.
In a further aspect, computing system 116 is configured to determine dimensions of a sample by providing present state raw measurement signals, e.g., measurement data 111 collected from a sample in a present state, as input to a trained present state measurement model. The trained present state measurement model is a machine learning based measurement model that determines dimensions of a sample based on measurement data provided as input to the model.
In the embodiment depicted in
In one example, illuminator 102 illuminates an instance of a current version of a semiconductor structure in a present state of a semiconductor process flow with an amount of radiation 106. Spectrometer 104 detects raw measurement signals from the instance of the current version of the semiconductor structure in response to the incident radiation. Computing system 116 provides the measurement signals as input to a trained present state, machine learning based measurement model. The trained present state machine learning based measurement model estimates a value of a parameter of interest characterizing the instance of the current version of the semiconductor structure in the present state. The estimated value of the parameter of interest is an output of the trained, present state, machine learning based measurement model generated in response to the raw measurement signals provided as input.
In some examples, measurements are performed at N different instances of a semiconductor structure fabricated at different locations, different wafers, or both, where N is any positive integer value. Specimen parameters of interest can be deterministic (e.g., CD, SWA, etc.) or statistical (e.g., rms height of sidewall roughness, roughness correlation length, etc.).
In some embodiments, measurement system 100 is further configured to store one or more trained present state measurement models in a memory (e.g., carrier medium 118).
In one aspect, a present state, ML based measurement model is trained using training data associated with measurements of a plurality of instances of a current version of a semiconductor structure in a present state of a semiconductor process flow and training data associated with measurements of a plurality of instances of a historical version of the semiconductor structure in the present state of the semiconductor process flow.
A state of a semiconductor structure indicates the particular state of semiconductor structure within a semiconductor fabrication process flow. For example, a present state indicates the state of the semiconductor structure after the latest process step applied to the semiconductor structure, and before any subsequent process steps are applied to the semiconductor structure. A prior state indicates the state of the semiconductor structure before the latest process step was applied to the semiconductor structure. A semiconductor fabrication process flow includes many process steps, e.g., over one hundred process steps. As such, for a typical present state of a complex semiconductor structure there are many prior states of semiconductor structure.
In some examples, a significant amount of validated measurement data is collected from a semiconductor structure in a prior state. In some of these examples, accurate measurements of one or more parameters of interest in a prior state are relatively easy to obtain compared to a present state. In this manner, measurement data associated with a semiconductor structure in a prior state are typically trusted. To the extent that a semiconductor structure in a prior state is reasonably correlated with a the semiconductor structure in a present state, prior state measurement data can be employed to improve the accuracy and reliability of a trained present state measurement model.
A version of a semiconductor structure indicates the design version of a semiconductor structure, a process recipe version employed to fabricate the semiconductor structure, or both. For example, a current version of a semiconductor structure is the design revision, process recipe, or both, associated with semiconductor structure for which a present state measurement model is being trained. A historical version of the semiconductor structure is a different design revision, different process recipe, or both, associated with the semiconductor structure. Typically, a historical version of a semiconductor structure is an earlier design revision, earlier process recipe, or both, for which a significant amount of validated measurement data has been collected. In this manner, measurement data associated with historical versions of a semiconductor structure are typically trusted. To the extent that historical versions of semiconductor structure are reasonably correlated with a current version of the semiconductor structure, historical measurement data can be employed to improve the accuracy and reliability of a trained present state measurement model.
Measurements of parameters of interest at a present state using a present state measurement model trained using training data associated with both a current version and one or more historical versions of a semiconductor structure are generally more robust and more accurate compared to measurements performed using a present state measurement model trained using only training data associated with a current version of the semiconductor structure.
In some examples, training a present state, ML based measurement model using training data associated with both a current version and one or more historical versions of a semiconductor structure dramatically shortens the time required to generate an accurate and reliable measurement model.
In a further aspect, computing system 116 is configured to train a present state measurement model based at least in part training data associated with both a current version and one or more historical versions of a semiconductor structure. In the embodiment depicted in
As depicted in
Training data sets 165 and 166 are extracted from sets of paired measurement data derived from a number of different reference data sources as described herein. Each set of paired measurement data includes raw measurement signals, e.g., spectral measurements, diffraction images, etc., associated with measurements of a number of instances of the structure of interest and corresponding values of a parameter of interest characterizing a structural characteristic of the structure of interest. Training data set 165 includes the raw measurement signals associated with each set of paired measurement data, and training data set 166 includes the corresponding values of the parameter of interest associated with each set of paired measurement data included in training set 165.
In some examples, machine learning module 161 generates estimated values of one or more parameters of interest, POI* 164, based on each set of raw measurement signals comprising the training data set 165. Error evaluation module 162 receives the estimated values of the one or more parameters of interest, POI* 164, generated by machine learning module 161. In addition, error evaluation module 162 receives training data set 166 including the corresponding values of the parameter of interest characterizing associated with each set of raw measurement signals included in training data set 165. The values of training set 166 indicate trusted values of the one or more parameters of interest associated with each set of raw measurement signals included in training data set 165. Error evaluation module 162 generates updated values of weighting parameters 163 of the machine learning model 161 undergoing training to minimize differences between the estimated values of the one or more parameters of interest, POI* 164, and the trusted values of the one or more parameters of interest associated with each set of measurement signals. In the next iteration of model training, new estimated values of the one or more parameters of interest, POI* 164, are generated by machine learning module 161 based on the values of the weighting parameters 163 generated in the previous iteration. The training process continues until the differences between the estimated values of the one or more parameters of interest, POI* 164, and the trusted values of the one or more parameters of interest associated with each set of measurement signals are acceptably small. At this point, the trained present state measurement model 168 is stored in a memory, e.g., memory 132.
In one aspect, training data set 165 includes raw measurement signals associated with a measurement of each of the plurality of instances of the current version of the semiconductor structure in the present process state, [MEASSPRESENT]DOE, and training data set 166 includes a corresponding measured value of the parameter of interest associated with a reference measurement of each of the plurality of instances of the current version of the semiconductor structure by a reference metrology system, [REFPOIPRESENT]DOE. This measurement data pair, [MEASSPRESENT, REFPOIPRESENT]DOE, is derived from measurements of instances of the structure of interest fabricated on one or more Design Of Experiments (DOE) wafers in the present state. The DOE wafers are typically off-line wafers purposely fabricated with variations in process parameters to probe the expected process space and ensure that the measurement model is trained to reliably perform measurements of structures fabricated within the expected process window during high volume production.
For many process steps of a complex semiconductor structure, reliable, actual reference measurements are only available from low throughput, expensive, and often destructive measurement techniques, e.g., Transmission Election Microscopy (TEM), Scanning Electron Microscopy (SEM), etc. Thus, in practice, it is not feasible to generate very large reference data sets based on actual reference measurement data generated by trustworthy reference measurement systems for many process steps. In response, historical measurement data are employed to overcome the lack of actual reference measurement data collected at a limited number of different locations on a limited number of different wafers.
In some embodiments, training data set 165 includes raw measurement signals associated with a measurement of each of the plurality of instances of a historical version of the semiconductor structure in the present process state, [MEASSPRESENT]DOE-H, and training data set 166 includes a corresponding measured value of the parameter of interest associated with a reference measurement of each of the plurality of instances of the historical version of the semiconductor structure by a reference metrology system, [REFPOIPRESENT]DOE-H. This measurement data pair, [MEASSPRESENT, REFPOIPRESENT]DOE-H, is derived from measurements of historical instances of the structure of interest fabricated on one or more Design Of Experiments (DOE) wafers in the present state. The historical DOE wafers (DOE-H) are typically off-line wafers purposely fabricated with variations in process parameters to probe the expected process space during measurement recipe development for a historical version of the structure of interest. Although, the design revision, process revision, or both, associated with the historical version is not exactly the same as the current version of the structure of interest, this historical training data is available, validated, and correlated with the current version. As such, this historical training data contributes to the training of a present state measurement model that reliably performs measurements of structures fabricated within the expected process window during high volume production.
In some embodiments, training data set 165 includes raw measurement signals associated with a measurement of each of the plurality of instances of a historical version of the semiconductor structure in the present process state, [MEASSPRESENT]PROD-H, and training data set 166 includes a corresponding measured value of the measured parameter of interest associated with each of the plurality of instances of the historical version of the semiconductor structure by a high-throughput, in-line production metrology system, [REFPOIPRESENT]PROD-H. This measurement data pair, [MEASSPRESENT, REFPOIPRESENT]PROD-H, is derived from high-throughput, in-line measurements of historical instances of the structure of interest fabricated on one or more production wafers in the present state.
Typically, measurement data pair, [MEASSPRESENT, REFPOIPRESENT]PROD-H is a very large data set collected using one or more instances of a high-throughput measurement system, e.g., a high-throughput optical critical dimension (OCD) measurement system such as the spectroscopic ellipsometry system depicted in
In some embodiments, training data set 165 includes synthetically generated, raw measurement signals, [SYNSPRESENT]DOE, corresponding to each of a plurality of assumed values of a parameter of interest, [KWNPOIPRESENT]DOE, characterizing the current version of the semiconductor structure in the present process state, and training data set 166 includes the assumed values of the parameter of interest, [KWNPOIPRESENT]DOE. The measurement data pair, [SYNSPRESENT, KWNPOIPRESENT]DOE, is generated by a measurement system model that simulates raw measurement signals generated by a high-throughput measurement system, e.g., spectroscopic ellipsometer 101 depicted in
In some embodiments, training data set 165 includes raw measurement signals associated with a measurement of each of a plurality of instances of the current version of the semiconductor structure in a prior process state, [MEASSPRIOR-I]DOE, and training data set 166 includes a corresponding measured value of the parameter of interest associated with a reference measurement of each of the plurality of instances of the current version of the semiconductor structure by a reference metrology system, [REFPOIPRIOR-I]DOE. This measurement data pair, [MEASSPRIOR-I, REFPOIPRIOR-I]DOE, is derived from measurements of instances of the structure of interest fabricated on one or more Design Of Experiments (DOE) wafers in a prior process state, prior process state, i. The DOE wafers are typically off-line wafers purposely fabricated with variations in process parameters to probe the expected process space and ensure that the measurement model is trained to reliably perform measurements of structures fabricated within the expected process window during high volume production.
In general, training data sets 165 and 166 include training data associated with measurements and reference measurements, respectively, associated with the current version of the semiconductor structure of interest at any number of prior process steps. Each of the different prior states of the semiconductor process flow and the present state of the semiconductor process flow are separated by one or more intervening semiconductor manufacturing process steps. Furthermore, each of the different prior states of the semiconductor process flow and any other of the different prior states are separated by one or more intervening semiconductor manufacturing process steps.
Although, a semiconductor structure of interest at a present state is not identically formed at a prior state, the prior state training data is available, validated, and correlated with the current version. As such, the prior state training data contributes to the training of a present state measurement model that reliably performs measurements of structures fabricated within the expected process window during high volume production.
In some embodiments, training data set 165 includes synthetically generated, raw measurement signals, [SYNSPRIOR-I]DOE, corresponding to each of a plurality of assumed values of a parameter of interest, [KWNPOIPRIOR-I]DOE, characterizing the current version of the semiconductor structure in a prior process state, and training data set 166 includes the assumed values of the parameter of interest, [KWNPOIPRIOR-I]DOE. The measurement data pair, [SYNSPRIOR-I, KWNPOIPRIOR-I]DOE, is generated by a measurement system model that simulates raw measurement signals generated by a high-throughput measurement system, e.g., spectroscopic ellipsometer 101 depicted in
In some embodiments, training data set 165 includes raw measurement signals associated with a measurement of each of a plurality of instances of a historical version of the semiconductor structure in a prior process state, [MEASSPRIOR-I]DOE-H, and training data set 166 includes a corresponding measured value of the parameter of interest associated with a reference measurement of each of the plurality of instances of the historical version of the semiconductor structure in the prior process state by a reference metrology system, [REFPOIPRIOR-I]DOE-H. This measurement data pair, [MEASSPRIOR-I, REFPOIPRIOR-I]DOE-H, is derived from measurements of historical instances of the structure of interest fabricated on one or more historical, Design Of Experiments (DOE-H) wafers in the prior state. The DOE-H wafers are typically off-line wafers purposely fabricated with variations in process parameters to probe the expected process space during measurement recipe development for a historical version of the structure of interest in the prior state. Although, the design revision, process revision, or both, associated with the historical version is not exactly the same as the current version of the structure of interest, and a semiconductor structure of interest at a present state is not identically formed at a prior state, the historical, prior state training data is available, validated, and correlated with the current version. As such, the historical, prior state training data contributes to the training of a present state measurement model that reliably performs measurements of structures fabricated within the expected process window during high volume production.
In some embodiments, training data set 165 includes raw measurement signals associated with a measurement of each of the plurality of instances of a historical version of the semiconductor structure in the present process state, [MEASSPRIOR-I]PROD-H, and training data set 166 includes a corresponding measured value of the measured parameter of interest associated with each of the plurality of instances of the historical version of the semiconductor structure by a high-throughput, in-line production metrology system, [REFPOIPRIOR-I]PROD-H. This measurement data pair, [MEASSPRIOR-I, REFPOIPRIOR-I]PROD-H, is derived from high-throughput, in-line measurements of historical instances of the structure of interest fabricated on one or more production wafers in the ith prior state, where i is any non-zero positive integer number bounded by the total number of prior process states before the present process state in the semiconductor fabrication process flow.
Typically, measurement data pair, [MEASSPRIOR-I, REFPOIPRIOR-I]PROD-H is a very large data set collected using one or more instances of a high-throughput measurement system, e.g., a high-throughput optical critical dimension (OCD) measurement system such as the spectroscopic ellipsometry system depicted in
In general, training data sets 165 and 166 include training data associated with historical, prior state measurements at any number of prior process steps. Each of the different prior states of the semiconductor process flow and the present state of the semiconductor process flow are separated by one or more intervening semiconductor manufacturing process steps. Furthermore, each of the different prior states of the semiconductor process flow and any other of the different prior states are separated by one or more intervening semiconductor manufacturing process steps.
In some examples, training data continues to be generated based on reliable, high-throughput, in-line measurements of a continuously growing number of instances of a structure of interest at a prior state, e.g., [MEASSPRIOR-I, REFPOIPRIOR-I]PROD-H, along with corresponding high-throughput, in-line measurement signals in the present state, e.g., [MEASSPRESENT, REFPOIPRESENT]PROD-H. In these examples, prior state and present state measurements continue to be collected from in-line, production wafers. As described hereinbefore, the prior state and present state measurements are communicated to present state measurement model training engine 160. Periodically, the expanded set of training data is employed to retrain the present state measurement model to continuously improve the accuracy and reliability of the trained present state measurement model as production continues.
In a further aspect, present state measurement model training engine 160 includes a weighting module (not shown) that assigns different weighting values to different sets of training data, e.g., any of the different set of training data depicted in
In the example, depicted in
In other examples, training data sets including actual reference measurement data associated with measurements of the current version in the present state are weighted higher than training data sets including synthetically generated data, historical data, prior state data, and historical, prior state data.
In general, any relative weighting among the training data sets may be contemplated within the scope of this patent document.
In general, there must be a process correlation between historical and prior state training data and the current version in a present state. This correlation must be captured by the high-throughput measurement signals associated with measurements of different versions, at different states, or both. In this manner, process correlation between steps enables improved solution accuracy and robustness.
The shape of a nanowire or nanosheet affects device performance significantly. Furthermore, the shape of a nanowire or nanosheet can change from one process step to another. In one example, sacrificial Silicon Germanium sheets are removed at the nanosheet release step. In addition to the removal of the SiGe sheets, part of the interspaced Silicon layers is consumed by the Silicon Germanium removal process, and as a result, the height of the Silicon nanosheets is also reduced during the release step. This can lead to height non-uniformity among Silicon nanosheets. In addition, the underlying Silicon nanosheets are affected by various etch recipes prior to the nanosheet release step.
The critical dimensions and features depicted in
In one example, process correlation from a prior state to a present state exists for inner spacer width between an epitaxial growth step employed from the SiGe source and drain structures 191 and 192 (prior state) and the subsequent SiGe release step (present state).
In another example, process correlation from a prior state to a present state exists for the dimension MGCD1 depicted in
In another example, process correlation from a prior state to a present state exists for the metal gate critical dimension, MGCD2, depicted in
In another example, process correlation from a prior state to a present state exists for the metal gate critical dimension, MGCD3, depicted in
In general, present state and historical training data are employed to generate measurement models of film structures and critical dimension structures, including, but not limited to, complex memory structures, e.g., 3D flash memory, DRAM cells, multi-deck VNAND structures, etc., and complex logic structures cells, e.g., Gate-All-Around structures, forksheet structures, complementary field effect transistor (CFET) structures, etc.
The flexibility of building measurement models using historical training data enables rapid measurement recipe generation based on relatively large data sets, including data sets acquired during production. This allows fabrication facilities to quickly refine process conditions to improve yield of FinFET structures, Gate-All-Around structures, DRAM and VNAND memory structures, etc. In many cases, the only other available alternatives to measurement of these high aspect ratio structures are destructive techniques, such as Focused Ion Beam microscopy, and extremely low throughput techniques, such as Transmission Electron Microscopy (TEM).
It should be recognized that the various steps described throughout the present disclosure may be carried out by single computer systems 116, or, alternatively, multiple computer systems 116. Moreover, different subsystems of system 100, such as the spectroscopic ellipsometer 101, may include a computer system suitable for carrying out at least a portion of the steps described herein. Therefore, the aforementioned description should not be interpreted as a limitation on the present invention but merely an illustration. Further, the one or more computing systems 116 may be configured to perform any other step(s) of any of the method embodiments described herein.
The computing system 116 may include, but is not limited to, a personal computer system, mainframe computer system, cloud-based computing system, workstation, image computer, parallel processor, or any other device known in the art. In general, the term “computing system” may be broadly defined to encompass any device having one or more processors, which execute instructions from a memory medium. In general, computing system 116 may be integrated with a measurement system such as measurement system 100, or alternatively, may be separate from any measurement system. In this sense, computing system 116 may be remotely located and receive measurement data from any measurement source.
Program instructions 120 implementing methods such as those described herein may be transmitted over or stored on carrier medium 118. The carrier medium may be a transmission medium such as a wire, cable, or wireless transmission link. The carrier medium may also include a computer-readable medium such as a read-only memory, a random access memory, a magnetic or optical disk, or a magnetic tape.
Although the methods discussed herein are explained with reference to system 100, any optical or x-ray metrology system configured to illuminate and detect light reflected, transmitted, or diffracted from a specimen may be employed to implement the exemplary methods described herein. Exemplary systems include an angle-resolved reflectometer, a scatterometer, a reflectometer, an ellipsometer, a spectroscopic reflectometer or ellipsometer, a beam profile reflectometer, a multi-wavelength, two-dimensional beam profile reflectometer, a multi-wavelength, two-dimensional beam profile ellipsometer, a rotating compensator spectroscopic ellipsometer, an optically modulated reflectometer, an optically modulated ellipsometer, a transmissive x-ray scatterometer, a reflective x-ray scatterometer, etc. By way of non-limiting example, an ellipsometer may include a single rotating compensator, multiple rotating compensators, a rotating polarizer, a rotating analyzer, a modulating element, multiple modulating elements, or no modulating element.
It is noted that the output from a measurement system may be configured in such a way that the measurement system uses more than one technology. In fact, an application may be configured to employ any combination of available metrology sub-systems within a single tool, or across a number of different tools.
A system implementing the methods described herein may also be configured in a number of different ways. For example, a wide range of wavelengths (including visible, ultraviolet, infrared, and X-ray), angles of incidence, states of polarization, and states of coherence may be contemplated. In another example, the system may include any of a number of different light sources (e.g., a directly coupled light source, a laser-sustained plasma light source, etc.). In another example, the system may include elements to condition light directed to or collected from the specimen (e.g., apodizers, filters, etc.).
In block 301, a first instance of a current version of a semiconductor structure is illuminated with an amount of radiation. The first instance of the current version of the semiconductor structure is disposed on a current version production semiconductor wafer in a present state of a semiconductor process flow.
In block 302, a first amount of raw measurement data is detected from the first instance of the current version of the semiconductor structure in response to the amount of radiation.
In block 303, the first amount of raw measurement data is provided as input to a trained present state, machine learning based measurement model.
In block 304, a value of a parameter of interest characterizing the first instance of the current version of the semiconductor structure in the present state is estimated. The estimated value of the parameter of interest is an output of the trained, present state, machine learning based measurement model generated in response to the first amount of raw measurement data provided as input. The trained present state, machine learning based measurement model is trained based at least in part on a first amount of training data associated with measurements of a plurality of instances of the current version of the semiconductor structure in the present state of the semiconductor process flow and a second amount of training data associated with measurements of a plurality of instances of a historical version of the semiconductor structure in the present state of the semiconductor process flow.
As described herein, the term “critical dimension” includes any critical dimension of a structure (e.g., bottom critical dimension, middle critical dimension, top critical dimension, sidewall angle, grating height, etc.), a critical dimension between any two or more structures (e.g., distance between two structures), a displacement between two or more structures (e.g., overlay displacement between overlaying grating structures, etc.), and a dispersion property value of a material used in the structure or part of the structure. Structures may include three dimensional structures, patterned structures, overlay structures, etc.
As described herein, the term “critical dimension application” or “critical dimension measurement application” includes any critical dimension measurement.
As described herein, the term “metrology system” includes any system employed at least in part to characterize a specimen in any aspect. However, such terms of art do not limit the scope of the term “metrology system” as described herein. In addition, the metrology system 100 may be configured for measurement of patterned wafers and/or unpatterned wafers. The metrology system may be configured as a LED inspection tool, edge inspection tool, backside inspection tool, macro-inspection tool, or multi-mode inspection tool (involving data from one or more platforms simultaneously), and any other metrology or inspection tool that benefits from the calibration of system parameters based on critical dimension data.
Various embodiments are described herein for a semiconductor processing system (e.g., an inspection system or a lithography system) that may be used for processing a specimen. The term “specimen” is used herein to refer to a site, or sites, on a wafer, a reticle, or any other sample that may be processed (e.g., printed or inspected for defects) by means known in the art. In some examples, the specimen includes a single site having one or more measurement targets whose simultaneous, combined measurement is treated as a single specimen measurement or reference measurement. In some other examples, the specimen is an aggregation of sites where the measurement data associated with the aggregated measurement site is a statistical aggregation of data associated with each of the multiple sites. Moreover, each of these multiple sites may include one or more measurement targets associated with a specimen or reference measurement.
As used herein, the term “wafer” generally refers to substrates formed of a semiconductor or non-semiconductor material. Examples include, but are not limited to, monocrystalline silicon, gallium arsenide, and indium phosphide. Such substrates may be commonly found and/or processed in semiconductor fabrication facilities. In some cases, a wafer may include only the substrate (i.e., bare wafer). Alternatively, a wafer may include one or more layers of different materials formed upon a substrate. One or more layers formed on a wafer may be “patterned” or “unpatterned.” For example, a wafer may include a plurality of dies having repeatable pattern features.
A “reticle” may be a reticle at any stage of a reticle fabrication process, or a completed reticle that may or may not be released for use in a semiconductor fabrication facility. A reticle, or a “mask,” is generally defined as a substantially transparent substrate having substantially opaque regions formed thereon and configured in a pattern. The substrate may include, for example, a glass material such as amorphous SiO2. A reticle may be disposed above a resist-covered wafer during an exposure step of a lithography process such that the pattern on the reticle may be transferred to the resist.
One or more layers formed on a wafer may be patterned or unpatterned. For example, a wafer may include a plurality of dies, each having repeatable pattern features. Formation and processing of such layers of material may ultimately result in completed devices. Many different types of devices may be formed on a wafer, and the term wafer as used herein is intended to encompass a wafer on which any type of device known in the art is being fabricated.
In one or more exemplary embodiments, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer. By way of example, and not limitation, such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code means in the form of instructions or data structures and that can be accessed by a general-purpose or special-purpose computer, or a general-purpose or special-purpose processor. Also, any connection is properly termed a computer-readable medium. For example, if the software is transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. Disk and disc, as used herein, includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.
Although certain specific embodiments are described above for instructional purposes, the teachings of this patent document have general applicability and are not limited to the specific embodiments described above. Accordingly, various modifications, adaptations, and combinations of various features of the described embodiments can be practiced without departing from the scope of the invention as set forth in the claims.
Number | Date | Country | |
---|---|---|---|
63622592 | Jan 2024 | US |