This relates generally to semiconductor manufacturing operations and, particularly, to tools for the analysis of data from semiconductor processing operations.
In semiconductor processing operations, the components may be semiconductor wafers which, in turn, may be separated into semiconductor dice. Multiple wafers may be processed together as a group of wafers, called a lot. Collections of lots may be called batches. The wafers may be processed through entities which are basically semiconductor processing stations. Examples of such entities may be ion implantation equipment, deposition chambers, etching chambers, and the like.
A wide variety of software applications are used to operate a semiconductor manufacturing facility. In many cases, these applications are incompatible, for example, because they all developed by different sources. If it is desired to correlate data from different applications, those applications may be in different formats, making such correlation difficult.
The need for rapid correlation and collection of data arises because when the semiconductor manufacturing facility is not operating, it is necessary to quickly determine the cause of the problem and to restart the operation. This is because downtime may be extremely expensive, with a large number of wafers sitting idle, while production resources are not being used but, nonetheless, incur costs.
Thus, when something goes wrong, which may be called an excursion, a large amount of money is at stake. Analysis of an excursion may require getting all the manufacturing and test data and understanding correlations between that data. Key questions may include where each lot, wafer, or die was processed and what data is associated with such processing, which may be called lot level traceability, wafer level traceability, and die level traceability, respectively.
Conventionally, lot level analysis has limited effectiveness due to the fact that wafers are often split out from the parent lot during manufacturing and test operations. Thus, a lot identifier that identifies a lot may change over the course of processing. In addition, in test operations, especially after the wafer is cut into dice, the physical meaning of the lot or wafer changes fundamentally and makes tracking by lot or wafer especially challenging.
In many cases, there is a collection of automation systems that have inconsistent data domain formats. Thus, different data domains store data in different databases with different formats. Storage of data by various automation systems may have online transaction processing used cases, as well as offline analytic processing use case requirements. Online transaction processing use cases generally take priority over offline analytical processing.
Information from a variety of automated systems within a semiconductor manufacturing facility may be joined in a way that enables automatic extraction of data from across those systems. This data extraction may be particularly useful in connection with the analysis of excursions, but may be useful in other cases as well. In some cases, a unified system is provided which enables extraction at various levels within the manufacturing facility. These levels may include the lot, the batch, the wafer, and the die levels to varying degrees of specificity.
As shown in
The semiconductor manufacturing fabrication and test data of the model may be created by breaking the various sources of data into data domains and then determining how to link those domains together. The number of data domains may expand over time, but may include at least the following data domain models. Statistical process control data primarily consists of metrology data in a matrix plotted on control charts and used to monitor the process. Run to run control data consists of metrology data, after sources other than the statistical process control data, and recommended or actual setting data. Fault detection and classification data consists primarily of trace data during running of process equipment and information on faults, such as detection alarms, classification information, and the like. Work-in-process history contains information on the flow of material through various processing and metro operations, entities, and sub-entities. Entity attributes are typically information on the state of a tool at a given time. The attributes may be developed by counters which indicate the number of wafers since the last preventive maintenance. Preventive maintenance data contains details on the maintenance activities such as the type and identifier of the part changed. The yield data gives inline and end-of-line yields and defect information. Bin test results has binning information that describes whether a die works and, if not, the type of failure. In addition, there may be various parametric test data. Electrical test data has results from various electrical tests, such as current voltage, capacitance voltage, and the like. Facilities data includes various environmental data, such as temperature, pressure, and the like. Assembly test data includes information on testing done after the die is packaged.
After defining domains, such as those described above, the next step in defining the data model may be to understand the levels of the data, as well as the query use cases. The model may have wafer, lot, die, and entity level data. For each level, an identifier needs to be unique or needs to be made to be unique if combined with other keys. This uniqueness may necessitate source database modifications to provide the keys in a fashion amenable to joining the data from the different applications. Additionally, other items that may be useful for joining data across databases may require database modifications.
The run level identifier is different than the conventional run identifier in that a new run level identifier is assigned for each pass through re-work or re-measure. The run level identifier stays the same regardless of re-works or re-measures, so that the data for the overall run at a tool may be correlated. A controller may generate a run level identifier which is then distributed to the various domains.
A process batch identifier may also be assigned. Conventional batch identifiers are re-assigned when a batch goes from a process tool to a metrology tool. If there are a number of metrology steps, a large number of hard to reconcile batch numbers may be assigned. The process batch identifier solves this problem because it is only reassigned when the batch is processed at a process tool.
A unique wafer identifier may also be assigned to each wafer. For example, a twelve digit identifier may be used so that the wafer member is not duplicated over any time period of interest, as in the case of the conventional three digit wafer identifiers. Also, this unique wafer identifier allows tracing of a wafer through split lots.
The strategy for querying and joining each of the key data domains from the level perspective may include the following. At the batch level, the input specification for the query may include a list of process batch identifiers of interest, a list of operations of interest, and a list of domain specific parameters of interest. The join strategy may be the unique batch identifier and the lot level run identifier. At the lot level, the input specification for the query may be the lot identifier list, operation list, and domain specific parameters. The join strategy may include a unique lot identifier and a lot level run identifier. At the wafer level, the input specification of the query includes the wafer identifier list, operation list, and domain specific parameters. The join strategy is unique wafer identifiers and lot level run identifiers.
At the die level the input specification for the query may be the lot or wafer identifier list, optional die identifier list, operation list, and domain specific parameters. The join strategy may include providing unique lot identifiers and lot level run identifiers. At the entry level, the input specification for the query may not be used. The join strategy is that the nearest in time queries between the transaction date and time of the entity level transaction and the transaction date and time of the lot, wafer, or die level transaction in question.
The various data sources may need to be modified in some cases. Generally, full wafer identifiers are provided pursuant to specification and may be scribed on the wafer. Identifiers for batch, lot, entity, and die can be made unique by convention, but they vary by semiconductor manufacturer. All of these identifiers need to be consistent. The run identifier is a unique identifier when paired with a particular run of a lot through an operation. However, the run identifier is reset each time a lot is introduced at a tool. The run identifier is useful for distinguishing re-work and re-measures from original passes of the lot through the tool. It may be used across databases to join these kind of results.
Thus, referring to
For each domain, a join strategy is developed as indicated in block 26. Examples of potential join strategies include nearest in time join, lot-wafer-die join, run level identifier, process batch identifier, or wafer number join. The nearest in time join looks at an entity attribute or preventative maintenance. It asks when was the last entity attribute or preventative maintenance done on a tool. This is then tied back into an appropriate entity such as a lot, wafer, or die. It then looks at what lots ran under the changed condition. It may use SQL software to find the last value communicated, looking for that last value in a time window of a given duration. In a lot-wafer-die join, the data for the lot, wafer, and die are tied together and used to join the data together. Finally, a process batch identifier, a run level identifier, and/or a wafer identifier may be utilized to join the data across the different domains. As a result, the data model (
Once the data model is developed, the data may be manipulated (block 18) and integrated (block 20).
Referring to
The software automatically operates across the domain joined query by collecting the configured query information for the specified domain queries and by collecting the configured relevant information across domain join conditions. Instead of simply creating one giant database, the required information is actually extracted from a number of different domains or databases.
Referring to
The next tier is the server tier 42. It includes a data join cluster 60 with a database gateway that is a data join solution for query optimization over multiple databases.
Tier 3 includes the data sources 44 domains. This may include, for example, the domains for sort and electrical test data, a database for persisting metadata configuration information, a data source for work in process, a data source for advance process control, and other process control data as already outlined.
References throughout this specification to “one embodiment” or “an embodiment” mean that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one implementation encompassed within the present invention. Thus, appearances of the phrase “one embodiment ” or “in an embodiment” are not necessarily referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be instituted in other suitable forms other than the particular embodiment illustrated and all such forms may be encompassed within the claims of the present application.
While the present invention has been described with respect to a limited number of embodiments, those skilled in the art will appreciate numerous modifications and variations therefrom. It is intended that the appended claims cover all such modifications and variations as fall within the true spirit and scope of this present invention.