The embodiments of the invention generally relate to improving manufacturing processes, and more particularly to an improved method that simplifies statistical correlation processes by eliminating the need for the user to identify independent variables and which automatically identifies independent variables used in statistical analysis.
With the fast pace progress of modern technologies, the process of scaling down, and the development of more complex devices and circuit designs, process control becomes more critical for yield learning. Process shifts of a few degrees Celsius or a micro-second could shift device performance significantly. Some of the challenging characteristics of manufacturing data analysis include multiple data types, large volumes, subtle device shifts, and data outliers. To detect and determine possible factors which can impact product quality, new applications of statistical techniques and automated analyses have been developed.
One of the objectives in manufacturing engineering is to understand the factors which can impact yield. Conventional methods for detecting the factors that affect yield are based on an engineer's experience or theories. Engineers select dependent variables (such as limited yields) and independent variables (such as some metrology data) to build up a table, then analyze the table by using a data mining tools or by building a scatter plot to see if there is a strong correlation between the identified dependent and independent variables.
These traditional methods sometimes do not account for all possible factors due to the limited experience or theories and the inordinately long times for manual data extraction. Further, such methods cannot cover large volumes of data and different data types, such as production line yield data, inline test data, and metrology data. Further, it is difficult to conventionally perform data manipulation using the common vertical databases. In addition, the manual selection of independent variables sometimes cannot respond fast enough to emerging problems which may have major revenue impact. In addition, such conventional systems are not very user friendly, because they require the user to be very experienced in statistical analysis and to have extensive knowledge of which dependent and independent variables will produce the most useful statistical correlations.
Therefore, the present embodiments provide a method that has the user only input (or select) a dependent variable table (comprising dependent variables), a data type, and a report key (and possibly filtering restrictions and statistical model selections). Using this information, the method automatically locates independent variable data based on the data type and the report key. This independent variable data can be in the form of a table and comprises independent variables. The method automatically joins the dependent variable table and the independent variable data to create a joint table. Then, the method can automatically perform a statistical analysis of the joint table to find correlations between the dependent variables and the independent variables and output the correlation results. This avoids having the user input or select the independent variables.
In addition, the method can automatically and independently filter the dependent variables and the independent variables (based on the filtering restrictions input by the user) to produce filtered dependent variables and filtered independent variables within the joint table. The filtering can comprise using different filters for the dependent variables and the independent variables. Similarly, the method can remove dependent variables and independent variables from the joint table that are based on a sample size that is below a predetermined minimum to maintain statistical quality.
As used herein, the dependent variables are related to product quality, yield, performance, etc., the independent variables are related to process parameters and inline electrical test parameters. The data type comprises different data sources and the report key comprises a module list or photo layer list of the data type. Either modules or photo layers can be used to point to specific process sectors.
These and other aspects of the embodiments of the invention will be better appreciated and understood when considered in conjunction with the following description and the accompanying drawings. It should be understood, however, that the following descriptions, while indicating preferred embodiments of the invention and numerous specific details thereof, are given by way of illustration and not of limitation. Many changes and modifications may be made within the scope of the embodiments of the invention without departing from the spirit thereof, and the embodiments of the invention include all such modifications.
The embodiments of the invention will be better understood from the following detailed description with reference to the drawings, in which:
The embodiments of the invention and the various features and advantageous details thereof are explained more fully with reference to the non-limiting embodiments that are illustrated in the accompanying drawings and detailed in the following description. It should be noted that the features illustrated in the drawings are not necessarily drawn to scale. Descriptions of well-known components and processing techniques are omitted so as to not unnecessarily obscure the embodiments of the invention. The examples used herein are intended merely to facilitate an understanding of ways in which the embodiments of the invention may be practiced and to further enable those of skill in the art to practice the embodiments of the invention. Accordingly, the examples should not be construed as limiting the scope of the embodiments of the invention.
As mentioned above, traditional methods sometimes do not account for all possible factors due to the limited experience or theories and the inordinately long times for manual data extraction. Further, such methods cannot cover large volumes of data and different data types, such as production line yield data, inline test data, and metrology data. In addition, the manual selection of independent variables may not be able to respond fast enough to emerging problems. In addition, such conventional systems are not very user friendly, because they require the user to be very experienced in statistical analysis and to have extensive knowledge of which dependent and independent variables will produce the most useful statistical correlations.
Therefore, one idea of the invention is to have the user just supply an input table which has dependent variables and related categorical variables. This is different from traditional data mining systems which require the user to provide both dependent variables and independent variables. Therefore, with the invention, the user does not need to list, or even know, each of the independent variables. The user just needs to know which sector or part of the manufacturing process they want to focus on for data mining. With embodiments herein, the user just needs to identify the data type and data group (report key). The embodiments herein query all related independent variables automatically.
More specifically, as shown in flowchart form in
Using this information, the method automatically locates (queries areas of a database to find) independent variable data based on the data type and the report key in item 104, without further user input. This independent variable data can be in the form of a table and comprises independent variables. In item 104, the method also automatically (without further user input) joins the dependent variable table and the independent variable data to create a joint table.
In addition, the method can automatically and independently filter the dependent variables and the independent variables (based on the filtering restrictions input by the user) to produce filtered dependent variables and filtered independent variables within the joint table in item 106. With embodiments herein, the user is presented options to filter on any of the dependent variables in the input dataset and to filter independent variables automatically based on the distribution of each independent variable.
The filtering can comprise using different filters for the dependent variables and the independent variables. The dual filtering functions that occur in item 106 include different filters for dependent variable and for independent variables. The filters for the dependent variables can be based on both distribution of the variable and the other variables in the input table and can be determined by using a query builder. The filters for the independent variables can also be based on sigma rule. For example: if 3 sigma is selected, the data for independent variables out of 3 sigma will be filtered out of the analysis.
Similarly, the method can remove dependent variables and independent variables from the joint table that are based on a sample size that is below a predetermined minimum to maintain statistical quality in item 108. The minimum sample size function is used to eliminate dependent variables which have a smaller sample size than a minimum sample size. To eliminate false signals in statistical analysis, minimum sample sizes for independent variables are used.
Then, in item 110, the method can automatically perform a statistical analysis of the joint table to find correlations between the dependent variables and the independent variables and output the correlations results and rank the signals output by the statistical models. Thus, the models can be used to rank the signals to help the user pinpoint the most important signals. The most important signals can be further analyzed by using the correlation by time series.
The statistical models used with embodiments herein can include any models, whether now known or developed in the future. For example, the embodiments herein can use Generalized Linear Model (GLM) models and quadratic models. The GLM model is a linear model which can be used to rank signals based on R-squares. There are three options which can be used to do the analysis, positive correlation, negative correlation, and combination correlation. The positive correlation can be used to find the relationship between functional yield and inline test health of line yield. The negative correlation can be used to find relationships between functional yield and defect density. The combination correlation can be used for process window evaluation and abnormality identification. The quadratic model can be used to highlight a process which has significant quadratic shape and to evaluate if process windows are too wide or too narrow.
As part of the output, the invention can output various charts to visually confirm the signals output by the statistical models in item 112. This allows the user to take action to change various process windows in item 114 without having the user input or select the independent variables.
The system can be used efficiently with vertical database design and the user can control the sample size for statistical analysis. Further, multiple statistical models can be used to rank correlation results. The system can be used for process window evaluation, to detect abnormal process change, and for further physical failure analysis.
In the example shown in
The dependent variables are related to product quality, yield, performance, etc., while the independent variables are related to process parameters and process related measurement parameters. In other words, changes to the independent variables (e.g., changes in processing temperature, processing time, etc.) cause change in the dependent variables (e.g., the product yield or performance).
The data type comprises of different data sources and the report key comprises a module list or photo layer list of the data type. For example, some data types include metrology data, photo-limited yield (PLY) analysis, inline electrical data, or other related data types. The metrology data, photo-limited yield (PLY) analysis, inline electrical data, or other related data types are useful for vertical database design to make the system work efficiently. The data types can be used to identify independent variables automatically.
The processing herein is different from data mining systems which require the user to provide both dependent variables and independent variables. Therefore, with embodiments herein the user does not need to list, or even know, each of the independent variables. The user just needs to know which sector or part of the manufacturing process they want to focus on for data mining. The independent variable data can be retrieved automatically from a manufacturing database with embodiments herein.
The embodiments of the invention can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment including both hardware and software elements. In a preferred embodiment, the invention is implemented in software, which includes but is not limited to firmware, resident software, microcode, etc.
Furthermore, the embodiments of the invention can take the form of a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
The medium can be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium. Examples of a computer-readable medium include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk. Current examples of optical disks include compact disk-read only memory (CD-ROM), compact disk-read/write (CD-R/W) and DVD.
A data processing system suitable for storing and/or executing program code will include at least one processor coupled directly or indirectly to memory elements through a system bus. The memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution.
Input/output (I/O) devices (including but not limited to keyboards, displays, pointing devices, etc.) can be coupled to the system either directly or through intervening I/O controllers. Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks. Modems, cable modem and Ethernet cards are just a few of the currently available types of network adapters.
A representative hardware environment for practicing the embodiments of the invention is depicted in
The foregoing description of the specific embodiments will so fully reveal the general nature of the invention that others can, by applying current knowledge, readily modify and/or adapt for various applications such specific embodiments without departing from the generic concept, and, therefore, such adaptations and modifications should and are intended to be comprehended within the meaning and range of equivalents of the disclosed embodiments. It is to be understood that the phraseology or terminology employed herein is for the purpose of description and not of limitation. Therefore, while the embodiments of the invention have been described in terms of preferred embodiments, those skilled in the art will recognize that the embodiments of the invention can be practiced with modification within the spirit and scope of the appended claims.