This instant specification generally relates to quality control in electronic device manufacturing, including semiconductor processing lines. More specifically, the instant specification relates to monitoring remaining useful life of various processing tools used in semiconductor manufacturing.
Manufacturing of modern materials often involves various deposition techniques, such as chemical vapor deposition (CVD) or physical vapor deposition (PVD) techniques, in which atoms or molecules of one or more selected types are deposited on a wafer (substrate) held in low or high vacuum environments that are provided by vacuum processing (e.g., deposition, etching, etc.) chambers. Materials manufactured in this manner may include monocrystals, semiconductor films, fine coatings, and numerous other substances used in practical applications, such as electronic device manufacturing. Many of these applications depend on the purity and specifications of the materials grown in the processing chambers. The quality of such materials, in turn, depends on adherence of the manufacturing operations to correct process specifications. To maintain isolation of the inter-chamber environment and to minimize exposure of wafers to ambient atmosphere and contaminants, various sensor detection techniques are used to monitor processing chamber environment, wafer transportation, physical and chemical properties of the products, and the like. Improving precision, reliability, and efficiency of such monitoring presents a number of technological challenges whose successful resolution facilitates continuing progress of electronic device manufacturing and helps to meet the constantly increasing demands to the quality of the products of semiconductor device manufacturing.
In one implementation, disclosed is a method that includes storing, by a processing device, a failure index (FI) model generated using run-time sensor data that was collected during one or more operations of a tool of a manufacturing system that occurred prior to first five failures of the tool. The FI model includes an FI function, an input into the FI function comprising the run-time sensor data, and one or more FI threshold values for the FI function, wherein each of the one or more FI threshold values is associated with at least one of a present condition of the tool or a projected condition of the tool. The method further includes collecting new run-time sensor data for one or more instances of the tool, and applying, by the processing device, the FI model to the new run-time sensor data to identify one or more conditions associated with each of the one or more instances of the tool. The method further includes, responsive to one or more tool failures of the one or more instances of the tool, updating the FI model. Updating the FI model includes modifying at least one of a dependence of the FI function on the run-time sensor data, or at least one FI threshold value of the one or more FI threshold values.
In another implementation, disclosed is a method that includes storing, by a processing device, a failure index (FI) model generated using run-time sensor data that was collected during one or more operations of a tool of a manufacturing system that occurred prior to a first failure of the tool. The FI model includes an FI function, an input into the FI function comprising the run-time sensor data, and one or more FI threshold values for the FI function, wherein each of the one or more FI threshold values is associated with at least one of a present condition of the tool or a projected condition of the tool. The method further includes collecting new run-time sensor data for one or more instances of the tool and applying the FI model to the new run-time sensor data to identify one or more conditions associated with each of the one or more instances of the tool.
In another implementation, disclosed is a system that includes a memory and a processing device operatively coupled to the memory. The processing device is to store a failure index (FI) model generated using run-time sensor data that was collected during one or more operations of a tool of a manufacturing system that occurred prior to first five failures of the tool. The FI model includes an FI function, an input into the FI function comprising the run-time sensor data, and one or more FI threshold values for the FI function, wherein each of the one or more FI threshold values is associated with at least one of a present condition of the tool or a projected condition of the tool. The processing device is further to collect new run-time sensor data for one or more instances of the tool and apply the FI model to the new run-time sensor data to identify one or more conditions associated with each of the one or more instances of the tool. The processing device is further to update the FI model responsive to one or more tool failures of the one or more instances of the tool. Updating the FI model includes modifying at least one of a dependence of the FI function on the run-time sensor data, or at least one FI threshold value of the one or more FI threshold values.
The implementations disclosed herein provide for efficient monitoring of a state, including remaining useful life (RUL) or a time to some threshold condition (TTC), of various tools used in manufacturing of various products, including but not limited to semiconductor wafers, films, and/or fully or partially manufactured devices. The implementations disclosed herein provide for using available sensor data to estimate when a failure of various processing tools and/or processes is likely to happen, both in situations when a tool failure data is too scarce or not yet available and in situations when a substantial tool failure data has been accumulated. For example, the implementations disclosed herein can be used to estimate a current state of a particular tool (e.g., normal state, warning state, advanced state, etc.) and inform a manufacturing line controller about a likely RUL or TTC for the tool, and/or a time to a certain condition (e.g., a process stoppage),
The robotic delivery and retrieval of wafers, as well as maintaining controlled environments in loading, processing, and transfer chambers improve speed, efficiency, and quality of device manufacturing. Typical device manufacturing processes often require tens and even hundreds of steps, such as introducing a gas into a processing chamber, heating the chamber environment, changing a composition of gas, purging a chamber, pumping the gas out, changing pressure, moving a wafer from one position to another, creating or adjusting a plasma environment, performing etching, polishing, and/or deposition steps, and so on. The very complexity of the manufacturing technology calls for processing of a constant stream of runtime data from various sensors placed within or near the manufacturing system. Such sensor may include temperature sensors, pressure sensors, chemical sensors, gas flow sensors, motion sensors, position sensor, optical sensors, and/or other types of sensors. The manufacturing system can deploy multiple sensors of the same (or similar) type distributed throughout various parts of the system. For example, a single processing chamber can have multiple chemical sensors detecting a concentration of chemical vapor at various locations within the processing chamber and can similarly have multiple temperature sensors monitoring a temperature distribution.
The collected sensor data, e.g., raw run-time trace data, statistical characteristics of the raw data, can inform a handler (a user, engineer, supervisor) of the processing line when a specific tool is about to fail. Automatic estimation of the state of various tools without the need to stop the processing line and manually inspect each tool (a slow and expensive operation) is advantageous for increasing the processing line output and ensuring that the output comports with applicable technological specifications. Existing approaches to correlating runtime statistics with the state of various tools include using artificial intelligence (AI), e.g., machine-learning models. Training reliable AI models, however, requires a considerable number of prior (historical) tool failure. Given that a given tool can have multiple designs and is capable of failing in multiple different ways (e.g., a polishing tool can break down, become deformed, lose abrasion, and so on), collecting an amount of statistics sufficient for successful training of the AI models can require waiting for many tool failures. Additionally, training the AI models is typically performed by data science specialists who may lack subject matter expertise (e.g., expertise in physics and chemistry of the relevant processes and phenomena). Integration of a feedback from subject matter specialists into data-driven AI training process requires significant developmental efforts and is uncommon and/or impractical.
Aspects and implementations of the present disclosure address these and other challenges of the existing tool maintenance technology by disclosing a hybrid monitoring framework that combines unsupervised (or minimally supervised) monitoring during early stages of tool deployment when tool failure data is not yet available or scarce with supervised monitoring as more tool failure data is being collected. Hybrid monitoring integrates subject matter expertise, for use in estimation of tools' RULs (or other TTCs) during early stages of tool deployment with data-driven prediction during later stages (e.g., after several tool lifecycles). In some embodiments, a set {Xj}=X1, X2, . . . of runtime data and/or statistical characteristics of that data (denoted herein generally as Xj) collected during manufacturing can be identified. The run-time data can include one or more quantities (sensor values) that are monitored during processing operations, e.g., temperature, pressure, concentration of plasma particles, density, various tool-specific metrics, and the like. The identified run-time data or their statistical characteristics can be used to define a failure index (FI) function FI({Xj}), also referred to as FI herein, whose value is indicative of a likelihood that a specific tool is approaching the end of its RUL or some other TTC, e.g., a state where the tool is to be cleaned, re-charged, and/or undergo any other maintenance operation. In some embodiments, the FI can include a weighted sum FI({Xj})=Σjwj(ΔXj)β
Initially, parameters of the FI can be set based on recommendations of subject-matter experts and/or based on expectations that certain metrics do not significantly deviate from their normal ranges. For example, if a particular sensor quantity Xj associated with a new tool has an average value
The advantages of the disclosed techniques include (but are not limited to) a timely identification of different states of degradation of various tools that are deployed in manufacturing processing systems and are hard or inefficient to monitor without stopping the manufacturing line. Additional benefits include an ability to identify a quality of a maintenance operation performed for the tool. The disclosed implementations pertain to a variety of manufacturing techniques that use processing chambers (that may include deposition chambers, etching chambers, and the like), such as chemical vapor deposition techniques (CVD), physical vapor deposition (PVD), plasma-enhanced CVD, plasma-enhanced PVD, sputter deposition, atomic layer CVD, combustion CVD, catalytic CVD, evaporation deposition, molecular-beam epitaxy techniques, and so on. The disclosed implementations may be employed in techniques that use vacuum deposition chambers (e.g., ultrahigh vacuum CVD or PVD, low-pressure CVD, etc.) as well as in atmospheric pressure deposition chambers.
The transfer chamber 104 may include a robot 108, a robot blade 110, and an optical inspection tool for accurate optical inspection of a wafer 112 that is being transported by the robot blade 110 after processing in one of the processing chambers 106. The transfer chamber 104 may be held under pressure (temperature) that is higher (or lower) than the atmospheric pressure (temperature). The robot blade 110 may be attached to an extendable arm sufficient to move the robot blade 110 into the processing chamber 106 to retrieve the wafer from the chamber after processing of the wafer is complete.
The robot blade 110 may enter the processing chamber(s) 106 through a slit valve port (not shown) while a lid to the processing chamber(s) 106 remains closed. The processing chamber(s) 106 may contain processing gases, plasma, and various particles used in deposition processes. A magnetic field may exist inside the processing chamber(s) 106. The inside of the processing chamber(s) 106 may be held at temperatures and pressures that are different from the temperature and pressure outside the processing chamber(s) 106.
The manufacturing machine 100 may deploy one or more sensors 114. Each sensor 114 may be a temperature sensor, pressure sensor, chemical detection sensor, chemical composition sensor, gas flow sensor, motion sensor, position sensor, optical sensor, or any and other type of sensors. Some or all of the sensors 114 may include a light source to produce light (or any other electromagnetic radiation), direct it towards a target, such as a component of the machine 100 or a wafer, a film deposited on the wafer, etc., and detect light reflected from the target. The sensors 114 can be located anywhere inside the manufacturing machine 100 (for example, within any of the chambers including the loading stations, on the robot 108, on the robot blade 110, between the chambers, and so one), or even outside the manufacturing machine 100 (where the sensors can test ambient temperature, pressure, gas concentration, and so on).
In some implementations, a computing device 101 may control operations of the manufacturing machine 100 and its various tools and components, including operations of the robot 108, operations that manage processes in the processing chambers 106, operations of the sensors 114, and so on. The computing device 101 may communicate with an electronics module 150 of the robot 108 and with the sensors 114. In some implementations, such communication may be performed wirelessly. The computing device 101 may control operations of the robot 108 and may also receive sensing data from the sensors 114, including raw sensors data or sensor data that undergoes preliminary processing (such as conversion from analog to digital format) by sensors 114 or by another processing device, such as a microcontroller of the electronics module 150 or any other processing device of the manufacturing machine 100. In some implementations, some of the sensor data is processed by the electronics module 150 whereas some of the sensor data is processed by the computing device 101. The computing device 101 may include a sensor data module (SDM) 120. The SDM 120 may activate sensors, deactivate sensors, place sensors in an idle state, change settings of the sensors, detect sensor hardware or software problems, and so on. In some implementations, SDM 120 may keep track of the processing operations performed by the manufacturing machine 100 and determine which sensors 114 are to be sampled for a particular processing (or diagnostic, maintenance, etc.) operation of the manufacturing machine 100. For example, during a chemical deposition step inside one of the processing chambers 106, SDM 120 may sample sensors 114 that are located inside the respective processing chamber 106 but not activate (or sample) sensors 114 located inside the transfer chamber 104 and/or the loading station 102. The raw data obtained by SDM 120 may include time series data where a specific sensor 114 captures or generates one or more readings of a detected quantity at a series of times. For example, a pressure sensor may generate N pressure readings P (ti) at times t1, t2, . . . tN. In some implementations, the raw data obtained by SDM 120 may include spatial maps at a predetermined set of spatial locations. For example, an optical reflectivity sensor may determine reflectivity of a film deposited on the surface of a wafer, R(xj, yl), at a set (e.g., a two-dimensional set) of spatial locations xj, yk, on the surface of the film/wafer. In some implementations, both the time series and the spatial maps raw data can be collected. For example, as the film is being deposited on the wafer, SDM 120 can collect the reflectivity data from various locations on the surface of the film and at a set of consecutive instances of time, R(ti, xj, yl).
In some embodiments, SDM 120 may process the raw data (also referred as sensor values herein) obtained by sensors 114 and determine statistical characteristics of the obtained sensor values. For example, for each or some of the sensor values S(a), SDM 120 may determine one or more statistical characteristics Xj(a) of the respective sensor value S(a), such as a mean (e.g., X1(a)), a median (e.g., X2(a)), a mode (e.g., X3(a)), an upper bound (e.g., X4(a)), a lower bound (e.g., X5(a)), a standard deviation (e.g., X6(a)), a skewness (e.g., X7(a)), a kurtosis (e.g., X8(a)), or any further moments or cumulants of the data distribution. In various embodiments, only some of statistical characteristics Xj(a) may be used. In some embodiments, any additional value not listed above may be used. In some embodiments, at least some of the values Xj(a) may be raw data that is not statistically averaged. For example, temperature T and/or pressure P in the processing chamber may be taken at regular time intervals (e.g., one second) and not statistically processed. In some embodiments, SDM 120 may model (e.g., via regression analysis and/or some form of statistical fitting) the sensor values with various model distributions, e.g., the normal distribution, the log-normal distribution, the binomial distribution, Poisson's distribution, the Gamma distribution, or any other distribution. In such embodiments, the one or more parameters may include an identification of the fitting distribution being used together with the fitting parameters determined by SDM 120. In some embodiments, SDM 120 may use multiple distributions to fit the raw data from one sensor, e.g., a main distribution and a tail distribution for outlier data points.
The statistical characteristics obtained by SDM 120 and included into the FI function may be sensor-specific. For example, for some sensors a small number of values may be determined (e.g., only the mean, or the mean and the variance) whereas for other sensors more moments (e.g., skewness, kurtosis, etc.) may be determined. The computing device 101 may also include a remaining useful life estimation module (REM) 122 to process, aggregate, and analyze the statistics collected by SDM 120, as described in more details below, in reference to
The preprocessed data (including filtered raw data, sets of statistical characteristics of the raw data, and/or the like) can be analyzed by REM 122, which may deploy multiple machine-learning models (MLMs), including but not limited to an FI construction model (FCM) 220, a tool state detection model (TSDM) 230, a RUL/TTC prediction model (RPM) 240, and/or any other MLMs. More specifically, FCM 220 may be used to construct an FI function representative of the current state of a particular tool (or a set of multiple tools). The constructed index FI({Xj(a)}) may weight different sets of sensor data and/or statistical characteristics {Xj(a)} in view of a degree (estimated and/or learned during training) to which a respective sensor value S(a) is representative (relative to other sensor values) of a condition (e.g., state of deterioration) of the tool(s). In some embodiments, operation of FCM 220 can be performed prior to collection of the runtime sensor data 202. In some embodiments, operation of FCM 220 can continue during runtime processing. For example, the weights in the FI and/or FI thresholds (e.g., TW, TA. TC, etc.) can be adjusted at runtime as tool failure data become available over a number of lifecycles of the tool(s). TSDM 230 can apply the constructed FI to runtime sensor data 202 and determine a current state of the tool(s), e.g., normal state, warning state, advanced state, and/or any other defined states. RPM 240 can be used at later stages of tool degradation, e.g., once the tool has been determined to be in the advanced state, RPM 240 can estimate how much useful life has remained of the tool(s).
FCM 220, TSDM 230, and/or RPM 240 can be trained by the training server 270. Training server 270 can be (and/or include) a rackmount server, a router computer, a personal computer, a laptop computer, a tablet computer, a desktop computer, a media center, or any combination thereof. Training server 270 can include a training engine 272. Training engine 272 can construct various modes, including machine learning models. FCM 220, TSDM 230, and/or RPM 240 can be trained by the training engine 272 using training data that includes training inputs 274, corresponding target outputs 276, and mapping data mapping training inputs 274 to target outputs 276. In some implementations, FCM 220, TSDM 230, and/or RPM 240 can be trained separately.
The training outputs 276 can include correct associations (mappings) of training inputs 274 to training outputs 276. The training engine 272 can find patterns in the training data that map the training input 274 to the training output 276 (e.g., the associations to be predicted), and train FCM 220, TSDM 230, and/or RPM 240 to capture these patterns. The patterns can subsequently be used by FCM 220, TSDM 230, and/or RPM 240 for subsequent data processing, tool state determination, and RUL/TTC determination. For example, upon receiving a new set of runtime sensor data 202, TSDM 230 and/or RPM 240 can be capable of determining a current status of one or more tools of the processing line and, if the tool(s) are nearing the end of RUL (or some other threshold condition), estimate how long the tool(s) are to remain operational.
In some embodiments, FCM 220, TSDM 230, and/or RPM 240 can include one or more neural networks, e.g., neural networks having a single or multiple layers of linear and/or non-linear neural operations. In some embodiments, FCM 220, TSDM 230, and/or RPM 240 can deploy deep neural networks having multiple levels of linear or non-linear operations. Examples of deep neural networks are neural networks including convolutional neural networks, recurrent neural networks (RNN) with one or more hidden layers, fully connected neural networks, Boltzmann machines, and so on. In some implementations, the neural networks deployed in FCM 220, TSDM 230, and/or RPM 240 can include multiple neurons, each neuron can receive its input from other neurons or from an external source and can produce an output by applying an activation function to the sum of weighted inputs and a trainable bias value. A neural network (e.g., any neural network deployed in FCM 220, TSDM 230, and/or RPM 240) can include multiple neurons arranged in layers, including an input layer, one or more hidden layers, and an output layer. Neurons from adjacent layers can be connected by weighted edges. Initially, all the edge weights can be assigned some starting (e.g., random) values. For every training input 274 in the training dataset, training engine 272 ma can y cause the neural network to generate training outputs (e.g., predicted RUL/TTC of a particular tool). The training engine can compare the observed training output of the neural network with target output 276. The resulting error, e.g., the difference between the training output and target output 276, can be propagated back through the neural network, and the weights and biases in the neural network can be adjusted to make the training output closer to target output 276. This adjustment can be repeated until the output error for a particular training input 274 satisfies a predetermined condition (e.g., falls below a predetermined value). Subsequently, a different training input 274 can be selected, a new output generated, a new series of adjustments implemented, until the neural network is trained to an acceptable degree of accuracy.
Training inputs 274 may include historical sensor data 282, which may be stored, e.g., in tool statistics repository 280, which may be accessible to the computing device 101 directly or via network 260. Historical sensor data 282 can be past statistics e.g., collected by sensors 114 of manufacturing machine 100 or similar manufacturing machines. In some embodiments, historical sensor data 282 can include runtime sensor data 202 collected during previous life-cycles of similar tool(s). Historical sensor data 282 can be annotated with the times when the tool(s) experienced a failure and/or were replaced as a result of tool degradation. Historical sensor data 282 can be further annotated with a type of the failure, an estimate of the RUL/TTC at the time of the tool replacement, and/or any other appropriate data.
Tool statistics repository 280 can be a persistent storage capable of storing sensor data or sensor data statistics as well as metadata for the stored data/statistics. Tool statistics repository 280 can be hosted by one or more storage devices, such as main memory, magnetic or optical storage disks, tapes, or hard drives, network-attached storage (NAS), storage area network (SAN), and so forth. Although depicted as separate from the computing device 101, in some implementations the training statistics repository 280 can be a part of the computing device 101. In some implementations, the training statistics repository 280 can be a network-attached file server, while in other implementations the training statistics repository 280 can be some other type of persistent storage, such as an object-oriented database, a relational database, and so forth, that can be hosted by a server machine, or one or more different machines coupled to the computing device 101 via the network 260.
Once FCM 220, TSDM 230, and/or RPM 240 have been trained, the trained models can be provided to computing device 101 for processing of new runtime sensor data 202 by REM 122. In some embodiments, a copy of training engine 272 can also be provided to computing device 101. The copy of training engine 272 can be used for training and/or retraining of some or all of FCM 220, TSDM 230, and/or RPM 240 using runtime sensor data 202. For example, as additional lifecycles of a particular tool are completed, the runtime sensor data 202 from those additional cycles can be correlated with various statistical characteristics {Xj(a)} and adjustments of FI weights of FCM 220, state thresholds of TSDM 230, and/or RUL/TTC metrics of RPM 240 can be performed in view of the completed lifecycles.
Any or all of FCM 220, TSDM 230, and/or RPM 240 can be deployed using one or more processing devices. “Processing device,” as used herein, refers to a device capable of executing instructions encoding arithmetic, logical, or I/O operations. In one illustrative example, a processing device may follow von Neumann architectural model and can include an arithmetic logic unit (ALU), a control unit, and a plurality of registers. In a further aspect, a processing device can be a single core processor, which is typically capable of executing one instruction at a time (or process a single pipeline of instructions), or a multi-core processor which may simultaneously execute multiple instructions. In another aspect, a processing device can be implemented as a single integrated circuit, two or more integrated circuits, or can be a component of a multi-chip module. A processing device can also be referred to as a CPU. “Memory device” herein refers to a volatile or non-volatile memory, such as random-access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), or any other device capable of storing data.
Using available sensor types 302, normal operating ranges 310 for various sensor values (temperature, pressure, gas flow, thickness of the polishing element, etc.) associated with performance of a specific processing tool can be established. In the following, a reference will often be made to a single tool, but it should be understood that any number of tools can be evaluated in a similar manner. For example, for some sensor value S, an optimal value S0 can be identified (e.g., based on technological specification of the tool and/or the processing line). Additionally, a range of normal operations around the sensor value S, e.g., [S1, S2], can be established, representing an acceptable departure of the sensor value from the optimal value S0 as part of normal tool operations. In some embodiments, normal operating ranges 310 can be established based on observation of sensor values and sensor statistics of new tools even when no tool failure data is yet available. For example, a normal operating range [S1, S2] for the sensor value S can established as a certain number k (e.g., k=2, 3, etc.) of standard deviations σ from the mean sensor value
In some embodiments, FI generation stage 300 can use historical sensor data 282 (if available) for determination of the normal operating ranges 310 of at least some of the available sensor types 302. Available historical sensor data 282 can be processed, e.g., with a statistical analysis module 304 that extracts various statistical information from historical sensor data 282, including (but not limited to) one or more of the following statistical characteristics Xj(a): a mean value, a median value, a mode, a standard deviation, a half-width, a lower/upper bound, a skewness, a kurtosis, and/or the like. Such values can be determined for different datasets of the corresponding quantities, e.g., for datasets associated with different timestamps of historical sensor data 282. If any previous tool failure data is available (e.g., the instances when the tool failed or had to be replaced in the past), the statistical analysis module 304 can correlate the past instances of the tool failures with various features in the statistical characteristics {Xj(a)} extracted from historical sensor data 282.
Using the obtained correlations, the statistical analysis module 304 can perform a regression analysis and identify specific sensor values Sa (generated by specific sensor types 302) that are correlated with the past tool failures. Such correlations can be characterized by a predictive power Pj(a), e.g., a numerical value computed for the respective statistical characteristics Xj(a) and quantifying a degree to which changes in Xj(a) are correlated with historic instances of the tool failures. Sensor values Sa (sensor types 302) that include statistical characteristics Xj(a) having a higher predictive power Pj(a) can be identified by the statistical analysis module 304 as being more relevant for predicting future tool failures. Correspondingly, sensor values Sa that have statistical characteristics Xj(a) with lower predictive powers can be identified by the statistical analysis module 304 as being less relevant for predicting future tool failures.
Predictive powers Pj(a) for various sensor data and/or their statistical characteristics Xj(a) and various sensor values Sa can be communicated to a FI selection module 330 that selects a type of the FI (e.g., a linear function, a polynomial function, an exponential function, etc., or any combination thereof) and parameters of the FI, e.g., weights with which various sensor data and/or their statistical characteristics Xj(a) of one or more selected sensor values Sa are represented in the FI. In one example non-limiting implementation, the FI can be a linear function, constructed by selecting N most predictive sensor values S1 . . . SN, each represented with one or more (e.g., Ma) most relevant statistical characteristics Xj(a),
where, e.g., δXj(a) represents a departure of the statistical characteristics Xj(a) from its corresponding optimal value or a normal operating range. For example, one sensor value S1 can be represented in the FI by its mean (X1(1)) and variance (X1(1)) whereas another sensor value S2 can be represented by only variance (X1(2)), and so on. In some embodiments, the FI for some tools can be constructed with as few as one statistical characteristics X1(1) of a single sensor value (single sensor type 302). In some embodiments, the FI for some tools can be constructed with tens or even hundreds (or more) of various sensor values Sa, each described by one or more statistical characteristics Xj(a). In some embodiments, the selected FI can be a suitably chosen non-linear function, e.g., an exponential function,
or a polynomial function
with suitably chosen (integer or non-integer exponents βj(a)), and/or any other non-linear function.
In some embodiments, the weights wj(a) (and/or other parameters of the FI such as exponents βa) can be selected in view of the predictive powers Pj(a) identified by the statistical analysis module 304, e.g., can be proportional to the predictive powers Pj(a) or can be some other (e.g., non-linear) functions of the predictive powers Pj(a).
In some embodiments, various FI parameters 326 (e.g., weights wj(a), exponents βj(a), etc.) can be selected based, at least in part, on SME input 324. For example, if no significant historical sensor data 282 has been collected for a particular sensor of a new type, FI parameters 326 can be set by subject matter experts, e.g., based on mathematical/physical/chemical modeling, and the like. In some embodiments, FI parameters 326 can initially be set using statistical analysis module 304 and subsequently screened (e.g., confirmed or modified) by subject matter experts. In some embodiments, FI selection 330 can use a machine-learning model, e.g., FCM 220, trained as described in conjunction with
After FI parameters 326 are determined during the FI selection process, the resulting FI 340 can be used for evaluation of the tool degradation. In some embodiments, a state threshold selection module 350 can define one or more thresholds TW, TA, etc., indicating when a state of the tool deteriorates sufficiently enough to warrant sending a notification to an operator of the processing line. More specifically, the range of the FI where FI<TW can correspond to a normal state of the tool where the operator receives no notifications. The range TW≤FI<TA, where the FI reaches or exceeds a warning threshold TW but is below an advanced threshold TA, can correspond to a warning state of the tool where a notification is output to the operator indicating that the tool is approaching the end of its useful life. The range TA≤FI, where the FI reaches or exceed the advanced threshold TA, can correspond to the state where the tool is about to fail and where another notification may be delivered to the operator. Although this example has three states separated by two thresholds TW, TA, any other number of states/thresholds can be defined in some embodiments. In some embodiments, a failure threshold TF can be defined. The failure threshold may indicate a state of the tool failure, e.g., a state where the tool is expected to stop operating or providing an adequate performance. In some embodiments, the failure threshold TF need not indicate the failure of the tool, but can instead indicate a state where a tool maintenance operation is to be completed, which can include tool cleaning, recharging, recalibration, and/or the like.
In some embodiments, state threshold selection can be based on historical sensor data 282 and the analysis performed by statistical analysis module 304. For example, when at least one prior life cycle of the tool is available, the state thresholds TW, TA, etc., can be set based on the historical data points. For example, threshold TW can be set based on the prior tool failure data, e.g., by identifying a point where the tool has started a departure from its normal operating range. In one example, this point can correspond to a state of the tool where some representative sensor value S (or multiple sensor values) departed from the normal operating range [S1, S2] of the respective sensor value(s) by a certain predetermined value ΔS. In some embodiments, the state thresholds TW, TA, etc., can initially be set using on historical sensor data 282 and subsequently confirmed or adjusted using SME input 328. In some embodiments, e.g., when no historical sensor data 282 is available, the state thresholds TW, TA, etc., can be set (at least initially) based solely on SME input 328. The set state thresholds 360 can subsequently be used during runtime monitoring of the tool, as disclosed in more detail below in conjunction with
Deployment stage 400 can receive runtime sensor data 402-1. Runtime sensor data 402-1 can include one or more sensor values Sa collected and/or monitored by one or more sensors (e.g., sensors 114 illustrated in
If the warning condition is met, the state of the tool can be determined as warning 430, indicating that the tool's performance is outside the normal operating range. A warning notification 432 can be generated and provided to an operator of the processing line, indicating that the tool is approaching the end of its useful life and may have to be replaced within a certain time (which does not have to be determined yet) or approaching any other reference event. While the tool is in the warning state, deployment stage 400 may continue with collecting the runtime sensor data 402-2 and re-computing the failure index. At decision-making block 445, deployment stage 400 can determine if an advanced condition is met, e.g., whether FI<TA (the advanced condition is not met) or FI≥TA (the advanced condition is met).
In some embodiments, an additional function can be computed at block 440, e.g., a cumulative failure index (CFI) function. In some embodiments, the CFI can be an isotonic function of FI 340, such that the CFI tracks various increases of FI 340 but is not reset to a lower FI value even if FI 340 decreases later. For example, the CFI at timestamp tn can be determined based on the CFI value CFI (tn−1) at the previous timestamp tn−1 and a new value of the FI, e.g.,
Correspondingly, in some embodiments, an isotonic CFI function (or some other suitable cumulative function) can be used instead of FI 340 at the decision-making block, e.g., whether CFI<TA (the advanced condition is not met) or CFI≥TA (the advanced condition is met).
In some embodiments, the advanced condition can involve the slope of CFI 502. For example, a discrete derivative (representing a rate of change) of the CFI 502 can be computed, D(tn)=CFI(tn)−CFI(tn−1), and compared to a threshold derivative DT (which can be set using SME input 328 and/or based on historical sensor data 282, as depicted in
Numerous other metrics that are based on the FI and/or the CFI, the derivatives (first and/or higher) of the FI and/or the CFI, and or any combinations thereof can be used to identify the threshold for the advanced condition. Although
Referring again to
If the advanced condition is met, the state of the tool can be determined as advanced 450, indicating that the tool's failure is imminent. An advanced notification 452 can be generated and provided to the operator of the processing line, indicating that the tool is at or near the end of its RUL (or some other TTC for the tool). The deployment stage 400 can then estimate the tool's RUL/TTC. In some embodiments, the RUL/TTC of the tool can be estimated differently depending on the amount of available historical tool failure data. In those instances where at a decision-making block 455 it is determined that the tool failure data is absent (e.g., the deployment stage 400 is evaluating a new tool) or is insufficient (e.g., the amount of historical data is below a threshold), the deployment stage 400 can use an FI-based TTC projection 460. For example, at block 455 the number of historical tool failures N can be compared to a predetermined minimum number of tool failures Nmin, and in the instance N<Nmin, the historical statistics can be deemed insufficient.
FI-based TTC projection 460 can use one or more models, e.g., mathematical models that project (interpolate) the tool's degradation state into future and determine the TTC for the tool, e.g., by determining a likely time (or an interval of times) when the projected (extrapolated) CFI (t) is to cross a failure threshold TF.
The obtained regression CFI 604 with the estimated statistics of the regression parameters correctly characterizes past temporal evolution of the FI including occurrences of the warning state and/or advanced state. The regression model trained with such past predictions can be used to predict subsequent dynamics of the FI, e.g., to generate a projected CFI 606 for times t>T and to determine a projected distribution of failure times 610, e.g., based on intersections of the family of projected CFI 606 (determined using the regression model) with the failure threshold TF 608. Although
Referring again to
In those instances where the number of past tool failures N is at or above the predetermined minimum number Nmin, the deployment stage 400 can use a supervised RUL prediction 470 instead of (or in addition to) the FI-based RUL projection. In some embodiments, Nmin can be a low number, e.g., Nmin=5, 4, etc., or even Nmin=1.
Numerous other predictions and metrics may be obtained from the distribution PCUR(NREM).
Referring back to
Tools that fail and/or are replaced due to an impending failure can contribute to a tool failure data 495. For example, a tool that is replaced can be examined and the estimates of the RULs (e.g., generated by FI-based RUL prediction 470) can be verified compared to the actual state of the replaced tool. The collected tool failure data 495 can be used as the ground truth data for retraining the hybrid preventive maintenance system. More specifically, tool failure data 495 can be used (as indicated schematically with the dashed arrows in
The one or more selected statistical characteristics can be based, at least in part, on the plurality of ranked statistical characteristics. For example, the one or more selected statistical characteristics can be user-modified, compared with the plurality of ranked statistical characteristics, in that: (1) at least one statistical characteristics of the one or more selected statistical characteristics is ranked differently compared with the plurality of ranked statistical characteristics, (2) at least one statistical characteristics of the one or more selected statistical characteristics is not included in the plurality of ranked statistical characteristics; or (3) at least one of the plurality of ranked statistical characteristics is not included in the one or more selected statistical characteristics.
At block 860, method 800 can continue with obtaining a runtime statistics of sensor data. The runtime statistics of sensor data can include the statistics collected during operations of the device manufacturing system (e.g., during wafer/film/pattern processing). Some runtime statistics can be collected during stoppages of the device manufacturing system, e.g., for maintenance or any other purpose. A set of sensors that provide sensor data can be selected (e.g., by the computing device) in view of a specific tool whose state of deterioration is being monitored. For example, the sensors can provide the statistical characteristics selected at block 850 and used by the FI model.
At block 870, method 800 can continue with computing, using the runtime statistics of sensor data, a time series of FI values, e.g., FI(t1), FI(t2), . . . . FI (tn). Computing the time series of FI values can include computing one or more of a mean, a median, a mode, a variance, a standard deviation, a range, a maximum, a minimum, a skewness, or a kurtosis of one or more quantities of the runtime statistics of the sensor data, e.g., a mean value of a first quantity (e.g., mean value of the temperature of plasma), a mean value of a second sensor quantity (e.g., mean value of plasma density), a variance of the first quantity (e.g., variance of the temperature), a variance of the second quantity (e.g., variance of plasma density), and so on. The number of statistical characteristics used in computing the time series of FI values need not be limited. In some embodiments, the time series of FI values is an isotonic time series (e.g., the CFI, as described in conjunction with
At block 880, method 800 can include estimating, using the time series of FI values, one or more projected TTCs of the tool. In some embodiments, method 800 can include generating one or more notifications. Such notifications can be generated responsive to a value of the time series of FI values (e.g., the most recent value) satisfying a respective threshold condition (e.g., meeting or exceeding a warning threshold, advanced threshold, and/or the like).
In some embodiments, method 800 can include, at block 882, generating, using the time series of FI values, a regression model that characterizes temporal evolution of the time series of FI values. In some embodiments, the regression model can have one or more hidden stochastic parameters whose statistics is modeled with some suitable distributions, e.g., the Gaussian distribution.
At block 884, method 800 can continue with applying the regression model to generate a distribution of one or more projected TTCs of the tool. In some embodiments, obtaining the distribution of the projected TTCs of the tool can be based on the expected probabilities of the tool reaching a failure state. At block 886, method 800 can continue with estimating one or more metrics associated with TTC for the tool. For example, the one or more metrics can include some or all of: (1) a most likely number of operations that the current tool is projected to support before tool failure, (2) an average number of operations that the current tool is projected to support before tool failure, (3) a range of the number of operations that the current tool is projected to support with a first probability, (4) a minimum number of operations that the current tool is projected to process with a second probability, (5) a third probability that that the current tool is projected to support at least a threshold minimum number of operations, and/or other suitable metrics.
In some embodiments, method 800 can include obtaining historical FI data associated with the plurality of historical failures of the tool, e.g., responsive to a number of a plurality of historical failures of the tool being above a threshold number of failures. Method 800 can then include obtaining, using the historical FI data, one or more predicted RULs for the current instance of the tool. Method 800 may also include displaying, on a user interface, a first representation of the one or more projected TTCs for the current instance of the tool and/or a second representation of the one or more predicted RULs for the current instance of the tool. As disclosed above, the predicted can be based on the time series of FI values for the current instance of the tool, whereas the projected TTCs can be based on the historical FI data for the tool. Any of the first representation and/or the second representation can include one, some, or all of the metrics referenced in conjunction with block 886 above.
At block 930, method 900 can include determining that a value (e.g., the latest value or some other value) of the plurality of FI values meets a threshold (e.g., warning threshold or advanced threshold). At block 940, method 900 can include obtaining a historical FI data associated with the plurality of historical failures of the tool. In some embodiments, obtaining the historical data can be responsive to the number of the plurality of historical failures of the tool being above (or at or above) a threshold number of tool failures. In some embodiments, method 900 can include generating one or more notifications. Such notifications can be generated responsive to a value of the time series of FI values (e.g., the most recent value) satisfying a respective threshold condition (e.g., meeting or exceeding a warning threshold, advanced threshold, and/or the like).
At block 950, method 900 can include obtaining, using the plurality of FI values and the historical FI data, one or more predicted RULs for the tool. In some embodiments, block 950 can include operations illustrated with the callout portion of
At block 960, method 900 can include displaying, on a user interface, a representation of the one or more predicted RULs. In some embodiments, the representation of the one or more predicted RULs can include one, more, or all of the following: the most likely number of operations that the current tool is predicted to support before tool failure, an average number of operations that the current tool is predicted to support before tool failure, a range of the number of operations that the current tool is predicted to support with a first probability, a minimum number of operations that the current tool is predicted to process with a second probability, or a third probability that that the current tool is predicted to support at least a threshold minimum number of operations.
The FI model can include an FI function. An input into the FI function can include the run-time sensor data. In some implementations, the FI function can include a plurality of weighted statistical characteristics of the run-time sensor data. In some implementations, the FI model can further include one or more FI threshold values for the FI function. The one or more FI threshold values can be associated with a present condition of the tool (e.g., a warning state) or with a projected condition of the tool (e.g., an advanced deterioration state expected to occur at a certain time in the future).
In some implementations, generating the FI model can include operations illustrated in
Referring again to
In some embodiments, applying the FI model can include operations depicted in the callout portion of
At block 1036, method 1000 can include computing, using the run-time sensor data collected for a second instance of the tool, one or more FI function values for the second instance of the tool. At block 1038, method 1000 can include estimating, using the one or more FI function values computed for the second instance of the tool, a quality of a maintenance operation performed for the second instance of the tool. For example, the processing device performing method 1000 can determine that the FI function values computed after the maintenance operation indicate that performance of the tool has not improved to a degree expected fo the maintenance operation. The processing device can then output a notification to an operator of the manufacturing system that the maintenance operation was not successful.
At block 1040, method 1000 can include updating the FI model. In some implementations, updating the FI model can be responsive to one or more tool failures of the one or more instances of the tool. In some implementations, updating the FI model can include modifying (1) a dependence of the FI function on the run-time sensor data, and/or (2) at least one FI threshold value of the one or more FI threshold values. For example, one or more tool failures can indicate that the existing FI model underestimates or overestimates degradation of the tool. Furthermore, one or more tool failures can indicate that the FI threshold values (e.g., warning, advanced, failed state, etc.) underestimates or overestimates degradation of the tool. Correspondingly, the update to the FI model can change how the FI function depends on particular run-time sensor data, remove dependence on particular run-time sensor data, add dependence on other run-time sensor data, adjust the FI threshold values, and/or the like. In some implementations, updating the FI model is responsive to a first failure of the one or more instances of the tool.
At block 1050, method 1000 can include collecting additional run-time sensor data for one or more additional instances of the tool, e.g., after the FI model update. At block 1060, method 1000 can continue with applying the updated FI model to the additional run-time sensor data to identify one or more conditions of the one or more additional instances of the tool. In some implementations, method 1000 can include generating one or more notifications to a user (block 1070). The notifications can be generated responsive to a value of the time series of FI function values satisfying a respective threshold condition of one or more threshold conditions.
Example processing device 1100 can be connected to other processing devices in a LAN, an intranet, an extranet, and/or the Internet. The processing device 1100 can be a personal computer (PC), a set-top box (STB), a server, a network router, switch or bridge, or any device capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that device. Further, while only a single example processing device is illustrated, the term “processing device” shall also be taken to include any collection of processing devices (e.g., computers) that individually or jointly execute a set (or multiple sets) of instructions to perform any one or more of the methods discussed herein.
Example processing device 1100 can include a processor 1102 (e.g., a CPU), a main memory 1104 (e.g., read-only memory (ROM), flash memory, dynamic random access memory (DRAM) such as synchronous DRAM (SDRAM), etc.), a static memory 1106 (e.g., flash memory, static random access memory (SRAM), etc.), and a secondary memory (e.g., a data storage device 1118), which can communicate with each other via a bus 1130.
Processor 1102 represents one or more general-purpose processing devices such as a microprocessor, central processing unit, or the like. More particularly, processor 1102 can be a complex instruction set computing (CISC) microprocessor, reduced instruction set computing (RISC) microprocessor, very long instruction word (VLIW) microprocessor, processor implementing other instruction sets, or processors implementing a combination of instruction sets. Processor 1102 can include processing logic 1126 and can be one or more special-purpose processing devices such as an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), a digital signal processor (DSP), network processor, or the like. In accordance with one or more aspects of the present disclosure, processor 1102 can be configured to execute instructions implementing methods 800-1000 of preventive maintenance and tool state monitoring in manufacturing systems.
Example processing device 1100 can further comprise a network interface device 1108, which can be communicatively coupled to a network 1120. Example processing device 1100 can further comprise a video display 1110 (e.g., a liquid crystal display (LCD), a touch screen, or a cathode ray tube (CRT)), an alphanumeric input device 1112 (e.g., a keyboard), an input control device 1114 (e.g., a cursor control device, a touch-screen control device, a mouse), and a signal generation device 1116 (e.g., an acoustic speaker).
Data storage device 1118 can include a computer-readable storage medium (or, more specifically, a non-transitory computer-readable storage medium) 1128 on which is stored one or more sets of executable instructions 1122. In accordance with one or more aspects of the present disclosure, executable instructions 1122 can comprise executable instructions implementing methods 800-1000 of preventive maintenance and tool state monitoring in manufacturing systems.
Executable instructions 1122 can also reside, completely or at least partially, within main memory 1104 and/or within processor 1102 during execution thereof by example processing device 1100, main memory 1104 and processor 1102 also constituting computer-readable storage media. Executable instructions 1122 can further be transmitted or received over a network via network interface device 1108.
While the computer-readable storage medium 1128 is shown in
It should be understood that the above description is intended to be illustrative, and not restrictive. Many other implementation examples will be apparent to those of skill in the art upon reading and understanding the above description. Although the present disclosure describes specific examples, it will be recognized that the systems and methods of the present disclosure are not limited to the examples described herein, but can be practiced with modifications within the scope of the appended claims. Accordingly, the specification and drawings are to be regarded in an illustrative sense rather than a restrictive sense. The scope of the present disclosure should, therefore, be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled.
The implementations of methods, hardware, software, firmware or code set forth above can be implemented via instructions or code stored on a machine-accessible, machine readable, computer accessible, or computer readable medium which are executable by a processing element. “Memory” includes any mechanism that provides (i.e., stores and/or transmits) information in a form readable by a machine, such as a computer or electronic system. For example, “memory” includes random-access memory (RAM), such as static RAM (SRAM) or dynamic RAM (DRAM); ROM; magnetic or optical storage medium; flash memory devices; electrical storage devices; optical storage devices; acoustical storage devices, and any type of tangible machine-readable medium suitable for storing or transmitting electronic instructions or information in a form readable by a machine (e.g., a computer).
Reference throughout this specification to “one implementation” or “an implementation” means that a particular feature, structure, or characteristic described in connection with the implementation is included in at least one implementation of the disclosure. Thus, the appearances of the phrases “in one implementation” or “in an implementation” in various places throughout this specification are not necessarily all referring to the same implementation. Furthermore, the particular features, structures, or characteristics can be combined in any suitable manner in one or more implementations.
In the foregoing specification, a detailed description has been given with reference to specific exemplary implementations. It will, however, be evident that various modifications and changes can be made thereto without departing from the broader spirit and scope of the disclosure as set forth in the appended claims. The specification and drawings are, accordingly, to be regarded in an illustrative sense rather than a restrictive sense. Furthermore, the foregoing use of implementation, implementation, and/or other exemplarily language does not necessarily refer to the same implementation or the same example, but can refer to different and distinct implementations, as well as potentially the same implementation.
The words “example” or “exemplary” are used herein to mean serving as an example, instance, or illustration. Any aspect or design described herein as “example’ or “exemplary” is not necessarily to be construed as preferred or advantageous over other aspects or designs. Rather, use of the words “example” or “exemplary” is intended to present concepts in a concrete fashion. As used in this application, the term “or” is intended to mean an inclusive “or” rather than an exclusive “or.” That is, unless specified otherwise, or clear from context, “X includes A or B” is intended to mean any of the natural inclusive permutations. That is, if X includes A; X includes B; or X includes both A and B, then “X includes A or B” is satisfied under any of the foregoing instances. In addition, the articles “a” and “an” as used in this application and the appended claims should generally be construed to mean “one or more” unless specified otherwise or clear from context to be directed to a singular form. Moreover, use of the term “an implementation” or “one implementation” or “an implementation” or “one implementation” throughout is not intended to mean the same implementation or implementation unless described as such. Also, the terms “first,” “second,” “third,” “fourth,” etc. as used herein are meant as labels to distinguish among different elements and may not necessarily have an ordinal meaning according to their numerical designation.