Prediction of Failure Probabilities of Chips of a Wafer

Description

This application claims priority under 35 U.S.C. § 119 to patent application no. DE 10 2023 207 670.5, filed on Aug. 10, 2023 in Germany, the disclosure of which is incorporated herein by reference in its entirety.

The disclosure relates to a method for teaching a machine learning system for predicting failure probabilities of chips on a wafer and a method for predicting failure probabilities of the chips with the taught machine learning system.

BACKGROUND

It is generally known that machine learning (ML for short) and machine learning methods can be suitable for predicting defects in components during production.

Currently, predicting the failure probability or defect probability of an individual semiconductor chip during final testing based on wafer level test measurements is not possible due to the loss of traceability between processes. In particular, the link between a single chip for which wafer-level measurements have been performed and the associated final test is lost after the wafer has been cut, making the training of supervised models impossible.

It is therefore a task of the disclosure to provide a prediction of the probability of an individual chip being defective based on test measurements at wafer level.

SUMMARY

By predicting the failure probability, chips with a high failure probability can be removed early in the manufacturing process or, depending on the failure probabilities of all chips in a batch, an intelligent combination of chips in the end product with a low failure probability of the end product can be determined.

As a result, the disclosures can reduce waste, rework and additional costs associated with manufacturing and shipping defective products. By detecting and eliminating defects at an early stage, manufacturers can improve product quality and reduce the reject rate.

In a first aspect, the disclosure relates to a method for teaching a machine learning system to predict failure probabilities of the chips on a wafer during final-level testing or, in particular, later during operation of the chips.

The method begins by providing a training data set comprising wafer-level test measurements and an associated final test yield of a plurality of wafers. The final test yield can be understood as a quotient of the number of chips on the wafer that have passed the final test compared to the total number of chips on the wafer tested in the final test.

The following steps are then repeated several times:

- drawing a preferably complete wafer-level lot of wafer-level test measurements and the associated FT lot of final test yields, especially for an iteration/batch.
- predictions of the final test yields depending on the wafer-level test measurements by the machine learning system, wherein the machine learning system predicts the failure probabilities of the chips depending on the wafer-level test measurements and the predicted failure probabilities are aggregated to the predicted final test yields.
- teaching of the machine learning system in such a way that a mathematical difference between the predicted final test yield and the assigned final test yield is minimized.

What is surprising here is that the actually simplified training task of yield prediction is used to learn during training which chips will have a high or low failure probability and this essentially corresponds to the actual failure probability on a validation data set.

The disclosure of the first aspect can be used to color defective chips at an early stage of the manufacturing process, so that the chip cannot be further processed or removed after the dicing process (=cutting out the individual chips from the wafer). Another application is using the failure probability to perform smart sorting, where chips with a high failure probability are packaged together, reducing the risk of multiple good chips being packaged with a single bad chip, resulting in the elimination of the entire package.

It is also conceivable within the scope of the disclosure that the assignment of the components to the functional units comprises a sorting into at least two or three different classes according to the probability of the failure or defect occurring.

Furthermore, it is advantageous if, within the scope of the disclosure, the components are combined into the functional units by the assignment in such a way that the components are combined into the functional units depending on their class, so that preferably each of the functional units has only the components of the same class. This increases the likelihood that the entire functional unit will comprise more than one defective component when it is sorted out and, in particular, disposed of. Alternatively or additionally, it is possible that the assignment of the components to the functional units is carried out in such a way that the components are combined depending on their probability of failure in the functional unit, so that preferably the probability is maximized that in the case of a failed functional unit more than one defective component in the functional unit is responsible for this failure.

Another object of the disclosure is a computer program, in particular a computer program product, comprising instructions which, when the computer program is executed by a computer, cause the computer to carry out the method according to the disclosure. The computer program according to the disclosure thus brings with it the same advantages as have been described in detail with reference to a method according to the disclosure.

The disclosure also relates to a device for data processing which is configured to carry out the method according to the disclosure. The device can be a computer, for example, that executes the computer program according to the disclosure. The computer can comprise at least one processor for executing the computer program. A non-volatile data memory can be provided as well, in which the computer program can be stored and from which the computer program can be read by the processor for execution.

An object of the disclosure can also be a computer-readable storage medium comprising the computer program according to the disclosure. The storage medium is configured as a data memory such as a hard drive and/or a non-volatile memory and/or a memory card, for example. The storage medium can, for example, be integrated into the computer.

In addition, the method according to the disclosure can also be designed as a computer-implemented method.

BRIEF DESCRIPTION OF THE DRAWINGS

Further advantages, features and details of the disclosure will emerge from the following description, in which embodiment examples of the disclosure are described in detail with reference to the drawings. The features mentioned in the claims and in the description can each be essential to the disclosure individually or in any combination. Shown are:

The FIGURE is a schematic visualization of a method, a device, a storage medium and a computer program according to embodiment examples of the disclosure.

DETAILED DESCRIPTION

In the manufacture of semiconductor wafers, the wafers are tested after various production steps. Before the wafer is broken down into individual chips and the chips are packaged into the end product, the “finished” wafers are tested in what is known as wafer-level testing or EWS (=Electrical Wafer Sorting). In this step, the position of the individual chips on the wafer is still known.

After dicing and packaging, the wafers are tested again in the so-called final test. Since the position of the wafers is lost after the dicing process, it is no longer possible to trace the chips back to their original coordinates, which makes the training of supervised machine learning models impossible. Even if the individual chip position of each chip is lost, the information as to which wafer as a whole is part of the test lot is still available.

In the following, a method is proposed that enables the prediction of failure probability in the final inspection of individual chips based on wafer-level measurements.

For this purpose, a neural network architecture is used that receives a series of wafer-level measurements of a wafer as input and predicts a failure probability for each chip on the wafer (e.g. by using a sigmoid activation function in the last layer). For each matching set of wafers and final inspection lots, an average of the chip predictions can be calculated to obtain a yield of chips on the wafer.

For this purpose, the neural network is trained so that it can predict a yield of chips after the final test. The training is carried out as shown in the FIGURE.

According to embodiment examples of the disclosure, the FIGURE illustrates a method 100 for teaching a machine learning system, in particular a neural network, to predict failure probabilities of chips on a wafer. Also shown is a computer program 20, a storage medium 40 and a device 10 for data processing according to embodiment variants of the disclosure.

The method 100 begins by providing 101 a training data set comprising wafer-level test measurements at the chip level and an associated final test yield of a plurality of wafers.

The subsequent steps are then carried out several times until a predefined termination criterion is met:

- drawing 102 of a plurality of wafer-level test measurements and assigned final test yields,
- predictions 103 of the final test yields depending on the wafer-level test measurements by the machine learning system, wherein the machine learning system predicts the failure probabilities of the chips depending on the wafer-level test measurements and the predicted failure probabilities are aggregated to the predicted final test yields.

The drawn wafer level measurements are processed as input by the machine learning system, in particular propagated through the neural network to obtain a failure probability for each chip.

All failure probabilities are then aggregated to a single lot yield (e.g. by using the average).

- teaching 104 of the machine learning system in such a way that a mathematical difference between the predicted final test yield and the assigned final test yield is minimized. Teaching can be done using known training methods based on errors between predicted and true final test lot yields.

After training, the neural network can be used to predict a failure probability for each chip in the final test.

Claims

1. A method for teaching a machine learning system to predict failure probabilities of chips on a wafer, comprising: providing a training data set comprising a plurality of wafers, each with wafer-level test measurements and an associated final test yield;repeating several times: drawing a plurality of wafer-level test measurements and associated final test yields,predicting final test yields depending on the plurality of wafer-level test measurements by the machine learning system, wherein the machine learning system determines the failure probabilities of the chips on the wafer depending on the plurality of wafer-level test measurements,aggregating the determined failure probabilities to the predicted final test yields, andteaching the machine learning system, such that a mathematical difference between the predicted final test yield and the assigned final test yield is minimized.
2. The method according to claim 1, wherein the machine learning system has a topography set up to record the plurality of wafer-level test measurements as an input variable and to output a failure probability as an output variable for each chip on the wafer.
3. The method according to claim 1, wherein the machine learning system is a neural network and an activation function of an output layer of the neural network is a sigmoid activation function.
4. The method according to claim 1, further comprising: performing an aggregation of the predicted failure probabilities to the predicted final test yield using an average calculation of the predicted failure probabilities.
5. A method for predicting failure probabilities of chips on a wafer, comprising: using the trained machine learning system according to claim 1 to predict a failure probability in a final test for each chip on the wafer depending on wafer-level test measurements of a currently processed wafer.
6. The method according to claim 5, wherein the chips for which the predicted failure probability is greater than a predetermined first threshold value are sorted out and/or those chips for which the predicted failure probability is less than a predetermined second threshold value are installed with other chips less than the predetermined second threshold value in a final product.
7. The method according to claim 1, wherein a computer program comprises instructions which, when the computer program is executed by a computer, cause the computer to execute the method.
8. A device for data processing configured to carry out the method according to claim 1.
9. A non-transitory computer-readable storage medium comprising instructions which, when executed by a computer, cause the computer to execute the method according to claim 1.

Priority Claims (1)

Number	Date	Country	Kind
10 2023 207 670.5	Aug 2023	DE	national

Prediction of Failure Probabilities of Chips of a Wafer

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

Priority Claims (1)