The present technology relates to a polishing apparatus and a program.
A polishing apparatus for polishing a substrate (for example, a wafer) is known. For example, as disclosed in Japanese patent publication No. 2017-76779, there is known a technique for stopping polishing by detecting that a surface of underlayer is exposed and initial unevenness is flattened from a signal related to frictional force in polishing. This detection is also referred to as end point detection. For that detection, whether the signal waveform satisfies a predetermined condition is determined in real time, to determine the end point.
However, when the conventional end point detection method is used, there is a problem that the timings of the end point detection differ between substrates, so that the thicknesses (also referred to as residual film thicknesses) of the remaining films (also referred to as residual films) of the substrates are not constant.
In the conventional end point detection method, it is detected whether a simple numerical value (for example, inclination) characterizing a signal waveform related to a frictional force in polishing satisfies a predetermined condition, and predetermined additional polishing is performed after the detection. In actual polishing, for example, the polishing rates change due to the wear of a polishing pad, and the polishing profiles of the substrates are not always constant. In order to make the residual film thickness constant in accordance with the polishing situation (or state) that changes as described above, it has been necessary to establish a new end point detection method. In addition, in a case where the polishing amount or the residual film amount during polishing deviates from a predetermined condition, it is desirable that polishing can be performed so as to achieve a target polishing amount without increasing the polishing time, for example, by changing a polishing condition (for example, polishing pressure). In any case, even if the situation of polishing changes, it has been desired to estimate a parameter (for example, a polishing amount or a residual film amount, a polishing end point probability, and remaining polishing time or additional polishing time from the end point detection timing) at a target time point during polishing.
The present technology has been made in view of the above problems, and it is desirable to provide a polishing apparatus and a program capable of estimating a parameter at a target time point during polishing even when a polishing situation changes.
A polishing apparatus of one embodiment comprises: a generation unit configured to generate time-series data of a feature value up to a target time point by using data regarding a frictional force between a polishing member and a target substrate up to the target time point during polishing or a temperature measurement data of the polishing member or the target substrate; and an prediction unit configured to input at least the time-series data of the feature value generated by the generation unit to a machine learning model trained with a training data set including, as an input, time-series data of the feature value up to a specific time point during polishing of another substrate, and as an output, a polishing amount or a residual film amount at the specific time point, or time-series data of the polishing amount or the residual film amount up to the specific time point during polishing, the polishing amount or the residual film amount being predicted using at least a film thickness measured after polishing of the another substrate, and output an predicted value of a polishing amount or a residual film amount at the target time point during polishing of the target substrate.
A polishing apparatus of one embodiment, comprises: a generation unit configured to generate time-series data of a feature value up to a target time point by using data regarding a frictional force between a polishing member and a target substrate up to a target time point during polishing or a temperature measurement data of the polishing member or the target substrate; an prediction unit configured to input at least the time-series data of the feature value generated by the generation unit to a machine learning model trained with a training data set that includes, as an input, time-series data of the feature value up to a specific time point during polishing of another substrate and as an output, a polishing end point probability at the specific time point during polishing of the another substrate or time-series data of the polishing end point probability up to the specific time point, and output an predicted value of the polishing end point probability at the target time point of the target substrate; and a determination unit configured to determine whether or not a polishing end point has been reached by using the predicted value.
A polishing apparatus of one embodiment, comprises: a generation unit configured to generate time-series data of a feature value up to a target time point by using data regarding a frictional force between a polishing member and a target substrate up to the target time point during polishing or a temperature measurement data of the polishing member or the target substrate; and an prediction unit configured to input at least the time-series data of the feature value generated by the generation unit to a machine learning model trained with a training data set including, as an input, time-series data of the feature value up to a specific time point during polishing of another substrate, and as an output, a remaining polishing time at the specific time point or an additional polishing time from an end point detection timing, or time-series data of the a remaining polishing time up to the specific time point or the additional polishing time from the end point detection timing, the remaining polishing time or the additional polishing time being determined such that a remaining film thickness or a polishing amount of the another substrate becomes a target value, and output an predicted value of the remaining polishing time or the additional polishing time from an end point detection timing of the target substrate; and a determination unit that determines whether or not a polishing end point has been reached by using the predicted value.
A program of one embodiment for causing a computer to function as: a generation unit configured to generate time-series data of a feature value up to a target time point by using data regarding a frictional force between a polishing member and a target substrate up to the target time point during polishing or a temperature measurement data of the polishing member or the target substrate; and an prediction unit configured to input at least the time-series data of the feature value generated by the generation unit to a machine learning model trained with a training data set including, as an input, time-series data of the feature value up to a specific time point during polishing of another substrate, and as an output, a polishing amount or a residual film amount at the specific time point during polishing, or time-series data of the polishing amount or the residual film amount up to the specific time point, the polishing amount or the residual film amount being predicted by using at least a film thickness measured after polishing of the another substrate, and output an predicted value of the polishing amount or the residual film amount at the target time point during polishing of the target substrate.
A program of one embodiment for causing a computer to function as: a generation unit configured to generate time-series data of a feature value up to a target time point by using data regarding a frictional force between a polishing member and a target substrate up to the target time point during polishing or a temperature measurement data of the polishing member or the target substrate; and an prediction unit that inputs at least the time-series data of the feature value generated by the generation unit to a machine learning model trained with a training data set that includes, as an input, time-series data of the feature value up to a specific time point during polishing of another substrate, and as an output, a polishing end point probability at the specific time point or time-series data of the polishing end point probability up to the specific time point during polishing of the another substrate, and outputs an predicted value of the polishing end point probability at the target time point.
A program of one embodiment for causing a computer to function as: a generation unit configured to generate time-series data of a feature value up to a target time point by using data regarding a frictional force between a polishing member and a target substrate up to the target time point during polishing or a temperature measurement data of the polishing member or the target substrate; and an prediction unit configured to input at least time-series data of the feature value generated by the generation unit to a machine learning model trained with a training data set including, as an input, time-series data of the feature value up to a specific time point during polishing of another substrate, and as an output, a remaining polishing time at the specific time point or an additional polishing time from an end point detection timing or time-series data of the remaining polishing time up to the specific time point or the additional polishing time from the end point detection timing, the remaining polishing time or the additional polishing time being determined such that a remaining film thickness or a polishing amount of the another substrate becomes a target value, and output an estimation value of the additional polishing time from the remaining polishing time or the end point detection timing of the target substrate.
An information processing system of one embodiment comprises: a generation unit configured to generate time-series data of a feature value up to a target time point by using data regarding a frictional force between a polishing member and a target substrate up to the target time point during polishing or a temperature measurement data of the polishing member or the target substrate; and an prediction unit configured to input at least the time-series data of the feature value generated by the generation unit to a machine learning model trained with a training data set including, as an input, time-series data of the feature value up to a specific time point during polishing of another substrate, and as an output, a polishing amount or a residual film amount at the specific time point, or time-series data of the polishing amount or the residual film amount up to the specific time point during polishing, the polishing amount or the residual film amount being predicted using at least a film thickness measured after polishing of the another substrate, and output an predicted value of a polishing amount or a residual film amount at the target time point during polishing of the target substrate.
An information processing system of one embodiment comprises: a generation unit configured to generate time-series data of a feature value up to a target time point by using data regarding a frictional force between a polishing member and a target substrate up to a target time point during polishing or a temperature measurement data of the polishing member or the target substrate; an prediction unit configured to input at least the time-series data of the feature value generated by the generation unit to a machine learning model trained with a training data set that includes, as an input, time-series data of the feature value up to a specific time point during polishing of another substrate and as an output, a polishing end point probability at the specific time point during polishing of the another substrate or time-series data of the polishing end point probability up to the specific time point, and output an predicted value of the polishing end point probability at the target time point of the target substrate; and a determination unit configured to determine whether or not a polishing end point has been reached by using the predicted value.
An information processing system of one embodiment comprises: a generation unit configured to generate time-series data of a feature value up to a target time point by using data regarding a frictional force between a polishing member and a target substrate up to the target time point during polishing or a temperature measurement data of the polishing member or the target substrate; and an prediction unit configured to input at least the time-series data of the feature value generated by the generation unit to a machine learning model trained with a training data set including, as an input, time-series data of the feature value up to a specific time point during polishing of another substrate, and as an output, a remaining polishing time at the specific time point or an additional polishing time from an end point detection timing, or time-series data of the a remaining polishing time up to the specific time point or the additional polishing time from the end point detection timing, the remaining polishing time or the additional polishing time being determined such that a remaining film thickness or a polishing amount of the another substrate becomes a target value, and output an predicted value of the remaining polishing time or the additional polishing time from an end point detection timing of the target substrate; and a determination unit that determines whether or not a polishing end point has been reached by using the predicted value.
A substrate polishing method of one embodiment comprises: a generation step configured to generate time-series data of a feature value up to a target time point by using data regarding a frictional force between a polishing member and a target substrate up to the target time point during polishing or a temperature measurement data of the polishing member or the target substrate; and an estimation step configured to input at least the time-series data of the feature value generated by the generation step to a machine learning model trained with a training data set including, as an input, time-series data of the feature value up to a specific time point during polishing of another substrate, and as an output, a polishing amount or a residual film amount at the specific time point, or time-series data of the polishing amount or the residual film amount up to the specific time point during polishing, the polishing amount or the residual film amount being predicted using at least a film thickness measured after polishing of the another substrate, and output an predicted value of a polishing amount or a residual film amount at the target time point during polishing of the target substrate.
A substrate polishing method of one embodiment comprises: a generation step configured to generate time-series data of a feature value up to a target time point by using data regarding a frictional force between a polishing member and a target substrate up to a target time point during polishing or a temperature measurement data of the polishing member or the target substrate; an estimation step configured to input at least the time-series data of the feature value generated by the generation step to a machine learning model trained with a training data set that includes, as an input, time-series data of the feature value up to a specific time point during polishing of another substrate and as an output, a polishing end point probability at the specific time point during polishing of the another substrate or time-series data of the polishing end point probability up to the specific time point, and output an predicted value of the polishing end point probability at the target time point of the target substrate; and a determination step configured to determine whether or not a polishing end point has been reached by using the predicted value.
A substrate polishing method of one embodiment comprises: a generation step configured to generate time-series data of a feature value up to a target time point by using data regarding a frictional force between a polishing member and a target substrate up to the target time point during polishing or a temperature measurement data of the polishing member or the target substrate; and an estimation step configured to input at least the time-series data of the feature value generated by the generation step to a machine learning model trained with a training data set including, as an input, time-series data of the feature value up to a specific time point during polishing of another substrate, and as an output, a remaining polishing time at the specific time point or an additional polishing time from an end point detection timing, or time-series data of the a remaining polishing time up to the specific time point or the additional polishing time from the end point detection timing, the remaining polishing time or the additional polishing time being determined such that a remaining film thickness or a polishing amount of the another substrate becomes a target value, and output an predicted value of the remaining polishing time or the additional polishing time from an end point detection timing of the target substrate; and a determination step that determines whether or not a polishing end point has been reached by using the predicted value.
Hereinafter, a description will be given of each embodiment of the present invention with consultation of drawings. However, unnecessarily detailed description may be omitted. For example, a detailed description of a well-known matter and a repeated description of substantially the same configuration may be omitted. This is to avoid unnecessary redundancy of the following description and to facilitate understanding of those skilled in the art.
A polishing apparatus according to a first aspect of the present technology comprises: a generation unit configured to generate time-series data of a feature value up to a target time point by using data regarding a frictional force between a polishing member and a target substrate up to the target time point during polishing or a temperature measurement data of the polishing member or the target substrate; and an prediction unit configured to input at least the time-series data of the feature value generated by the generation unit to a machine learning model trained with a training data set including, as an input, time-series data of the feature value up to a specific time point during polishing of another substrate, and as an output, a polishing amount or a residual film amount at the specific time point, or time-series data of the polishing amount or the residual film amount up to the specific time point during polishing, the polishing amount or the residual film amount being predicted using at least a film thickness measured after polishing of the another substrate, and output an predicted value of a polishing amount or a residual film amount at the target time point during polishing of the target substrate.
With this configuration, a relationship between a feature value related to a change in a frictional force or temperature when polishing is performed and a polishing amount or a residual film amount as a result of polishing is trained, and the polishing amount or the residual film amount during polishing of a new substrate is predicted using the trained machine learning model. By the learning of the machine learning model, the trained machine learning model can estimate a polishing amount or a residual film amount in consideration of the influence of the consumable member such as the polishing pad and the non-uniformity of polishing. Therefore, it is possible to estimate the polishing amount or the residual film amount during polishing of a new substrate in consideration of the influence of the consumable member such as the polishing pad and the non-uniformity of polishing. By using the predicted value for detecting the polishing end point of the target substrate, it is possible to realize end point detection capable of suppressing the difference in residual film thickness between the substrates even if the polishing situation changes.
A polishing apparatus according to a second aspect of the present technology, in the polishing apparatus according to the first aspect, further comprises: a determination unit configured to determine whether or not an polishing end point has been reached by using the predicted value; and a control unit configured to control the polishing apparatus so as to stop polishing in a case where the determination unit determines that the polishing end point has been reached.
According to this configuration, since it is possible to control the polishing apparatus so as to stop polishing by using the polishing amount or the residual film amount during polishing predicted in consideration of the influence of consumable members such as polishing pads and non-uniformity of substrates, the difference between the substrates in the polishing amount or the residual film amount at the end of polishing can be reduced.
A polishing apparatus according to a third aspect of the present technology, in the polishing apparatus according to the first or second aspect, wherein the input of the machine learning model further includes a polishing recipe, a use time of one consumable member, the number of substrates treated with the consumable member, and/or an initial film thickness.
According to this configuration, it is possible to estimate the polishing amount or the residual film amount according to the polishing condition and the state of the consumable members, so that the estimation accuracy can be improved.
A polishing apparatus according to a forth aspect of the present technology, in the polishing apparatus according to any one of the first to third aspect, wherein the polishing amount or the residual film amount at each time point in the training data set is calculated using a first polishing rate until an interface between a polishing target layer and a lower layer is exposed and a second polishing rate after the interface is exposed.
A polishing apparatus according to a fifth aspect of the present technology comprises: a generation unit configured to generate time-series data of a feature value up to a target time point by using data regarding a frictional force between a polishing member and a target substrate up to a target time point during polishing or a temperature measurement data of the polishing member or the target substrate; an prediction unit configured to input at least the time-series data of the feature value generated by the generation unit to a machine learning model trained with a training data set that includes, as an input, time-series data of the feature value up to a specific time point during polishing of another substrate and as an output, a polishing end point probability at the specific time point during polishing of the another substrate or time-series data of the polishing end point probability up to the specific time point, and output an predicted value of the polishing end point probability at the target time point of the target substrate; and a determination unit configured to determine whether or not a polishing end point has been reached by using the predicted value.
According to this configuration, the relationship between the feature value related to a change in a frictional force or temperature when polishing is performed and the polishing end point probability at each time point during polishing is trained, and a polishing end point probability at each time point during polishing of a new substrate is predicted using the trained machine learning model. By the learning of the machine learning model, the trained machine learning model can estimate a polishing end point probability at each time point during polishing in consideration of the influence of the consumable member such as the polishing pad and the non-uniformity of polishing, and thus, it is possible to estimate the polishing end point probability at each time point during polishing of a new substrate in consideration of the influence of the consumable member such as the polishing pad and the non-uniformity of polishing. By using the predicted value for detecting the polishing end point of the target substrate, it is possible to realize end point detection capable of suppressing the difference in residual film thickness between the substrates even if the polishing situation changes.
A polishing apparatus according to a sixth aspect of the present technology, in the polishing apparatus according to the fifth aspect, comprises: a control unit configured to control the polishing apparatus so as to stop polishing in a case where the determination unit determines that the polishing end point has been reached.
According to this configuration, since the influence of the consumable members such as a polishing pad and non-uniformity of substrates can be taken into consideration, a deviation range of the polishing amount or the residual film amount at the end of polishing can be reduced.
A polishing apparatus according to a seventh aspect of the present technology comprises: a generation unit configured to generate time-series data of a feature value up to a target time point by using data regarding a frictional force between a polishing member and a target substrate up to the target time point during polishing or a temperature measurement data of the polishing member or the target substrate; and an prediction unit configured to input at least the time-series data of the feature value generated by the generation unit to a machine learning model trained with a training data set including, as an input, time-series data of the feature value up to a specific time point during polishing of another substrate, and as an output, a remaining polishing time at the specific time point or an additional polishing time from an end point detection timing, or time-series data of the a remaining polishing time up to the specific time point or the additional polishing time from the end point detection timing, the remaining polishing time or the additional polishing time being determined such that a remaining film thickness or a polishing amount of the another substrate becomes a target value, and output an predicted value of the remaining polishing time or the additional polishing time from an end point detection timing of the target substrate; and a determination unit that determines whether or not a polishing end point has been reached by using the predicted value.
According to this configuration, the relationship between the feature value related to a change in the frictional force or temperature at the time of polishing and the remaining polishing time or the additional polishing time from the end point detection timing is trained, and a remaining polishing time or additional polishing time from the end point detection timing during polishing of a new substrate is predicted using the trained machine learning model. By the learning of the machine learning model, the trained machine learning model can estimate the remaining polishing time or the additional polishing time from the end point detection timing in consideration of the influence of the consumable member such as the polishing pad and the non-uniformity of polishing. Therefore, the remaining polishing time or the additional polishing time from the end point detection timing during the polishing of the new substrate can be predicted in consideration of the influence of the consumable member such as the polishing pad and the non-uniformity of polishing. By using the predicted value for detecting the polishing end point of the target substrate, it is possible to realize end point detection capable of suppressing the difference in residual film thickness between the substrates even if the polishing situation changes.
A polishing apparatus according to an eighth aspect of the present technology, in the polishing apparatus according to the seventh aspect, further comprises: a control unit configured to control the polishing apparatus so as to stop polishing by using the predicted value of the remaining polishing time or the additional polishing time from the end point detection timing.
According to this configuration, since the influence of the consumable members such as a polishing pad and non-uniformity of substrates can be taken into consideration, a deviation range of the polishing amount or the residual film amount at the end of polishing can be reduced.
A program, according to a ninth aspect of the present technology, for causing a computer to function as: a generation unit configured to generate time-series data of a feature value up to a target time point by using data regarding a frictional force between a polishing member and a target substrate up to the target time point during polishing or a temperature measurement data of the polishing member or the target substrate; and an prediction unit configured to input at least the time-series data of the feature value generated by the generation unit to a machine learning model trained with a training data set including, as an input, time-series data of the feature value up to a specific time point during polishing of another substrate, and as an output, a remaining polishing time at the specific time point or an additional polishing time from an end point detection timing, or time-series data of the a remaining polishing time up to the specific time point or the additional polishing time from the end point detection timing, the remaining polishing time or the additional polishing time being determined such that a remaining film thickness or a polishing amount of the another substrate becomes a target value, and output an predicted value of the remaining polishing time or the additional polishing time from an end point detection timing of the target substrate; and a determination unit that determines whether or not a polishing end point has been reached by using the predicted value.
A program, according to a tenth aspect of the present technology, for causing a computer to function as: a generation unit configured to generate time-series data of a feature value up to a target time point by using data regarding a frictional force between a polishing member and a target substrate up to the target time point during polishing or a temperature measurement data of the polishing member or the target substrate; and an prediction unit that inputs at least the time-series data of the feature value generated by the generation unit to a machine learning model trained with a training data set that includes, as an input, time-series data of the feature value up to a specific time point during polishing of another substrate, and as an output, a polishing end point probability at the specific time point or time-series data of the polishing end point probability up to the specific time point during polishing of the another substrate, and outputs an predicted value of the polishing end point probability at the target time point.
A program, according to an eleventh aspect of the present technology, for causing a computer to function as: a generation unit configured to generate time-series data of a feature value up to a target time point by using data regarding a frictional force between a polishing member and a target substrate up to the target time point during polishing or a temperature measurement data of the polishing member or the target substrate; and an prediction unit configured to input at least time-series data of the feature value generated by the generation unit to a machine learning model trained with a training data set including, as an input, time-series data of the feature value up to a specific time point during polishing of another substrate, and as an output, a remaining polishing time at the specific time point or an additional polishing time from an end point detection timing or time-series data of the remaining polishing time up to the specific time point or the additional polishing time from the end point detection timing, the remaining polishing time or the additional polishing time being determined such that a remaining film thickness or a polishing amount of the another substrate becomes a target value, and output an estimation value of the additional polishing time from the remaining polishing time or the end point detection timing of the target substrate.
An information processing system according to a twelfth aspect of the present technology comprises: a generation unit configured to generate time-series data of a feature value up to a target time point by using data regarding a frictional force between a polishing member and a target substrate up to the target time point during polishing or a temperature measurement data of the polishing member or the target substrate; and an prediction unit configured to input at least the time-series data of the feature value generated by the generation unit to a machine learning model trained with a training data set including, as an input, time-series data of the feature value up to a specific time point during polishing of another substrate, and as an output, a polishing amount or a residual film amount at the specific time point, or time-series data of the polishing amount or the residual film amount up to the specific time point during polishing, the polishing amount or the residual film amount being predicted using at least a film thickness measured after polishing of the another substrate, and output an predicted value of a polishing amount or a residual film amount at the target time point during polishing of the target substrate.
An information processing system according to a thirteenth aspect of the present technology comprises: a generation unit configured to generate time-series data of a feature value up to a target time point by using data regarding a frictional force between a polishing member and a target substrate up to a target time point during polishing or a temperature measurement data of the polishing member or the target substrate; an prediction unit configured to input at least the time-series data of the feature value generated by the generation unit to a machine learning model trained with a training data set that includes, as an input, time-series data of the feature value up to a specific time point during polishing of another substrate and as an output, a polishing end point probability at the specific time point during polishing of the another substrate or time-series data of the polishing end point probability up to the specific time point, and output an predicted value of the polishing end point probability at the target time point of the target substrate; and a determination unit configured to determine whether or not a polishing end point has been reached by using the predicted value.
An information processing system according to a fourteenth aspect of the present technology comprises: a generation unit configured to generate time-series data of a feature value up to a target time point by using data regarding a frictional force between a polishing member and a target substrate up to the target time point during polishing or a temperature measurement data of the polishing member or the target substrate; and an prediction unit configured to input at least the time-series data of the feature value generated by the generation unit to a machine learning model trained with a training data set including, as an input, time-series data of the feature value up to a specific time point during polishing of another substrate, and as an output, a remaining polishing time at the specific time point or an additional polishing time from an end point detection timing, or time-series data of the a remaining polishing time up to the specific time point or the additional polishing time from the end point detection timing, the remaining polishing time or the additional polishing time being determined such that a remaining film thickness or a polishing amount of the another substrate becomes a target value, and output an predicted value of the remaining polishing time or the additional polishing time from an end point detection timing of the target substrate; and a determination unit that determines whether or not a polishing end point has been reached by using the predicted value.
A substrate polishing method according to a fifteenth aspect of the present technology comprises: a generation step configured to generate time-series data of a feature value up to a target time point by using data regarding a frictional force between a polishing member and a target substrate up to the target time point during polishing or a temperature measurement data of the polishing member or the target substrate; and an estimation step configured to input at least the time-series data of the feature value generated by the generation step to a machine learning model trained with a training data set including, as an input, time-series data of the feature value up to a specific time point during polishing of another substrate, and as an output, a polishing amount or a residual film amount at the specific time point, or time-series data of the polishing amount or the residual film amount up to the specific time point during polishing, the polishing amount or the residual film amount being predicted using at least a film thickness measured after polishing of the another substrate, and output an predicted value of a polishing amount or a residual film amount at the target time point during polishing of the target substrate.
A substrate polishing method according to a sixteenth aspect of the present technology comprises: a generation step configured to generate time-series data of a feature value up to a target time point by using data regarding a frictional force between a polishing member and a target substrate up to a target time point during polishing or a temperature measurement data of the polishing member or the target substrate; an estimation step configured to input at least the time-series data of the feature value generated by the generation step to a machine learning model trained with a training data set that includes, as an input, time-series data of the feature value up to a specific time point during polishing of another substrate and as an output, a polishing end point probability at the specific time point during polishing of the another substrate or time-series data of the polishing end point probability up to the specific time point, and output an predicted value of the polishing end point probability at the target time point of the target substrate; and a determination step configured to determine whether or not a polishing end point has been reached by using the predicted value.
A substrate polishing method according to a seventeenth aspect of the present technology comprises: a generation step configured to generate time-series data of a feature value up to a target time point by using data regarding a frictional force between a polishing member and a target substrate up to the target time point during polishing or a temperature measurement data of the polishing member or the target substrate; and an estimation step configured to input at least the time-series data of the feature value generated by the generation step to a machine learning model trained with a training data set including, as an input, time-series data of the feature value up to a specific time point during polishing of another substrate, and as an output, a remaining polishing time at the specific time point or an additional polishing time from an end point detection timing, or time-series data of the a remaining polishing time up to the specific time point or the additional polishing time from the end point detection timing, the remaining polishing time or the additional polishing time being determined such that a remaining film thickness or a polishing amount of the another substrate becomes a target value, and output an predicted value of the remaining polishing time or the additional polishing time from an end point detection timing of the target substrate; and a determination step that determines whether or not a polishing end point has been reached by using the predicted value.
According to one aspect of the present technology, a relationship between a feature value related to a change in a frictional force or temperature when polishing is performed and a polishing amount or a residual film amount as a result of polishing is trained, and the polishing amount or the residual film amount during polishing of a new substrate is predicted using the trained machine learning model. By the learning of the machine learning model, the trained machine learning model can estimate a polishing amount or a residual film amount in consideration of the influence of the consumable member such as the polishing pad and the non-uniformity of polishing. Therefore, it is possible to estimate the polishing amount or the residual film amount during polishing of a new substrate in consideration of the influence of the consumable member such as the polishing pad and the non-uniformity of polishing.
According to one aspect of the present technology, the relationship between the feature value related to a change in a frictional force or temperature when polishing is performed and the polishing end point probability at each time point during polishing is trained, and a polishing end point probability at each time point during polishing of a new substrate is predicted using the trained machine learning model. By the learning of the machine learning model, the trained machine learning model can estimate a polishing end point probability at each time point during polishing in consideration of the influence of the consumable member such as the polishing pad and the non-uniformity of polishing, and thus, it is possible to estimate the polishing end point probability at each time point during polishing of a new substrate in consideration of the influence of the consumable member such as the polishing pad and the non-uniformity of polishing.
According to one aspect of the present technology, the relationship between the feature value related to a change in the frictional force or temperature at the time of polishing and the remaining polishing time or the additional polishing time from the end point detection timing is trained, and a remaining polishing time or additional polishing time from the end point detection timing during polishing of a new substrate is predicted using the trained machine learning model. By the learning of the machine learning model, the trained machine learning model can estimate the remaining polishing time or the additional polishing time from the end point detection timing in consideration of the influence of the consumable member such as the polishing pad and the non-uniformity of polishing. Therefore, the remaining polishing time or the additional polishing time from the end point detection timing during the polishing of the new substrate can be predicted in consideration of the influence of the consumable member such as the polishing pad and the non-uniformity of polishing.
The inventors of the present application have found that there is a correlation between a feature value related to a change in a frictional force or temperature when polishing is performed and a polishing amount or a residual film amount as a result of polishing. In addition, the inventors of the present application have found that there is a correlation between the feature value related to the change in the frictional force or temperature when polishing is performed and a polishing end point probability at each time point during polishing. Further, the inventors of the present application have found that there is a correlation between the feature value related to the change in the frictional force or temperature when polishing is performed and the remaining polishing time or the additional polishing time from the end point detection timing. Therefore, in each embodiment, a machine learning model (for example, a recurrent neural network or a long short-term memory (LSTM)) is used to learn one of the relationships described above. In each embodiment, a wafer will be described as an example of the substrate.
First, a first embodiment will be described.
The polishing apparatus 10 includes a polishing table 100 and a polishing head 1 as a substrate holding apparatus that holds a substrate (here, a wafer) to be polished and presses the substrate against a polishing surface on the polishing table 100. The polishing head 1 is also referred to as a top ring. The polishing table 100 is connected to a table rotating motor 102 via a table shaft 100a, that is disposed therebelow. The polishing table 100 rotates around the table shaft 100a as the table rotating motor 102 rotates. A polishing pad 101 as a polishing member is attached to the upper surface of the polishing table 100. The surface of the polishing pad 101 constitutes a polishing surface 101a for polishing a semiconductor wafer W. As described above, the polishing apparatus 10 includes the polishing table 100 provided with a polishing member (here, the polishing pad 101 as an example) and configured to be rotatable, and the polishing head 1 that is configured to face the polishing table 100 and be rotatable and to which a substrate (here, the wafer) can be attached on a surface facing the polishing table 100.
A polishing liquid supply nozzle 60 is installed above the polishing table 100. A polishing liquid (polishing slurry) Q is supplied from the polishing liquid supply nozzle 60 onto the polishing pad 101 on the polishing table 100.
The polishing head 1 basically includes a top ring main body 2 that presses the semiconductor wafer W against the polishing surface 101a, and a retainer ring 3 as a retainer member that holds an outer peripheral edge of the semiconductor wafer W and prevents the semiconductor wafer W from jumping out of the polishing head 1. The polishing head 1 is connected to a top ring shaft 111. The top ring shaft 111 is moved up and down with respect to a top ring head 110 by a vertical movement mechanism 124. Positioning of the polishing head 1 in a vertical direction is performed by a vertical movement of the entire polishing head 1 with respect to the top ring head 110 by moving the top ring shaft 111 up and down. A rotary joint 26 is attached to an upper end of the top ring shaft 111.
The vertical movement mechanism 124 that vertically moves the top ring shaft 111 and the polishing head 1 includes a bridge 128 that rotatably supports the top ring shaft 111 via a bearing 126, a ball screw 132 attached to the bridge 128, a support base 129 supported by a support column 130, and a servomotor 138 provided on the support base 129. The support base 129 that supports the servomotor 138 is fixed to the top ring head 110 via the support column 130.
The ball screw 132 includes a screw shaft 132a connected to the servomotor 138 and a nut 132b to which the screw shaft 132a is screwed. When the servomotor 138 is driven, the bridge 128 moves up and down via the ball screw 132, whereby the top ring shaft 111 and the polishing head 1 moving up and down integrally with the bridge 128 that moves up and down.
In addition, as illustrated in
The top ring head 110 is supported by a top ring head shaft 117 rotatably supported by a frame (not illustrated). The polishing apparatus 10 is connected to each device in the apparatus including the top ring rotating motor 114, the servomotor 138, and the table rotating motor 102 via a control line, and includes a control unit 500 that controls each device. The control unit 500 controls the polishing apparatus so as to polish the substrate by pressing the substrate against the polishing member (here, the polishing pad 101) while rotating the polishing head 1 to which the substrate is attached and the polishing table 100.
Although there are table rotation, head rotation, and rotation of a motor (not illustrated) for rocking the top ring head 110, and these are to be used as the basis of features to be input to a machine learning model to be described later, one or more sensor detected values (for example, motor current value) or a calculated value of torque calculated from the sensor detected value may be used.
The polishing apparatus 10 includes the AI unit 4 connected to the control unit 500 via wiring.
The storage unit 41 stores a machine learning model trained with a training data set that includes, as an input, a feature value based on data regarding a frictional force at each time point during polishing or a feature value based on temperature measurement data, and as an output, a polishing amount or a residual film amount at each time point during polishing predicted by using at least a film thickness measured after polishing. The storage unit 41 also stores a program to be read and executed by the processor 45. The storage unit 41 may be a storage such as a hard disk or a DVD, an external storage medium such as an SD card or a flash memory, an online storage, or a storage device.
Here, the data regarding the frictional force at each time point during polishing is, for example, a current value (hereinafter, also referred to as table torque current) for torque calculation of the table rotating motor 102 during polishing. Here, the data regarding the frictional force at each time point during polishing may be a calculated value of torque converted from the current value of the motor. Note that the data regarding the frictional force at each time point during polishing may be a drive current value of the top ring rotating motor 114 that rotates the polishing head 1, or may be a drive current value of a motor (not illustrated) that rotates the top ring head 110 (thus, the top ring head shaft 117).
In addition, the polishing apparatus 10 may include a load cell that measures a frictional force between the polishing member and the substrate, and in this case, the data regarding the frictional force at each time point during polishing may be a signal value of the load cell. The polishing apparatus 10 may include a strain sensor that measures the strain of the substrate. In this case, the data regarding the frictional force at each time point during polishing may be a signal value of the strain sensor.
The memory 42 is a medium that temporarily stores information.
The input unit 43 receives information from the control unit 500 and outputs the information to the processor 45.
The output unit 44 receives information from the processor 45 and outputs the information to the control unit 500.
The processor 45 functions as a generation unit 451, an prediction unit 452, and a determination unit 453 by reading a program from the storage unit 41 to execute the program.
For example, the generation unit 451 generates a feature value using the data regarding the frictional force between the polishing member and a target substrate at a target time point during polishing. Here, “during polishing” means, for example, during polishing of a substrate by pressing the substrate against a polishing member while rotating the polishing head 1 to which the substrate is attached and the polishing table 100. Details of this process will be described later.
The prediction unit 452 inputs at least the feature value generated by the generation unit 451 to the trained machine learning model, and outputs an predicted value of the polishing amount or the residual film amount at the target time point during polishing of the target substrate. Details of this process will be described later. The determination unit 453 uses the predicted value to determine whether or not the polishing end point has been reached.
As illustrated in
As the lengths of arrows A12 and A13 are shorter than the lengths of arrows A11 and A14 in
The inventor of the present application has found that there is a correlation between the data on the frictional force between the polishing member and the substrate (for example, the signal of the table torque current) and the residual film thickness or the polishing amount, since the polishing rate varies depending on the polishing position due to wear of the polishing pad or the like and the timing at which the lower layer film is exposed varies in the wafer plane due to uneven polishing of the film or the like. Here, the residual film thickness is a remaining thickness of the polishing target layer 51, that is, a thickness from the bottom in the recess to the lower surface of the polishing target layer 51, and is, for example, a thickness (for example, the lengths of arrows A11, A12, A13, A14) of the film remaining in the recess in a case of interface protrusion as indicated by the point P3 in
Therefore, in the present embodiment, a machine learning model is caused to learn with a training data set in which data regarding the frictional force between the polishing member and the substrate when a substrate having a certain initial film thickness is polished to a certain residual film thickness is used as an input, and the residual film thickness or the polishing amount at that time point is used as an output, and the trained machine learning model is caused to read data regarding a frictional force between a polishing member and a substrate to be newly targeted, so that an predicted value of the residual film thickness or the polishing amount is output, and the polishing is stopped at the timing when the residual film thickness or the polishing amount becomes a target value.
In the training process, the feature value based on the data (for example, table torque current) regarding a frictional force between the polishing member and the substrate at each time point during polishing is extracted with reference to the storage unit 41. In addition, with reference to the storage unit 41, the polishing amount or the residual film amount at each time point during polishing predicted using at least the film thickness measured after polishing is extracted.
Machine learning is performed using the training data set that includes, as an input, a feature value based on the data regarding the frictional force between the polishing member and the substrate at each time point during polishing, and as an output, a polishing amount or a residual film amount at each time point during polishing predicted using at least the film thickness measured after polishing. As a result, the trained machine learning model is stored in the storage unit 41. In addition to the feature value based on the data regarding the frictional force between the polishing member and the substrate at each time point during polishing, as the input in the training data set, a polishing recipe, a use time of one consumable member, the number of substrates processed with a same consumable member, and/or the initial film thickness may be added as described later.
Here, the polishing amount or the residual film amount at each time point in the training data set is obtained by calculating the polishing amount or the residual film amount (residual film thickness) at each time point on the basis of the measurement result of the initial film thickness and the film thickness after polishing, assuming that the polishing rate during polishing is constant. Alternatively, a change in the polishing rate during polishing may be obtained by an experiment to calculate the polishing amount or the residual film amount at each time point. Note that a first polishing rate until an interface between the polishing target layer and the lower layer is exposed and a second polishing rate after the interface is exposed may be calculated separately.
For example, learning is performed using learning data in which time-series data of the feature value from a start of polishing to a time point t1 is input and a value of the output parameter at the time point t1 is output.
As another learning data, learning is performed using learning data in which time-series data of the feature value from the start of polishing to a time point t2 is input and a value of the output parameter at the time point t2 is output.
As another learning data, learning is performed using learning data in which time-series data of the feature value from the start of polishing to a time point t3 is input and a value of the output parameter at the time point t3 is output.
From the time-series data of the feature values up to the times t1, t2, and t3, learning is performed in which the values of the output parameters at the times t1, t2, and t3 are output.
After the learning is completed, when time-series data of the feature value up to a certain time point is input to the machine learning model in a new polishing, an predicted value (for example, an unknown residual film amount) of the output parameter at the time point is output. For example, RNN or LSTM may be used as the machine learning model. However, a machine learning model (method) other than the RNN or the LSTM may be used.
As illustrated in
That is, learning is performed using the learning data in which the time-series data of the feature value from the start of polishing to the end of polishing is input and the time-series data of the output parameter from the start of polishing to the end of polishing is output.
After the learning is completed, when time-series data of the feature value up to a certain time point is input to the machine learning model in a new polishing, an predicted value (for example, an unknown residual film amount) of the output parameter up to the time point is output. That is, when time-series data of the feature value up to the time point t1 is input to the machine learning model, predicted values (for example, unknown residual film amount) of the output parameter up to the time point t1 are output. In addition, when time-series data of the feature value up to the time point t2 is input to the machine learning model, predicted values (for example, unknown residual film amount) of the output parameter up to the time point t2 are output. Furthermore, when time-series data of the feature value up to the time point t3 is input to the machine learning model, predicted values (for example, unknown residual film amount) of the output parameter up to the time point t3 are output. In this manner, since the plurality of predicted values of the output parameter up to that time point are output, the prediction unit 452 may acquire an predicted value at that time point among the plurality of predicted values. The determination unit 453 may determine whether or not the polishing end point has been reached using the predicted value at that time point.
Subsequently, referring back to
Note that, in a case where the machine learning model is trained as described with reference to
The input of the machine learning model may further include a polishing recipe, a use time of one consumable member, the number of substrates treated with a same consumable member, and/or an initial film thickness. As a result, it is possible to estimate the polishing amount or the residual film amount according to the polishing conditions and the state of the consumable member, and the estimation accuracy can be improved.
(Step S110) First, the processor 45 loads a trained machine learning model (also referred to as an AI model) from the storage unit 41 into the memory 42.
(Step S120) Next, the processor 45 acquires table torque current data.
(Step S130) Next, the generation unit 451 calculates a feature value from the table torque current data acquired in step S120.
(Step S140) Next, the prediction unit 452 inputs the feature value calculated in step S130 to the trained machine learning model, and outputs an predicted value of the polishing amount at the target time point during polishing of the target substrate.
(Step S150) Next, the determination unit 453 determines whether or not the predicted value of the polishing amount output in step S140 is equal to or more than a set threshold. In a case where the predicted value of the polishing amount is not equal to or more than the set threshold value, the process returns to step 130 and the process is repeated. On the other hand, in a case where the predicted value of the polishing amount is equal to or more than the set threshold value, the determination unit 453 outputs an instruction to stop polishing to the control unit 500, and the control unit 500 that has received the instruction to stop polishing controls the polishing apparatus so as to stop polishing. In this manner, the determination unit 453 controls the polishing apparatus so as to stop polishing by using the predicted value predicted by the prediction unit 452. According to this configuration, since the influence of the consumable members such as a polishing pad and non-uniformity of substrates can be taken into consideration, a deviation range of the polishing amount or the residual film amount at the end of polishing can be reduced.
In practice, as illustrated in the upper diagram of
(Step S210) First, the processor 45 acquires an initial film thickness of the substrate.
(Step S220) First, the processor 45 loads a trained machine learning model (also referred to as an AI model) from the storage unit 41 into the memory 42.
(Step S230) Next, the processor 45 acquires table torque current data.
(Step S240) Next, the generation unit 451 calculates a feature value from the table torque current data acquired in step S230.
(Step S250) Next, the prediction unit 452 inputs the feature value calculated in step S230 to the trained machine learning model, outputs an predicted value of the polishing amount at the target time point during polishing of the target substrate, and calculates an predicted value of the residual film thickness by subtracting the predicted value of the polishing amount from the initial film thickness acquired in step S210.
(Step S260) Next, the determination unit 453 determines whether or not the predicted value of the residual film thickness output in step S250 is equal to or less than a set threshold value. In a case where the predicted value of the residual film thickness is not equal to or less than the set threshold value, the process returns to step 230 and repeats the processing. On the other hand, in a case where the predicted value of the residual film thickness is equal to or less than the set threshold value, the determination unit 453 outputs an instruction to stop polishing to the control unit 500, and the control unit 500 that has received the instruction to stop polishing controls the polishing apparatus so as to stop polishing.
Note that, in a case where a machine learning model trained with a training data set including, as an input, a feature value based on data regarding a frictional force at each time point during polishing, and as an output, a residual film amount at each time point during polishing predicted using at least a film thickness measured after polishing, in step S240, an predicted value of the residual film thickness may be directly output from the trained machine learning model, instead of the predicted value of the polishing amount.
As described above, the information processing system S according to the first embodiment includes the generation unit 451 that generates the feature value based on the data regarding the frictional force between the polishing member and the target substrate at the target time point during polishing. Furthermore, the information processing system S includes the prediction unit 452 that inputs at least the feature value generated by the generation unit 451 to the machine learning model trained with the training data set, that includes, as an input, a feature value based on data regarding the frictional force between the polishing member and the substrate at each time point during polishing, and as an output, a polishing amount or a residual film amount at each time point during polishing predicted using at least a film thickness measured after polishing, and outputs an predicted value of the polishing amount or the residual film amount at a target time point during polishing of the target substrate.
With this configuration, a relationship between a feature value related to a change in a frictional force or temperature when polishing is performed and a polishing amount or a residual film amount as a result of polishing is trained, and the polishing amount or the residual film amount during polishing of a new substrate is predicted using the trained machine learning model. By the learning of the machine learning model, the trained machine learning model can estimate a polishing amount or a residual film amount in consideration of the influence of the consumable member such as the polishing pad and the non-uniformity of polishing. Therefore, it is possible to estimate the polishing amount or the residual film amount during polishing of a new substrate in consideration of the influence of the consumable member such as the polishing pad and the non-uniformity of polishing. By using the predicted value for detecting the polishing end point of the target substrate, it is possible to realize end point detection capable of suppressing the difference in residual film thickness between the substrates even if the polishing situation changes.
<First Modification of First Embodiment>
Next, a first modification of the first embodiment will be described. In the first modification, the storage unit 41 stores a machine learning model trained with a training data set in which at least a feature value based on data regarding a frictional force at each time point during polishing or a feature value based on temperature measurement data is used as an input, and a polishing end point probability at each time point during polishing is used as an output. The polishing end point probability is that, for example, the output of the learning data based on the data up to the middle of polishing is set to 0, and the output of the learning data based on the polishing data reaching an ideal polishing end point or an ideal detection point is set to 1.
The generation unit 451 generates a feature value using data related to a frictional force between the polishing member and the target substrate at a target time point during polishing.
The prediction unit 452 inputs at least the feature value generated by the generation unit 451 to the trained machine learning model stored in the storage unit 41, and outputs an predicted value of the polishing end point probability at the target time point.
With this configuration, by using the machine learning model, for example, it is possible to perform inference by storing not only the instantaneous value of the feature value of the data but also the waveform change, so that it is possible to estimate the polishing end point probability in consideration of the influence of non-uniformity of the consumable member such as the polishing pad or the substrate. Then, by using an predicted value of the polishing end point probability for polishing termination control, it is possible to reduce the difference in residual film thickness between the substrates after polishing.
The determination unit 453 controls the polishing apparatus so as to stop polishing by using the predicted value predicted by the prediction unit 452.
(Step S310) First, the processor 45 loads a trained machine learning model (also referred to as an AI model) from the storage unit 41 into the memory 42.
(Step S320) Next, the processor 45 acquires table torque current data.
(Step S330) Next, the generation unit 451 calculates a feature value from the table torque current data acquired in step S320.
(Step S340) Next, the prediction unit 452 inputs the feature value calculated in step S330 to the trained machine learning model, and outputs an predicted value of the polishing end point probability at the target time point.
(Step S350) Next, the determination unit 453 determines whether or not the predicted value of the polishing end point probability output in step S340 is equal to or larger than a set threshold. In a case where the predicted value of the polishing end point probability is not equal to or more than the set threshold value, the process returns to step 320 and the process is repeated. On the other hand, in a case where the predicted value of the polishing end point probability is equal to or more than the set threshold value, the determination unit 453 outputs an instruction to stop polishing to the control unit 500, and the control unit 500 that has received the instruction to stop polishing controls the polishing apparatus so as to stop polishing. As described above, in a case where the determination unit 453 determines that the polishing end point has been reached, the control unit 500 controls the polishing apparatus to stop polishing. According to this configuration, the relationship between the feature value related to a change in a frictional force or temperature when polishing is performed and the polishing end point probability at each time point during polishing is trained, and a polishing end point probability at each time point during polishing of a new substrate is predicted using the trained machine learning model. By the learning of the machine learning model, the trained machine learning model can estimate a polishing end point probability at each time point during polishing in consideration of the influence of the consumable member such as the polishing pad and the non-uniformity of polishing, and thus, it is possible to estimate the polishing end point probability at each time point during polishing of a new substrate in consideration of the influence of the consumable member such as the polishing pad and the non-uniformity of polishing. By using the predicted value for detecting the polishing end point of the target substrate, it is possible to realize end point detection capable of suppressing the difference in residual film thickness between the substrates even if the polishing situation changes.
<Second Modification of First Embodiment>
Next, a second modification of the first embodiment will be described. In the second modification, the storage unit 41 stores a machine learning model trained with a training data set that includes, as an input, at least a feature value based on data regarding a frictional force at each time point during polishing, and as an output, remaining polishing time or additional polishing time from an end point detection timing determined so that a residual film thickness or a polishing amount becomes a target value. Here, the predicted value of the additional polishing time from the end point detection timing is an predicted value of the time for additional polishing from the end point detection timing until the target residual film thickness illustrated in
The generation unit 451 generates a feature value using data related to a frictional force between the polishing member and the target substrate at the target time point during polishing or temperature measurement data of the polishing member or the substrate.
The prediction unit 452 inputs at least the feature value generated by the generation unit 451 to the trained machine learning model stored in the storage unit 41, and outputs an predicted value of the remaining polishing time or the additional polishing time from the end point detection timing.
With this configuration, by using the machine learning model, for example, it is possible to perform inference by storing not only the instantaneous value of the feature value of data but also the waveform change, so that it is possible to estimate the remaining polishing time or the additional polishing time from the end point detection timing in consideration of the influence of non-uniformity of the consumable member such as the polishing pad or the substrate. Then, by using the predicted value of the remaining polishing time or the additional polishing time from the end point detection timing for polishing termination control, it is possible to reduce the difference in residual film thickness between the substrates after polishing.
(Step S410) First, the processor 45 loads a trained machine learning model (also referred to as an AI model) from the storage unit 41 into the memory 42.
(Step S420) Next, the processor 45 acquires table torque current data.
(Step S430) Next, the generation unit 451 calculates a feature value from the table torque current data acquired in step S420.
(Step S440) Next, the prediction unit 452 inputs the feature value calculated in step S430 to the trained machine learning model and outputs an predicted value of the remaining polishing time.
(Step S450) Next, the determination unit 453 determines whether or not the predicted value of the remaining polishing time output in step S440 is 0 or less. If the predicted value of the polishing end point probability is not 0 or less, the process returns to step 420 and repeats the process. On the other hand, in a case where the predicted value of the polishing end point probability is 0 or less, the determination unit 453 outputs an instruction to stop polishing to the control unit 500, and the control unit 500 that has received the instruction to stop polishing controls the polishing apparatus so as to stop polishing. As described above, in a case where the determination unit 453 determines that the polishing end point has been reached, the control unit 500 controls the polishing apparatus to stop polishing. According to this configuration, the relationship between the feature value related to a change in the frictional force or temperature at the time of polishing and the remaining polishing time or the additional polishing time from the end point detection timing is trained, and a remaining polishing time or additional polishing time from the end point detection timing during polishing of a new substrate is predicted using the trained machine learning model. By the learning of the machine learning model, the trained machine learning model can estimate the remaining polishing time or the additional polishing time from the end point detection timing in consideration of the influence of the consumable member such as the polishing pad and the non-uniformity of polishing. Therefore, the remaining polishing time or the additional polishing time from the end point detection timing during the polishing of the new substrate can be predicted in consideration of the influence of the consumable member such as the polishing pad and the non-uniformity of polishing. By using the predicted value for detecting the polishing end point of the target substrate, it is possible to realize end point detection capable of suppressing the difference in residual film thickness between the substrates even if the polishing situation changes.
(Step S510) First, the processor 45 loads a trained machine learning model (also referred to as an AI model) from the storage unit 41 into the memory 42.
(Step S520) Next, the processor 45 acquires table torque current data.
(Step S530) Next, the generation unit 451 calculates a feature value from the table torque current data acquired in step S520.
(Step S540) Next, the prediction unit 452 inputs the feature value calculated in step S530 to the trained machine learning model and outputs an predicted value of the remaining polishing time.
(Step S550) In parallel to steps 5530 and 5540, the processor 45 executes a conventional end point detection process. For example, the processor 45 detects a polishing end point when a time derivative value of the table torque current falls below a preset threshold.
(Step S560) The processor 45 determines whether or not the polishing end point is detected in step S550, and in a case where the polishing end point is not detected (NO in step S560), the process returns to step S520 and the process is repeated.
(Step S570) On the other hand, when the polishing end point is detected (YES in step S560), the predicted value of the remaining polishing time output by the prediction unit 452 at that timing is set as an additional polishing time (also referred to as overpolishing time).
(Step S580) The determination unit 453 determines whether or not the additional polishing time (overpolishing time) has elapsed after the detection of the polishing end point. In a case where the additional polishing time (overpolishing time) has elapsed after the detection of the polishing end point, the determination unit 453 outputs an instruction to stop polishing to the control unit 500, and the control unit 500 that has received the instruction to stop polishing controls the polishing apparatus so as to stop polishing.
Note that the AI unit 4 may be mounted on a gateway in a factory, which is a gateway to which the polishing apparatus is connected by a network line. This gateway is preferably in a vicinity of the polishing apparatus. In a case where high-speed processing is required (for example, in a case where a sampling rate is 100 ms or less), the AI unit 4 in the polishing apparatus or the AI unit 4 mounted on the gateway may execute the processing as an edge computing. The AI unit 4 in the polishing apparatus may be mounted on a PC as an apparatus or a controller.
Next, a second embodiment will be described. In the first embodiment, the polishing apparatus 10 includes the information processing system having the AI unit 4, but in the second embodiment, there is a difference in that an information processing system S2 having an AI unit 4 is provided not in a polishing apparatus but in a factory management room, a clean room, or the like in a factory.
In a case where the AI unit 4 is provided in the polishing apparatus or the gateway, it is possible to perform high-speed processing by executing a trained machine learning model by edge computing. For example, it is possible to perform processing at high speed on time (in real time).
In addition, in a case where the AI unit 4 is mounted on a server or a fog computer in a factory, data of a plurality of polishing apparatuses in the factory may be collected to update the machine learning model. In addition, data of a plurality of polishing apparatuses in the factory may be collected and analyzed, and the analysis result may be reflected in setting polishing parameters.
Next, a second embodiment will be described. In the first embodiment, the polishing apparatus 10 includes the AI unit 4, but in the second embodiment, the AI unit 4 is provided not in the polishing apparatus but in the analysis center.
By providing the AI unit 4 in the analysis center physically separated from the polishing apparatus in this manner, the AI unit 4 can be shared among the plurality of factories, and maintainability of the AI unit 4 is improved. Further, by utilizing data during polishing in a plurality of factories to cause the machine learning model to relearn with a large amount of data, estimation accuracy can be improved more quickly.
In addition, the machine learning model may be updated by collecting data (for example, a large amount of data) of a plurality of polishing apparatuses across a plurality of factories. In addition, data (for example, a large amount of data) of a plurality of polishing apparatuses across a plurality of factories may be collected and analyzed, and the analysis result may be reflected in setting polishing parameters.
Note that the AI unit 4 may be provided in a cloud instead of the analysis center that intensively performs analysis.
A mounting place of the AI unit 4 may be (1) in the polishing apparatus, and/or (2) a gateway in the vicinity of the polishing apparatus, and/or (3) a computer (PC, server, fog computer, and the like) in a factory (for example, in a factory management room).
A mounting place of the AI unit 4 may be (1) in the polishing apparatus, and/or (2) a gateway near the polishing apparatus, and/or (4) a computer in an analysis center (or cloud).
A mounting place of the AI unit 4 may be (1) in the polishing apparatus and/or (2) a gateway in the vicinity of the polishing apparatus, and/or (3) a computer in a factory (for example, in a factory management room), and/or (4) a computer in an analysis center (or cloud).
In addition, each configuration of the AI unit 4 may be dispersedly arranged in (1) the inside of the polishing apparatus and/or (2) the gateway in the vicinity of the polishing apparatus, and/or (3) the computer (PC, server, fog computer, and the like) in the factory (for example, in a factory management room), and/or (4) the computer of the analysis center (or cloud).
Note that, in each embodiment, the input of the machine learning model is a feature value based on data regarding the frictional force between the polishing member and the substrate at each time point during polishing, but is not limited thereto. The input of the machine learning model may be a feature value based on temperature measurement data of the polishing member (here, the polishing pad 101) or the substrate at each time point during polishing. This is because when the frictional force between the polishing member and the substrate during polishing increases, a calorific value of the polishing member or the substrate increases accordingly, and a temperature of the polishing member or the substrate increases, so that the temperature of the polishing member or the substrate has a positive correlation with the frictional force between the polishing member and the substrate during polishing.
For example, in the case of the first embodiment, the storage unit 41 may store a machine learning model trained with a training data set in which at least a feature value based on temperature measurement data of a polishing member or a substrate at each time point during polishing is input, and a polishing amount or a residual film amount at each time point during polishing predicted using at least a film thickness measured after polishing is output.
In this case, the generation unit 451 may generate a feature value using temperature measurement data of the polishing member or a target substrate at a target time point during polishing. Then, the prediction unit 452 may input at least the feature value generated by the generation unit 451 to the trained machine learning model and output an predicted value of the polishing amount or the residual film amount at the target time point during polishing of the target substrate.
In addition, for example, in the case of the first modification of the first embodiment, the storage unit 41 may store a machine learning model trained with a training data set in which at least a feature value based on temperature measurement data of a polishing member or a substrate at each time point during polishing is an input and a polishing end point probability at each time point during polishing is an output.
In this case, the generation unit 451 may generate a feature value using temperature measurement data of the polishing member or a target substrate at a target time point during polishing. Then, the prediction unit 452 may input at least the feature value generated by the generation unit 451 to the trained machine learning model and output the predicted value of the polishing end point probability at the target time.
In addition, for example, in the case of the second modification of the first embodiment, the storage unit 41 may store a machine learning model trained with a training data set in which at least a feature value based on the temperature measurement data of the polishing member or the substrate at each time point during polishing is input, and a remaining polishing time or an additional polishing time from end point detection timing determined so that the remaining film thickness or the polishing amount becomes a target value is output.
In this case, the generation unit 451 may generate a feature value using temperature measurement data of the polishing member or a target substrate at a target time point during polishing. Then, the prediction unit 452 may input at least the feature value generated by the generation unit 451 to the trained machine learning model and output an predicted value of the remaining polishing time or the additional polishing time from the end point detection timing.
Note that at least a part of the AI unit 4 described in the above-described embodiment may be configured by hardware or software. In a case where the AI unit 4 is configured by software, a program for realizing at least some functions of the AI unit may be stored in a recording medium such as a flexible disk or a CD-ROM, and may be read and executed by a computer. The recording medium is not limited to a removable recording medium such as a magnetic disk or an optical disk, and may be a fixed recording medium such as a hard disk device or a memory.
Furthermore, the program for realizing at least some functions of the AI unit 4 may be distributed via a communication line (including wireless communication) such as the Internet. Further, the program may be distributed via a wired line or a wireless line such as the Internet or stored in a recording medium in an encrypted, modulated, or compressed state.
Furthermore, the AI unit 4 may be caused to function by one or a plurality of information processing apparatuses. In a case where a plurality of information processing apparatuses is used, at least one of the information processing apparatuses may be a computer, and the computer may execute a predetermined program to implement a function as at least one means of the AI unit 4.
In an invention of a method, all the processes (steps) may be realized by automatic control by a computer. In addition, progress control between the processes may be performed by a human hand while causing the computer to perform each process. Furthermore, at least a part of all steps may be performed by a human hand.
Note that, in the above embodiment, as illustrated in
In addition, the present technology may be used not only for determining the end of polishing, but also for changing a polishing condition (for example, a polishing pressure or the like) in a case where the polishing amount or the residual film amount predicted during polishing deviates from a predetermined condition, and for example, polishing may be performed so that a target polishing amount is obtained without increasing a polishing time.
As described above, the present technology is not limited to the above-described embodiment as it is, and can be embodied by modifying the components without departing from the gist of the present technology at an implementation stage. In addition, various inventions can be formed by appropriately combining a plurality of constituent elements disclosed in the above embodiment. For example, some components may be deleted from all the components illustrated in the embodiments. Furthermore, constituent elements in different embodiments may be appropriately combined.
Number | Date | Country | Kind |
---|---|---|---|
2020-104283 | Jun 2020 | JP | national |