This application is based upon and claims the benefit of priority from Japanese Patent Application No. 2023-131894, filed on Aug. 14, 2023; the entire contents of which are incorporated herein by reference.
Embodiments described herein relate generally to an information processing apparatus, an information processing method, and a computer program product.
In factories (semiconductor factories and the like) and plants (chemical plants and the like), monitoring of quality characteristics, grasping of tendency changes or abnormalities, examination of countermeasures, and the like are executed by utilizing big data regarding manufacturing to improve productivity, yield, and reliability.
For example, a regression model is used as an analysis method of big data. The regression model is, for example, a model in which process data such as a sensor value, a control value, and a setting value is used as an explanatory variable and a quality characteristic is used as an objective variable. In factories and plants, it may be necessary to update a regression model according to an event such as maintenance of equipment, a change in trend of process data, and the like, and it is desired to efficiently manage the regression model.
According to an embodiment, an information processing apparatus includes a processing unit configured to: detect whether or not one or more conditions defining timings to perform learning of a regression model configured to predict one or more objective variables for a plurality of explanatory variables are satisfied; determine priorities of the plurality of explanatory variables according to a condition detected to be satisfied; and perform learning of the regression model by using an objective function and learning data, the objective function including a regularization term having a regularization strength changing according to the priorities.
Hereinafter, a preferred embodiment of an information processing apparatus according to the present invention will be described in detail with reference to the accompanying drawings.
The information processing apparatus of the present embodiment can be applied to, for example, management (including construction, updating, learning, etc.) of a model used in a system that performs quality management in factories and plants. Applicable systems are not limited thereto.
In factories, plants, and the like, a regression model is utilized for the following four applications, for example, in order to improve productivity, yield, and reliability.
(1) Defect factor analysis: The influence of the process on quality characteristics can be grasped, and factors of yield reduction and quality variation can be identified.
(2) Soft Sensor: Quality characteristics that are difficult to measure or impossible to measure physically can be estimated from the process data.
(3) Control and Adjustment: Certain processes can be controlled and adjusted to achieve desired quality characteristics.
(4) Detection of Abnormality and Change: The prediction error (prediction residual) of the regression model or the change of the regression model itself can be monitored to detect the abnormality and the change.
The regression model is a model that predicts one or more objective variables for a plurality of explanatory variables. The regression model may be any parametric regression model such as a linear regression model, a logistic regression model, a Poisson regression model, a generalized linear regression model, a spline regression model, a generalized additive model, or a neural network. Here, the prediction represents prediction of an objective variable from an explanatory variable, and is not necessarily limited to future forecasting, and may be estimation of a past value.
The objective variable is, for example, a quality characteristic, a defect rate, and a variable indicating whether the product is a non-defective product or a defective product. The objective variable may be a sensor value detected by a sensor. The explanatory variables are other sensor values, setting values, control values, and the like. The explanatory variable may be applied with preprocess in advance. The preprocess is, for example, standardization, normalization, conversion by a specific function, addition of an interaction term, time lag, time lead, dummy variable conversion, encoding, outlier processing, missing value processing, and the like.
Data (such as process data) including the objective variable and the explanatory variable is stored in a data management system, a database, and the like.
First, a case where the regression model has one objective variable will be described. It is assumed that there are n pieces of data in total (n is an integer of 2 or more), and each piece of data includes p explanatory variables and one objective variable. That is, data is represented by (xi, yi), xi ∈Rp, yi ∈R, i=1, . . . , n. Here, xi is an explanatory variable of the p-dimensional vector, yi is a scalar objective variable, and R is the entire real number.
At this time, the regression models β0{circumflex over ( )}∈R and β{circumflex over ( )}∈Rp can be estimated by the least squares method as in the following Equation (1). Note that a variable with a hat symbol “{circumflex over ( )}” indicates an estimated value. The same applies to the following variables. In addition, the symbol “T” represents transposition.
In addition, in a case of p>>n, the regression models β0{circumflex over ( )} and β{circumflex over ( )} can be estimated by minimizing a square error (an example of a loss function) with L1 regularization as shown in the following Equation (2). Note that λ represents a regularization parameter.
The method shown in the Equation (2) is called Lasso (for example, Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267-288.), and many regression coefficients are estimated to be zero, and a sparse regression coefficient is obtained. Therefore, generalization performance in high-dimensional data and interpretability of a model can be improved.
The loss function is not limited to the function representing the square error as described above, and may be any other loss function. In addition, the regularization is not limited to the L1 regularization, and any other regularization may be used.
In quality management such as factories and plants, a regression model does not end once it is constructed. When a predetermined condition is satisfied, it is necessary to reconstruct (update) the regression model. The condition is, for example, one or more conditions defining the timings to perform learning of the regression model, and is, for example, the following condition.
As a method of updating the regression model, a method of collecting new data and estimating the regression model from the beginning can be considered. However, in such a method, problems may occur, such as an increase in the load of calculation processing, an enormous change in the result of variable selection, the unstably-changing regression coefficient, and an increase in the load for checking the regression model.
Therefore, a method of updating the model only in a necessary portion (for example, M. Takada et al., “Transfer Learning via $ell_1$ Regularization”, Advances in Neural Information Processing Systems (NeurIPS2020), 33, 14266-14277., hereinafter, referred to as the method MA) can be utilized. The method MA is a method of estimating and updating a regression model as shown in the following Equation (3) by using a regression model β−∈Rp before update. Note that α represents a new regularization parameter.
With this formulation, it is possible to realize an update method in which only the minimum necessary parameters are updated and the other parameters are not updated.
When the method MA is used, the parameter to be updated is automatically determined from the data. Therefore, in a case where the noise included in the data is large, in a case where the collinearity of the data is strong, and in a case where the data is high-dimensional, an inappropriate parameter may be updated.
For example, when maintenance is performed on a specific machine, only a parameter (regression coefficient) related to the machine should be updated, but in some cases, a parameter not related to the machine may be updated. In addition, when a specific processing condition is changed, it is desirable to update only parameters related to the processing condition, but other parameters may be updated.
Therefore, in the present embodiment, when a condition (such as occurrence of an event) defining a timing to perform learning (updating) of the regression model is satisfied, a parameter (corresponding to regression coefficient and explanatory variable) that needs to be updated is specified according to the condition, and the specified parameter is preferentially updated. As a result, only necessary portions (parameters) can be more appropriately specified and updated. That is, the regression model can be managed more efficiently.
More specifically, in the present embodiment, a priority of the parameter (explanatory variable) is changed according to the condition, and the regression model is updated on the basis of the priority. As a result, it is possible to stably and efficiently manage (operate) the regression model while realizing the update of the regression model matching the domain knowledge. Hereinafter, an example of using a condition indicating that an event has occurred will be mainly described, but a similar method can be applied to conditions other than the occurrence of an event.
Note that the present embodiment can be applied not only when updating an already constructed regression model but also when newly constructing a regression model. Hereinafter, it is assumed that the learning of the regression model includes both the update of the already constructed (learned) regression model and the construction of the new regression model.
The model storage unit 121 stores information related to the regression model such as parameters (regression coefficients and the like) of the regression model. For example, the model storage unit 121 stores information of a regression model learned (constructed or updated) in the past.
The event storage unit 122 stores event information indicating an event that has occurred.
One event may be associated with a plurality of event types. Similarly, one event type may be associated with a plurality of events. That is, an event and an event type may have a one-to-one, one-to-many, many-to-one, and many-to-many relationship.
Returning to
As illustrated in
The selection priority corresponds to, for example, a priority with respect to the second term on the right side of Equation (3) indicating the method MA. This is because the second term is represented by p regression coefficients βj to be estimated corresponding to the p explanatory variables xij (1≤j≤p). The update priority corresponds to, for example, a priority with respect to the third term on the right side of Equation (3). This is because the third term is represented by a difference between the p regression coefficients βj to be estimated and the p regression coefficients β−j estimated last time.
The priority may be any one of the selection priority and the update priority. For example, when Equation (2) is used instead of Equation (3), only the selection priority corresponding to the second term on the right side of Equation (2) may be used.
In the example of the correspondence information 123a of
Each priority may be expressed in any way, but may be expressed at a plurality of levels such as “large, medium, small”, or may be expressed by the magnitude of a numerical value.
As illustrated in
By referring to the correspondence information 123a and 123b illustrated in
The event type and the variable type of the correspondence information 123a may have a one-to-one, one-to-many, many-to-one, and many-to-many relationship. Similarly, the variable type and the variable of the correspondence information 123b may have a one-to-one, one-to-many, many-to-one, and many-to-many relationship.
Note that the correspondence information illustrated in
In addition, correspondence information that does not explicitly define the priority may be used. For example, correspondence information in which an event type and a variable type are associated with each other may be used instead of the correspondence information 123a. Alternatively, correspondence information in which an event and a variable are associated with each other may be used instead of the correspondence information 123a and 123b. In this case, different priorities may be determined depending on whether or not it is described in the correspondence information. For example, a high (or low) priority may be determined for a variable described in the correspondence information, and a low (or high) priority may be determined for a variable not described in the correspondence information.
Returning to
For example, the sensor data includes one piece of sensor data corresponding to the objective variable and a plurality of pieces of sensor data corresponding to the explanatory variables. As described above, the explanatory variable is not limited to the sensor data (sensor value), and may be a setting value, a control value, or the like. Hereinafter, an example in which sensor data is used as an explanatory variable will be mainly described.
Measurement points W1 and W2 are information indicating positions corresponding to the sensor data. In the example of
Returning to
Note that the sensor data illustrated in
Returning to
Note that each storage unit (the model storage unit 121, the event storage unit 122, the correspondence information storage unit 123, the sensor data storage unit 124, and the generated data storage unit 125) can be configured by any commonly used storage medium such as a flash memory, a memory card, a random access memory (R-A), a hard disk drive (HDD), and an optical disc.
Each storage unit may be a physically different storage medium or may be realized as different storage areas of physically the same storage medium. Furthermore, each of the storage units may be realized by a plurality of physically different storage media.
The model acquisition unit 101 acquires information regarding the regression model to be learned (constructed and updated) from the model storage unit 121.
The event detection unit 102 detects whether or not one or more conditions defining the timings to perform learning of the regression model are satisfied. For example, the event detection unit 102 detects whether or not a predetermined event has occurred. The detection method may be any method, and for example, the following method can be used.
The event detection unit 102 may detect that an event has occurred, for example, when event information is stored in the event storage unit 122 by another system or the like. The event detection unit 102 may read the event information stored in the event storage unit 122 periodically or when specified, and detect a predetermined event from the read event information.
The priority determination unit 103 determines the priority of each of the plurality of explanatory variables according to the condition detected to be satisfied. For example, when an event is detected by the event detection unit 102, the priority determination unit 103 determines an explanatory variable related to the event and the priority of the explanatory variable according to the detected event.
The priority determination unit 103 can determine the priority of the explanatory variable by using the event information stored in the event storage unit 122 and the correspondence information (the correspondence information 123a and 123b) stored in the correspondence information storage unit 123. First, the priority determination unit 103 extracts an event type corresponding to the detected event with reference to the event information. The priority determination unit 103 extracts a variable type corresponding to the extracted event type and priority (selection priority or update priority) with reference to the correspondence information 123a. The priority determination unit 103 extracts a variable (explanatory variable) corresponding to the extracted variable type with reference to the correspondence information 123b. The priority determination unit 103 decides the priority corresponding to the variable type of the extracted variable.
In this manner, the priority determination unit 103 determines the priority corresponding to the detected condition (event) for the explanatory variable corresponding to the detected condition (event) by using the correspondence information in which the condition (event), the explanatory variable, and the priority are associated with each other.
The priority determination unit 103 may further determine the priority in consideration of events other than the detected event. For example, the priority determination unit 103 obtains a change of each of the plurality of explanatory variables between the learning data and the test data. The change in the explanatory variable can also be interpreted as a change in the distribution of the explanatory variable. The priority determination unit 103 determines a level of the priority according to the magnitude of the obtained change. For example, the priority determination unit 103 determines the priority of the explanatory variable having change obtained which is larger than the other explanatory variables to be a value larger than the priority of the other explanatory variables. The priority determination unit 103 may determines the priority of the explanatory variable having change obtained which is larger than the other explanatory variables to be a value smaller than the priority of the other explanatory variables.
The test data is data serving as an input of prediction using the regression model after learning. The test data is, for example, sensor data (such as sensor data detected after learning based on the learning data) different from the learning data.
The data generation unit 104 generates learning data. For example, the data generation unit 104 generates part of the sensor data stored in the sensor data storage unit 124 as learning data. In a case where the sensor data is used as it is as the learning data, the generated data storage unit 125 and the data generation unit 104 is not necessarily provided.
The data generation unit 104 may generate the learning data in consideration of the priority. For example, the data generation unit 104 generates learning data including one or more explanatory variables having a higher priority than other explanatory variables among the plurality of explanatory variables and an objective variable. As the priority, for example, a value determined by the priority determination unit 103 before generation of learning data to be used for next learning can be referred to.
The model learning unit 105 performs learning of the regression model using the learning data. When the learning data is generated by the data generation unit 104, the model learning unit 105 performs learning of the regression model using the generated learning data. In addition, the model learning unit 105 performs learning of the regression model so as to optimize the objective function including the regularization term (penalty term) having a regularization strength changing according to the priority. The model learning unit 105 stores information regarding the learned regression model in the model storage unit 121.
Details of learning of the regression model by the model learning unit 105 will be described below. The learning data is represented as (xi,yi), xi ∈Rp, yi ∈R, i=1, . . . , n. In addition, a regression model learned in the past is represented as β−∈Rp. The weight based on the selection priority and the weight based on the update priority are represented as uj and vj (j=1, . . . , p), respectively.
In a case where the selection priority and the update priority are represented by numerical values, the model learning unit 105 can use the numerical value itself or a value calculated using the numerical value as the weight uj and the weight vj. In a case where the selection priority and the update priority are not represented by numerical values, the model learning unit 105 obtains values obtained by converting the selection priority and the update priority into numerical values as the weight uj and the weight vj. The numerical value indicated by the weight is a real number of 0 or more, and it is indicated that the larger the numerical value, the smaller the priority.
Similarly, the weight vj based on the update priority is set to any value of ¼, ½, 1, 2, and 4. These values, ¼, ½, 1, 2, and 4, correspond to the explanatory variables being “almost always changing”, “actively changing”, “as usual”, “actively constant”, and “almost always constant”.
The model learning unit 105 performs learning of the regression model by using learning data (xi, yi), a regression model β−, the weight uj, and the weight vj. In the present embodiment, a method (hereinafter, referred to as a method MB) in which a weight based on the priority is added to the method MA is adopted. The method MB is expressed by the following Equation (4).
In the method MB, weights uj based on the selection priorities and weights vj based on the update priorities are added to the regularization term as compared with the method MA. The priorities of variable selection and variable update are adjusted by these weights.
As such, in the method MB, the objective function includes the following three terms:
When uj=1 and vj=1 for all the explanatory variables (all j), the method MB matches the method MA.
When the value of uj is large, a large penalty is applied to the fact that the regression coefficient becomes non-zero, and the selection of the variable is suppressed. Similarly, when the value of v is large, a large penalty is applied to the update of the regression coefficient from the regression model β− before the update, and the update of the regression coefficient is suppressed.
Since the priority is adjusted according to the detected event (condition), it is possible to select and update an appropriate explanatory variable for the event. Since only the necessary explanatory variables are selected or only the necessary explanatory variables are updated, for example, an access load to each storage unit, a network load, or a calculation processing load can be reduced. In addition, the model can be stably updated even when the noise is large, when the data collinearity is high, or when the number of data is small. Furthermore, since the model update is consistent with the event information, the management cost for the model update can be reduced.
Note that, in a case of constructing the regression model first, since the learned regression model β− does not exist, the model learning unit 105 may perform learning of the regression model by, for example, a method not using the learned regression model β− as in the above Equation (2).
The prediction unit 111 executes prediction processing using the regression model after learning. For example, the prediction unit 111 executes prediction processing of predicting the objective variable for the test data using the regression model after learning.
The prediction unit 111 may estimate whether or not an object related to the test data is in a specific state (for example, abnormal) using the prediction result (prediction value) of the regression model. The object is for example a specific facility of a factory or plant. The prediction unit 111 may detect the abnormality of the object based on the prediction error of the objective variable predicted using the regression model. The prediction unit 111 estimates that an abnormality has occurred in the object, for example, in a case where the prediction error is larger than the threshold value.
Note that the prediction processing may be executed by an external device (prediction device) of the information processing apparatus 100. In this case, the information processing apparatus 100 does not necessarily include the prediction unit 111.
The output control unit 112 controls output of various types of information used in the information processing apparatus 100. For example, the output control unit 112 outputs a result of the prediction processing by the prediction unit 111. The output method may be any method, and for example, a method of transmitting to an external device via a network and a method of displaying on a display device such as a liquid crystal display can be applied.
At least a part of each unit (the model acquisition unit 101, the event detection unit 102, the priority determination unit 103, the data generation unit 104, the model learning unit 105, the prediction unit 111, and the output control unit 112) may be realized by one processing unit. Each of the above units is realized by, for example, one or a plurality of processors. For example, each of the above units may be realized by causing a processor such as a central processing unit (CPU) and a graphics processing unit (GPU) to execute a program, that is, by software. Each of the above units may be realized by a processor such as a dedicated integrated circuit (IC), that is, hardware. Each of the above units may be realized by using software and hardware in combination. When a plurality of processors is used, each processor may implement one of the units or two or more of the units.
Furthermore, the information processing apparatus 100 may be physically configured by one apparatus or may be physically configured by a plurality of apparatuses. For example, the information processing apparatus 100 may be constructed on a cloud environment.
Next, learning processing by the information processing apparatus 100 according to the embodiment will be described.
In a case where the event detection unit 102 detects an event, the learning processing is started (Step S101). The priority determination unit 103 determines the priority of the explanatory variable using the correspondence information according to the detected event (Step S102).
The data generation unit 104 generates learning data used for learning of the regression model (Step S103). The model acquisition unit 101 acquires information of the learned regression model from, for example, the model storage unit 121 (Step S104). The model learning unit 105 performs learning of the parameters of the regression model using the learning data generated in Step S103 so as to minimize the objective function using the priority determined in Step S102 (Step S105).
As described above, in the information processing apparatus according to the embodiment, when the condition defining the timing to perform learning of the regression model is satisfied, the priority of the explanatory variable is determined according to the condition, and the regression model is learned using the objective function including the regularization term according to the priority. As a result, the regression model can be managed more efficiently.
Many regression models may be required in factories and plants. For example, in a semiconductor factory, not only one product but also various products are produced on the same line, and the tendency is different for each product to be manufactured (variety and model number). In addition, there are various quality characteristics for one product. The quality characteristics in the semiconductor used as the objective variable of the regression model include, for example, an electrical characteristic value and trench sizes. Furthermore, the tendency of these quality characteristics varies depending on the position of a device, a chamber, a wafer, a chip, or the like measured. Therefore, a regression model representing each quality characteristic is required.
Therefore, a technique for constructing an integrated model in which a plurality of regression models is integrated has been proposed. With such a technology, it is possible to estimate the integrated model using a plurality of pieces of data. However, in the conventional technique, since there is no means for updating the constructed model, it is necessary to recreate the integrated model again when the update is necessary. In the method of recreating the integrated model, problems such as an unstable change of the integrated model, a long construction processing (recreating process), and an increased burden on a worker for construction or verification may occur.
An integrated model obtained by integrating a plurality of regression models can be interpreted as a model that predicts a plurality of objective variables. Therefore, in the modification, an example of managing a regression model that predicts a plurality of objective variables will be described. Basically, learning (constructing, or updating) of the regression model can be performed by a method similar to that of the above embodiment.
An example of a plurality of regression models will be described. Assuming that there are K objective variables (K is an integer of 2 or more), data including the objective variable and the explanatory variable is represented by (xi, yi), xi ∈Rp, yi ∈RK, i=1, . . . , n. The regression model is estimated as the following Equation (5).
There are K regression models, and the number of regression coefficients (the number of parameters) including the intercept is (p+1)×K. In this way, when each of the plurality of regression models is individually constructed, the number of regression coefficients becomes enormous, and construction and management thereof become complicated.
Therefore, by introducing sparsity, the number of parameters is reduced to improve and stabilize generalization performance.
For example, when there are K types of objective variables, a K-dimensional dummy variable zi ∈{0, 1}K representing the type of i-th data is introduced. This dummy variable is dummy variable conversion called one-hot encoding of the categorical variable, but K−1 dimensional dummy variable conversion may be used, or another encoding method may be used.
The category variable is a variable that can take a plurality of levels (values) for a certain category. The category and the level may be any combination, and for example, the following combination can be used.
The dummy variable conversion is executed, for example, as follows.
At this time, assuming that a regression coefficient common to types and an intercept different for each type are provided, the regression model is expressed by the following Equation (6).
Here, γT∈RK, and the k-th element γk represents an intercept when the type of the objective variable is k. As a result, the number of parameters is (1+p+K). Furthermore, if the regression coefficient is different for each type, a model represented by the following Equation (7) can be considered as the regression model.
At this time, the number of parameters is (1+p+K+pK). Although the number of parameters increases as it is, the number of effective parameters can be reduced by estimating many parameters to zero by adding appropriate regularization as shown in the following Equation (8) (for example, Tibshirani, R., & Friedman, J. (2020). “A pliable Lasso.” Journal of Computational and Graphical Statistics, 29(1), 215-225.).
Here, the first term of the objective function corresponding to the right side of Equation (8) is a square error, the second term is a term (regularization term) sparsifying the common regression coefficient and the individual regression coefficient with each variable j, and the third term is a term (regularization term) sparsifying the individual regression coefficient.
In particular, the second term is a formulation that constrains “γjk may be non-zero only when βj is non-zero (there is almost no case where βj is zero even though γjk is non-zero)”. By using such regularization, effective (non-zero) parameters are reduced, and generalization performance, stabilization, and interpretability can be improved.
Such a method is also called multi-task learning in the sense that there are a plurality of objective variables, and various methods other than the above have been proposed (for example, Japanese Patent No. 6480022, Yuan, M., & Lin, Y. (2006). “Model selection and estimation in regression with grouped variables.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68(1), 49-67., Obozinski, G., Wainwright, M. J., & Jordan, M. I. (2008). “High-dimensional union support recovery in multivariate regression.” Advances in Neural Information Processing Systems, 21(3)., Lee, S., Zhu, J., & Xing, E. (2010). “Adaptive multi-task Lasso: with application to eqtl detection.” Advances in neural information processing systems, 23., and Zhang, K., Zhe, S., Cheng, C., Wei, Z., Chen, Z., Chen, H., . . . & Ye, J. (2016, August). “Annealed sparsity via adaptive and dynamic shrinking.” In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1325-1334).).
In addition, such a method can be interpreted as newly expanding the learning data (xij, zik, xijzik) using the dummy variable with respect to the original learning data (xij) and estimating the integrated regression model using the sparse regularization with respect to the expanded learning data.
For example, zi1 is a variable representing whether or not the objective variable is the quality characteristic of a chamber 1 as one or zero, and xi1 is a variable of the sensor 1. In this case, the new variable xi1zi1 is a variable representing “sensor 1×chamber 1”, and is a variable representing a value of zero in a case other than the chamber 1 and a variable representing the value of the sensor 1 itself in a case of the chamber 1.
That is, by extending the data, an integrated model in which a plurality of regression models is integrated can be constructed. In implementing the algorithm, if the learning data is extended as preprocessing, a method and a package of a general regularized optimization problem can be used.
Y represents sensor data (such as quality characteristics) corresponding to the objective variable. The sensor 1 is sensor data measured by a sensor identified by “sensor 1”. “Chamber 1” and “chamber 2” correspond to dummy variables representing the type of Y that is an objective variable. For example, when the value of “chamber 1” is one, it indicates that a semiconductor product is manufactured in a chamber corresponding to “chamber 1”. When the value of “chamber 2” is one, it indicates that a semiconductor product is manufactured in a chamber corresponding to “chamber 2”. That is, this example can be interpreted as an example of data for an integrated model for predicting two quality characteristics of products manufactured in two different chambers. In this integrated model, the two quality characteristics are represented by one objective variable Y.
The data extension is executed by the data generation unit 104, for example. That is, the data generation unit 104 calculates a plurality of explanatory variables by multiplying one or more variables VA (first variables) and a plurality of dummy variables respectively corresponding to the plurality of objective variables. In the example of
In the modification, the learning method of the integrated model similar to that of the above embodiment is applied by regarding the explanatory variable and the regression model (integrated model) of the extended data as the p-dimension. For example, in the example of
It may be necessary to update the model even for the integrated model constructed as described above. In particular, since the integrated model targets a plurality of quality characteristics, the frequency of updating the model of the integrated model is higher than that of the regression model of one quality characteristic.
For the update of the integrated model, the method MA for updating only necessary portions can be utilized. In order to apply the method MA to the integrated model, the data may only be expanded in advance by setting a vector in which xij, zik, and xi zik (j=1, . . . , p, k=1, . . . , K) are arranged as xi afresh.
However, since the integrated model includes a plurality of quality characteristics, noise tends to be large, and a dimension of data often becomes large, so that automatic parameter update by the method MA is not always necessarily appropriate.
Therefore, in the present modification, when a condition (such as occurrence of an event) defining a timing to perform learning (updating) of the integrated model is satisfied, a parameter that needs to be updated is specified according to the condition, and the specified parameter is preferentially updated.
An example of a priority determination method according to a modification will be described.
Also in the modification, the priority determination unit 103 can determine the priority of the explanatory variable using the event information stored in the event storage unit 122 and the correspondence information (the correspondence information 123a and 123b) stored in the correspondence information storage unit 123.
The correspondence information 123b between the variable and the variable type may be manually set by, for example, an administrator or the like. The correspondence information 123b may be generated by, for example, the priority determination unit 103 (or the data generation unit 104, the model learning unit 105, or the like may be used) with reference to the extended data.
For example, the priority determination unit 103 specifies the type of the objective variable corresponding to the dummy variable. In the example of
As a result, the priority determination unit 103 can determine the priority corresponding to the type of the detected condition for the explanatory variable included in the variable type corresponding to the type of the detected condition (event type) using the generated correspondence information 123b and the correspondence information 123a (second correspondence information).
The priority determination unit 103 may further determine the priority in consideration of the prediction error of the prediction processing by the model after learning. For example, the priority determination unit 103 determines the level of the priority of the corresponding explanatory variable according to the magnitude of the prediction error by the regression model after learning among the plurality of objective variables. For example, when the prediction error of the quality characteristic (objective variable) related to the chamber 2 is larger than other quality characteristics (for example, the quality characteristic related to the chamber 1), the priority determination unit 103 determines the priority of the explanatory variable (for example, the sensor 1×the chamber 2) related to the quality characteristic (objective variable) related to the chamber 2 to be a value larger than other explanatory variables. Conversely, the priority determination unit 103 may determine the priority of the explanatory variable having the prediction error larger than the other explanatory variables to be a value smaller than the priority of the other explanatory variables.
The model learning unit 105 perform learning of the regression model so as to optimize the objective function including the regularization term (penalty term) having a regularization strength changing according to the priority determined. As described above, in the modification, the learning method of the integrated model (regression model) similar to the above embodiment can be applied by regarding the explanatory variable of the extended data as the p-dimensional learning data xi afrsh.
As described above, in the modification, even for an integrated model having a plurality of quality characteristics, the model is updated by changing the priority according to the event, so that the model can be stably updated and the load on the update of the model can be reduced and efficiently managed.
Next, a hardware configuration of an information processing apparatus according to the embodiment (and the modification) will be described with reference to
The information processing apparatus according to the embodiment includes a control device such as a CPU 51, a storage device such as a read only memory (ROM) 52 and a RAM 53, a communication I/F 54 that is connected to a network and performs communication, and a bus 61 that connects the respective units.
The program executed by the information processing apparatus according to the embodiment is provided by being incorporated in the ROM 52 or the like in advance.
The program executed by the information processing apparatus according to the embodiment may be provided as a computer program product by being recorded as a file in an installable format or an executable format in a computer-readable recording medium such as a compact disk read only memory (CD-ROM), a flexible disk (FD), a compact disk recordable (CD-R), or a digital versatile disk (DVD).
Furthermore, the program executed by the information processing apparatus according to the embodiment may be stored on a computer connected to a network such as the Internet and provided by being downloaded via the network. Furthermore, the program executed by the information processing apparatus according to the embodiment may be provided or distributed via a network such as the Internet.
The program executed by the information processing apparatus according to the embodiment can cause a computer to function as each unit of the information processing apparatus described above. In this computer, the CPU 51 can read a program from a computer-readable storage medium onto a main storage device and execute the program.
A configuration example of the embodiment will be described below:
An information processing apparatus including
The information processing apparatus according to Configuration Example 1, wherein
The information processing apparatus according to Configuration Example 2, wherein
The information processing apparatus according to Configuration Example 2, wherein
The information processing apparatus according to any one of Configuration Examples 1 to 4, wherein
The information processing apparatus according to any one of Configuration Examples 1 to 5, wherein
The information processing apparatus according to any one of Configuration Examples 1 to 6, wherein
The information processing apparatus according to any one of Configuration Examples 1 to 7, wherein
The information processing apparatus according to any one of Configuration Examples 1 to 8, wherein
The information processing apparatus according to Configuration Example 9, wherein
The information processing apparatus according to any one of Configuration Examples 1 to 10, wherein
The information processing apparatus according to any one of Configuration Examples 1 to 11, wherein
An information processing method executed by an information processing apparatus, the method including:
A program causing a computer to execute:
While certain embodiments have been described, these embodiments have been presented by way of example only, and are not intended to limit the scope of the inventions. Indeed, the novel embodiments described herein may be embodied in a variety of other forms; furthermore, various omissions, substitutions and changes in the form of the embodiments described herein may be made without departing from the spirit of the inventions. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the scope and spirit of the inventions.
Number | Date | Country | Kind |
---|---|---|---|
2023-131894 | Aug 2023 | JP | national |