The present disclosure relates to an analysis device, an analysis method, and a non-transitory computer-readable medium having a program stored thereon.
A predicted value of a prediction model for a certain data point may greatly deviate from an actual value due to factors such as overfitting or underfitting with respect to training data and a shift in data distribution. This is called a prediction error. In a case where analysis of a prediction error and an action for eliminating a factor of the prediction error are manually performed, a person in charge of analysis first performs specialized examination accompanied by multifaceted analysis based on a plurality of metrics using a prediction model, training data, and the like to identify the factor. Next, the person in charge of analysis devises an action for eliminating the found factor and executes the action.
As techniques related to evaluation of a prediction model, some techniques are known. For example, a metric monitoring system described in Non Patent Literature 1 continuously evaluates a plurality of metrics and presents an evaluation result to a user of the system. In addition, a prediction model maintenance system described in Patent Literature 1 continuously evaluates prediction accuracy and a magnitude of distribution shift of data, and when a deterioration state of a prediction model is detected from an evaluation result, automatically performs re-learning to update the model.
The metric monitoring system of Non Patent Literature 1 only calculates a plurality of metrics individually and individually presents a determination result of each metric for each metric. For this reason, identifying a factor of a prediction error still requires expert consideration by a person in charge of analysis. In addition, the prediction model maintenance system of Patent Literature 1 does not identify a factor of a prediction error based on evaluation results with respect to a plurality of metrics.
Therefore, in view of the above problems, a main object of the present disclosure is to provide an analysis device, an analysis method, and a program capable of easily identifying a factor of a prediction error in prediction using a prediction model on the basis of various viewpoints.
An analysis device according to a first aspect of the present disclosure includes:
An analysis method according to a second aspect of the present disclosure includes:
A program according to a third aspect of the present disclosure causes a computer to execute:
According to the present disclosure, it is possible to provide an analysis device, an analysis method, and a program capable of easily specifying a factor of a prediction error in prediction using a prediction model on the basis of various viewpoints.
Before describing the details of an example embodiment, an outline of the example embodiment will be described first.
The metric evaluation unit 2 calculates a plurality of types of metrics (or indexes) with respect to a prediction model, data of explanatory variables used in the prediction model, or data of target variables used in the prediction model. Then, the metric evaluation unit 2 evaluates each of the plurality of types of calculated metrics. The metric evaluation unit 2 calculates a predetermined arbitrary metric. For example, the metric may be the accuracy of the prediction model, may be an abnormality score of a value of an explanatory variable or an target variable in data (hereinafter, referred to as a prediction error sample) that failed to predict in prediction using the prediction model, or may be a magnitude of temporal shift of a distribution of explanatory variables or target variables. Note that these are merely examples, and the metric evaluation unit 2 may calculate other metrics.
The factor identification unit 3 identifies a factor of an error in prediction by the prediction model according to combination of evaluation results from the metric evaluation unit 2 with respect to each of the plurality of types of metrics. The factor identification unit 3 identifies a factor by using, for example, a predetermined rule for associating a combination of evaluation results with a factor.
According to the analysis device 1, a plurality of types of metrics are evaluated, and a factor according to combinations of evaluation results for the metrics is automatically identified. Therefore, according to the analysis device 1, a factor of a prediction error in prediction using the prediction model can be easily identified on the basis of various viewpoints.
Hereinafter, example embodiments will be described in detail with reference to the drawings. When the prediction model causes a prediction error, that is, when the prediction model fails to predict a certain data point, the analysis device of the present example embodiment identifies a prediction error factor for the data point (prediction error sample) by analyzing the prediction error using a plurality of metrics. Note that the target prediction model is arbitrary, and may be, for example, a regression model or a classification model. In a case where the target model is a regression model, the analysis device of the present example embodiment identifies, for example, a factor that a predicted value of an target variable is not appropriate. In addition, in a case where the target prediction model is a classification model, the analysis device of the present example embodiment identifies a factor that a predicted value of a label, a classification score, or the like is not appropriate, for example.
The analysis device of the present example embodiment calculates a plurality of metrics using a prediction error sample, training data, and the like, and performs analysis using the plurality of metrics to identify a prediction error factor. Examples of metrics to be used include an evaluation metric of a prediction model such as a mean square error (accuracy of the prediction model), an abnormality score of a prediction error sample calculated using an abnormality detection method, a magnitude of distribution shift of data calculated from a distance between distributions of explanatory variables of training data and operation data, and the like.
First, the storage unit 20 will be described. The storage unit 20 stores information necessary for analysis of a prediction error factor. Specifically, as illustrated in
The prediction model 21 is a prediction model trained using the training data 22. That is, the prediction model 21 is a trained model. The prediction model 21 has a function as a function of outputting a predicted value of an target variable when input data (data of an explanatory variable) is input. As described above, the model type of the prediction model 21 is not particularly limited.
The training data 22 is data used for training, parameter tuning, and the like of the prediction model 21, and is a set of data of an explanatory variable and data of an target variable.
The training test data 23 is data used to evaluate generalization performance of the prediction model 21 at the time of training the prediction model 21, and is a set of data of explanatory variables and data of target variables. The training data 22 and the training test data 23 can be said to be data in a training phase with respect to the prediction model 21.
The operation data 24 is data obtained at the time of operation of the prediction model 21, and is data including data of explanatory variables used to obtain prediction by the prediction model 21 and actual values of target variables corresponding to the data of the explanatory variables. The operation data 24 may include predicted values of the target variables corresponding to the data of the explanatory variables, predicted by the prediction model 21, in addition to the actual values of the target variables corresponding to the data of the explanatory variables.
The operation data 24 includes a prediction error sample 25. The prediction error sample 25 is designated by, for example, a user of the analysis device 10 from the operation data 24 as a sample in which a prediction error has occurred. In the present example embodiment, the analysis device 10 uses the operation data 24 designated by an instruction received by the instruction reception unit 70 which will be described later as the prediction error sample 25. The number of designated prediction error samples 25 is not limited to one, and may be plural. When a plurality of prediction error samples 25 are designated, the analysis device 10 sequentially identifies a prediction error factor for each of the prediction error samples.
The analysis control information 26 is information for controlling processing of the analysis device 10. Examples of the analysis control information 26 include a program in which an algorithm used by the diagnosis unit 30 to evaluate a metric is implemented, a setting value of a threshold used by the diagnosis unit 30 to evaluate a metric, information defining a rule used by the diagnosis unit 30 or the action determination unit 40, and the like. Note that the storage unit 20 may store a plurality of pieces of analysis control information 26 that can be substituted for each other. For example, the storage unit 20 may store, as the analysis control information 26, various algorithms for calculating the same type of metric, or may store various setting values (various evaluation algorithms) of thresholds used for evaluation of metrics. Furthermore, for example, the storage unit 20 may store various types of definition information of rules used by the diagnosis unit 30 or the action determination unit 40 as the analysis control information 26. When a plurality of pieces of analysis control information 26 that can be substituted for each other are stored, the analysis device 10 performs processing using the analysis control information 26 designated by an instruction received by the instruction reception unit 70. With such a configuration, the analysis device 10 can perform analysis by various analysis methods.
Next, the diagnosis unit 30 will be described. The diagnosis unit 30 identifies a prediction error factor for the prediction error sample 25 using information stored in the storage unit 20. Specifically, the diagnosis unit 30 calculates a metric and evaluates a calculation result of the metric for each of a plurality of metrics. Then, the diagnosis unit 30 identifies a prediction error factor using each evaluation result obtained for each metric.
As illustrated in
The metric evaluation unit 31 calculates a plurality of metrics necessary for analysis of a prediction error factor and determines calculation results of the metrics using information in the storage unit 20. For example, the metric evaluation unit 31 calculates an abnormality score of an explanatory variable of the prediction error sample 25 with respect to the training data 22 and evaluates the calculated abnormality score. In this case, the metric evaluation unit 31 evaluates the metric by determining whether the calculated value of the abnormality score is a value at which the prediction error sample 25 is recognized as an abnormal sample. That is, in this case, the metric evaluation unit 31 determines whether the prediction error sample 25 is an abnormal sample using the calculated abnormality score. As another example, the metric evaluation unit 31 calculates an inter-distribution distance (hereinafter, it is also referred to as a magnitude of distribution shift of data) between the training data 22 and the operation data 24, and evaluates the calculated inter-distribution distance. In this case, the metric evaluation unit 31 evaluates the metric by determining whether the calculated value of the inter-distributions distance is a value at which it is recognized that there is a shift in the distribution of data between training and operation. That is, in this case, the metric evaluation unit 31 determines whether or not a shift in the distribution of data occurs between training and operation by using the calculated inter-distribution distance. Note that these are merely examples, and the metric evaluation unit 31 can perform calculation and evaluation with respect to various types of metrics. As described above, in the present example embodiment, the metric evaluation unit 31 performs predetermined determination on metrics as evaluation on the metrics. Determination on each metric is performed using, for example, a threshold stored as the analysis control information 26. Note that a parameter for specifying the threshold may be stored as the analysis control information 26 instead of the threshold itself.
Here, the type and number of metrics calculated to identify a factor of a prediction error for one prediction error sample 25 are arbitrary, but it is preferable to use two or more metrics. This is because, by using a large number of metrics, more multifaceted analysis can be achieved and the number of types of prediction error factors that can be identified can be increased.
In addition, an evaluation method for each metric in the metric evaluation unit 31 is arbitrary. For example, when an abnormality score of an explanatory variable of the prediction error sample 25 is calculated and it is determined whether the prediction error sample is an abnormal sample, various abnormality detection methods such as a hoteling method and a k-nearest neighbor method can be used. As described above, a program for realizing an evaluation method (algorithm) used by the metric evaluation unit 31 for each metric is stored in the storage unit 20 as the analysis control information 26, for example. Furthermore, as described above, the analysis control information 26 may include a plurality of programs in which different algorithms are implemented for the same type of metric. For example, the analysis control information 26 may include two programs, i.e., a program implementing a hoteling method and a program implementing a k-nearest neighbor method, as programs implementing an evaluation method (algorithm) regarding an abnormality score of an explanatory variable of the prediction error sample 25. According to such a configuration, the diagnosis unit 30 can evaluate metrics using various evaluation methods by switching the analysis control information 26 to be used.
The factor identification unit 32 identifies a prediction error factor according to combinations of evaluation results of the plurality of types of metrics from the metric evaluation unit 31. In the present example embodiment, the factor identification unit 32 identifies a prediction error factor according to combinations of determination results of predetermined determinations for each metric. Specifically, the factor identification unit 32 identifies a prediction error factor by using a predetermined rule (hereinafter, a factor determination rule) for associating the prediction error factor with a combination of a plurality of determination results.
As described above, the factor identification unit 32 identifies a factor of the error in prediction by the prediction model 21 according to the rule for associating a factor with a combination of evaluation results (determination results) of a plurality of types of metrics. The content of the factor determination rule used by the factor identification unit 32 is arbitrary. In addition, as described above, the factor determination rule is stored in the storage unit 20, for example, as the analysis control information 26. Furthermore, as described above, the analysis control information 26 may include a plurality of factor determination rules having different types or numbers of determination results to be analyzed. According to such a configuration, the diagnosis unit 30 can analyze a prediction error using different factor determination rules by switching the analysis control information 26 to be used. Note that since it is necessary to obtain a determination result corresponding to a factor determination rule to be used, the type and number of metrics to be evaluated by the metric evaluation unit 31 depend on the factor determination rule.
In addition, the form of the factor determination rule is also arbitrary. The factor determination rule used by the factor identification unit 32 may be, for example, a factor determination rule for allocating a combination of determination results to a prediction error factor using a table, or a factor determination rule for allocating a combination of determination results to a prediction error factor using a flowchart. These forms of the factor determination rule will be described below.
The factor identification unit 32 identifies a prediction error factor using determination results obtained by the metric evaluation unit 31 and the factor determination rule of
As described above, a factor determination rule in a flowchart format may be used as the factor determination rule used by the factor identification unit 32.
The factor determination rule in the flowchart format shown in
When the determination result of Q1 is Yes, it means that the explanatory variables of the prediction error sample 25 are normal and a sample whose explanatory variables are similar to those of the prediction error sample 25 can occur with a high frequency. Therefore, it is assumed that there are a large number of neighboring training samples in the training data 22. In this case, if actual values of target variables of these neighboring training samples are appropriately trained, the prediction model 21 becomes a prediction model with high prediction accuracy. In addition, when the determination result of Q1 is Yes, since the prediction error sample 25 is a normal sample, there is a low possibility of the data distribution changing between training and operation. Therefore, when the determination result of Q1 is Yes, determination of Q3 is meaningless.
If the determination result of Q1 is Yes, subsequently, it is determined whether the prediction model 21 has appropriately learned actual values of the target variables of the neighboring training samples in Q2. When the determination result of Q2 is Yes, since the prediction model 21 is assumed to be a prediction model with high prediction accuracy, it is expected that no prediction error occurs. Therefore, factors other than the prediction model and data, such as analysis of a sample without a prediction error due to a malfunction of the analysis device 10 (a malfunction of a user interface or the like) or an erroneous operation of a user of the system as the prediction error sample 25, are conceived. Therefore, in this case, the factor identification unit 32 determines that a factor of a prediction error is an error other than the prediction model and data with reference to the factor determination rule. In addition, when the determination result of Q2 is No, it is conceivable that the prediction model 21 cannot appropriately learn the actual values of the target variables of the neighboring training samples due to underfitting or the like. Therefore, in this case, it is concluded that the prediction model 21 is a model having a local error around the prediction error sample 25. Therefore, in this case, the factor identification unit 32 determines that a factor of a prediction error is a local error with reference to the factor determination rule. As described above, since the determination of Q2 is meaningful only when the determination result of Q1 is Yes, Q2 is arranged after Q1.
On the other hand, when the determination result of Q1 is No, it means that there are not sufficient neighboring training samples in the training data 22, and in this case, it is impossible to accurately determine whether the prediction model 21 applies satisfactorily to the neighboring training samples in Q2. Therefore, when the determination result of Q1 is No, it is important to identify the reason why a sample having a high abnormality score such as the prediction error sample 25 has occurred. Therefore, in Q3, it is determined whether the distribution of the data has shifted with the lapse of time. Hereinafter, a shift with the lapse of time is referred to as a temporal shift. When the determination result of Q3 is Yes, conclusion is made as follows. That is, since the frequency at which a sample having a high abnormality score is generated increases as compared with the training data 22 due to a temporal shift in the distribution of the data, as a result, it is concluded that the prediction error sample 25 having a high abnormality score as compared with the training data 22 is generated and a prediction error occurs. Therefore, in this case, the factor identification unit 32 determines that the factor of the prediction error is a shift in the data distribution with reference to the factor determination rule. In addition, when the determination result of Q3 is No, since the distribution of the data does not shift with time, it is concluded that the prediction error sample 25 is an abnormal sample caused by a factor other than a temporal shift in the data distribution. Therefore, in this case, the factor identification unit 32 determines that the factor of the prediction error is an abnormality in the explanatory variables due to some reason with reference to the factor determination rule. As described above, the factor determination rule in a flowchart format has a structure in which the details of the reason why the determination result of Q1 is No are determined in Q3, and thus Q3 is arranged after Q1.
As described above, in the factor determination rule illustrated in
When the factor determination rule in a flowchart format as illustrated in
Next, the action determination unit 40 will be described. The action determination unit 40 determines an action (work) for eliminating the factor identified by the factor identification unit 32 of the diagnosis unit 30. In the present example embodiment, the action determination unit 40 creates an action proposal sentence (hereinafter, an action proposal) for eliminating a prediction error factor for the prediction error factor identified by the diagnosis unit 30. At this time, the action determination unit 40 creates an action proposal by using a predetermined rule (hereinafter, an action determination rule) for allocating the action proposal to the prediction error factor.
Here, an example of the action determination rule is illustrated in
In this manner, the action determination unit 40 determines an action to be performed to eliminate the prediction error factor identified by the factor identification unit 32. As a result, it is possible to output an action proposal for eliminating the prediction error factor, and thus the user can immediately start an action necessary for improvement. That is, the user does not need to perform examination for determining an action from the identified factor.
Next, the visualization unit 50 will be described. The visualization unit 50 visualizes information describing each determination result in the diagnosis unit 30. A method of visualizing the information describing each determination result is arbitrary. For example, in the case of visualization regarding an abnormality score of a prediction error sample, the visualization unit 50 may generate image data of a graph as illustrated in
A program for generating information (image data) describing a determination result may be stored in the storage unit 20 as the analysis control information 26. In this case, the analysis control information 26 may hold a plurality of programs for realizing different visualization methods for a certain metric in order to perform different visualizations illustrated in
Note that, in the above description, visualization of an abnormality score of a prediction error sample has been described as an example, but the visualization unit 50 may visualize information describing other determination results. For example, the visualization unit 50 may generate image data of a graph as illustrated in
In this manner, the visualization unit 50 may generate image data of a predetermined graph corresponding to a metric. With such visualization, the user can visually confirm the validity of a determination result for each metric.
Furthermore, in the case of a factor determination rule in a flowchart format, the visualization unit 50 may generate image data describing a flow of a determination result in the flowchart as in
Next, the result output unit 60 will be described. The result output unit 60 outputs calculation results of metrics from the metric evaluation unit 31, determination results of the metrics from the metric evaluation unit 31, the prediction error factor identified by the factor identification unit 32, the action proposal created by the action determination unit 40, the image data created by the visualization unit 50, and the like. Note that the result output unit 60 may output all or only some of such information. The output method of the result output unit 60 is arbitrary, and the result output unit 60 may display the above-described information on, for example, a monitor (display) or the like. Furthermore, the result output unit 60 may transmit the above-described information to another device.
Next, the instruction reception unit 70 will be described. The instruction reception unit 70 receives an instruction from a user of the analysis device 10. For example, the instruction reception unit 70 receives an instruction to designate which sample of the operation data 24 is the prediction error sample 25. As a result, the user can easily change samples to be analyzed. A user interface of the instruction reception unit 70 may be displayed, for example, on a monitor (display). That is, the instruction reception unit 70 may display a screen for receiving an instruction on the monitor. The instruction reception unit 70 receives an instruction from the user via, for example, an input device (for example, a mouse, a keyboard, or the like) connected to the analysis device 10.
Note that, as described above, the instruction reception unit 70 may receive an instruction to designate a metric calculation algorithm or an evaluation algorithm. In this case, the metric evaluation unit 31 calculates or evaluates metrics by the calculation algorithm or the evaluation algorithm designated by the instruction. Further, the instruction reception unit 70 may receive an instruction to designate a factor determination rule. In this case, the factor identification unit 32 identifies a factor of an error in prediction by the prediction model 21 according to the factor determination rule designated by the instruction. With such a configuration, the user can easily change the analysis method. Note that the instruction is not limited to the above-described designation, and the instruction reception unit 70 may receive an instruction to designate an action determination rule or an instruction to designate a visualization method.
Next, a hardware configuration of the analysis device 10 will be described.
The input/output interface 150 is an interface for connecting the analysis device 10 and an input/output device. For example, an input device such as a mouse and a keyboard, and an output device such as a monitor (display) are connected to the input/output interface 150.
The network interface 151 is used to communicate with any other device as necessary. The network interface 151 may include, for example, a network interface card (NIC).
The memory 152 includes, for example, a combination of a volatile memory and a nonvolatile memory. The memory 152 is used to store software (a computer program) including one or more instructions executed by the processor 153, data used for various types of processing of the analysis device 10, and the like. For example, the above-described storage unit 20 may be realized by a storage device such as the memory 152.
The processor 153 reads and executes the software (computer program) from the memory 152 to perform processing of the diagnosis unit 30, the action determination unit 40, the visualization unit 50, the result output unit 60, and the instruction reception unit 70. The processor 153 may be, for example, a microprocessor, a micro processor unit (MPU), or a central processing unit (CPU). The processor 153 may include a plurality of processors.
As described above, the analysis device 10 has a function as a computer.
Furthermore, the program described above can be stored using various types of non-transitory computer-readable media and supplied to a computer. The non-transitory computer-readable media include various types of tangible storage media. Examples of non-transitory computer-readable media include a magnetic recording medium (for example, a flexible disk, a magnetic tape, or a hard disk drive), a magneto-optical recording medium (for example, a magneto-optical disk), a CD-read only memory (ROM) CD-R, a CD-R/W, and a semiconductor memory (for example, a mask ROM, a programmable ROM (PROM), an erasable PROM (EPROM), a flash ROM, and a random access memory (RAM)). In addition, the program may be supplied to a computer through various types of transitory computer readable media. Examples of transitory computer-readable media include electrical signals, optical signals, and electromagnetic waves. Transitory computer-readable media can supply the program to a computer via a wired communication path such as an electric wire and an optical fiber, or a wireless communication path.
Next, the operation of the analysis device 10 of the present example embodiment will be described.
First, as preparation before analysis processing performed by the analysis device 10, the prediction model 21, the training data 22, the training test data 23, and the operation data 24 are stored in the storage unit 20 (step S11). For example, these pieces of information are stored in the storage unit 20 by user operation. The analysis control information 26 is stored in the storage unit 20 in advance. Next, the user inputs an instruction to designate a prediction error sample 25 to be analyzed to the analysis device 10, and the instruction reception unit 70 receives the instruction (step S12). Next, the diagnosis unit 30 calculates a plurality of metrics, determines each metric, and identifies a prediction error factor using a factor determination rule (step S13). Next, the action determination unit 40 creates an action proposal for eliminating the identified prediction error factor (step S14). Next, the visualization unit 50 visualizes information describing the analysis process (step S15). Then, the result output unit 60 displays the identification result of the prediction error factor, the action proposal, and the visualized information (step S16).
The analysis device 10 has been described above. According to the analysis device 10, a plurality of types of metrics are evaluated, and a factor according to a combination of the evaluation results is automatically identified. Therefore, according to the analysis device 10, it is possible to easily identify a factor of a prediction error in prediction using the prediction model on the basis of various viewpoints. In particular, in the analysis device 10, since the action determination unit 40 determines an action to be performed in order to eliminate a prediction error factor, the user can omit examination on what action needs to be performed. Furthermore, since the analysis device 10 includes the visualization unit 50, it is possible to visualize information describing an analysis process in the analysis device 10. Note that the configuration of the analysis device 10 described above is merely an example, and various modifications can be made. For example, the analysis device 10 may further include a processing unit that performs prediction using the prediction model 21.
Meanwhile, in the above description, specific examples of the factor determination rule and the action determination rule have been described in order to aid in understanding, but these are not limited to the above specific examples. For example, the following rules may be used.
Hereinafter, specific examples different from the above examples will be described with respect to the factor determination rule and the action determination rule.
In the example of
In Q1, whether the prediction error sample 25 is a normal sample is determined from an abnormality score of an explanatory variable of the prediction error sample 25 with respect to the training data 22. In addition, in Q2, in a case where the determination result of Q1 is Yes, it is determined whether an actual value of an target variable of the prediction error sample 25 is similar to an actual value of an target variable of a neighboring training sample. By performing determination of Q1 and Q2, it is possible to determine whether the prediction error sample 25 is a normal sample with respect to the explanatory variable and the target variable when compared with the training data 22. Processing of the metric evaluation unit 31 corresponding to Q1 and Q2 can be implemented using an abnormality detection technique. For example, in a case where an abnormality detection technique called a hoteling method is used, in order to determine Q1, the metric evaluation unit 31 calculates a Mahalanobis distance of the prediction error sample 25 using a distribution of explanatory variables of the training data 22, and sets the Mahalanobis distance as an abnormality score. Similarly, in this case, in order to determine Q2, the metric evaluation unit 31 calculates a Mahalanobis distance of the prediction error sample 25 using a distribution of target variables of the neighboring training sample, and sets the Mahalanobis distance as an abnormality score. Then, with respect to the calculated abnormality score, the metric evaluation unit 31 determines whether the prediction error sample 25 is a normal sample using a threshold stored as the analysis control information 26. If the sample is determined to be an abnormal sample, the determination result of Q1 or Q2 is No.
In Q4, in a case where the determination result of Q1 is No, it is determined whether a temporal shift occurs in a data distribution by focusing on the explanatory variables of the training data 22 and the operation data 24. In addition, in Q5, in a case where the determination result of Q2 is No, it is determined whether a temporal shift occurs in the data distribution by focusing on the distribution of the target variables of the neighboring training sample and the sample (hereinafter, neighboring operation sample) in the operation data 24 located in the neighboring region. In this way, by focusing only on the sample of the neighboring region in Q5, the influence of a correlation between the explanatory variables and the target variables can be removed, and a temporal shift in a noise distribution of the target variables can be easily calculated. By performing determinations of Q4 and Q5, the diagnosis unit 30 determines, when the prediction error sample 25 is an abnormal sample, whether the reason why such an abnormal sample has appeared is a temporal shift in the data distribution. Processing of the metric evaluation unit 31 corresponding to Q4 and Q5 can be implemented using an inter-distribution distance estimation technique or a change point detection technique. For example, in a case where an inter-distribution distance estimation technique is used, in order to determine Q4, the metric evaluation unit 31 calculates an inter-distribution distance such as a Kullback-Leibler distance using a distribution of actual values of the explanatory variables of the training data 22 and the operation data 24, and sets the calculated inter-distribution distance as a magnitude of distribution shift of data. Similarly, in this case, in order to determine Q5, the metric evaluation unit 31 calculates an inter-distribution distance such as the Kullback-Leibler distance using distributions of actual values of the target variables of the neighboring training sample and the neighboring operation sample, and sets the calculated inter-distribution distance as a magnitude of distribution shift of data. Then, with respect to the calculated magnitude of distribution shift of data, the metric evaluation unit 31 determines whether or not a temporal shift occurs in the data distribution using the threshold stored as the analysis control information 26.
Q3 is determined when the determination results of Q1 and Q2 are both Yes (that is, in a case where the prediction error sample 25 is determined to be a normal sample in comparison with the training data 22). Q3 is a question of determining whether the prediction model 21 has performed neither underfitting nor overfitting on the training data 22 near the prediction error sample 25. By outputting the determination result of Q3, it is possible to determine whether the factor of the prediction error is in the prediction model 21. Processing of the metric evaluation unit 31 corresponding to Q3 can be implemented using various evaluation methods of the prediction model. As an example, there is a method of using an evaluation metric of a prediction model such as a mean square error. Specifically, in order to determine Q3, the metric evaluation unit 31 calculates a mean square error using the neighboring training sample and the prediction model 21, and compares the mean square error with a first threshold stored as the analysis control information 26, thereby determining the presence or absence of underfitting for the neighboring training sample. Further, the metric evaluation unit 31 calculates a mean square error using the sample (neighboring test sample) in the training test data 23 located in the neighboring region and the prediction model 21, and compares the mean square error with a second threshold stored as the analysis control information 26. As a result, the metric evaluation unit 31 determines the presence or absence of overfitting for the neighboring training sample. Note that the first threshold and the second threshold may be the same or different. In this manner, it is determined whether or not both underfitting and overfitting have occurred. In a case where neither underfitting nor overfitting has occurred, it is determined that the prediction model 21 applies satisfactorily to the training data and the training test data, and the determination result of Q3 is Yes.
A major difference between the factor determination rule illustrated in
Next, a dependence relationship of each question Q in the factor determination rule of
If the determination result of Q1 is Yes, subsequently, it is determined whether or not it is possible to accurately predict actual values of the target variables of the prediction error sample 25 when the prediction model 21 has appropriately learned actual values of the neighboring training sample in Q2. If the determination result of Q2 is No, the value of the target variable of the prediction error sample 25 is an abnormal value with respect to the value of the target variable of the neighboring training sample, which means that it is difficult to perform highly accurate prediction. Therefore, subsequently, it is determined whether the reason why the sample having such an abnormal target variable is generated is a shift in the distribution of the data of the target variables in Q5. If the determination result of Q5 is No, it is concluded that a prediction error factor is that the prediction error sample 25 is a sample having an abnormal target variable value generated regardless of the shift in the data distribution. That is, it is concluded that the factor of the prediction error is an abnormality in the explanatory variables due to some reason. If the determination result of Q5 is Yes, it is concluded that the frequency at which a sample having an abnormal target variable value is generated increases due to the temporal shift in the distribution with respect to the target variables, and as a result, the prediction error sample 25 having an abnormal target variable value has been generated and the prediction error has occurred.
If the determination result of Q2 is Yes, subsequently, it is determined in Q3 whether the prediction model 21 has appropriately learned the actual values of the target variables of the neighboring training sample. If the determination result of Q3 is Yes, since the prediction model 21 is assumed to be a prediction model with high prediction accuracy, it is expected that no prediction error occurs. Therefore, a factor other than the prediction model and data, such as a sample without a prediction error being analyzed as the prediction error sample 25 due to a malfunction of the system (analysis device 10) (a malfunction of a user interface or the like) or an erroneous operation of the user of the system, is conceivable. In addition, if the determination result of Q3 is No, this corresponds to a case where the prediction model 21 has not appropriately learned the actual values of the target variables of the neighboring training sample due to overfitting or underfitting. Therefore, in this case, it is concluded that the prediction model 21 is a model having a local error around the prediction error sample 25.
Next, the action determination rule of
Although the present invention has been described above with reference to the example embodiments, the present invention is not limited to the above. Various modifications that can be understood by those skilled in the art can be made to the configuration and details of the present invention within the scope of the invention.
Some or all of the above example embodiments may be described as the following supplementary notes, but are not limited to the following.
An analysis device including:
The analysis device according to Supplementary note 1, wherein the factor identification means identifies a factor of an error in prediction by the prediction model according to a rule for associating combinations of evaluation results of the plurality of types of the metrics with factors.
The analysis device according to Supplementary note 2, wherein the factor identification means identifies a factor of an error in prediction by the prediction model according to a combination of an evaluation result of a predetermined metric among the plurality of types of the metrics and an evaluation result of the metric selected according to the evaluation result of the predetermined metric.
The analysis device according to any one of Supplementary notes 1 to 3, further including an instruction reception unit configured to receive an instruction to designate a calculation algorithm or an evaluation algorithm for the metrics,
The analysis device according to Supplementary note 2, further including an instruction reception unit configured to receive an instruction to designate the rule,
The analysis device according to any one of Supplementary notes 1 to 5, further including an action determination means for determining an action for eliminating the factor identified by the factor identification means.
The analysis device according to any one of Supplementary notes 1 to 6, further including a visualization means for generating image data of a predetermined graph according to the metrics.
The analysis device according to Supplementary note 3, further including a visualization means for generating image data representing a flowchart defining the metric used to identify the factor and an order of using the metric and a transition history in the flowchart.
An analysis method including:
A non-transitory computer-readable medium storing a program causing a computer to execute:
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2021/007191 | 2/25/2021 | WO |