The present disclosure relates to analysis of a machine learning model.
MLOps (Machine Learning Operations) is recognized as a technology that enables the continuous and cost-effective operation of machine learning models. MLOps encompasses techniques for developing, deploying, and managing a machine learning model. Particularly, in a practical operation of a system using the machine learning model, it is important to analyze a factor contributing to a decrease in prediction accuracy of the machine learning model.
In order to analyze the factor contributing to the decrease in prediction accuracy in an operational system, sufficient knowledge and experience in programming during a development of the machine learning model and maintenance during the operation are required. Additionally, Patent Document 1 discloses a method for converting natural language into a programming language.
It is one object of the present disclosure to enable appropriate analysis of a factor contributing to a decrease in prediction accuracy caused by a machine learning model, even in a case where knowledge and experience related to a development and an operation of a machine learning model are not necessarily sufficient.
According to an example aspect of the present disclosure, there is provided an information processing device including:
According to another example aspect of the present disclosure, there is provided an information processing method performed by a computer, the information processing method including:
According to a further example aspect of the present disclosure, there is provided a program causing a computer to perform a process including:
According to the present disclosure, it is possible for even a person who does not necessarily have sufficient knowledge or experience in a development and an operation of a machine learning model to appropriately analyze the factor contributing to the decrease in prediction accuracy of the machine learning model.
In the following, example embodiments will be described with reference to the accompanying drawings.
The terminal device 20 is operated by a user who manages and maintains the machine learning model (hereinafter, referred to as a “predictive model”) for performing a predetermined prediction. The predetermined prediction may be a variety of predictions, for instance, forecasts of weather, forecasts of power demand, forecasts of sales at stores, and the like. In the present example embodiment, the user may need a certain level of knowledge and experience in the development and maintenance of machine learning model, but the user does not need to be an expert with extensive knowledge and experience.
If the prediction error occurs in the predictive model in operation, the user analyzes the factor of the prediction error. Specifically, if the prediction error occurs, the user operates the terminal device 20, and transmits a prompt instructing a factor analysis of the prediction error to the information processing device 10.
The information processing device 10 interprets the input prompt using a natural language model and executes analysis of a factor of a prediction error. Note that the “prompt” refers to an instruction sentence to a generation AI (Artificial Intelligence) including a natural linguistic model, or the like. The information processing device 10 interactively executes the factor analysis of the prediction error using an analytical algorithm prepared in advance. Specifically, according to the analytical algorithm, the information processing device 10 asks the user regarding variables or the predictive model, and acquires an answer of the user. Then, the information processing device 10 analyzes the factors of the prediction error based on the answer of the user, and generates an analysis result to output to the terminal device 20. Accordingly, the user can analyze the factor of the prediction error by inputting natural language into the information processing device 10.
Here, the natural language model will be described. The natural language model is a model which learns the relationships between words in a sentence, and generates relevant strings which is related to a target string, based on the target string. By using the natural language model which learns sentences and phrases of various contexts, it is possible to generate a relevant string of reasonable description related to the target string. For instance, a case of using the natural language model in question and answer will be described below. In this case, the natural language model accepts a question “What kind of country is Japan?” as the target string, and generates a string such as “Japan is an island in the northern hemisphere . . . ”, as the answer to the question.
The learning method of the natural language model is not particularly limited, but may be one which is learned to output at least one sentence including an input string as an example. For instance, the natural language model may be a GPT (Generative Pre-Training of outputting a sentence containing the input string by predicting a most probable string following the input string. Alternatively, the natural language model may be, for instance, T5 (Text-to-Text Transfer Transformer), BERT (Bidirectional Encoder Representations from Transformers), ROBERTa (Robustly optimized BERT approach), ELECTRA (Efficiently Learning an Encoder that Classifies Token Replacements Accurately), or the like. In the present example embodiment, a large-scale language model may be used as the natural language model. The natural language model may also be capable of accessing the Internet or other specialized knowledge bases to obtain information.
The processor 11 is a computer such as a CPU (Central Processing Unit) and controls the entire information processing device 10 by executing a program prepared in advance. Specifically, the processor 11 may be a CPU, a GPU (Graphics Processing Unit), a DSP (Digital Signal Processor), a MPU (Micro Processing Unit), a FPU (Floating Point number Processing Unit), a PPU (Physics Processing Unit), a TPU (Tensor Processing Unit), a quantum processor, a microcontroller, or a combination thereof.
The processor 11 loads the program stored in the ROM 13 or the recording medium 16 to the RAM 14, and executes processes coded in the program. The processor 11 functions as part or all of the information processing device 10. The processor 11 performs a factor analysis process which will be described later.
The IF 12 transmits and receives data to and from an external device. Specifically, the information processing device 10 receives, through the IF 12, a prompt which instructs the factor analysis or a response to a question output to the user from the terminal device 20.
The ROM 13 stores various programs to be executed by the processor 11. The RAM 14 is used as a working memory during executions of various processes by the processor 11.
The DB 15 stores a plurality of analytical algorithms used for factorial analysis.
The recording medium 16 is a non-volatile and non-transitory recording medium such as a disk-shaped recording medium or a semiconductor memory. The recording medium 16 may be detachably formed to the information processing device 10. The recording medium 16 records various programs executed by the processor 11.
In addition to the above, the information processing device 10 may include a display device such as a liquid crystal display or the like, and an input device such as a keyboard, a mouse, or the like. These display devices and the input device, for instance, is used by an administrator of the information processing device 10.
The processor 21 is a computer such as a CPU, and controls the entire terminal device 20 by executing programs prepared in advance. The processor 21 may be a GPU, an FPGA, a DSP, an ASIC or the like.
The IF 22 transmits and receives data to and from an external device. Specifically, the terminal device 20 transmits the prompt created by the user and the answer of the user to the question to the information processing device 10 through the IF 22. The terminal device 20 receives the question to the user and the analysis result from the information processing device 10 through the IF 22.
The ROM 23 stores various programs executed by the processor 21. Also, the RAM 24 is used as a working memory during executions of various operations by the processor 21.
The recording medium 25 is a non-volatile and non-temporary recording medium such as a disk-shaped recording medium or a semiconductor memory. The recording medium 25 may be detachably formed to the terminal device 20. The recording medium 25 records various programs executed by the processor 21.
The input unit 26 is, for instance, an input device such as a keyboard, a mouse, or a touch panel. The display unit 27 is a display for displaying data based on a control of the processor 21.
The communication unit 111 is formed by the IF 12 illustrated in
The analytical algorithm DB 113 is formed by the DB 15 illustrated in
The analysis unit 112 is formed by the processor 11 illustrated in
At the time of the factor analysis of the prediction error, the analysis unit 112 acquires a prompt which the user has created by operating the terminal device 20. The prompt includes information designating the analytical algorithm to be used for the analysis and an instruction to perform the factor analysis of the prediction error, as will be described in detail below. The analysis unit 112 acquires an analytical algorithm designated from the analytical algorithm DB 113 in accordance with the prompt. The analytical algorithm is described in natural language, and the analysis unit 112 interprets and executes the analytical algorithm using the natural language model.
During the execution of the analytical algorithm, the analysis unit 112 performs the necessary question to the user. Specifically, the analysis unit 112 generates the question corresponding to a conditional branch included in the analytical algorithm, and transmits the question to the terminal device 20 via the communication unit 111. The user operates the terminal device 20 to send the answer for the question, and the analysis unit 112 acquires the answer through the communication unit 111. Then, the analysis unit 112 makes a determination for the conditional branch based on the acquired answer and continues an execution of the analytical algorithm. Thus, the analysis unit 112 repeats transmitting of the question to the user and acquiring of the answer for each conditional branch included in the analytical algorithm, and executes the analytical algorithm interactively. Then, if the analysis result is acquired in accordance with the analytical algorithm, the analysis unit 112 transmits the analysis result to the terminal device 20. Note that the analysis unit 112 corresponds to examples of a prompt acquisition unit, an algorithm acquisition unit, and an analysis execution unit.
As described above, in the present example embodiment, the information processing device 10 executes the analysis process interactively by repeating questions and answers in natural language in accordance with the analytical algorithm prepared in advance. Therefore, it is possible for the user who analyzes the prediction error to proceed with the analysis along the analytical algorithm only by answering the questions transmitted from the information processing device 10. Therefore, if an appropriate analytical algorithm is prepared, it is possible even for the user who does not have advanced knowledge and experience related to analysis of the prediction errors to perform the appropriate analysis.
Next, an example of the factor analysis of the prediction error by the analysis unit 112.
First, the analytical algorithm AG analyzes whether or not the explanatory variable X is the prediction error factor. Specifically, if the explanatory variable X satisfies a condition A1 (step S21: Yes), the analytical algorithm AG determines that the explanatory variable X is not the prediction error factor (step S22). Moreover, if the explanatory variable X does not satisfy the condition A1 but satisfies a condition A2 (step S23: Yes), the analytical algorithm AG determines that the explanatory variable X is not the prediction error factor (step S22). On the other hand, if the explanatory variable X does not satisfy both the condition A1 and the condition A2 (step S23: No), the analytical algorithm AG determines that the explanatory variable X is the prediction error factor and outputs the analytical result “Factor location: Explanatory variable X, Factor: B1, Countermeasure: C1” (step S24).
If the explanatory variable X is not the prediction error factor, the analytical algorithm AG analyzes whether or not the objective variable Y is the prediction error factor. Specifically, if the objective variable Y satisfies a condition A3 (step S25: Yes), the analytical algorithm AG determines that the objective variable Y is not the prediction error factor (step S26). Moreover, if the objective variable Y does not satisfy the condition A3 but satisfies a condition A4 (step S27: Yes), the analytical algorithm AG determines that the objective variable Y is not the prediction error factor (step S26). On the other hand, if the objective variable Y does not satisfy both the condition A3 and the condition A4 (step S27: No), the analytical algorithm AG determines that the objective variable Y is the prediction error factor and outputs the analysis result “Factor location: Objective variable Y, Factor: B2, Countermeasure: C2” (step S28).
If the objective variable Y is not a factor, the analytical algorithm AG analyzes whether or not the predictive model M is the prediction error factor. Specifically, the analytical algorithm AG determines that the predictive model M is the prediction error factor if the predictive model M does not satisfy the condition A5 (step S29: No), and outputs the analysis result “Factor location: Model M, Factor: B3, Countermeasure: C3” (step S30). On the other hand, if the predictive model M satisfies the condition A5 (step S29: Yes), the analytical algorithm AG determines that the prediction error is an unexplained factor, and outputs the analysis result of “unexplained factor” (step S31).
Note that for convenience of explanation, details of the analytical algorithm AG have been described in the flowchart in
In
An item “#Explanatory variable analysis” describes contents of steps S21 to S24 in
The user generates a prompt including a designation of the analytical algorithm AG and an instruction to perform the factor analysis, and inputs the prompt to the information processing device 10.
The analysis unit 112 of the information processing device 10 interprets the prompt 41 using the natural language model and acquires the analytical algorithm AG designated from the analytical algorithm DB 113. The analytical algorithm AG is described in natural language as illustrated in
Specifically, the analysis unit 112 generates a question 42 asking whether or not the explanatory variable X satisfies the condition A1 based on a conditional branch of step S21 in
That is, the analysis unit 112 provides the question 44 to the user according to the analytical algorithm AG, and proceeds the analysis based on an answer 45 to the question 44. If the answer 45 to the question based on the conditional branch of the step S29 is received, the analysis unit 112 outputs an analysis result 46 according to step S30 and terminates the analysis.
As described above, in the present example embodiment, the user interacts with the information processing device 10 using natural language, so that the factor analysis of the prediction error can be performed. Therefore, it is possible for even the user who does not have advanced knowledge or experience concerning the machine learning model to appropriately perform factor analysis.
Next, a flow of the analysis process for performing the analysis as described above will be described.
First, the information processing device 10 reads the designated analytical algorithm from the analytical algorithm DB 113 in accordance with the prompt generated by the user (step S51). Next, the information processing device 10 executes the designated analytical algorithm (step S52). If there is a conditional branch in the analytical algorithm (step S53: Yes), the information processing device 10 generates a question corresponding to that condition, transmits the generated question to the terminal device 20, and continues the analysis based on an answer of the user to the question (step S54). Thus, the information processing device 10 performs the question to the user for each conditional branch and continues the analysis. Next, if the information processing device 10 ends the analysis in accordance with the analytical algorithm (step S55: Yes), the information processing device 10 outputs the analysis result to the terminal device 20 (step S56). After that, the analysis process is terminated.
Next, modifications of the above example embodiment will be described. The following modification can be applied in appropriate combination.
In the above-described analysis example, the analysis unit 112 asks the user concerning all conditional branches in the analytical algorithm, but if information on a specific conditional branch has been already acquired, the question concerning the specific conditional branch can be omitted. For instance, if the prompt created by the user at a start of the analysis includes a statement that “the explanatory variable X satisfies the conditions A1 and A2,” the analysis unit 112 may omit questions corresponding to steps S21 and S23 in
Moreover, if the user executes the analysis again by changing the answer to the question or the like, the analysis unit 112 can omit the question for a part which uses a previous answer of the user. For instance, in a case where the prompt created by the user at the start of the analysis includes a statement that “the answer to the explanatory variable X is assumed as the same as the previous analysis and execute the analysis again”, the analysis unit 112 may perform the analysis by diverting the previous answer of the user for steps S21 and S23 in
The analysis unit 112 may change wording of the question depending on a level of knowledge or experience of the user (hereinafter, simply referred to as “knowledge level”). For instance, the analysis unit 112 may change terms and phrases used for questions depending on a data scientist having an experience of less than three years and a data scientist having an experience of three years or more. In a case where the knowledge level of the user is low, the analysis unit 112 may generate a question by reducing the use of technical terms in the questions to the user, using a less technical term if there are multiple terms with the same meaning, adding explanations of the technical terms, and the like.
In this case, if the information processing device 10 has information on the knowledge level of the user, the wording of the question may be changed based on that information. Also, if the user specifies the knowledge level of the user at the prompt, the wording of the question may be changed accordingly. For instance, if the user describes “ask a question at the knowledge level of the data scientist having the experience of less than three years” at the prompt, the information processing device 10 may generate the question adjusted to the knowledge level.
In addition, if the user answers to the question output by the information processing device 10 such as “I don't understand the meaning of the question.” or “I don't understand the meaning of the term XX in the question.”, the information processing device 10 may regenerate the question with a different expression or re-generate the question using different term.
In the analysis example described above, the information processing device 10 outputs the question to the user corresponding to the conditional branch in the analytical algorithm, but the question to the user is not limited to that related to the conditional branch. For instance, it may be a question to confirm a type of the analytical algorithm, a precondition if executing the analytical algorithm, or the like.
In the example embodiment described above, the user designates the analytical algorithm in the prompt, and the information processing device 10 acquires the designated analytical algorithm from the analytical algorithm DB 113. Alternatively, the user may make an analysis instruction to the information processing device 10 by including the analytical algorithm described in natural language in the prompt. For instance, the use may instruct to use the analytical algorithm in the prompt by including a description of the analytical algorithm illustrated in
In the example embodiment described above, the analytical algorithm described in natural language is stored in the analytical algorithm DB 113 illustrated in
According to the information processing device 70 of the second example embodiment, it is possible for even a person who does not necessarily have sufficient knowledge or experience in a development and an operation of a machine learning model to appropriately analyze the factor contributing to the decrease in prediction accuracy of the machine learning model.
A part or all of the example embodiments described above may also be described as the following supplementary notes, but not limited thereto.
An information processing device comprising:
The information processing device according to supplementary note 1, wherein
The information processing device according to supplementary note 1, wherein the analysis execution means outputs the question concerning each of conditional branches included in the analytical algorithm.
The information processing device according to supplementary note 3, wherein
The information processing device according to supplementary note 1, wherein
The information processing device according to supplementary note 1, wherein
The information processing device according to supplementary note 1, wherein
An information processing method performed by a computer, the information processing method comprising:
A program causing a computer to perform a process comprising:
While the disclosure has been described with reference to the example embodiments and examples, the disclosure is not limited to the above example embodiments and examples. It will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the present disclosure as defined by the claims.
This application is based upon and claims the benefit of priority from Japanese Patent Application 2023-209024, filed on Dec. 12, 2023, the disclosure of which is incorporated herein in its entirety by reference.
| Number | Date | Country | Kind |
|---|---|---|---|
| 2023-209024 | Dec 2023 | JP | national |