The present invention relates to an impact visualization system, an impact visualization method, and an impact visualization pro ram that enable visualization of the impacts of explanatory variables used in prediction formulas.
Recently, there are increasing occasions where accumulated data are analyzed to make future predictions. Data accumulated may often include different regularities so there is a method of making predictions while switching between a plurality of prediction formulas according to the conditions.
For example, Non Patent Literature 1 (NPL 1) describes extracting complicated rules and patterns by using a heterogeneous mixture learning technology and outputting a model of the learned results. The learned results described in NPL 1 include prediction formulas classified according to the factors such as date of the week, temperature, etc., and each prediction formula is expressed by a linear sum of weighted explanatory variables indicating the respective factors.
NPL 1 also describes a method of displaying influential factors (contributing factors) used when making predictions by switching between the prediction formulas. The display method illustrated in
NPL 1: NEC Corporation, “Data Utilization by Advanced Machine Learning Technology”, Administration & Information Systems, The Institute of Administrative Information Systems, October 2014, Vol. 50, pp. 84-87
In the example shown in
On the other hand, there may be a case where the impact of an explanatory variable on a predicted value (objective variable) varies in accordance with the value (segment) of the explanatory variable. Even with the same prediction formula, if the impacts vary according to the values of the explanatory variables, it will be difficult to express the impacts on the objective variable using such a simple stem plot as described in NPL 1.
In view of the foregoing, an object of the present invention is to provide an impact visualization system, an impact visualization method, and an impact visualization program that enable visualization of the impacts of explanatory variables on a prediction result such that a user can readily understand the impacts even in the case where the impacts of the explanatory variables on the prediction result vary according to the values or segments of the explanatory variables.
An impact visualization system according to the present invention is an impact visualization system that enables visualization of impacts of explanatory variables used in a prediction formula, the system including: an explanatory variable display unit which, with the prediction formula being expressed by a linear sum of functions of the explanatory variables, displays the explanatory variables used in the prediction formula on one dimensional axis by allocating a predetermined width to a respective one of the explanatory variables; and a function value display unit which, in accordance with possible values or segments of the respective explanatory variables, sets values or segments of the explanatory variables in the widths allocated thereto, and plots values of the functions specified by the values or the segments that have been set, at corresponding positions in another dimensional axis direction.
An impact visualization method according to the present invention is an impact visualization method that enables visualization of impacts of explanatory variables used in a prediction formula, the method including: with the prediction formula being expressed by a linear sum of functions of the explanatory variables, displaying the explanatory variables used in the prediction formula on one dimensional axis by allocating a predetermined width to a respective one of the explanatory variables; and in accordance with possible values or segments of the respective explanatory variables, setting values or segments of the explanatory variables in the widths allocated thereto, and plotting values of the functions specified by the values or the segments that have been set, at corresponding positions in another dimensional axis direction.
An impact visualization program according to the present invention is an impact visualization program that is applied to a computer and that enables visualization of impacts of explanatory variables used in a prediction formula, the program causing the computer to perform: an explanatory variable displaying process of, with the prediction formula being expressed by a linear sum of functions of the explanatory variables, displaying the explanatory variables used in the prediction formula on one dimensional axis by allocating a predetermined width to a respective one of the explanatory variables; and a function value displaying process of, in accordance with possible values or segments of the respective explanatory variables, setting values or segments of the explanatory variables in the widths allocated thereto, and plotting values of the functions specified by the values or the segments that have been set, at corresponding positions in another dimensional axis direction.
According to the present invention, even in the ease where the impacts of explanatory variables on a prediction result vary according to the values or segments of the explanatory variables, the impacts of the explanatory variables on the prediction result can be visualized so as to be readily understood by a user.
An embodiment of the present invention will he described below with reference to the drawings.
The input unit 11 inputs a prediction formula to be displayed, to the display information generation unit 12. For example, in the case where necessary information is stored in a storage unit (not shown), the input unit 11 may extract the information from the storage unit and input the information to the display information generation unit 12. In the case where necessary information is to be received from another system (not shown), the input unit 11 may operate as an interface for receiving the information from the other system, and input the received information to the display information generation unit 12.
The information obtained by the input unit 11 is not limited to the form of the prediction formula, 2 illustrates an example of a prediction model. When the prediction model illustrated in
For example, in the case where a prediction model as illustrated in
The display information generation unit 12 generates display information for enabling visualization of impacts of explanatory variables used in a prediction formula. It is assumed in the present embodiment that a prediction formula is expressed by a linear sum of functions of the explanatory variables. Here, it is assumed that a function of an explanatory variable is expressed by a piecewise combination of linear functions, with the value of the function being uniquely determined in accordance with the value or segment of the explanatory variable. The piecewise combination of linear functions is a combination of the linear functions, defined in accordance with the value ranges or segments of the explanatory variable, and it is defined to cover possible values or segments of the explanatory variable.
In the case where the explanatory variable is a variable expressed by continuous values (for example, price, temperature, etc.), the function of the explanatory variable is, for example, a function in which a conversion method is defined for each predetermined range. In the case where the explanatory variable is a variable expressed by discrete values (segments) (for example, weather, day of the week, etc.), the function of the explanatory variable is, for example, a function in which a value is defined according to each discrete value.
The display information generation unit 12 generates display information in which the explanatory variables used in a prediction formula are arranged on one dimensional axis, with a predetermined width allocated to a respective one of the explanatory variables. In the description of the present embodiment, it is assumed that the display information is displayed on a two-dimensional space and that the axis (one dimensional axis) on which the explanatory variables are arranged is the y axis.
On the one dimensional axis, any width may be allocated to an explanatory variable. The display information generation unit 12 may allocate, to each explanatory variable, an interval width predetermined for that explanatory variable, or it may allocate equal interval widths to all explanatory variables. Further, the display information generation unit 12 may allocate, to each explanatory variable, a width according to the range of possible values (segments) of that explanatory variable.
Next, the display information generation unit 12 sets values or segments of an explanatory variable in the width allocated thereto, in accordance with possible values or segments of the explanatory variable. The values or the segments of an explanatory variable may be set in any predetermined manner. In the case where the explanatory variable is a variable expressed by continuous values, the display information generation unit 12 may set the values of the explanatory variable such that, for example, the value increases in a fixed direction of the axis. In the case where the explanatory variable is a variable expressed by discrete values (segments), the display information generation unit 12 may set the segments of the explanatory variable such that, for example, each segment is set in a width obtained by dividing the allocated width by the number of possible segments.
Next, the display information generation unit 12 generates display information in which values of the functions specified by the values or the segments that have been set are plotted at corresponding positions in another dimensional axis direction. In the present embodiment, it is assumed that the function. values are displayed in the x axis (the other dimensional axis) direction.
In the case where there are two or more prediction formulas, the display information generation unit 12 may plot function values at corresponding positions where the values of the functions specified by the explanatory variables in the respective prediction formulas are accumulated.
In the example shown in
Displaying the function values cumulatively in the above-described manner facilitates understanding, at a glance, the impacts of the explanatory variables over a plurality of prediction formulas. It is noted that the function values may be accumulated in any order. For example, the display information generation unit 12 may accumulate the function values in the order of identifiers that identify the respective prediction formulas.
For example, in the case where a prediction formula for use in prediction from input data. is selected in accordance with the content of the input data and the selected prediction formula is used to make a prediction from the input data, as in the prediction model generated by the heterogeneous mixture learning described in NPL 1, there exist two or more prediction formulas selected.
In the case where these prediction formulas are each expressed by a linear sum functions of explanatory variables, it is difficult, with a normal stem plot, to make it understand the impacts of the explanatory variables in the plurality of prediction formulas because the term of each explanatory variable is expressed by a function, in contrast, in the present embodiment, a graph is displayed in which values of the functions of the respective explanatory variables are accumulated. Such a display enables understanding, at a glance, the explanatory variables used in the prediction formulas and also the impacts of the explanatory variables on a predicted value. This leads to an improved interpretation of the prediction model. Thus, for example when there occurs a problem or degradation in performance in such a prediction model as described above, it also becomes readily possible to find out the cause of the problem.
The output unit 13 outputs the display information generated by the display information generation unit 12. For example, in the case where the output unit 13 is implemented by a display device, the output unit 13 by itself may display the display information. Alternatively, the output unit 13 may send an output instruction of the display information to another display device (not shown) to cause the display information to be output.
Further, in the present embodiment, a description has been made about the case where display information is generated by the display information generation unit 12 and, then, the display information is output by the output unit 13. Alternatively, it may be configured such that display information is displayed on a display device (not shown) each time the display, information generation unit 12 generates the display information.
The input unit 11 and the display information generation unit 12 are each implemented by a CPU of a computer that operates in accordance with a program (impact visualization program). For example, the program may be stored in a storage unit (not shown) included in the impact visualization system, and the CPU may read the program and operate as the input unit 11 and the display information generation unit 12 in accordance with the program.
In the impact visualization system of the present embodiment, the input unit 11, the display information generation unit 12, and the output unit 13 may each be implemented by dedicated hardware. Further, the impact visualization system according to the present invention may be configured with two or more physically separate devices which are connected in a wired or wireless manner.
An operation of the impact visualization system of the present embodiment will now be described.
The display information generation unit 12 displays explanatory variables used in a prediction formula input b the input unit 11 on one dimensional axis (y axis) by allocating a predetermined width to a respective one of the explanatory variables (step S11). Next, the display information generation unit 12 sets values or segments of the explanatory variables in the widths allocated thereto, in accordance with possible values or segments of the respective explanatory variables (step S12). The display information generation unit 12 then plots values of the functions specified by the values or the segments that have been set, at corresponding positions in the other dimensional axis (x axis) direction (step S13).
As described above, in the present embodiment, a prediction formula is expressed by a linear sum of functions of explanatory variables, and the display information generation unit 12 displays the explanatory variables used in the prediction formula on one dimensional axis by allocating a predetermined width to a respective one of the explanatory variables. Further, the display information generation unit 12 sets values or segments of the explanatory variables in the widths allocated thereto, in accordance with possible values or segments of the respective explanatory variables, and plots values of the functions specified by the values or the segments that have been set, at corresponding positions in the other dimensional axis direction. With such a configuration, even in the case where the impacts of explanatory variables on a prediction result vary in accordance with the values or segments of the explanatory variables, the impacts of the explanatory variables on the prediction result can be visualized so as to be readily understood by a user.
The present invention will be outlined below
With such a configuration, even in the case where the impacts of explanatory variables on a prediction result vary in accordance with the values or segments of the explanatory variables, it is possible to visualize the impacts of the explanatory variables on the prediction result such that a user can readily understand the impacts.
In the case where there are two or more prediction formulas to be displayed, the function value display unit 82 may plot function values at corresponding positions where the values of the functions specified by the explanatory variables in the respective prediction if formulas are accumulated. Such a configuration allows the explanatory variables used in a plurality of prediction formulas as well as the impacts of the explanatory variables on a predicted value to be understood at a glance.
Further, the impact visualization system may include a prediction formula extraction unit (for example, the input unit 11) that extracts each prediction formula from a prediction model in which a prediction formula for use in prediction from input data is selected in accordance with the content of the input data and the selected prediction formula is used to make a prediction from the input data.
Specifically, the function of an explanatory variable is expressed by a piecewise combination of linear functions, with the value of the function being uniquely determined in accordance with the value or the segment of the explanatory variable.
While the present invention has been described with reference to an embodiment and examples, the present invention is not limited to the embodiment or examples above, Various modifications appreciable by those skilled in the art are possible to the configuration and details of the present invention within the scope of the present invention.
This application claims priority based on U.S. Provisional Application Ser. No. 62/117,555 filed Feb. 18, 2015, the disclosure of which is incorporated herein in its entirety.
The present invention is suitably applied to an impact visualization system that enables visualization of the impacts of explanatory variables used in prediction formulas. For example, it is suitably applied to an apparatus that enables visualization of the impacts of the explanatory variables used in each prediction formula in a prediction model generated by heterogeneous mixture learning.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2016/000406 | 1/27/2016 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62117555 | Feb 2015 | US |