MODEL GENERATION DEVICE, PARAMETER CALCULATION DEVICE, MODEL GENERATION METHOD, PARAMETER CALCULATION METHOD, AND RECORDING MEDIUM

Information

  • Patent Application
  • 20220229428
  • Publication Number
    20220229428
  • Date Filed
    May 22, 2019
    5 years ago
  • Date Published
    July 21, 2022
    2 years ago
Abstract
A model generation device includes a model generation unit that generates a third model indicating a relationship between a first model and a parameter of a second model, the first model indicating a relationship between a sample and a label of the sample, the second model indicating the relationship and being different from the first model.
Description
TECHNICAL FIELD

The present invention relates to a model generation device, a parameter calculation device, a model generation method, a parameter calculation method, and a recording medium.


BACKGROUND ART

Patent Document 1 discloses a simulation device that applies operation status prediction data learned preliminarily using meteorological data and the like to the execution of training simulation by a simulator, for the purpose of realizing a simulation based on the actual situation.


PRIOR ART DOCUMENTS
Patent Documents

[Patent Document 1] Japanese Unexamined Patent Application, First Publication No. 2008-180784


SUMMARY OF THE INVENTION
Problem to be Solved by the Invention

In the case where meaning can be given to a parameter of a model used in a simulator, it is conceivable to use the value of the parameter for analyzing an analysis target. For example, it is conceivable to acquire a parameter value that enables accurate simulation of the analysis target and estimate the state of the analysis target using the obtained parameter value. However, the process of acquiring an appropriate parameter value includes many processes. Therefore, the length of time required for the processing is long.


An example object of the present invention is to provide a model generation device, a parameter calculation device, a model generation method, a parameter calculation method, and a recording medium.


Means for Solving the Problem

According to a first example aspect of the present invention, a model generation device includes: a model generation means for generating a third model indicating a relationship between a first model and a parameter of a second model, the first model indicating a relationship between a sample and a label of the sample, the second model indicating the relationship and being different from the first model.


According to a second example aspect of the present invention, a parameter calculation device includes: a model execution means for calculating a parameter of a second model regarding a given sample of a first model by applying a third model to the given sample, the first model indicating a relationship between a sample and a label of the sample, the second model indicating the relationship and being different from the first model, the third model indicating a relationship between the first model and a parameter of the second model.


According to a third example aspect of the present invention, a model generation method executed by a computer includes: generating a third model indicating a relationship between a first model and a parameter of a second model, the first model indicating a relationship between a sample and a label of the sample, the second model indicating the relationship and being different from the first model.


According to a fourth example aspect of the present invention, a parameter calculation method executed by a computer includes: calculating a parameter of a second model regarding a given sample of a first model by applying a third model to the given sample, the first model indicating a relationship between a sample and a label of the sample, the second model indicating the relationship and being different from the first model, the third model indicating a relationship between the first model and a parameter of the second model.


According to a fifth example aspect of the present invention, a recording medium stores a program for causing a computer to execute a function of: generating a third model indicating a relationship between a first model and a parameter of a second model, the first model indicating a relationship between a sample and a label of the sample, the second model indicating the relationship and being different from the first model.


According to a sixth example aspect of the present invention, a recording medium stores a program for causing a computer to execute a function of: a model execution means for calculating a parameter of a second model regarding a given sample of a first model by applying a third model to the given sample, the first model indicating a relationship between a sample and a label of the sample, the second model indicating the relationship and being different from the first model, the third model indicating a relationship between the first model and a parameter of the second model.


Effect of the Invention

According to an example embodiment of the present invention, data that can be used for analyzing an analysis target can be obtained in a comparatively short period of time.





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 is a schematic configuration diagram showing a device configuration example of a prediction system according to an example embodiment.



FIG. 2 is a schematic block diagram showing an example of a functional configuration of a model generation device according to the example embodiment.



FIG. 3 is a diagram showing an example of a production line that serves as a target of the prediction system according to the example embodiment.



FIG. 4 is a diagram showing an example of a configuration of the model generation device according to an example embodiment.



FIG. 5 is a diagram showing an example of a configuration of a parameter calculation device according to an example embodiment.



FIG. 6 is a diagram showing an example of processing in a model generation method according to an example embodiment.



FIG. 7 is a diagram showing an example of processing in a parameter calculation method according to an example embodiment.



FIG. 8 is a schematic block diagram showing a configuration of a computer according to at least one example embodiment.





EXAMPLE EMBODIMENT

Hereinafter, example embodiments of the present invention are described. However, the present invention within the scope of the claims is not limited by the following example embodiments. Furthermore, all the combinations of features described in the example embodiments may not be essential for the solving means of the invention.



FIG. 1 is a schematic configuration diagram showing a device configuration example of a prediction system according to the example embodiment of the present invention. In the configuration shown in FIG. 1, a prediction system 1 includes a simulator device 10, a machine learning device 20, and a model generation device 30. Moreover, the simulator device 10, the machine learning device 20, the model generation device 30, and a prediction target 910 communicate with each other via a communication network 920.


The prediction system 1 predicts the behavior or state of the prediction target 910. Furthermore, the prediction system 1 acquires information that aids the analysis of the behavior or state of the prediction target 910.


The prediction target 910 may be any prediction target that enables simulation of the behavior or state thereof, and is not limited to a specific prediction target.


For example, in the case where the prediction target 910 is a physical distribution system of a courier company or the like, the prediction system 1 may, on the basis of the arrangement of resources such as trucks and staff members as well as the distribution of goods to be delivered, predict the delivery status after a predetermined length of time such as three hours (the arrangement of resources and the distribution of goods to be delivered after the predetermined length of time), and may provide the user with the prediction result. In such a case, the values of input data are a parameter value representing the arrangement of resources such as trucks and staff members, and a parameter value representing the distribution of goods to be delivered. The value of the output data is a parameter value representing the delivery status after a predetermined length of time. In addition, the prediction system 1 may provide the user with parameter values of a simulation model when the prediction is performed by a simulator. Moreover, the prediction system 1 may include parameters representing state values different from input data and output data (for example, parameters representing intermediate states, data representing the relationship between input data and output data).


The user can use the data from the prediction system 1 to analyze the delivery status such as whether or not the delivery status is favorable after a predetermined time, and if not, where a bottleneck is occurring.


Hereinafter, analysis of the behavior or state of the prediction target 910 is simply referred to as analysis of the prediction target 910. The above analysis of the delivery status corresponds to an example of the analysis of the prediction target 910.


The simulator device 10 simulates the behavior or state of the prediction target 910. The simulator device 10 uses a model including parameters as a simulation model for simulating the prediction target 910. The simulation model represents a process of calculating output data from input data. The simulation model may be, for example, a model that mathematically expresses the relationship between input data and output data, or a model that physically expresses an event between input and output.


The values to be set in the parameters are preliminarily defined with respect to the values of input data to the model for the simulator device 10, on the basis of the actual data in the prediction target 910. The simulator device 10 may automatically acquire the relationship between the value of input data and the parameter setting value by machine learning for example. Alternatively, a person (for example, the user of the prediction system 1) may preliminarily determine the parameter setting value for the value of the input data by executing a simulation or performing data analysis.


The simulation model of the simulator device 10 corresponds to an example of a second model. The simulator device 10 corresponds to an example of a parameter setting unit (parameter setting means).


The machine learning device 20 learns the behavior or state of the prediction target 910 (machine learning), and predicts the behavior or state of the prediction target 910, using the learning result. The machine learning device 20 may include a neural network for learning; however, the mechanism by which the machine learning device 20 performs machine learning is not limited to this example. The machine learning device 20 may learn using, for example, a model such as a support vector machine and a decision tree. The machine learning device 20 calculates the parameter value of the model so as to accurately calculate the output data value with respect to the input data value.


The model obtained by machine learning of the machine learning device 20 is called a machine learning model. The machine learning model of the machine learning device 20 corresponds to an example of a first model.


Comparing the prediction by the simulator device 10 and the prediction by the machine learning device 20, the prediction by the machine learning device 20 takes less amount of time than the prediction by the simulator device 10. This is because the calculation amount of machine learning is typically less than the calculation amount of simulation performed on the basis of a physical model. Meanwhile, a person (for example, the user) can analyze the basis of the prediction performed by the simulator device 10. In contrast to this, it is difficult for a person to analyze the basis of the prediction performed by the machine learning device 20. This is because a physical model in the simulator device 10 is easier to understand than a mathematical model.


For example, the parameters of a simulation model used by the simulator device 10 are physical quantities related to the actual prediction target 910, and the user can take advantage of the values for the analysis of the prediction target 910. On the other hand, in the case where the machine learning device 20 performs machine learning using a neural network, it is usually difficult to associate the weight (parameter value) in the neural network with an actual physical quantity.


The model generation device 30 acquires parameter values to be set in the simulation model for the simulator device 10 to perform a similar prediction on the basis of the input/output in the prediction performed by the machine learning device 20. Therefore, the model generation device 30 preliminarily learns the model that receives input of the input data for prediction and the prediction result obtained by the machine learning device 20 and outputs a parameter value of the simulation model for acquiring the prediction result from the input data. Hereinafter, the parameter value of the simulation model of the simulator device 10 for the simulator device 10 to output the same prediction result as the prediction result of the machine learning device 20 is referred to as the parameter value of the simulation model corresponding to the prediction result of the machine learning device 20. The prediction result from the simulator device 10 and the prediction result from the machine learning device 20 typically include errors. Therefore, in the present example embodiment, for errors within a predetermined range (for example, 1%, 5%, 7%, and so forth), prediction results are assumed matched even if there are errors in the prediction results. In the following, for convenience of explanation, even if an error within a predetermined range occurs in a prediction result, the prediction result will be described using the term “same” or “matching”.


Moreover, the model learned by the model generation device 30 is also referred to as a bridge model. The bridge model corresponds to an example of a third model. In other words, the bridge model is a model that represents the relationship between the parameter values calculated by the machine learning device 20 and the parameter values of the simulation model.


The model generation device 30 is configured, using a computer such as a personal computer (PC) and a workstation (WS).


The processing performed by the model generation device 30 is classified into a learning phase and a prediction phase. The model generation device 30 generates a bridge model in the learning phase. For example, the model generation device 30 acquires the bridge model and calculates the parameter values of the bridge model. Then, the model generation device 30 uses the model bridge in the prediction phase to acquire the parameter values of the simulation model corresponding to the prediction result of the machine learning device 20.


The model generation device 30 acquiring the parameter value of the simulation model enables the user to use the parameter value of the simulation model, for example, for the analysis of the prediction target 910 such as the analysis of the prediction result obtained by the machine learning device 20.


The communication network 920 mediates communication between the model generation device 30, the simulator device 10, the machine learning device 20, and the prediction target 910. The type of communication network 920 is not limited to a particular type. For example, the communication network 920 may be the Internet. Alternatively, the communication network 920 may be configured as a communication network of a dedicated line of the prediction system 1.


The method by which the prediction system 1 predicts the behavior or state of the prediction target 910 is not limited to a method performed by machine learning. Furthermore, the data acquired by the prediction system 1 for aiding the analysis of the prediction target 910 is not limited to the parameter values of the simulation model. For example, the prediction system 1 can be applied to various situations where the following conditions (1) and (2) are satisfied.


(1) As a method for predicting the behavior or state of the prediction target 910, a method is used in which it is difficult for a person to directly understand the basis of the prediction (from the parameter values of the prediction model and so forth).


(2) In the case where the data acquired by the prediction system 1 to aid the analysis of the prediction target 910 is supposed to be directly acquired (by simulation, analysis, or the like), it takes more time compared to acquiring the data using a bridge model or it is more difficult to directly acquire the data.


Hereunder, the data acquired by the prediction system 1 for aiding the analysis of the prediction target 910 is referred to as analysis aid data. The parameter values of the simulation model correspond to an example of the analysis aid data.


As will be described later, the model generation device 30 may generate a bridge model that outputs analysis data to the prediction input data without requiring the prediction results.


Any two or more of the simulator device 10, the machine learning device 20, and the model generation device 30 may be integrated into a single device. In such a case, the bridge model is a model that represents the relationship between the input data and the parameter values of the simulation model.



FIG. 2 is a schematic block diagram showing an example of a functional configuration of the model generation device 30. In the configuration shown in FIG. 2, the model generation device 30 includes a communication unit 110, a display unit 120, an operation input unit 130, a storage unit 180, and a control unit 190. The control unit 190 includes a model generation unit 191 and a model execution unit 192.


The communication unit 110 communicates with other devices. For example, the communication unit 110 receives a prediction result of the behavior or state of the prediction target 910 from the machine learning device 20.


The display unit 120 includes a display screen such as a liquid crystal panel or an LED (light emitting diode) panel, and displays various types of images. For example, the display unit 120 displays analysis aid data.


The operation input unit 130 includes input devices such as a keyboard and a mouse, and accepts user operations. For example, the operation input unit 130 accepts a user operation instructing acquisition of analysis aid data.


The storage unit 180 stores various types of data. The storage unit 180 is configured using a storage device included in the model generation device 30.


The control unit 190 controls each unit of the model generation device 30 and executes various processes. Functions of the control unit 190 are executed by a CPU (central processing unit) included in the model generation device 30 reading out a program from the storage unit 180 and executing the program.


The model generation unit 191 generates a bridge model in the learning phase. In the prediction phase, the model execution unit 192 acquires analysis aid data using the bridge model generated by the model generation unit 191. Specifically, the model execution unit 192 applies prediction input data to the bridge model to calculate the analysis aid data.


The device that executes the function of the model generation unit 191 (that is, the device that generates a bridge model) and the device that executes the function of the model execution unit 192 (that is, the device that acquires analysis aid data using the bridge model) may be configured as separate devices.


Generation of a bridge model by the model generation unit 191 will be further described.


Here, assuming that the following conditions (A) to (E) are satisfied, an example is described in which the machine learning device 20 performs machine learning for predicting the behavior or state of the prediction target 910, and the model generation device 30 generates a bridge model that outputs the parameter values of the simulation model of the simulator device 10.


(A) A simulation model for simulating the prediction target 910 exists. Moreover, a person can use the parameter values of the simulation model for the analysis of the prediction target 910.


(B) While the machine learning device 20 can sufficiently accurately predict the behavior or state of the prediction target 910, the values of the machine learning parameters of the machine learning device 20 cannot be used for the analysis of the prediction target 910.


(C) The calculation cost for the simulator device 10 to predict the behavior or state of the prediction target 910 in a simulation is higher than the calculation cost for the machine learning device 20 to perform a prediction using a machine learning result. In particular, the length of time required for the simulator device 10 to perform a prediction is longer than the length of time required for the machine learning device 20 to perform a prediction.


(D) There is a relationship between the value of the input data for predicting the behavior or state of the prediction target 910 and the value of the parameter of the simulation model of the simulator device 10.


(E) A sufficient length of time is available for offline calculation to acquire the relationship between the value of the input data for predicting the behavior or state of the prediction target 910 and the value of the parameter of the simulation model of the simulator device 10. Meanwhile, the length of time available for predicting the behavior or state of the prediction target 910 is limited.


A machine learning model of the machine learning device 20 is expressed as Equation (1).


Equation [1]





y=f
ml(x; ξ)  (1)


x is input data for prediction and includes dx real numbers. That is to say, x is an element of Rdx. “R” indicates a real space. x corresponds to an example of a sample example.


y is output data indicating a prediction result and includes dy real numbers. That is to say, y is an element of Rdy. y corresponds to an example of a label. The label here refers to data related to a sample. The label may be a class representing discrete information or a numerical value representing continuous information.


ξ is a vector representation of a machine learning parameter value. The machine learning device 20 has dξ real-number parameters as machine learning parameters. That is to say, is an element of R.


Moreover, a simulation performed by the simulator device 10 is expressed as Equation (2).


[Equation 2]





y=fsim(x; θ)  (2)


x and y are similar to those in the case of Equation (1). Ideally, the machine learning device 20 and the simulator device 10 output the same prediction result (output data y) for the same input data x. In the following, it is assumed that the difference in the output data y for the same input data x is sufficiently small and the output data y can be considered the same between the machine learning device 20 and the simulator device 10.


θ is a vector representation of a simulation model parameter value. The simulation model of the machine learning device 20 has dθ real-number parameters. That is to say, θ is an element of R.


The model generation unit 191 acquires the expression in the RKHS (Reproducing Kernel Hilbert Space) for each of the function fml indicating the machine learning model of the machine learning device 20 and the function fsim indicating the simulation model of the simulator device 10. This process is referred to as preprocessing.


The model generation unit 191 acquires the function that receives the input of the function indicating the machine learning model of the machine learning device 20 and outputs the function of the simulation model of the simulator device 10. This process is referred to as actual processing.


In the preprocessing, upon receiving the input of {Xnl, Ynl, . . . , XNL, YnL}, the model generation unit 191 calculates {μ{circumflex over ( )}, mll, . . . , μ{circumflex over ( )}, mlL} and {μ{circumflex over ( )}, siml, . . . , μ{circumflex over ( )}, simL}.


Here, it is assumed that the parameter value of the machine learning model of the machine learning device 20 and the parameter value θ of the simulation model of the simulator device 10 change according to the changes in the state of the prediction target 910.


Xnl(l=1, . . . , L) is sample data of the input data x for prediction in a unit time in which the parameter values ξ and θ can be considered constant. Xnl shows the distribution of x with n sample data. As mentioned above, x is an element of dx, and therefore, Xnl is indicated by n×dx real numbers. That is to say, xnl is an element of Rn×dx.


Ynli(l=1, . . . , L) is sample data of the output data y indicating the prediction result obtained by the machine learning device 20 in a unit time in which the parameter values ξ and θ can be considered constant. Ynl shows the distribution of y with n sample data. As mentioned above, y is an element of dy, and therefore, Ynl is indicated by n×dy real numbers. That is to say, Ynl is an element of Rn×dy,


In the following, a case will be described in which the unit time in which the parameter values ξ and θ can be considered constant is one day. However, the unit time in which the parameter values ξ and θ can be considered constant is not limited to a particular length of time. For example, in the case where variation is comparatively likely in the state of the prediction target 910, the unit time in which the parameter values ξ and θ can be considered constant may be three hours.


μ{circumflex over ( )}, mll(l=1, . . . , L) indicates a kernel mean of the machine learning model corresponding to a dataset {Xnl, Ynl}. The model corresponding to the data set {Xnl, Ynl} is a model that outputs y of the distribution indicated by Ynl for the input of x of the distribution indicated by Xnl. The superscript “{circumflex over ( )}” indicates an estimated value.


μ{circumflex over ( )}, simll(l=1, . . . , L) indicates a kernel mean of the simulation model corresponding to a dataset {Xnl, Ynl}.


The kernel means are indicated by points on the RKHS. As a method for the model generation unit 191 to calculate μ{circumflex over ( )}, mll and μ{circumflex over ( )}, siml, a method of kernel ABC (Kernel Approximate Bayesian Computation) can be used.


In the actual processing, the model generation unit 191 calculates T{circumflex over ( )} on the basis of {μ{circumflex over ( )}, mll, . . . , μ{circumflex over ( )}, simL}. T{circumflex over ( )} is an expression in the RKHS space of the function that outputs the simulation model μ{circumflex over ( )}, sim upon receiving the input of the machine learning model μ{circumflex over ( )}, ml.


The model generation unit 191 calculates T{circumflex over ( )} on the basis of Equation (3).









[

Equation





3

]












T
^

=

arg







max
T



{



1
L






l
=
1

L







μ
l

^

,
sim



-

T


(

μ
l

^

,

m





l




)





H
2



+

λ




T


H
2



}







(
3
)







Here, “λ” is a constant for regularization and “λ>0”. “H” indicates the RKHS space. ∥ ∥H indicates the norm in the RKHS space. In the RKHS space, polynomial functions are indicated by points, and the similarity of functions can be calculated by the norm.


As shown by Σl=1Lμμ{circumflex over ( )}, siml−T(μ{circumflex over ( )}, siml)∥H2 of Equation (3), T{circumflex over ( )} is calculated so that the error from μ{circumflex over ( )}, siml is as small as possible when converting μ{circumflex over ( )}, mll using the function T.


μ∥T∥H2 in Equation (3) is a regularization term for preventing overlearning, and functions as a penalty term for the model becoming complex.


Next, the calculation of the parameter value of the simulation model by the model execution unit 192 will be further described.


Upon receiving an input of {XnL+1, YnL+1}, the model execution unit 192 calculates the parameter value of the simulation model corresponding to this {XnL+1, YnL+1}. The parameter value of the simulation model corresponding to {XnL+1, YnL+1} here is a parameter value for the simulator device 10 to output YnL+1 to the input of the XnL+1.


The model execution unit 192 calculates μ{circumflex over ( )}, mlL+1 on the basis of {XnL+1, YnL+1}, and applies the obtained μ{circumflex over ( )}, mlL+1 to T{circumflex over ( )} to calculate μ{circumflex over ( )},simL+1. The simulator device 10 calculates θL+1 on the basis of the obtained μ{circumflex over ( )},simL+1.


The method by which the model execution unit 192 calculates μ{circumflex over ( )}, mlL+1 on the basis of {XnL+1, YnL+1}, and the method by which the model generation unit 191 finds μ{circumflex over ( )}, mll on the basis of {Xnl, Ynl}can be the same.


Here, Gaussian Like Kernel ic is defined as in Equation (4) where μ and μ′ are both taken as functions in the RKHS space.









[

Equation





4

]












κ


(

μ
,

μ



)


=


exp


{


-

1

2


σ
μ
2









μ
-

μ





H
2


}



H





(
4
)







σμ is a constant indicating the width of the kernel κ, where “σμ>0”.


The calculation of μ{circumflex over ( )}, simL+1 performed by the model execution unit 192 is expressed as Equation (5).









[

Equation





5

]












μ

L
+
1


^

,
sim



=



T
^



(

μ

L
+
1


^

,

m





l




)


=





l
=
1

L




v
l



μ
l

^

,
sim






H






(
5
)







vl is expressed as vl, . . . , vL in Equation (6).









[

Equation





6

]














v
=





(


v
1

,





,

v
L


)

T



R
L








=





(

G
+

λ





L





I


)


-
1




k


(

μ

L
+
1


^

,

m





l




)










(
6
)







The superscript “T” indicates the transposition of a matrix or vector. “I” indicates an identity matrix.


“G” indicates Gram Matrix and is expressed as Equation (7).


[Equation 7]





G{κ(μl{circumflex over ( )},ml, μl′{circumflex over ( )},ml)}l,l′=1L∉RL×L  (7)


“K(μ{circumflex over ( )}, mlL+1)” is expressed as Equation (8).


[Equation 8]





kL+1{circumflex over ( )},ml)=(κ(μl{circumflex over ( )},ml, μL+1{circumflex over ( )},ml), . . . , κ(μl{circumflex over ( )},ml, μL+1{circumflex over ( )},ml)TRL  (8)


As a method for the model execution unit 192 to calculate the parameter value of the simulator model from the kernel mean μ{circumflex over ( )}, simL+1, a method of Kernel Herding can be used. For example, the model execution unit 192 calculates sample data θL+l,j of the parameter value of the simulation model with respect to the kernel mean μ{circumflex over ( )}, simL+1, using Equation (9).














[

Equation





9

]













θ


L
+
1

,
j


=



arg







max
T






l
=
1

L







j







=
1

m




v
l



w

l
,

j












k
θ



(

θ
,

θ

l
,

j










)







+


1

m
+
1








j







=
1

m




k
θ



(

θ
,

θ

l
,

j










)







R

d
θ







(
9
)







θL+1,j(j=1, . . . , m) indicates the jth sampling data of θL+1. Therefore, (θL+1, l, . . . , θL+1,m) indicates θL+1.


The weight is calculated by kernel ABC (Kernel Approximate Bayesian Computation) with respect to {Xnl, Ynl} for obtaining the kernel mean of a posterior distribution of θ1.


“k74” indicates a Gaussian Kernel. θl,j indicates sampling data from the jth anterior distribution with respect to the lth data set {Xnl, Ynl}.


In the case of “j=2, . . . , m”, the entire Equation (9) is applied. For the case of “j=1”, which is the initial state, the first term on the right side of Equation (9) is used. Therefore, in the case of “j=1”, Equation (10) is applied.









[

Equation





10

]












θ


L
+
1

,
j


=


arg







max
T






l
=
1

L







j







=
1

m




v
l



w

l
,

j












k
θ



(

θ
,

θ

l
,

j










)









R

d
θ







(
10
)







As described above, the user can use θL+1 for the analysis of the prediction target 910 as a result of the model execution unit 192 calculating θL+1.


Calculation of μ{circumflex over ( )}, siml by the model generation unit 191 will be further described.


The model generation unit 191 may calculate μ{circumflex over ( )}, siml, using Equation (11).









[

Equation





11

]












μ
l

^

,
sim



=





j
=
1

m




w

l
,
j





k
θ



(

·

,

θ

l
,
i




)





H





(
11
)







Each θl,j shows a sample of parameter values following the anterior distribution π(θ). m indicates the number of the samples. Here, j is an index that identifies each individual sample.


The “*” in the brackets indicates that the variable of the function in the RKHS space is not limited to a particular variable.


The l in Equation (11) may be read as L+1, and the model execution unit 192 may calculate μ{circumflex over ( )}, sim L+1, using Equation (11).


Next, calculation of by the model generation unit 191 will be further described.


Here, as an example of the case where the machine learning model of the machine learning device 20 is a parametric model, a case is assumed where the machine learning device 20 performs machine learning using a Bayesian neural network having several hidden layers. The parametric model referred to here is a model having parameters (here, learning parameters).


In this case, the model generation unit 191 may calculate μ{circumflex over ( )}, mll on the basis of Equation (12).









[

Equation





12

]












μ
l

^

,

m





l




=





j
=
1

m




k
ξ



(

·

,

ξ

l
,
j




)




H





(
12
)







The posterior distribution ξl for “l=1, . . . , L” is an element of R and can be obtained using the Markov Chain Monte Carlo (MCMC) method or a variation thereof.


m indicates the number of parameter samples. “J=1, . . . , m” is used as an index to identify each individual parameter sample.


The Gaussian Like Kernel of the function μ{circumflex over ( )}, mll{circumflex over ( )}, mll is an element of H) is taken as shown in Equation (13).









[

Equation





13

]















κ


(


μ
l

^

,

m





l




,

μ

l



^

,

m





l





)


=




exp


{


-

1

2


σ
μ
2










μ
l

^

,

m





l




-

μ

l



^

,

m





l







H
2


}



H







=



exp


{


-

1

2


σ
μ
2






(

1
-




μ
l

^

,

m





l




-

μ

l



^

,

m





l








)


}








=



exp


{


-

1

σ
μ
2





(

1
-




j
=
1

m








j







=
1


m






k
ξ



(


ξ

l
,
j


,

ξ


l


,

j





)





)


}









(
13
)







The constant σμ indicates the width of the Gaussian Like Kernel κ, where “σμ>0”.


<⋅, ⋅> indicates the inner product.


The Gaussian Kernel of ξ is taken as shown in Equation (14).









[

Equation





14

]













k
ξ



(


ξ

l
,
j


,

ξ


l


,

j





)


=


exp


{


-

1

2


σ
ξ
2










ξ

l
,
j


,

ξ


l


,

j







2


}



R





(
14
)







The constant σξ indicates the width of the Gaussian Kernel kξ, where “σξ>0”.


While kξ(⋅, ξi,j), in Equation (12) cannot be calculated directly, by taking the inner product, the calculation becomes possible as follows. “<k(⋅, ξ1), k(⋅, ξ2)>=k(ξ1, ξ2)”. In the case of kξ(⋅, ξi,j) in Equation (12), it becomes kξi,j, ξi,j′) in Equation (14), which makes the calculation possible.


The l in Equation (12) may be read as L+1, and the model execution unit 192 may calculate μ{circumflex over ( )}, mlL+1 using Equation (12).


In this way, even in the case where prediction data cannot be obtained by the machine learning device 20, the parameter value of the simulation model can still be calculated by calculating the kernel mean μ{circumflex over ( )}, mll of the machine learning model by the model generation unit 191 and the model execution unit 192.


Meanwhile, as an example of the case where the machine learning model of the machine learning device 20 is a non-parametric model, a case is assumed where the machine learning device 20 performs machine learning using Gaussian process regression (GPR). The non-parametric model referred to here is a model not having parameters (here, learning parameters).


Gaussian process regression is equivalent to a Bayesian neural network having a single hidden layer and an infinite number of nodes.


As a result of Gaussian process regression, the mean of the anterior distribution shown in Equation (15) is obtained.









[

Equation





15

]











y
=


μ


Y
|
X

,
l

^

=





i
=
1

n




u

l
,
i





k
x



(

x
,

X

l
,
i



)





H






(
15
)







μl,i is calculated using a gram matrix of kernel kx. It is clear that Equation (15) holds because of the equivalence between Gaussian process regression and Kernel ridge regression.


Using μ{circumflex over ( )}Y|X, l, Yhu {circumflex over ( )}L n+1 is calculated as shown in Equation (16).


[Equation 16]





Y
l,n+1
{circumflex over ( )}Y|X,l{circumflex over ( )}(Xl,n+1)  (16)


In the case of a parametric model, the parameter ξl is an input, whereas Xnl is an input in the case of a nonparametric model. Therefore, in a parametric model, the machine learning parameter ξ is converted to the simulator parameter θ, whereas in a non-parametric model, the input X is converted to the simulator parameter θ.


Therefore, the model execution unit 192 can obtain the parameter value of the simulator model without the need for obtaining the prediction result of the machine learning device 20.


Here, the Gaussian Like Kernel κ of the function μ{circumflex over ( )},mll is taken as shown in Equation (17).














[

Equation





17

]













κ


(


μ
l

^

,

m





l




,

μ

l



^

,

m





l





)


=


exp


{


-

1

σ
μ
2





(

1
-




i
=
1

n








i







=
1


n






u

l
,
i




u


l


,
i





k
x



(


X

l
,
i


,

X


l


,

i





)






)


}



H





(
17
)







Also, the Gaussian Kernel of x is taken as shown in Equation (18).









[

Equation





18

]













k
x



(


X

l
,
i


,

X


l


,

i





)


=


exp


{


-

1

2


σ
x
2










X

l
,
i


,

X


l


,

i







2


}



R





(
18
)







The constant σx indicates the width of the kernel k, where “σx>0”.


Next, an example of a situation to which the prediction system 1 is applied will be described.


The prediction system 1 can be applied to, for example, prediction of the required length of time on a factory production line.



FIG. 3 is a diagram showing an example of a production line that serves as a target of the prediction system 1. In the example of FIG. 3, an assembly device and an inspection device are installed on the production line.


The assembly device assembles four components, an upper component, a lower component, and two screws, to produce a product. The product assembled by the assembly device is transported into the inspection device. The inspection device inspects when four of the products have been transported thereinto.


In this assembly process, the production amount of the product per unit time is defined as data X, and the shipping time of X products (value of data X) is defined as data Y. In addition, the number of parameters in the simulation model of the simulator device 10 is two, the working time of the assembly device (the length of time required for the assembly process) is θ1, and the working time of the inspection device (the length of time required for the inspection process) is θ2.


It is assumed that, as the number of products to be produced increases in this process, the load increases and the length of time that elapses in each process increases significantly. Specifically, if the value of X exceeds 110, it takes time for assembly and inspection, and the values of both θ1 and θ2 become large.


By predicting the shipping time of this production line by the machine learning device 20, the user can confirm whether or not the production line is operating appropriately. Moreover, by calculating the value of the parameters θ1, θ2 of the simulation model by the model generation device 30, the user can use the parameter value calculated by the model generation device 30 for analyzing the production line, for example analyzing where the bottleneck of the shipping time is.


The parameter value calculated by the model generation device 30 is not limited to the length of time required for each process described above, and may be various parameter values that can affect the prediction of the machine learning device 20. For example, in the case of a production line that is affected by the state of the surrounding environment such as weather or temperature, the model generation device 30 may calculate the weather or temperature, or a combination thereof as a parameter value in addition to or instead of the length of time required for each process.


However, the application target of the prediction system 1 is not particularly limited. For example, the prediction system 1 may be applied to a physical distribution system of a delivery company or the like. Alternatively, the prediction system 1 may be applied to predicting the flow of people, such as when guiding people safely and efficiently at a venue where people gather, such as a fireworks display.


In the above description, the case of so-called distribution-distribution regression, which is illustrated by the distribution using sample data in both the machine learning model and the simulation model, has been described as an example. However, the application scope of the prediction system 1 is not limited to this example.


With use of a linear kernel as the kernel in the RKHS space, the kernel mean is the mean value of the distribution. Therefore, even when either one or both of the machine learning model and the simulation model are indicated by points, the prediction system 1 can be applied in a manner similar to that described above.


That is to say, the model generation unit 191 may calculate as a bridge model the function in the RKHS space that receives an input of the distribution of a function indicating the machine learning model and outputs the distribution of the function of the simulation model.


Alternatively, the model generation unit 191 may calculate as a bridge model the function in the RKHS space that receives an input of the distribution of a function indicating the machine learning model and outputs the point indicating the function of the simulation model.


Alternatively, the model generation unit 191 may calculate as a bridge model the function in the RKHS space that receives an input of a point indicating a function that indicates the machine learning model and outputs the distribution of the function of the simulation model.


Alternatively, the model generation unit 191 may calculate as a bridge model the function in the RKHS space that receives an input of a point indicating a function that indicates the machine learning model and outputs the point indicating the function of the simulation model.


By indicating either one or both of the machine learning model and the simulation model by points, the calculation cost of the model generation device 30 can be made comparatively low. In this respect, the model generation device 30 can calculate the parameter values of the simulation model relatively quickly.


As described above, the prediction system 1 can be applied not only in the case of distribution-distribution regression, but also in those cases of distribution-point regression, point-distribution regression, and point-point regression.


As described above, the model generation unit 191 generates a bridge model indicating a relationship between: a machine learning model that indicates a relationship between prediction data input to the machine learning model and the prediction result based on the predication data; and a parameter of a simulation model that indicates the above relationship and is different from the machine learning model.


According to the model generation device 30, when predicting the behavior or state of the prediction target 910, the parameter value of the simulation model can be obtained without the need for executing the simulation model. In this respect, according to the model generation device 30, data that can be used for analyzing an analysis target can be obtained in a comparatively short period of time.


Moreover, the model generation unit 191 generates as a bridge model the function in the RKHS space that receives an input of the distribution of a function indicating the machine learning model and outputs the distribution of the function of the simulation model.


According to the model generation unit 191, the distribution of parameter values of the simulation model can be calculated, and in this respect, the parameter values of the simulation model can be calculated at a high level of accuracy.


Moreover, the model generation unit 191 generates as a bridge model the function in the RKHS space that receives an input of the distribution of a function indicating the machine learning model and outputs a point indicating the function of the simulation model.


According to the model generation unit 191, since the function indicating the simulation model is indicated by a point, the calculation cost can be made comparatively low.


Moreover, the model generation unit 191 generates as a bridge model the function in the RKHS space that receives an input of a point that indicates a function indicating the machine learning model and outputs the distribution of the function of the simulation model.


According to the model generation unit 191, since the machine learning model is indicated by a point, the calculation cost can be made comparatively low.


Moreover, the model generation unit 191, upon receiving an input of points that indicate a function indicating the machine learning model, generates the function in the RKHS space as a bridge model that outputs points indicating the function of the simulation model.


According to the model generation unit 191, since the function indicating the machine learning model and the function indicating the simulation model are both indicated by points, the calculation cost can be made comparatively low.


Moreover, the model generation unit 191 generates as a bridge model the function in the RKHS space that receives an input of the kernel mean that indicates the machine learning model and outputs the kernel mean that indicates the function of the simulation model.


According to the model generation device 30, a technique such as kernel mean can be used as part of bridge model generation, and the bridge model generation process can be designed comparatively easily.


Moreover, the model execution unit 192 calculates a parameter value of the simulation model on the basis of the kernel mean that indicates the simulation model.


The user can use this parameter value for the analysis of the prediction target 910.


The model execution unit 192 applies a bridge model indicating a relationship between: a first model that indicates a relationship between prediction data input to a machine learning model and the prediction result of the predication data; and a parameter of a simulation model that indicates the above relationship and is different from the machine learning model, to a given sample of the machine learning model to thereby calculate the parameter of the simulation model regarding the given sample.


The user can use this parameter value for the analysis of the prediction target 910.


Next, a configuration example of an example embodiment will be described, with reference to FIG. 4 to FIG. 7.



FIG. 4 is a diagram showing a configuration example of a model generation device according to an example embodiment. A model generation device 200 shown in FIG. 4 includes a model generation unit 201.


In this configuration, the model generation unit 201 generates a third model indicating a relationship between: a first model that indicates the relationship between a sample and the label of the sample; and a parameter of a second model that indicates the above relationship and that is different from the first model.


According to the model generation device 200, when predicting the behavior or state of a prediction target, the parameter value of the simulation model can be obtained without the need for executing the simulation model. In this respect, according to the model generation device 200, data that can be used for analyzing an analysis target can be obtained in a comparatively short period of time.



FIG. 5 is a diagram showing an example of a configuration of a parameter calculation device according to an example embodiment. A parameter calculation device 210 shown in FIG. 5 includes a model execution unit 211.


In this configuration, the model execution unit 211 calculates a parameter of the second model regarding a given sample of a first model by applying to the given sample, a third model indicating a relationship between: the first model that indicates a relationship between a sample and a label of the sample; and a parameter of a second model that indicates the above relationship and that is different from the first model.


The user can use the parameter value of the first model regarding the given sample for the analysis of a prediction target.



FIG. 6 is a diagram showing an example of processing in a model generation method according to an example embodiment.


The model generation method shown in FIG. 6 includes Step S11. Step S11 is a step of generating a third model indicating a relationship between: a first model that indicates a relationship between a sample and a label of the sample; and a parameter of a second model that indicates the above relationship and that is different from the first model.


According to the model generation method shown in FIG. 6, when predicting the behavior or state of a prediction target, the parameter value of the simulation model can be obtained without the need for executing the simulation model. In this respect, according to the model generation method, data that can be used for analyzing an analysis target can be obtained in a comparatively short period of time.



FIG. 7 is a diagram showing an example of processing in the parameter calculation method according to an example embodiment.


The parameter calculation method shown in FIG. 7 includes Step S21. Step S21 is a step of calculating a parameter of the second model regarding a given sample of a first model by applying to the given sample, a third model indicating a relationship between: the first model that indicates a relationship between a sample and a label of the sample; and a parameter of a second model that indicates the above relationship and that is different from the first model.


According to the parameter calculation method shown in FIG. 7, the user can use the parameter value of the first model regarding the given sample for the analysis of a prediction target.


In the example embodiments described above, the description has been made using a simulation; however, the parameter value may actually represent the actual operation (or state) of a prediction target, instead of the simulation. Alternatively, the operation of the prediction target may be controlled according to the calculated parameter value. In such a case, the model generation device sets the calculated parameter value as the parameter value of a control device that controls the processing (operation) of the prediction target. The control device controls the prediction target according to the parameter value. For example, the model generation device functions as a device that determines the goods to be delivered to be loaded on the truck according to the calculated parameter value. Alternatively, the model generation device functions as a device that determines the processing amount to be processed by each device on the production line according to the calculated parameter value, and controls the operation of each device according to the determined processing amount. That is to say, the model generation device functions as a device that controls the processing (operation) of the prediction target according to the calculated parameter value.



FIG. 8 is a schematic block diagram showing a configuration of a computer according to at least one example embodiment.


In the configuration shown in FIG. 8, a computer 700 includes a CPU 710, a main storage device 720, an auxiliary storage device 730, and an interface 740.


One or both of the model generation device 30 and the model generation device 200 may be implemented in the computer 700. In such a case, operations of the respective processing units described above are stored in the auxiliary storage device 730 in the form of program. The CPU 710 reads the program from the auxiliary storage device 730, loads it on the main storage device 720, and executes the processing described above according to the program. Moreover, the CPU 710 secures, according to the program, storage regions corresponding to the respective storage units mentioned above, in the main storage device 720. Communication between each device and another device is executed by the interface 740 having a communication function and communicating according to the control of the CPU 710.


In the case where the model generation device 30 is implemented in the computer 700, operations of the control unit 190 and each unit thereof are stored in the auxiliary storage device 730 in the form of program. The CPU 710 reads the program from the auxiliary storage device 730, loads it on the main storage device 720, and executes the processing described above according to the program.


Moreover, the CPU 710 secures, according to the program, a storage region corresponding to the storage unit 180 mentioned above, in the main storage device 720. Communication performed by the communication unit 110 is executed by the interface 740 having a communication function and communicating according to the control of the CPU 710. The processing performed by the display unit 120 is executed by the interface 740 including a display device and displaying images according to the control of the CPU 710. The processing performed by the operation input unit 130 is executed by the interface 740 including an input device, accepting user operations, and outputting signals indicating the performed user operations to the CPU 710.


In the case where the model generation device 200 is implemented in the computer 700, operations of the model generation unit 201 are stored in the auxiliary storage device 730 in the form of program. The CPU 710 reads the program from the auxiliary storage device 730, loads it on the main storage device 720, and executes the processing described above according to the program.


It should be noted that a program for realizing all or part of the functions of the model generation device 30, the model generation device 200, or the parameter calculation device 210 may be recorded on a computer-readable recording medium, and the program recorded on the recording medium may be read into and executed on a computer system, to thereby perform the processing of each unit. The “computer system” referred to here includes an operating system and hardware such as peripheral devices.


Moreover, the “computer-readable recording medium” referred to here refers to a portable medium such as a flexible disk, a magnetic optical disk, a ROM (Read Only Memory), and a CD-ROM (Compact Disc Read Only Memory), or a storage device such as a hard disk built in a computer system. The above program may be a program for realizing a part of the functions described above, and may be a program capable of realizing the functions described above in combination with a program already recorded in a computer system.


The example embodiments of the present invention has been described in detail with reference to the drawings. However, the specific configuration of the invention is not limited to the example embodiments, and may include design changes and so forth that do not depart from the scope of the present invention.


The whole or part of the example embodiments disclosed above can be described as, but not limited to, the following supplementary notes.


(Supplementary Note 1)

A model generation device comprising:


a model generation means for generating a third model indicating a relationship between a first model and a parameter of a second model, the first model indicating a relationship between a sample and a label of the sample, the second model indicating the relationship and being different from the first model.


(Supplementary Note 2)

The model generation device according to supplementary note 1, wherein the model generation means generates the third model that receives input of a distribution of a function indicating the first model and outputs a distribution of a function indicating the second model.


(Supplementary Note 3)

The model generation device according to supplementary note 1, wherein the model generation means generates the third model that receives input of a distribution of a function indicating the first model and outputs a point indicating a function indicating the second model.


(Supplementary Note 4)

The model generation device according to supplementary note 1, wherein the model generation means generates the third model that receives input of a point indicating a function indicating the first model and outputs a distribution of a function indicating the second model.


(Supplementary Note 5)

The model generation device according to supplementary note 1, wherein the model generation means generates the third model that receives input of a point indicating a function indicating the first model and outputs a point indicating a function indicating the second model.


(Supplementary Note 6)

The model generation device according to any one of supplementary notes 1 to 5, wherein the model generation means generates, as the third model, a function of an RKHS space that receives input of a kernel mean indicating the first model and outputs a kernel mean indicating the second model.


(Supplementary Note 7)

The model generation device according to supplementary note 6, further comprising:


a model execution means for calculating a value of the parameter of the second model based on the kernel mean indicating the second model.


(Supplementary Note 8)

A parameter calculation device comprising:


a model execution means for calculating a parameter of a second model regarding a given sample of a first model by applying a third model to the given sample, the first model indicating a relationship between a sample and a label of the sample, the second model indicating the relationship and being different from the first model, the third model indicating a relationship between the first model and a parameter of the second model.


(Supplementary Note 9)

A model generation method executed by a computer, comprising:


generating a third model indicating a relationship between a first model and a parameter of a second model, the first model indicating a relationship between a sample and a label of the sample, the second model indicating the relationship and being different from the first model.


(Supplementary Note 10)

A parameter calculation method executed by a computer, comprising:


calculating a parameter of a second model regarding a given sample of a first model by applying a third model to the given sample, the first model indicating a relationship between a sample and a label of the sample, the second model indicating the relationship and being different from the first model, the third model indicating a relationship between the first model and a parameter of the second model.


(Supplementary Note 11)

A recording medium storing a program for causing a computer to execute a function of:


generating a third model indicating a relationship between a first model and a parameter of a second model, the first model indicating a relationship between a sample and a label of the sample, the second model indicating the relationship and being different from the first model.


(Supplementary Note 12)

A recording medium storing a program for causing a computer to execute a function of:


a model execution means for calculating a parameter of a second model regarding a given sample of a first model by applying a third model to the given sample, the first model indicating a relationship between a sample and a label of the sample, the second model indicating the relationship and being different from the first model, the third model indicating a relationship between the first model and a parameter of the second model.


This application is based upon and claims the benefit of priority from Japanese patent application No. 2019-096325, filed May 22, 2019, the disclosure of which is incorporated herein in its entirety by reference.


INDUSTRIAL APPLICABILITY

The present invention may be applied to a model generation device, a parameter calculation device, a model generation method, a parameter calculation method, and a recording medium.


REFERENCE SYMBOLS


1 Prediction system



10 Simulator device



20 Machine learning device



30, 200 Model generation device



110 Communication unit (communication means)



120 Display unit (display means)



130 Operation input unit (operation input means)



180 Storage unit (storage means)



190 Control unit (control means)



191, 201 Model generation unit (model generation means)



192 Model execution unit (model execution means)

Claims
  • 1. A model generation device comprising: at least one memory configured to store the instructions; andat least one processor configured to execute the instructions to: generate a third model indicating a first relationship between a first model and a parameter of a second model, the first model indicating a second relationship between a sample and a label of the sample, the second model indicating the second relationship and being different from the first model.
  • 2. The model generation device according to claim 1, wherein the at least one processor is configured to execute the instructions to generate the third model that receives input of a distribution of a function indicating the first model and outputs a distribution of a function indicating the second model.
  • 3. The model generation device according to claim 1, wherein the at least one processor is configured to execute the instructions to generate the third model that receives input of a distribution of a function indicating the first model and outputs a point indicating a function indicating the second model.
  • 4. The model generation device according to claim 1, wherein the at least one processor is configured to execute the instructions to generate the third model that receives input of a point indicating a function indicating the first model and outputs a distribution of a function indicating the second model.
  • 5. The model generation device according to claim 1, wherein the at least one processor is configured to execute the instructions to generate the third model that receives input of a point indicating a function indicating the first model and outputs a point indicating a function indicating the second model.
  • 6. The model generation device according to claim 1 wherein, the at least one processor is configured to execute the instructions to generate, as the third model, a function of a reproducing Kernel Hilbert space (RKHS) that receives input of a kernel mean indicating the first model and outputs a kernel mean indicating the second model.
  • 7. The model generation device according to claim 6, wherein the at least one processor is configured to execute the instructions to calculate a value of the parameter of the second model based on the kernel mean indicating the second model.
  • 8. A parameter calculation device comprising: at least one memory configured to store the instructions; andat least one processor configured to execute the instructions to: calculate a parameter of a second model regarding a given sample of a first model by applying a third model to the given sample, the first model indicating a relationship between a sample and a label of the sample, the second model indicating the relationship and being different from the first model, the third model indicating a relationship between the first model and a parameter of the second model.
  • 9-12. (canceled)
Priority Claims (1)
Number Date Country Kind
2019-096325 May 2019 JP national
PCT Information
Filing Document Filing Date Country Kind
PCT/JP2020/020085 5/22/2019 WO 00