This application is based upon and claims the benefit of priority from Japanese Patent Application No. 2021-072646, filed on Apr. 22, 2021; the entire contents of which are incorporated herein by reference.
Embodiments described herein relate generally to an information processing device, an information processing method, and a computer program product.
Conventionally, techniques for modelling a physical phenomenon have been known. For example, there are techniques for obtaining a mathematical model describing a physical phenomenon from time-series data by applying symbolic regression, which is a type of machine learning.
However, in the conventional techniques, it has been difficult to further improve the accuracy of generating a model of a physical phenomenon.
According to an embodiment, an information processing device includes a memory and one or more processors coupled to the memory. The memory is configured to store therein time-series data including one or more variables. The one or more processors are configured to: calculate one or more time differential values of the one or more variables; calculate one or more differences representing variation of the one or more variables from an initial value; estimate a coefficient of a linear regression equation by machine learning in which the time differential values and the differences are used as learning data; and output the linear regression equation.
Hereinafter, an embodiment of an information processing device, an information processing method, and a computer program product will be described in detail with reference to the accompanying drawings.
In a method described in S. L. Brunton, J. L. Proctor, J. N. Kutz, “Discovering governing equations from data by sparse identification of nonlinear dynamical systems”, Proc. Natl. Acad. Sci., 113 (2016), pp. 3932-3937 in which symbolic regression is developed, the following equation (1) is used, and it is assumed that the true model can be expressed by a linear combination of nonlinear terms.
{dot over (X)}=Θ(X)Ξ (1)
In the information processing device of the embodiment also, the above equation (1) is used, and it is assumed that the true model can be expressed by a linear combination of nonlinear terms. Then, the information processing device of the embodiment estimates a coefficient w in a library in the following equation (2) composed of nonlinear function candidates, using a machine learning technique.
The information processing device of the embodiment can significantly reduce the time required for learning and the amount of data required for learning, by defining a search space in the library.
In the following embodiment, a thermal circuit network model is generated as a model of a physical phenomenon. In a node equation of a thermal circuit network method expressed by the following equation (3), time change in temperature can be expressed as a linear combination of nonlinear terms.
Consequently, there is a method of estimating a coefficient using a machine learning technique, by considering a library composed of candidates of a basis function proportional to the right-hand side. More specifically, the coefficient E is estimated by expressing the right-hand side of the above equation (3) as in the following equation (4).
Most of coefficients 8 of the given basis function are zero. Thus, the coefficient is estimated using a sparse estimation technique, which is a type of machine learning technique.
To estimate a coefficient, the fit to data is used as one index. In general, the sum of squared error is used as the index for the fit to data. For example, when an equation in which an L1 norm term is added to the sum of squared error is turned into a loss function, a sparse estimation technique called lasso is obtained as in the following equation (5).
minξ∥{dot over (X)}−Θ(X)Ξ∥22 s.t. ∥Ξ∥1≤s (5)
Because the sum of squared error is a weighted sum method in which the weight adjustment is ignored, the sum of squared error is strongly affected by the amount of the left-hand side. It is assumed that the left-hand side is a temperature gradient (time differential of temperature), and that the results of thermal fluid analysis are learned. To efficiently perform thermal fluid analysis, in general, varied time steps (Δt) of different lengths are used. That is, calculation is performed by narrowing Δt in the time zone where temperature changes rapidly, and calculation is performed by increasing Δt in the time zone where temperature changes gradually.
When the temperature gradient is calculated by a difference method and the like, the range is significantly widened. That is, it is temperature gradient of time zone where temperature changes rapidly >>temperature gradient of time zone where temperature changes gradually, and the coefficient estimation is significantly affected by the time zone where temperature changes rapidly. Moreover, when the learned model is used for designing, a super long-term prediction must be performed. Hence, a highly accurate short-term prediction is not enough. However, because the above equation (3) has a temperature dependency, it is not preferable to increase Δt by simply thinning data.
Consequently, in an information processing device 1 of the embodiment, the coefficient that appears in the equation representing the amount of change from the initial temperature matches with the coefficient Ξ in the above equation (4). By taking the above into account, the time differential of the temperature, which is a short-term component, and the amount of change (difference) from the initial temperature, which is a long-term component, are mixed and used as learning data.
Hereinafter, an operation example of the information processing device of the embodiment capable of further improving the accuracy of generating a model of a physical phenomenon will be described in detail.
Example of Functional Configuration
The storage unit 11 stores therein time-series data including at least one of a dependent variable and an independent variable. The dependent variable (response variable) is a variable determined depending on the independent variable (explanatory variable). The independent variable is a variable representing the factor of change in the dependent variable. For example, the dependent variable is temperature of an electronic component, a heat sink, and the like. For example, the independent variable is wind velocity indicating the strength of air from a fan for cooling an electronic component, electric current that flows through the electronic component, voltage supplied to the electronic component, and the like.
In the information processing device 1 of the embodiment, the value of the dependent variable is expressed by a unit unified for each physical quantity represented by the dependent variable. For example, if the physical quantity is weight, the value of the dependent variable is unified to kg or g, without mixing the dependent variable expressed in kg and the dependent variable expressed in g. Similarly, the value of the independent variable is expressed by a unit unified for each physical quantity represented by the independent variable.
The storage unit 11 may also store a plurality of types of time-series data. In the types of time-series data, at least one of the initial condition and boundary condition may be different.
The time differential value calculation module 12 calculates a time differential value of a variable (dependent variable or independent variable) included in the time-series data. The time differential value of the variable included in the time-series data is used as learning data in the form of the above equation (4).
The variation difference calculation module 13 calculates a difference representing variation of a variable included in the time-series data from the initial value. More specifically, the variation difference calculation module 13 calculates the following equation (8), by discretizing the left-hand side of the above equation (4) with first order accuracy, for example, with forward differential.
In this example, the superscript in the above equation (8) indicates time. By transforming the above equation (8) to the following equation (9), the variation difference calculation module 13 calculates a difference representing variation of the variable from the initial value.
ΔTt=0=Σi=1tΘ(X)(i)dt(i)Ξ (9)
The difference representing variation of the variable from the initial value included in the time-series data is used as learning data in the form of the above equation (9).
The nonlinear function generation module 14 generates a nonlinear function on the basis of at least one of the dependent variable and the independent variable. For example, the nonlinear function generation module 14 generates a nonlinear function on the basis of temperature Ti at a position i, and temperature Tj at a position j.
The regression equation generation module 15 generates a linear regression equation in which the nonlinear function generated by the nonlinear function generation module 14 is used as a basis function.
The estimation module 16 estimates the coefficient of the linear regression equation generated by the regression equation generation module 15, by machine learning in which the time differential value and the difference are used as learning data. More specifically, the estimation module 16 estimates the coefficient of the linear regression equation by machine learning, using both of the learning data in the form of the above equation (4) and the learning data in the form of the above equation (9).
The coefficient Ξ in the above equation (4) and the coefficient Ξ in the above equation (9) are the same. The above equation (9) can advantageously handle the nonlinear basis function and varied time steps (Δt) of different lengths. By transforming the time-series data in the forms of the above equations (4) and (9), mixing the time-series data, and using the mixed time-series data as learning data, learning in which the short-term component and the long-term component are taken into consideration becomes possible. In this manner, compared to when only the short-term component is taken into consideration, it is possible to further improve the accuracy of long-term prediction by the model generated by learning.
If the short-term component is only taken into consideration, the accuracy of long-term prediction by the model is degraded. Alternatively, if the long-term component is only taken into consideration, the accuracy of short-term prediction by the model is degraded. If the short-term prediction is not correct, the accuracy of long-term prediction by the model is also degraded.
The least square method used in machine learning is a weighted sum method in which the weight adjustment is ignored. Hence, to further improve the accuracy of the model, it is important to perform preprocessing on the learning data such that the following equation (10) is obtained.
αΣiΔTt=0=Σi{dot over (T)} (10)
In this example, i is the number of data, a is weight and satisfies α>1. That is, the total sum of the time differential values of the variables included in the learning data is larger than the total sum of differences representing variations of the variables from the initial value.
Moreover, in the example of the embodiment, for example, the basis function candidate includes addition and subtraction of variables to and from each other as in the right-hand side of the following equation (11), and the calculation results have physical meanings. Hence, inconvenience occurs when the variables (dependent variables and independent variables) are normalized. As a result, the estimation module 16 estimates the coefficient of a linear regression equation by a machine learning method in which normalization of variables is not required.
When a certain convergence condition is satisfied, the output control module 17 outputs a linear regression equation expressed by the corrected coefficient. For example, the certain convergence condition includes the number of repetition times of the machine learning process and the like.
Example of Generation Method of Model
Next, the information processing device 1 initializes the data used when the model is machine learnt (for example, a hyperparameter and the like) (step S3).
Next, the estimation module 16 estimates the coefficient of a linear regression equation generated by the regression equation generation module 15, by the machine learning in which the time differential value calculated at step S1 and the difference calculated at step S2 are used as learning data (step S4). More specifically, the time differential value of a variable included in the time-series data is used as learning data in the form of the above equation (4), and the difference representing variation of a variable included in the time-series data from the initial value is used as learning data in the form of the above equation (9).
For example, for the learning data used in the estimation process at step S4, a part of data may be selected at random from the entire learning data. Moreover, for example, the learning data used in the estimation process at step S4 may also be sequentially selected from unused data included in the learning data.
Next, the estimation module 16 determines whether the result of the coefficient estimation process has satisfied the convergence condition (step S5). For example, the convergence condition includes the number of times the coefficient estimation process is executed.
When the convergence condition is not satisfied (No at step S5), the process returns to step S4. When the convergence condition is satisfied (Yes at step S5), the output control module 17 calculates a performance evaluation index of the model (step S6). Next, the output control module 17 determines whether the learned model has satisfied the convergence condition (step S7). For example, the convergence condition is the number of times the learning processes (steps S4 to S6) of the model are executed. Moreover, for example, the convergence condition is when the performance evaluation index calculated by the process at step S6 is greater than a predetermined evaluation threshold. When the convergence condition is not satisfied (No at step S7), the hyperparameter is updated (step S8), and the process returns to step S4.
When the convergence condition is satisfied (Yes at step S7), the output control module 17 outputs the model (step S9).
Explanation of Effects
Next, accuracy of the model generated by the information processing device 1 of the embodiment will be described.
For example, the information processing device 1 of the embodiment generates a model that expresses a thermal fluid analysis with several million nodes with 60 nodes, for example, by performing the machine learning described above, using the time-series data including the temperature history of 60 nodes of the thermal fluid analysis to be performed on the power electronic apparatus 100.
In verification of the effectiveness of the model generated by the information processing device 1 of the embodiment to be described below, the input data includes 73 variables (heat generation amount of the chip_1 to chip_12, wind velocity of a fan that blows air to the heat sink, and initial temperature at 60 locations (nodes)), and the output data includes 60 variables (temperature at 60 locations). Moreover, the learning data is results of thermal fluid analysis performed twelve times. The range of learning data includes the heat generation amount of 1 to 69 W, and the wind velocity 1.0 to 2.0 m/s of the fan. The range of evaluation data (unknown input data not included in the learning data) includes the heat generation amount of 0 to 80 W, and the wind velocity 1.5 to 3.5 m/s of the fan.
As described above, in the information processing device 1 of the embodiment, the storage unit 11 stores therein the time-series data including one or more variables. The time differential value calculation module 12 calculates the time differential value of the variable. The variation difference calculation module 13 calculates a difference representing variation of the variable from the initial value. The estimation module 16 estimates the coefficient of a linear regression equation, by machine learning in which the time differential value and the difference are used as learning data. Then, the output control module 17 outputs the linear regression equation.
In this manner, with the information processing device 1 of the embodiment, it is possible to further improve the accuracy of generating a model of a physical phenomenon.
While the above-described embodiment describes a case where the information processing device 1 generates the linear regression equation of the thermal model, the linear regression equation of a model of another physical phenomenon (for example, electric resistance or physical deformation amount) may be generated.
Finally, an example of a hardware configuration of the information processing device 1 of the embodiment will be described.
Example of Hardware Configuration
The information processing device 1 of the embodiment includes a control device 201, a main storage device 202, an auxiliary storage device 203, a display device 204, an input device 205, and a communication device 206. The control device 201, the main storage device 202, the auxiliary storage device 203, the display device 204, the input device 205, and the communication device 206 are connected via a bus 210.
The control device 201 executes a computer program read out to the main storage device 202 from the auxiliary storage device 203. The main storage device 202 is memory such as read only memory (ROM) and random access memory (RAM). The auxiliary storage device 203 is a hard disk drive (HDD), a memory card, and the like.
The display device 204 displays display information. For example, the display device 204 is a liquid crystal display and the like. The input device 205 is an interface for operating the information processing device 1. For example, the input device 205 is a keyboard, a mouse, and the like. When the information processing device 1 is a smartphone or a smart device such as a tablet-type terminal, for example, the display device 204 and the input device 205 are a touch panel.
The communication device 206 is an interface for communicating with another device and the like.
A computer program executed by the information processing device 1 of the embodiment is recorded on a computer-readable storage medium such as a compact disc-read only memory (CD-ROM), a memory card, a compact disc-recordable (CD-R), and a digital versatile disc (DVD) in an installable or executable file format, and is provided as a computer program product.
Moreover, the computer program executed by the information processing device 1 of the embodiment may also be stored on a computer connected to a network such as Internet, and may be provided by causing a user to download the computer program via the network. Furthermore, the computer program executed by the information processing device 1 of the embodiment may also be provided via a network such as Internet without causing a user to download the computer program.
Still furthermore, the computer program of the information processing device 1 of the embodiment may also be provided by being incorporated in advance in ROM or the like.
The computer program executed by the information processing device 1 of the embodiment has a modular configuration including functional blocks that can also be implemented by a computer program among the functional blocks described above (
A part or the whole of the functional blocks described above may be implemented by hardware such as an integrated circuit (IC) instead of being implemented by software.
Moreover, when the functions are implemented using a plurality of processors, each processor may implement one of the functions or implement two or more functions.
Furthermore, an operating mode of the information processing device 1 of the embodiment may be optional. For example, the information processing device 1 of the embodiment may be operated as a cloud system on a network.
While certain embodiments have been described, these embodiments have been presented by way of example only, and are not intended to limit the scope of the inventions.
Indeed, the novel embodiments described herein may be embodied in a variety of other forms; furthermore, various omissions, substitutions and changes in the form of the embodiments described herein may be made without departing from the spirit of the inventions. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the scope and spirit of the inventions.
Number | Date | Country | Kind |
---|---|---|---|
2021-072646 | Apr 2021 | JP | national |