This application is based upon and claims the benefit of priority of the prior Japanese Patent Application No. 2021-104194, filed on Jun. 23, 2021, the entire contents of which are incorporated herein by reference.
The embodiment discussed herein is related to an inference program, an inference method, and an information processing apparatus.
A numerical simulation that conducts numerical analysis performed on a computer by using a mesh obtained by discretizing phenomena of structural mechanics, fluid mechanics, or the like is widely used to design structural objects.
It is possible to reduce design costs or the like by speeding up simulations by causing approximate models, which are obtained by training relationships of input/output data of numerical simulations conducted by using high-performance computing (HPC) on a neural network (NN) or the like, to infer the simulation results.
Furthermore, there is the generalized minimal residual method (GMRES) as a type of an iterative method for obtaining a solution of a simultaneous linear equation, and, there is a related technology for speeding up iterative calculation by applying a deep neural network (DNN) to the GMRES. Hereinafter, simply referred to as a related technology.
With the related technology, regarding a simultaneous linear equation (Ax=b), training of the relationship between a right-hand side vector b and a solution vector x is performed on the DNN while performing a non-stationary analysis on a fluid by using the GMRES. In equation above, “A” denotes a matrix. Furthermore, in the related technology, by using the solution vector x that is output from the DNN trained up to that time and in which the training result is reflected, it is possible to reduce time needed for the iterative calculation.
In
Patent Document 1: Japanese Laid-open Patent Publication No. 2007-157000
Patent Document 2: Japanese Laid-open Patent Publication No. 2008-226178
According to an aspect of the embodiments, a non-transitory computer-readable recording medium stores therein an inference program that causes a computer to execute a process including: inferring, by inputting new input data to an approximate model that is obtained by conducting practice by using training data in which a relationship between input data and output data of a numerical simulation is defined, output data that is associated with the new input data; determining whether or not accuracy of an inference result obtained from the approximate model satisfies a condition; correcting the inference result when the accuracy of the inference result obtained from the approximate model does not satisfy the condition; and outputting the corrected inference result.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention.
For example, it is conceivable to increase a scene in which the approximate model described above is able to be used by adopting a method for determining whether or not the accuracy of the inference result is sufficient or performing correction of the inference result.
Here, in order to examine the accuracy of the output obtained from the approximate model, it is conceivable to perform simulations under the same condition and compare the simulation result to the output result obtained from the approximate model. However, if a simulation is performed on each of the output results obtained from the approximate model, there is no point to speed up the iterative calculation performed by using the approximate model.
Furthermore, in general, the approximate model (NN) is an image in which the magnitude of the output is a fixed length; therefore, the magnitude of the output from the approximate model is not matched with the vector length of the solution vector x. As a result, it is not easy to correct the solution by using the output obtained from the approximate model or to improve the model based on the inference result.
Furthermore, the related technology that applies the DNN to the GMRES is limited to a case (non-stationary analysis) of repeatedly using a coefficient matrix A in which an applicable scope is the same; therefore, the related technology is not able to be applied to a steady analysis. In a non-stationary analysis, since a plurality of time steps are included, it is possible to advance the training of the DNN. In contrast, in the steady analysis, time steps are not included; therefore, it is not possible to perform training of the DNN by using the result of the time steps that are obtained up to the last time.
In one aspect, the embodiments provide an inference program, an inference method, and an information processing apparatus capable of efficiently obtaining a solution with accuracy that satisfies a request received from a user.
Preferred embodiments will be explained with reference to accompanying drawings. Furthermore, the present invention is not limited to the embodiments.
Before the present embodiment is described, an example of a graph neural network (GNN) will be described. Here, as an example of the GNN, a description will be given by using a graph convolutional network (GCN).
The GCN 30 gives a feature value to each vertices included in a graph structure (a set of vertices and branches) as an input, and then, obtains, by using a NN, an output of each of the vertices based on the associated feature values of the (adjacent) vertices from which branches spread. It is assumed that a feature vector is given to each of the vertices, and following processes at Step S1 and Step S2 are performed on, for example, all of the layers.
The process performed on each of the layers included in the GCN 30 is performed by, regarding each of the vertices, aggregating (Aggregation) the feature vectors held by the adjacent vertices (Step S1), and calculating (Combine) an output of the own feature vector by combining the weights (Step S2).
In the GCN 30, training and an inference is possible with respect to an arbitrary number of vertices. The weight of the NN and inside the NN is common to all of the vertices (independent of the number of vertices); therefore, it is possible to infer the NN trained (practiced) by using pieces of data including an arbitrary number of vertices from pieces of data including arbitrary number of vertices.
The process performed on the layers included in the GCN 30 is able to be represented by Equation (1). In Equation (1), X denotes a feature matrix of N×F0, Hi denotes a feature matrix of N×Fi, A denotes an adjacent matrix of N×N, N denotes the number of vertices, and F0 denotes a magnitude of a feature (vector).
H
1
=f(Hi−1,A),H0=X (1)
In a case of the simplest GCN, the process performed on the layers included in the GCN 30 is also able to be represented by Equation (2). Here, Wi denotes a weighting matrix of Fi×Fi+1, and σ denotes an activating function. The magnitude of the weighting matrix depends on a graph shape.
f(Hi,A)=σ(AHiWi) (2)
The information processing apparatus according to the present embodiment performs the process, which will be described later, by using the GNN (GCN) described above.
In the following, the information processing apparatus according to the present embodiment will be described.
The information processing apparatus inputs inference purpose data 40 that is related to a scheduled analysis to the trained GNN 143, and derives an inference of a solution vector x. In the inference purpose data 40, as will be described later, mesh data, a coefficient matrix A, and a right-hand side vector b are included. The information processing apparatus determines whether or not the accuracy of the solution vector x is sufficient by obtaining a residual error of the solution vector x. If the accuracy of the solution vector x is not sufficient, the information processing apparatus corrects the solution vector x by using a conjugate gradient (CG) method, and outputs a corrected solution vector x′. The CG method is an example of an iterative method.
As described above, the information processing apparatus determines whether or not the accuracy of a solution of the GNN 143 obtained in the case where the inference purpose data 40 is input is sufficient, and, if the accuracy is not sufficient, the information processing apparatus corrects the solution vector x by using the iterative method. As a result, it is possible to obtain a solution with accuracy that satisfies a request received from a user.
The information processing apparatus inputs a plurality of kinds of inference purpose data 40a, 40b, 40c to the trained GNN 143, and calculates a plurality of solution vectors x. The information processing apparatus determines, by obtaining a residual error of each of the solution vectors x, whether or not the accuracy of the solution vectors x is sufficient.
If the accuracy of the solution vectors x is not sufficient, the information processing apparatus corrects the solution vectors x by using the CG method, adds the corrected solution vector x′ to the training purpose data, and uses the updated training purpose data in a case retraining the GNN 143. If the accuracy of the solution vectors x is sufficient, the information processing apparatus does not add the corrected solution vector x′ to the training purpose data.
If the accuracy of the solution vectors x is insufficient, it can be said that training of the GNN 143 related to the solution vectors x is not sufficient. Thus, by adding a pair of inference purpose data that is associated with the solution vector x in which the accuracy is insufficient and the corrected solution vector x′ to the training purpose data 142, it is possible to efficiently generate a case of the simulation that is useful to improve the GNN 143.
In the following, a configuration of the information processing apparatus according to the present embodiment that performs the processes described above with reference to
The communication unit 110 is connected to an external device or the like in a wired or wireless manner, and transmits and receives information to and from the external device or the like. For example, the communication unit 110 is implemented by a network interface card (NIC), or the like. The communication unit 110 may be connected to a network that is not illustrated.
The input unit 120 is an input device that inputs various kinds of information to the information processing apparatus 100. The input unit 120 corresponds to a keyboard, a mouse, a touch panel, or the like. The user operates the input unit 120 and designates, for example, a threshold that is related to a residual error or an error and that is requested in terms of the accuracy of a solution.
The display unit 130 is a display device that displays information that is output from the control unit 150. The display unit 130 corresponds to a liquid crystal display, an organic electro luminescence (EL) display, a touch panel, or the like. For example, a simulation result obtained from the GNN 143 is displayed on the display unit 130.
The storage unit 140 includes a structural analysis data 141, the training purpose data 142, the GNN 143, and an inference result table 144. The storage unit 140 is implemented by a semiconductor memory device, such as a random access memory (RAM) and a flash memory, or a storage device, such as a hard disk and an optical disk.
The structural analysis data 141 is data that is used for a simulation. In the present embodiment, the solution vector x of a simultaneous linear equation (Ax=b) is calculated by the simulation. For example, the solution vector x obtained by the simulation indicates a displacement of each of the points (elements) in the structure. In the structural analysis data 141, a set of the mesh data, a coefficient matrix A, and a right-hand side vector b is included.
The training purpose data 142 is data for training (practicing) the GNN 143.
The GNN 143 is an approximate model corresponding to the GNN (the GCN 30) described above with reference to
The inference result table 144 is a table that holds an inference result of the solution vectors x obtained from the GNN 143.
A description will be given here by referring back to
The simulation execution unit 151 acquires the structural analysis data 141 from the training purpose data generating unit 152, and performs a simulation for calculating the solution vector x. By conducting a simulation under a plurality of conditions, the simulation execution unit 151 calculates a plurality of sets of the mesh data m, the coefficient matrix A, the right-hand side vector b, and the solution vector x. The simulation execution unit 151 outputs the execution result of the simulation to the training purpose data generating unit 152.
The training purpose data generating unit 152 is a processing unit that generates the training purpose data 142 by using the simulation execution unit 151. The training purpose data generating unit 152 outputs the structural analysis data 141 to the simulation execution unit 151 to conduct a simulation under a plurality of conditions, and obtains a simulation result. In the simulation result, a plurality of pairs of a set of mesh data m, the coefficient matrix A, and the right-hand side vector b, and the solution vector x are included. The training purpose data generating unit 152 registers, in the training purpose data 142, the mesh data m, the coefficient matrix A, and the right-hand side vector b as input data, and the solution vector x as the correct answer label.
If the training purpose data generating unit 152 acquires an instruction from the end determination unit 158, which will be described later, indicating that the training purpose data is to be added, the training purpose data generating unit 152 newly generates a pair of the input data and the correct answer label by changing the condition, and adds the generated pair to the training purpose data 142.
The approximate model building unit 153 performs training on the GNN 143 based on the training purpose data 142. For example, the approximate model building unit 153 repeatedly performs a process of adjusting the parameter of the GNN 143 such that the output data that is output from the GNN 143 approaches the correct answer label in the case where the approximate model building unit 153 acquires a pair of the input data and the correct answer label from the training purpose data 142 and inputs the input data to the GNN 143. The approximate model building unit 153 performs training on the GNN 143 by using an error backpropagation method or the like.
If the new input data and the correct answer label are registered in the training purpose data 142 by the accuracy correction unit 157, which will be described later and if the training purpose data 142 is updated, the approximate model building unit 153 performs retraining on the GNN 143 by using the updated training purpose data 142.
The inference purpose data generating unit 154 generates inference purpose data. For example, if the inference purpose data generating unit 154 receives data on the structure that corresponds to the analysis object from the input unit 120 or the like, the inference purpose data generating unit 154 generates the mesh data m in which the structure is represented in the form of a mesh.
Furthermore, the inference purpose data generating unit 154 generates the coefficient matrix A and the right-hand side vector b based on the mesh data m. The inference purpose data generating unit 154 outputs the inference purpose data that includes the mesh data m, the coefficient matrix A, and the right-hand side vector b to the solution vector inference unit 155.
If the inference purpose data generating unit 154 receives an update of a step from the end determination unit 158, which will be described later, the inference purpose data generating unit 154 generates the inference purpose data by changing the condition, and repeatedly performs the process of outputting the generated data to the solution vector inference unit 155.
The solution vector inference unit 155 infers (calculates) the solution vector x by inputting the inference purpose data to the trained GNN 143. The solution vector inference unit 155 registers a relationship between the inference purpose data and the solution vector x that corresponds to the inference result into the inference result table 144. Furthermore, the solution vector inference unit 155 outputs the inference purpose data and the solution vector x to the accuracy determination unit 156.
The solution vector inference unit 155 repeatedly performs the process described above every time the solution vector inference unit 155 acquires the inference purpose data from the inference purpose data generating unit 154.
The accuracy determination unit 156 determines whether or not the accuracy of the solution vector x is sufficient. For example, the accuracy determination unit 156 determines, based on a residual error or an error, whether or not the accuracy of the solution vector x is sufficient.
A process of determining the accuracy based on the residual error and a process of determining the accuracy based on an error performed by the accuracy determination unit 156 will be described.
The process of determining the accuracy performed based on the residual error by the accuracy determination unit 156 will be described. The accuracy determination unit 156 calculates a residual error r based on Equation (3). In Equation (3), “A” corresponds to the coefficient matrix A included in the inference purpose data, and “b” corresponds to right-hand side vector b included in the inference purpose data. The value of the residual error r is closer to zero as the accuracy of the solution vector x is higher.
r=∥Ax−b∥ (3)
If the residual error r is less than a first threshold, the accuracy determination unit 156 determines that the accuracy of the solution vector x is sufficient. If the accuracy determination unit 156 determines that the accuracy of the solution vector x is not sufficient, the accuracy determination unit 156 outputs the inference purpose data and the solution vector x to the accuracy correction unit 157. The first threshold is previously designated by a user.
The process of determining the accuracy performed based on an error by the accuracy determination unit 156 will be described. The accuracy determination unit 156 calculates an error e based on Equation (4). The accuracy determination unit 156 calculates x(1) by performing the iterative method by one step by using the solution vector x as the initial value. The difference between the solution vector x and x(1) is decreased as the accuracy of the solution vector x is higher.
e=∥x−x
(1)∥ (4)
If the error e is less than a second threshold, the accuracy determination unit 156 determines that the accuracy of the solution vector x is sufficient. If the accuracy determination unit 156 determines that the accuracy of the solution vector x is not sufficient, the accuracy determination unit 156 outputs the inference purpose data and the solution vector x to the accuracy correction unit 157. The second threshold is previously designated by the user.
If the accuracy correction unit 157 acquires the solution vector x that is targeted for the correction from the accuracy determination unit 156, the accuracy correction unit 157 performs the iterative method (the CG method) for linear solvers, and corrects the solution vector x until the residual error r is less than the first threshold. The accuracy correction unit 157 associates the inference purpose data with the corrected solution vector x′ and registers the associated data into the inference result table 144.
The accuracy correction unit 157 associates the inference purpose data with the corrected solution vector x′ and registers the associated data into the training purpose data 142. The inference purpose data corresponds to the input data and the corrected solution vector x′ corresponds to the correct answer label.
The end determination unit 158 is a processing unit that determines whether or not the simulation has ended. The end determination unit 158 determines to end the simulation when a predetermined end condition is satisfied. The end determination unit 158 outputs an update of the step to the inference purpose data generating unit 154 in a period of time for which the simulation is continued. Furthermore, the end determination unit 158 may output an instruction to the training purpose data generating unit 152 indicating that the training purpose data is added.
The end determination unit 158 may output the information on the inference result table 144 to the display unit 130 and cause the display unit 130 to display the output information, or the end determination unit 158 may notify an external device of the information on the inference result table 144 via a network.
In the following, an example of the flow of the process performed by the information processing apparatus according to the present embodiment will be described.
The training purpose data generating unit 152 included in the information processing apparatus 100 accumulates the mesh data m, the coefficient matrix A, the right-hand side vector b, the solution vector x of each of the simulations and generates the training purpose data 142 (Step S102).
The approximate model building unit 153 included in the information processing apparatus 100 performs training on the GNN 143 by using the training purpose data 142 (Step S103).
Furthermore, the process illustrated in
The solution vector inference unit 155 included in the information processing apparatus 100 inputs the mesh data m, the coefficient matrix A, and the right-hand side vector b to the trained GNN 143, and then, infers the solution vector x (Step S203).
The accuracy determination unit 156 included in the information processing apparatus 100 calculates the residual error of the solution vector x or the error (Step S204). The accuracy determination unit 156 determines whether or not the accuracy of the solution vector x is sufficient based on the obtained residual error or the error (Step S205). For example, in a case of using the residual error at Step S205, if the residual error is less than the first threshold, it is determined that the accuracy of the solution vector x is sufficient. Furthermore, in a case of using the error, if the error is less than the second threshold, it is determined that the accuracy of the solution vector x is sufficient.
If the accuracy of the solution vector x is sufficient (Yes at Step S205), the accuracy determination unit 156 proceeds to Step S208.
In contrast, if the accuracy of the solution vector x is not sufficient (No at Step S205), the accuracy determination unit 156 proceeds to Step S206. The accuracy correction unit 157 included in the information processing apparatus 100 corrects the solution vector x by using the iterative method (Step S206).
The accuracy determination unit 156 adds the pair of the input data and the corrected solution vector x′ to the training purpose data 142 (Step S207).
If the end determination unit 158 included in the information processing apparatus 100 ends the simulation (Yes at Step S208), the end determination unit 158 ends the process. In contrast, if the end determination unit 158 does not end the simulation (No at Step S208), the end determination unit 158 updates the step of the simulation (Step S209), and proceeds to Step S202.
In the following, effects of the information processing apparatus 100 according to the present embodiment will be described. The information processing apparatus 100 determines whether or not the accuracy of the solution of the GNN 143 obtained in the case where the inference purpose data is input is sufficient, and corrects, if the accuracy if not sufficient, the solution vector x by using the iterative method. As a result, it is possible to efficiently obtain the solution with accuracy that satisfies the request received from the user.
If the accuracy of the solution vector x is not sufficient, the information processing apparatus 100 corrects the solution vector x by using the iterative method, adds the corrected solution vector x′ to the training purpose data 142, and uses the updated training purpose data 142 at the time of retraining the GNN 143. As a result, it is possible to efficiently generate a case of the simulation that is useful to improve the GNN 143.
The information processing apparatus 100 calculates the residual error of the solution vector x or the error, and determines whether or not the accuracy of the solution vector x is sufficient based on the calculated residual error or the error. As a result, it is possible to appropriately determine whether or not the accuracy of the solution vector x is sufficient.
Incidentally, in the embodiment described above, a case has been described in which the information processing apparatus 100 calculates the solution vector x of the simultaneous linear equation; however, the embodiment is not limited to this. For example, based on a Newton-Raphson method, by solving a simultaneous linear equation by a plurality of steps, it is possible to conduct a nonlinear analysis (elastic mechanics).
In the following, an example of a hardware configuration of a computer that implements the same function as that of the information processing apparatus 100 according to the embodiment described above will be described.
As illustrated in
The hard disk device 207 includes a training program 207a and an inference program 207b. Furthermore, the CPU 201 reads each of the programs 207a and 207b and loads the programs 207a and 207b into the RAM 206.
The training program 207a functions as a training process 206a. The inference program 207b functions as an inference process 206b.
The process of the training process 206a corresponds to each of the processes performed by the simulation execution unit 151, the training purpose data generating unit 152, and the approximate model building unit 153. The process of the inference process 206b corresponds to each of the processes performed by the inference purpose data generating unit 154, the solution vector inference unit 155, the accuracy determination unit 156, the accuracy correction unit 157, and the end determination unit 158.
Furthermore, each of the programs 207a and 207b does not need to be stored in the hard disk device 207 from the beginning. For example, each of the programs is stored in a “portable physical medium”, such as a flexible disk (FD), a CD-ROM, a DVD disk, a magneto-optic disk, an IC card, that is to be inserted into the computer 200. Then, the computer 200 may also read each of the programs 207a and 207b from the portable physical medium and execute the programs.
It is possible to efficiently obtain a solution with accuracy that satisfies a request received from a user.
All examples and conditional language recited herein are intended for pedagogical purposes of aiding the reader in understanding the invention and the concepts contributed by the inventor to further the art, and are not to be construed as limitations to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although the embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2021-104194 | Jun 2021 | JP | national |