This application is based upon and claims the benefit of priority from Japanese Patent Application No. 2023-042587, filed on Mar. 17, 2023; the entire contents of which are incorporated herein by reference.
Embodiments described herein relate generally to an information processing device, an information processing method, and a computer program product.
As a molecular dynamics method of simulating physical movement of atoms, a first-principle molecular dynamics method by density functional theory (DFT) or the like is known. Furthermore, as a method capable of reducing a calculation cost as compared with the first-principle molecular dynamics method, a machine learning molecular dynamics method using a model obtained by machine learning has been proposed.
In the machine learning molecular dynamics method, for example, a model that outputs forces acting on respective atoms and the entire energy on the basis of input atomic arrangement (positions of the respective atoms) is constructed, and the model is learned such that an error between an output value of the model and correct data is made small. As the error, for example, the root mean square for a prediction error of each atom is used.
According to an embodiment, an information processing device includes one or more hardware processors configured to: set an error function including one or more terms based on a plurality of weights according to features of a plurality of elements, the error function being a function used during learning of a machine learning model into which positions of a plurality of atoms included in an analysis target, and information indicating which of the plurality of elements the plurality of atoms are, are input, and that outputs a physical quantity of the analysis target; and learn the machine learning model using the error function.
Hereinafter, preferred embodiments of an information processing device according to the present invention will be described in detail with reference to the accompanying drawings.
In a conventional machine learning molecular dynamics method, a result that is physically invalid may be obtained in a case where forces acting on respective atoms are calculated using a learned model and a motion of an atom is simulated on the basis of the calculated forces. As one of the causes, it is conceivable that a feature (for example, mass) of an element is not considered during learning of a model. For example, in a case where errors of forces are the same, an error of acceleration is larger as the atom is lighter, which may affect accuracy of simulation. For this reason, a phenomenon may occur in which simulation succeeds in a case where accuracy of a learned model is low and a prediction error is large, and simulation fails in a case where a prediction error is small.
In the following embodiments, an error function using weights (load values) set by a feature (for example, mass) of elements is set as an error function used during learning a model. For example, in a case where a mass is used as a feature, the model is learned such that an error that is an output value of an error function set such that a lighter atom has a larger weight is made small. Simulation is likely to be successful using a model learned in this way.
The following embodiments can be applied to, for example, learning of a model used in a machine learning molecular dynamics method. The machine learning molecular dynamics method can be applied to, for example, processing of searching for a material of a battery (including a secondary battery), processing of searching for a material of a catalyst, and the like.
The feature storage unit 151 stores features for each element.
Returning to
The learning data storage unit 153 stores learning data used for learning of a machine learning model (hereinafter, simply referred to as a model). The model is, for example, a model into which positions of a plurality of atoms included in an analysis target and information indicating which of a plurality of elements the plurality of atoms are, are input, and output a physical quantity of the analysis target. The information indicating which of the plurality of elements the plurality of atoms is, for example, is an element symbol.
The analysis target is, for example, a crystal and a molecule (including a polymer). The physical quantity is, for example, at least one of forces acting on respective atoms included in an analysis target or energy of the entire analysis target. The forces acting on respective atoms may be output independently of the energy, or values obtained by differentiating the energy at the positions of the atoms may be output.
The learning data includes input data input to such a model and correct data corresponding to correct answers of output of the model.
The first row indicates energy of the unit cell of the crystal. In the example of
From the first column “192” of the row below “PRIMCOORD”, it can be seen that the number of atoms included in the unit cell is 192. In the subsequent rows, information of 192 atoms is described. For example, information of an atom includes an element symbol, an x coordinate, a y coordinate, a z coordinate, an x component of the force, a y component of the force, and a z component of the force in this order. For example, element symbols and the xyz coordinates of the respective atoms correspond to input data, and the xyz components of forces and the energy in the first row correspond to correct data.
For example, in the first atom, the element is Li, the xyz coordinate is (1.1674292981, 11.1419581078, 11.8087410073), and the force acting on this atom is (−0.0273284000, 0.0104727700, 0.0224368600).
The learning data storage unit 153 stores a plurality of samples assuming the learning data (xsf file) as illustrated in
Returning to
In a case where the model is a neural network model, the parameter storage unit 154 stores, for example, a weight, a bias, or the like as a model parameter.
Note that each storage unit (feature storage unit 151, weight storage unit 152, learning data storage unit 153, parameter storage unit 154) can be configured by any commonly used storage medium such as a flash memory, a memory card, a random access memory (RAM), a hard disk drive (HDD), or an optical disc.
Each storage unit may be a physically different storage medium or may be implemented as different storage areas of a physically same storage medium. Furthermore, each storage unit may be implemented by a plurality of physically different storage media.
Returning to
In the example of
The function setting unit 102 sets an error function used during learning of the model. Setting an error function means defining an error function used for learning by the learning unit 110 to be described below. Note that processing of calculating an error that is an output value of the error function using the set error function is executed by the learning unit 110.
For example, the function setting unit 102 sets an error function such that a plurality of terms obtained by multiplying an error of the physical quantity for each of a plurality of elements by a plurality of weights of the respective elements set by the weight setting unit 101. The function setting unit 102 sets an error function using selected N (N is an integer of 2 or more) samples of learning data. For example, the function setting unit 102 may select all or a part of the learning data stored in the learning data storage unit 153. The following Formula (1) indicates an example of the error function set by the function setting unit 102.
“(n)” in Formula (1) represents a value corresponding to the n-th (n is an integer satisfying 1≤n≤N) sample (hereinafter, also referred to as a sample n) among the N samples of the learning data. ek(n) is an error with respect to an element k (hereinafter, referred to as element-specific error), and is expressed by the following Formula (2).
σk in Formula (2) is a set of identification information for identifying one or more atoms of the element k. The identification information for identifying an atom is, for example, a numerical value of a serial number starting from 1. In the example of the learning data of
k is identification information for identifying a plurality of elements. The identification information of an element is, for example, a numerical value of a serial number starting from 1, but is not limited to a numerical value, and may be a symbol (for example, element symbol) or the like. Hereinafter, k is assumed to be an integer satisfying 1≤k≤K. K is the number of types of the elements included in the analysis target.
λ in Formula (1) is a constant. For example, λ is designated by a user. x represents input data to the model. θ represents a model parameter. Note that “DFT” indicates that energy of the learning data (corresponding to the correct data) is obtained by density functional theory (DFT), but the correct data may be obtained by any method other than DFT.
wk is the weight of the element k. The weight wx is calculated by, for example, the following Formula (3). In the Formula (3), mx is the mass of atoms of the element k. Note that Formula (3) is a formula indicating that the weight wk in Formula (1) is calculated by the reciprocal of the mass.
Formula (2) corresponds to a formula for calculating an element-specific error ek(n) (second error) corresponding to a sum of errors (first errors) of forces (example of physical quantities) of a plurality of respective atoms for each element. Furthermore, Formula (1) corresponds to an error function including a plurality of (K) terms obtained by multiplying the element-specific error ek(n) by the weight wk for each element.
The function setting unit 102 may set an error function for all the samples from error functions for the respective samples. For example, the function setting unit 102 sets an error function L for all the samples by taking a sum of the error functions for the respective samples. The error function L is expressed by, for example, the following Formula (4).
The learning unit 110 learns the model using a set error function. For example, the learning unit 110 repeatedly executes learning of the model using a plurality of pieces of the learning data a plurality of times until the learning is determined to be ended. The learning unit 110 includes an update unit 111 and a determination unit 112.
The update unit 111 calculates an output value of an error function using the learning data, and updates (corrects) the model parameter on the basis of the calculation result. For example, the update unit 111 updates the model parameter such that the output value of the error function is made small. A method of updating the model parameter may be any method, and for example, a method such as a steepest descent method, Adam, or a Kalman filter can be used. The update unit 111 stores the updated model parameter in the parameter storage unit 154.
The determination unit 112 determines whether to end learning. For example, the determination unit 112 checks the output value of the error function calculated by the update unit 111, and determines to end the learning in a case where change in the output value of the error function is smaller than a threshold. The method of determining the end of learning is not limited to this, and any method may be used. For example, the determination unit 112 may determine to end the learning in a case where the number of times of update of the model parameter is larger than the number of times of learning designated in advance.
The output control unit 131 controls output of various types of information used in the information processing device 100. For example, the output control unit 131 outputs the model parameter of the learned model to an external device that performs processing using the model (for example, analysis of an analysis target by the machine learning molecular dynamics method).
At least some of each unit (weight setting unit 101, function setting unit 102, learning unit 110, and output control unit 131) may be implemented by one processing unit. Each unit is implemented by, for example, one or a plurality of processors. For example, each unit may be implemented by a processor such as a central processing unit (CPU) or a graphics processing unit (GPU) being caused to execute a program, that is, by software. Each unit may be implemented by a processor such as a dedicated integrated circuit (IC), that is, by hardware. Each unit may be implemented using software and hardware in combination. In a case where a plurality of processors is used, each of the processors may implement one of the units or two or more of the units.
Details of the processing by the function setting unit 102 will be further described.
In processing 601, the function setting unit 102 selects N samples used for setting an error function. As described above, the function setting unit 102 may select all the samples stored in the learning data storage unit 153 or may select some of all the samples.
In processing 602, the function setting unit 102 sets the error functions for the respective samples using the selected N samples n. This processing corresponds to, for example, the processing illustrated in
In processing 603, the function setting unit 102 sets the error function L for selected samples by taking a sum of the error functions for the respective samples according to Formula (4), for example.
In a case where learning is repeated a plurality of times, the processing 601 to 603 illustrated in
Next, details of the processing by the update unit 111 will be further described.
The update unit 111 calculates output values using error functions for respective samples for M (M is an integer of 1 or more) samples m (1≤m≤M) designated as samples used for learning. The M samples may be designated in any way, for example, as follows.
For example, for each of M samples m, the update unit 111 calculates an error that is a difference between energy and a force acting on an atom output from the model as a prediction value and energy and a force acting on the atom stored in the samples as the correct data. The update unit 111 inputs calculated errors to, for example, Formula (1), and calculates an output value of an error function for each of the samples m.
The update unit 111 also calculates an output value of the error function L that is a sum of output values of error functions for the M samples according to Formula (4). Further, the update unit 111 updates the model parameter such that the value of L is made small according to an algorithm such as the steepest descent method, Adam, or the Kalman filter.
Next, model learning processing by the information processing device 100 according to the first embodiment will be described.
The weight setting unit 101 reads a feature of the elements from the feature storage unit 151 (step S101). For example, the weight setting unit 101 reads a mass among the features of the elements illustrated in
The learning unit 110 (update unit 111) reads the model parameter stored in the parameter storage unit 154 (step S104).
The function setting unit 102 sets error functions using the weights set by the weight setting unit 101 (step S105).
The update unit 111 calculates an output value of the error function L using the learning data read in step S102, the model parameter read in step S104, and the error functions set in step S105 (step S106).
The determination unit 112 determines whether to end the learning (step S107). For example, in a case where the difference between the output value calculated last time and the output value calculated this time is less than a threshold, or in a case where the number of times of update of the model parameter exceeds a designated number of times of learning, the determination unit 112 determines to end the learning.
In a case where the learning is determined not to be ended (step S107: No), the update unit 111 updates the model parameter such that the output value of the error function calculated in step S106 is made small, and stores the updated value in the parameter storage unit 154 (step S108). The update unit 111 also increases the number of times of update of the model parameter by 1. Note that it is assumed that the number of times of update of the model parameter is initially initialized to 0.
In a case where the learning is determined to be ended (step S107: Yes), the learning processing ends.
As described above, in the first embodiment, the model is learned using the error function using the weights set by a feature of the elements. As a result, learning accuracy of the model used for analysis of a motion of an atom and the like can be further improved.
An information processing device according to the second embodiment aggregates the number of atoms for each of a plurality of elements, and sets a weight using the number of atoms or a ratio of the number of atoms to the total number of atoms.
The second embodiment is different from the first embodiment in the aggregation unit 103-2 being added and a function of the weight setting unit 101-2. Other configurations and functions are similar to those in
The aggregation unit 103-2 aggregates the number of atoms that is the number of atoms of an element included in a plurality of pieces of learning data for each of a plurality of elements on the basis of a plurality of pieces of learning data. For example, the aggregation unit 103-2 selects one or more samples among samples read as learning data, counts the number of atoms included in the selected samples for each element, and obtains the number of atoms for each element.
Returning to
Next, learning processing by the information processing device 100-2 according to the second embodiment will be described with reference to
Since steps S201 and S202 are similar to steps S101 and S102 in the information processing device 100 of the first embodiment, the description thereof will be omitted.
The aggregation unit 103-2 aggregates the number of atoms for each of the elements included in the learning data using all or a part of the read learning data (step S203). The weight setting unit 101-2 obtains the weight wk=1/(mk*number of atoms) of the element k using the feature (for example, mass mk) of the element and the number of atoms for each of the elements, and stores the weight in the weight storage unit 152 (step S204).
Since steps S205 to S209 are similar to steps S104 to S108 in the information processing device 100 of the first embodiment, the description thereof will be omitted.
As described above, the information processing device according to the second embodiment learns the model using an error function using weights each in which the number of atoms (or ratio of the number of atoms) is further considered for each of the elements included in the learning data. As a result, learning accuracy of the model can be further improved.
Next, modifications applicable to the above embodiments and the following embodiments will be described. Hereinafter, a case where the first embodiment is the application target will be described as an example, but other embodiments can also be an application target similarly.
In the above embodiments, an error function is set such that terms obtained by multiplying a weight for each of the elements by an error of a physical quantity is included. In the modifications, the function setting unit 102 selects one or more elements from among a plurality of elements with a probability corresponding to the magnitude of weights, and sets an error function including errors of physical quantities for the selected elements. Hereinafter, two modifications including different selection methods will be described.
In a case where the element s is selected, the function setting unit 102 sets an error function for each sample using only an error of a force with respect to the element s (element-specific error ek(s)). For example, the function setting unit 102 sets an error function including an error of the force with respect to the selected element s (element-specific error ek(s)) and not including an error of an unselected element.
The following Formulas (6) and (7) illustrate examples of the error function set in the present modification. Formula (6) is an example of an error function in a case where one element s is selected. Formula (7) is an example of an error function in a case where two elements s and s′ are selected. The number of selected elements is not limited to one or two, and may be three or more.
In the first modification, the function setting unit 102 selects the element s according to weights set by the weight setting unit 101 (step S301). The function setting unit 102 sets an error function including a term of the selected element s as in, for example, Formula (6) or Formula (7) (step S302).
According to the first modification, an error function including a term of only some elements selected according to the weights can be used as indicated in Formulas (6) and (7). As a result, the error function is made a simpler formula, and the load of calculation can be reduced. In a case where a large number of samples of learning data are used, a large number of different elements may be selected according to the weights. Therefore, for example, selected elements are not biased, and a learning result similar to that of the embodiments can be obtained.
In a second modification, an element common to samples of all pieces of learning data is selected in each of a plurality of times of learning repeatedly executed.
In the processing 1301, the function setting unit 102 of the second modification selects an element according to a probability indicated in Formula (5), similarly to the first modification. As illustrated in
In the first modification, a different element is selected for each sample, whereas in the second modification, the same element is selected for all samples. That is, the function setting unit 102 of the second modification selects one or more elements with a probability corresponding to the magnitude of a plurality of weights for each of a plurality of times of learning.
In the second modification, the function setting unit 102 selects the element s with a probability according to the weights of respective elements (step S405). The function setting unit 102 sets an error function for each sample using only an error of a force with respect to the selected element s (element-specific error ek(s)) (step S406).
Since steps S407 to S409 are similar to steps S106 to S108 in the information processing device 100 of the first embodiment, the description thereof will be omitted.
In the second modification, an error function including a term of only some elements selected according to the weights can be used similarly to the first modification. As a result, the error function is made a simpler formula, and the load of calculation can be reduced. In the second modification, the same element is selected for all the samples included in learning data used in one repetition of learning, but in a case where learning is repeated a plurality of times, a large number of different elements can be selected according to the weights. Therefore, for example, selected elements are not biased, and a learning result similar to that of the embodiments can be obtained.
An information processing device according to a third embodiment includes a function of correcting a weight value.
The third embodiment is different from the first embodiment in a function of the output control unit 131-3 and the correction unit 132-3 being added. Other configurations and functions are similar to those in
The output control unit 131-3 is different from the output control unit 131 of the first embodiment in a function of outputting output information including an output value of an error function being further included. For example, the output control unit 131-3 controls a display device such as a display connected to the information processing device 100-3 such that the output information is displayed.
The correction unit 132-3 executes correction processing of correcting at least some of a plurality of weights to a designated value. For example, the correction unit 132-3 corrects a weight for which correction is designated among the plurality of weights to a value designated by a user according to the output information displayed on the display device.
Note that the function setting unit 102 sets an error function including one or more terms based on the plurality of weights after the correction processing is executed.
In the example of the learning processing illustrated in
As described above, a value of a weight can be corrected by a user or the like in the third embodiment. As a result, for example, a more appropriate weight value can be searched for.
An information processing device according to a fourth embodiment sets weights used for setting an error function on the basis of weights designated by a user or the like.
The fourth embodiment is different from the first embodiment in the feature storage unit 151 being deleted, the reception unit 104-4 being added, and a function of the weight setting unit 101-4. Other configurations and functions are similar to those in
The reception unit 104-4 receives a plurality of weights designated for a plurality of respective elements. For example, the reception unit 104-4 receives inputs of a plurality of weights for a plurality of respective elements designated by a user or the like.
The weight setting unit 101-4 sets the received weights as weights used in setting of an error function by the function setting unit 102. For example, in a case where values of weights are designated by a user or the like such that different values are set according to the features of respective elements, a function similar to that of the above embodiments can be implemented using an error function using the designated weights. For example, the designated weights may be any values as long as the correlation with the reciprocals of the mass of the elements is a threshold (for example, 90%) or more.
The weight setting unit 101-4 of the present embodiment does not need to calculate and set weights from the features stored in the feature storage unit 151 as in the first embodiment. Therefore, the feature storage unit 151 is not necessarily included in the present embodiment.
As described above, according to the first to fourth embodiments, learning accuracy of a model used for analysis of a motion of an atom and the like can be further improved.
Next, a hardware configuration of the information processing device according to the first to fourth embodiments will be described with reference to
The information processing device according to the first to fourth embodiments includes a control device such as a CPU 51, storage devices of a read only memory (ROM) 52, a RAM 53 and the like, a communication I/F 54 that is connected to a network and performs communication, and a bus 61 that connects each unit. A GPU may be further included as the control device.
A program executed by the information processing device according to the first to fourth embodiments is included by being incorporated in the ROM 52 or the like in advance.
The program executed by the information processing device according to the first to fourth embodiments may be provided as a computer program product by being recorded as a file in an installable format or an executable format in a computer-readable recording medium such as a compact disk read only memory (CD-ROM), a flexible disk (FD), a compact disk recordable (CD-R), or a digital versatile disk (DVD).
Furthermore, the program executed by the information processing device according to the first to fourth embodiments may be stored on a computer connected to a network such as the Internet and provided by being downloaded via the network. Furthermore, the program executed by the information processing device according to the first to fourth embodiments may be provided or distributed via a network such as the Internet.
The program executed by the information processing device according to the first to fourth embodiments may cause a computer to function as each unit of the information processing devices described above. In this computer, the CPU 51 can read a program from a computer-readable storage medium onto a main storage device and execute the program.
Configuration Examples of the embodiments will be described below:
An information processing device including
The information processing device according to Configuration Example 1,
The information processing device according to Configuration Example 1 or 2,
set the plurality of weights according to the features of the plurality of elements, and
The information processing device according to Configuration Example 3,
The information processing device according to any one of Configuration Examples 1 to 4,
The information processing device according to any one of Configuration Examples 1 to 5,
The information processing device according to Configuration Example 6,
The information processing device according to any one of Configuration Examples 1 to 7,
The information processing device according to Configuration Example 1,
The information processing device according to Configuration Example 9,
The information processing device according to Configuration Example 9,
The information processing device according to any one of Configuration Examples 1 to 11,
The information processing device according to Configuration Example 12,
The information processing device according to any one of Configuration Examples 1 to 13,
The information processing device according to any one of Configuration Examples 1 to 14,
An information processing method executed by an information processing device, including:
A program for causing a computer to execute:
While certain embodiments have been described, these embodiments have been presented by way of example only, and are not intended to limit the scope of the inventions. Indeed, the novel embodiments described herein may be embodied in a variety of other forms; furthermore, various omissions, substitutions and changes in the form of the embodiments described herein may be made without departing from the spirit of the inventions. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the scope and spirit of the inventions.
Number | Date | Country | Kind |
---|---|---|---|
2023-042587 | Mar 2023 | JP | national |