This invention relates generally to graph neural network (GNN) force field computational algorithms for direct prediction of atomic forces in molecular dynamics computer simulations in material systems, such as electrochemical and water filtration devices.
Molecular dynamics is a computational materials science methodology for simulating the motion of atoms in a material system at real operating pressure and temperature conditions. Methodologies exist to calculate the underlying atomic forces used in the simulation of the motion of atoms. One methodology is the ab-initio quantum mechanics approach. This approach is very accurate but is also very expensive because of the tremendous amount of computational resources necessary to apply the approach. While other approaches exist that consume less computational resources, these other approaches do not deliver as much accuracy.
According to one embodiment, a computational method for simulating the motion of elements within a multi-element system using a graph neural network (GNN) is disclosed. The method includes converting a molecular dynamics snapshot of the elements within the multi-element system into a directed graph comprised of nodes and edges. The nodes include first and second nodes and the edges include first and second edges. The method further includes the step of initially embedding the nodes and the edges to obtain initially embedded nodes and edges. The method also includes updating the initially embedded nodes and edges by passing a first message from the first edge to the first node using a first message function and passing a second message from the first node to the first edge using a second message function to obtain updated embedded nodes and edges. The method also includes predicting a force vector for one or more elements within the multi-element system based on the updated embedded edges and a unit vector pointing from the first node to the second node or the second node to the first node.
According to another embodiment, a computational method for simulating the motion of elements within a multi-element system using a graph neural network (GNN) is disclosed. The method includes converting a molecular dynamics snapshot of the elements within the multi-element system into a directed graph comprised of nodes and edges. The nodes include first, second and third nodes and the edges including first, second and third edges. The method further includes initially embedding the nodes and the edges to obtain initially embedded nodes and edges. The method also includes updating the initially embedded nodes and edges by passing a first message from the first edge to the first node using a first message function and passing a second message from the first node to the first edge using a second message function to obtain updated embedded nodes and edges. The second message function represents interactions between the first, second and third nodes. The method also includes predicting a force vector for one or more elements within the multi-element system based on the updated embedded edges.
According to yet another embodiment, a computational method for simulating the motion of elements within a multi-element system using a graph neural network (GNN) is disclosed. The method includes converting a molecular dynamics snapshot of the elements within the multi-element system into a directed graph comprised of nodes and edges. The nodes include first and second nodes and the edges include first and second edges. The method further includes initially embedding the nodes and the edges to obtain initially embedded nodes and edges. The method also includes updating the initially embedded nodes and edges by iteratively passing a first message from the first edge to the first node using a first message function and iteratively passing a second message from the first node to the first edge using a second message function to obtain updated embedded nodes and edges. The method also includes predicting a force vector for one or more elements within the multi-element system based on the updated embedded edges.
As required, detailed embodiments of the present invention are disclosed herein; however, it is to be understood that the disclosed embodiments are merely exemplary of the invention that may be embodied in various and alternative forms. The figures are not necessarily to scale; some features may be exaggerated or minimized to show details of particular components. Therefore, specific structural and functional details disclosed herein are not to be interpreted as limiting, but merely as a representative basis for teaching one skilled in the art to variously employ the present invention.
The term “substantially” may be used herein to describe disclosed or claimed embodiments. The term “substantially” may modify a value or relative characteristic disclosed or claimed in the present disclosure. In such instances, “substantially” may signify that the value or relative characteristic it modifies is within ±0%, 0.1%, 0.5%, 1%, 2%, 3%, 4%, 5% or 10% of the value or relative characteristic.
Molecular dynamics (MDs) methodologies are beneficial for studying physical phenomena, such as, but not limited to, ionic transport, chemical reactions, and material bulk and surface degradation in material systems, such as, devices or functional materials. Non-limiting examples of such material systems include fuel cells, surface coatings, batteries, water desalination, and water filtration. Methodologies exist to calculate the underlying atomic forces used in the simulation of the motion of atoms. The ab-initio quantum mechanics approach is very accurate but is also very expensive because of the tremendous amount of computational resources necessary to apply the approach.
Neural networks have been utilized to fit and predict quantum mechanics energies. These methodologies have been referred to as neural network force fields (NNFF). Derivatives of energy with respect to atomic positions and forces are predicted using quantum mechanics energies. However, these methodologies are also computationally extensive. In light of the foregoing, what is needed is a computational methodology for calculating atomic forces that delivers an adequate level of accuracy while consuming a reasonable amount of computing resources.
Molecular dynamics use atomic positions (and possibly charges, bonds, or other structural information) to calculate interatomic forces of each atom, which are consequently used to modify the velocities of atoms in the simulation. The resulting trajectories of the atoms are utilized to describe physical phenomena, such as, but not limited to ionic transport in batteries and fuel cells, chemical reactions during bulk and surface material degradation, and solid-state material phase change. A tradeoff exists between the accuracy and size (measured by number of atoms and simulated dynamics time) of the simulation depending on the underlying method used to calculate the atomic forces. As set forth above, one accurate but expensive method uses the ab-initio quantum mechanics approach, known as ab-initio molecular dynamics (AIMD).
AIMD is too computationally demanding for many relevant applications. For instance, according to one estimation, an AIMD simulation on a high-performance computer with two hundred (200) cores would take sixty-five (65) years to complete to study the dynamics behavior of amorphous lithium phosphate glass with about five hundred (500) atoms for five hundred (500) nanoseconds.
An alternative to AIMD is classical molecular dynamics. Using this alternative, interatomic forces have been fitted into some closed functional forms to enable several orders of magnitude speedup over AIMD. This alternative approach has been used to simulate large systems, such as polymers and amorphous materials. According to one estimation, the same simulation identified above using the classical molecular dynamics approach finishes in under two (2) days.
One of the drawbacks of the classical molecular dynamics model using pre-defined functional forms is that they restrict the behavior of interatomic interactions within assumptions that remain valid under certain conditions. For example, a classical force field with quadratic functional forms for the bond between atoms assumes that the chemical bond between atoms do not break. This assumption is contrary to the behavior of nature where chemical bonds in molecules form and break during chemical reactions. Relatively more complex pre-assumed functional forms for simulating chemical reactions exist. For example, reactive force fields, such as ReaxFF, is one such model. These models suffer from the disadvantage of being difficult to parametrize and require a large number of relatively complex handcrafted, closed functional forms.
As described above, NNFF algorithms exist that require the computation of quantum mechanics energies. One implementation of NNFF uses rotation-invariant features developed by Behler and Parrinello (otherwise referred to as the Behler-Parrinello approach), as described in the following articles: J. Behler and M. Parrinello, “Generalized Neural-Network Representation of High-Dimensional Potential-Energy Surfaces”, Phys. Rev. Lett. 98, 146401 (2007); M. Gastegger, J. Behler, and P. Marquetand, “Machine Learning Molecular Dynamics for Simulation of Infrared Spectra”, Chem. Sci. 2017, 8, 6924; and J. Behler, “Atom-centered Symmetry Functions for Constructing High-Dimensional Neural Network Potentials”, J. Chem. Phys., 134, 074106 (2011), all of which are incorporated by reference in their entirety herein.
In this implementation, atoms surrounding central atom i within a predefined radius Rc are used as inputs into the neural network. Instead of using atomic Cartesian coordinates of the atoms as the input features, the atomic Cartesian coordinates are converted into fingerprint feature vectors. These vectors are invariant under physical operations that should not change the energy coordinates of the local atomic structure. These physical operations include rotation, translation, and permutation. Rotation-invariance refers to the energy invariance upon rotating the entire Cartesian coordinates of the local atomic environment by an identical rotation with respect to an arbitrary point in space, such as the central atom i. Translational-invariance refers to the energy invariance upon shifting the entire Cartesian coordinates of the local atomic environment by an identical displacement. Permutation-invariance refers to the energy invariance upon changing the ordering sequence of the atoms in the local atomic environment and keeping their positions fixed. In addition to satisfying these three (3) conditions, the chosen fingerprint feature vectors may also satisfy the following criteria (which are assumed to be consistent with the physics of the system): decaying contribution from atoms further away from the central atom i (down to 0 for R>Rc); and smoothly varying as the atomic positions vary smoothly in the local atomic environment.
In one implementation, these five (5) requirements are accomplished by converting a list of atomic coordinates surrounding a central atom i {{right arrow over (Ri)}} into fingerprint feature vectors {{right arrow over (Gi)}} described by two-body and three-body functions. Some non-limiting examples of fingerprint functions are listed below:
where η, κ, ζ, and λ are user-defined parameters of G fingerprint functions as described in J. Behler, “Atom-centered Symmetry Functions for Constructing High-Dimensional Neural Network Potentials”, J. Chem. Phys., 134, 074106 (2011). The values of these parameters are typically chosen to appropriately cover the length-scales relevant for the local atomic system under consideration (typically within a cutoff radius Rc<10 A). The cutoff function ƒc depends on Rc and can be defined as follows:
The permutation-invariance is only true for atoms of the same elements. For example, exchanging a C atom at position 1 and a C atom at position 2 describes two systems with the same physics, and accordingly, the permutation-invariance holds true. As another example, exchanging a C atom at position 1 and an O atom at position 2 describes two systems with different physics, and permutation-invariance does not hold true. Consequently, when multi-element systems are investigated, the summations in the G functions above should only be done for atoms with the same elements (for two-body G's) or same element-pairs (for three-body G's). Given m number of elements in a MD system of interest surrounding a central atom i, m different G's are calculated for each parameterized 2-body function and m(m+1)/2 different G's for each parametrized three-body function. Due to the construction of these functions (including rotational invariance), these functions provide beneficial representations for learning the rotation-invariant nature of quantum mechanics. However, a machine learning system using these invariant features are limited to use to predict rotation-invariant outputs such as energy.
Many methodologies based on quantum mechanics (e.g., density functional-theory) are used to calculate energy E of a system. However, these methodologies cannot conveniently partition the electronic energy into contributions from individual atoms. Accordingly, in one or more embodiments, {{right arrow over (Gi)}} feature vectors for each atom in a system are calculated and then utilized to train or predict a single scalar quantity of energy E as shown in
The NNFF approach for AIMD applications suffers from one or more limitations. For instance, during a training cycle, each of the atomic NNs (typically one NN for each element) is co-trained to a single energy quantity E, which is tied to the system of interest. Accordingly, the NNs are co-dependent because each needs to be co-used to predict a single energy value E and are only optimized for predicting the exact MD system used in the training set. There are no independent {{right arrow over (Ri)}}→Ei samples that can be included for the training of different MD systems containing local atomic environments similar to {{right arrow over (Ri)}}. Secondly, during MD runtime (force prediction cycle), expensive calculations are performed to calculate the force vector {right arrow over (F)}i on atom i. {right arrow over (F)}i is the derivative of total system energy E with respect to displacement of a single atom {right arrow over (Ri)}. However, prediction of E requires knowing G for all atoms in the MD system {{right arrow over (G1)}, {right arrow over (G2)}, . . . , {right arrow over (GN)}}. Consequently, calculating {right arrow over (F)}i requires calculating the fingerprint vector 3N times, as shown in the equation below.
where α indicates the Cartesian x, y, and z axes, N is the total number of atoms in the MD system, and Mi is the total number of fingerprint functions describing atom i. Accordingly, calculating forces for all N atoms in the system grows with O(N2). In certain implementations, certain algorithm modifications can reduce the number of fingerprint vector calculations to 3P times, where P is the average number of atoms in the local atomic environment of the atoms, reducing the computational cost scale to O(N). Nevertheless, this restriction requires 3P {{right arrow over (Gi)}} calculations for one {right arrow over (F)}i prediction (P may range from 100 to 400 atoms depending on the chosen Rc), as opposed to one {{right arrow over (Gi)}} calculation in a direct {right arrow over (F)}i prediction. Accordingly, an NNFF algorithm with direct {right arrow over (F)}i prediction can accelerate NNFF-based MD {right arrow over (F)}i calculations by two (2) to three (3) orders of magnitude.
In one or more embodiments, a graph neural network (GNN) is utilized because it does not require manually computing rotationally-invariant features of a local atomic environment to predict atomic forces for an MD simulation. The GNN model architecture may be configured to automatically extract translationally and rotationally-invariant features from an input MD simulation snapshot during a training process.
The computational algorithm of one or more embodiments is configured to perform direct {{right arrow over (Ri)}}→{right arrow over (F)}i prediction for each training sample. Accordingly, {{right arrow over (Ri)}}→{right arrow over (F)}i samples from different sets of systems or materials may be combined for force field training and development. The computational algorithm of one or more algorithms may be practically applied to a relatively large atomic system of an interface between materials A and B. A training set may be created by combining separate smaller structures of material A, material B, and the interface between the two.
The computational algorithm of one or more embodiments may be integrated into an MD simulation engine. This integration may be used to perform accurate MD simulations without using the more expensive AIMD methods. The computational algorithm of one or more embodiments find practical application in the modeling of force vectors for one or more elements within the multi-element systems, such as atom/ion dynamics, chemical reactions in fuel cells, water desalination, catalysis, coatings and batteries.
In one or more embodiments, the force prediction algorithm of one or more embodiments is integrated into MD software. A non-limiting example of suitable MD software is Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS). LAMMPS is available from Sandia National Laboratories, located in Albuquerque, N. Mex. LAMMPS uses a Message Passing Interface (MPI) for parallel communication. LAMMPS is open-source software, distributed under the terms of the GNU General Public License.
The atomic force prediction algorithm of one or more embodiments utilizes graph neural networks (GNN) that may operate directly on graph-structured data.
In one embodiment, embedding part 402 may mathematically expressed as the following equations (8) and (9).
ƒa:One-Hot(type(atom i))→hi0 (8)
GF:d
ij
→h
(i,j)
0 (9)
where ƒa denotes the neural network that maps the one hot encoding of atom type to hi0, GF denotes the Gaussian filtering operation. GNN 400 of one embodiment uses a single dense hidden layer for ƒa.
Message-passing part 404 of GNN 400 may be composed of several message passing layers 410 where at each layer, local messages may be passed between the nodes and edges. Local message passing between the nodes and edges may be defined by equations (10) and (11):
Edge to node: ƒvl:{hil-1,h(i,j)l-1|j∈Ni}→hil (10)
Node to edge: ƒel:{hil,hj1,h(i,j)l-1}→h(i,j)l (11)
where hil is the embedding of node vi in layer l and h(i,j)l is the embedding of edge eij, where initial conditions hi0 and h(i,j)0 are determined in embedding part 402 of GNN 400. Ni denotes a set of indices of neighbor nodes connected to node vi. Message functions ƒvl and ƒel are node- and edge-specific neural networks that update hil-1 and h(i,j)l-1 to hil and h(i,j)l respectively. These functions are defined such that after each update (message-passing), embedding hil and h(i,j)l respectively, better represent the local atomic environment of atom i and the interatomic interactions between atoms i and j. The message functions may be designed flexibly so that they can be altered to emphasize different interatomic dynamics that occur within MD simulations.
In one embodiment, the message functions ƒvl and ƒel are defined by equations (12) and (13), respectively.
ƒvl:hil=g[hil-1+Σj∈N
ƒel:h(i,j)l=g[h(i,j)l-1+σ((hil⊙hjl)W3l+b3l)⊙g((hil⊙hjl)W4l+b4l)+Σk∈N
where g denotes a hyperbolic tangent function, σ denotes a sigmoid function, denotes an element wise multiplication operator, and ⊕ denotes a concatenation operator. Wl and bl represent the weights and biases of the dense hidden layers that compose layer l. The second line of equation (13) represents a three-body function. Equation (12) and the other part of equation (13) represent a two-body function.
In force vector calculation part 406, the edge embedding outputs of message-passing part 404 may be used to calculate force vectors for one or more elements within a multi-element system. For a GNN with L message-passing layers, the final state of edge eij embedding h(i,j)L may be used to calculate nij, a scalar quantity that denotes the magnitude of the force that atom i is exerting on atom j. The force exerted by atom i on to j is given by nij·{right arrow over (u)}ij and the force prediction of atom j is the sum of forces exerted by neighboring atoms. These calculations may be mathematically represented by equations (14) and (15), reproduced below.
ƒƒ:h(i,j)L→nij (14)
{right arrow over (F)}
j=Σi∈N
where function ƒƒ is a neural network that maps h(i,j)L to nij and {right arrow over (F)}j is the force prediction of atom j. GNN 400 of one embodiment may apply a series of 3 dense hidden layers for ƒƒ where the intermediate hidden states are passed through a ReLU activation function.
During the training process of GNN 400, the weights and biases in ƒa, ƒvl, ƒel, and ƒƒ (l=1, 2, . . . , L) are updated through backpropagation to minimize error between the predicted forces and the MD-calculated forces. Under such embodiment, the force field predictions by GNN 400 are translation-invariant, but rotation-covariant to the input atomic coordinates environment {{right arrow over (Ri)}} of the MD snapshots.
In one embodiment, GNN 400 may be used to directly predict atomic force vectors in an MD simulation for fuel cells, water desalination, catalysis, coating and battery applications. Moreover, GNN 400 of one or more embodiments may be used as part of an MD simulation for drug design and biosciences, including biochemistry design. In one or more embodiments, MD snapshot 300 represented as directed graphs 302 may be composed of nodes, edges and unit vectors. The nodes may be a representation of constituent atoms. The edges may be a representation of interatomic influence between neighboring atoms. The unit vectors may be a representation of the directions of the interatomic influences. The nodes may be initially embedded using, for example, a one hot encoding of atom type. The edges may be initially embedded using, for example, the interatomic distances processed by a Gaussian filter. The initial node and edge embedding may be iteratively updated by iteratively passaging messages from edge to node and node to edge. The messages can be, but are not limited to, functions that are designed to update the node and edge embedding to better represent the local atomic environment and interatomic interactions, respectively. The message passing steps may be conducted more than once. The edge embedding and unit vectors may be used to calculate the interatomic forces between a center atom and its neighboring atoms. The values can be summed to predict the force vector of the center atom.
√{square root over (|{right arrow over (F)}i,GNN−{right arrow over (F)}i,QM|2)} (16)
Using the error equation set forth above, the error between the GNN algorithm and the quantum mechanics DFT simulation is 0.273 eV/A, 0.159 eV/A, 0.148 eV/A, 0.221 eV/A, and 0.340 eV/A for the prediction of C, F, H, O and S atoms, respectively, which is within an acceptable range. In these examples, no samples from the test set were seen during the training cycle.
The GNN algorithms and/or methodologies of one or more embodiments are implemented using a computing platform, such as the computing platform 700 illustrated in
The processor 904 may be configured to read into memory 902 and execute computer-executable instructions residing in GNN software module 908 of the non-volatile storage 906 and embodying GNN algorithms and/or methodologies of one or more embodiments. The processor 904 may be further configured to read into memory 902 and execute computer-executable instructions residing in MD software module 910 (such as LAMMPS) of the non-volatile storage 906 and embodying MD algorithms and/or methodologies. The software modules 908 and 910 may include operating systems and applications. The software modules 908 and 910 may be compiled or interpreted from computer programs created using a variety of programming languages and/or technologies, including, without limitation, and either alone or in combination, Java, C, C++, C #, Objective C, Fortran, Pascal, Java Script, Python, Perl, and PL/SQL. In one embodiment, PyTorch, which is a package for the Python programming language, may be used to implement code for the GNNs of one or more embodiments. The code framework may be based on a crystal graph convolutional neural network (CGCNN) code, which is available under license from the Massachusetts Institute of Technology of Cambridge, Mass.
Upon execution by the processor 904, the computer-executable instructions of the GNN software module 908 and the MD software module 910 may cause the computing platform 900 to implement one or more of the GNN algorithms and/or methodologies and MD algorithms and/or methodologies, respectively, disclosed herein. The non-volatile storage 906 may also include GNN data 912 and MD data 914 supporting the functions, features, and processes of the one or more embodiments described herein.
The program code embodying the algorithms and/or methodologies described herein is capable of being individually or collectively distributed as a program product in a variety of different forms. The program code may be distributed using a computer readable storage medium having computer readable program instructions thereon for causing a processor to carry out aspects of one or more embodiments. Computer readable storage media, which is inherently non-transitory, may include volatile and non-volatile, and removable and non-removable tangible media implemented in any method or technology for storage of information, such as computer-readable instructions, data structures, program modules, or other data. Computer readable storage media may further include RAM, ROM, erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other solid state memory technology, portable compact disc read-only memory (CD-ROM), or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to store the desired information and which can be read by a computer. Computer readable program instructions may be downloaded to a computer, another type of programmable data processing apparatus, or another device from a computer readable storage medium or to an external computer or external storage device via a network.
Computer readable program instructions stored in a computer readable medium may be used to direct a computer, other types of programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions that implement the functions, acts, and/or operations specified in the flowcharts or diagrams. In certain alternative embodiments, the functions, acts, and/or operations specified in the flowcharts and diagrams may be re-ordered, processed serially, and/or processed concurrently consistent with one or more embodiments. Moreover, any of the flowcharts and/or diagrams may include more or fewer nodes or blocks than those illustrated consistent with one or more embodiments.
While all of the invention has been illustrated by a description of various embodiments and while these embodiments have been described in considerable detail, it is not the intention of the applicant to restrict or in any way limit the scope of the appended claims to such detail. Additional advantages and modifications will readily appear to those skilled in the art. The invention in its broader aspects is therefore not limited to the specific details, representative apparatus and method, and illustrative examples shown and described. Accordingly, departures may be made from such details without departing from the spirit or scope of the general inventive concept.