The present application claims priority to Korean Patent Application No. 10-2023-0022659, filed Feb. 21, 2023, the entire contents of which is incorporated herein for all purposes by this reference.
The present invention relates to a device and method for modeling 2-dimensional (2D) materials using machine learning, and more particularly, to a technology capable of discover new 2D materials with high elastic modulus and shear modulus through deep learning, machine learning, and high-throughput calculation.
A 2D material refers to a substance with a thin thickness measured in nanometers. 2D materials have excellent electrical and chemical properties, as well as the advantages of being flexible and optically transparent, which are derived from their thin thickness.
For example, graphene has a high elastic modulus of around 1 TPa and exhibits excellent electrical conductivity, making it a highly conductive material. Single layer of MoS has a direct bandgap of approximately 1.8 eV and exhibits high luminescence efficiency.
Phase-change random-access memory (PRAM) made of 2D MoTe2 has the advantage of being faster in data storage and retrieval speed than those made of conventional materials. The properties of each 2D material suggest the possibility of applying 2D materials to a variety of industries, including next-generation displays, solar energy, the aerospace industry, and semiconductors.
However, commercializing these 2D materials remains a challenging problem. The difficulty lies in achieving mass production of widely known 2D materials, using mechanical exfoliation and chemical vapor deposition methods, without compromising their excellent properties. Furthermore, the unique properties of each 2D material pose difficulties in terms of their commercialization.
For example, graphene possesses strong bonding forces between its constituent atoms, making it difficult to bond with other materials. MoS is also the same. Furthermore, these materials have the disadvantage of being prone to oxidation, which results in the electrical properties being easily altered.
Therefore, in addition to existing 2D materials, finding new 2D materials that are easy to synthesize plays a very important role in the development of various industrial fields. Furthermore, in the commercialization of 2D materials, structural stability is an important criterion for assessment. The material must be highly resistant to external forces such as tension, compression, and fracture, as well as resistance to damage and breakage. For these reasons, the discovery of new 2D materials with high elastic modulus and shear modulus holds significant industrial value.
On the other hand, using traditional experimental methods to discover such 2D materials would lead to inefficient results in terms of time and cost. This is due to the inevitable trial and error process and the uncertainty regarding the synthesis feasibility of new 2D materials. Additionally, there is a drawback of being highly influenced by changes in experimental conditions.
The Korean registered patent No. 10-2402582 (invention name: Electronic device that generates chemical structural formulas for new drug development based on a machine learning-based chemical generation model and the operating method thereof) discloses an electronic device including a training set storage unit in which multiple chemical structural formulas represented by SMILES notation are stored as a training set; a first correction unit that selects one of the multiple chemical structural formulas as a first chemical structural formula, appends a start character to the front of the first chemical structural formula and appends an end character to the end of the first chemical structural formula to correct the first chemical structural formula; a first character pair generation unit that, upon completion of the correction of the first chemical structural formula, generates multiple first character pairs corresponding to the corrected first chemical structural formula by sequentially designating a character located at the nth position (where n is a natural number greater than or equal to 1) in the string composing the corrected first chemical structural formula as input and designating the character located at the (n+1)th position as the answer, a first vector verification unit that refers to the vector storage unit to verify embedding vectors for characters specified as inputs in the multiple first character pairs; and a first probability value generation unit that, for each of the multiple first character pairs, sequentially matches them to multiple time steps of a recurrent neural network (RNN), inputs the embedding vector for the character specified as input in each character pair to the matched time step, and calculates probability values indicating the likelihood of the character designated as the answer in each character pair being produced using predetermined activation functions for each time step matched to each character pair, based on the start character, end character, and multiple characters stored in the character storage unit.
The present invention has been conceived to solve the above problems and it is an object of the present invention to provide a technology that is capable of finding new 2D materials with high elastic modulus and shear modulus through the use of deep learning, machine learning, and high-throughput calculation methods.
The technical objects of the present invention are not limited to the aforesaid, and other objects not described herein with can be clearly understood by those skilled in the art from the descriptions below.
In order to accomplish the above objects, a device of the present invention includes a virtual inorganic material generation unit generating a plurality of virtual inorganic material chemical formulas, a material classification unit receiving data on the virtual inorganic material chemical formulas from the virtual inorganic material generation unit and classifying a 2D material among the plurality of virtual inorganic material chemical formulas into a preliminary 2D material, a space group analysis unit receiving data on the preliminary 2D material from the material classification unit, predicting the space group of the preliminary 2D material, and selecting the preliminary 2D material having the same space group as an existing 2D material as a structurally similar 2D material, a similarity analysis unit receiving data on the structurally similar 2D material from the space group analysis unit and deriving the structurally similar 2D material having a similar chemical composition to the existing 2D material as a compositionally similar 2D material, and a new material generation unit receiving data on the compositionally similar 2D material from the similarity analysis unit and deriving a new 2D material by performing element substitution between chemical formulas of the compositionally similar 2D material and the existing 2D material matching the compositionally similar 2D material.
In an embodiment of the present invention, the virtual inorganic material generation unit may generate the virtual inorganic material chemical formulas using a generative adversarial network (GAN).
In an embodiment of the present invention, the material classification unit may select the preliminary 2D material using a random forest model.
In an embodiment of the present invention, the material classification unit may train the random forest model using a virtual material database as predefined material data.
In an embodiment of the present invention, the space group analysis unit may predict the space group of the preliminary 2D material using a neural network (NN).
In an embodiment of the present invention, the similarity analysis unit may analyze the chemical similarity of the existing 2D material and the structurally similar 2D material using a distance function.
In an embodiment of the present invention, the distance function may be the earth mover's distance (EMD) function.
In an embodiment of the present invention, the device may further include a property analysis unit receiving data on the new 2D material from the new material generation unit and computing thermodynamic stability and mechanical property of the new 2D material.
In an embodiment of the present invention, the property analysis unit may perform density functional theory (DFT) calculation on the new 2D material and removes the new 2D material having a negative value for the stiffness tensor among a plurality of types of new 2D materials.
In an embodiment of the present invention, the device may further include a display unit displaying information on the new 2D material received from the property analysis unit on a screen.
In order to accomplish the above objects, a method of the present invention includes generating a plurality of virtual inorganic material chemical formulas; classifying a 2D material among the plurality of virtual inorganic material chemical formulas into a preliminary 2D material using data on the virtual inorganic material chemical formulas; predicting the space group of the preliminary 2D material and selecting the preliminary 2D material having the same space group as an existing 2D material as a structurally similar 2D material; deriving the structurally similar 2D material having a similar chemical composition to the existing 2D material as a compositionally similar 2D material; and deriving a new 2D material by performing element substitution between chemical formulas of the compositionally similar 2D material and the existing 2D material matching the compositionally similar 2D material.
Hereinafter, the present invention will be described with reference to the accompanying drawings. However, the present invention may be embodied in many different forms and is not limited to the embodiments described herein. In order to clearly describe the present invention, parts irrelevant to the description may be omitted in the drawings, and similar reference numerals may be used for similar components throughout the specification.
Throughout the specification, when a part is said to be “connected (coupled, contacted, or combined)” with another part, this is not only “directly connected”, but also “indirectly connected” with another member in between. Also, when a part is said to “comprise” a certain component, this means that other components may be further included instead of excluding other components unless specifically stated otherwise.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” or “has,” when used in this specification, specify the presence of a stated feature, number, step, operation, component, element, or a combination thereof, but they do not preclude the presence or addition of one or more other features, numbers, steps, operations, components, elements, or combinations thereof.
Hereinafter, the present invention will be described in detail with reference to the accompanying drawings.
As shown in
Each of the virtual inorganic material generation unit 110, material classification unit 120, space group analysis unit 130, similarity analysis unit 140, and new material generation unit 150, and a subsequent property analysis unit 160 to be described later may be implemented using hardware such as a computer or central processing units (CPU), software such as an operating program, or a combination of both. In addition, one unit may be implemented using two or more hardware components, and two or more units may also be implemented using a single hardware component.
The virtual inorganic material generation unit 110 may generate virtual inorganic chemical formulas using generative adversarial networks (GAN). In detail, the virtual inorganic material generation unit 110 may use a generative deep learning model for inorganic materials (MatGAN) trained on inorganic chemical formulas, which is capable of generating over one million virtual inorganic chemical formulas.
In an embodiment of the present invention, it is possible to generate 1,450,283 virtual inorganic chemical formulas.
The MatGAN Generator is a Python-based platform that can be obtained through open-source websites, and the acquired program is stored in the virtual inorganic material generation unit 110 for use in generating a plurality of virtual inorganic chemical formulas as described above.
The material classification unit 120 may use a random forest model (RF) for the selection of preliminary 2D materials. The material classification unit 120 may also train the RF model using a virtual material database as the predefined material data.
The material classification unit 120 may screen preliminary 2D materials from the plurality of virtual inorganic materials to form a set of candidates for 2D materials, called the 2D material candidate group, and a set of materials not suitable for preliminary 2D materials, called the non-2D material group.
To accomplish this, the material classification unit 120 may first train the random forest model using an existing virtual material database, which be obtained from the Materials Project (MP) and the 2D Materials Encyclopedia (2DMatPedia) websites, respectively.
The random forest (RF) model is effective in processing large amounts of data and robust to data imbalance. In the virtual material database, the database obtained from the Materials Project (MP) may be referred to as the first data, and the database obtained from the 2D Materials Encyclopedia (2DMatPedia) may be referred to as the second data.
Here, the first data may include the chemical formulas of other materials other than 2D materials, and the second data may include the chemical formulas of 2D materials. In an embodiment, the first data may consists of 126,631 entries while the second data may include 6,351 entries. However, the numbers of entries are not necessarily limited thereto.
When the first data and the second data are input into the material classification unit 120, the material classification unit 120 may generate non-2D training data for the chemical formula of materials other than 2D materials in the first data by excluding the second data from the first data, and 2D training data using the second data as training data for 2D materials.
The material classification unit 120 may train the random forest (RF) model using both the 2D training data and the non-2D training data. Here, the number of chemical formulas in the non-2D training data may be approximately 19 times larger than the number of chemical formulas in the 2D training data.
Here, part (a) in
As shown in
Here, when the ratio of 2D to non-2D training data is 1:1, the performance metrics of precision, recall, f1-score, and accuracy for the random forest (RF) model may be analyzed to be approximately 0.850, which is the highest; thus, the material classification unit 120 may perform training using the random forest (RF) model with a ratio of 1:1 for 2D and non-2D training data.
Although the description is focused on the machine learning being performed in the material classification unit 120 by generating 2D training data and non-2D training data using the first data and the second data in an embodiment of the present invention, the performance metrics may be changed by changing the data.
The material classification unit 120 may store a performance metric threshold value (decision threshold) of 0.9 for the random forest (RF) model, and when setting the ratio of 2D to non-2D training data as described above, the ratio of 2D to non-2D training data may be adjusted to match the performance metrics of the random forest (RF) model that are close to 0.9.
As shown in
Here, part (a) of
Part (b) of
Part (c) of
The space group analysis unit 130 may predict the space group of preliminary 2D materials using an artificial neural network (NN). As shown in part (a) of
Upon receiving the data about the preliminary 2D materials, the space analysis unit 130 may perform predictions for the space group of each preliminary 2D material included in the data using crystal structure predictions via neural network (CRYSPNet) and the data about the preliminary 2D materials.
CRYSPNet, which is an engine for use in predicting the crystal structure of inorganic material using a neural network (NM), is a tool (program) capable of predicting the Bravais lattice, space group, and lattice parameters of inorganic materials.
CRYSPNet is a platform based on Python and other programming languages, which may be obtained through open-source websites or similar sources, and once acquired, the program may be stored in the space group analysis unit 130 for use by the space group analysis unit 130 in performing space group predictions for the preliminary 2D materials.
As described above, the space group analysis unit 130 may perform space group predictions on the preliminary 2D materials and compare the predicted space groups of the preliminary 2D materials with the space groups of the existing 2D materials in the second data to match and classify the preliminary 2D materials with the same space group as the existing 2D materials.
Accordingly, based on the matching performed by the space group analysis unit 130, the preliminary 2D materials having the same space group as the existing 2D materials may be classified, and data for structurally similar 2D materials may be generated.
In the space group analysis unit, it is possible for one structurally similar 2D material to match with one existing 2D material among multiple existing 2D materials. Here, as shown in part (b) of
Part (b) of
After the analysis in the space group analysis unit 130, the data on the structurally similar 2D materials and the data on the existing 2D materials with the same space group as the structurally similar 2D materials may be transmitted from the space group analysis unit 130 to the similarity analysis unit 140.
The similarity analysis unit 140 may analyze the chemical similarity between the existing 2D materials and the structurally similar 2D materials using a distance function. Here, the distance function may be the earth mover's distance function (EMD).
In detail, the Earth Mover's Distance (EMD) function may be used to compute the distance between the constituents of 2D materials, and an operational program called element mover distance (EIMD) may be developed using EMD for such calculations.
The similarity analysis unit 140 may utilize the data on structurally similar 2D materials, the data on existing 2D materials with the same space group as the structurally similar 2D materials, and the EIMD to derive the EIMD result values for the structurally similar 2D materials and the existing 2D materials with the same space group as the structurally similar 2D materials.
EIMD is a Python-based library used to measure the chemical similarity between two materials, and it may obtained from open-source sites and stored in the similarity analysis unit 140, allowing the similarity analysis unit 140 to compute EIMD result values between structurally similar 2D materials and existing 2D materials with the same space group.
The similarity analysis unit 140 may derive the EIMD result value for the structurally similar 2D materials and the existing 2D materials with the same space group and may determine whether the EIMD result value is less than a threshold value.
Here, the lower the EIMD result value, the more similar the compositions of the constituent elements between the structurally similar 2D materials and the existing 2D materials with the same space group. Here, the threshold value may be 2, but is not limited thereto, and the benchmark value may be changed.
The similarity analysis unit 140 may determine whether the EIMD result value between one of a plurality of structurally similar 2D materials and at least one existing 2D material with the same space group as the corresponding structurally similar 2D material is less than the threshold value.
As shown in part (d) of
The structurally similar 2D materials included in the set classified in the above manner may be called compositionally similar 2D materials, and the existing 2D materials that match the compositionally similar 2D materials in in the set may be referred to as substitutional 2D materials.
The new material generation unit 150 may receive data on compositionally similar 2D materials and substitutional 2D materials from the similarity analysis unit 140 and derive the chemical formula of a new 2D material from the compositionally similar 2D material by substituting the constituent elements of one compositionally similar 2D material and one substitutional 2D material in a 1:1 ratio.
The modeling device of the present invention may further include a property analysis unit 160 that receives data on the new 2D material from the new material generation unit 150 and computes the thermodynamic stability and mechanical properties of the new 2D material.
The property analysis unit 160 may perform density functional theory (DFT) calculations on the new 2D material to remove the new 2D material with a negative value of the stiffness tensor from a plurality of types of new 2D materials.
In detail, the property analysis unit 160 may perform density functional theory (DFT) calculations using a simulation program, and the simulation program stored in the property analysis unit 160 may be Vienna ab initio simulation package (VASP).
During the process of performing density functional theory (DFT) calculations in the property analysis unit 160, the structural stabilization of the new 2D material formed through element substitution may be performed, and the thermodynamic stability and mechanical properties of the new 2D material may be calculated.
In order to consider reality, the property analysis unit 160 may derive the stiffness tensor of the new 2D materials through density functional theory (DFT) calculations and selectively remove the new 2D materials with negative values of the stiffness tensor from the data.
The modeling device of the present invention may further includes a display unit 170 displaying information about the new 2D materials on a screen. The display unit 170 may receive information on the new 2D material from the property analysis unit 160 and display information on the crystal structure, mechanical properties, etc. of the new 2D material on the screen in the form of images, text, and the like.
First, it is possible to generate a plurality of inorganic chemical formulas at step S110. As described above, step S110 may be performed by the virtual inorganic material generation unit 110.
Next, it is possible to classify a preliminary 2D material among a plurality of virtual inorganic materials at step S120 using data on the virtual inorganic chemical formulas. As described above, step S120 may be performed by the material classification unit 120.
In detail, the material classification unit 120 may analyze a virtual inorganic material to determine at step S120 whether the virtual inorganic materials is a 2D material, classify the virtual inorganic material as a preliminary 2D material based on the virtual inorganic material being a 2D material, and remove the virtual inorganic material from the data based on the virtual inorganic material being not a 2D material.
Next, it is possible to predict the space group of the preliminary 2D material and select the preliminary 2D material as a structurally similar 2D material based on the preliminary 2D material having the same space group as an existing 2D material at step S130. As described above, step S130 may be performed by the space group analysis unit 130.
In detail, the space group analysis unit 130 compares the predicted space group of the preliminary 2D material with the space group of the existing 2D material at step S130 to classify the preliminary 2D material into a structurally similar 2D material based on the preliminary 2D material having the same space group as the existing 2D material and remove the preliminary 2D material from the data based on the preliminary 2D material not having the same space group as the existing 2D material.
Next, it is possible to derive the structurally similar 2D material as a compositionally similar 2D material based on the structurally similar 2D material having a similar chemical composition to the existing 2D material with the same space group at step S140. As described above, step S140 may be performed by the similarity analysis unit 140.
In detail, the similarity analysis unit 140 may determine at step S140 whether the existing 2D material and the structurally similar 2D material having the same space group have the chemical composition similarity, i.e., chemical similarity, to classify the structurally similar 2D material into a compositionally similar 2D material based on the structurally similar 2D material and the existing 2D material being chemically similar and remove the structurally similar 2D material from the data based on the structurally similar 2D material and the existing 2D material being not similar chemically.
Next, it is possible to derive a new 2D material at step S150 by performing element substitution between the chemical formulas of the compositionally similar 2D material and the existing 2D material. As described above, step S150 may be performed by the new material generation unit 150.
The remaining detailed aspects of the modeling method of the present invention are the same as those described in the modeling device of the present invention.
In detail, parts (a) and (b) of
As shown in
As described above, the modeling device and method of the present invention makes it possible to generate information about new 2D materials with high structural stability and facilitate the process of deriving such information about new 2D materials.
Additionally, the modeling device and method of the present invention is capable of overcoming the limitations of traditional experimental synthesis methods and property exploration based on post-synthesis analysis and implementing new frameworks for the development of 2D materials, which allows for the rapid exploration of various 2D materials with different characteristics, making it easier to search for materials suitable for commercialization and large-scale production.
Furthermore, the modeling device and method of the present invention can provide structurally stable 2D materials with an average elastic modulus of approximately 700 N/m, indicating their synthesis feasibility.
The present invention with the above configuration is advantageous in terms of being able to generate information on new 2D materials with high structural stability and quickly retrieve information on such new 2D materials.
The present invention is also advantageous in terms of implementing a new 2D material development framework that is capable of overcoming the disadvantages of conventional experimental synthesis methods and property exploration research based on property analysis after synthesis and facilitating search for materials suitable for commercialization and mass production by allowing for quick exploration of 2D materials with various properties.
The present invention is also advantageous in terms of providing structurally stable 2D materials with an average elastic modulus of around 700 N/m and exhibiting synthesis feasibility.
It should be understood that the advantages of the present invention are not limited to the aforesaid but include all advantages that can be inferred from the detailed description of the present invention or the configuration specified in the claims.
The above description of the present invention is for illustrative purposes only, and it will be understood by those skilled in the art that various modifications and changes may be made thereto without departing from the spirit and scope of the invention. Therefore, it should be understood that the embodiments described above are exemplary and not limited in all respects. For example, each component described as a single type may be implemented in a distributed manner, and similarly, components described as distributed may be implemented in a combined form.
The scope of the invention should be determined by the appended claims, and all changes or modifications derived from the meaning and scope of the claims and equivalent concepts thereof should be construed as being included in the scope of the present invention.
Number | Date | Country | Kind |
---|---|---|---|
10-2023-0022659 | Feb 2023 | KR | national |