The present disclosure claims priority to Chinese Patent Application Priority No. 202110543978.2, filed with the Chinese Patent Office on May 18, 2021, and titled “Information processing method and device”, which disclosure is incorporated herein in its entirety by reference
The present disclosure relates to the field of artificial intelligence, and in particular, to a graph neural network in the field of deep learning. More specifically, the present disclosure relates to an information processing method and device, an electronic device, a computer readable storage medium, and a computer program product.
In the fields of computational biology and computational chemistry, effective molecular characterization is essential for the understanding and accurate prediction of various biochemical properties. A network structure diagram composed of multiple types of atomic interactions can be used to characterize molecules.
The present disclosure provides an information processing method and device, a device and a storage medium.
According to a first aspect of the present disclosure, an information processing method is provided. The method includes initial representation of edges connected between multiple atoms in a molecule is determined based on three-dimensional structure information of the molecule. The method further includes first representation of a neighbor edge of each of the atoms is determined based on the initial representation of the edges, the neighbor edge of each of the atoms indicating at least one edge connected with each of the atoms. The method further includes first representation of each of the atoms is determined based on the first representation of the neighbor edge of each of the atoms. The method further includes feature representation for characterizing the molecule is determined based on the first representation of each of the atoms.
According to a second aspect of the present disclosure, an information processing device is provided. The device includes an initial representation determination module. The initial representation determination module is configured to determine initial representation of edges connected between multiple atoms in a molecule based on three-dimensional structure information of the molecule. The device further includes an edge determination module. The edge determination module is configured to determine first representation of a neighbor edge of each of the atoms based on the initial representation of the edges. The neighbor edge of each of the atoms indicates at least one edge connected with each of the atoms. The device further includes an atom determination module. The atom determination module is configured to determine first representation of each of the atoms based on the first representation of the neighbor edge of each of the atoms. The device further includes a characterization module. The characterization module is configured to determine feature representation for characterizing the molecule based on the first representation of each of the atoms.
According to a third aspect of the present disclosure, an electronic device is provided. The electronic device includes at least one processor and a memorizer in communication connection with the at least one processor. The memorizer stores an instruction capable of being performed by the at least one processor. The instruction is performed by the at least one processor, to cause the at least one processor to perform the method according to the first aspect of the present disclosure.
According to a fourth aspect of the present disclosure, a non-transitory storage medium storing a computer instruction is provided. The computer instruction is used for a computer to perform the method according to the first aspect of the present disclosure.
According to a fifth aspect of the present disclosure, a computer program product is provided. The method according to the first aspect of the present disclosure is implemented when the computer program is performed by a processor.
It is to be understood that, the content described in this section is not intended to identify the key or important features of the embodiments of the present disclosure, nor is it intended to limit the scope of the present disclosure. Other features of the present disclosure will become easy to understand through the following description.
Drawings are used to better understand the solution, and are not intended to limit the present disclosure, wherein
Exemplary embodiments of the present disclosure are described in detail below with reference to the drawings, including various details of the embodiments of the present disclosure to facilitate understanding, and should be regarded as merely exemplary. Thus, those of ordinary skilled in the art shall understand that, variations and modifications can be made on the embodiments described herein, without departing from the scope and spirit of the present disclosure. Likewise, for clarity and conciseness, descriptions of well-known functions and structures are omitted in the following description.
As described above, in the fields of computational biology and computational chemistry, effective molecular characterization is essential for the understanding and accurate prediction of various biochemical properties. A molecule is essentially a network structure diagram composed of multiple types of atomic interactions. In addition to topological structure information, the network structure diagram of the molecule also includes key spatial structure information, for example, an angle and distance between atoms forming the molecule. At present, there is still a need for a method that can better characterize the three-dimensional structure information of the molecule.
According to an embodiment of the present disclosure, a solution for characterizing molecules is provided. In the solution, initial representation of edges connected between multiple atoms in a molecule is determined based on three-dimensional structure information of a molecule. The solution further includes first representation of a neighbor edge of each of the atoms is determined based on the initial representation of the edges. The neighbor edge of each of the atoms indicates at least one edge connected with each of the atoms. The solution further includes first representation of each of the atoms is determined based on the first representation of the neighbor edge of each of the atoms. The solution further includes feature representation for characterizing the molecule is determined based on the first representation of each of the atoms. In this way, the representation of the atoms and the edges can be interactively generated to integrate the information of the neighbor atoms so as to better characterize the molecule.
In some embodiments, the pre-processing module 110 may receive the three-dimensional structure information 101 of the molecule. The three-dimensional structure information 101 of the molecule may include types and space distribution of the atoms forming the molecule. Additionally, the three-dimensional structure information 101 of the molecule may further include the type, physical and chemical properties, name, and the like of the molecule. The three-dimensional structure information 101 of the molecule may be in a form of a molecular diagram, or in a form that may represent the three-dimensional structure information 101 of the molecule. The pre-processing module 110 determines the initial representation of the atoms in the molecule based on the three-dimensional structure information 101 of the molecule. The initial representation of the atoms may be initial vector representation generated based on the properties of the atoms, space distribution, the properties of the molecule and other information. The initial representation of the atoms may be determined by utilizing multiple methods, which is not limited by the scope of the present disclosure.
The pre-processing module 110 may also determine the space distribution of the atoms based on the three-dimensional structure information 101 of the molecule. The pre-processing module 110 may construct an edge between the atoms of which distance between each other is less than a threshold distance (for example, 3 angstroms). The pre-processing module 110 may determine the characterization of the distance between the atoms connected by the edge based on the three-dimensional structure information 101 of the molecule. In some embodiments, the characterization of the distance may be obtained by vectorizing the distance between the atoms. For example, the distance between the atoms may be discretized to obtain the one-hot encoding of the distance. Based on the one-hot encoding of the distance, the characterization of the distance may be obtained.
The pre-processing module 110 may also determine an included angle between neighbor edges connected with the same atom based on the three-dimensional structure information 101 of the molecule. In some embodiments, a polar coordinate system may be utilized to represent the three-dimensional structure information 101 of the molecule. In this case, the included angle between the neighbor edges may be calculated more easily. For example, a first edge connected with a first atom may be taken as a polar axis, and the first atom is taken as a pole. The pre-processing module 110 may determine an included angle between each of the other edges except the first edge in the neighbor edge connected with the first atom and the first edge to obtain multiple included angles. In some embodiments, the included angle may be represented by using (θ, φ), and θ and φ may be in a range of 0 to 180°.
The pre-processing module 110 may input the initial representation of the atoms, the characterization of the distance between the atoms and the multiple included angles determined based on the three-dimensional structure information 101 of the molecule into the graph neural network module 120. The graph neural network module 120 may be a graph neural network that outputs the feature representation of the molecule based on the above input data.
In some embodiments, the atom-edge determination module 131 may determine the initial representation of the edges connecting the atoms based on the initial representation of the atoms and the characterization of the distance between the atoms. The initial representation of the edges may be one-dimensional vector representation. In some embodiments, the initial representation of the atoms connected by the edge and the characterization of the distance may be concatenated to determine the initial representation of the edges. Alternatively, the average value of the initial representation of the connected atoms and the characterization of the distance may be concatenated to determine the initial representation of the edges.
The atom-edge determination module 131 also determines the first representation of the neighbor edge based on the initial representation of the neighbor edge. A combination process for the edges will be described in detail below with reference to
In some embodiments, the atom-edge determination module 131 may divide the other edges eki into different angle domains based on the multiple included angles between the neighbor edges. For example, a formula (1) may be utilized to calculate indexes Indki of the angle domains at which the other edges eki are located.
Wherein, DA represents an angle domain divider, ┌·┐ represents a rounding symbol, ϕkij ∈ [0, 180°] represents an included angle between the edges eki and eij, and N represents the quantity of the angle domains. As shown in
In some embodiments, the atom-edge determination module 131 may determine the attention weight of the other edges eki to the first edge eij in each of the angle domains. For example, the attention weight of the other edge e1i to the first edge eij in the angle domain 201 may be determined; the attention weight of the other edges e2i and e3i to the first edge eij in the angle domain 202 may be determined; and the attention weight of the other edge eki to the first edge eij in the angle domain 203 may be determined. Formulas (2) to (3) may be utilized to calculate the attention weight of the other edges eki to the first edge eij in the angle domain q.
A function attnql may calculate an importance factor of the neighbor edge eki to eij in a layer 1. In the calculation of the atom-edge determination module 131, the layer 1 is the first layer 151. As shown in the formula (2), the importance factor may be calculated by means of concatenating of eki and eij. we,q(l), be,q(l), and ul,qT are trainable parameter matrices. αki,q(l) represents the attention weight of the neighbor edges eki in the specific angle domain q. As shown in
In some embodiments, based on the attention weight αki,q(l) of the other edges eki to the first edge eij in each of the angle domains q, the atom-edge determination module 131 may determine weighted initial representation for each of the angle domains q through weighted summation of the initial representation of the other edges in each of the angle domains q. For example, a formula (4) may be utilized to calculate the weighted initial representation mij,q(l) for the angle domain q.
The atom-edge determination module 131 may also determine the characterization of the combined first edge eij, that is, the first representation of the first edge eij, by concatenating the weighted initial representation mij,q(l) for each of the angle domains q. For example, a formula (5) may be utilized to calculate the first representation of the first edge eij through concatenating.
e
ij
(l)=[mij,1(l)∥mij,2(l)∥ . . . ∥mij,N(l)] (5)
Similarly, the atom-edge determination module 131 may determine first representation of all edges in the molecule. In this way, the information of the neighbor edges of each of the atoms may be combined with the information of the edge connected with the atom, so that the first representation of the edge may better characterize the edge and a surrounding molecular structure, thereby better characterizing the molecule.
Further referring to
In some embodiments, the edge-atom determination module 141 may determine a distance between the neighbor edges eij, e1i, e2i, e3i, and e4i and the first atom ai. The distance between the neighbor edge and the first atom ai may be a distance between a second atom (the atoms a1, a2, a3, and a4 shown in
w
ki
(l)=LeakyRelu(vlT·[ēki(l)∥āi(l)∥Wd(l)dki]) (6)
A function LeakyRelu may calculate an importance factor of the neighbor edge eki to ai in a layer 1. In the calculation of the edge-atom determination module 141, the layer 1 is the first layer 151. As shown in a formula (6), the importance factor wki(l) may be calculated by means of the concatenating of ēki(l), āi(l) and Wd(l)dki. ēki(l) and āi(l) are respectively the first representation of the converted neighbor edge eki and the initial representation of the converted first atom ai. By converting the first representation of the neighbor edge eki and the initial representation of the first atom ai, the first representation of the neighbor edge eki and the initial representation of the first atom ai may be converted to a same feature space, so that a follow-up concatenating operation is realized. Wd(l) and vlT are trainable parameter matrices.
βki(l) represents the attention weight of the neighbor edge eki to the first atom ai. As shown in a formula (7), the importance factor wki(l) may be standardized by using the softmax function to obtain βki(l). Based on the attention weight βki(l) of the neighbor edge eki to the first atom ai, the edge-atom determination module 141 may determine the first representation of the first atom ai by determining a weighted average of the first representation of the neighbor edge.
Additionally, the edge-atom determination module 141 may calculate the attention weight of the neighbor edge eki to the first atom ai for multiple times by using a multi-head attention algorithm. In this case, a formula (8) may be utilized to calculate the weighted average of the first representation of the neighbor edge, so as to determine the first representation of the first atom ai.
Where, C represents the quantity of attention heads.
Similarly, the edge-atom determination module 141 may determine the first representation of all atoms in the molecule. In this way, by combining the information of the neighbor edge of each of the atoms into the first representation of the atom, the first representation of the atom may be better characterize the atom and a surrounding molecular structure.
Further referring to
Similarly, the atom-edge determination module 132 may determine second representation of the neighbor edge of each of the atoms based on the first representation of each of the atoms. For example, the atom-edge determination module 132 may determine second representation of the edges by concatenating the first representation of the atoms connected by the neighbor edges and the characterization of the distance. The atom-edge determination module 132 may determine third representation of the neighbor edges based on the second representation of the neighbor edges. For example, the information of the neighbor edges may be transmitted into the third representation of a target edge in the neighbor edges based on a combination of angles. The edge-atom determination module 142 may determine second representation of the first atom based on the third representation of the neighbor edges of the first atom. For example, the information of the neighbor edges and neighbor atoms may be transmitted into the second representation of the atoms based on a combination of distances. Additionally, the graph neural network module 120 may also utilize the follow-up iteration in other layers to determine final representation of the atoms and the edges. In this way, the representation of the atoms and the edges can be interactively generated. The space structure information of the atoms is integrated based on the combination of the angles and the distances, so that the molecule is better characterized.
Further referring to
In some embodiments, the system 100 or the graph neural network module 120 may be trained based on a downstream task of molecular characterization. Different loss functions may be selected for different downstream tasks, so as to train the system 100 or the graph neural network module 120. For example, the L1 loss function may be selected when molecular property prediction is performed. A cross entropy function may be selected when binary-classification DTI property prediction is performed. Limitations are not imposed in the scope of the present disclosure thereto.
In some embodiments, determining the first representation of the neighbor edge of each of the atoms includes: an included angle between each of the other edges except a first edge in the neighbor edge of a first atom in the multiple atoms and the first edge is determined to obtain multiple included angles; and the first representation of the first edge is determined based on the multiple included angles and initial representation of the other edges.
In some embodiments, determining the first representation of the first edge includes: the other edges are divided into different angle domains based on the multiple included angles; weighted initial representation for each of the angle domains through weighted summation of the initial representation of the other edges in each of the angle domains is determined based on attention weight of the other edges in each of the angle domains to the first edge; and concatenating the weighted initial representation for each of the angle domains as the first representation of the first edge.
At 403, first representation of each of the atoms is determined based on the first representation of the neighbor edge of each of the atoms. In some embodiments, determining the first representation of each of the atoms includes: a distance between the neighbor edge of the first atom in the multiple atoms and the first atom is determined, the distance between the neighbor edge and the first atom indicating a distance between a second atom connected with the neighbor edge and the first atom; the attention weight of the neighbor edge of the first atom to the first atom is determined based on the distance; and an weighted average of the first representation of the neighbor edge of the first atom is determined based on the attention weight as first representation of the first atom.
At 404, feature representation for characterizing the molecule is determined based on the first representation of each of the atoms. In some embodiments, determining the feature representation characterizing the molecule includes: second representation of the edge is determined based on the first representation of each of the atoms; third representation of the neighbor edge of each of the atoms is determined based on the second representation of the edge; second representation of each of the atoms is determined based on the third representation of the neighbor edge of each of the atoms; and the feature representation characterizing the molecule is determined based on the second representation of each of the atoms.
In some embodiments, a polar coordinate system is utilized to represent the three-dimensional structure information of the molecule.
In some embodiments, the edge determination module 504 includes: an included angle determination sub-module, configured to determine an included angle between each of the other edges except a first edge in the neighbor edge of a first atom in the multiple atoms and the first edge to obtain multiple included angles; and a combination sub-module, configured to determine the first representation of the first edge based on the multiple included angles and initial representation of the other edges.
In some embodiments, the combination sub-module includes: a division sub-module, configured to divide the other edges into different angle domains based on the multiple included angles; a summation sub-module, configured to determine weighted initial representation for each of the angle domains through weighted summation of the initial representation of the other edges in each of the angle domains based on attention weight of the other edges in each of the angle domains to the first edge; and a concatenating sub-module, configured to concatenate the weighted initial representation for each of the angle domains as the first representation of the first edge.
In some embodiments, the atom determination module 506 includes: a distance determination sub-module, configured to determine a distance between the neighbor edge of the first atom in the multiple atoms and the first atom, the distance between the neighbor edge and the first atom indicating a distance between a second atom connected with the neighbor edge and the first atom; a weight determination sub-module, configured to determine the attention weight of the neighbor edge of the first atom to the first atom based on the distance; and a weighted average sub-module, configured to determine an weighted average of the first representation of the neighbor edge of the first atom based on the attention weight as first representation of the first atom.
In some embodiments, the characterization module 508 includes: a second edge determination sub-module, configured to determine second representation of the edges based on the first representation of each of the atoms; a second edge determination sub-module, configured to determine third representation of the neighbor edge of each of the atoms based on the second representation of the edges; a second atom determination sub-module, configured to determine second representation of each of the atoms based on the third representation of the neighbor edge of each of the atoms; and a second characterization sub-module, configured to determine the feature representation for characterizing the molecule based on the second representation of each of the atoms.
In some embodiments, a polar coordinate system is utilized to represent the three-dimensional structure information of the molecule.
According to an embodiment of the present disclosure, the present disclosure further provides an electronic device, a readable storage medium, and a computer program product.
As shown in
Multiple components in the device 600 are connected with the I/O interface 605, and include: an input unit 606, such as a keyboard and a mouse; an output unit 607, such as various types of displays and loudspeakers; the storage unit 605 such as a disk and an optical disc; and a communication unit 609, such as a network card, a modem, and a wireless communication transceiver. The communication unit 609 allows the device 600 to exchange information/data with other devices through a computer network, such as the Internet, and/or various telecommunication networks.
The computing unit 601 may be various general and/or special processing assemblies with processing and computing capabilities. Some examples of the computing unit 601 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units for running machine learning model algorithms, a Digital Signal Processor (DSP), and any appropriate processors, controllers, microcontrollers, etc. The computing unit 601 performs the various methods and processing operations described above, for example, the method 400. For example, in some embodiments, the method 400 may be implemented as a computer software program, which is tangibly included in a machine-readable medium, such as the storage unit 608. In some embodiments, part or all of the computer programs may be loaded and/or installed on the device 600 via the ROM 602 and/or the communication unit 609. When the computer program is loaded into the RAM 603 and performed by the computing unit 601, one or more steps of the method 400 described above may be performed. Alternatively, in other embodiments, the computing unit 601 may be configured to perform the method 400 in any other suitable manners (for example, by means of firmware).
The various implementations of systems and technologies described herein may be implemented in a digital electronic circuit system, an integrated circuit system, a Field Programmable Gate Array (FPGA), an Application-Specific Integrated Circuit (ASIC), an Application-Specific Standard Product (ASSP), a System-On-Chip (SOC), a Complex Programmable Logic Device (CPLD), computer hardware, firmware, software, and/or a combination thereof. These various implementations may include: being implemented in one or more computer programs, the one or more computer programs may be performed and/or interpreted on a programmable system including at least one programmable processor. The programmable processor may be a dedicated or general programmable processor, which can receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit the data and instructions to the storage system, the at least one input device, and the at least one output device.
Program codes used to implement the method of the present disclosure can be written in any combination of one or more programming languages. These program codes can be provided to the processors or controllers of general computers, special computers, or other programmable data processing devices, so that, when the program codes are performed by the processors or controllers, functions/operations specified in the flowcharts and/or block diagrams are implemented. The program codes can be performed entirely on a machine, partially performed on the machine, and partially performed on the machine and partially performed on a remote machine as an independent software package, or entirely performed on the remote machine or a server.
In the context of the present disclosure, a machine-readable medium may be a tangible medium, which may include or store a program for being used by an instruction execution system, device, or apparatus or in combination with the instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or apparatus, or any foregoing suitable combinations. More specific examples of the machine-readable storage medium may include electrical connections based on one or more wires, a portable computer disk, a hard disk, a RAM, a ROM, an Erasable Programmable Read-Only Memory (EPROM or flash memory), an optical fiber, a portable Compact Disk Read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any above suitable combinations.
In order to provide interaction with a user, the system and technologies described herein can be implemented on a computer, including a display device for displaying information to the user (for example, a Cathode Ray Tube (CRT) or a Liquid Crystal Display (LCD) monitor), a keyboard and a pointing device (for example, a mouse or a trackball). The user can provide an input to the computer by using the keyboard and the pointing device. Other types of devices may also be configured to provide interaction with the user, for example, the feedback provided to the user may be any form of sensory feedback (such as visual feedback, auditory feedback, or tactile feedback), and may be the input from the user received in any form (including acoustic input, voice input, or tactile input).
The system and technologies described herein may be implemented in a computing system (for example, as a data server) including a back-end component, or a computing system (for example, an application server) including a middleware component, or a computing system (for example, a user computer with a graphical user interface or network browser, the user may be in interaction with implementations of the system and technologies described herein by using the graphical user interface or network browser) including a front-end component, or a computing system including any combination of the back-end component, the middleware component, or the front-end component. The components of the system can be connected with each other through any form or digital data communication (for example, a communication network) of the medium. Examples of the communication network include a Local Area Network (LAN), a Wide Area Network (WAN), and the Internet.
The computer system may include a client and a server. The client and the server are generally far away from each other and usually interact by means of the communication network. A relationship between the client and the server is generated by the computer program that is run on the corresponding computer and has a client-server relationship with each other.
It is to be understood that, the steps may be reordered, added or deleted by using various forms of programs shown above. For example, the steps described in the present disclosure may be performed parallelly, sequentially, or in a different order, as long as desired results of the technical solutions disclosed in the present disclosure can be achieved, which are not limited herein.
The foregoing specific implementations do not constitute limitations on the protection scope of the present disclosure. Those skilled in the art should understand that, various modifications, combinations, sub-combinations and substitutions can be made according to design requirements and other factors. Any modifications, equivalent replacements, improvements and the like made within the spirit and principle of the present disclosure shall fall within the scope of protection of the present disclosure.
Number | Date | Country | Kind |
---|---|---|---|
202110543978.2 | May 2021 | CN | national |