The invention pertains to the pre-processing of molecular force field energy calculation, in particular to a method of atomic sequence rearrangement.
Before calculating the force field energy of the structure, it is necessary to rearrange the atomic sequence. The current method of rearranging the atomic sequence is to rearrange the atoms through the topological comparison of the graph theory tool networkx, and the topological comparison only contains the 2D information of the structure, while the 3D features of the structure may cause the rearranged atomic sequence to be wrong. So after the topological comparison of the rearrangement of the atomic sequence, manual inspection is still required, this is inefficient.
For example, rearrange the atomic sequence of a structure which contains a symmetric aliphatic ring, as shown in
Based on this, it is necessary to provide an atomic sequence rearrangement method that can improve the accuracy of atomic sequence rearrangement.
A method of atomic sequence rearrangement, including:
Topological rearrangement: the atomic sequence of the target structure is rearranged referring to the reference structure using the two-dimensional topological rearrangement method.
Judgment of equivalent atoms: judge the equivalent atoms in the topological structure.
Measuring and marking: mark the atomic chiral information of the rearranged structure and the reference structure.
Second rearrangement: referring to the reference structure for the second rearrangement of the atomic sequence for the rearranged structure.
In a preferred embodiment, wherein the measuring and marking step as: marking the atomic sequence chiral information of the rearranged structure and the reference structure according to the measurement and marking method of atomic chirality.
In a preferred embodiment, the method for measuring and marking the atomic sequence chirality is: taking the central atom as a starting point, and take the dihedral angles of the atoms connected to the central atom in a clockwise direction, and the atoms taken must contain equivalent atoms, If the dihedral angle is greater than 0, the two topologically consistent atoms are marked as True and False in the order of taking the atoms. If the dihedral angle is less than 0, the two topologically consistent atoms are marked as False and True in the order of taking the atoms.
In a preferred embodiment, the atomic chirality is that, if the atomic sequence of the molecular structure does not overlap with the atomic sequence of its mirror structure, it is judged to have atomic sequence chirality.
In a preferred embodiment, if the atomic chirality is that, if the topological connection degree of the atom is greater than or equal to 3, it is judged to have atomic sequence chirality.
In a preferred embodiment, the measuring and marking step as: measuring the atomic sequence chirality of the central atom connecting two topologically equivalent atoms, and marking the measurement result on the equivalent non-hydrogen atom.
In a preferred embodiment, the judgment of equivalent atoms includes: judging topologically equivalent atoms through a list of adjacent atoms, and the list of adjacent atoms is generated according to the topological connections of atoms.
In a preferred embodiment, the equivalent atom is an atom having an equivalent adjacent atom list.
In a preferred embodiment, if there are two or more equivalent atoms among the atoms connected to the central atom, two atoms are arbitrarily selected as equivalent atoms.
In a preferred embodiment, the second rearrangement step includes: performing rearrangement of the original structure with atomicity information and the reference structure the second time to obtain a structure with the same atomic sequence of the reference structure.
The above-mentioned atomic rearrangement method will mark the atomic chiral information of the rearranged structure and the reference structure, and the rearranged structure will be referenced to the reference structure to perform a secondary rearrangement of the atomic sequence. Introducing atomic sequence chirality, and including part of the 3D information of the structure In the 2D topological atomic rearrangement, the atomic sequence rearrangement can fully consider the 3D information of the structure, avoid the disorder of the atomic sequence of the structure, and solve the problem of inconsistent atomic sequence, which will be beneficial to the subsequent accurate calculation of the force field energy of the structure.
As shown in
Step S101, Topological rearrangement: rearrange the atomic sequence of the target structure referring to the reference structure using a two-dimensional topological rearrangement method;
Step S103, equivalent atom judgment: judging the equivalent atoms in the topological structure;
Step S105, measuring and marking: marking the atomic sequence chiral information of the rearrangement structure and the reference structure;
Step S107, second rearrangement: the rearranged structure is subjected to a second rearrangement of atomic sequence referring to the reference structure.
Further, the topological rearrangement step of this embodiment: refer to the reference structure for the structure that the atomic sequence needed to be rearranged and use the is_isomorphic method of the isomorphism module in the networkx algorithm library to calculate the atomic correspondence relationship according to the two-dimensional topologies of the two structures, and rearrange the atomic sequence of the target structure according to the correspondence relationship. For the is_isomorphic method, please refer to: LP Cordella, P. Foggia, C. Sansone, M. Vento, “An Improved Algorithm for Matching Large Graphs”, 3rd IAPR-TC15 Workshop on Graph-based Representations in Pattern Recognition, Cuen, pp. 149-159, 2001.
Further, the measuring and marking step of this embodiment: marking the atomic sequence chiral information of the rearranged structure and the reference structure according to the measuring and marking method of atomic sequence chirality.
Further, the method for measuring and marking the chirality of the atomic sequence of this embodiment: taking the central atom as a starting point, and take the dihedral angles of the atoms connected to the central atom in a clockwise direction, and the atoms taken must contain equivalent atoms; If the dihedral angle is greater than 0, the two topologically consistent atoms are marked as True and False in the order of taking the atoms. If the dihedral angle is less than 0, the two topologically consistent atoms are marked as False and True in the order of taking the atoms.
Furthermore, the atomic chirality of this embodiment is that, if the atomic sequence of the molecular structure does not overlap with the atomic sequence of the mirror structure itself, it is judged to have atomic sequence chirality.
Furthermore, if the atomic chirality of this embodiment is that, if the topological connection degree of the atom is greater than or equal to 3, it is judged to have atomic sequence chirality.
Further, the measuring and marking step of this embodiment: measure the atomic sequence chirality of the central atom that connects two topologically equivalent atoms, and mark the measurement result on the equivalent non-hydrogen atom.
Further, the equivalent atom judgment in this embodiment includes: judging the topologically equivalent atoms through the adjacent atom list, which is generated according to the topological connection of the atoms.
Furthermore, the equivalent atom in this embodiment is an atom having an equivalent adjacent atom list.
Furthermore, if there are two or more equivalent atoms among the atoms connected to the central atom, two atoms are arbitrarily selected as equivalent atoms.
Further, the second rearrangement step of this embodiment includes: performing a second rearrangement of the original structure with atomic chiral information and the reference structure to obtain a structure consistent with the atomic sequence of the reference structure.
The present invention introduces the concept of atomic sequence chirality, and incorporates part of the 3D information of the structure into the 2D topological atomic sequence rearrangement. The atomic sequence rearrangement can fully consider the 3D information of the structure. Atomic sequence chirality: The atomic sequence of the molecular structure does not overlap with the atomic sequence of its mirror structure, indicating that the atom has atomic sequence chirality. Described from the perspective of topology, when the topological connectivity of an atom is greater than or equal to 3, it indicates that the atom has atomic sequence chirality. Adjacent atom list: Generate adjacent atom list according to the topological connection of atoms.
As shown in
As shown in
As shown in
By introducing the concept of atomic order chirality, part of the 3D information of the structure is included in the 2D topological atomic sequence rearrangement, so that the atomic sequence rearrangement can fully consider the 3D information of the structure and avoid disorder of the atomic sequence of the structure.
As shown in
Due to the original structure containing a symmetric six-membered ring (as shown in
Rearrange the original structure (as shown in
Judge topologically equivalent non-hydrogen atoms through the adjacent atom list. For example, in Table 1, the adjacent atoms of atom C_11 and C_0v are the same, and the adjacent atoms of atom C_12 and C_0y are the same.
The atomic sequence chirality is measured for the central atom connecting two topologically equivalent atoms, and the measurement result is marked on the equivalent non-hydrogen atom. For example, the atomic sequence chirality of the central atom N_18 in
The original structure with atomic sequence chiral information and the reference structure are rearranged for the second time and a structure consistent with the atomic sequence of the reference structure can be obtained.
The following Table 1 is the adjacent atom list of this embodiment
The atomic sequence rearrangement method of the present invention is suitable for the pre-processing of the molecular structure force field energy calculation. By including part of the 3D information of the structure into the 2D topological atomic sequence rearrangement, the problem of atomic sequence inconsistency can be solved, thereby accurately calculating the force field energy of the structure.
Taking the above-mentioned ideal embodiment based on this application as enlightenment, and through the above description, relevant staff can make various changes and modifications without departing from the scope of the technical idea of this application. The technical scope of this application is not limited to the content in the specification, and its technical scope must be determined according to the scope of the claims.
Those skilled in the art should understand that the embodiments of the present application can be provided as a method, a system, or a computer program product. Therefore, this application may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, this application may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.
This application is described with reference to the method of embodiments of this invention and flowcharts and/or block diagrams of devices (systems), and computer program products. It should be understood that each process and/or block in the flowchart and/or block diagram, and the combination of processes and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, an embedded processor, or other programmable data processing equipment to generate a machine, so that the instructions executed by the processor of the computer or other programmable data processing equipment are generated. It is a device that realizes the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
These computer program instructions can also be stored in a computer-readable memory that can direct a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device. The device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment. The instructions provide steps for implementing functions specified in a flow or multiple flows in the flowchart and/or a block or multiple blocks in the block diagram.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2020/093287 | 5/29/2020 | WO | 00 |