The present invention relates to a compact structure for topology among spheres, and more particularly a compact structure for topology among spheres defining a blending surface of a sphere set and a method of constructing the same.
The Voronoi diagram and its related concepts have been quite popular in various technical fields such as science and engineering fields. In particular, the use of Voronoi diagram for a point set has been increasing over the last few years with respect to various applications. This can be attributed to a greater understanding of its mathematical and computational properties, as well as the development of dynamic and efficient codes.
In biology, for example, the Voronoi diagram for the centers of atoms in a molecule was first used by Richards in 1974 to study the packing density of molecules (see F. M. Richards, The interpretation of protein structures: Total volume, group volume distributions and packing density, Journal of Molecular Biology 82 (1974) 1-14). Since then the Voronoi diagram has been used as one of the most important computational tools for conducting structure analysis of molecules.
Since 1974, the Voronoi diagram of a point set has been used quite extensively in the solution processes of various structural biological problems. However, Richards realized that the ordinary Voronoi diagram of points cannot adequately account for the size variations among atoms. As such, Richards proposed to translate the planar bisector between two atoms in the Voronoi diagram according to the size differences between two atoms. However, the translations of bisectors caused the so-called “vertex error” since this transformation cannot generally guarantee a correct tessellation of the space. In 1982, Gellatly and Finney proposed using a radical plane as the bisector between two atoms since such planes can guarantee that vertex errors will not occur (see B. J. Gellatly, J. L. Finney, Calculation of protein volumes: An alternative to the Voronoi procedure, Journal of Molecular Biology 161 (2) (1982) 305-322). While reflecting the size variations among atoms at a certain level, this transformation can guarantee a valid tessellation of the space. The tessellation using radical planes is indeed identical to the power diagram named by Aurenhammer (see F. Aurenhammer, Power diagrams: Properties, algorithms and applications, SIAM Journal on Computing 16 (1987) 78-96).
By introducing the concept of alpha-shapes in 1994, Edelsbrunner and Muecke provided a basis for applying the Voronoi diagram of a point set in reconstructing the shape from which the point set can be produced (see H. Edelsbrunner, E. P. Muecke, Three-dimensional alpha shapes, ACM Transactions on Graphics 13 (1) (1994) 43-72). They also provided an efficient code to compute alpha-shapes using the properties of Delaunay triangulation. Since alpha-shapes are fundamentally based on the rigorous theory of the Voronoi diagram for a point set and the Delaunay triangulation, they have been used in various applications. The main applications of alpha-shapes lie in deriving the surface shape, which is defined by a point set. Based on this property, many researchers have tried to use alpha-shapes for restructuring and deriving the spatial structures of biological systems.
However, alpha-shapes have limitations in their applications in biological systems mainly due to the fact that alpha-shapes can not account for the size variations among atoms. In general, the proximity among spheres is not necessarily identical to the proximity among centers of spheres.
The three-dimensional alpha-shape will be briefly reviewed with reference to
R3 filled with Styrofoam and the points 102 of S made of a more solid material (e,g., such as rock). Also, there is a spherical eraser 104 with radius α. It is omnipresent in the sense that it carves out Styrofoam at all positions where it does not enclose any of the sprinkled rocks, that is, points 102 of S. The resulting object 106 is referred to as the alpha-hull. To make things more feasible, the surface of the object is straightened by substituting straight edges 108 for the circular ones 110 and triangles for the spherical caps. The obtained object 112 is the alpha-shape of S.
Therefore, an alpha-shape 112 is identical to the convex hull of S when α=∞. For α=0, the alpha-shape 112 reduces to the point set S itself. Generally, alpha-shapes can be concave and disconnected. Alpha-shapes can contain two-dimensional patches of triangles and one-dimensional strings of edges. Its components can even be points. An alpha-shape 112 is a subset of the closure of the Delaunay triangulation of S, and it may have handles and interior voids.
∂X, i(X) and cl(X) denote the boundary, the interior and the closure of a set X, respectively. In addition, Hα(S) and Sα(S) denote an alpha-hull 106 and an alpha-shape 112 of the set S, respectively. Given the above, it can be generally shown that ∂I(Sα(S))≠∂Sα(S). This implies that alpha-shapes 112 are generally non-manifold.
Although the theory of alpha-shape has been applied to many situations, there are some critical instances where the application of this construct is not suitable. For such situations, a pocket recognition will be described hereinafter with reference to
First, we will present the definition of a pocket on the surface of a protein in the geometric point of view. At most, most proteins consist of six different types of atoms, i.e., H, C, N, O, P and S, which have the corresponding Van Der Waals radii of 1.2, 1.7, 1.55, 1.52, 1.8 and 1.8 Å, respectively. These atoms with Van Der Waals radii are usually referred to as Van Der Waals atoms. The number of atoms for a protein varies from hundreds to hundreds of thousands.
In most studies conducted to analyze geometric characteristics of a protein with respect to another molecule (usually relatively small) referred to as a ligand, such analysis is typically done using the concept of a spherical probe which encloses the ligand. While a probe is an approximation of the ligand, the probe can best represent the ligand by incorporating its shape, conformation changes, and all possible orientations of the ligand with respect to the protein. Hence, it is considered that the behavior of a probe best represents the geometric behavior of the ligand with respect to a protein. In the case of a water molecule, the corresponding probe is a sphere with the radius of 1.4 Å.
The method of determining whether or not a beta-ball 208 can freely enter into the pocket 206 or not is discussed below. Ignoring the size differences among the atoms 202 and considering the atom centers 204 only, the best approach for the systematic reasoning of spatial structure may be to use the ordinary Voronoi diagram of the atom centers 204 or the power diagram of atoms.
However, the decision of whether the beta-ball 208 can or cannot pass through may not be correct unless the size variation among atoms 202 is properly accounted for. For example, in
Since the ordinary Voronoi diagram of point set (hence the corresponding Delaunay triangulation) does not properly deliver the proximity information among atoms, applications requiring precise proximity information cannot efficiently yield accurate results.
In order to incorporate the size difference among atoms, Edelsbrunner extended the alpha-shape to the weighted alpha-shape using the regular triangulation, which is the topological dual of the power diagram of the atoms (see H. Edelsbrunner, Weighted alpha shapes, Technical Report UIUCDCS-R-92-1760, Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Ill. (1992); and H. Edelsbrunner, The union of balls and its dual shape, Discrete & Computational Geometry 13 (1995) 415-440). Since then, the weighted alpha-shapes have been used in restructuring and reasoning of the spatial structure for molecular systems. However, it should be noted that the distance metric in a power diagram is the power distance. In other words, the radius of the spherical eraser is smaller than the minimum tangential distance between the atoms defining an edge or a face. Then, the eraser is considered small enough to pass through between these atoms without a collision and the corresponding edge or face is considered larger than the eraser. As such, it is always necessary in the weighted alpha-shape to check if the minimum Euclidean distance between the atoms defining a bisector in a power diagram really allows for a spherical probe with a predefined size to freely pass through without a collision. Particularly, the power diagram (and therefore the weighted alpha-shape) cannot correctly provide the proximity information among atoms in the Euclidean distance metric, if such atoms do not intersect. Since the power diagram correctly recognizes the intersections between atoms, if the sizes of atoms are properly adjusted, the power diagram of the adjusted atoms may produce correct results. However, the adjustment of atom sizes depends on the particular application. Hence, different applications may require different size adjustments and the computation of power diagram. Besides the computational inefficiency of using power diagram, it is not clear if the size adjustment can be nicely defined for a particular application. Thus, weighted alpha-shapes themselves also have deficiencies in biological applications based on Euclidean distance metric even though they reflect the size variations of atoms at a certain level.
The main pitfall of alpha-shape and weighted alpha-shape lies in the fact that they have deficiencies to fully account for the size variation of input geometry efficiently.
It is, therefore, an object of the present invention to provide a compact structure for topology among spheres defining a blending surface of a sphere set. It is another object of the present invention to provide a method for constructing such structure from the sphere set. It is yet another object of the present invention to provide a method utilizing the above structure.
According to one aspect of the present invention, there is provided a computer-readable medium, which comprises the following: computer-readable code adapted to obtain a Voronoi diagram object of spheres; computer-readable code adapted to search for partially accessible Voronoi edges from the Voronoi diagram object; and computer-readable code adapted to compute faces of a beta-shape object from the partially accessible Voronoi edges.
According to another aspect of the present invention, there is provided a computer-readable medium, which comprises the following: computer-readable code adapted to obtain beta-shape objects; and computer-readable code adapted to recognize pockets from the beta-shape objects.
According to yet another aspect of the present invention, there is provided a system, which comprises the following: logic configured to obtain a Voronoi diagram object of spheres; logic configured to search for partially accessible Voronoi edges from the Voronoi diagram object; and logic configured to compute faces of a beta-shape object from the partially accessible Voronoi edges.
According to still yet another aspect of the present invention, there is provided a system, which comprises the following: logic configured to obtain beta-shape objects; and logic configured to recognize pockets from the beta-shape objects.
According to a further aspect of the present invention, there is provided a method of constructing a beta-shape, which comprises the following steps: acquiring a Voronoi diagram of spheres; searching for partially accessible Voronoi edges; and obtaining faces of the beta-shape from the partially accessible Voronoi edges.
According to yet another aspect of the present invention, there is provided a method of recognizing pockets, which comprises the following steps: acquiring beta-shapes; and recognizing pockets from the beta-shapes.
The present invention introduces a concept of beta-shape, which is based on Euclidean distance metric and considers the size differences among circles and spheres. Further, the present invention provides an efficient method of constructing the beta-shape. The present invention also provides a geometric method for pocket recognition based on the beta-shape and Voronoi diagram for atoms in the Euclidean distance metric.
The above and other objects and features in accordance with the present invention will become apparent from the following descriptions of preferred embodiments given in conjunction with the accompanying drawings, in which:
Numerous specific details are set forth below in order to provide a thorough understanding of the present invention. It will be apparent, however, to one skilled in the art, that the present invention may be practiced without some or all of these specific details. Further, it should be noted herein that well-known process steps have not been described in detail so as not to unnecessarily obscure the present invention.
To incorporate the size differences of atoms into the concept of alpha-shapes, one aspect of the present invention provides the concept of beta-shapes. The beta-shape is a compact structure for topology among spheres defining a blending surface of a sphere set. In view of
Conceptually, a beta-hull is a generalization of an alpha-hull and can be similarly described. The point set from which an alpha-hull is defined is now replaced by a set of three-dimensional spherical balls. In the case of alpha-hulls, assume R3 filled with Styrofoam and some spherical rocks scattered around the inside of the Styrofoam. The radii of the spherical rocks are different. Then, carving out the Styrofoam with an omnipresent and empty spherical eraser with the radius of β will result in a beta-hull. Since the eraser is omnipresent, there can be interior voids as well.
In
Consider A={ai|i=1, . . . , n} as a finite set of three-dimensional spherical rocks. For 0<β≦∞, a beta-ball b is an open ball with a radius β. The beta-ball b with β=0 generally corresponds to a point. It should be noted that β=∞ corresponds to an open half-space. The beta-hall b is called empty if and only if b∩A=Ø.
For 0≦β≦∞, the beta-hull of A, which is denoted by Hβ(A), is defined as the complement of the union of all empty beta-balls. If β=∞, then Hβ(A) is the convex hull of A. If β=0, then Hβ(A) is A itself. In fact, the boundary of the beta-hull for set A, which is denoted by ∂Hβ(A), is the molecular surface of A.
Given an alpha-hull, an alpha-shape is obtained by straightening the curved geometry in the corresponding alpha-hull. In other words, circular arcs and spherical caps of alpha-hulls are replaced by line segments and triangles. A beta-shape can be similarly explained with a slight but yet fundamental difference. In the beta-family, therefore, the relationship between a beta-hull and the corresponding beta-shape is slightly different from their counterparts in the alpha-family.
Assuming that a beta-hull Hβ(A) for an atomic structure A is given, the boundary ∂b of an empty beta-ball b may or may not touch one or more atoms in the set A. Also, it should be assumed that Asub(b) is a subset of A consisting of atoms, which are simultaneously touched by the boundary of an empty beta-ball b.
Then, the beta-shape Sβ(A), which corresponds to Hβ(A), is obtained by connecting the centers of the appropriate atoms, as follows: if |Asub(b)|=1, then the center of the touched atom becomes a vertex in Sβ(A); if |Asub(b)|=2, then the edge defined between the centers of the two touched atoms becomes an edge in Sβ(A); and if |Asub(b)|=3, then the triangle defined among the centers of the three touched atoms becomes a face in Sβ(A). A formal definition of a beta-shape is as follows.
Assume that Asub(b)={a∈A|b∩A=Ø, a∩∂b≠Ø} and Csub(b)={c|a=(c, r)∈Asub(b)}, wherein c and r are the center and radius of an atom a, respectively. Further assume that Δb is the convex combination of elements in Csub(b) for a particular b. Then, the beta-shape Sβ(A) of A is a polytope bounded by a set ∪Δb, for all possible b in the space corresponding to a particular value of β, 0≦β≦∞.
It should be emphasized herein that a beta-shape is a polytope, which has both an interior and a boundary. It should be also noted that 1≦|Asub(b)|≦3 if the atoms in A are all located in general position. In other words, it is assumed that ∂b for an empty beta-ball b can simultaneously touch either one, two or three atoms. This argument can be generalized to the d-dimension as well.
Assume that a beta-shape Sβ(A) is represented as sets of vertices, edges and faces. In other words, Sβ(A)=(Vβ, Eβ, Fβ) where Vβ={vβ1, vβ2, . . . }, Eβ={eβ1, eβ2, . . . } and Fβ={fβ1, fβ2, . . . } are sets of vertices, edges, and faces of Sβ(A), respectively. It should be noted that vβi∈Vβ corresponds to the center ci of an atom ai∈A. As such, the following properly can be derived.
Assume that b represents an empty beta-ball. If b touches an atom ai∈A, then the center ci of ai maps to a vertex vβi∈Vβ. If b simultaneously touches two atoms ai and aj, then their centers ci and cj define an edge eβij∈Eβ. If b simultaneously touches three atoms ai, aj and ak, then the convex combination of their centers ci, cj and ck define a triangular face fβijk∈Fβ.
∂Hβ(A) corresponds to a blended model of A with a probe of radius β. There are two types of blending surface in Hβ(A): a rolling blend and a link blend. A rolling blend ρ is defined between two nearby or adjacent atoms by rolling the probe around the line passing through the atom centers while keeping tangential contacts with such atoms. When the probe touches a third atom, the rolling blend ceases to exist. Instead, a link blend λ is defined among three neighboring atoms by placing the probe on top of the three atoms.
Assume that two atoms ai and aj are in close proximity to each other so that they define a rolling blend ρij. Then, an edge eβij between the centers ci and cj of the atoms ai and aj can be defined. If this operation is applied for all such pairs of atoms, then the definitions of Eβ for the beta-shape Sβ(A) can be obtained. Assume that a triplet of atoms ai, aj and ak defines a link blend λijk. Then, a triangular face fβijk using vβi, vβj, and vβk as the vertices can be defined. If this operation is applied for all such triplets of atoms, the definition of Fβ can be obtained.
Assume that a rolling blend ρij. is defined between two atoms ai and aj and the swept-volume of the probe along the spine of the sweeping does not intersect with any other atoms. Then, the spine curve of the rolling blend ρij is a complete circle. In such a case, the corresponding edge eβij does not contribute to any face fβ∈Fβ. Thus, it is considered as a dangling edge. Similarly, there can be dangling faces as well.
Sβ(A) is referred to as a beta-shape since the construct is defined based on the concept of blending over all atoms in A. It should be noted that the actual operation of blending should not be explicitly performed at all. The existence of such blending patches on the surface of the atomic complex is only checked by referring to the appropriate Voronoi edges and faces. The name, beta-shape, also implies that it is a generalization of a well-known alpha-shape by fully taking into account the size differences among atoms.
Although
Hereinafter, with reference to
A Voronoi diagram VD(A) for an atom set A is defined as follows. Associated with each atom ai∈A, there is a corresponding Voronoi region VRi for ai, wherein VRi={p|dist(p,ci)−ri≦dist(p,cj)−rj, i≠j} and dist(p,q) denotes a Euclidean distance between points p and q. Then, VD(A)={VR1, VR2, . . . , VRn} is the Voronoi diagram for the given atoms and represented as Gv=(Vv, Ev, Fv), wherein Vv={vv1, vv2, . . . }, Ev={ev1, ev2, . . . } and Fv={fv1, fv2, . . . } are sets of Voronoi vertices, edges and faces, respectively. It is important that the topological adjacencies among vertices, edges and faces should be also appropriately represented in the Voronoi diagram. From the definition of a Voronoi diagram, a Voronoi vertex vv is the center of an empty sphere tangent to four nearby atoms, while a Voronoi edge ev is defined as a locus of points equidistant from the surfaces of three surrounding atoms. In addition, a Voronoi face fv is the surface defined by two neighboring atoms. Note that the face is always a hyperbolic surface and any point on the face is equidistant from the surfaces of both atoms. As mentioned above, the Voronoi diagram may be stored in a storage as a data structure, which can be handled by a computer or the like.
Since Aurenhammer initially discussed the Voronoi diagram of atoms, a few researchers have studied the algorithm to compute this Voronoi diagram. Recently, as described in U.S. Patent Publication No. 2005/0248567, practical algorithms and their implementations are reported, which will not be described in detail in this specification.
Assume ev is an edge of VD(A) with a starting and ending Voronoi vertices vvs and vv3, respectively, and ev is defined by three atoms ai, aj and ak. It is known that ev is conic and therefore planar. Assume DeV(x) is the Euclidean distance from a point x∈ev to the surface of one of the three atoms, and rp is the radius of a given probe p. Further assume that DmineV and DmaxeV are the minimum and maximum values of DeV(x), respectively, wherein x∈ev. Then, it can be shown that DeV(x) is unimodal when x moves from vVs to vVe along the edge ev. Hence, DmineV occurs when either x≡vvs or x≡vve, or x is the intersection between eV and the plane passing through ci, cj and ck, i.e., the centers of three atoms ai, aj and ak, respectively. On the other hand, DmaxeV always occurs at either x≡vvs or x≡vve. Note that the distance DeV(vvs) and DeV(vve) can be computed while VD(A) is constructed. An edge ev is defined to be partially accessible by a probe p if DmineV<rp≦DmaxeV. Also, the edge ev is defined to be fully accessible if rp≦DmineV and non-accessible if DmaxeV<rp.
Assume fv is a face in VD(A) defined by two neighboring atoms ai and aj, and DfV (x) is the Euclidean distance from a point x∈fv to the surface of one of the atoms. DminfV and DmaxfV are also similarly defined as before. A face fv is defined to be partially accessible by a probe p if DminfV<rp≦DmaxfV. The face fV is defined to be fully accessible if rp≦DminfV and non-accessible if DmaxfV<rp.
Assume that a Voronoi edge eV is defined by three atoms ai, aj and ak. If eV is partially accessible by a probe p, then the three atoms define a link blend λijk and the corresponding triangular face fβijk defined by the vertices ci, cj and ck is an element of Fβ, wherein Sβ(A)=(Vβ, Eβ, Fβ) is the beta-shape.
Assume that a Voronoi face fv is defined by two atoms ai and aj. If fv is partially accessible by a probe p, then two atoms define a rolling blend ρij and the corresponding edge eβij defined by the vertices ci and cj is an element of Eβ, wherein Sβ(A)=(Vβ, Eβ, Fβ) is the beta-shape.
Thus, in the present embodiment, the construction of a beta-shape begins by obtaining a Voronoi diagram of spheres in step 702. It then searches for such partially accessible Voronoi edges and/or faces from the obtained Voronoi diagram in step 704. For example, all of the edges and/or faces in VD(A) may be checked with the above properties of the partially accessible edges and/or faces in a brute force way to collect partially accessible Voronois edges and/or faces. Next, in step 706, it obtains faces and/or edges of a beta-shape from the partially accessible Voronoi edges and/or faces, respectively. For example, applying the above properties to the partially accessible edges and/or faces will eventually produce the desired result. That is, the face set Fβ can be easily identified in O(|Ev|) time in the worst case by scanning all the Voronoi edges in Ev at once by applying the properties.
Hereinafter, another method of constructing a beta-shape from a Voronoi diagram of spheres, which is in accordance with a fourth preferred embodiment of the present invention, will be described with reference to
The method begins by collecting infinite edges in the Voronoi diagram in step 802, and the infinite edges are put into a stack in step 804. It checks whether the stack contains at least one edge in step 806. If the stack does not contain an edge, then the method finishes. If the stack contains at least one edge, it pops an edge from the stack in step 808, and then checks whether the popped edge is partially accessible in step 810. If the popped edge is partially accessible, then it determines the popped edge as one of the Voronoi edges corresponding to the faces of a beta-shape in step 812. Otherwise, if the popped edge is not partially accessible, then it puts edges incident to the popped edge into the stack on the condition that the incident edges are not in the stack in step 814. Then, the process returns back to the step 806, where it is checked whether the stack contains at least one edge.
Even though its time complexity would be identical to the one for the brute force approach, the method of the present embodiment is much faster in practice than the brute force approach since there are significantly fewer partially accessible Voronoi edges than |Ev|. Note that |Ev| is O(n) for proteins in the worst case scenario, although it can generally be O(n2) in the case of arbitrarily sized spheres.
Similarly, the steps shown in
It should be noted herein that the method of the present embodiment simply locates the faces and edges of the beta-shape Sβ(A). To complete the topology structure among vertices, edges and faces, it is necessary to perform a file inversion among Vβ, Eβ and Fβ. The following method in accordance with a fifth preferred embodiment (or an edge-crossing method) combines the steps of
The edge-crossing method begins by finding a partially accessible edge from an infinite Voronoi edge by a depth-first search in step 1002. It then constructs a face of the beta-shape corresponding to the found edge in step 1004. In step 1006, it constructs three edges of the face, and pushes them into a stack. Then, it checks whether the stack contains at least one edge in step 1008. If the stack does not contain any edge, then the process finishes. However, if the stack contains at least one edge, then it pops an edge from the stack in step 1010, and then searches for mate faces sharing the popped edge in step 1012. It checks whether there is at least one mate face in step 1014. If there is no mate face, then it declares the popped edge as a dangling edge in step 1016, in which case the process returns back to the step 1008. Otherwise, if there is at least one mate face, then it stitches topology information between the mate faces and the popped edge in step 1018. Then, in step 1020, it obtains two other edges on each mate face (except the popped edge). Next, it checks whether the obtained edge already exists in the stack in step 1022. If the obtained edge does not exist in the stack, then it puts the obtained edge into the stack in step 1024. However, if the obtained edge already exists in the stack, then it updates the partial topologies of the obtained edge using the topology information of the existing edge in step 1026 and deletes the existing edge from the stack in 1028. Thereafter, the process returns back to the step 1008.
Note that the edge in the step 1002 can be either an infinite edge or a finite edge. The topology data of an edge eβpop popped from the stack in the step 1010 is only partially defined. To complete the topology associated with eβpop, it is necessary to locate faces incident to eβpop. If given a face fβold∈Fβincident to eβpop, then another face fβnew∈Fβ incident to eβpop can be easily located by crossing the edge eβpop starting from fβold·fβnew, which is referred to as a mate face of fβold. Hence, such a method is referred to as an edge-crossing method. Note that the edge-crossing can be easily done if VD is represented in a non-manifold data structure, for example, a radial edge data structure. Hence, stitching the connectivity among fβnew, fβold and eβpop can be done without any difficulty.
Assume that all the edges ev1, ev2, ev3 and ev4 bounding the Voronoi face fv are fully accessible while fv itself is partially accessible. This then means that there can be a complete rolling blend, which is defined between atoms a1 and a2. Thus, in such a case, fv corresponds to a dangling edge between the centers of a1 and a2 as shown in
Assume that ev1 has been tested and it was found that it is partially accessible so that its corresponding beta-shape face fβ1 is already found (shown in
The edge-crossing method runs in O(Ev) time in the worst case. Note, however, that the number of surface atoms is usually much less than the total number of atoms in the system. Although the methods provided by the present invention are all designed to handle only the outer shell for the beta-shape of a single component, it should be recognized that the methods can be easily modified to appropriately account for the situations in that the beta-shape contains more than one components and/or has inner shells corresponding to the interior voids. While the edge-crossing method of the present embodiment assumes the existence of some partially accessible edges, the other case can also be handled via a slight variation in the presented method.
The beta-shape may be stored in a storage as a data structure, which can be handled by a computer or the like. For example, the beta-shape may be stored as a graph consisting of faces, edges and vertices. Thus, it should be recognized that the methods of the present invention may be implemented as computer-readable codes, which can be stored in a computer-readable medium and utilized via the use of a computer or the like. Furthermore, the methods of the present invention may be easily practiced with a system.
For example, the edge-crossing method may be implemented as following:
Hereinafter, a method of recognizing pockets on a protein using the beta-shape, which is in accordance with a sixth preferred embodiment of the present invention, will be described.
Given a protein, the method first obtains beta-shapes of Van Der Waals atoms. In a specific example, the method may first obtain a Voronoi diagram of Van Der Waals atoms. Then, it may compute the beta-shapes from the Voronoi diagram using a spherical probe.
The first step in defining a pocket is to define the spatial proximity among the atoms on the surface of the protein. This is done by using a beta-shape. Then, pocket primitives are defined on the beta-shape, wherein a pocket primitive is a unit of depressed region on the beta-shape. Lastly, the validity of boundaries between neighboring pocket primitives are evaluated to test if two neighbors should be considered as being from a single pocket or not. Eventually, a few pockets will be left on the surface of a receptor, wherein each pocket corresponds to an appropriately depressed region.
It should be emphasized herein once again that the beta-shape takes the size variations of atoms into account when computing the proximity among atoms on the surface of a protein. As mentioned above, the radius difference between atoms (e.g., H and P) is quite significant.
Assume A={a1, a2, . . . , an} is a protein consisting of atoms ai=(ci, ri) where ci={xi, yi, zi} and ri is the center and the radius of the atom ai, respectively. In addition, assume that L={l1, l2, . . . , lm} is a ligand which also consists of a number of atoms lj (defined similarly to ai), and L will be docking with A. Further assume that C={c1, c2, . . . , cn} is the set of centers of atoms. Note that in general m<<n. Assume p=(cp, rp) (called a probe) is the minimum sphere enclosing all atoms in the ligand L.
Assume πj is a pocket wherein πj={aj1, aj2, . . . , ajk}, in which these atoms together define a depressed region on the surface S of protein A. The surface S={s1, s2, . . . , sl} is defined as a set of atoms of the protein, wherein some points on the surface of the atoms contribute to the molecular surface. Hence, S is a set of atoms ai∈A which may be touched by the ligand. Since πjSA and π1∩π2∩ . . . ∩πpSA, there may be some atoms that are not included in any pocket. Assume Π={π1, π2, . . . , πp} is the set of all possible pockets on S.
Since πjS, extracting pockets need to query on the surface shape of the protein. Hence, an appropriate definition of the surface of a protein and the efficient representation of the topological structure among atoms on the surface is necessary. In this regard, the beta-shape based on the Voronoi diagram of spheres is employed.
Assume that pL and p∞ are a probe for a ligand L and a hypothetical probe with infinite radius, respectively. Further assume that BL and B∞ are the beta-shapes of a protein corresponding to pL and p∞, respectively. Then, B∞ is a beta-shape bounded by faces defined by the centers of atoms with unbounded Voronoi regions. Unlike an alpha-shape, however, a beta-shape B∞ may contain some isolated vertices.
Let BI and BO denote BL and B∞ to specify the inner and outer beta-shapes of a given model, respectively. Assume that BI={VBI, EBI, FBI} and BO={VBO, EBO, FBO}. Let VBO={VO1, VO2, . . . }. EBO, FBO, VBI, EBI, and FBI are similarly defined.
Similar observation can be made for its 3D counterpart. For a face fO∈FBO of a 3D protein, there is a corresponding depressed region on BI unless fO coincides with a face fI∈FBI.
However, a pocket on BI may or may not correspond to a face fO∈FBO. A large pocket, for example, may correspond to two faces of BO in FBO when the depressed regions from two faces of BO do not have a clear boundary between them. In such a case, a depressed region on BI corresponding to a face fO∈FBO cannot be defined to form a complete pocket. Instead, both depressed regions may altogether define a single pocket. Hence, the concept of pocket primitive pas a unit depressed region on BI corresponding to each face fO∈FBO will be described.
A face fOi∈FBO has three associated vertices vOi1, vOi2, and vOi3 in VBO. There are always three vertices vIi1, vIi2 and vIi3 in VBI, which coincide with VOi1, VOi2 and VOi3, respectively. Let γ(i1, i2) be geodesic (i.e., shortest path), on the inner beta-shape BI between vIi1 and vIi2. The path from a vertex follows an incident edge and the distance between two neighboring vertices is defined as the edge length between the two vertices. Hence, the distance between two arbitrary vertices is the sum of the edge lengths along the shortest path connecting two vertices. Geodesic γ(i1, i2) is referred to as a ridge between two pocket primitives.
The geometric meaning of γ(i1, i2) is as follows. While the extreme vertices vIi1 and vIIi2 are on both BO and BI, the other vertices on the path define depressions on BI from the corresponding face of BO. Hence, the geodesic γ(i1,i2) can be interpreted as the most upward wall separating two relatively deep depressions on BI. Other geodesics γ(i2, i3) and γ(i3, i1) can be similarly interpreted.
Assume that FsubIi is a set of faces fIh, wherein fIh∈FBI is interior to the three geodesics γ(i1, i2), γ(i2, i3) and γ(i3, i1). Then, FsubIi forms a topologically triangular shaped depression on BI from the corresponding face of BO. This depression is referred to as a pocket primitive fi corresponding to a face fOi∈FBO and is also represented by another graph φi=(VsubIi, EsubIi, FsubIi).
In
When a face fOi∈FBO coincides with a face φIi∈FBI, FsubIi consists of a single face φIi∈FBI and it is not considered as a pocket primitive. Note that FsubIi can even be a null set when three geodesics degenerate to three curve segments without containing any face inside. In such a case, no pocket primitive corresponds to the face. Based on the above, a few properties can be obtained:
If φi and φj are not topological neighbors, then FsubIi∩FsubIj=Ø where i≠j. Further, if φi and φj are topological neighbors, then EsubIi∩EsubIj=γ(ij)≠O. The geodesic γ(ij) is referred to as a ridge between φi and φj.
Next, the method of the present embodiment will be described in more detail with reference to
First, pocket primitives are extracted from BO. In step 1502, it obtains inner beta-shape BI and outer beta-shape BO. In one example, the method may first obtain a Voronoi diagram of spheres for the atom set, and then compute the inner and outer beta-shapes from the Voronoi diagram of spheres. It then identifies faces fOi of the outer beta-shapes BO in step 1504, and then identifies three vertices vi1, vi2, and vi3 of the inner beta-shape BI corresponding to the vertices of fOi in step 1506. Next, the geodesics γ(i1, i2), γ(i2, i3) and γ(i3, i1) corresponding to the three identified vertices vi1, vi2 and vi3 are found in step 1508. Thus, a pocket primitive φi=(VsubIi, EsubIi, FsubIi) surrounded by the geodesics γ(i1, i2), γ(i2, i3) and γ(i3, i1) can be found in step 1510.
Given the pocket primitives, one or more neighboring pocket primitives may form a pocket. Hence, it needs to be determined whether two neighboring pocket primitives can be merged together to form a more meaningful depression based on an appropriate criterion in step 1512. As mentioned above, a ridge γ(i1, i2) exists between two incident pocket primitives φi and φj. It is also an edge chain on BI corresponding to the geodesic between two extreme vertices of a pocket primitive. Hence, a ridge plays the role of the boundary between two incident pocket primitives.
Assume that a mountain is the edge chain on BI separating two pockets. If a ridge is sufficiently high, then can be regarded as a mountain. Note that a pocket primitive always has 3 ridges and a pocket is surrounded by 3 or more mountains. Therefore, the boundary of a pocket primitive may or may not be the boundary of a pocket.
Assume that a path γk, which is a ridge, exists between two incident pocket primitives φi and φj corresponding to an edge eOk of EBO. Note that there always exists a geodesic on BI for an edge of BO. Then, a certain measure can be defined to determine the discrepancy between two chains, eOk and φk. Depending on the measure and its prescribed threshold value, two pocket primitives sharing the chain can be merged into one larger pocket in step 1514.
There are various ways to define such a measure for the merge. For an example, the concept of average distance between two chains may be employed. Let δk be the average distance between eOk and φk. If δk is larger than a prescribed value, then the two neighboring pocket primitives sharing the chains can be merged. Otherwise, φk can be regarded as a mountain chain. For such threshold value in this embodiment, the average of all δ values may be chosen, without being limited thereto. After all, atoms for vertices in merged pocket primitives define the pocket πk. Note that there may be various other measures, which can be used for merging the pocket primitives, and these measures can be easily computed once they are well defined. The internal angles at the edges of a ridge, the intrinsic shape of a pocket primitive, etc. can constitute such examples.
Once pockets are recognized, it is often necessary to evaluate the significance of the pockets. In other words, some recognized pockets may not be regarded as significant pockets. There are different criteria for such evaluation (e.g., the average or maximum depth of pocket from the entrance of the pocket, the volume of the pocket from the entrance, etc.). Note that these can be also easily computed.
BO can also be defined by a probe p with a radius rp<<∞. Then, the resulting pockets from such BO is in a finer grain than those from B∞. Often, the pockets extracted from such finer grain BO can be more meaningful.
After pocket primitives are properly extracted, the ridges around all pocket primitives can be evaluated, wherein the appropriate pocket primitive pairs can be merged, if necessary, to form pockets.
In the above, a method has been presented to automatically recognize the pockets on the surface of proteins. In this method, a Euclidean Voronoi diagram of atoms is first computed and beta-shapes corresponding to given probes from the Voronoi diagram is constructed. Two beta-shapes can be computed, namely, one for inner definition and the other for outer definition of surface atom sets. Then, the pocket primitives on the inner beta-shape, which corresponds to each face of the outer beta-shape, can be computed. After extracting the pocket primitives, the quality of boundaries between neighboring pocket primitives can be evaluated to test if two neighbors should be merged into a single pocket or not. Eventually, a few pockets remain on the surface of a receptor, wherein each pocket corresponds to an appropriately depressed region.
Although this preferred embodiment illustrates the present invention in the context of a biological field, the present invention can be also applied in various other field (e.g., science and engineering fields). For example, the present invention can be applied in the fields of physics, chemistry, astronomy, computer graphics, etc.
Moreover, although the present invention has been mainly illustrated in terms of methods in preferred methods, the present invention can be implemented as computer-readable codes that can be stored in a computer-readable medium, as mentioned above.
The present invention can be also easily practiced as a system by those of an ordinary skill in the art.
The present invention can be applied in various disciplines including science and engineering, and more particularly in biological systems such as proteins without being limited thereto.
While the present invention has been shown and described with respect to preferred embodiments, those skilled in the art will recognize that various changes and modifications may be made without departing from the spirit and scope of the invention as defined in the appended claims.
This application is a U.S. national stage application filed under 35 U.S.C. §371 of International Patent Application PCT/KR2006/002424, accorded an international filing date of Jun. 22, 2006, which claims benefit under 35 U.S.C. 119(e) of U.S. Provisional Patent Application Ser. No. 60/692,790, filed Jun. 22, 2005,” both of which are incorporated herein by reference in their entirety.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/KR2006/002424 | 6/22/2006 | WO | 00 | 12/21/2007 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2006/137710 | 12/28/2006 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
6119069 | McCauley et al. | Sep 2000 | A |
6178539 | Papadopoulou | Jan 2001 | B1 |
6384826 | Bern et al. | May 2002 | B1 |
20030083117 | Rupert et al. | May 2003 | A1 |
20050248567 | Kim | Nov 2005 | A1 |
Number | Date | Country |
---|---|---|
10-2003-0018160 | Mar 2003 | KR |
Number | Date | Country | |
---|---|---|---|
20100183226 A1 | Jul 2010 | US |
Number | Date | Country | |
---|---|---|---|
60692790 | Jun 2005 | US |