A portion of the disclosure of this patent document contains material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
The present invention relates to two-, three- or four-dimensional structures of a polo domain. In particular, the invention relates to a crystal comprising a polo domain. The crystal may be useful for modeling and/or synthesizing mimetics of a polo domain or ligands that associate with the polo domain. Such mimetics or ligands may be capable of acting as modulators of activity of polo family kinases, and they may be useful for treating, inhibiting, or preventing diseases modulated by such kinases.
The Polo-like kinases (Plks) include S. cerevisiae Cdc5, S. pombe Plol, Drosophila Polo, and the four mammalian genes Plk1, Prk/Fnk, Snk and Sak. The Plks play multiple and overlapping roles in cell cycle progression [reviewed in refs. 1-3]. Mutation of polo in Drosophila, plol in S. pombe, and cdc5 in S. cerevisiae, cause mitotic defects including monopolar spindles, aberrant chromosome segregation, and failure of cytokinesis [4-8]. The targeted disruption of Sak in mouse is embryonic lethal at gastrulation with cells arresting in late stage mitosis and displaying failure of cytokinesis [9]. In S. cerevisiae, mitotic defects arising from the loss of cdc5 function can be rescued by the heterologous expression of mammalian Plk [10] or Prk/Fnk [11].
The Plks localize to characteristic mitotic structures during cell cycle progression, presumably to promote the interaction of the enzymes with specific substrates and effectors. Plk, Prk/Fnk, Cdc5, Plo1, Polo and Sak localize to centrosomes in early M phase and/or to the cleavage furrow or mother bud neck during cytokinesis [9, 12-17]. Mutational analyses of Cdc5 and Plk1 have demonstrated a requirement and sufficiency of the polo box motifs for sub-cellular localization [13-15]. In addition, these studies have demonstrated a requirement of proper sub-cellular localization for Plk family function. Interestingly, while most Plks possess two polo box motifs, the Sak orthologues possess only one. Since the sub-cellular localization of Sak conforms to that of the other Plks, the functional relevance of this difference remains to be determined.
Citation of documents herein is not intended as an admission that any of the documents cited herein is pertinent prior art, or an admission that the cited documents is considered material to the patentability of any of the claims of the present application. All statements as to the date or representation as to the contents of these documents is based on the information available to the applicant and does not constitute any admission as to the correctness of the dates or contents of these documents.
Applicants have solved the x-ray crystal structure of a polo domain. Solving the crystal structure has enabled the determination of structural features of the polo domain that permit the design of modulators of proteins comprising a polo domain. The crystal structure also enables the determination of structural features in molecules or ligands that interact or associate with the polo domain.
Knowledge of the conformation of the polo domain and binding pockets thereof is of significant utility in drug discovery. The association of natural substrates and effectors with a polo domain and binding pockets thereof may be the basis of many biological mechanisms. The associations may occur with all or any parts of a polo domain. An understanding of the association of a drug with the polo domain or part thereof will lead to the design and optimization of drugs having favorable associations with target polo family kinases and thus provide improved biological effects. Therefore, information about the shape and structure of the polo domain is valuable in designing potential modulators of proteins comprising a polo domain for use in treating diseases and conditions associated with or modulated by the proteins.
The present invention relates to a two-, three- or four dimensional structure of a polo domain, or a binding pocket thereof.
The invention also relates to a crystal comprising a polo domain or binding pocket thereof.
The present invention also contemplates molecules or molecular complexes that comprise all or parts of either one or more a polo domain, or homologs thereof, that have similar structure and shape.
The present invention also provides a crystal comprising a polo domain or binding pocket thereof and at least one ligand. A ligand may be complexed or associated with a polo domain or binding pocket thereof. Ligands include a substrate or analogue thereof or effector. A ligand may be a modulator of the activity of a polo family kinase
In an aspect the invention contemplates a crystal comprising a polo domain or binding pocket thereof complexed with a ligand (e.g. substrate or analogue thereof) from which it is possible to derive structural data for the ligand (e.g. substrate or analogue thereof).
The shape and structure of a polo domain or binding pocket thereof may be defined by selected atomic contacts in the domain or pocket. In an embodiment, the polo domain binding pocket is defined by one or more atomic interactions or enzyme atomic contacts.
An isolated polypeptide comprising a polo domain or binding pocket thereof with the shape and structure of a polo domain or binding pocket thereof described herein is also within the scope of the invention.
The invention also provides a method for preparing a crystal of the invention, preferably a crystal of a polo domain or binding pocket thereof, or a complex of such a domain or binding pocket thereof, and a ligand.
Crystal structures of the invention enable a model to be produced for a polo domain or binding pocket thereof, or complexes or parts thereof. The models will provide structural information about a polo domain, or a ligand and its interactions with a polo domain or binding pocket thereof. Models may also be produced for ligands. A model and/or the crystal structure of the present invention may be stored on a computer-readable medium.
A crystal and/or model of the invention may be used in a method of determining the secondary and/or tertiary structures of a polypeptide or binding pocket thereof with incompletely characterised structure. Thus, a method is provided for determining at least a portion of the secondary and/or tertiary structure of molecules or molecular complexes that contain at least some structurally similar features to a polo domain or binding pocket thereof of the invention. This is achieved by using at least some of the structural coordinates set out in Table 2.
A crystal of the invention may be useful for designing, modeling, identifying, evaluating, and/or synthesizing mimetics of a polo domain or binding pocket thereof, or ligands that associate with a binding pocket. Such mimetics or ligands may be capable of acting as modulators of polo kinase activity, and they may be useful for treating, inhibiting, or preventing conditions or diseases modulated by such kinases.
Thus the present invention contemplates a method of identifying a potential modulator of a polo family kinase comprising the step of applying the structural coordinates of a polo domain or binding pocket thereof, or atomic interactions, or atomic contacts thereof, to computationally evaluate a test compound for its ability to associate with the polo domain or binding pocket thereof, wherein a test compound that is found to associate with the polo domain or binding pocket thereof is a potential modulator. Use of the structural coordinates of a polo domain or binding pocket thereof, or atomic interactions, or atomic contacts thereof to design or identify a modulator is also provided.
The invention further contemplates classes of modulators of polo family kinases based on the shape and structure of a ligand defined in relation to the molecule's spatial association with a polo domain or binding pocket thereof. Generally, a method is provided for designing potential inhibitors of polo family kinases comprising the step of applying the structural coordinates of a ligand defined in relation to its spatial association with a polo domain or binding pocket thereof, to generate a compound that is capable of associating with the polo domain or binding pocket thereof.
It will be appreciated that a modulator of a polo family kinase may be identified by generating an actual secondary or three-dimensional model of a polo domain or binding pocket thereof, synthesizing a compound, and examining the components to find whether the required interaction occurs.
Therefore, the methods of the invention for identifying modulators may comprise one or more of the following additional steps:
Steps (a), (b) (c) and (d) may be carried out in any order, at different points in time, and they need not be sequential.
A potential modulator of a polo family kinase identified by a method of the present invention may be confirmed as a modulator by synthesizing the compound, and testing its effect on the polo family kinase in an assay for enzymatic activity. Such assays are known in the art (e.g phosphorylation assays).
A modulator of the invention may be converted using customary methods into pharmaceutical compositions. A modulator may be formulated into a pharmaceutical composition containing a modulator either alone or together with other active substances.
The invention also contemplates a method of treating or preventing a disease or condition associated with polo family kinases in a cellular organism, comprising:
The invention provides for the use of a modulator identified by the methods of the invention in the preparation of a medicament to treat or prevent a disease in a cellular organism. Use of modulators of the invention to manufacture a medicament is also provided.
Still another aspect of the present invention provides a method of conducting a drug discovery business comprising:
In certain embodiments, the subject method can also include a step of establishing a distribution system for distributing the pharmaceutical preparation for sale, and may optionally include establishing a sales group for marketing the pharmaceutical preparation.
Yet another aspect of the invention provides a method of conducting a target discovery business comprising:
These and other aspects of the present invention will become evident upon reference to the following detailed description and Tables, and attached drawings.
The present invention will now be described only by way of example, in which reference will be made to the following Figures:
The present invention will now be described only by way of example, in which reference will be made to the following Tables:
Table 1 shows the data collection, structure determination and refinement statistics for the polo domain of Sak. The following is the legend for Table 1:
1Numbers in parentheses refer to data for the highest resolution shell (2.00-2.07Å)
2Rsym=100×Σ|I−<I>|/Σ<I>, where I is the observed intensity and <I> is the average intensity from multiple observations of symmetry-related reflections.
3Phasing power for isomorphous and anomalous acentric reflections, where phasing power=<[|Fh,c|/phase-integrated lack of closure]>.
4Rfree was calculated with 10% of the data.
Table 2 shows the structural coordinates of a polo domain.
In Table 2, from the left, the second column identifies the atom number; the third identifies the atom type; the fourth identifies the amino acid type; the fifth identifies the chain name; the sixth identifies the residue number; the seventh identifies the x coordinates; the eighth identifies the y coordinates; the ninth identifies the z coordinates; the tenth identifies the occupancy; and the eleventh identifies the temperature factor.
Glossary
Unless otherwise indicated, all terms used herein have the same meaning as they would to one skilled in the art of the present invention. Practitioners are particularly directed to Current Protocols in Molecular Biology (Ansubel) for definitions and terms of the art.
“Polo Family Kinase” refers to a member of a family of cell cycle regulators that have been shown to be important for progression through the cell cycle (Lane, H. A., Trends in Cell Biol. 1997, 7:63-68). The family contains the following related but distinct members:
The polo family kinases are characterized by a kinase domain and one or two conserved sequences in the noncatalytic C-terminal domain i.e. the polo domain.
A polo family kinase may be derivable from a variety of sources, including viruses, bacteria, fungi, plants and animals. In a preferred embodiment a polo family kinase is derivable from a mammal. For example, a polo family kinase may be a human Sak polo family kinase
A polo family kinase in the present invention may be a wild type enzyme, or part thereof, or a mutant, variant or homolog, or part of such an enzyme.
The term “wild type” refers to a polypeptide having a primary amino acid sequence that is identical with the native enzyme (for example, the human enzyme).
The term “mutant” refers to a polypeptide having a primary amino acid sequence which differs from the wild type sequence by one or more amino acid additions, substitutions or deletions. Preferably, the mutant has at least 90% sequence identity with the wild type sequence. Preferably, the mutant has 20 mutations or less over the whole wild-type sequence. More preferably the mutant has 10 mutations or less, most preferably 5 mutations or less over the whole wild-type sequence.
The term “variant” refers to a naturally occurring polypeptide that differs from a wild-type sequence. A variant may be found within the same species (i.e. if there is more than one isoform of the enzyme) or may be found within a different species. Preferably the variant has at least 90% sequence identity with the wild type sequence. Preferably, the variant has 20 mutations or less over the whole wild-type sequence. More preferably, the variant has 10 mutations or less, most preferably 5 mutations or less over the whole wild-type sequence.
The term “part” indicates that the polypeptide comprises a fraction of the wild-type amino acid sequence. It may comprise one or more large contiguous sections of sequence or a plurality of small sections. The “part” may comprise a binding pocket as described herein. The polypeptide may also comprise other elements of sequence, for example, it may be a fusion protein with another protein (such as one which aids isolation or crystallisation of the polypeptide). Preferably the polypeptide comprises at least 50%, more preferably at least 65%, most preferably at least 80% of the wild-type sequence.
The term “homolog” means a polypeptide having a degree of homology with the wild-type amino acid sequence. The term “homology” refers to a degree of complementarity. There may be partial homology or complete homology. A sequence that is “substantially homologous” refers to a partially complementary sequence that at least partially inhibits an identical sequence from hybridizing to a target nucleic acid. Inhibition of hybridization of a completely complementary sequence to the target sequence may be examined using a hybridization assay (e.g. Southern or northern blot, solution hybridization, etc.) under conditions of reduced stringency. A sequence that is substantially homologous or a hybridization probe will compete for and inhibit the binding of a completely homologous sequence to the target sequence under conditions of reduced stringency. However, conditions of reduced stringency can be such that non-specific binding is permitted, as reduced stringency conditions require that the binding of two sequences to one another be a specific (i.e., a selective) interaction. The absence of non-specific binding may be tested using a second target sequence which lacks even a partial degree of complementarity (e.g., less than about 30% homology or identity). The substantially homologous sequence or probe will not hybridize to the second non-complementary target sequence in the absence of non-specific binding.
The phrase “percent identity” or “% identity” refers to the percentage of sequence similarity found in a comparison of two or more amino acid sequences. Percent identity can be determined electronically using conventional programs, e.g., by using the MEGALIGN program (LASERGENE software package, DNASTAR). The MEGALIGN program can create alignments between two or more amino acid sequences according to different methods, e.g., the Clustal Method. (Higgins, D. G. and P. M. Sharp (1988) Gene 73:237-244.) Gaps of low or of no homology between the two amino acid sequences are not included in determining percentage similarity.
In the present context, a homologous sequence is taken to include an amino acid sequence which may have at least 75, 85 or 90% identity, preferably at least 95 or 98% identity to the wild-type sequence. The homologs will comprise the same sites (for example, binding pockets) as the subject amino acid sequence.
A sequence for a polo family kinase or a polo domain or binding pocket thereof may have deletions, insertions or substitutions of amino acid residues which produce a silent change and result in a functionally equivalent enzyme. Deliberate amino acid substitutions may be made on the basis of similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic nature of the residues as long as the secondary binding activity of the substance is retained. For example, negatively charged amino acids include aspartic acid and glutamic acid; positively charged amino acids include lysine and arginine; and amino acids with uncharged polar head groups having similar hydrophilicity values include leucine, isoleucine, valine, glycine, alanine, asparagine, glutamine, serine, threonine, phenylalanine, and tyrosine.
The polypeptides may also have a homologous substitution (substitution and replacement are both used herein to mean the interchange of an existing amino acid residue, with an alternative residue) i.e. like-for-like substitution such as basic for basic, acidic for acidic, polar for polar etc. Non-homologous substitution may also occur i.e. from one class of residue to another or alternatively involving the inclusion of unnatural amino acids such as ornithine (hereinafter referred to as Z), diaminobutyric acid ornithine (hereinafter referred to as B), norleucine ornithine (hereinafter referred to as O), pyriylalanine, thienylalanine, naphthylalanine and phenylglycine.
A “polo domain” refers to a domain comprising a polo motif that is a highly conserved sequence in the non-catalytic domain of polo family kinases.
In the present invention the polo domain may be a polo domain of Plk1, Polo, cdc5, Plx, Plo, Prk, Fnk, Snk, or Sak., preferably Sak.
“Binding pocket” refers to a region or site of a polo domain or molecular complex thereof that as a result of its shape, favorably associates with another region of the polo domain or polo family kinase, or with a ligand or a part thereof. For example, it may comprise a region responsible for binding a ligand. In an aspect, a binding pocket comprises a dimeric structure.
A “ligand” refers to a compound or entity that associates with a polo domain or binding pocket thereof including substrates or analogues or parts thereof, effectors, or modulators of polo family kinases, including inhibitors. A ligand may be designed rationally by using a model according to the present invention. For example, a ligand for Plk may be Golgi Reassembly Stacking Protein of 65 kDa (GRASP65) (Lin Cy et al, Proc. Natl. Acad, Sci USA 2000, 7; 97(23): 12589-94), an α, β, or γ-tubulin (Feng, Y et al, Biochem J 1999 15;339 (Pt2): 435-42); human cytomegalovirus (HCMV) pp65 lower matrix protein (Gallina, A. et al J. Virol. 1999 73(2): 1468-78); associated with peptidyl-prolyl isomerase (Pin1), septins [8], Spc72, SMc1, Smc3, IrrI [23], Bfa1 [25], Mid1p [26], cyclin B1, Scc1, Cdc16, Cdc27, MKLP-1, and Hsp90 [reviewed in ref. 1]. A ligand for Prk/Fnk and Snk may be Cib, a Ca2+ and integrin-binding protein.
The term “binding pocket” (BP) also includes a homolog of the binding pocket or a portion thereof. As used herein, the term “homolog” in reference to a binding pocket refers to a binding pocket or a portion thereof which may have deletions, insertions or substitutions of amino acid residues as long as the binding specificity is retained. In this regard, deliberate amino acid substitutions may be made on the basis of similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic nature of the residues as long as the binding specificity of the binding pocket is retained.
As used herein, the term “portion thereof” means the structural coordinates corresponding to a sufficient number of amino acid residues of a binding pocket (or homologs thereof) that are capable of associating with a ligand. For example, the structural coordinates provided in a crystal structure may contain a subset of the amino acid residues in a binding pocket which may be useful in the modelling and design of compounds that bind to the binding pocket.
Crystal
The invention provides crystal structures. As used herein, the term “crystal” or “crystalline” means a structure (such as a three dimensional (3D) solid aggregate) in which the plane faces intersect at definite angles and in which there is a regular structure (such as internal structure) of the constituent chemical species. Thus, the term “crystal” can include any one of: a solid physical crystal form such as an experimentally prepared crystal, a crystal structure derivable from the crystal (including secondary and/or tertiary and/or quaternary structural elements), a 2D and/or 3D model based on the crystal structure, a representation thereof such as a schematic representation thereof or a diagrammatic representation thereof, or a data set thereof for a computer. In one aspect, the crystal is usable in X-ray crystallography techniques. Here, the crystals used can withstand exposure to X-ray beams used to produce a diffraction pattern data necessary to solve the X-ray crystallographic structure. A crystal of a polo domain or binding pocket may be characterized as being capable of diffracting x-rays in a pattern defined by one of the crystal forms depicted in Blundel et al 1976, Protein Crystallography, Academic Press.
The invention contemplates a crystal comprising a polo domain or binding pocket thereof of the invention.
In an embodiment, the invention relates to a crystal that is characterized as follows:
The crystal comprising two monomers (i.e.. a dimer), preferably a crystal of the polo domain of Sak that is dimeric, may be further characterized by one or more of the following properties:
A crystal of the invention may comprise amino acids residues Asp 868 and Lys 906.
Preferably the atoms of the Asp 868 and Lys 906 amino acid residues have the structural coordinates as set out in Table 2.
In an embodiment, a crystal of a polo domain of the invention belongs to space group P3212. The term “space group” refers to the lattice and symmetry of the crystal. In a space group designation the capital letter indicates the lattice type and the other symbols represent symmetry operations that can be carried out on the contents of the asymmetric unit without changing its appearance
A crystal of the invention may comprise a unit cell having the following unit dimensions: a=b=51.78 (±0.05) Å, c=146.94 (±0.05) Å. The term “unit cell” refers to the smallest and simplest volume element (i.e. parallelpiped-shaped block) of a crystal that is completely representative of the unit of pattern of the crystal. The unit cell axial lengths are represented by a, b, and c. Those of skill in the art understand that a set of atomic coordinates determined by X-ray crystallography is not without standard error.
In a preferred embodiment, a crystal of the invention has the structural coordinates as shown in Table 2. As used herein, the term “structural coordinates” refers to a set of values that define the position of one or more amino acid residues with reference to a system of axes. The term refers to a data set that defines the three dimensional structure of a molecule or molecules (e.g. Cartesian coordinates, temperature factors, and occupancies). Structural coordinates can be slightly modified and still render nearly identical three dimensional structures. A measure of a unique set of structural coordinates is the root-mean-square deviation of the resulting structure. Structural coordinates that render three dimensional structures (in particular a three dimensional structure of a ligand binding pocket) that deviate from one another by a root-mean-square deviation of less than 5 Å, 4 Å, 3 Å, 2 Å, or 1.5 Å may be viewed by a person of ordinary skill in the art as very similar.
Variations in structural coordinates may be generated because of mathematical manipulations of the structural coordinates of a polo domain described herein. For example, the structural coordinates of Table 2 may be manipulated by crystallographic permutations of the structural coordinates, fractionalization of the structural coordinates, integer additions or substractions to sets of the structural coordinates, inversion of the structural coordinates or any combination of the above.
Variations in the crystal structure due to mutations, additions, substitutions, and/or deletions of the amino acids, or other changes in any of the components that make up the crystal may also account for modifications in structural coordinates. If such modifications are within an acceptable standard error as compared to the original structural coordinates, the resulting structure may be the same. Therefore, a ligand that bound to a polo domain or binding pocket thereof, would also be expected to bind to another polo domain or binding pocket whose structural coordinates defined a shape that fell within the acceptable error. Such modified structures of a polo domain or binding pocket thereof are also within the scope of the invention.
Various computational analyses may be used to determine whether a molecule or the binding pocket thereof is sufficiently similar to all or parts of a polo domain or binding pocket thereof. Such analyses may be carried out using conventional software applications and methods as described herein.
A crystal of the invention may also be specifically characterised by the parameters, diffraction statistics and/or refinement statistics set out in Table 1.
With reference to a crystal of the present invention, residues in a binding pocket may be defined by their spatial proximity to a ligand in the crystal structure. For example, a binding pocket may be defined by their proximity to a modulator.
A crystal or secondary or three-dimensional structure of a polo domain or binding pocket thereof may be more specifically defined by one or more of the atomic contacts of atomic interactions in the crystal (e.g. between Asp 868 and Lys 906). An atomic interaction can be defined by an atomic contact (more preferably, a specific atom of an amino acid residue where indicated) on the polo domain, and an atomic contact (more preferably, a specific atom of an amino acid residue where indicated) on the polo domain or ligand.
Illustrations of particular crystals of the invention are shown in
A crystal of the invention includes a polo domain or binding pocket thereof in association with one or more moieties, including heavy-metal atoms i.e. a derivative crystal, or one or more ligands or molecules i.e. a co-crystal.
The term “associate”, “association” or “associating” refers to a condition of proximity between a moiety (i.e. chemical entity or compound or portions or fragments thereof), and a polo domain or binding pocket thereof. The association may be non-covalent i.e. where the juxtaposition is energetically favored by for example, hydrogen-bonding, van der Waals, or electrostatic or hydrophobic interactions, or it may be covalent.
The term “heavy-metal atoms” refers to an atom that can be used to solve an x-ray crystallography phase problem, including but not limited to a transition element, a lanthanide metal, or an actinide-metal. Lanthanide metals include elements with atomic numbers between 57 and 71, inclusive. Actinide metals include elements with atomic numbers between 89 and 103, inclusive.
Multiwavelength anomalous diffraction (MAD) phasing may be used to solve protein structures using selenomethionyl (SeMet) proteins. Therefore, a complex of the invention may comprise a crystalline polo domain or binding pocket with selenium on the methionine residues of the protein.
A crystal may comprise a complex between a polo domain or binding pocket thereof and one or more ligands or molecules. In other words the polo domain or binding pocket may be associated with one or more ligands or molecules in the crystal. The ligand may be any compound that is capable of stably and specifically associating with the polo domain or binding pocket. A ligand may, for example, be a modulator of a polo family kinase or another polo family kinase, in particular a polo domain of another polo family kinase.
In an embodiment of the invention, a binding pocket is in association with a cofactor in the crystal. A “cofactor” refers to a molecule required for enzyme activity and/or stability. For example, the cofactor may be a metal ion.
Therefore, the present invention also provides:
A structure of a complex of the invention may be defined by selected intermolecular contacts.
A crystal of the invention may enable the determination of structural data for a ligand. In order to be able to derive structural data for a ligand, it is necessary for the molecule to have sufficiently strong electron density to enable a model of the molecule to be built using standard techniques. For example, there should be sufficient electron density to allow a model to be built using XTALVWEW (McRee 1992 J. Mol. Graphics. 10 44-46).
Method of Making a Crystal
The present invention also provides a method of making a crystal according to the invention. The crystal may be formed from an aqueous solution comprising a purified polypeptide comprising a polo domain, in particular a polo family kinase or part or fragment thereof (e.g. a binding pocket). A method may utilize a purified polypeptide comprising a binding pocket to form a crystal. For example, amino acid residues 839 to 925 of murine Sak may be used to prepare a polo domain structure of the invention.
The term “purified” in reference to a polypeptide, does not require absolute purity such as a homogenous preparation rather it represents an indication that the polypeptide is relatively purer than in the natural environment. Generally, a purified polypeptide is substantially free of other proteins, lipids, carbohydrates, or other materials with which it is naturally associated, preferably at a functionally significant level for example at least 85% pure, more preferably at least 95% pure, most preferably at least 99% pure. A skilled artisan can purify a polypeptide comprising using standard techniques for protein purification. A substantially pure polypeptide will yield a single major band on a non-reducing polyacrylamide gel. Purity of the polypeptide can also be determined by amino-terminal amino acid sequence analysis.
A polypeptide used in the method may be chemically synthesized in whole or in part using techniques that are well-known in the art. Alternatively, methods are well known to the skilled artisan to construct expression vectors containing a native or mutated polo family kinase coding sequence and appropriate transcriptional/translational control signals. These methods include in vitro recombinant DNA techniques, synthetic techniques, and in vivo recombination/genetic recombination. See for example the techniques described in Sambrook et al. (Molecular Cloning: A Laboratory Manual, 2nd Edition, Cold Spring Harbor Laboratory press (1989)), and other laboratory textbooks. (See also Sarker et al, Glycoconjugate J. 7:380, 1990; Sarker et al, Proc. Natl. Acad, Sci. USA 88:234-238, 1991, Sarker et al, Glycoconjugate J. 11: 204-209, 1994; Hull et al, Biochem Biophys Res Commun 176:608, 1991 and Pownall et al, Genomics 12:699-704, 1992).
Crystals may be grown from an aqueous solution containing the purified polypeptide by a variety of conventional processes. These processes include batch, liquid, bridge, dialysis, vapor diffusion, and hanging drop methods. (See for example, McPherson, 1982 John Wiley, New York; McPherson, 1990, Eur. J. Biochem. 189: 1-23; Webber. 1991, Adv. Protein Chem. 41:1-36). Generally, native crystals of the invention are grown by adding precipitants to the concentrated solution of the polypeptide. The precipitants are added at a concentration just below that necessary to precipitate the protein. Water is removed by controlled evaporation to produce precipitating conditions, which are maintained until crystal growth ceases.
Derivative crystals of the invention can be obtained by soaking native crystals in a solution containing salts of heavy metal atoms. A complex of the invention can be obtained by soaking a native crystal in a solution containing a compound that binds the polypeptide, or they can be obtained by co-crystallizing the polypeptide in the presence of one or more compounds. In order to obtain co-crystals with a compound which binds deep within the tertiary structure of the polypeptide it is necessary to use the second method.
Once the crystal is grown it can be placed in a glass capillary tube and mounted onto a holding device connected to an X-ray generator and an X-ray detection device. Collection of X-ray diffraction patterns are well documented by those skilled in the art (See for example, Ducruix and Geige, 1992, IRL Press, Oxford, England). A beam of X-rays enter the crystal and diffract from the crystal. An X-ray detection device can be utilized to record the diffraction patterns emanating from the crystal. Suitable devices include the Marr 345 imaging plate detector system with an RU200 rotating anode generator.
Multiwavelength anomalous diffraction (MAD) phasing using selenomethionyl (SeMet) proteins may be used to determine a crystal of the invention. Thus, the invention contemplates a method for determining a crystal structure of the invention using a selenomethionyl derivative of a polo domain or a binding pocket thereof.
Methods for obtaining the three dimensional structure of the crystalline form of a molecule or complex are described herein and known to those skilled in the art (see Ducruix and Geige 1992, IRL Press, Oxford, England). Generally, the x-ray crystal structure is given by the diffraction patterns. Each diffraction pattern reflection is characterized as a vector and the data collected at this stage determines the amplitude of each vector. The phases of the vectors may be determined by the isomorphous replacement method where heavy atoms soaked into the crystal are used as reference points in the X-ray analysis (see for example, Otwinowski, 1991, Daresbury, United Kingdom, 80-86). The phases of the vectors may also be determined by molecular replacement (see for example, Naraza, 1994, Proteins 11:281-296). The amplitudes and phases of vectors from the crystalline form are determined in accordance with these methods can be used to analyze other related crystalline polypeptides.
The unit cell dimensions and symmetry, and vector amplitude and phase information can be used in a Fourier transform function to calculate the electron density in the unit cell i.e. to generate an experimental electron density map. This may be accomplished using the PHASES package (Furey, 1990). Amino acid sequence structures are fit to the experimental electron density map (i.e. model building) using computer programs (e.g. Jones, T A. et al, Acta Crystallogr A47, 100-119, 1991). This structure can also be used to calculate a theoretical electron density map. The theoretical and experimental electron density maps can be compared and the agreement between the maps can be described by a parameter referred to as R-factor. A high degree of overlap in the maps is represented by a low value R-factor. The R-factor can be minimized by using computer programs that refine the structure to achieve agreement between the theoretical and observed electron density map. For example, the XPLOR program, developed by Brunger (1992, Nature 355:472-475) can be used for model refinement.
A three dimensional structure of the molecule or complex may be described by atoms that fit the theoretical electron density characterized by a minimum R value. Files can be created for the structure that defines each atom by coordinates in three dimensions.
Model
A crystal structure of the present invention may be used to make a model of a polo domain or binding pocket thereof. A model may, for example, be a structural model or a computer model. A model may represent the secondary, tertiary and/or quaternary structure of the binding pocket. The model itself may be in two or three dimensions. It is possible for a computer model to be in three dimensions despite the constraints imposed by a conventional computer screen, if it is possible to scroll along at least a pair of axes, causing “rotation” of the image.
As used herein, the term “modelling” includes the quantitative and qualitative analysis of molecular structure and/or function based on atomic structural information and interaction models. The term “modelling” includes conventional numeric-based molecular dynamic and energy minimization models, interactive computer graphic models, modified molecular mechanics models, distance geometry and other structure-based constraint models.
Preferably, modelling is performed using a computer and may be further optimized using known methods. This is called modelling optimisation.
An integral step to an approach of the invention for designing modulators of a subject polo domain involves construction of computer graphics models of the domain which can be used to design pharmacophores by rational drug design. For instance, for a modulator to interact optimally with the subject domain, it will generally be desirable that it have a shape which is at least partly complimentary to that of a particular binding pocket of the domain, as for example those portions of the domain which are involved in recognition of a ligand. Additionally, other factors, including electrostatic interactions, hydrogen bonding, hydrophobic interactions, desolvation effects, and cooperative motions of ligand and domain, all influence the binding effect and should be taken into account in attempts to design bioactive modulators.
As described herein, a computer-generated molecular model of the subject polo domain can be created. In preferred embodiments, at least the Cα-carbon positions of the polo domain sequence of interest are mapped to a particular coordinate pattern, such as the coordinates for a polo domain shown in Table 2, by homology modeling, and the structure of the protein and velocities of each atom are calculated at a simulation temperature (To) at which the docking simulation is to be determined. Typically, such a protocol involves primarily the prediction of side-chain conformations in the modeled domain, while assuming a main-chain trace taken from a tertiary structure such as provided in Table 2 and the Figures. Computer programs for performing energy minimization routines are commonly used to generate molecular models. For example, both the CHARMM (Brooks et al. (1983) J Comput Chem 4:187-217) and AMBER (Weiner et al (1981) J. Comput. Chem. 106: 765) algorithms handle all of the molecular system setup, force field calculation, and analysis (see also, Eisenfield et al. (1991) Am J Physiol 261:C376-386; Lybrand (1991) J Pharm Belg 46:49-54; Froimowitz (1990) Biotechniques 8:640-644; Burbam et al. (1990) Proteins 7:99-111; Pedersen (1985) Environ Health Perspect 61:185-190; and Kini et al. (1991) J Biomol Struct Dyn 9:475-488). At the heart of these programs is a set of subroutines that, given the position of every atom in the model, calculate the total potential energy of the system and the force on each atom. These programs may utilize a starting set of atomic coordinates, such as the coordinates provided in Table 2, the parameters for the various terms of the potential energy function, and a description of the molecular topology (the covalent structure). Common features of such molecular modeling methods include: provisions for handling hydrogen bonds and other constraint forces; the use of periodic boundary conditions; and provisions for occasionally adjusting positions, velocities, or other parameters in order to maintain or change temperature, pressure, volume, forces of constraint, or other externally controlled conditions.
Most conventional energy minimization methods use the input data described above and the fact that the potential energy function is an explicit, differentiable function of Cartesian coordinates, to calculate the potential energy and its gradient (which gives the force on each atom) for any set of atomic positions. This information can be used to generate a new set of coordinates in an effort to reduce the total potential energy and, by repeating this process over and over, to optimize the molecular structure under a given set of external conditions. These energy minimization methods are routinely applied to molecules similar to the subject polo domain.
In general, energy minimization methods can be carried out for a given temperature, Ti, which may be different than the docking simulation temperature, To. Upon energy minimization of the molecule at Ti, coordinates and velocities of all the atoms in the system are computed. Additionally, the normal modes of the system are calculated. It will be appreciated by those skilled in the art that each normal mode is a collective, periodic notion, with all parts of the system moving in phase with each other, and that the motion of the molecule is the superposition of all normal modes. For a given temperature, the mean square amplitude of motion in a particular mode is inversely proportional to the effective force constant for that mode, so that the motion of the molecule will often be dominated by the low frequency vibrations.
After the molecular model has been energy minimized at Ti, the system is “heated” or “cooled” to the simulation temperature, To, by carrying out an equilibration run where the velocities of the atoms are scaled in a step-wise manner until the desired temperature, To, is reached. The system is further equilibrated for a specified period of time until certain properties of the system, such as average kinetic energy, remain constant. The coordinates and velocities of each atom are then obtained from the equilibrated system.
Further energy minimization routines can also be carried out. For example, a second class of methods involves calculating approximate solutions to the constrained EOM for the protein. These methods use an iterative approach to solve for the Lagrange multipliers and, typically, only need a few iterations if the corrections required are small. The most popular method of this type, SHAKE (Ryckaert et al. (1977) J Comput Phys 23:327; and Van Gunsteren et al. (1977) Mol Phys 34:1311) is easy to implement and scales as O(N) as the number of constraints increases. Therefore, the method is applicable to molecules such as the polo domains of the present invention. An alternative method, RATTLE (Anderson (1983) J Comput Phys 52:24) is based on the velocity version of the Verlet algorithm. Like SHAKE, RATTLE is an iterative algorithm and can be used to energy minimize the model of the subject protein.
Overlays and super positioning with a three dimensional model of a polo domain or binding pocket thereof of the invention may be used for modelling optimisation. Additionally alignment and/or modelling can be used as a guide for the placement of mutations on a polo domain or binding pocket thereof to characterize the nature of the site in the context of a cell.
The three dimensional structure of a new crystal may be modelled using molecular replacement. The term “molecular replacement” refers to a method that involves generating a preliminary model of a molecule or complex whose structure coordinates are unknown, by orienting and positioning a molecule whose structure coordinates are known within the unit cell of the unknown crystal, so as best to account for the observed diffraction pattern of the unknown crystal. Phases can then be calculated from this model and combined with the observed amplitudes to give an approximate Fourier synthesis of the structure whose coordinates are unknown. This, in turn, can be subject to any of the several forms of refinement to provide a final, accurate structure of the unknown crystal. Lattman, E., “Use of the Rotation and Translation Functions”, in Methods in Enzymology, 115, pp. 55-77 (1985); M. G. Rossmann, ed., “The Molecular Replacement Method”, Int. Sci. Rev. Ser., No. 13, Gordon & Breach, New York, (1972).
Commonly used computer software packages for molecular replacement are X-PLOR (Brunger 1992, Nature 355: 472-475), AMoRE (Navaza, 1994, Acta Crystallogr. A50:157-163), the CCP4 package (Collaborative Computational Project, Number 4, “The CCP4 Suite: Programs for Protein Crystallography”, Acta Cryst., Vol. D50, pp. 760-763, 1994), the MERLOT package (P. M. D. Fitzgerald, J. Appl. Cryst., Vol. 21, pp. 273-278, 1988) and XTALVIEW (McCree et al (1992) J. Mol. Graphics 10: 44-46. It is preferable that the resulting structure not exhibit a root-mean-square deviation of more than 3 Å.
Molecular replacement computer programs generally involve the following steps: (1) determining the number of molecules in the unit cell and defining the angles between them (self rotation function); (2) rotating the known structure against diffraction data to define the orientation of the molecules in the unit cell (rotation function); (3) translating the known structure in three dimensions to correctly position the molecules in the unit cell (translation function); (4) determining the phases of the X-ray diffraction data and calculating an R-factor calculated from the reference data set and from the new data wherein an R-factor between 30-50% indicates that the orientations of the atoms in the unit cell have been reasonably determined by the method; and (5) optionally, decreasing the R-factor to about 20% by refining the new electron density map using iterative refinement techniques known to those skilled in the art (refinement).
The quality of the model may be analysed using a program such as PROCHECK or 3D-Profiler [Laskowski et al 1993 J. Appl. Cryst. 26:283-291; Luthy R. et al, Nature 356: 83-85, 1992; and Bowie, J. U. et al, Science 253: 164-170, 1991]. Once any irregularities have been resolved, the entire structure may be further refined.
Other molecular modelling techniques may also be employed in accordance with this invention. See, e.g., Cohen, N. C. et al, “Molecular Modelling Software and Methods for Medicinal Chemistry”, J. Med. Chem., 33, pp. 883-894 (1990). See also, Navia, M. A. and M. A. Murcko, “The Use of Structural Information in Drug Design”, Current Opinions in Structural Biology, 2, pp. 202-210 (1992).
Using the structural coordinates of crystals provided by the invention, molecular modelling may be used to determine the structural coordinates of a crystalline mutant or homolog of a polo domain or binding pocket thereof. By the same token a crystal of the invention can be used to provide a model of a ligand. Modelling techniques can then be used to approximate the three dimensional structure of ligand derivatives and other components which may be able to mimic the atomic contacts between a ligand and polo domain or binding pocket.
Computer Format of Crystals/Models
Information derivable from a crystal of the present invention (for example the structural coordinates) and/or the model of the present invention may be provided in a computer-readable format.
Therefore, the invention provides a computer readable medium or a machine readable storage medium which comprises the structural coordinates of a polo domain or binding pocket thereof including all or any parts thereof, or ligands including portions thereof. Such storage medium or storage medium encoded with these data are capable of displaying on a computer screen or similar viewing device, a three-dimensional graphical representation of a molecule or molecular complex which comprises such polo domain, binding pockets or similarly shaped homologous domains or binding pockets. Thus, the invention also provides computerized representations of the secondary or three-dimensional structures of a polo domain or binding pocket of the invention, including any electronic, magnetic, or electromagnetic storage forms of the data needed to define the structures such that the data will be computer readable for purposes of display and/or manipulation.
In an aspect the invention provides a computer for producing a three-dimensional representation of a molecule or molecular complex, wherein said molecule or molecular complex comprises a polo domain or binding pocket thereof defined by structural coordinates of a polo domain or binding pocket or structural coordinates of atoms of a ligand, or a three-dimensional representation of a homologue of said molecule or molecular complex, wherein said homologue comprises a polo domain, binding pocket or ligand that has a root mean square deviation from the backbone atoms not more than 1.5 angstroms wherein said computer comprises:
The invention also provides a computer for determining at least a portion of the structural coordinates corresponding to an X-ray diffraction pattern of a molecule or molecular complex wherein said computer comprises:
The present invention also provides a method for determining the secondary and/or tertiary structures of a polo domain or part thereof by using a crystal, or a model according to the present invention. The domain or part thereof may be any domain or part thereof for which the secondary and or tertiary structure is uncharacterised or incompletely characterised. In a preferred embodiment the domain shares (or is predicted to share) some structural or functional homology to a crystal of the present invention. For example, the domain may show a degree of structural homology over some or all parts of the primary amino acid sequence.
The polo domain may be a polo domain of a polo family kinase with a different specificity for a ligand or substrate. Alternatively (or in addition) the domain may be a polo domain from a different species.
The domain may be from a mutant of a wild-type polo family kinase, in particular Plk1 or Sak. A mutant may arise naturally, or may be made artificially (for example using molecular biology techniques). The mutant may also not be “made” at all in the conventional sense, but merely tested theoretically using the model of the present invention. A mutant may or may not be functional.
Thus, using the model of the present invention, the effect of a particular mutation on the overall two and/or three dimensional structure of a polo domain and/or the interaction between a binding pocket of the enzyme and a ligand can be investigated.
Alternatively, the domain may perform an analogous function or be suspected to show a similar mechanism to a polo domain of a polo family kinase.
The domain may also be the same as the polo domain of the crystal, but in association with a different ligand (for example, modulator or inhibitor) or cofactor. In this way it is possible to investigate the effect of altering the ligand or compound with which the polo domain is associated on the structure of the binding pocket.
Secondary or tertiary structure may be determined by applying the structural coordinates of the crystal or model of the present invention to other data such as an amino acid sequence, X-ray crystallographic diffraction data, or nuclear magnetic resonance (NMR) data. Homology modeling, molecular replacement, and nuclear magnetic resonance methods using these other data sets are described below.
Homology modeling (also known as comparative modeling or knowledge-based modeling) methods develop a three dimensional model from a sequence based on the structures of known proteins (i.e. a polo domain of the crystal of the invention). The method utilizes a computer model of the crystal of the present invention (the “known structure”), a computer representation of the amino acid sequence of the domain with an unknown structure, and standard computer representations of the structures of amino acids. The method in particular comprises the steps of; (a) identifying structurally conserved and variable regions in the known structure; (b) aligning the amino acid sequences of the known structure and unknown structure (c) generating co-ordinates of main chain atoms and side chain atoms in structurally conserved and variable regions of the unknown structure based on the coordinates of the known structure thereby obtaining a homology model; and (d) refining the homology model to obtain a three dimensional structure for the unknown structure. This method is well known to those skilled in the art (Greer, 1985, Science 228, 1055; Bundell et al 1988, Eur. J. Biochem. 172, 513; Knighton et al., 1992, Science 258:130-135, http://biochem.vt.edu/coul-ses/modeling/homology.htn). Computer programs that can be used in homology modelling are Quanta and the Homology module in the Insight II modelling package distributed by Molecular Simulations Inc, or MODELLER (Rockefeller University, www.iucr.ac.uk/sinris-top/logical/prg-modeller.html).
In step (a) of the homology modelling method, a known structure is examined to identify the structurally conserved regions (SCRs) from which an average structure, or framework, can be constructed for these regions of the protein. Variable regions (VRs), in which known structures may differ in conformation, also must be identified. SCRs generally correspond to the elements of secondary structure, such as alpha-helices and beta-sheets, and to ligand- and substrate-binding sites (e.g. nucleotide binding sites). The VRs usually lie on the surface of the proteins and form the loops where the main chain turns.
Many methods are available for sequence alignment of known structures and unknown structures. Sequence alignments generally are based on the dynamic programming algorithm of Needleman and Wunsch [J. Mol. Biol. 48: 442-453, 1970]. Current methods include FASTA, Smith-Waterman, and BLASTP, with the BLASTP method differing from the other two in not allowing gaps. Scoring of alignments typically involves construction of a 20×20 matrix in which identical amino acids and those of similar character (i.e., conservative substitutions) may be scored higher than those of different character. Substitution schemes which may be used to score alignments include the scoring matrices PAM (Dayhoff et al., Meth. Enzymol. 91: 524-545, 1983), and BLOSUM (Henikoff and Henikoff, Proc. Nat. Acad. Sci. USA 89: 10915-'0919, 1992), and the matrices based on alignments derived from three-dimensional structures including that of Johnson and Overington (JO matrices) (J. Mol. Biol. 233: 716-738, 1993).
Alignment based solely on sequence may be used; however, other structural features also may be taken into account. In Quanta, multiple sequence alignment algorithms are available that may be used when aligning a sequence of the unknown with the known structures. Four scoring systems (i.e. sequence homology, secondary structure homology, residue accessibility homology, CA-CA distance homology) are available, each of which may be evaluated during an alignment so that relative statistical weights may be assigned.
When generating coordinates for the unknown structure, main chain atoms and side chain atoms, both in SCRs and VRs need to be modelled. A variety of approaches known to those skilled in the art may be used to assign co-ordinates to the unknown. In particular, the coordinates of the main chain atoms of SCRs will be transferred to the unknown structure. VRs correspond most often to the loops on the surface of the polypeptide and if a loop in the known structure is a good model for the unknown, then the main chain co-ordinates of the known structure may be copied. Side chain coordinates of SCRs and VRs are copied if the residue type in the unknown is identical to or very similar to that in the known structure. For other side chain coordinates, a side chain rotamer library may be used to define the side chain coordinates. When a good model for a loop cannot be found fragment databases may be searched for loops in other proteins that may provide a suitable model for the unknown. If desired, the loop may then be subjected to conformational searching to identify low energy conformers if desired.
Once a homology model has been generated it is analyzed to determine its correctness. A computer program available to assist in this analysis is the Protein Health module in Quanta which provides a variety of tests. Other programs that provide structure analysis along with output include PROCHECK and 3D-Profiler [Luthy R. et al, Nature 356: 83-85, 1992; and Bowie, J. U. et al, Science 253: 164-170, 1991]. Once any irregularities have been resolved, the entire structure may be further refined. Refinement may consist of energy minimization with restraints, especially for the SCRs. Restraints may be gradually removed for subsequent *minimizations. Molecular dynamics may also be applied in conjunction with energy minimization.
Molecular replacement involves applying a known structure to solve the X-ray crystallographic data set of a polypeptide of unknown structure. The method can be used to define the phases describing the X-ray diffraction data of a polypeptide of unknown structure when only the amplitudes are known. Thus in an embodiment of the invention, a method is provided for determining three dimensional structures of polypeptides with unknown structure by applying the structural coordinates of a crystal of the present invention to provide an X-ray crystallographic data set for a polypeptide of unknown structure, and (b) determining a low energy conformation of the resulting structure.
The structural coordinates of the crystal of the present invention may be applied to nuclear magnetic resonance (NMR) data to determine the three dimensional structures of polypeptides with uncharacterised or incompletely characterised structure. (See for example, Wuthrich, 1986, John Wiley and Sons, New York: 176-199; Pflugrath et al., 1986, J. Molecular Biology 189: 383-386; Kline et al., 1986 J. Molecular Biology 189:377-382). While the secondary structure of a polypeptide may often be determined by NMR data, the spatial connections between individual pieces of secondary structure are not as readily determined. The structural coordinates of a polypeptide defined by X-ray crystallography can guide the NMR spectroscopist to an understanding of the spatial interactions between secondary structural elements in a polypeptide of related structure. Information on spatial interactions between secondary structural elements can greatly simplify Nuclear Overhauser Effect (NOE) data from two-dimensional NMR experiments. In addition, applying the structural coordinates after the determination of secondary structure by NMR techniques simplifies the assignment of NOE's relating to particular amino acids in the polypeptide sequence and does not greatly bias the NMR analysis of polypeptide structure.
In an embodiment, the invention relates to a method of determining three dimensional structures of domains with unknown structures, by applying the structural coordinates of a crystal of the present invention to nuclear magnetic resonance (NMR) data of the unknown structure. This method comprises the steps of: (a) determining the secondary structure of an unknown structure using NMR data; and (b) simplifying the assignment of through-space interactions of amino acids. The term “through-space interactions” defines the orientation of the secondary structural elements in the three dimensional structure and the distances between amino acids from different portions of the amino acid sequence. The term “assignment” defines a method of analyzing NMR data and identifying which amino acids give rise to signals in the NMR spectrum.
Screening Method
Another aspect of the present invention concerns molecular models, in particular three-dimensional molecular models of polo domains, and their use as templates for the design of agents able to mimic or inhibit the activity of a polypeptide comprising a polo domain.
In certain embodiments, the present invention provides a method of screening for a ligand that associates with a polo domain or binding pocket and/or modulates the function of a polo family kinase by using a crystal or a model according to the present invention. The method may involve investigating whether a test compound is capable of associating with or binding a polo domain or binding pocket thereof, and/or inhibiting or enhancing interactions of atomic contacts in a polo domain or binding pocket thereof.
In accordance with an aspect of the present invention, a method is provided for screening for a ligand capable of binding to a polo domain or a binding pocket thereof, wherein the method comprises using a crystal or model according to the invention.
In another aspect, the invention relates to a method of screening for a ligand capable of binding to a polo domain or binding pocket thereof, wherein the polo domain or binding pocket thereof is defined by the structural coordinates given herein, the method comprising contacting the polo domain or binding pocket thereof with a test compound and determining if the test compound binds to the polo domain or binding pocket thereof.
In one embodiment, the present invention provides a method of screening for a test compound capable of interacting with one or more key amino acid residues of a binding pocket of a polo domain.
Another aspect of the invention provides a process comprising the steps of:
A further aspect of the invention provides a process comprising the steps of;
Once a test compound capable of interacting with one or more key amino acid residues in a binding pocket of a polo domain has been identified, further steps may be carried out either to select and/or modify compounds and/or to modify existing compounds, to modulate the interaction with the key amino acid residues in the binding pocket.
Yet another aspect of the invention provides a process comprising the steps of;
As used herein, the term “test compound” means any compound which is potentially capable of associating with a binding pocket, and/or inhibiting or enhancing interactions of atomic contacts in a binding pocket. If, after testing, it is determined that the test compound does bind to the binding pocket and/or inhibits or enhances interactions of atomic contacts in a binding contact, it is known as a “ligand”.
The test compound may be designed or obtained from a library of compounds which may comprise peptides, as well as other compounds, such as small organic molecules and particularly new lead compounds. By way of example, the test compound may be a natural substance, a biological macromolecule, or an extract made from biological materials such as bacteria, fungi, or animal (particularly mammalian) cells or tissues, an organic or an inorganic molecule, a synthetic test compound, a semi-synthetic test compound, a carbohydrate, a monosaccharide, an oligosaccharide or polysaccharide, a glycolipid, a glycopeptide, a saponin, a heterocyclic compound, a structural or functional mimetic, a peptide, a peptidomimetic, a derivatised test compound, a peptide cleaved from a whole protein, or a peptides synthesised synthetically (such as, by way of example, either using a peptide synthesizer or by recombinant techniques or combinations thereof), a recombinant test compound, a natural or a non-natural test compound, a fusion protein or equivalent thereof and mutants, derivatives or combinations thereof.
The increasing availability of biomacromolecule structures of potential pharmacophoric molecules that have been solved crystallographically has prompted the development of a variety of direct computational methods for molecular design, in which the steric and electronic properties of substrate binding sites are use to guide the design of potential ligands (Cohen et al. (1990) J. Med. Cam. 33: 883-894; Kuntz et al. (1982) J. Mol. Biol 161: 269-288; DesJarlais (1988) J. Med. Cam. 31: 722-729; Bartlett et al. (1989) (Spec. Publ., Roy. Soc. Chem.) 78: 182-196; Goodford et al. (1985) J. Med. Cam. 28: 849-857; DesJarlais et al. J. Med. Cam. 29: 2149-2153). Directed methods generally fall into two categories: (1) design by analogy in which 3-D structures of known molecules (such as from a crystallographic database) are docked to the domain structure and scored for goodness-of-fit; and (2) de novo design, in which the ligand model is constructed piece-wise in the domain structure. The latter approach, in particular, can facilitate the development of novel molecules, uniquely designed to bind to the subject domain.
The test compound may be screened as part of a library or a data base of molecules. Data bases which may be used include ACD (Molecular Designs Limited), NCI (National Cancer Institute), CCDC (Cambridge Crystallographic Data Center), CAST (Chemical Abstract Service), Derwent (Derwent Information Limited), Maybridge (Maybridge Chemical Company Ltd), Aldrich (Aldrich Chemical Company), DOCK (University of California in San Francisco), and the Directory of Natural Products (Chapman & Hall). Computer programs such as CONCORD (Tripos Associates) or DB-Converter (Molecular Simulations Limited) can be used to convert a data set represented in two dimensions to one represented in three dimensions.
Test compounds may tested for their capacity to fit spatially into a binding pocket. As used herein, the term “fits spatially” means that the three-dimensional structure of the test compound is accommodated geometrically in a cavity of a binding pocket. The test compound can then be considered to be a ligand.
A favourable geometric fit occurs when the surface area of the test compound is in close proximity with the surface area of the cavity of a binding pocket without forming unfavorable interactions. A favourable complementary interaction occurs where the test compound interacts by hydrophobic, aromatic, ionic, dipolar, or hydrogen donating and accepting forces. Unfavourable interactions may be steric hindrance between atoms in the test compound and atoms in the binding pocket.
If a model of the present invention is a computer model, the test compounds may be positioned in a binding pocket through computational docking. If, on the other hand, the model of the present invention is a structural model, the test compounds may be positioned in the binding pocket by, for example, manual docking.
As used herein the term “docking” refers to a process of placing a compound in close proximity with a binding pocket, or a process of finding low energy conformations of a test compound/binding pocket complex.
In an illustrative embodiment, the design of potential polo domain ligands begins from the general perspective of shape complimentary for an active site and substrate specificity subsites of the domain, and a search algorithm is employed which is capable of scanning a database of small molecules of known three-dimensional structure for candidates which fit geometrically into the target protein site. It is not expected that the molecules found in the shape search will necessarily be leads themselves, since no evaluation of chemical interaction necessarily be made during the initial search. Rather, it is anticipated that such candidates might act as the framework for further design, providing molecular skeletons to which appropriate atomic replacements can be made. Of course, the chemical complementarity of these molecules can be evaluated, but it is expected that atom types will be changed to maximize the electrostatic, hydrogen bonding, and hydrophobic interactions with the protein. Most algorithms of this type provide a method for finding a wide assortment of chemical structures that are complementary to the shape of a binding pocket of the subject domain. Each of a set of small molecules from a particular data-base, such as the Cambridge Crystallographic Data Bank (CCDB) (Allen et al. (1973) J. Chem. Doc. 13: 119), is individually docked to the binding pocket or site of a polo domain, in particular a Sak or Plk polo domain, in a number of geometrically permissible orientations with use of a docking algorithm. In a preferred embodiment, a set of computer algorithms called DOCK, can be used to characterize the shape of invaginations and grooves that form active sites and recognition surfaces of a subject structure (Kuntz et al. (1982) J. Mol. Biol 161: 269-288). The program can also search a database of small molecules for templates whose shapes are complementary to particular binding pockets or sites of a structure (DesJarlais et al. (1988) J Med Chem 31: 722-729). These templates normally require modification to achieve good chemical and electrostatic interactions (DesJarlais et al. (1989) ACS Symp Ser 413: 60-69). However, the program has been shown to position accurately known cofactors for ligands based on shape constraints alone.
The orientations are evaluated for goodness-of-fit and the best are kept for further examination using molecular mechanics programs, such as AMBER or CHARMM. Such algorithms have previously proven successful in finding a variety of molecules that are complementary in shape to a given binding site of a structure, and have been shown to have several attractive features. First, such algorithms can retrieve a remarkable diversity of molecular architectures. Second, the best structures have, in previous applications to other proteins, demonstrated impressive shape complementarity over an extended surface area. Third, the overall approach appears to be quite robust with respect to small uncertainties in positioning of the candidate atoms.
Goodford (1985, J Med Chem 28:849-857) and Boobbyer et al. (1989, J Med Chem 32:1083-1094) have produced a computer program (GRID) which seeks to determine regions of high affinity for different chemical groups (termed probes) on the molecular surface of the binding site. GRID hence provides a tool for suggesting modifications to known ligands that might enhance binding. It may be anticipated that some of the sites discerned by GRID as regions of high affinity correspond to “pharmacophoric patterns” determined inferentially from a series of known ligands. As used herein, a pharmacophoric pattern is a geometric arrangement of features of the anticipated ligand that is believed to be important for binding. Attempts have been made to use pharmacophoric patterns as a search screen for novel ligands (Jakes et al. (1987) J Mol Graph 5:41-48; Brint et al. (1987) J Mol Graph 5:49-56; Jakes et al. (1986) J Mol Graph 4:12-20); however, the constraint of steric and “chemical” fit in the putative (and possibly unknown) binding pocket or site is ignored. Goodsell and Olson (1990, Proteins: Struct Funct Genet 8:195-202) have used the Metropolis (simulated annealing) algorithm to dock a single known ligand into a target protein. They allow torsional flexibility in the ligand and use GRID interaction energy maps as rapid lookup tables for computing approximate interaction energies. Given the large number of degrees of freedom available to the ligand, the Metropolis algorithm is time-consuming and is unsuited to searching a candidate database of a few thousand small molecules.
Yet a further embodiment of the present invention utilizes a computer algorithm such as CLIX which searches such databases as CCDB for small molecules which can be oriented in a binding pocket or site in a way that is both sterically acceptable and has a high likelihood of achieving favorable chemical interactions between the candidate molecule and the surrounding amino acid residues. The method is based on characterizing a binding pocket in terms of an ensemble of favorable binding positions for different chemical groups and then searching for orientations of the candidate molecules that cause maximum spatial coincidence of individual candidate chemical groups with members of the ensemble. The current availability of computer power dictates that a computer-based search for novel ligands follows a breadth-first strategy. A breadth-first strategy aims to reduce progressively the size of the potential candidate search space by the application of increasingly stringent criteria, as opposed to a depth-first strategy wherein a maximally detailed analysis of one candidate is performed before proceeding to the next. CLIX conforms to this strategy in that its analysis of binding is rudimentary—it seeks to satisfy the necessary conditions of steric fit and of having individual groups in “correct” places for bonding, without imposing the sufficient condition that favorable bonding interactions actually occur. A ranked “shortlist” of molecules, in their favored orientations, is produced which can then be examined on a molecule-by-molecule basis, using computer graphics and more sophisticated molecular modeling techniques. CLIX is also capable of suggesting changes to the substituent chemical groups of the candidate molecules that might enhance binding.
The algorithmic details of CLIX is described in Lawerence et al. (1992) Proteins 12:31-41, and the CLIX algorithm can be summarized as follows. The GRID program is used to determine discrete favorable interaction positions (termed target sites) in the binding pocket or site of the protein for a wide variety of representative chemical groups. For each candidate ligand in the CCDB an exhaustive attempt is made to make coincident, in a spatial sense in the binding site of the protein, a pair of the candidate's substituent chemical groups with a pair of corresponding favorable interaction sites proposed by GRID. All possible combinations of pairs of ligand groups with pairs of GRID sites are considered during this procedure. Upon locating such coincidence, the program rotates the candidate ligand about the two pairs of groups and checks for steric hindrance and coincidence of other candidate atomic groups with appropriate target sites. Particular candidate/orientation combinations that are good geometric fits in the binding site and show sufficient coincidence of atomic groups with GRID sites are retained.
Consistent with the breadth-first strategy, this approach involves simplifying assumptions. Rigid protein and small molecule geometry is maintained throughout. As a first approximation rigid geometry is acceptable as the energy minimized coordinates of a polo domain, in particular a Sak polo domain deduced structure, as described herein, describe an energy minimum for the molecule, albeit a local one. If the surface residues of the site of interest are not involved in crystal contacts then the crystal configuration of those residues is used merely as a starting point for energy minimization, and potential solution structures for those residues determined. The deduced structure described herein should reasonably mimic the mean solution configuration.
A further assumption implicit in CLIX is that the potential ligand, when introduced into the binding pocket or site, does not induce change in the protein's stereochemistry or partial charge distribution and so alter the basis on which the GRID interaction energy maps were computed. It must also be stressed that the interaction sites predicted by GRID are used in a positional and type sense only, i.e., when a candidate atomic group is placed at a site predicted as favorable by GRID, no check is made to ensure that the bond geometry, the state of protonation, or the partial charge distribution favors a strong interaction between the protein and that group. Such detailed analysis should form part of more advanced modeling of candidates identified in the CLIX shortlist.
Yet another embodiment of a computer-assisted molecular design method for identifying ligands of a polo domain comprises the de novo synthesis of potential ligands by algorithmic connection of small molecular fragments that will exhibit the desired structural and electrostatic complementarity with a polo domain or binding pocket thereof. The methodology employs a large template set of small molecules with are iteratively pieced together in a model of a polo domain or binding pocket. Each stage of ligand growth is evaluated according to a molecular mechanics-based energy function, which considers van der Waals and coulombic interactions, internal strain energy of the lengthening ligand, and desolvation of both ligand and domain. The search space can be managed by use of a data tree which is kept under control by pruning according to the binding criteria.
In an illustrative embodiment, the search space is limited to consider only amino acids and amino acid analogs as the molecular building blocks. Such a methodology generally employs a large template set of amino acid conformations, though need not be restricted to just the 20 natural amino acids, as it can easily be extended to include other related fragments of interest to the medicinal chemist, e.g. amino acid analogs. The putative ligands that result from this construction method are peptides and peptide-like compounds rather than the small organic molecules that are typically the goal of drug design research. The appeal of the peptide building approach is not that peptides are preferable to organics as potential pharmaceutical agents, but rather that: (1) they can be generated relatively rapidly de novo; (2) their energetics can be studied by well-parameterized force field methods; (3) they are much easier to synthesize than are most organics; and (4) they can be used in a variety of ways, for peptidomimetic ligand design, protein-protein binding studies, and even as shape templates in the more commonly used 3D organic database search approach described above.
Such a de novo peptide design method has been incorporated in a software package called GROW (Moon et al. (1991) Proteins 11:314-328). In a typical design session, standard interactive graphical modeling methods are employed to define the structural environment in which GROW is to operate. For instance, environment could be an active site binding pocket of a polo domain, in particular a Sak or Plk polo domain, or it could be a set of features on the protein's surface to which the user wishes to bind a peptide-like molecule. The GROW program then operates to generate a set of potential ligand molecules. Interactive modeling methods then come into play again, for examination of the resulting molecules, and for selection of one or more of them for further refinement.
To illustrate, GROW operates on an atomic coordinate file generated by the user in the interactive modeling session, such as the coordinates provided in Table 2 plus a small fragment (e.g., an acetyl group) positioned in the active site to provide a starting point for peptide growth. These are referred to as “site” atoms and “seed” atoms, respectively. A second file provided by the user contains a number of control parameters to guide the peptide growth (Moon et al. (1991) Proteins 11:314-328).
The operation of the GROW algorithm is conceptually fairly simple. GROW proceeds in an iterative fashion, to systematically attach to the seed fragment each amino acid template in a large preconstructed library of amino acid conformations. When a template has been attached, it is scored for goodness-of-fit to the polo domain or binding pocket thereof, and then the next template in the library is attached to the seed. After all the templates have been tested, only the highest scoring ones are retained for the next level of growth. This procedure is repeated for the second growth level; each library template is attached in turn to each of the bonded seed/amino acid molecules that were retained from the first step, and is then scored. Again, only the best of the bonded seed/dipeptide molecules that result are retained for the third level of growth. The growth of peptides can proceed in the N-to-C direction only, the reverse direction only, or in alternating directions, depending on the initial control specifications supplied by the user. Successive growth levels therefore generate peptides that are lengthened by one residue. The procedure terminates when the user-defined peptide length has been reached, at which point the user can select from the constructed peptides those to be studied further. The resulting data provided by the GROW procedure includes not only residue sequences and scores, but also atomic coordinates of the peptides, related directly to the coordinate system of the domain site atoms.
In yet another embodiment, potential pharmacophoric compounds can be determined using a method based on an energy minimization-quenched molecular dynamics algorithm for determining energetically favorable positions of functional groups in the binding pockets of the subject polo domain. The method can aid in the design of molecules that incorporate such functional groups by modification of known ligands or de novo construction.
For example, the multiple copy simultaneous search method (MCSS) described by Miranker et al. (1991) Proteins 11: 29-34. To determine and characterize a local minima of a functional group in the forcefield of the protein, multiple copies of selected functional groups are first distributed in a binding pocket of interest on the polo domain. Energy minimization of these copies by molecular mechanics or quenched dynamics yields the distinct local minima. The neighborhood of these minima can then be explored by a grid search or by constrained minimization. In one embodiment, the MCSS method uses the classical time dependent Hartee (TDH) approximation to simultaneously minimize or quench many identical groups in the forcefield of the protein.
Implementation of the MCSS algorithm requires a choice of functional groups and a molecular mechanics model for each of them. Groups must be simple enough to be easily characterized and manipulated (3-6 atoms, few or no dihedral degrees of freedom), yet complex enough to approximate the steric and electrostatic interactions that the functional group would have in binding to the pocket or site of interest in the polo domain. A preferred set is, for example, one in which most organic molecules can be described as a collection of such groups (Patai's Guide to the Chemistry of Functional Groups, ed. S. Patai (New York: John Wiley, and Sons, (1989)). This includes fragments such as acetonitrile, methanol, acetate, methyl ammonium, dimethyl ether, methane, and acetaldehyde.
Determination of the local energy minima in the binding pocket or site requires that many starting positions be sampled. This can be achieved by distributing, for example, 1,000-5,000 groups at random inside a sphere centered on the binding site; only the space not occupied by the protein needs to be considered. If the interaction energy of a particular group at a certain location with the protein is more positive than a given cut-off (e.g. 5.0 kcal/mole) the group is discarded from that site. Given the set of starting positions, all the fragments are minimized simultaneously by use of the TDH approximation (Elber et al. (1990) J Am Chem Soc 112: 9161-9175). In this method, the forces on each fragment consist of its internal forces and those due to the protein. The essential element of this method is that the interactions between the fragments are omitted and the forces on the protein are normalized to those due to a single fragment. In this way simultaneous minimization or dynamics of any number of functional groups in the field of a single protein can be performed.
Minimization is performed successively on subsets of, for example 100, of the randomly placed groups. After a certain number of step intervals, such as 1,000 intervals, the results can be examined to eliminate groups converging to the same minimum. This process is repeated until minimization is complete (e.g. RMS gradient of 0.01 kcal/mole/C). Thus the resulting energy minimized set of molecules comprises what amounts to a set of disconnected fragments in three dimensions representing potential pharmacophores.
The next step then is to connect the pharmacophoric pieces with spacers assembled from small chemical entities (atoms, chains, or ring moieties). In a preferred embodiment, each of the disconnected can be linked in space to generate a single molecule using such computer programs as, for example, NEWLEAD (Tschinke et al. (1993) J Med Chem 36: 3863,3870). The procedure adopted by NEWLEAD executes the following sequence of commands (1) connect two isolated moieties, (2) retain the intermediate solutions for further processing, (3) repeat the above steps for each of the intermediate solutions until no disconnected units are found, and (4) output the final solutions, each of which is single molecule. Such a program can use for example, three types of spacers: library spacers, single-atom spacers, and fuse-ring spacers. The library spacers are optimized structures of small molecules such as ethylene, benzene and methylamide. The output produced by programs such as NEWLEAD consist of a set of molecules containing the original fragments now connected by spacers. The atoms belonging to the input fragments maintain their original orientations in space. The molecules are chemically plausible because of the simple makeup of the spacers and functional groups, and energetically acceptable because of the rejection of solutions with van-der Waals radii violations.
A screening method of the present invention may comprise the following steps:
In an aspect of the invention, a method is provided comprising the following steps:
In an embodiment of the invention, a method is provided which comprises the following steps:
The method may be applied to a plurality of test compounds, to identify those that best fit the selected site.
The model used in the screening method may comprise a binding pocket either alone or in association with one or more ligands and/or cofactors. For example, the model may comprise the binding pocket in association with a nucleotide (or analogue thereof), a substrate (or analogue thereof), and/or modulator.
If the model comprises an unassociated binding pocket, then the selected site under investigation may be the binding pocket itself. The test compound may, for example, mimic a known ligand (e.g. substrate) for a polo family kinase in order to interact with the binding pocket. The selected site may alternatively be another site on the polo domain or polo family kinase.
If the model comprises an associated binding pocket, for example a binding pocket in association with a ligand, the selected site may be the binding pocket or a site made up of the binding pocket and the complexed ligand, or a site on the ligand itself. The test compound may be investigated for its capacity to modulate the interaction with the associated molecule.
A test compound (or plurality of test compounds) may be selected on the basis of their similarity to a known ligand for a polo domain, in particular a Sak or Plk1 polo domain. For example, the screening method may comprise the following steps:
Searching may be carried out using a database of computer representations of potential compounds, using methods known in the art.
The present invention also provides a method for designing a ligand for a polo domain. It is well known in the art to use a screening method as described above to identify a test compound with promising fit, but then to use this test compound as a starting point to design a ligand with improved fit to the model. Such techniques are known as “structure-based ligand design” (See Kuntz et al., 1994, Acc. Chem. Res. 27:117; Guida, 1994, Current Opinion in Struc. Biol. 4: 777; and Colman, 1994, Current Opinion in Struc. Biol. 4: 868, for reviews of structure-based drug design and identification;and Kuntz et al 1982, J. Mol. Biol. 162:269; Kuntz et al., 1994, Acc. Chem. Res. 27: 117; Meng et al., 1992, J. Compt. Chem. 13: 505; Bohm, 1994, J. Comp. Aided Molec. Design 8: 623 for methods of structure-based modulator design).
Examples of computer programs that may be used for structure-based ligand design are CAVEAT (Bartlett et al., 1989, in “Chemical and Biological Problems in Molecular Recognition”, Roberts, S. M. Ley, S. V.; Campbell, N. M. eds; Royal Society of Chemistry: Cambridge, pp 182-196); FLOG (Miller et al., 1994, J. Comp. Aided Molec. Design 8:153); PRO Modulator (Clark et al., 1995 J. Comp. Aided Molec. Design 9:13); MCSS (Miranker and Karplus, 1991, Proteins: Structure, Fuction, and Genetics 8:195);,and, GRID (Goodford, 1985, J. Med. Chem. 28:849).
The method may comprise the following steps:
Evaluation of fit may comprise the following steps:
The fit of the modified test compound may then be evaluated using the same criteria.
The chemical modification of a group may either enhance or reduce hydrogen bonding interaction, charge interaction, hydrophobic interaction, Van Der Waals interaction or dipole interaction between the test compound and the key amino acid residue(s) of the binding pocket. Preferably the group modifications involve the addition, removal, or replacement of substituents onto the test compound such that the substituents are positioned to collide or to bind preferentially with one or more amino acid residues that correspond to the key amino acid residues of the binding pocket.
If a modified test compound model has an improved fit, then it may bind to a binding pocket and be considered to be a “ligand”. Rational modification of groups may be made with the aid of libraries of molecular fragments which may be screened for their capacity to fit into the available space and to interact with the appropriate atoms. Databases of computer representations of libraries of chemical groups are available commercially, for this purpose.
The test compound may also be modified “in situ” (i.e. once docked into the potential binding pocket), enabling immediate evaluation of the effect of replacing selected groups. The computer representation of the test compound may be modified by deleting a chemical group or groups, or by adding a chemical group or groups. After each modification to a compound, the atoms of the modified compound and potential binding pocket can be shifted in conformation and the distance between the modulator and the binding pocket atoms may be scored on the basis of geometric fit and favourable complementary interactions between the molecules. This technique is described in detail in Molecular Simulations User Manual, 1995 II LUDI.
Examples of ligand building and/or searching computer include programs in the Molecular Simulations Package (Catalyst), ISIS/HOST, ISIS/BASE, and ISIS/DRAW (Molecular Designs Limited), and UNITY (Tripos Associates).
The “starting point” for rational ligand design may be a known ligand for a polo domain. For example, in order to identify potential modulators of a polo domain or polo family kinase, in particular Sak or Plk, a logical approach would be to start with a known ligand to produce a molecule which mimics the binding of the ligand. Such a molecule may, for example, act as a competitive inhibitor for the true ligand, or may bind so strongly that the interaction (and inhibition) is effectively irreversible.
Such a method may comprise the following steps:
The replacement groups could be selected and replaced using a compound construction program which replaces computer representations of chemical groups with groups from a computer database, where the representations of the compounds are defined by structural coordinates.
In an embodiment, a screening method is provided for identifying a ligand of a polo domain, in particular a Sak or Plk polo domain, comprising the step of using the structural coordinates of a substrate or component thereof, defined in relation to its spatial association with a binding pocket of the invention, to generate a compound that is capable of associating with the binding pocket.
Screening methods of the present invention may be used to identify compounds or entities that associate with a molecule that associates with a polo domain, in particular a Sak or Plk polo domain.
Test compounds and ligands which are identified using a crystal or model of the present invention can be screened in assays such as those well known in the art. Screening may be for example in vitro, in cell culture, and/or in vivo. Biological screening assays preferably centre on activity-based response models, binding assays (which measure how well a compound binds to a domain), and bacterial, yeast, and animal cell lines (which measure the biological effect of a compound in a cell). The assays may be automated for high throughput screening in which large numbers of compounds can be tested to identify compounds with the desired activity. The biological assay may also be an assay for the binding activity of a compound that selectively binds to the binding pocket compared to other proteins.
Ligands/Compounds Identified by Screening Methods
The present invention provides a ligand or compound identified by a screening method of the present invention. A ligand or compound may have been designed rationally by using a model according to the present invention. A ligand or compound identified using the screening methods of the invention specifically associate with a target compound, or part thereof (e.g. a binding pocket). In the present invention the target compound may be the polo family kinase (e.g. Sak or Plk1) or part thereof (polo domain), or a molecule that is capable of associating with the polo family kinase or polo domain (e.g. substrate).
A ligand or compound identified using a screening method of the invention may act as a “modulator”, i.e. a compound which affects the activity of a polo family kinase, in particular Sak or Plk1. A modulator may reduce, enhance or alter the biological function of a polo family kinase in particular Sak or Plk1. For example a modulator may modulate the capacity of the enzyme to phosphorylate. An alteration in biological function may be characterised by a change in specificity. In order to exert its function, the modulator commonly binds to a binding pocket.
A “modulator” which is capable of reducing the biological function of the enzyme may also be known as an inhibitor. Preferably an inhibitor reduces or blocks the capacity of the enzyme to phosphorylate. The inhibitor may mimic the binding of a substrate, for example, it may be a substrate analogue. A substrate analogue may be designed by considering the interactions between the substrate and a polo domain (for example by using information derivable from the crystal of the invention) and specifically altering one or more groups (as described above).
The present invention also provides a method for modulating the activity of a polo family kinase, in particular Sak or Plk1, using a modulator according to the present invention. It would be possible to monitor activity following such treatment by a number of methods known in the art.
A modulator may be an agonist, partial agonist, partial inverse agonist or antagonist of a polo family kinase.
As used herein, the term “agonist” means any ligand, which is capable of binding to a binding pocket and which is capable of increasing a proportion of the protein that is in an active form, resulting in an increased biological response. The term includes partial agonists and inverse agonists.
As used herein, the term “partial agonist” means an agonist that is unable to evoke the maximal response of a biological system, even at a concentration sufficient to saturate the specific proteins.
As used herein, the term “partial inverse agonist” is an inverse agonist that evokes a submaximal response to a biological system, even at a concentration sufficient to saturate the specific proteins. At high concentrations, it will diminish the actions of a full inverse agonist.
As used herein, the term “antagonist” means any agent that reduces the action of another agent, such as an agonist. The antagonist may act at the same site as the agonist (competitive antagonism). The antagonistic action may result from a combination of the substance being antagonised (chemical antagonism) or the production of an opposite effect through a different protein (functional antagonism or physiological antagonism) or as a consequence of competition for the binding site of an intermediate that links enzyme activation to the effect observed (indirect antagonism).
As used herein, the term “competitive antagonism” refers to the competition between an agonist and an antagonist for a protein that occurs when the binding of agonist and antagonist becomes mutually exclusive. This may be because the agonist and antagonist compete for the same binding sites or combine with adjacent but overlapping sites. A third possibility is that different sites are involved but that they influence the protein macromolecules in such a way that agonist and antagonist molecules cannot be bound at the same time. If the agonist and antagonist form only short lived combinations with the protein so that equilibrium between agonist, antagonist and protein is reached during the presence of the agonist, the antagonism will be surmountable over a wide range of concentrations. In contrast, some antagonists, when in close enough proximity to their binding site, may form a stable covalent bond with it and the antagonism becomes insurmountable when no spare proteins remain.
As mentioned above, an identified ligand or compound may act as a ligand model (for example, a template) for the development of other compounds. A modulator may be a mimetic of a ligand.
Like the test compound (see above) a modulator may be one or a variety of different sorts of molecule.(See examples herein.) A modulator may be an endogenous physiological compound, or it may be a natural or synthetic compound. The modulators of the present invention may be natural or synthetic. The term “modulator” also refers to a chemically modified ligand or compound.
The technique suitable for preparing a modulator will depend on its chemical nature. For example, peptides can be synthesized by solid phase techniques (Roberge J Y et al (1995) Science 269: 202-204) and automated synthesis may be achieved, for example, using the ABI 43 1 A Peptide Synthesizer (Perkin Elmer) in accordance with the instructions provided by the manufacturer. Once cleaved from the resin, the peptide may be purified by preparative high performance liquid chromatography (e.g., Creighton (1983) Proteins Structures and Molecular Principles, WH Freeman and Co, New York N.Y.). The composition of the synthetic peptides may be confirmed by amino acid analysis or sequencing (e.g., the Edman degradation procedure; Creighton, supra).
Organic compounds may be prepared by organic synthetic methods described in references such as March, 1994, Advanced Organic Chemistry: Reactions, Mechanisms, and Structure, New York, McGraw Hill.
The invention also relates to classes of modulators of polo family kinases based on the structure and shape of a substrate or component thereof, defined in relation to the substrate's spatial association with a crystal structure of the invention or part thereof.
The invention contemplates all optical isomers and racemic forms of the modulators of the invention.
Compositions
The present invention also provides the use of a modulator according to the invention, in the manufacture of a medicament to treat and/or prevent a disease in a mammalian patient. There is also provided a pharmaceutical composition comprising such a modulator and a method of treating and/or preventing a disease comprising the step of administering such a modulator or composition to a mammalian patient.
The pharmaceutical compositions may be for human or animal usage in human and veterinary medicine and will typically comprise a pharmaceutically acceptable carrier, diluent, excipient, adjuvant or combination thereof.
Acceptable carriers or diluents for therapeutic use are well known in the pharmaceutical art, and are described, for example, in Remington's Pharmaceutical Sciences, Mack Publishing Co. (A. R. Gennaro edit. 1985). The choice of pharmaceutical carrier, excipient or diluent can be selected with regard to the intended route of administration and standard pharmaceutical practice. The pharmaceutical compositions may also comprise suitable binder(s), lubricant(s), suspending agent(s), coating agent(s), solubilising agent(s).
Preservatives, stabilizers, dyes and even flavouring agents may be provided in the pharmaceutical composition. Examples of preservatives include sodium benzoate, sorbic acid and esters of p-hydroxybenzoic acid. Antioxidants and suspending agents may be also used.
The routes for administration (delivery) include, but are not limited to, one or more of: oral (e.g. as a tablet, capsule, or as an ingestable solution), topical, mucosal (e.g. as a nasal spray or aerosol for inhalation), nasal, parenteral (e.g. by an injectable form), gastrointestinal, intraspinal, intraperitoneal, intramuscular, intravenous, intrauterine, intraocular, intradermal, intracranial, intratracheal, intravaginal, intracerebroventricular, intracerebral, subcutaneous, ophthalmic (including intravitreal or intracameral), transdermal, rectal, buccal, vaginal, epidural, sublingual.
Where the pharmaceutical composition is to be delivered mucosally through the gastrointestinal mucosa, it should be able to remain stable during transit though the gastrointestinal tract; for example, it should be resistant to proteolytic degradation, stable at acid pH and resistant to the detergent effects of bile.
Where appropriate, the pharmaceutical compositions can be administered by inhalation, in the form of a suppository or pessary, topically in the form of a lotion, gel, hydrogel, solution, cream, ointment or dusting powder, by use of a skin patch, orally in the form of tablets containing excipients such as starch or lactose or chalk, or in capsules or ovules either alone or in admixture with excipients, or in the form of elixirs, solutions or suspensions containing flavouring or colouring agents, or they can be injected parenterally, for example, intravenously, intramuscularly or subcutaneously. For parenteral administration, the compositions may be best used in the form of a sterile aqueous solution which may contain other substances, for example enough salts or monosaccharides to make the solution isotonic with blood. The aqueous solutions should be suitably buffered (preferably to a pH of from 3 to 9), if necessary. The preparation of suitable parenteral formulations under sterile conditions is readily accomplished by standard pharmaceutical techniques well-known to those skilled in the art.
If the agent of the present invention is administered parenterally, then examples of such administration include one or more of: intravenously, intra-arterially, intraperitoneally, intrathecally, intraventricularly, intraurethrally, intrasternally, intracranially, intramuscularly or subcutaneously administering the agent; and/or by using infusion techniques.
For buccal or sublingual administration the compositions may be administered in the form of tablets or lozenges which can be formulated in a conventional manner.
The tablets may contain excipients such as microcrystalline cellulose, lactose, sodium citrate, calcium carbonate, dibasic calcium phosphate and glycine, disintegrants such as starch (preferably corn, potato or tapioca starch), sodium starch glycollate, croscarmellose sodium and certain complex silicates, and granulation binders such as polyvinylpyrrolidone, hydroxypropylmethylcellulose (HPMC), hydroxypropylcellulose (HPC), sucrose, gelatin and acacia. Additionally, lubricating agents such as magnesium stearate, stearic acid, glyceryl behenate and talc may be included.
Solid compositions of a similar type may also be employed as fillers in gelatin capsules. Preferred excipients in this regard include lactose, starch, cellulose, milk sugar or high molecular weight polyethylene glycols. For aqueous suspensions and/or elixirs, the agent may be combined with various sweetening or flavouring agents, colouring matter or dyes, with emulsifying and/or suspending agents and with diluents such as water, ethanol, propylene glycol and glycerin, and combinations thereof.
As indicated, the therapeutic agent (e.g. modulator) of the present invention can be administered intranasally or by inhalation and is conveniently delivered in the form of a dry powder inhaler or an aerosol spray presentation from a pressurised container, pump, spray or nebuliser with the use of a suitable propellant, e.g. dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, a hydrofluoroalkane such as 1,1,1,2-tetrafluoroethane (HFA 134A™) or 1,1,1,2,3,3,3-heptafluoropropane (HFA 227EA™), carbon dioxide or other suitable gas. In the case of a pressurised aerosol, the dosage unit may be determined by providing a valve to deliver a metered amount. The pressurised container, pump, spray or nebuliser may contain a solution or suspension of the active compound, e.g. using a mixture of ethanol and the propellant as the solvent, which may additionally contain a lubricant, e.g. sorbitan trioleate. Capsules and cartridges (made, for example, from gelatin) for use in an inhaler or insufflator may be formulated to contain a powder mix of the agent and a suitable powder base such as lactose or starch.
Therapeutic administration of polypeptide modulators may also be accomplished using gene therapy. A nucleic acid including a promoter operatively linked to a heterologous polypeptide may be used to produce high-level expression of the polypeptide in cells transfected with the nucleic acid. DNA or isolated nucleic acids may be introduced into cells of a subject by conventional nucleic acid delivery systems. Suitable delivery systems include liposomes, naked DNA, and receptor-mediated delivery systems, and viral vectors such as retroviruses, herpes viruses, and adenoviruses.
The invention further provides a method of treating a mammal, the method comprising administering to a mammal a modulator or pharmaceutical composition of the present invention.
Typically, a physician will determine the actual dosage which will be most suitable for an individual subject and it will vary with the age, weight and response of the particular patient and severity of the condition. The dosages below are exemplary of the average case. There can, of course, be individual instances where higher or lower dosage ranges are merited.
The specific dose level and frequency of dosage for any particular patient may be varied and will depend upon a variety of factors including the activity of the specific compound employed, the metabolic stability and length of action of that compound, the age, body weight, general health, sex, diet, mode and time of administration, rate of excretion, drug combination, the severity of the particular condition, and the individual undergoing therapy. By way of example, the pharmaceutical composition of the present invention may be administered in accordance with a regimen of 1 to 10 times per day, such as once or twice per day.
For oral and parenteral administration to human patients, the daily dosage level of the agent may be in single or divided doses.
Applications
The modulators and compositions of the invention may be useful in treating, inhibiting, or preventing diseases modulated by polo family kinases. They may be used to treat, inhibit, or prevent proliferative diseases. The modulators may be used to stimulate or inhibit cell proliferation.
Accordingly, modulators of the invention may be useful in the prevention and treatment of conditions including but not limited to lymphoproliferative conditions, malignant and pre-malignant conditions, arthritis, inflammation, and autoimmune disorders. Malignant and pre-malignant conditions may include solid tumors, B cell lymphomas, chronic lymphocytic leukemia, chronic myelogenous leukemia, prostate hypertrophy, Hirschsprung disease, glioblastoma, breast and ovarian cancer, adenocarcinoma of the salivary gland, premyelocytic leukemia, prostate cancer, multiple endocrine neoplasia type IIA and IIB, medullary thyroid carcinoma, papillary carcinoma, papillary renal carcinoma, hepatocellular carcinoma, gastrointestinal stromal tumors, sporadic mastocytosis, acute myeloid leukemia, large cell lymphoma or Alk lymphoma, chronic myeloid leukemia, hematological/solid tumors, papillary thyroid carcinoma, stem cell leukemia/lymphoma syndrome, acure myelogenous leukemia, osteosarcoma, multiple myeloma, preneoplastic liver foci, and resistance to chemotherapy. Diseases associated with increased cell survival, or the inhibition of apoptosis, include cancers (e.g. follicular lymphomas, carcinomas with p53 mutations, hormone-dependent tumors such as breast cancer, prostate cancer, Kaposi's sarcoma and ovarian cancer); autoimmune disorders (such as lupus erythematosus and immune-related glomerulonephritis rheumatoid arthritis) and viral infections (such as herpes viruses, pox viruses, and adenoviruses); inflammation, graft vs. host disease, acute graft rejection and chronic graft rejection.
Modulators that stimulate cell proliferation may be useful in the treatment of conditions involving damaged cells including conditions in which degeneration of tissue occurs such as arthropathy, bone resorption, inflammatory disease, degenerative disorders of the central nervous system, and for promoting wound healing.
The invention will now be illustrated by the following non-limiting examples:
The following methods were used in the investigation described in the example: Protein expression, mutagenesis and purification: The polo domain of Sak (residues 839 to 925) which was delimited by proteolysis and mass spectrometry, was expressed in E. coli as a GST-fusion protein using the pGEX-2T vector (Pharmacia). The QuikChange™ kit (Stratagene) was used to generate the double site-directed mutant C909L/V874M to improve long-term protein stability and for phasing purposes. Protein was purified by affinity chromatography using glutathione-sepharose (Pharmacia). Bound protein was eluted by cleavage with thrombin (Sigma). Eluate was applied to a HiQ ion-exchange column under low salt conditions. The flow-through containing the polo domain was concentrated to approximately 1 mM and then applied to a Superdex 75 gel filtration column (Pharmacia) for final purification and characterization by static light scattering as described by Luo et al.[35].
Crystallization and data collection: Hanging drops containing 1 μl of 50 mg ml−1 native or mutant protein in 20 mM Hepes pH 8.0, 5 mM dithiothreitol (DTT), were mixed with equal volumes of reservoir buffer containing 100 mM Tris pH 7.0, 32.5% (v/v) Jeffamine M-600 (Hampton), and 200 mM MgCl2. Hexagonal-like crystals of approximate dimensions 0.10×0.10×0.03 mm were obtained overnight for both native and mutant proteins. The asymmetric unit of the crystals consist of two polypeptides forming an interdigitated dimer. The crystals belong to the space group P3212, (a=b 32 51.782 Å, c=146.941 Å).
MAD diffraction data was collected on frozen crystals at the Structural Biology Center 19-BM and BIOCARS 14-BMC at the Advanced Photon Source at Argonne National Laboratory. Data processing and reduction was carried out using HKL 2000 [36]. Heavy atom sites were identified using CNS [37] and phasing, density modification, and experimental electron density map calculation was performed using SHARP3 [38].
Model building and Refinement: Model building was performed using O [39]. A starting model comprised of approximately 85% of the polypeptide sequence was refined using CNS [37]. Bulk solvent correction was applied during refinement and simulated annealing protocols were employed. The remaining structure was built into 2|Fo-Fc| electron density maps generated with CNS. The final refinement statistics are shown in Table 1. The first and last 6 residues of the polo domain fragment are disordered (residues 839 to 844 and residues 920 to 925) and have not been modeled. Analysis by PROCHECK [40] indicated that no amino acid residues occupy disallowed regions of the Ramachandran plot and 94% occupy the most favored regions.
Sak protein localization: Full length Sak (residues 1-925), SakΔpb (residues 1-823), Sak241 (residues 596-836), SakΔ(pb+241) (residues 1-595), and Sakpb (residues 824-925) were fused to enhanced green fluorescent protein (EGFP) in the vector pEGFP-Cl (Clontech). NIH 3T3 murine fibroblast cells were maintained in DMEM containing 10% FBS. For transient gene expression, cells at 20-30% confluence on glass cover slips were transiently transfected with pEGFP-Sak, pEGFP-SakΔpb, pEGFP-SakΔ(pb+241), Sak241, pEGFP-Sakpb, or pEGFP-Cl with Effectene™ (Qiagen). Cells were released from 48 h of serum starvation by addition of fresh media containing 10% FBS and fixed at intervals as they proceeded through the cell cycle. Cells were processed by rinsing twice in PBS, fixed with 3.7% para-formaldehyde in PBS for 12 min, and permeabilized for 5 min in PBS 0.5% Triton X-100. Actin microfilaments were stained with a 1:100 dilution of TRITC-phalloidin (Sigma) in PBS. γ-tubulil was stained with a 1:200 dilution of anti-γ-tubulin antibody (Sigma) in Tris/Saline 0.1% Tween20 at 20° C. for 40 min. Cells were washed three times in Tris/Saline+0.1% Tween20 and incubated in a 1:500 dilution of rhodamine-conjugated goat anti-mouse antibody (Pierce) for 40 min. Nuclei were stained with Hoechst 33258 (Molecular Probes) in PBS for 1 min. Images were obtained using an Olympus IX-70 inverted microscope equipped with a Princeton CCD camera and Deltavision Deconvolution microscopy software (Applied Precision).
Quantification of EGFP fusion proteins exhibiting centrosomal localization was performed by counting three independent populations of 100 cells. Because of the inability to generate large populations of cells undergoing cytokinesis, the quantification of EGFP fusion protein localization to the cleavage furrow was not scored. The SakΔpd construct (residues 1-823) fused to EGFP differed from the FLAG- and Myc-tagged SakΔpd construct (residues 1-836) prepared for coimmunprecipitation studies by a deletion of 13 amino acid residues from the C-terminus. The Sakpd construct (residues 824-925) fused to EGFP differs from the FLAG- and Myc-tagged Sakpd (residues 819-925) prepared for coimmunoprecipitation studies by the deletion of 5 amino acid residues at the N-terminus.
Immmunoprecipitation: NIH 3T3 murine fibroblast cells were maintained in DMEM containing 10% FBS. For transient gene expression, cells at 30-40% confluence were tranlsfected using Effectene™ (Qiagen). After 24 h post transfection cells were lysed in 50 mM Tris pH 7.5, 100 mM NaCl, 1 mM EDTA, 0.5% Triton-X 100. Immunoprecipitations were performed using anti-FLAG antibody (Sigma) and Protein G Sepharose (Pharmacia) according to product specifications. The Protein G sepharose matrix was washed three times with lysis buffer. Western blots were performed using a 1:200 dilution of anti-Myc antibody (Santa Cruz Biotech) or a 1:4000 dilution of anti-FLAG antibody (Sigma).
Coordinates
The Sak polo domain coordinates are in Table 2.
Results and Discussison
A protein fragment encompassing the polo box motif of Sak (residues 839 to 925) was expressed and characterized. Using limited proteolysis and mass spectrometry, it was found that the polo box motif comprises an autonomously folding unit, which is designated the polo domain, that behaves as a dimer in solution as indicated by size exclusion chromatography and static light scattering analysis (SLS molecular weight=22.6±0.9 kDa versus predicted monomer molecular weight=10.8 kDa). The domain was crystallized and its structure determined using the selenomethione-multiple anomalous dispersion (SeMet MAD) method. Structure determination and crystallographic refinement statistics are provided in Table 1. A comprehensive structure based sequence alignment of the polo domain is shown in
Structure Description
The crystal structure of the polo domain of Sak is dimeric, consisting of two α-helices and two six-stranded β-sheets (
Residues of Sak that compose much of the polo domain hydrophobic core are highly conserved across the Plks (
The presence of two polo domains in all Plks other than the Sak orthologs raises an interesting possibility for an intramolecular mode of polo domain dimerization. In support of this possibility is a covariance in primary structure across paired polo domains involving the conserved salt bridge (Asp 868 and Lys 906) and a dimer interface residue equivalent to Val 846 in Sak (
While less conserved than the hydrophobic core and dimer interface structure, the interfacial cleft and pocket display properties suggestive of a functionally important surface. Of the 19 conserved hydrophobic positions in the polo domain alignment, 9 contribute side chains to the outer cleft and inner pocket (
Polo Domain Self-Association in vivo
To investigate the ability of the polo domain of Sak to dimerize in vivo differentially tagged mammalian expression constructs were generated and tested for sell-association in vivo using a coimmunoprecipitation assay. As shown in
Polo Domain Subcellular Localization
To investigate the role of the polo domain in the subcellular localization of Sak, enhanced green fluorescent protein (EGFP) fusion constructs of Sak, SakΔpd, SakΔ(pd+241), Sak241, and Sakpd were transiently transfected into NIH 3T3 cells and examined using immunofluorescence. EGFP-Sak colocalizes in cells with γ-tubulin and actin, which indicate the positions of centrosomes and the cleavage furrow, respectively (
The polo domain of Sak forms dimers both in vitro and in a crystal environment, can self-associate in vivo, and localizes to mitotic structures. The conservation of the hydrophobic core and dimer interface residues, the presence of two copies of the polo domain in most Plks, and the covariance across tandem polo domains in most Plks suggest that the ability to adopt a dimeric conformation may be a general characteristic of all polo domains and that dimerization may occur in an intramolecular manner for some family members.
The deregulation of Plks alters mitotic checkpoints, chromosome stability and can lead to tumour development [27, 28]. Indeed, Plk1 is overexpressed in many human tumours [29-32] and causes malignant transformation when overexpressed in NIH 3T3 cells [33]. In addition, over expression of a kinase-deficient form of Plk1 results in cell death, an apparent dominant-negative effect that is more pronounced in tumor cells than non-transformed cells [34]. This identifies the Plks as potential targets for cancer therapy. The requirement of the polo domain for Plk family function and, in contrast to the catalytic domain, its exclusive presence in this small family of proteins that regulate mitotic progression suggests that the polo domain itself may serve as a good target for intervention. Indeed, the large semi-enclosed cleft and pocket with its partial hydrophobic character appears well suited for the design of small molecule inhibitors.
The present invention is not to be limited in scope by the specific embodiments described herein, since such embodiments are intended as but single illustrations of one aspect of the invention and any functionally equivalent embodiments are within the scope of this invention. Indeed, various modifications of the invention in addition to those shown and described herein will become apparent to those skilled in the art from the foregoing description and accompanying drawings. Such modifications are intended to fall within the scope of the appended claims.
All publications, patents and patent applications referred to herein are incorporated by reference in their entirety to the same extent as if each individual publication, patent or patent application was specifically and individually indicated to be incorporated by reference in its entirety. All publications, patents and patent applications mentioned herein are incorporated herein by reference for the purpose of describing and disclosing the cell lines, vectors, methodologies etc. which are reported therein which might be used in connection with the invention. Nothing herein is to be construed as an admission that the invention is not entitled to antedate such disclosure by virtue of prior invention.
It must be noted that as used herein and in the appended claims, the singular forms “a”, “an”, and “the” include plural reference unless the context clearly dictates otherwise. Thus, for example, reference to “a host cell” includes a plurality of such host cells, reference to the “antibody” is a reference to one or more antibodies and equivalents thereof known to those skilled in the art, and so forth.
Full Citations for References Referred to in the Specification
Number | Date | Country | |
---|---|---|---|
60357475 | Feb 2002 | US | |
60360704 | Feb 2002 | US |