The invention relates to a high resolution crystal structure for human Rab9, and in particular to methods of use for this crystal structure for drug discovery.
Rab proteins are the largest subfamily of the Ras-like small GTPase superfamily, and serve as key regulators in vesicular transport (Lombardi et al. (1993) EMBO J. 12: 677-82). Rab proteins are thought to function in vesicle fusion and/or targeting events, a hypothesis bolstered by evidence that most organelles of the endocytic and secretory pathways bear distinct Rab proteins on their surfaces (Novick et al. (1980) Cell 21: 205-15; Plutner et al. (1991) J. Cell Biol. 115: 31-43; Rexach and Schekman (1991) J. Cell Biol. 114: 219-29; Segev (1991) Science 252: 1553-56; Gorvel et al. (1991) Cell 64: 915-25; Lombardi et al. (1993) EMBO J. 12: 677-82). Biochemical and genetic studies of chimeric and mutant Rab proteins have identified several hypervariable regions, including the N- and C-termini and the α3/β5 loop, that play an important role in determining functional specificity (Brennwald and Novick (1993) Nature 362: 560-63; Stenmark et al. EMBO J. 13: 575-83).
Rab proteins serve as molecular switches mediating tethering, docking, fusion, and motility of intracellular membranes by cycling between GTP-bound (on/active) and GDP-bound (off/inactive) conformations. (Pfeffer (2001) Trends Cell Biol. 11: 487-91; Pfeffer (1994) Curr. Opin. Cell Biol. 6: 522-26; Bourne et al. (1991) Nature 349: 117-27; Sprang (1997) Ann. Rev. Biochem. 66: 639-78). The active form is stabilized by additional hydrogen bond interactions with the γ-phosphate of GTP, mediated by serine residues in the phosphate-binding loop (P-loop) and switch I region, as well as an extensive hydrophobic interface between the switch I and II regions (Dumas et al. (1999) Structure Fold Des. 7: 413-23; Esters et al. (2000) J. Mol. Biol. 298: 111-21). The inactive conformation usually has displaced and mobile switch regions. (Stroupe and Brunger (2000) J. Mol. Biol. 304: 585-98).
One particular Rab protein, Rab9, was first identified in a screen of cDNA clones (Chavrier et al. (1990) Cell 62: 317-29). Rab9 is localized predominantly on the surface of late endosomes and stimulates the recycling of mannose-6-phosphate receptors (MPRs) from late endosomes to the trans-Golgi network (TGN) (Lombardi et al. (1993) EMBO J. 12: 677-82; Riederer et al. (1994) J. Cell Biol. 125: 573-82). Rab9 also interacts with the vesicle cargo selection protein TIP47, which has been shown to bind the cytoplasmic tail of the HIV envelope glycoprotein subunit gp41 (Blot et al. (2003) J. Virol. 77: 6931-45).
Recent interest on Rab9 has focused on its role in HIV-1, Ebola, Marburg and Measles virus replication, making drugs that target Rab9 of significant technical and commercial interest. However, no crystal structure of human Rab9 has been available that could be used for structure guided drug design. There is therefore a need for a method to determine a high resolution crystal structure of human Rab9, as well as methods of using of this crystal structure for drug discovery.
The present invention relates to crystalline Rab9 in complex with either GDP or the GTP analog Guanosine 5′-(β,γ-imido) triphosphate (GppNHp), the three dimensional coordinates and structures of Rab9 in Rab9-GDP and Rab9-GppNHp complexes, and uses thereof for drug design. In particular, the present invention relates to methods for producing Rab9 crystals of sufficient quality to obtain a determination of the three dimensional structure of Rab9 to a high resolution in both its GTP-bound (on/active) and GDP-bound (off/inactive) conformations. The present invention also relates to a computer-readable medium encoded with the three dimensional coordinates of Rab9 in Rab9-GDP and Rab9-GppNHp complexes wherein, using a graphical display software program, the three dimensional coordinates create an electronic file that can be visualized on a computer capable of representing the electronic file as a three dimensional image.
In one embodiment, a crystal of a C-terminally truncated human Rab9 having the three dimensional atomic coordinates of Table 4 or Table 6, wherein said Rab9 comprises amino acids 1-177 of SEQ ID NO:1.
In another embodiment, a crystal of a Rab9-GDP complex is provided wherein the crystal effectively diffracts X-rays for the determination of the atomic coordinates of the Rab9-GDP complex to a resolution of less than about 1.75 Å, preferably between about 1.25 Å and about 1.75 Å, and most preferably about 1.25 Å.
In another embodiment, a crystal of a Rab9-GDP complex is provided having a space group of P1 and a unit cell of dimensions a=38.40 Å, b=45.62 Å, c=51.22 Å, α=99.8°, β=107.2°, and γ=101.8°.
In another embodiment, a Rab9-GDP complex is provided wherein the Rab9 has secondary structural elements that include six β-sheets and five α-helices, wherein the β-sheets are designated B1-B6 and wherein the α-helices are designated H1-H5, and wherein the β-sheets and the α-helices connect with a B1-H1-B2-B3-H2-B4-H3-B5-H4-B6-H5 topology.
In another embodiment, a crystal of a Rab9-GppNHp complex is provided wherein the crystal effectively diffracts X-rays for the determination of the atomic coordinates of the Rab9-GppNHp complex to a resolution of less than about 20 Å, preferably less than about 2 Å, more preferably between about 1.25 Å and about 2 Å, and most preferably about 1.73 Å.
In another embodiment, a crystal of a Rab9-GppNHp complex is provided having a space group of P212121 and a unit cell of dimensions a=56.24 Å, b=76.60 Å, and c=174.35 Å.
In another embodiment, Rab9-GppNHp complex is provided wherein the Rab9 has secondary structural elements that include six β-sheets and five α-helices, wherein the β-sheets are designated B1-B6 and wherein the α-helices are designated H1-H5, and wherein the β-sheets and the β-helices connect with a B1-H1-B2-B3-H2-B4-H3-B5-H4-B6-H5 topology.
In another aspect, a method of using any of the crystals described above to screen for an agent that modulates Rab9 activity is provided, comprising the steps of: a) selecting a candidate agent by performing structure based drug design with the three-dimensional structure determined for the crystal, wherein selection is performed in conjunction with computer modeling; b) contacting the candidate agent with Rab9; and c) detecting the ability of the candidate agent to modulate Rab9. In another embodiment, the screening method comprises use of a candidate agent selected from a database, designed de novo, or designed from a known modulator of Rab9 activity. In yet another embodiment, the screening method comprises a candidate agent that is an inhibitor of Rab9, including competitive inhibitors and non-competitive or uncompetitive inhibitor of Rab9. In yet another embodiment, the step of employing the three-dimensional structure to design or select the candidate agent further comprises the steps of: a) screening for chemical entities or fragments capable of associating with Rab9; and b) assembling the chemical entities or fragments into a single molecule to provide the structure of the candidate agent.
In another embodiment, a method of screening for an agent that modulates Rab9 activity is provided, comprising the steps of: a) providing a model of Rab9 including at least one of the binding sites defined by at least one of Table 2,
In another embodiment or the screening method, the binding sites are selected from the group consisting of Region I, Region II, Region III, Region IV, Region V, Region VI, and Region VII of the C-terminally truncated human Rab9 defined by the amino acid sequence of SEQ ID NO:1, wherein Region I consists of amino acids 1 to 7 of SEQ ID NO:1, Region II consists of amino acids 33 to 43 of SEQ ID NO:1, Region III consists of amino acids 50 to 54 of SEQ ID NO:1, Region IV consists of amino acids 63 to 79 of SEQ ID NO:1, Region V consists of amino acids 108 to 117 of SEQ ID NO:1, Region VI consists of amino acids 128 to 130 of SEQ ID NO:1, and Region VII consists of amino acids 168 to 177 of SEQ ID NO:1. In yet another embodiment, the screening method screens for an agent that modulates Rab9 activity by inhibiting the binding of a Rab9-associated protein with Rab9, including but not limited to the Rab9-associated proteins TIP47 or P40.
In another embodiment, a method of screening for an agent that modulates Rab9 activity is provided, comprising the steps of: a) providing a three-dimensional structure of Rab9 as defined by the atomic coordinate data of Table 4 or Table 6, wherein said Rab9 comprises amino acids 1-177 of SEQ ID NO:1; b) employing the three-dimensional structure to design or select a candidate agent; c) synthesizing the candidate agent; and d) contacting the candidate agent with the Rab9 in the presence of a substrate to test the ability of the candidate agent to modulate the activity of the Rab9. In another embodiment, the screening method comprises use of a candidate agent selected from a database, designed de novo, or designed from a known modulator of Rab9 activity. In yet another embodiment, the screening method comprises a candidate agent that is an inhibitor of Rab9, including competitive inhibitors and non-competitive or uncompetitive inhibitor of Rab9. In yet another embodiment, the step of employing the three-dimensional structure to design or select the candidate agent further comprises the steps of: a) screening for chemical entities or fragments capable of associating with Rab9; and b) assembling the chemical entities or fragments into a single molecule to provide the structure of the candidate agent.
In another embodiment, a method of screening for an agent that modulates Rab9 activity is provided, comprising: a) selecting or designing a candidate agent by performing structure based drug design with a computer system encoded with computer readable data comprising atomic coordinate data or binding site data or both, wherein said atomic coordinate data is defined by Table 4 or Table 6, and wherein said binding site data is defined by at least one of Table 2,
In another embodiment, a computer system is provided, where the computer system is encoded with computer readable data comprising atomic coordinate data or binding site data or both, wherein said atomic coordinate data is defined by Table 4 or Table 6, and wherein said binding site data is defined by at least one of Table 2,
In another embodiment, a computer-readable medium is provided, where the computer-readable medium is encoded with atomic coordinate data or binding site data or both, wherein said atomic coordinate data is defined by Table 4 or Table 6, and wherein said binding site data is defined by at least one of Table 2,
This patent contains multiple drawings executed in color. Copies of this patent with color drawings will be provided by the Office upon request and payment of the necessary fee.
The present invention now will be described more fully hereinafter with reference to the accompanying examples, in which some, but not all embodiments of the invention are shown. Indeed, these inventions may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will satisfy applicable legal requirements. Like numbers refer to like elements throughout.
Many modifications and other embodiments of the inventions set forth herein will come to mind to one skilled in the art to which these inventions pertain having the benefit of the teachings presented in the foregoing descriptions and the associated drawings. Therefore, it is to be understood that the inventions are not to be limited to the specific embodiments disclosed and that modifications and other embodiments are intended to be included within the scope of the appended claims. Although specific terms are employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation.
The article “a” and “an” are used herein to refer to one or more than one (i.e., to at least one) of the grammatical object of the article. By way of example, “an element” means one or more than one element.
As used herein, sequences contain the one letter code for nucleotide sequence characters and the three letter codes for amino acids as defined in conformity with the IUPAC-IYUB standards described in Nucleic Acids Res. 13, 3021-3030 (1985) and in the Biochemical Journal 219, 345-373 (1984) which are herein incorporated by reference. Specifically, abbreviations of amino acids are defined below:
The terms “isolated” and “biologically pure” do not necessarily reflect the extent to which the protein has been purified. An isolated protein of the present invention can be obtained from its natural source, can be produced using recombinant DNA technology or can be produced by chemical synthesis. It is also to be noted that the terms “tertiary” and “three-dimensional” can be used interchangeably. It is also to be noted that reference to a “Rab9 protein” or a “Rab9 GTPase” can also be recited simply as “Rab9” and such terms can be used to refer to the complete Rab9 protein, a portion of the Rab9 protein, such as a polypeptide, and/or a monomer or a dimer of the Rab9 protein. When reference is specifically made to a monomer or dimer, for example, such term is typically used in conjunction with the Rab9 protein name.
The term “unit cell” refers to a basic parallelepiped-shaped block (in other words, a six faced block, each a parallelogram and each being parallel to the opposite face). The entire volume of a crystal may be constructed by regular assembly of such blocks. Each unit cell comprises a complete representation of the unit of pattern, the repetition of which builds up the crystal.
The term “space group” refers to the arrangement of symmetry elements of a crystal.
The term “root mean square deviation” refers to the square root of the arithmetic mean of the squares of the deviations from the mean.
The term “complex” refers to a Rab9 protein (or Rab9 truncation or homologue) in covalent or non-covalent association with a ligand. Ligand may include, for example, a chemical entity, compound, or inhibitor, candidate drug, and the like. The term “association” refers to a condition of proximity between the ligand and Rab9, or their respective portions thereof, in any appropriate physicochemical interaction. Ligands may include, but are not limited to, GDP, GTP, and GTP analog Guanosine 5′-(β,γ-imido) triphosphate (GppNHp).
The term “binding site” refers to a site, such as an atom or functional group of an amino acid residue, in the Rab9 active site that may bind to an agent or Rab9-associated protein. The term “active site” refers to any or all of the following: 1) the portion of the Rab9 sequence that binds to a substrate; 2) the portion of the Rab9 sequence that binds to an inhibitor; 3) the portion of the Rab9 sequence that binds to GDP, GTP, and/or GppNHp; and 4) the portion of the Rab9 sequence that binds to a Rab9-associated protein. Depending on the particular molecule bound to Rab9, sites may exhibit attractive or repulsive binding interactions, brought about by charge, steric considerations and the like.
The term “Rab9-associated protein” refers to a protein that normally interacts with Rab9, including but not limited to such Rab9 interacting proteins as Rab effector P40 and the vesicle cargo selection protein TIP47.
By “fitting” is meant determining by automatic, or semi-automatic means, interactions between one or more atoms of an agent and one or more binding sites of Rab9, and determining the extent to which such interactions are stable. Various computer-based methods for fitting are described further herein.
By a “computer system” is meant the hardware means, software means and data storage means used to analyze atomic coordinate data. The computer may comprise a central processing unit (“CPU”), a working memory, for example, random access memory (“RAM”) and/or storage memory in the form of one or more disk drives (e.g., floppy, Zip™, Jazz™), tape drives, CD-ROM drives, DVD drives, and the like, a display terminal such as for example, a cathode ray tube type display, and input and output lines for data transmission, including a keyboard and/or mouse controller. The computer may be a stand-alone, or connected to a network and/or shared server. Examples of such systems are microcomputer workstations available from Silicon Graphics Incorporated and Sun Microsystems running Unix based, Windows NT or IBM OS/2 operating systems.
By “computer readable media” or “CRM” is meant any media which can be read and accessed directly by a computer, for example so that the media is suitable for use in the above-mentioned computer system. Computer-readable data storage materials include, for example, floppy discs, hard disc storage medium and magnetic tape; optical storage media such as optical discs or CD-ROM; electrical storage media such as RAM and ROM; Zip™ and Jazz™-type disks; tapes; CDs; and DVDs; and hybrids of these categories such as magnetic/optical storage media.
Rab proteins, the largest subfamily of the Ras-like small GTPase superfamily, serve as molecular switches mediating tethering, docking, fusion, and motility of intracellular membranes (Pfeffer, S. R. (2001) Trends Cell Biol. 11, 487-491). Rabs cycle between GTP-bound (on/active) and GDP-bound (off/inactive) conformations (Pfeffer, S. R. (1994) Curr. Opin. Cell Biol. 6, 522-526; Bourne, H. R. et al. (1991) Nature 349, 117-127; Sprang, S. R. (1997) Annu. Rev. Biochem. 66, 639-678). The active form is stabilized by additional hydrogen bond interactions with the γ-phosphate of GTP mediated by serine residues in the phosphate-binding loop and switch I region as well as an extensive hydrophobic interface between the switch I and II regions (Dumas, J. J. et al. (1999) Structure Fold Des. 7, 413-423; Esters, H. et al. (2000) J. Mol. Biol. 298, 111-121). The inactive conformation usually has displaced and mobile switch regions (Bourne, H. R. et al. (1991) Nature 349, 117-127; Stroupe, C., and Brunger, A. T. (2000) J. Mol. Biol. 304, 585-598). Biochemical and genetic studies of chimeric and mutant Rab proteins have identified several hypervariable regions, including the N and C termini and the α3/β5 loop, that play an important role in determining functional specificity (Brennwald, P., and Novick, P. (1993) Nature 362, 560-563; Stenmark, H. et al. (1994) EMBO J. 13, 575-583).
The Rab9 GTPase is localized predominantly to late endosomes and is required for the transport of mannose 6-phosphate receptors from endosomes to the trans-Golgi network (Lombardi, D. et al. (1993) EMBO J. 12, 677-682; Riederer, M. A. et al. (1994) J. Cell Biol. 125, 573-582). By targeting Rab9 mRNA for degradation with small interfering RNA, Rab9 has been identified to be a key cellular component for HIV-1, Ebola, Marburg, and measles virus replication, suggesting that inhibitors of Rab9 function, if developed, might prove useful in the control of those viruses.
Rab9-associated proteins may be involved in the normal physiological activity of Rab9. For example, Rab9 facilitates vesicular transport by pairing with its cognate Rab effector P40 (Diaz, E. et al. (1997) J. Cell Biol. 138, 283-290). Rab9 also interacts with the vesicle cargo selection protein TIP47, which has been shown to bind the cytoplasmic tail of the HIV envelope glycoprotein subunit gp41 (Blot, G. et al. (2003) J. Virol. 77, 6931-6945). Because Rab9-associated proteins may be involved in the normal physiological activity of Rab9, in one embodiment, binding sites mediating the interaction of Rab9 with its associated proteins are provided as targets for structure-based drug design and development.
In one embodiment of the present invention, a crystal of a C-terminally truncated human Rab9 having the three dimensional atomic coordinates of Table 4 or Table 6, wherein said Rab9 comprises amino acids 1-177 of SEQ ID NO:1.
As discussed herein, the term “Rab9” in intended to include all or part of the amino acid sequence of SEQ ID NO:1 as well as any biologically active variants of human Rab9 in which functionally equivalent amino acid residues are substituted for residues within the sequence resulting in conservative amino acid substitutions. A “biologically active variant” of human Rab9 is a polypeptide derived from the human Rab9 polypeptide by deletion (so-called truncation) or addition of one or more amino acids to the N-terminal and/or C-terminal end of the native protein; deletion or addition of one or more amino acids at one or more sites in the protein; or substitution of one or more amino acids at one or more sites in the protein. Biologically active variant human Rab9 polypeptides encompassed by the present invention are biologically active, that is they are capable of having the GTPase activity of the Rab protein subfamily of the Ras-like small GTPase superfamily (Lombardi et al. (1993) EMBO J. 12: 677-82). Such biologically active variants may result from, for example, genetic polymorphism or from human manipulation. Biologically active variants of human Rab9 according to the invention will have at least about 50%, 60%, 65%, 70%, generally at least about 75%, 80%, 85%, preferably at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, such as at least about 98%, 99% or more sequence identity to the amino acid sequence shown in SEQ ID NO:1. Thus, a biologically active variant of human Rab9 of the invention may differ from the amino acid sequences shown in SEQ ID NO:1 by as few as 1-15 amino acid residues, as few as 1-10 amino acid residues, such as 6-10 amino acid residues, as few as 5 amino acid residues, or as few as 4, 3, 2, or even 1 amino acid residue.
In order to retain biological activity, any substitutions will preferably be conservative in nature, and truncations and substitutions will generally made in residues that are not required for GTPase activity. For example, one or more amino acid residues within the sequence can be substituted by another amino acid of a similar polarity, which acts as a functional equivalent, resulting in a silent alteration. Substitutes for an amino acid within the sequence may be selected from other members of the class to which the amino acid belongs. For example, nonpolar (hydrophobic) amino acids include alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan and methionine. Amino acids containing aromatic ring structures are phenylalanine, tryptophan, and tyrosine. Polar neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and glutamine. Positively charged (basic) amino acids include arginine, lysine, and histidine. Negatively charged (acidic) amino acids include aspartic acid and glutamic acid. Such alterations will not be expected to affect apparent molecular weight as determined by polyacrylamide gel electrophoresis, or isoelectric point.
The comparison of sequences and determination of percent identity and percent similarity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch (1970) J. Mol. Biol. 48:444-453 algorithm, which is incorporated into the GAP program in the GCG software package (available at www.accelrys.com), using either a BLOSSUM62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package, using a BLOSUM62 scoring matrix (see Henikoff et al. (1989) Proc. Natl. Acad. Sci. USA 89:10915) and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and the one that should be used if the practitioner is uncertain about what parameters should be applied to determine if a molecule is within a sequence identity limitation of the invention) is using a BLOSUM62 scoring matrix with a gap weight of 60 and a length weight of 3).
The percent identity between two amino acid or nucleotide sequences can also be determined using the algorithm of E. Meyers and W. Miller (1989) CABIOS 4:11-17 which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.
In another embodiment, a crystal of a Rab9-GDP complex is provided wherein the crystal effectively diffracts X-rays for the determination of the atomic coordinates of the Rab9-GDP complex to a resolution of less than about 1.75 Å, preferably between about 1.25 Å and about 1.75 Å, and most preferably about 1.25 Å. In another embodiment, a crystal of a C-terminally truncated human Rab9 (residues 1-177) is provided having a space group of P1 and a unit cell of dimensions a=38.40 Å, b=45.62 Å, c=51.22 Å, α=99.8°, β=107.2°, and γ=101.8°. In another embodiment, a crystal of a C-terminally truncated human Rab9 (residues 1-177) is provided where the Rab9 has secondary structural elements that include six β-sheets and five α-helices, where the β-sheets are designated B1-B6 and where the α-helices are designated H1-H5, and where the β-sheets and the α-helices connect with a B1-H1-B2-B3-H2-B4-H3-B5-H4-B6-H5 topology.
In another embodiment, a crystal of a Rab9-GppNHp complex is provided wherein the crystal effectively diffracts X-rays for the determination of the atomic coordinates of the Rab9-GppNHp complex to a resolution of less than about 20 Å, preferably less than about 2 Å, more preferably between about 1.25 Å and about 2 Å, and most preferably about 1.73 Å. In another embodiment, a crystal of a Rab9-GppNHp complex is provided having a space group of P212121 and a unit cell of dimensions a=56.24 Å, b=76.60 Å, and c=174.35 Å. In another embodiment, Rab9-GppNHp complex is provided wherein the Rab9 has secondary structural elements that include six β-sheets and five α-helices, wherein the β-sheets are designated B1-B6 and wherein the α-helices are designated H1-H5, and wherein the β-sheets and the α-helices connect with a B1-H1-B2-B3-H2-B4-H3-B5-H4-B6-H5 topology.
Structure based drug design involves the rational design of ligand molecules to interact with the three-dimensional (“3-D”) structure of target receptors; the ultimate goal being to identify or design molecules with 3-D complementarity to the target protein (Kirkpatrick et al. (1999) Comb. Chem. High Throughput Screen. 2: 211-21). The accuracy required of a protein structure depends on the question addressed by the design process, with some processes predicated on the assumption that a lead molecule will need to complement a known binding site for a ligand precisely, or match the presumed transition state structure of a reaction closely (Whittle and Blundell (1994) Annu. Rev. Biophys. Biomol. Struct. 23: 349-75). Such cases call for an accurate model at the highest resolution possible. Alternatively, the design process may exploit the structure to indicate the general availability of space to fill, hydrogen bonds to make, or electrostatic interactions to optimize, in which case knowledge of the general topography of the binding site is often useful (Whittle and Blundell (1994) Annu. Rev. Biophys. Biomol. Struct. 23: 349-75).
Factors that affect the accuracy of structure-based drug design include aspects of the determination of the three-dimensional structure of proteins such as refinement, resolution, the number of restraints introduced in the structure analysis, statistical indicators of agreement between the model and the experimental data, and the conformity of the model to stereochemistry found in proteins in general (Whittle and Blundell (1994) Annu. Rev. Biophys. Biomol. Struct. 23: 349-75). Most statistical parameters can be optimized, at least within the constraints of the data. However, if the data is of poor quality or the conformations are incorrect (particularly for the sidechains and loops), then it is difficult to optimize all of the parameters at the same time (Whittle and Blundell (1994) Annu. Rev. Biophys. Biomol. Struct. 23: 349-75). Computer programs are available to introduce a check on such parameters, including PROCHECK™, which analyzes the distribution of a range of conformational parameters and compares them with expected distributions (Laskowski et al. (1993). J. Appl. Crystallogr. 26:283). Sequence-dependent indications of the probability that the structure is correct can be derived through a comparison of the local environment in the proposed structure to the propensity of an amino acid (Luthy et al. (1991) Proteins Struct. Funct. Genet. 10: 229; Novotny et al. (1988) Proteins Struct. Funct. Genet. 4: 19), the knowledge-based potential (Hendlich et al. (1990) J. Mol. Biol. 216: 167), or the probability of amino acid substitution (Overington et al. (1990) Proc. R. Soc. London Ser. B 241: 132; Topham et al. (1991) Biochem. Soc. Syrup. 57: 1) in the proposed structure.
Protein structures cannot generally be predicted by simulation of the folding pathway due to the fact that the forces between the atoms of the protein, and particularly with the surrounding solvent and counter-ions, are not very well described (Whittle and Blundell (1994) Annu. Rev. Biophys. Biomol. Struct. 23: 349-75). However, proteins belong to families with a common fold, including more than 1500 groups of homologous proteins that can be recognized by sequence searches alone, and over 500 that have common topologies or folds (Whittle and Blundell (1994) Annu. Rev. Biophys. Biomol. Struct. 23: 349-75).
Profiles or templates are useful in the search for the common fold and alignment of sequences for proteins with sequence identities of <30% (Whittle and Blundell (1994) Annu. Rev. Biophys. Biomol. Struct. 23: 349-75). Structural information can be used to identify key features in protein architecture and then to associate these with invariant or conserved sequences (Bedarkar et al. (1977) Nature, 270: 449; Eigenbrot et al. (1991) J. Mol. Biol. 221; 15). Projection of the restraints of the three-dimensional fold onto the one dimension of the sequence and comparison to sequence templates or profiles provides a more systematic approach (Sali et al. (1990) J. Mol. Biol. 212: 403). The template search can also be approached by determining the propensity of an amino acid to occur in each class of local structural environment defined by solvent accessibility and secondary structure, or by calculation of amino acid substitution tables as a function of local environment (Bowie et al. (1991) Science 253: 164; Johnson et al. (1993) J. Mol. Biol. 231: 735; Luthy et al. (1991) Proteins Struct. Funct. Genet. 10: 229; Overington et al. (1990) Proc. R. Soc. London Ser. B 241: 132).
The three-dimensional structure of a protein can also be predicted by using information derived from the identification of a new sequence with a known fold (Summers et al. (1987) J. Mol. Biol. 196: 175; Sutcliffe et al. (1987) Protein Eng. 1: 385). Some methods depend on the assembly of rigid fragments to select sets of fragments that define the framework: the structurally variable (mainly loop) regions and the sidechains (Blundell et al. (1988) Ear. J. Biochem. 172: 513; Blundell et al. (1987) Nature 326: 347; Claessens et al. (1989) Protein Eng. 2: 335; Jones et al. EMBO J. 5: 819; Topham et al. (1993) J. Mol. Biol. 229: 194). Such modeling procedures are very successful when the percentage sequence identity to the unknown is high (greater than 40%) and when the known structures cluster around that to be predicted (Srinivasen & Blundell (1993) Protein Eng. 6: 501).
Where a common fold is not known, combinatorial approaches that depend upon the identification of secondary-structure elements using conformational propensities and residue patterns can be valuable (Presnell et al. (1992) Biochemistry 31: 983). The elements of secondary structure are then assembled by docking and/or by using rules concerning supersecondary structures (Whittle and Blundell (1994) Annu. Rev. Biophys. Biomol. Struct. 23: 349-75).
The present invention provides a method of using crystals of a human Rab9 as described herein (for example, a crystal of a Rab9-GDP complex or a Rab9-GppNHp complex, or both) to screen for an agent that modulates Rab9 activity. The screening method of the present invention may comprise use of a candidate agent selected from a database, designed de novo, or designed from a known modulator of Rab9 activity. Suitable candidate agents of the present invention include peptides or other organic molecules, and inorganic molecules. Suitable organic molecules include small organic molecules. Preferably, a therapeutic compound of the present invention is not harmful (e.g., toxic) to an animal when such compound is administered to an animal. Peptides refer to a class of compounds that is small in molecular weight and yields two or more amino acids upon hydrolysis. A polypeptide is comprised of two or more peptides. As used herein, a protein is comprised of one or more polypeptides. Preferred therapeutic compounds to design include peptides composed of “L” and/or “D” amino acids that are configured as normal or retroinverso peptides, peptidomimetic compounds, small organic molecules, or homo- or hetero-polymers thereof, in linear or branched configurations.
In one embodiment, the screening method of the present invention comprises the steps of: a) selecting a candidate agent by performing structure based drug design with the three-dimensional structure determined for crystals of a human Rab9 as described herein (for example, a crystal of a Rab9-GDP complex or a Rab9-GppNHp complex, or both), where selection is performed in conjunction with computer modeling; b) contacting the candidate agent with Rab9; and c) detecting the ability of the candidate agent to modulate Rab9. The step of employing the three-dimensional structure to design or select the candidate agent may further comprise the steps of: a) screening for chemical entities or fragments capable of associating with Rab9; and b) assembling the chemical entities or fragments into a single molecule to provide the structure of the candidate agent.
In another embodiment, the screening method of the present invention comprises the steps of: a) providing a model of Rab9, said model including at least one of the binding sites defined by at least one of Table 2,
Rab9 contains seven hypervariable regions that are significantly different in conformation from other Rab proteins. These seven hypervariable regions in Rab9 structure are thought to be involved in the binding of associated proteins to Rab9. Without being bound by theory, because Rab9-associated proteins may be involved in the normal physiological activity of Rab9, these seven hypervariable regions in Rab9 provide an excellent target for structure-based drug design and development.
In another embodiment, the screening method of the present invention provides a model of Rab9 including binding sites selected from the group consisting of Region I, Region II, Region III, Region IV, Region V, Region VI, and Region VII of the C-terminally truncated human Rab9 defined by the amino acid sequence of SEQ ID NO:1, wherein Region I consists of amino acids 1 to 7 of SEQ ID NO:1, Region II consists of amino acids 33 to 43 of SEQ ID NO:1, Region III consists of amino acids 50 to 54 of SEQ ID NO:1, Region IV consists of amino acids 63 to 79 of SEQ ID NO:1, Region V consists of amino acids 108 to 117 of SEQ ID NO:1, Region VI consists of amino acids 128 to 130 of SEQ ID NO:1, and Region VII consists of amino acids 168 to 177 of SEQ ID NO:1. In yet another embodiment, the screening method screens for an agent that modulates Rab9 activity by inhibiting the binding of a Rab9-associated protein with Rab9, including but not limited to wherein the Rab9-associated protein is TIP47 or P40.
In another embodiment, the screening method of the present invention screens for an agent that modulates Rab9 activity, and comprises the steps of: a) providing a three-dimensional structure of Rab9 as defined by the atomic coordinate data of Table 4 or Table 6, wherein said Rab9 comprises amino acids 1-177 of SEQ ID NO:1; b) employing the three-dimensional structure to design or select a candidate agent; c) synthesizing the candidate agent; and d) contacting the candidate agent with the Rab9 in the presence of a substrate to test the ability of the candidate agent to modulate the activity of the Rab9. The step of employing the three-dimensional structure to design or select the candidate agent may further comprise the steps of: a) screening for chemical entities or fragments capable of associating with Rab9; and b) assembling the chemical entities or fragments into a single molecule to provide the structure of the candidate agent.
In another embodiment, the screening method of the present invention screens for an agent that modulates Rab9 activity, and comprises the steps of: a) selecting or designing a candidate agent by performing structure based drug design with a computer system encoded with computer readable data comprising atomic coordinate data or binding site data or both, wherein said atomic coordinate data is defined by Table 4 or Table 6, and wherein said binding site data is defined by at least one of Table 2,
In one embodiment of the present invention, the screening method comprises a candidate agent that is an inhibitor of Rab9, including competitive inhibitors and non-competitive or uncompetitive inhibitor of Rab9. A potential inhibitor is selected by performing rational drug design with the three-dimensional structure (or structures) determined for a crystal as described herein, especially in conjunction with computer modeling and methods as described herein. The potential inhibitor is obtained from commercial sources or is synthesized from readily available starting materials using standard synthetic techniques and methodologies known to those of ordinary skill in the art. The potential inhibitor is then assayed to determine its ability to inhibit Rab9 and/or a Rab9-associated intracellular process or pathway. The assay may be in vitro or in vivo. Inhibition can be measured by various methods, including, for example, those methods illustrated in the examples below. Assays include any assay wherein a nucleoside or nucleotide are cofactors or substrates of the peptide of interest, and particularly any assay involving phosphotransfer in which the substrates and or cofactors are GDP, GTP, and/or GppNHp. The assay may be an enzyme inhibition assay, utilizing a full length or truncated GTPase. The enzyme is contacted with the potential inhibitor and a measurement of the binding affinity of the potential inhibitor against a standard is determined. Such assays are known to one of ordinary skill in the art. The assay may also be a cell-based assay in which the potential inhibitor is contacted with a cell and a measurement of inhibition of a standard marker produced in the cell is determined. Cells may be either isolated from an animal, including a transformed cultured cell, or may be in a living animal. Such assays are also known to one of ordinary skill in the art.
A variety of methods are available to one skilled in the art for evaluating and virtually screening candidate agents appropriate for associating with Rab9. Such association may be in a variety of forms including, for example, steric interactions, van der Waals interactions, electrostatic interactions, solvation interactions, charge interactions, covalent bonding interactions, non-covalent bonding interactions (e.g., hydrogen-bonding interactions), entropically or enthalpically favorable interactions, and the like.
Once the three-dimensional structure of a target protein has been defined, computational procedures may be used to suggest ligands that will bind at the active site. Interactive graphics approaches explore new ligand designs manually in ways that might involve, for example, modification of groups on the ligand to optimize complementarity with receptor/enzyme subsites, optimization of a transition state isostere to reflect data from mechanistic studies, replacement of peptide bonds with groups that improve hydrolytic stability while maintaining key hydrogen bond interactions, or linking of adjacent side groups to increase the rigidity of the ligand (Whittle and Blundell (1994) Annu. Rev. Biophys. Biomol. Struct. 23: 349-75). Most of these steps can now be done using systematic computational approaches that fall into three classes: 1) automated docking of whole molecules into receptor sites; 2) precalculating potentials at grid points and fitting molecules to these potentials; and 3) docking fragments and either joining them or growing them into real molecules (Whittle and Blundell (1994) Annu. Rev. Biophys. Biomol. Struct. 23: 349-75).
Attempts at automated docking through the evaluation of electrostatic, steric, or more complex energy terms during a systematic search of rotational and translational space for the two molecules have produced some successes, but the simplification of energy functions required to achieve reasonable computational times has proved limiting (Kuntz et al. (1982). J. Mol. Biol. 161: 269; Wodak (1978) J. Mol. Biol. 124: 323; Whittle and Blundell (1994) Annu. Rev. Biophys. Biomol. Struct. 23: 349-75). Interactive or manual docking involving the positioning of molecules with constant feedback of the energy has been used as an alternative, but the many degrees of freedom and modes of interaction, however, have imposed their own limitations on the utility of this approach (Busetta et al. (1983) J. Appl. Crystallogr. 16: 432; Pattabiraman et al. (1985) J. Comput. Chem. 6: 432; Tomioka et al. (1987) J. Comput. Aided Mol. Des. 1: 197).
Precalculating terms for each point on a grid can be used to identify hydrogen-bonding sites within enzyme active sites and also significantly reduces computational time (Goodford (1985) J. Med. Chem. 28: 849). A similar approach involves the use of pseudoenergies calculated from pairwise distributions of atoms in protein complexes or crystals of small molecules, with probe molecules then fitted to these potentials and ranked according to energy (Whittle and Blundell (1994) Annu. Rev. Biophys. Biomol. Struct. 23: 349-75). For example, software such as DOCK (available from University of California, San Francisco), creates a negative image of the target site by placing a set of overlapping spheres so that they fill the complex invaginations of the proposed binding site, and the putative ligands are then placed into the site by matching X-ray or computer derived structures on the basis of a comparison of internal distances (Whittle and Blundell (1994) Annu. Rev. Biophys. Biomol. Struct. 23: 349-75). The candidates are then ranked on the basis of their best orientations. Other methods include a directed version of DOCK that allows for hydrogen-bond information to be used and conformational flexibility to be allowed, and a method that uses least squares fitting to maximize overlap of enzymes and putative ligands (Leach & Kuntz (1992) J. Comput. Chem. 13: 730; Bacon & Moult (1992) J. Mol. Biol. 225: 849). Still further methods involve the use of genetic algorithms and graph theory to generate molecular structures within constraints of an enzyme active site or a receptor binding site (Payne & Glen (1993) J. Mol. Graph. 11: 76; Lewis (1993) J. Mol. Graph. 10: 131). For all of these methods to be useful in drug discovery, however, they must depend upon the existence of large data bases of small molecule structures, such as the Cambridge Structure Data Base and the Fine Chemicals Directory (Allen et al. (1979) Acta Cryst. B 35: 2331; Rusinko et al. (1989) J. Chem. Inf. Comput. Sci. 29: 251).
Methods involving fragment docking and then developing algorithms to grow them into larger structures to fill the space available depend upon the exploration of electrostatic, van der Waals, or hydrogen bonding interactions involved in molecular recognition (Whittle and Blundell (1994) Annu. Rev. Biophys. Biomol. Struct. 23: 349-75). Many of these methods incorporate the GRID algorithm as a starting point, and then use GenStar and/or GroupBuild to generate chemically reasonable structures to fill the active sites of enzymes (Rotstein and Murcko (1993) J. Comput. Aided Mol. Des. 7: 23; Rotstein and Murcko (1993) J. Med. Chem. 36: 1700). Alternatively, the program can start with a docked core or the structure of a fragment from an inhibitor complex and for each atom generated, several hundred candidate positions, representing different bond lengths and torsion angles, are scored on the basis of contacts with the enzyme (Whittle and Blundell (1994) Annu. Rev. Biophys. Biomol. Struct. 23: 349-75).
Numerous computer programs are available and suitable for rational drug design and the processes of computer modeling, model building, and computationally identifying, selecting and evaluating potential inhibitors in the methods described herein. These include, for example, SYBYL (available from TRIPOS, St. Louis Mo.), DOCK (available from University of California, San Francisco), GRID (available form Oxford University, UK), MCSS (available from Molecular Simulations Inc., Burlington, Mass.), AUTODOCK (available from Oxford Molecular Group), FLEX X (available from TRIPOS, St. Louis Mo.), CAVEAT (available from University of California, Berkeley), HOOK (available from Molecular Simulations Inc., Burlington, Mass.), and 3-D database systems such as MACCS-3D (available from MDL Information Systems, San Leandro, Calif.), UNITY (available from TRIPOS, St. Louis Mo.), and CATALYST (available from Molecular Simulations Inc., Burlington, Mass.). Potential inhibitors may also be computationally designed de novo using such software packages as LUDI (available from Biosym TechMA), and LEAPFROG (TRIPOS Associates, St. Louis, Mo.). Compound deformation energy and electrostatic repulsion, may be evaluated using programs such as GAUSSIAN 92, AMBER, QUANTA/CHARMM, and INSIGHT II/DISCOVER. These computer evaluation and modeling techniques may be performed on any suitable hardware including for example, workstations available from Silicon Graphics, Sun Microsystems, and the like. These techniques, methods, hardware and software packages are representative and are not intended to be comprehensive listing. Other modeling techniques known in the art may also be employed in accordance with this invention. See for example, N. C. Cohen, Molecular Modeling in Drug Design, Academic Press (1996); Whittle and Blundell (1994) Annu. Rev. Biophys. Biomol. Struct. 23: 349-75; Grootenhuis et al. (1992) Bull. Soc. Chim. Belg 101: 661; Lawrence and Davis (1992) Proteins Struct. Funct. Genet. 12: 31; Miranker and Karplus (1991) Proteins Struct. Funct. Genet. 11: 29). Other methods and programs include CLIX (a suite of computer programs that searches the Cambridge Data base for small molecules that have both geometrical and chemical complementarity to a defined binding site on a protein of known three-dimensional structure), and software identified at internet sites including the CAOS/CAMM Center Cheminformatics Suite at http://www.caos.kun.nl/, and the NIH Molecular Modeling Home Page at http://www.fi.muni.cz/usr/mejzlik/mirrors/molbio.info.nih.gov/modeling/software list/.
In one embodiment of the present invention a computer system is provided, wherein the computer system is encoded with computer readable data comprising atomic coordinate data or binding site data or both, wherein said atomic coordinate data is defined by Table 4 or Table 6, and wherein said binding site data is defined by at least one of Table 2,
In another embodiment, a computer-readable medium is provided, where the computer-readable medium is encoded with atomic coordinate data or binding site data or both, wherein said atomic coordinate data is defined by Table 4 or Table 6, and wherein said binding site data is defined by at least one of Table 2,
For this experiment, the crystal structure of a C-terminally truncated human Rab9 (residues 1-177) in complex with GDP (Rab9-GDP) was determined.
Cloning and Expression—The gene for human Rab9 (GenBank™ accession number NM—004251) was obtained from the IMAGE clone collection (IMAGE ID number 4139714) through distribution by Open Biosystems. A C-terminally truncated fragment coding for residues 1-177 (20.1 kDa) was PCR subcloned using primers 5′-G ACA GCT AGC ATG GCA GGC AAA TCA TCA CTT TTT AAA G-3′ and 5′-C ATG GAT CCT TCA GTC CTC GGT AGC AAG AAC TCT TC-3′ into the NheI/BamHI restriction sites of pET28b (Novagen); the resulting construct encodes for a Rab9-(1-177) protein product with an N-terminal His 6-containing fusion (MGSSHHHHHHSSGLVPRGSHMAS). The pET28-Rab9-(1-177) vector was transformed into Escherichia coli BL21 (DE3) (Novagen), and overproduction of the fusion protein was induced at an A600 nm of ˜2.0 with 1 mM isopropyl-1-thio-β-D-galactopyranoside at 310 K for 2 h; the cells were harvested by centrifugation and frozen at 253 K.
Protein Purification—The cell pellet was re-suspended in nickel buffer A (20 mM Tris, pH 8.0, 500 mM NaCl, 5 mM imidazole), lysed by sonication, and centrifuged at 20,000×g for 20 min at 277 K. The soluble fraction was filtered through a 0.45-micron filter and applied to chelating Sepharose (Amersham Biosciences), which had been previously charged with 50 mM NiSO4 and equilibrated with nickel buffer A. The column was then washed with nickel wash buffer (20 mM Tris-HCl, pH 8.0, 500 mM NaCl, 55 mM imidazole), and the His6-Rab9-(1-177) fusion protein was eluted with nickel elution buffer (20 mM Tris-HCl, pH 8.0, 500 mM NaCl, 350 mM imidazole). The His6-Rab9-(1-177) fusion protein was then dialyzed against Thr buffer (20 mM Tris-HCl, pH 8.4, 150 mM NaCl, 2.5 mM CaCl2) at 277 K, and precipitate was removed by centrifugation at 20,000×g for 20 min at 277 K. To the soluble fraction, 1 unit of thrombin protease (Novagen) was added per milligram of fusion protein, and the His tag was removed by digestion for 4 h at 298 K (thrombin cleavage results in a Rab9a-(1-177) protein with an N terminal GSHMAS extension). The thrombin cleavage reaction was diluted (1:3, v/v) with 20 mM 4-morpholineethanesulfonic acid (MES), pH 6.5, and applied to Q-Sepharose (Amersham Biosciences), which had been previously equilibrated with Q buffer A (20 mM MES, pH 6.5, 50 mM NaCl). Native Rab9-(1-177) was eluted from Q-Sepharose with a 50-750 mM NaCl linear gradient in MES, pH 6.5; fractions containing native Rab9-(1-177) were identified by denaturing gel electrophoresis and pooled. The pooled Q fractions were then further purified by gel filtration on Sephacryl S-200 (Amersham Biosciences) in MES, pH 6.5, 150 mM NaCl; fractions containing Rab9-(1-177) were pooled and concentrated by ultrafiltration.
Crystallization and Data Collection—The stock protein solution used for crystallization contained 20 mM MES buffer, pH 6.5, and 150 mM sodium chloride with a protein concentration of 10 mg/ml. Crystals were grown at 277 K by the hanging-drop vapor diffusion method with 100 mM sodium acetate buffer, pH 5.0, 5% (v/v) polyethylene glycol 4000 as crystallization solution. Crystals formed in space group P1 with a=38.40 Å, b=45.62 Å, c=51.22 Å, α=99.8°, β=107.2°, and γ=101.8° and contained two monomers in the unit cell. X-ray diffraction data to 1.25-Å resolution were collected at beamline 22-ID in the facilities of the South East Regional Collaborative Access Team at the Advanced Photon Source, Argonne National Laboratory. The statistics for data collection and processing are summarized in Table 1.
Structure Determination and Refinement—The orientation and position of the Rab9 dimer in the P1 unit cell were determined using the molecular replacement protocols in Crystallography & NMR Software (Brunger, A. T. et al. (1998) Acta Crystallogr. Sect. D Biol. Crystallogr. 54, 905-921) starting from the structure of Rab11a (PDB code 1OIV (Pasqualato, S. et al. (2004) J. Biol. Chem. 279, 11480-11488)) as the search model. The composite omit map was calculated to guide electronic density fitting of the model. Energy-restrained crystallographic refinement was carried out with maximum likelihood algorithms implemented in Crystallography & NMR Software (Brunger, A. T. et al. (1998) Acta Crystallogr. Sect. D Biol. Crystallogr. 54, 905-921). Refinement proceeded through several cycles in combination with manual checking by the program 0 (Jones, T. A. et al. (1991) Acta Crystallogr. Sect. A 47, 110-119). The addition of two GDPs and 473 water molecules and refinement up to 1.25 Å resulted in R and Rfree values of 0.213 and 0.232, respectively. Further refinement was continued with SHELX-97 (Sheldrick, G. M., and Schneider, T. R. (1997) Methods Enzymol. 227, 319-343) by subjecting the structure to cycles of isotropic conjugate gradient least squares refinement; then tightly restrained anisotropic displacement parameters were introduced and refined. The final refinement cycle resulted in R/Rfree values of 0.139/0.196. The final model contains residues 2-34 and 39-175 of monomer A, residues 5-34, 39-110, and 115-175 of monomer B, and 2 GDP molecules plus 508 water molecules. The phasing and refinement statistics are summarized in Table 1.
Protein Fold Analysis—Secondary structure elements were defined by the hydrogen-bonding patterns in combination with visual inspection. The Dali algorithm of comparing protein domain structures by alignment of distance matrices was used to search for structural homologues of Rab9 and also used for structure-based sequence alignment (Holm, L., and Sander, C. (1993) J. Mol. Biol. 233, 123-138; Holm, L., and Sander, C. (1998) Proteins 33, 88-96). Ribbon diagrams were prepared by the program MOLSCRIPT (Kraulis, P. J. (1991) J. Appl. Crystallogr. 24, 946-950).
Structure Determination—The human Rab9 variant used for crystal structure determination included residues 1-177, lacking its last 24 residues (
Overall Structure of Rab9 in the Rab9-GDP Complex—Like other members of the Rab GTPase family, Rab9 adopts a classical nucleotide binding fold consisting of a six-stranded β-sheet surrounded by five α-helices (
Active Site Structure The crystal structure reported here contains a tightly bound GDP molecule in the active site (
Structure Comparison—The overall structure of Rab9 is very similar to the prototype Ras protein p21Ras (Scheidig, A. J. et al. (1999) Structure Fold Des. 7, 1311-1324) and several Rab proteins (Table 3). Among those, Rab9 has the highest sequence identity with Ypt7p (54% over 153 equivalent positions), followed by Rab11a (43% over 161 equivalent positions). The structural similarity Z-scores (Holm, L., and Sander, C. (1993) J. Mol. Biol. 233, 123-138; Holm, L., and Sander, C. (1998) Proteins 33, 88-96) range from 26.6 to 23.2 with r.m.s. deviations of equivalent positions in the range of 1.4-2.0 Å. Structure-based sequence alignment reveals that the active site of Rab9 consists of residues highly conserved in the Rab GTPase family (
Atomic Coordinates and Structure Factors—The atomic coordinates and structure factors (code 1 WMS) have been deposited in the Protein Data Bank (PDB), Research Collaboratory for Structural Bioinformatics, Rutgers University, New Brunswick, N.J. (http://www.rcsb.org/). Portions of the information filed therein have been reproduced here in Table 4. All abbreviations, terms, and formats of the listed values follow the PDB Format Guide (Version 2.3, 1998).
For this experiment, the crystal structure of a C-terminally truncated human Rab9 (residues 1-177) in complex with the GTP analog Guanosine 5′-(β,γ-imido) triphosphate (GppNHp) was determined.
Complex Preparation—After the truncated Rab9 (1-177) was expressed in BL21(DE3) (Chen et al. (2004) J. Biol Chem., 279: 40204-40208), and purified by His-affinity chromatography (HiTrap Chelating HP, Amersham Biosciences), the protein was cleaved from His-tags by thrombin protease (Novagen). The protein was then further purified with anion-exchange affinity chromatography (HiTrap Q HP, Amersham Biosciences) at pH6.5. The complex of truncated Rab9 (1-177) and GppNHp trisodium salt (Sigma-Aldrich) was prepared by adding 3× molar excess of GppNHp, 10 units alkaline phosphatase (Sigma-Aldrich) per milligram of truncated Rab9 (1-177), 5 mM MgCl2, 1 mM DTT, 100 mM NaCl in 20 mM MES pH6.5, and incubated overnight (−12 hours) at room temperature. The complex solution of truncated Rab9 (1-177) and GppNHp was further purified with preparatory gel filtration chromatography (Sephacryl S-200, Amersham Biosciences).
Crystallization and Data Collection—The complex solution used for crystallization contained 20 mM MES buffer, pH 6.5 and 150 mM sodium chloride with a protein concentration of 15 mg/ml. Crystals were grown at 277 K by the sitting-drop vapor diffusion method with 100 mM sodium acetate buffer, pH 5.0, 5-8% (v/v) polyethylene glycol (PEG) 4000 as crystallization solution. Crystals formed in space group P212121 with a=56.24 Å, b=76.60 Å, c=174.35 Å and contained four monomers in the asymmetric unit. X-ray diffraction data to 1.73 Å resolution were collected at beamline 22-ID in the facilities of the South East Regional Collaborative Access Team (SER-CAT) at the Advanced Photon Source, Argonne National Laboratory, USA. The statistics for data collection and processing are summarized in Table 5.
Structure Determination and Refinement—The orientation and position of the four Rab9 monomers in the asymmetric unit were determined using the molecular replacement protocols in the program CNS (Brunger, A. T. et al. (1998) Acta Crystallogr. Sect. D Biol. Crystallogr. 54, 905-921) starting from the dimer structure of Rab9-GDP complex (PDB code 1WMS) as the search model. The composite omit map was calculated to guide electronic density fitting of the model. Energy-restrained crystallographic refinement was carried out with maximum likelihood algorithms implemented in CNS. Refinement proceeded through several cycles in combination with manual checking by the program 0 (Jones, T. A. et al. (1991) Acta Crystallogr. Sect. A 47, 110-119). The final refinement cycle resulted in R/Rfree values of 0.194/0.223. The final model contains residues 6-175 of monomer A′, residues 4-110, 115-175 of monomer B′, residues 6-174 of monomer C′, residues 6-110, 115-175 of monomer D′, four GppNHp molecules, four Magnesium ions plus 495 water molecules. The phasing and refinement statistics are summarized in Table 5.
Structure Determination—Rab9 bound to GppNHp was crystallized, and its structure was determined by molecular replacement and refined against 1.73 Å resolution data (Table 5). The refined structure has excellent stereochemistry with r.m.s. deviations for bond lengths and angles of 0.005 Å and 1.2°, respectively. The Ramachandran plot statistics shows that 91.4% of the backbone dihedral angles are in the most favored regions, 8.6% in the additional allowed regions and none of the non-glycine residues are in the disallowed regions. There are four crystallographically unique Rab9 molecules in the crystal asymmetric unit. They can be divided into two groups with two monomers each. Monomers A′ and D′ (Group I) have almost identical structures with the r.m.s. deviation between the 166 equivalent C(X atoms of 0.25 Å, while Monomers B′ and C′ (Group II) are very similar with the r.m.s. deviation between the 165 equivalent Cα atoms of 0.52 Å. The r.m.s. deviations between any inter-group pair of monomers are much higher in the range of 1.89 to 1.94 Å. For simplicity, Monomers A′ and C′ will be used in the present description of Rab9-GppNHp complex structures (
Overall Structure of Rab9 in the Rab9-GppNHp Complex—Rab9 in the complex adopts a classical nucleotide binding fold consisting of a six-stranded β-sheet surrounded by five α-helices (
Active Site Structure—The crystal structure reported here contains a tightly bound GppNHp molecule in the active site of each monomer (
Structure Comparison With the Rab9-GDP Complex—The overall structures of Rab9-GppNHp complex are similar to that of Rab9-GDP complex with r.m.s. deviation of equivalent Cα atoms in the range of 1.28 Å for monomer A′ to 1.32 Å for monomer C′ (
Atomic Coordinates and Structure Factors—Atomic coordinates and structure factors for the Rab9-GppNHp complex have been reproduced here in Table 6. All abbreviations, terms, and formats of the listed values follow the PDB Format Guide (Version 2.3, 1998).
All publications and patent applications mentioned in the specification are indicative of the level of those skilled in the art to which this invention pertains. All publications and patent applications are herein incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference.
Although the foregoing invention has been described in some detail by way of illustration and example for purposes of clarity of understanding, it will be obvious that certain changes and modifications may be practiced within the scope of the appended claims.
aValues in parentheses are for the highest resolution shell.
aProduced by Dali algorithm (Holm, L., and Sander, C. (1993) J. Mol. Biol. 233, 123-138; Holm, L., and Sander, C. (1998) Proteins 33, 88-96).
bZ-score strength of structural similarity in standard deviations above expected.
cPositional root mean square deviation of superimposed Cα atoms in Å.
dTotal number of equivalent residues.
eLength of the entire chain of the equivalent structure.
fPercentage of sequence identity over equivalent positions.
aValues in parentheses are for the highest resolution shell.
This application is a divisional of co-pending U.S. application Ser. No. 11/154,203, filed Jun. 16, 2005, which claims the benefit of U.S. Provisional Application No. 60/603,904, filed Aug. 24, 2004 and U.S. Provisional Application No. 60/581,961, filed Jun. 22, 2004; each of which are hereby incorporated by reference in its entirety.
The research underlying this invention was supported in part with funds from the National Science Foundation (NSF; NSF EPSCoR Program Grant No. EPS-0091853), the National Aeronautics and Space Administration (NASA; NASA Cooperative Agreement Number NCC8-243), and the United States Department of Energy (DOE; Contract No. W-31-109-Eng-38). The United States Government may have an interest in the subject matter of this invention.
Number | Date | Country | |
---|---|---|---|
60603904 | Aug 2004 | US | |
60581961 | Jun 2004 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 11154203 | Jun 2005 | US |
Child | 11933636 | US |