The present invention generally pertains to the fields of molecular biology, protein crystallization, X-ray diffraction analysis, three-dimensional structural determination, molecular modeling and structure based rational drug design. The present invention provides crystallized peptides of the c-fms kinase domain as well as descriptions of the X-ray diffraction patterns. The X-ray diffraction patterns of the c-fms kinase domain crystals are of sufficient resolution so that the three-dimensional structure can be determined at atomic resolution, ligand binding sites on c-fms can be identified, and the interactions of ligands with c-fms amino acid residues can be modeled.
The high resolution maps provided by the present invention and the models prepared using such maps also permit the design of ligands which can function as active agents. Thus, the present invention has applications to the design of active agents which include, but are not limited to, those that find use as inhibitors of c-fms for the treatment of diseases caused by inappropriate activity of c-fms.
Protein kinases are enzymes that serve as key components of signal transduction pathways by catalyzing the transfer of the terminal phosphate from ATP to the hydroxy group of tyrosine, serine and threonine residues of proteins. As a consequence, protein kinase inhibitors and substrates are valuable tools for assessing the physiological consequences of protein kinase activation. The overexpression or inappropriate expression of normal or mutant protein kinases in mammals has been demonstrated to play significant roles in the development of many diseases, including cancer, diabetes and autoimmune diseases.
Protein kinases can be divided into two classes: those, which preferentially phosphorylate tyrosine residues (protein tyrosine kinases) and those, which preferentially phosphorylate serine and/or threonine residues (protein serine/threonine kinases). Protein tyrosine kinases perform diverse functions ranging from stimulation of cell growth and differentiation to arrest of cell proliferation. They can be classified as either receptor protein tyrosine kinases or intracellular protein tyrosine kinases. The FMS or CSF-1-R protooncogene encodes the macrophage colony stimulating factor I receptor (or CSF-1-R or c-fms), which is the cell surface receptor for the colony stimulating factor I (CSF-1 or M-CSF) [1]. c-fms is part of the Platelet Derived Growth Factor (“PDGF”) receptor family, which includes PDGFR, the stem cell factor receptor (c-kit), c-fms, VEGFR-1 (flt-1) and VEGFR-2 (KDR).
Receptor Tyrosine Kinases (RTKs), such as c-fms, share a common architecture by which an extracellular ligand-binding domain is connected via a transmembrane segment to an intracellular catalytic domain with intrinsic tyrosine kinase activity. Binding of the ligand to the ligand-binding domain induces a conformational change, which leads in most cases to receptor dimerization, autophosphorylation of the kinase domain or adjacent domains and activation of the kinase. The activated RTK in turn trans-phosphorylates specific tyrosine residues of their respective substrates, thus transmitting the signal further. Additional members of the RTK subfamilies include the epidermal growth factor (“EGF”) family, (HER-1, HER-2/neu and HER-3 receptors), which code for oncogenes that have been have been linked to breast, colorectal and prostate cancers.
Insulin receptor (“IR”) and insulin-like growth factor I receptor (“IGF-1R”) are structurally and functionally related but exert distinct biological effects. IGF-1R over-expression has been associated with breast cancer. Fibroblast growth factor (“FGR”) receptors consist of four receptors, which are responsible for the production of blood vessels, for limb outgrowth, and for the growth and differentiation of numerous cell types.
Mononuclear phagocyte colony-stimulating factor (CSF-1 or M-CSF) is a polypeptide growth factor, which stimulates the survival, proliferation, and differentiation of haematopoietic cells of the monocyte-macrophage series. Multiple forms of soluble CSF-1 are produced by proteolytic cleavage of membrane-bound precursors, some of which are stably expressed at the cell surface [2].
Valuable insight into the signaling role of M-CSF and its receptor c-fms comes from the M-CSF deficient mice strain (op/op) [3]. These mice exhibit a selective reduction of monocytes, osteoclasts and macrophages in muscle, joints and other tissues. Furthermore these mice are osteoporotic and exhibit reduced fertility, but the incapability to produce functional M-CSF appears not to be life threatening per se. op/op mice are resistant to collagen-induced arthritis and show a reduced rate of mammary tumor progression into metastasis [4].
M-CSF has been shown to exacerbate collagen-induced arthritis in mouse models an effect, which could be suppressed with M-CSF blocking antibodies [5]. In another study M-CSF and GM-CSF (granulocyte macrophage colony-stimulating factor) induced prolonged inflammation and recruitment of macrophages in an mBSA induced arthritis model [6]. These studies demonstrated that CSF-1 and GM-CSF can exacerbate and prolong the histopathology of acute inflammatory arthritis and lend support to monocytes/macrophages being a driving influence in the pathogenesis of inflammatory arthritis. The data shown in these studies suggest that either M-CSF or its cognate receptor c-fms are suitable targets for treating arthritis or other macrophage induced inflammatory diseases.
In a recent study, expression of c-fms in various tumors was linked to poor survivability and increased tumor size [7]. M-CSF and c-fms have also been shown to be expressed by carcinomas of the breast and other epithelia of the female reproductive tract where activation of the receptor by ligand produced either by the tumor cells or by stromal elements stimulates tumor cell invasion by a urokinase-dependent mechanism [8]. These results also support other preclinical findings that CSF-1R may be involved in local invasion and metastasis. Thus, this receptor may be an effective anti-cancer target.
Recent studies indicate that macrophages, infected with HIV-1 produce high levels of M-CSF related specifically to HIV-1 and not other viral infections. High levels of M-CSF appear to be important to sustain HIV replication in vitro [9], a fact that is also corroborated by inhibition of HIV-1 replication through M-CSF scavenging agents (anti-M-CSF monoclonal or polyclonal Antibodies or soluble M-CSF receptors). These results suggest that antagonists for the action of M-CSF may represent novel a strategy for inhibiting the spread of HIV-1.
Overview of the c-fms Structures
The structure of the c-fms kinase domain closely resembles other kinase domain structures in the inactive form determined so far [12-14]. c-fms is organized in a two-lobe structure (
Activation Loop
The activation loop in RTKs is an essential element for the regulation of the kinase activity. In RTKs the activation loop is approximately 22 amino acids long and begins with a conserved Asp-Phe-Gly (DFG) motif and ends with a tyrosine kinase conserved Pro [15]. Autophosphorylation of tyrosines present in the activation loop has been shown to be essential for stimulation of activity for RTKs. In the absence of phosphorylation the activation loop is not properly positioned for catalysis and prevents binding of ATP. Phosphorylation events in the activation loop stabilize a conformation in which the activation loop is accessible to substrates and residues important for catalysis are positioned properly. Tyr809 is the single tyrosine present in the c-fms activation loop and is one of several that are phosphorylated in response to ligand binding. Tyr809 is bound in the active site in a manner very similar to that of Tyr1162 of the inactive form of IRK [13]. The phenol group of Tyr809 forms hydrogen-bonding interactions with Asp778 and Arg782 of the catalytic loop, which stabilize the inactive conformation of the activation loop.
The effect of its phosphorylation on this critical residue in the activation loop has been established by several mutagenesis studies. For instance a Tyr809Phe mutation prevents differentiation of macrophage colony-stimulating factor (M-CSF)-dependent bone marrow macrophages into osteoclasts [16]. In a rat cell line model Tyr809Gly abolished kinase activity and Tyr809Phe reduced kinase activity by 40-60% [17]. In a previous study the same Tyr809Phe mutation was shown to retain activity as a tyrosine kinase in vitro and in vivo, was able to undergo CSF-1-dependent association with a phosphatidylinositol 3-kinase, and induced expression of the protooncogenes c-fos and junB, underscoring its ability to trigger some of the known cellular responses to CSF-1 [18]. On the other hand the mutated receptor failed to induce mitogenesis.
Juxtamembrane Domain
The c-fms juxtamembrane domain (JM-domain) corresponds to residues 538-572 and contains two tyrosines (546 and 561). Tyr546 was shown to be a major autophosphorylation site and binds to a yet unidentified 55 kDa phosphoprotein [19]. A phosphopeptide modeled on the sequence of Tyr561 and surrounding residues competed with the association of Fyn with c-fms [20]. Furthermore, mutational analysis demonstrated that this and other sequences were required for the efficient association of Src family kinases with activated c-fms in vivo.
Structurally, the JM-domain adopts a similar arrangement as the one observed in the recently determined flt3 kinase structure [21]. Residues 548-552 are wedged between the catalytically important α-helix C and a β-sheet like loop region (residues 772-776) just preceding the catalytic loop. Residues N-terminal to Val548 do not show any electron density and are disordered. Parts of the JM-domain (558-559 and 565-572) are also disordered; this is in contrast to the flt3 structure in which the whole JM-domain was traced.
Another difference is the main anchoring point of the JM-domain to the bulk of the kinase domain. In c-fms Trp550 serves as the main anchor and is wedged deep into a cleft under α-helix C, whereas in flt3 the anchoring residue is Tyr572, which is located 2 residues upstream along the JM-domain. W550 sits in a hydrophobic pocket formed by (Ile636, Met637, Leu640, Ile646, Leu769, Cys774 and Ile794). It also forms a π face-to-edge interaction with His776. The backbone amid forms a hydrogen bond with one of the carboxyl atoms of Asp796, which is part of the DFG motif and signifies the start of the activation loop. The backbone oxygen of Trp550 also forms another hydrogen bond with the side chains of Arg777. The extended network of hydrophobic interactions and the backbone hydrogen-bonding network keep W550 firmly seated in its place, a fact underscored by the significantly lower B-factors of Trp550 as compared to its neighboring residues. Downstream of W550 a 3 residue antiparallel β-sheet like interaction between residues 551-553 and 773-775 provides additional anchorage for the JM-domain. This structural organization is also similar to that adopted by the activation loop of activated Insulin Receptor Kinase (IRK-A) in complex with AMP-PNP [22]. In IRK-A the activation loop is displaced from its inhibitory position in the nucleotide-binding pocket and folds partly around the kinase-domain parallel to the interface between the N- and C-lobe. Residues 1153 and 1157 are also wedged under the α-helix C and form a similar β-sheet like interaction as the c-fms JM-domain.
Kinase Insert Domain (KID)
The kinase insert domain is an additional loop region found in a subset of RTKs, which is located between α-helix D and α-helix E. It can vary in length from a dozen to almost 100 residues. There are several reports that the KID is involved in downstream signaling of c-fms through the mediation of protein-protein interactions. Deletion of the entire kinase insert domain completely abrogated signal transduction by the CSF-1 receptor expressed in Rat-2 fibroblasts [23]. Mutation of either Tyr697 or Tyr721 (Tyr699 and Tyr723 in human c-fms) compromised signal transduction by c-fms and the receptor lost all ability to induce changes in morphology or to increase cell growth rate in response to CSF-1. Early protein constructs, which utilized the full KID did not yield any crystals.
A need continues to exist for the development of modeling systems to design and select potent, small molecules that are inhibitors of c-fms for the treatment of diseases caused by inappropriate activity of c-fms.
The present invention includes an isolated chimeric kinase receptor polypeptide, wherein the polypeptide comprises an ATP binding pocket linked to a substrate binding pocket by a kinase insert domain (KID), wherein the domain is heterologous to the ATP binding pocket or the substrate binding pocket. The invention also includes a crystal comprising the chimeric kinase receptor polypeptide and a crystal comprising a fragment of the chimeric kinase receptor polypeptide. In one aspect of the invention, the ATP binding pocket and substrate binding pocket are c-fms. In a different aspect of the invention, the heterologous KID is selected from the group consisting of FGFR1, tie2 and IRK.
In one aspect of the invention, the invention includes an isolated chimeric kinase receptor polypeptide, wherein the chimeric polypeptide comprises an amino acid sequence beginning at c-fms amino acid position 538 and continuing through c-fms amino acid position 922 wherein the native c-fms KID is replaced with a KID sequence comprising a heterologous KID amino acid sequence beginning at c-fms amino acid positions 672-688, SEQ ID NO. 1. The invention also includes a crystal comprising the polypeptide or a crystal comprising a fragment of the polypeptide. In one embodiment, the heterologous KID is selected from the group consisting of FGFR1, tie2 and IRK. In another aspect of the invention, the chimeric polypeptide has an amino acid sequence having at least 95% amino acid sequence identity to a sequence selected from the group consisting of SEQ ID NO. 2 (FMS/FGFR1 chimera); SEQ ID NO. 4 (FMS/tie chimera) and SEQ ID NO: 6 (FMS/irk chimera). The chimeric polypeptide or the chimeric kinase receptor polypeptide can be in crystalline form.
The invention also includes a crystal comprising a chimeric kinase receptor polypeptide wherein the polypeptide comprises an ATP binding pocket linked to a substrate binding pocket by a kinase insert domain wherein the domain is heterologous to the ATP binding pocket or the substrate binding pocket, or a fragment thereof, wherein the crystal comprises a crystal structure defined by one or more structure coordinates of c-fms amino acid residues Trp 550, Lys 586, Thr 587, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Val 615, Lys 616, Glu 633, Met 637, Leu 640, Ile 646, Val 647, Val 661, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Asn 673, Arg 677, Cys 774, Ile 775, His 776, Arg 782, Asn 783, Leu 785, Ile 794, Gly 795, Asp 796, Phe 797, Gly 798, Leu 799, Ala 800, Arg 801, Asp 802 according to Tables 1, 2 or 3 or similar structure coordinates for said amino acids comprising a root mean square deviation of non-hydrogen atoms of less than about 1.5 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3
In a different aspect the crystal comprises a chimeric kinase receptor polypeptide wherein the polypeptide comprises an ATP binding pocket linked to a substrate binding pocket by a kinase insert domain wherein the domain is heterologous to the ATP binding pocket or the substrate binding pocket, or a fragment thereof, wherein the crystal comprises a crystal structure defined by one or more structure coordinates of c-fms amino acid residues Trp 550, Lys 586, Thr 587, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Val 615, Lys 616, Glu 633, Met 637, Leu 640, Ile 646, Val 647, Val 661, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Asn 673, Arg 677, Cys 774, Ile 775, His 776, Arg 782, Asn 783, Leu 785, Ile 794, Gly 795, Asp 796, Phe 797, Gly 798, Leu 799, Ala 800, Arg 801, Asp 802 according to Tables 1, 2 or 3 or similar structure coordinates for said amino acids comprising a root mean square deviation of non-hydrogen atoms of less than about 0.75 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3.
In a different aspect, the crystal comprises a chimeric kinase receptor polypeptide wherein the polypeptide comprises an ATP binding pocket linked to a substrate binding pocket by a kinase insert domain wherein the domain is heterologous to the ATP binding pocket or the substrate binding pocket, or a fragment thereof, wherein the crystal comprises a crystal structure defined by one or more structure coordinates of c-fms amino acid residues Lys 586, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Lys 616, Val 647, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Arg 677, Arg 782, Leu 785, Asp 796, Phe 797, Gly 798, Leu 799, Ala 800, Arg 801, Asp 802 according to Tables 1, 2 or 3 or similar structure coordinates for said amino acids comprising a root mean square deviation of non-hydrogen atoms of less than about 1.5 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3.
The invention also includes a crystal comprising a chimeric kinase receptor polypeptide wherein the polypeptide comprises an ATP binding pocket linked to a substrate binding pocket by a kinase insert domain wherein the domain is heterologous to the ATP binding pocket or the substrate binding pocket, or a fragment thereof, wherein the crystal comprises a crystal structure defined by one or more structure coordinates of c-fms amino acid residues Lys 586, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Lys 616, Val 647, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Arg 677, Arg 782, Leu 785, Asp 796, Phe 797, Gly 798, Leu 799, Ala 800, Arg 801, Asp 802 according to Tables 1, 2 or 3 or similar structure coordinates for said amino acids comprising a root mean square deviation of non-hydrogen atoms of less than about 0.75 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3.
The invention also includes a crystal comprising a chimeric kinase receptor polypeptide wherein the polypeptide comprises an ATP binding pocket linked to a substrate binding pocket by a kinase insert domain wherein the domain is heterologous to the ATP binding pocket or the substrate binding pocket, or a fragment thereof, wherein the crystal comprises a crystal structure defined by structure coordinates of c-fms amino acid residues Lys 586, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Lys 616, Val 647, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Arg 782, Asn 783, Leu 785, Asp 796, Phe 797, Leu 799, Ala 800, Arg 801 according to Tables 1, 2 or 3 or similar structure coordinates for said amino acids comprising a root mean square deviation of non-hydrogen atoms of less than about 1.5 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3.
The invention also includes a crystal comprising a chimeric kinase receptor polypeptide wherein the polypeptide comprises an ATP binding pocket linked to a substrate binding pocket by a kinase insert domain wherein the domain is heterologus to the ATP binding pocket or the substrate binding pocket, or a fragment thereof, wherein the crystal comprises a crystal structure defined by structure coordinates of c-fms amino acid residues Lys 586, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Lys 616, Val 647, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Arg 782, Asn 783, Leu 785, Asp 796, Phe 797, Leu 799, Ala 800, Arg 801 according to Tables 1, 2 or 3 or similar structure coordinates for said amino acids comprising a root mean square deviation of non-hydrogen atoms of less than about 0.75 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3.
In another aspect of the invention, the invention comprises a crystal comprising a chimeric kinase receptor polypeptide, wherein said chimeric polypeptide comprises an amino acid sequence beginning at c-fms amino acid position 538 and continuing through c-fms amino acid position 922 wherein the native c-fms KID is replaced with a KID sequence comprising a heterologous KID amino acid sequence beginning at c-fms amino acid positions 672-688, or a fragment of the chimeric kinase receptor polypeptide, wherein said crystal comprises a crystal structure defined by one or more structure coordinates of c-fms amino acid residues Trp 550, Lys 586, Thr 587, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Val 615, Lys 616, Glu 633, Met 637, Leu 640, Ile 646, Val 647, Val 661, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Asn 673, Arg 677, Cys 774, Ile 775, His 776, Arg 782, Asn 783, Leu 785, Ile 794, Gly 795, Asp 796, Phe 797, Gly 798, Leu 799, Ala 800, Arg 801, Asp 802 according to Tables 1, 2 or 3 or similar structure coordinates for said amino acids comprising a root mean square deviation of non-hydrogen atoms of less than about 1.5 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3.
In another aspect of the invention, the invention comprises a crystal comprising a chimeric kinase receptor polypeptide, wherein said chimeric polypeptide comprises an amino acid sequence beginning at c-fms amino acid position 538 and continuing through c-fms amino acid position 922 wherein the native c-fms KID is replaced with a KID sequence comprising a heterologous KID amino acid sequence beginning at c-fms amino acid positions 672-688, or a fragment of the chimeric kinase receptor polypeptide, wherein said crystal comprises a crystal structure defined by one or more structure coordinates of c-fms amino acid residues Trp 550, Lys 586, Thr 587, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Val 615, Lys 616, Glu 633, Met 637, Leu 640, Ile 646, Val 647, Val 661, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Asn 673, Arg 677, Cys 774, Ile 775, His 776, Arg 782, Asn 783, Leu 785, Ile 794, Gly 795, Asp 796, Phe 797, Gly 798, Leu 799, Ala 800, Arg 801, Asp 802 according to Tables 1, 2 or 3 or similar structure coordinates for said amino acids comprising a root mean square deviation of non-hydrogen atoms of less than about 0.75 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3.
In another aspect of the invention, the invention comprises a crystal comprising a chimeric kinase receptor polypeptide, wherein said chimeric polypeptide comprises an amino acid sequence beginning at c-fms amino acid position 538 and continuing through c-fms amino acid position 922 wherein the native c-fms KID is replaced with a KID sequence comprising a heterologous KID amino acid sequence beginning at c-fms amino acid positions 672-688, or a fragment of the chimeric kinase receptor polypeptide, wherein said crystal comprises a crystal structure defined by one or more structure coordinates of c-fms amino acid residues Lys 586, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Lys 616, Val 647, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Arg 677, Arg 782, Leu 785, Asp 796, Phe 797, Gly 798, Leu 799, Ala 800, Arg 801, Asp 802 according to Tables 1, 2 or 3 or similar structure coordinates for said amino acids comprising a root mean square deviation of non-hydrogen atoms of less than about 1.5 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3.
In another aspect of the invention, the invention comprises a crystal comprising a chimeric kinase receptor polypeptide, wherein said chimeric polypeptide comprises an amino acid sequence beginning at c-fms amino acid position 538 and continuing through c-fms amino acid position 922 wherein the native c-fms KID is replaced with a KID sequence comprising a heterologous KID amino acid sequence beginning at c-fms amino acid positions 672-688, or a fragment of the chimeric kinase receptor polypeptide, wherein said crystal comprises a crystal structure defined by one or more structure coordinates of c-fms amino acid residues Lys 586, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Lys 616, Val 647, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Arg 677, Arg 782, Leu 785, Asp 796, Phe 797, Gly 798, Leu 799, Ala 800, Arg 801, Asp 802 according to Tables 1, 2 or 3- or similar structure coordinates for said amino acids comprising a root mean square deviation of non-hydrogen atoms of less than about 0.75 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3.
In another aspect of the invention, the invention comprises a crystal comprising a chimeric kinase receptor polypeptide, wherein said chimeric polypeptide comprises an amino acid sequence beginning at c-fms amino acid position 538 and continuing through c-fms amino acid position 922 wherein the native c-fms KID is replaced with a KID sequence comprising a heterologous KID amino acid sequence beginning at c-fms amino acid positions 672-688, or a fragment of the chimeric kinase receptor polypeptide, wherein said crystal comprises a crystal structure defined by structure coordinates of c-fms amino acid residues Lys 586, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Lys 616, Val 647, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Arg 782, Asn 783, Leu 785, Asp 796, Phe 797, Leu 799, Ala 800, Arg 801 according to Tables 1, 2 or 3 or similar structure coordinates for said amino acids comprising a root mean square deviation of non-hydrogen atoms of less than about 1.5 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3.
In another aspect of the invention, the invention comprises a crystal comprising a chimeric kinase receptor polypeptide, wherein said chimeric polypeptide comprises an amino acid sequence beginning at c-fms amino acid position 538 and continuing through c-fms amino acid position 922 wherein the native c-fms KID is replaced with a KID sequence comprising a heterologous KID amino acid sequence beginning at c-fms amino acid positions 672-688, or a fragment of the chimeric kinase receptor polypeptide, wherein said crystal comprises a crystal structure defined by structure coordinates of c-fms amino acid residues Lys 586, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Lys 616, Val 647, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Arg 782, Asn 783, Leu 785, Asp 796, Phe 797, Leu 799, Ala 800, Arg 801 according to Tables 1, 2 or 3 or similar structure coordinates for said amino acids comprising a root mean square deviation of non-hydrogen atoms of less than about 0.75 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3.
In yet a different aspect, any of the crystals further comprise a ligand, wherein the ligand is an ATP-binding pocket ligand. In one embodiment, the ATP-binding pocket ligand is a small molecule inhibitor.
In one embodiment, small molecule inhibitor is an arylamide compound or a derivative thereof. In another embodiment, the small molecule inhibitor is a quinolone compound or a derivative thereof. In a preferred embodiment, the arylamide compound is 5-cyano-furan-2-carboxylic acid [5-hydroxymethyl-2-(4-methyl-piperidine-1-yl)-phenyl]-amide or derivative thereof. In another preferred embodiment, the quinolone compound is 6-Chloro-3-(3-methyl-isoxazol-5-yl)-4-phenyl-1H-quinolin-2-one or a derivative thereof.
In one aspect of the invention, the crystal-ligand complex has a space group of R3. (Form I). In another aspect of the invention, the crystal-ligand complex has a space group of sg=P212121. (Form II). In yet a different aspect, the crystal effectively diffracts X-rays for determination of atomic coordinates to a resolution of at least about 1.9 Å (Form I). In another aspect, the crystal effectively diffracts X-rays for determination of atomic coordinates to a resolution of at least about 3.0 Å (Form II).
Also included in the invention is a crystal comprising a unit cell having dimensions consisting of: a=81.07; b=81.07; c=144.67; alpha=90; beta=90; gamma=120. In one embodiment, the crystal comprises a unit cell having dimensions consisting of a=53.1; b=72.4; c=91.7; alpha=90; beta=90; gamma=90.
The invention also includes a crystal comprising a polypeptide which comprises a peptide having at least 95% amino acid sequence identity to SEQ ID NO. 2 (FMS/FGFR1 chimera); SEQ ID NO. 4 (FMS/tie chimera) or SEQ ID NO: 6 (FMS/irk chimera). In one embodiment, the crystal comprises a peptide having at least 95% sequence identity to SEQ ID NO. 2.
In one aspect, the crystal comprises SEQ ID NO: 2 comprising an atomic structure characterized by the coordinates of Tables 1, 2 or 3. In another aspect the invention includes an isolated nucleic acid molecule encoding any of the chimeric polypeptides or polypeptides disclosed above, a vector comprising the nucleic acid, a host cell comprising the vector and a method of producing the polypeptide by culturing the host cell.
Also included in the invention is a computer system comprising (a) a database containing information on the three dimensional structure of a crystal comprising a c-fms chimera, or a fragment or a target structural motif or derivative thereof, and a ligand, wherein the ligand is a small molecule inhibitor, stored on a computer readable storage medium; and, (b) a user interface to view the information.
In one aspect of the invention, the information comprises diffraction data obtained from a crystal comprising SEQ ID NO: 2, 4 or 6. In another aspect, the information comprises an electron density map of a crystal form comprising SEQ ID NO: 2, 4 or 6. In yet a different aspect, the information comprises the structure coordinates of Tables 1, 2 or 3 or homologous structure coordinates for the amino acids of SEQ ID NO: 2 comprising a root mean square deviation of non-hydrogen atoms of less than about 1.5 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3. In one embodiment, the information comprises structure coordinates for amino acid residues of SEQ ID NO: 2 comprising a root mean square deviation of non-hydrogen atoms of less than about 0.75 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3.
In yet a different embodiment, the information comprises the structure coordinates for one or more amino acid residues Trp 550, Lys 586, Thr 587, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Val 615, Lys 616, Glu 633, Met 637, Leu 640, Ile 646, Val 647, Val 661, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Asn 673, Arg 677, Cys 774, Ile 775, His 776, Arg 782, Asn 783, Leu 785, Ile 794, Gly 795, Asp 796, Phe 797, Gly 798, Leu 799, Ala 800, Arg 801, Asp 802 according to Tables 1, 2 or 3 or similar structure coordinates for said amino acids comprising a root mean square deviation of non-hydrogen atoms of less than about 1.5 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3. In another aspect, the information further comprises the structure coordinates for one or more amino acid residues Trp 550, Lys 586, Thr 587, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Val 615, Lys 616, Glu 633, Met 637, Leu 640, Ile 646, Val 647, Val 661, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Asn 673, Arg 677, Cys 774, Ile 775, His 776, Arg 782, Asn 783, Leu 785, Ile 794, Gly 795, Asp 796, Phe 797, Gly 798, Leu 799, Ala 800, Arg 801, Asp 802 according to Tables 1, 2 or 3 or similar structure coordinates for said amino acids comprising a root mean square deviation of non-hydrogen atoms of less than about 0.75 Å when superimposed on the non-hydrogen atom, positions of the corresponding atomic coordinates of Tables 1, 2 or 3.
In a different aspect, the computer system comprises a crystal structure defined by structure coordinates of one or more c-fms amino acid residues Lys 586, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Lys 616, Val 647, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Arg 677, Arg 782, Leu 785, Asp 796, Phe 797, Gly 798, Leu 799, Ala 800, Arg 801, Asp 802 according to Tables 1, 2 or 3 or similar structure coordinates for said amino acids comprising a root mean square deviation of non-hydrogen atoms of less than about 1.5 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3. In another aspect, the computer system comprises a crystal structure defined by structure coordinates of one or more c-fms amino acid residues Lys 586, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Lys 616, Val 647, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Arg 677, Arg 782, Leu 785, Asp 796, Phe 797, Gly 798, Leu 799, Ala 800, Arg 801, Asp 802 according to Tables 1, 2 or 3 or similar structure coordinates for said amino acids comprising a root mean square deviation of non-hydrogen atoms of less than about 0.75 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3.
In yet a different aspect, the computer system comprises a crystal structure defined by structure coordinates of c-fms amino acid residues Lys 586, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Lys 616, Val 647, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Arg 782, Asn 783, Leu 785, Asp 796, Phe 797, Leu 799, Ala 800, Arg 801 according to Tables 1, 2 or 3 or similar structure coordinates for said amino acids comprising a root mean square deviation of non-hydrogen atoms of less than about 1.5 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3. In one embodiment, the computer system comprises a crystal structure defined by structure coordinates of c-fms amino acid residues Lys 586, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Lys 616, Val 647, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Arg 782, Asn 783, Leu 785, Asp 796, Phe 797, Leu 799, Ala 800, Arg 801 according to Tables 1, 2 or 3 or similar structure coordinates for said amino acids comprising a root mean square deviation of non-hydrogen atoms of less than about 0.75 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3.
The invention further includes a method of evaluating the potential of an agent to associate with c-fms chimeric polypeptides comprising (a) exposing the c-fms chimera to the agent; and (b) detecting the association of the agent to one or more c-fms amino acid residues selected from the group consisting of (i) Trp 550, Lys 586, Thr 587, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Val 615, Lys 616, Glu 633, Met 637, Leu 640, Ile 646, Val 647, Val 661, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Asn 673, Arg 677, Cys 774, Ile 775, His 776, Arg 782, Asn 783, Leu 785, Ile 794, Gly 795, Asp 796, Phe 797, Gly 798, Leu 799, Ala 800, Arg 801, Asp 802; (ii) Lys 586, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Lys 616, Val 647, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Arg 677, Arg 782, Leu 785, Asp 796, Phe 797, Gly 798, Leu 799, Ala 800, Arg 801, Asp 802; and, (iii) Lys 586, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Lys 616, Val 647, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Arg 782, Asn 783, Leu 785, Asp 796, Phe 797, Leu 799, Ala 800, Arg 801 thereby evaluating the potential. The agent can be a virtual compound.
In addition, also included is a method of evaluating the potential of an agent to associate with the polypeptide, comprising: (a) exposing the polypeptide to the agent; and (b) detecting the level of association of the agent to the polypeptide, thereby evaluating the potential of the agent to associate with the polypeptide; the agent can be a virtual compound. In one aspect, step (a) comprises comparing the atomic structure of the compound to the three dimensional structure of a c-fms chimeric polypeptide. In one embodiment, the comparing comprises employing a computational means to perform a fitting operation between the compound and at least one binding site of a c-fms chimera.
In one embodiment, the binding site is defined by one or more structure coordinates for amino acids Trp 550, Lys 586, Thr 587, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Val 615, Lys 616, Glu 633, Met 637, Leu 640, Ile 646, Val 647, Val 661, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Asn 673, Arg 677, Cys 774, Ile 775, His 776, Arg 782, Asn 783, Leu 785, Ile 794, Gly 795, Asp 796, Phe 797, Gly 798, Leu 799, Ala 800, Arg 801, Asp 802 according to Tables 1, 2 or 3 or similar structure coordinates for said amino acids of a c-fms chimera comprising a root mean square deviation of non-hydrogen atoms of less than about 1.5 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3. In another aspect, the binding site is defined by one or more structure coordinates for amino acids Trp 550, Lys 586, Thr 587, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Val 615, Lys 616, Glu 633, Met 637, Leu 640, Ile 646, Val 647, Val 661, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Asn 673, Arg 677, Cys 774, Ile 775, His 776, Arg 782, Asn 783, Leu 785, Ile 794, Gly 795, Asp 796, Phe 797, Gly 798, Leu 799, Ala 800, Arg 801, Asp 802 according to Tables 1, 2 or 3 or similar structure coordinates for said amino acids of a c-fms chimera comprising a root mean square deviation of non-hydrogen atoms of less than about 0.75 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3.
In one embodiment, the method of comprises a crystal structure defined by one or more structure coordinates of c-fms amino acid residues Lys 586, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Lys 616, Val 647 Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Arg 677, Arg 782, Leu 785, Asp 796, Phe 797, Gly 798, Leu 799, Ala 800, Arg 801, Asp 802 according to Tables 1, 2 or 3 or similar structure coordinates for said amino acids comprising a root mean square deviation of non-hydrogen atoms of less than about 1.5 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3. In a different embodiment, the method comprises a crystal structure defined by one or more structure coordinates of c-fms amino acid residues Lys 586, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Lys 616, Val 647, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Arg 677, Arg 782, Leu 785, Asp 796, Phe 797, Gly 798, Leu 799, Ala 800, Arg 801, Asp 802 according to Tables 1, 2 or 3 or similar structure coordinates for said amino acids comprising a root mean square deviation of non-hydrogen atoms of less than about 0.75 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3.
In yet a different embodiment, the method comprises a crystal structure defined by structure coordinates of c-fms amino acid residues Lys 586, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Lys 616, Val 647, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Arg 782, Asn 783, Leu 785, Asp 796, Phe 797, Leu 799, Ala 800, Arg 801 according to Tables 1, 2 or 3 or similar structure coordinates for said amino acids comprising a root mean square deviation of non-hydrogen atoms of less than about 1.5 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3. In another embodiment, the method comprises a crystal structure defined by structure coordinates of c-fms amino acid residues Lys 586, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Lys 616, Val 647, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Arg 782, Asn 783, Leu 785, Asp 796, Phe 797, Leu 799, Ala 800, Arg 801 according to Tables 1, 2 or 3 or similar structure coordinates for said amino acids comprising a root mean square deviation of non-hydrogen atoms of less than about 0.75 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3. In a different aspect of the method, the agent is exposed to a crystalline c-fms chimera and the detecting of step (b) comprises determining the three dimensional structure of the agent-c-fms chimera complex.
In a different aspect, the invention includes a method of identifying a potential agonist or antagonist against a c-fms chimera comprising employing the three dimensional structure of the c-fms chimera cocrystallized with a small molecule inhibitor to design or select a potential agonist or antagonist. In one embodiment, the three dimensional structure corresponds to the atomic structure characterized by the coordinates of Tables 1, 2 or 3 or similar structure coordinates for said c-fms chimera comprising a root mean square deviation of non-hydrogen atoms of less than about 1.5 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3. In another embodiment, the method further comprises the steps of: (b) synthesizing the potential agonist or antagonist; and (c) contacting the potential agonist or antagonist with a chimeric c-fms polypeptide.
The invention is also directed to a method of locating the attachment site of an inhibitor to a c-fms chimeric polypeptide, comprising (a) obtaining X-ray diffraction data for a crystal of a chimeric c-fms polypeptide; (b) obtaining X-ray diffraction data for a complex of a chimeric c-fms polypeptide and the inhibitor; (c) subtracting the X-ray diffraction data obtained in step (a) from the X-ray diffraction data obtained in step (b) to obtain the difference in the X-ray diffraction data; (d) obtaining phases that correspond to X-ray diffraction data obtained in step (a); (e) utilizing the phases obtained in step (d) and the difference in the X-ray diffraction data obtained in step (c) to compute a difference Fourier image of the inhibitor; and (f) locating the attachment site of the inhibitor based on the computations obtained in step (e).
In a different aspect, the invention is directed to a method of obtaining a modified inhibitor comprising (a) obtaining a crystal comprising a chimeric c-fms polypeptide and an inhibitor; (b) obtaining the atomic coordinates of the crystal; (c) using the atomic coordinates and one or more molecular modeling techniques to determine how to modify the interaction of the inhibitor with the chimeric c-fms polypeptide; and (d) modifying the inhibitor based on the determinations obtained in step (c) to produce a modified inhibitor.
In one embodiment, the crystal comprises a peptide selected from the group consisting of: a peptide having SEQ ID NO: 2; a peptide having SEQ ID NO: 4 and a peptide having SEQ ID NO: 6. In another embodiment, the one or more molecular modeling techniques are selected from the group consisting of graphic molecular modeling and computational chemistry. In one embodiment, step (b) comprises detecting the interaction of the inhibitor to one or more amino acid residues selected from the group consisting of (i) Trp 550, Lys 586, Thr 587, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Val 615, Lys 616, Glu 633, Met 637, Leu 640, Ile 646, Val 647, Val 661, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Asn 673, Arg 677, Cys 774, Ile 775, His 776, Arg 782, Asn 783, Leu 785, Ile 794, Gly 795, Asp 796, Phe 797, Gly 798, Leu 799, Ala 800, Arg 801, Asp 802; (ii) Lys 586, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Lys 616, Val 647, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Arg 677, Arg 782, Leu 785, Asp 796, Phe 797, Gly 798, Leu 799, Ala 800, Arg 801, Asp 802; and, (iii) Lys 586, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Lys 616, Val 647, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Arg 782, Asn 783, Leu 785, Asp 796, Phe 797, Leu 799, Ala 800, Arg 801. In one embodiment of the method, an inhibitor of a chimeric c-fms polypeptide is identified.
The invention further includes an isolated protein fragment comprising a binding pocket or active site defined by one or more structure coordinates of chimeric c-fms amino acid residues selected from the group consisting of (i) Trp 550, Lys 586, Thr 587, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Val 615, Lys 616, Glu 633, Met 637, Leu 640, Ile 646, Val 647, Val 661, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Asn 673, Arg 677, Cys 774, Ile 775, His 776, Arg 782, Asn 783, Leu 785, Ile 794, Gly 795, Asp 796, Phe 797, Gly 798, Leu 799, Ala 800, Arg 801, Asp 802; (ii) Lys 586, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Lys 616, Val 647, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Arg 677, Arg 782, Leu 785, Asp 796, Phe 797, Gly 798, Leu 799, Ala 800, Arg 801, Asp 802; and, (iii) Lys 586, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Lys 616, Val 647, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Arg 782, Asn 783, Leu 785, Asp 796, Phe 797, Leu 799, Ala 800, Arg 801. The invention also includes a fragment linked to a solid support, an isolated nucleic acid molecule encoding the fragment, a vector comprising the nucleic acid molecule, a host cell comprising the vector, and, a method of producing a protein fragment comprising culturing the host cell under conditions in which the fragment is expressed.
The invention also includes a method of screening for an agent that associates with a chimeric c-fms polypeptide, comprising (a) exposing a protein molecule fragment to the agent; and (b) detecting the level of association of the agent to the fragment. Also included in the invention is a kit comprising the protein molecule fragment.
In another aspect of the invention, the invention is directed to a method for the production of a crystal complex comprising a chimeric c-fms chimeric polypeptide-ligand comprising (a) contacting the chimeric c-fms polypeptide with the ligand in a suitable solution and, (b) crystallizing the resulting complex of chimeric c-fms polypeptide-ligand from the solution. In one embodiment, the invention includes a method for the production of a crystal comprising crystallizing a peptide comprising a sequence selected from the group consisting of SEQ ID NO: 2, 4 or 6 with a potential inhibitor. In another embodiment, the method further comprises contacting the crystalline chimeric c-fms polypeptide-ligand complex with another ligand in a suitable solution to replace the bound ligand.
The invention also includes methods or identifying a potential inhibitor of a chimeric c-fms polypeptide comprising (a) using a three dimensional structure of a chimeric c-fms polypeptide as defined by atomic coordinates according to Tables 1, 2 or 3 or similar structure coordinates for the amino acids of a c-fms chimera comprising a root mean square deviation of non-hydrogen atoms of less than about 1.5 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3; (b) replacing one or more chimeric c-fms polypeptide amino acids selected from the group consisting of (i) Trp 550, Lys 586, Thr 587, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Val 615, Lys 616, Glu 633, Met 637, Leu 640, Ile 646, Val 647, Val 661, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Asn 673, Arg 677, Cys 774, Ile 775, His 776, Arg 782, Asn 783, Leu 785, Ile 794, Gly 795, Asp 796, Phe 797, Gly 798, Leu 799, Ala 800, Arg 801, Asp 802; (ii) Lys 586, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Lys 616, Val 647, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Arg 677, Arg 782, Leu 785, Asp 796, Phe 797, Gly 798, Leu 799, Ala 800, Arg 801, Asp 802; and, (iii) Lys 586, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Lys 616, Val 647, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Arg 782, Asn 783, Leu 785, Asp 796, Phe 797, Leu 799, Ala 800, Arg 801 in the three-dimensional structure with a different amino acid to produce a modified three-dimensional structure; and, (c) using the modified three-dimensional structure to design or select the potential inhibitor. In one aspect, the method further comprises d) synthesizing said potential inhibitor. In a different aspect, the method further comprises e) contacting said potential inhibitor with said modified chimeric c-fms polypeptide in the presence of a ligand to test the ability of said potential inhibitor to inhibit a chimeric c-fms polypeptide or said modified chimeric c-fms polypeptide; and the inhibitor identified. In one embodiment, the replacing of one or more amino acid residues further comprises replacing SEQ ID NO: 2 amino acid residues selected from the group consisting of Lys 586, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Lys 616, Val 647, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Arg 782, Asn 783, Leu 785, Asp 796, Phe 797, Leu 799, Ala 800, Arg 801. In yet a different aspect, the potential inhibitor is selected from a database. In another aspect, the potential inhibitor is designed de novo. In one embodiment, the potential inhibitor is designed from a known inhibitor. In yet a different embodiment, the step of employing said modified three-dimensional structure to design or select said potential inhibitor comprises the steps of: (a) identifying chemical entities or fragments capable of associating with a modified chimeric c-fms polypeptide; and (b) assembling the identified chemical entities or fragments into a single molecule to provide the structure of said potential inhibitor. In one aspect, the potential inhibitor is a competitive inhibitor. In a different aspect, the potential inhibitor is a non-competitive or uncompetitive inhibitor. In one embodiment, the potential inhibitor is an irreversible inhibitor.
The present invention includes methods of producing and using three-dimensional structure information derived from c-fms and c-fms chimeric polypeptides and inhibitory compounds which form a complex with c-fms and c-fms chimeric polypeptides and prevent c-fms and c-fms chimeras from interacting with their naturally occurring ligand or ligands. The present invention also includes specific crystallization to obtain crystals of the c-fms-ligand (inhibitor) complex. The crystals are subsequently used to obtain a 3-dimensional structure of the complex using X-ray crystallography (or NMR) and the obtained data is used for rational drug discovery design with the aim to improve the complex formation between c-fms and its chimeras and the inhibitor, and, also to improve the inhibition of the binding of c-fms ligands. In this invention the KID was replaced by the shorter KID derived from the FGFR1 receptor [24]. Additional constructs also include c-fms chimeras derived from replacing the native KID with the KID of tie2 or IRK.
Definitions
As is generally the case in biotechnology and chemistry, the description of the present invention has required the use of a number of terms of art. Although it is not practical to do so exhaustively, definitions for some of these terms are provided here for ease of reference. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Definitions for other terms also appear elsewhere herein. However, the definitions provided here and elsewhere herein should always be considered in determining the intended scope and meaning of the defined terms. Although any methods and materials similar or equivalent to those described herein can be used in the practice of the present invention, the preferred methods and materials are described.
As used herein, the term “atomic coordinates” or “structure coordinates” refers to mathematical coordinates that describe the positions of atoms in crystals of c-fms chimeras in Protein Data Bank (PDB) format, including X, Y, Z and B, for each atom. The diffraction data obtained from the crystals are used to calculate an electron density map of the repeating unit of the crystal. The electron density maps may be used to establish the positions (i.e., coordinates X, Y and Z) of the individual atoms within the crystal. Those of skill in the art understand that a set of structure coordinates determined by X-ray crystallography is not without standard error. For the purpose of this invention, any set of structure coordinates for c-fms chimeras from any source having a root mean square deviation of non-hydrogen atoms of less than about 1.5 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3 are considered substantially identical or homologous. In a preferred embodiment, any set of structure coordinates for c-fms chimeras from any source having a root mean square deviation of non-hydrogen atoms of less than about 0.75 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3 are considered substantially identical or homologous.
As used herein, the term “unit cell” means the fundamental portion of a crystal structure that is repeated infinitely by translation in three dimensions. A unit cell is characterized by three vectors a, b, and c, not located in one plane, which form the edges of a parallelepiped. Angles alpha, beta and gamma define the angles between the vectors: angle alpha is the angle between vectors b and c; angle beta is the angle between vectors a and c; and angle gamma is the angle between vectors a and b. The entire volume of a crystal can be constructed by regular assembly of unit cells. Each unit cell comprises a complete representation of the unit of pattern, the repetition of which builds up the crystal. See, for example, U.S. Appl. No. 2004/0002145.
As used herein, the term “asymmetric unit” (ASU) means part of a symmetric object, which by itself does not posses any symmetry and from which the whole unit cell is built up by the application of symmetry operations of its point group. See, for example, U.S. Appl. No. 2004/0002145.
As used herein, the term “space group” means a group or array of operations consistent with an infinitely extended regularly repeating pattern. It is the symmetry of a three-dimensional structure, or the arrangement of symmetry elements of a crystal. There are 230 space group symmetries possible; however, there are only 65 space group symmetries available for biological structures. See, for example, U.S. Appl. No. 2004/0002145.
The term “atom type” refers to the chemical element whose coordinates are measured. For instance, the first letter in a column in Table 1 identifies the element.
The terms “X,” “Y” and “Z” refer to the crystallographically-defined atomic position of the element measured with respect to the chosen crystallographic origin. The term “B” refers to a thermal factor that measures the mean variation of an atom's position with respect to its average position.
As used herein, the term “crystal” refers to any three-dimensional ordered array of molecules that diffracts X-rays.
As used herein, the term “carrier” in a composition refers to a diluent, adjuvant, excipient, or vehicle with which the product is mixed.
As used herein, the term “composition” refers to the combining of distinct elements or ingredients to form a whole. A composition comprises more than one element or ingredient. For the purposes of this invention, a composition will often, but not always, comprise a carrier.
As used herein, the term “SAR,” an abbreviation for Structure-Activity Relationships, collectively refers to the structure-activity/structure property relationships pertaining to the relationship(s) between a compound's activity/properties and its chemical structure.
As used herein, the term “molecular structure” refers to the three dimensional arrangement of molecules of a particular compound or complex of molecules (e.g., the three dimensional structure of a c-fms chimera and ligands that interact with the c-fms chimera).
As used herein, the term “molecular modeling” refers to the use of computational methods, preferably computer assisted methods, to draw realistic models of what molecules look like and to make predictions about structure activity relationships of ligands. The methods used in molecular modeling range from molecular graphics to computational chemistry.
As used herein, the term “molecular model” refers to the three dimensional arrangement of the atoms of a molecule connected by covalent bonds or the three dimensional arrangement of the atoms of a complex comprising more than one molecule, e.g., a protein-ligand complex.
As used herein, the term “molecular graphics” refers to 3D representations of the molecules, for instance, a 3D representation produced using computer assisted computational methods.
As used herein, the term “computational chemistry” refers to calculations of the physical and chemical properties of the molecules.
As used herein, the term “molecular replacement” refers to a method that involves generating a preliminary model of a crystal of whose coordinates are unknown, by orienting and positioning the said atomic coordinates described in the present invention so as best to account for the observed diffraction pattern of the unknown crystal. Phases can then be calculated from this model and combined with the observed amplitudes to give an approximate Fourier synthesis of the structure whose coordinates are unknown. (Rossmann, M. G., ed., “The Molecular Replacement Method,” Gordon & Breach, New York, 1972).
As used herein, the term “homolog” refers to the protein molecule or the nucleic acid molecule which encodes the protein, or a functional domain from said protein from a first source having at least about 30%, 40% or 50% sequence identity, or at least about 60%, 70% or 75% sequence identity, or at least about 80% sequence identity, or more preferably at least about 85% sequence identity, or even more preferably at least about 90% sequence identity, and most preferably at least about 95%, 97% or 99% amino acid or nucleotide sequence identity, with the protein, encoding nucleic acid molecule or any functional domain thereof, from a second source. The second source may be a version of the molecule from the first source that has been genetically altered by any available means to change the primary amino acid or nucleotide sequence or may be from the same or a different species than that of the first source.
As used herein, the term “active site” refers to regions on a protein or a structural motif of a protein that are directly involved in the function or activity of the c-fms chimera or c-fms protein.
As used herein, the terms “binding site” or “binding pocket” refer to a region of a protein or a molecular complex comprising the protein or polypeptide that, as a result of the primary amino acid sequence of the protein and/or its three-dimensional shape, favorably associates with another chemical entity or compound including ligands or inhibitors.
For the purpose of this invention, any active site, binding site or binding pocket defined by a set of structure coordinates for a protein or for a homolog of a protein from any source having a root mean square deviation of non-hydrogen atoms of less than about 1.5 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3 are considered substantially identical or homologous. In a more preferred embodiment, any set of structure coordinates for a protein or a homolog of a protein from any source having a root mean square deviation of non-hydrogen atoms of less than about 0.75 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of Tables 1, 2 or 3 are considered substantially identical or homologous.
The term “root mean square deviation” means the square root of the arithmetic mean of the squares of the deviations from the mean.
As used herein, the term “amino acids” refers to the L-isomers of the naturally occurring amino acids. The naturally occurring amino acids are glycine, alanine, valine, leucine, isoleucine, serine, methionine, threonine, phenylalanine, tyrosine, tryptophan, cysteine, proline, histidine, aspartic acid, asparagine, glutamic acid, glutamine, γ-carboxylglutamic acid, arginine, ornithine, and lysine. Unless specifically indicated, all amino acids are referred to in this application are in the L-form.
As used herein, the term “nonnatural amino acids” refers to amino acids that are not naturally found in proteins. For example, selenomethionine.
As used herein, the term “positively charged amino acid” includes any amino acids having a positively charged side chain under normal physiological conditions. Examples of positively charged naturally occurring amino acids are arginine, lysine, and histidine.
As used herein, the term “negatively charged amino acid” includes any amino acids having a negatively charged side chains under normal physiological conditions. Examples of negatively charged naturally occurring amino acids are aspartic acid and glutamic acid.
As used herein, the term “hydrophobic amino acid” includes any amino acids having an uncharged, nonpolar side chain that is relatively insoluble in water. Examples of naturally occurring hydrophobic amino acids are alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine.
As used herein, the term “hydrophilic amino acid” refers to any amino acids having an uncharged, polar side chain that is relatively soluble in water. Examples of naturally occurring hydrophilic amino acids are serine, threonine, tyrosine, asparagine, glutamine and cysteine.
As used herein, the term “hydrogen bond” refers to two hydrophilic atoms (either O or N), which share a hydrogen that is covalently bonded to only one atom, while interacting with the other.
As used herein, the term “hydrophobic interaction” refers to interactions made by two hydrophobic residues or atoms (such as C).
As used herein, the term “conjugated system” refers to more than two double bonds are adjacent to each other, in which electrons are completely delocalized with the entire system. This also includes aromatic residues.
As used herein, the term “aromatic residue” refers to amino acids with side chains having a delocalized conjugated system. Examples of aromatic residues are phenylalanine, tryptophan, and tyrosine.
As used herein, the terms “c-fms chimera,” “c-fms chimeric polypeptide” and “c-fms chimeric protein” are used interchangeably unless a different meaning is specifically indicated otherwise. When a different meaning is intended, such will be clear from the text. The term “c-fms,” “c-fms protein” and “c-fms polypeptide” may refer to either the chimeric or non-chimeric c-fms. When a different meaning is intended, such will be clear from the text.
As used herein, the term “inhibitor” or “potential inhibitor” means a substance that is believed to interact with another moiety, for example a given ligand that is believed to interact to at least partially inhibit the activity of a complete c-fms or a chimeric c-fms polypeptide, or fragment of either, and which can be subsequently evaluated for such an interaction and inhibititory effect. Representative candidate compounds or substrates include drugs and other therapeutic agents, carcinogens and environmental pollutants, natural products and extracts, as well as steroids, fatty acids and prostaglandins. Other examples of potential inhibitors that can be investigated using the methods of the present invention include, but are not restricted to, agonists and antagonists of c-fms, a chimeric c-fms polypeptide, toxins and venoms, viral epitopes, hormones, hormone receptors, peptides, enzymes, enzyme substrates, co-factors, lectins, sugars, oligonucleotides or nucleic acids, oligosaccharides, proteins, small molecules and monoclonal antibodies. See, for example, U.S. Patent Appl. No. 20040002145.
As used herein, the phrase “inhibiting the binding” refers to preventing or reducing the direct or indirect association of one or more molecules, peptides, proteins, enzymes, or receptors, or preventing or reducing the normal activity of one or more molecules, peptides, proteins, enzymes or receptors, e.g., preventing or reducing the direct or indirect association with c-fms chimeric polypeptides.
As used herein, the term “competitive inhibitor” refers to inhibitors that bind to c-fms chimeras at the same sites as its binding partner(s), thus directly competing with them. Competitive inhibition may, in some instances, be reversed completely by increasing the substrate concentration.
As used herein, the term “uncompetitive inhibitor” refers to one that inhibits the functional activity of a c-fms chimera by binding to a different site than does its substrate(s).
As used herein, the term “non-competitive inhibitor” refers to one that can bind to either the free or bound form of a c-fms chimera.
As used herein the term “irreversible” or “covalent” inhibitor refers to one that inhibits a c-fms chimera by forming a covalent bond with the chimera and either inhibiting the enzyme by excluding its substrate or causing a permanent reorientation of catalytic residues thus rendering the enzyme inactive. Those of skill in the art may identify inhibitors as competitive, uncompetitive, or non-competitive by computer fitting enzyme kinetic data using standard methods. See, for example, Segel, I. H., Enzyme Kinetics, J. Willey & Sons, (1975). Examples of irreversible inhibition are found, for example, in U.S. Pat. Nos. 6,153,617; 6,127,374; 5,4981,616; 5,298,508 and 5,082,964.
As used herein, the term “R or S-isomer” refers to two possible stereroisomers of a chiral carbon according to the Cahn-Ingold-Prelog system adopted by International Union of Pure and Applied Chemistry (IUPAC). Each group attached to the chiral carbon is first assigned to a preference or priority a, b, c or d on the basis of the atomic number of the atom that is directly attached to the chiral carbon. The group with the highest atomic number is given the highest preference a, the group with next highest atomic number is given the next highest preference b; and so on. The group with the lowest preference (d) is then directed away from the viewer. If the trace of a path from a to b to c is counter clockwise, the isomer is designated (S); in the opposite direction, clockwise, the isomer is designated (R).
As used herein, the term “ligand” refers to any molecule, or chemical entity which binds with or to a c-fms chimera, a subunit of a c-fms chimera, a domain of c-fms chimera, a target structual motif of a c-fms chimera or a fragment of a c-fms chimera. Thus, ligands include, but are not limited to, small molecule inhibitors, for example.
The term “soaking in a ligand” or “soaking a ligand” or “soaking” in the context of protein crystallography/structure based drug design refers to a process by which a ligand is brought in contact with and preferably bound to a protein present in crystalline form through diffusion of the ligand through the crystalline matrix. In a typical application a crystal of the protein of interest is placed in a stabilization solution for a certain period of time (hours or days) in which a molar excess of ligand of interest has been at least partially solubilized. Typically the protein is present in unliganded form to facilitate ligand binding but could also be present in complex with a weaker or equally strong ligand as the one one seeks to replace. If the binding affinity of the ligand is high enough and the ligand-binding-site is unobstructed the ligand will bind to the crystalline protein thus enabling the 3-dimensional structure of the protein-ligand complex to be determined by X-ray crystallography.
As used herein, the term “small molecule inhibitor” refers to compounds useful in the present invention having measurable or inhibiting activity. In addition to small organic molecules, peptides, antibodies, cyclic peptides and peptidomimetics are contemplated as being useful in the disclosed methods. Preferred inhibitors are small molecules, preferably less than 700 Daltons, and more preferably less than 450 Daltons.
As used herein, the terms “bind,” “binding,” “bond,” or “bonded” when used in reference to the association of atoms, molecules, or chemical groups, refer to any physical contact or association of two or more atoms, molecules, or chemical groups.
As used herein, the terms “covalent bond” or “valence bond” refer to a chemical bond between two atoms in a molecule created by the sharing of electrons, usually in pairs, by the bonded atoms.
As used herein, “noncovalent bond” refers to an interaction between atoms and/or molecules that does not involve the formation of a covalent bond between them.
As used herein, the term “native protein” refers to a protein comprising an amino acid sequence identical to that of a protein isolated from its natural source or organism.
Included in this invention is a substitution of the native kinase insert domain of c-fms, which due to its structural bulk and potential disorder, prevents crystallization of the native protein. The invention also includes a method for replacing the native kinase insert domain with shorter kinase insert domains from the FGF receptor kinase, the tie-2 kinase and the insulin receptor kinase, and obtaining crystals of a c-fms-chimeric protein.
A. Modeling the Three-Dimensional Structure of the c-fms Chimeric Protein
The atomic coordinate data provided in Tables 1, 2 or 3 or the coordinate data derived from homologous proteins may be used to build a three-dimensional model of a c-fms chimeric protein. Any available computational methods may be used to build the three dimensional model. As a starting point, the X-ray diffraction pattern obtained from the assemblage of the molecules or atoms in a crystalline version of a c-fms chimera or a c-fms chimeric homolog can be used to build an electron density map using tools well known to those skilled in the art of crystallography and X-ray diffraction techniques. Additional phase information extracted either from the diffraction data and available in the published literature and/or from supplementing experiments may then used to complete the reconstruction.
For basic concepts and procedures of collecting, analyzing, and utilizing X-ray diffraction data for the construction of electron densities see, for example, Campbellet al., 1984, Biological Spectroscopy, The Benjamin/Cummings Publishing Co., Inc., Menlo Park, Calif.; Cantor et al., 1980, Biophysical Chemistry, Part II: Techniques for the study of biological structure and function, W.H. Freeman and Co., San Francisco, Calif.; A. T. Brunger, 1993, X-Flor Version 3.1: A system for X-ray crystallography and NMR, Yale Univ. Pr., New Haven, Conn.; M. M. Woolfson, 1997, An Introduction to X-ray Crystallography, Cambridge Univ. Pr., Cambridge, UK; J. Drenth, 1999, Principles of Protein X-ray Crystallography (Springer Advanced Texts in Chemistry), Springer Verlag; Berlin; Tsirelson et al., 1996, Electron Density and Bonding in Crystals: Principles, Theory and X-ray Diffraction Experiments in Solid State Physics and Chemistry, Inst. of Physics Pub.; U.S. Pat. No. 5,942,428; U.S. Pat. No. 6,037,117; U.S. Pat. No. 5,200,910 and U.S. Pat. No. 5,365,456 (“Method for Modeling the Electron Density of a Crystal”), each of which is herein specifically incorporated by reference in their entirety.
For basic information on molecular modeling, see, for example, M. Schlecht, Molecular Modeling on the PC, 1998, John Wiley & Sons; Gans et al., Fundamental Principals of Molecular Modeling, 1996, Plenum Pub. Corp.; N. C. Cohen (editor), Guidebook on Molecular Modeling in Drug Design, 1996, Academic Press; and W. B. Smith, Introduction to Theoretical Organic Chemistry and Molecular Modeling, 1996. U.S. patents which provide detailed information on molecular modeling include U.S. Pat. Nos. 6,093,573; 6,080,576; 6,075,014; 6,075,123; 6,071,700; 5,994,503; 5,612,894; 5,583,973; 5,030,103; 4,906,122; and 4,812,12, each of which are incorporated by reference herein in their entirety.
B. Methods of Using the Atomic Coordinates to Identify and Design Ligands of Interest
The atomic coordinates of the invention, such as those described in Tables 1, 2 or 3 or coordinates substantially identical to or homologous to those of Tables 1, 2 or 3 may be used with any available methods to prepare three dimensional models of c-fms chimeras as well as to identify and design ligands, inhibitors or antagonists or agonist molecules.
For instance, three-dimensional modeling may be performed using the experimentally determined coordinates derived from X-ray diffraction patterns, such as those in Tables 1, 2 or 3, for example, wherein such modeling includes, but is not limited to, drawing pictures of the actual structures, building physical models of the actual structures, and determining the structures of related subunits and /ligand and subunit/ligand complexes using the coordinates. Such molecular modeling can utilize known X-ray diffraction molecular modeling algorithms or molecular modeling software to generate atomic coordinates corresponding to the three-dimensional structure of c-fms chimeras.
As described above, molecular modeling involves the use of computational methods, preferably computer assisted methods, to build realistic models of molecules that are identifiably related in sequence to the known crystal structure. It also involves modeling new small molecule inhibitors bound to c-fms chimeras starting with the structures of c-fms chimeras alone or complexed with known ligands or inhibitors. The methods utilized in ligand modeling range from molecular graphics (i.e., 3D representations) to computational chemistry (i.e., calculations of the physical and chemical properties) to make predictions about the binding of ligands or activities of ligands; to design new ligands; and to predict novel molecules, including ligands such as drugs, for chemical synthesis, collectively referred to as rational drug design.
One approach to rational drug design is to search for known molecular structures that might bind to an active site. Using molecular modeling, rational drug design programs can look at a range of different molecular structures of drugs that may fit into the active site of an enzyme or protein, and by moving them in a three-dimensional environment it can be decided which structures actually fit the site well. See, also, for example, data in Tables 1, 2 or 3. An alternate but related rational drug design approach starts with the known structure of a complex with a small molecule ligand and models modifications of that small molecule in an effort to make additional favorable interactions with c-fms chimeras and c-fms proteins.
The present invention includes the use of molecular and computer modeling techniques to design and select ligands, such as small molecule agonists or antagonists or other therapeutic agents that interact with c-fms chimeras as proteins. Such agents include, but are not limited to arylamides and quinolones and derivatives thereof. For example, the invention as herein described includes the design of ligands that act as partial or complete inhibitors of at least one function by binding to all, or a portion of, the active sites or other regions of c-fms chimeras or proteins.
This invention also includes the design of compounds that act as uncompetitive inhibitors of at least one function of c-fms chimeras or proteins. These inhibitors may bind to all, or a portion of, the active sites or other regions of the chimeras or proteins already bound to a ligand and may be more potent and less non-specific than competitive inhibitors that compete for active sites. Similarly, non-competitive inhibitors that bind to and inhibit at least one function of c-fms chimeras or proteins whether or not it is bound to another chemical entity, such as a natural ligand, for example, may be designed using the atomic coordinates of the chimeras or complexes comprising the chimeras of this invention.
The atomic coordinates of the present invention also provide the needed information to probe a crystal of a c-fms chimera with molecules composed of a variety of different chemical features to determine optimal sites for interaction between candidate inhibitors and/or activators and c-fms chimeras. For example, high resolution X-ray diffraction data collected from crystals saturated with solvent allows the determination of where each type of solvent molecule sticks. Small molecules that bind to those sites can then be designed and synthesized and tested for their inhibitory activity (Travis, J., Science 262:1374 (1993)).
The present invention also includes methods for computationally screening small molecule databases and libraries for chemical entities, agents, ligands, or compounds that can bind in whole, or in part, to c-fms chimeras. In this screening, the quality of fit of such entities or compounds to the binding site or sites may be judged either by shape complementarity or by estimated interaction energy (Meng, E. C. et al., J. Comp. Chem. 13:505-524 (1992)).
The design of compounds that bind to, promote or inhibit the functional activity of c-fms proteins and/or chimeras according to this invention generally involves consideration of two factors. First, the compound must be capable of physically and structurally associating with the c-fms protein and/or c-fms chimera. Non-covalent molecular interactions important in the association of the c-fms protein with the compound include hydrogen bonding, van der Waals and hydrophobic interactions. Second, the compound must be able to assume a conformation that allows it to associate with a c-fms protein and/or chimera. Although certain portions of the compound may not directly participate in the association with c-fms, those portions may still influence the overall conformation of the molecule. This, in turn, may have a significant impact on binding affinities, therapeutic efficacy, drug-like qualities and potency. Such conformational requirements include the overall three-dimensional structure and orientation of the chemical entity or compound in relation to all or a portion of the active site or other region of c-fms, or the spacing between functional groups of a compound comprising several chemical entities that directly interact with c-fms.
The potential, predicted, inhibitory agonist, antagonist or binding effect of a ligand or other compound on a c-fms protein may be analyzed prior to its actual synthesis and testing by the use of computer modeling techniques. If the theoretical structure of the given compound suggests insufficient interaction and association between it and the c-fms protein, synthesis and testing of the compound may be obviated. However, if computer modeling indicates a strong interaction, the molecule may then be synthesized and tested for its ability to interact with c-fms. In this manner, synthesis of inoperative compounds may be avoided. In some cases, inactive compounds are synthesized predicted on modeling and then tested to develop a SAR (structure-activity relationship) for compounds interacting with a specific region of a c-fms protein.
One skilled in the art may use one of several methods to screen chemical entities fragments, compounds, or agents for their ability to associate with a c-fms protein and more particularly with the individual binding pockets or active sites of the c-fms protein. This process may begin by visual inspection of, for example, the active site based on the atomic coordinates of the chimeric protein or the chimeric protein complexed with a ligand. Selected chemical entities, compounds, or agents may then be positioned in a variety of orientations, or docked within an individual binding pocket of the chimeric c-fms protein. Docking may be accomplished using software-such as Quanta and Sybyl, followed by energy minimization and molecular dynamics with standard molecular mechanics forcefields, such as CHARMM and AMBER.
Specialized computer programs may also assist in the process of selecting chemical entities. These include but are not limited to: GRID (Goodford, P. J., “A Computational Procedure for Determining Energetically Favorable Binding Sites on Biologically Important Macromolecules,” J. Med. Chem. 28:849-857 (1985), available from Oxford University, Oxford, UK); MCSS (Miranker, A. and M. Karplus, “Functionality Maps of Binding Sites: A Multiple Copy Simultaneous Search Method.” Proteins: Structure, Function and Genetics 11: 29-34 (1991), available from Molecular Simulations, Burlington, Mass.); AUTODOCK (Goodsell, D. S. and A. J. Olsen, “Automated Docking of Substrates to Proteins by Simulated Annealing” Proteins: Structure. Function, and Genetics 8:195-202 (1990), available from Scripps Research Institute, La Jolla, Calif.); DOCK (Kuntz, I. D. et al., “A Geometric Approach to Macromolecule-Ligand Interactions,” J. Mol. Biol. 161:269-288 (1982), available from University of California, San Francisco, Calif.); Gold (Jones, G. et al., “Development and validation of a genetic algorithm for flexible docking.” J. Mol. Biol. 267: 727-748 (1997)); Glide (Halgren, T. A. et al., “Glide: a new approach for rapid, accurate docking and scoring. 2. Enrichment factors in database screening.” J Med Chem, 47:1750-1759 (2004), Friesner, R. A. et al., “Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy.” J Med Chem, 47:1739-1749 (2004)); FlexX (Rarey, M. et al., “A fast flexible docking method using an incremental construction algorithm.” J. Mol. Biol. 261: 470-489 (1996)); and ICM (Abagyan, R. A. and Totrov, M. M., J. Mol. Biol. 235: 983-1002 (1994)).
The use of software such as GRID, a program that determines probable interaction sites between probes with various functional group characteristics and the macromolecular surface, is used to analyze the surface sites to determine structures of similar inhibiting proteins or compounds. The GRID calculations, with suitable inhibiting groups on molecules (e.g., protonated primary amines) as the probe, are used to identify potential hotspots around accessible positions at suitable energy contour levels. The program DOCK may be used to analyze an active site or ligand binding site and suggest ligands with complementary steric properties. See also, See, also, Kellenberger, P. N et al., “Recovering the true targets of specific ligands by virtual screening of the protein data bank,” Proteins 54(4):671-80 (2004); Oldfield, T., “Applications for macromolecular map interpretation: X-AUTOFIT, X-POWERFIT, X-BUILD, X-LIGAND, and X-SOLVATE,” Methods Enzymol. 374:271-300 (2003); Richardson, J. S. et al., “New tools and data for improving structures, using all-atom contacts,” Methods Enzymol. 374: 385-412 (2003); Terwilliger, T. C., “Improving macromolecular atomic models at moderate resolution by automated iterative model building, statistical density modification and refinement,” Acta Crystallogr D Biol Crystallogr. 59(Pt 7): 1174-82 (2003); Toerger, T. C. and Sacchettini, J. C., “TEXTAL system: artificial intelligence techniques for automated protein model building,” Methods Enzymol. 374:244-70 (2003); von Grotthuss, M. et al., “Predicting protein structures accurately,” Science 304(5677):1597-9 (2004); Rajakiannan, V. et al., “The use of ACORN in solving a 39.5 kDa macromolecule with 1.9 Å resolution laboratory source data,” J Synchrotron Radiat. 11(Pt 4):358-62 (2004); Claude, J. B. et al., “CaspR: a web server for automated molecular replacement using homology modelling,” Nucleic Acids Res. 32(Web Server issue):W606-9 (2004); Suhre, K. and Sanejouand, Y. H., “ElNemo: a normal mode web server for protein movement analysis and the generation of templates for molecular replacement,” Nucleic Acids Res. 32(Web Server issue):W610-4 (2004).
Once suitable chemical entities, compounds, or agents have been selected, they can be assembled into a single ligand or compound or inhibitor or activator. Assembly may proceed by visual inspection of the relationship of the fragments to each other on the three-dimensional image. This may be followed by manual model building using software such as Quanta or Sybyl.
Useful programs to aid in connecting the individual chemical entities, compounds, or agents include but are not limited to: CAVEAT (Bartlett, P. A. et al., “CAVEAT: A Program to Facilitate the Structure-Derived Design of Biologically Active Molecules.” In Molecular Recognition in Chemical and Biological, Problems, Special Pub., Royal Chem. Soc., 78, pp. 82-196 (1989)); 3D Database systems such as MACCS-3D (MDL Information Systems, San Leandro, C A and Martin, Y. C., “3D Database Searching in Drug Design,” J. Med. Chem. 35: 2145-2154 (1992); and HOOK (available from Molecular Simulations, Burlington, Mass.).
Several methodologies for searching three-dimensional databases to test pharmacophore hypotheses and select compounds for screening are available. These include the program CAVEAT (Bacon et al., J. Mol. Biol. 225:849-858 (1992)). For instance, CAVEAT uses databases of cyclic compounds which can act as “spacers” to connect any number of chemical fragments already positioned in the active site. This allows one skilled in the art to quickly generate hundreds of possible ways to connect the fragments already known or suspected to be necessary for tight binding.
Instead of proceeding to build an inhibitor activator, agonist or antagonist of a c-fms chimeric protein in a step-wise fashion one chemical entity at a time as described above, such compounds may be designed as a whole or “de novo” using either an empty active site or optionally including some portion(s) of a known molecules. These methods include: LUDI (Bohm, H.-J., “The Computer Program LUDI: A New Method for the De Novo Design of Enzyme Inhibitors”, J. Comp. Aid. Molec. Design, 6, pp. 61-78 (1992), available from Biosym Technologies, San Diego, Calif.); LEGEND (Nishibata, Y. and A. Itai, Tetrahedron 47:8985 (1991), available from Molecular Simulations, Burlington, Mass.); and LeapFrog (available from Tripos Associates, St. Louis, Mo.).
For instance, the program LUDI can determine a list of interaction sites into which to place both hydrogen bonding and hydrophobic fragments. LUDI then uses a library of linkers to connect up to four different interaction sites into fragments. Then smaller “bridging” groups such as —CH2— and —COO— are used to connect these fragments. For example, for the enzyme DHFR, the placements of key functional groups in the well-known inhibitor methotrexate were reproduced by LUDI. See also, Rotstein and Murcko, J. Med. Chem. 36: 1700-1710 (1992).
Other molecular modeling techniques may also be employed in accordance with this invention. See, e.g., Cohen, N. C. et al., “Molecular Modeling Software and Methods for Medicinal Chemistry, J. Med. Chem. 33:883-894 (1990). See also, Navia, M. A. and M. A. Murcko, “The Use of Structural Information in Drug Design,” Current Opinions in Structural Biology, 2, pp. 202-210 (1992).
Once a compound has been designed or selected by the above methods, the affinity with which that compound may bind or associate with a c-fms protein may be tested and optimized by computational evaluation and/or by testing biological activity after synthesizing the compound. Inhibitors or compounds may interact with c-fms in more than one conformation that is similar in overall binding energy. In those cases, the deformation energy of binding is taken to be the difference between the energy of the free compound and the average energy of the conformations observed when the compound binds to a c-fms protein.
A compound designed or selected as binding or associating with c-fms may be further computationally optimized so that in its bound state it would preferably lack repulsive electrostatic interaction with the protein. Such non-complementary (e.g., electrostatic) interactions include repulsive charge-charge, dipole-dipole and charge-dipole interactions. Specifically, the sum of all electrostatic interactions between the inhibitor and the chimera when the inhibitor is bound, preferably make a neutral or favorable contribution to the enthalpy of binding. Weak binding compounds will also be designed by these methods so as to determine SAR.
Specific computer software is available in the art to evaluate compound deformation energy and electrostatic interaction. Examples of programs designed for such uses include: Gaussian 92, revision C (M. J. Frisch, Gaussian, Inc., Pittsburgh, Pa., COPYRGT 1992); AMBER, version 4.0 (P. A. Kollman, University of California at San Francisco, COPYRGT 1994); QUANTA/CHARMM (Molecular Simulations, Inc., Burlington, Mass. COPYRGT 1994); Insight II/Discover (Biosysm Technologies Inc., San Diego, Calif. COPYRGT. 1994); and Delphi (A. Nicholls and B. Honig “A rapid finite difference algorithm, utilizing successive over-relaxation to solve the Poisson-Boltzman equation” J. Comp. Chem. 12: 435-445 (1991), M. K. Gilson and B. Honig. “Calculation of the total electrostatic energy of a macromolecular system: Solvation energies, binding energies and conformation analysis” Proteins 4: 7-18 (1988), M. K. Gilson et al., “Calculating the electrostatic potential of molecules in solution: Method and error assessment” J Comp. Chem 9: 327-335 (1987)). Other hardware systems and software packages will be known to those skilled in the art.
Once a compound that associates with the c-fms chimera and/or c-fms protein has been optimally selected or designed, as described above, substitutions may then be made in some of its atoms or side groups in order to improve or modify its binding properties. Generally, initial substitutions are conservative, i.e., the replacement group will have approximately the same size, shape, hydrophobicity and charge as the original group. It should, of course, be understood that components known in the art to alter conformation may be avoided. Such substituted chemical compounds may then be analyzed for efficiency of fit to a c-fms chimera by the same computer methods described in detail, above.
C. Use of Homology Structure Modeling to Design Ligands with Modulated Binding or Activity to c-fms Proteins
The present invention includes the use of the atomic coordinates and structures of c-fms chimeric proteins and/or c-fms chimeric protein inhibitor complexes. The structure of a complex between the chimera and the starting compound can be used to guide the modification of that compound to produce new compounds that have other desirable properties for applicable industrial and other uses (e.g., as pharmaceuticals), such as chemical stability, solubility or membrane permeability (Lipinski et al., Adv. Drug Deliv. Rev. 23:3 (1997)).
Binding compounds, agonists, antagonists and such that are known in the art include but are not limited to arylamides and quinolones. Such compounds can be diffused into or soaked with the stabilized crystals of a c-fms chimera to form a complex for collecting X-ray diffraction data. Alternatively, other compounds, known and unknown in the art, can be cocrystallized with the c-fms chimera by mixing the compound with the chimera before precipitation.
To produce custom high affinity and very specific compounds, the structure of a c-fms chimera can be compared to the structure of a selected non-targeted molecule and a hybrid constructed by changing the structure of residues at the binding site for a ligand for the residues at the same positions of the non-target molecule. The process whereby this modeling is achieved is referred to as homology structure modeling. This is done computationally by removing the side chains from the molecule or target of known structure and replacing them with the side chains of the non-targeted structure put in sterically plausible positions. In this way it can be understood how the shapes of the active site cavities of the targeted and non-targeted molecules differ. This process, therefore, provides information concerning how a bound ligand can be chemically altered in order to produce compounds that will bind tightly and specifically to the desired target but will simultaneously be sterically prevented from binding to the non-targeted molecule. Likewise, knowledge of portions of the bound ligands that are facing to the solvent would allow introduction of other functional groups for additional pharmaceutical purposes. The use of homology structure modeling to design molecules (ligands) that bind more tightly to the target enzyme than to the non-target enzyme has wide spread applicability.
D. High Throughput Assays
Any high throughput screening may be utilized to test new compounds which are identified or designed for their ability to interact with c-fms. For general information on high-throughput screening see, for example, Devlin, 1998, High Throughput Screening, Marcel Dekker; and U.S. Pat. No. 5,763,263. High throughput assays utilize one or more different assay techniques including, but not limited to, those described below.
Immunodiagnostics and Immunoassays. These are a group of techniques used for the measurement of specific biochemical substances, commonly at low concentrations in complex mixtures such as biological fluids, that depend upon the specificity and high affinity shown by suitably prepared and selected antibodies for their complementary antigens. A substance to be measured must, of necessity, be antigenic—either an immunogenic macromolecule or a haptenic small molecule. To each sample a known, limited amount of specific antibody is added and the fraction of the antigen combining with it, often expressed as the bound:free ratio, is estimated, using as indicator a form of the antigen labeled with radioisotope (radioimmunoassay), fluorescent molecule (fluoroimmunoassay), stable free radical (spin immunoassay), enzyme (enzyme immunoassay), or other readily distinguishable label.
Antibodies can be labeled in various ways, including: enzyme-linked immunosorbent assay (ELISA); radioimmuno assay (RIA); fluorescent immunoassay (FIA); chemiluminescent immunoassay (CLIA); and labeling the antibody with colloidal gold particles (immunogold). Common assay formats include the sandwich assay, competitive or competition assay, latex agglutination assay, homogeneous assay, microtitre plate format and the microparticle-based assay.
Enzyme-linked immunosorbent assay (ELISA). ELISA is an immunochemical technique that avoids the hazards of radiochemicals and the expense of fluorescence detection systems. Instead, the assay uses enzymes as indicators. ELISA is a form of quantitative immunoassay based on the use of antibodies (or antigens) that are linked to an insoluble carrier surface, which is then used to “capture” the relevant antigen (or antibody) in the test solution. The antigen-antibody complex is then detected by measuring the activity of an appropriate enzyme that had previously been covalently attached to the antigen (or antibody).
For information on ELISA techniques, see, for example, Crowther, (1995) ELISA—Theory and Practice (Methods in Molecular Biology), Humana Press; Challacombe & Kemeny, (1998) ELISA and Other Solid Phase Immunoassays—Theoretical and Practical Aspects, John Wiley; Kemeny, (1991) A Practical Guide to ELISA, Pergamon Press; Ishikawa, (1991) Ultrasensitive and Rapid Enzyme Immunoassay (Laboratory Techniques in Biochemistry and Molecular Biology) Elsevier.
Colorimetric Assays for Enzymes. Colorimetry is any method of quantitative chemical analysis in which the concentration or amount of a compound is determined by comparing the color produced by the reaction of a reagent with both standard and test amounts of the compound, often using a colorimeter. A colorimeter is a device for measuring color intensity or differences in color intensity, either visually or photoelectrically.
Standard calorimetric assays of beta-galactosidase enzymatic activity are well known to those skilled in the art (see, for example, Norton et al., Mol. Cell. Biol. 5:281-290 (1985)). A calorimetric assay can be performed on whole cell lysates using O-nitrophenyl -beta-D-galactopyranoside (ONPG, Sigma) as the substrate in a standard colorimetric beta-galactosidase assay (Sambrook et al., (1989) Molecular Cloning—A Laboratory Manual, Cold Spring Harbor Laboratory Press). Automated calorimetric assays are also available for the detection of beta-galactosidase activity, as described in U.S. Pat. No. 5,733,720.
Immunofluorescence Assays. Immunofluorescence or immunofluorescence microscopy is a technique in which an antigen or antibody is made fluorescent by conjugation to a fluorescent dye and then allowed to react with the complementary antibody or antigen in a tissue section or smear. The location of the antigen or antibody can then be determined by observing the fluorescence by microscopy under ultraviolet light.
For general information on immunofluorescent techniques, see, for example, Knapp et al., (1978) Immunofluorescence and Related Staining Techniques, Elsevier; Allan, (1999) Protein Localization by Fluorescent Microscopy—A Practical Approach (The Practical Approach Series) Oxford University Press; Caul, (1993) Immunofluorescence Antigen Detection Techniques in Diagnostic Microbiology, Cambridge University Press. For detailed explanations of immunofluorescent techniques applicable to the present invention, see U.S. Pat. No. 5,912,176; U.S. Pat. No. 5,869,264; U.S. Pat. No. 5,866,319; and U.S. Pat. No. 5,861,259.
E. Databases and Computer Systems
An amino acid sequence or nucleotide sequence of a c-fms chimera and/or X-ray diffraction data, useful for computer molecular modeling of a c-fms chimera or a portion thereof, can be “provided” in a variety of mediums to facilitate use thereof. As used herein, “provided” refers to a manufacture, which contains, for example, an amino acid sequence or nucleotide sequence and/or atomic coordinates derived from X-ray diffraction data of the present invention, e.g., an amino acid or nucleotide sequence of a c-fms chimera, a representative fragment thereof, or a homologue thereof. Such a product provides the amino acid sequence and/or X-ray diffraction data in a form which allows a skilled artisan to analyze and molecular model the three-dimensional structure of a c-fms chimera or related molecules, including a subdomain thereof.
In one application of this embodiment, databases comprising data pertaining to a c-fms chimera, or at least one subdomain thereof, amino acid and nucleic acid sequence and/or X-ray diffraction data of the present invention is recorded on computer readable medium. As used herein, “computer readable medium” refers to any medium which can be read and accessed directly by a computer. Such media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage media, and magnetic tape; optical storage media such as optical discs or CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled artisan can readily appreciate how any of the presently known computer readable media can be used to create a manufacture comprising computer readable medium having recorded thereon an amino acid sequence and/or X-ray diffraction data of the present invention.
As used herein, “recorded” refers to a process for storing information on computer readable media. A skilled artisan can readily adopt any of the presently known methods for recording information on computer readable media to generate manufactures comprising an amino acid sequence and/or atomic coordinate/X-ray diffraction data information of the present invention.
A variety of data storage structures are available to a skilled artisan for creating a computer readable medium having recorded thereon an amino acid sequence and/or atomic coordinate/X-ray diffraction data of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the sequence and X-ray data information of the present invention on computer readable media. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and MICROSOFT Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can readily adapt any number of dataprocessor structuring formats (e.g., text file or database) in order to obtain computer readable media having recorded thereon the information of the present invention.
By providing computer readable media having sequence and/or atomic coordinates based on X-ray diffraction data, a skilled artisan can routinely access the sequence and atomic coordinate or X-ray diffraction data to model, for instance, a related molecule, a subdomain, mimetic, or a ligand thereof. Computer algorithms are publicly and commercially available which allow a skilled artisan to access this data provided in a computer readable medium and analyze it for molecular modeling and/or RDD (rational drug design). See, e.g., Biotechnology Software Directory, MaryAnn Liebert Publ., New York (1995).
The present invention further provides systems, particularly computer-based systems, which contain the sequence, structure, and/or diffraction data described herein. Such systems are designed to do structure determination and RDD for a c-fms chimera or at least one subdomain thereof. Non-limiting examples are microcomputer workstations available from Silicon Graphics Incorporated and Sun Microsystems running UNIX based, Windows or IBM OS/2 operating systems.
As used herein, “a computer-based system” refers to the hardware means, software means, and data storage means used to analyze the sequence, structure, and/or X-ray diffraction data of the present invention. The minimum hardware means of the computer-based systems of the present invention comprises a central processing unit (CPU), input means, output means, and data storage means. A skilled artisan can readily appreciate which of the currently available computer-based systems are suitable for use in the present invention. A visualization device, such as a monitor, is optionally provided to visualize structure data.
As stated above, the computer-based systems of the present invention comprise a data storage means having stored therein sequence, structure, and/or atomic coordinate/X-ray diffraction data of the present invention and the necessary hardware means and software means for supporting and implementing an analysis means. As used herein, “data storage means” refers to memory which can store sequence, structure, or atomic coordinate/X-ray diffraction data of the present invention, or a memory access means which can access manufactures having recorded thereon the sequence or X-ray data of the present invention.
As used herein, “search means” or “analysis means” refers to one or more programs which are implemented on the computer-based system to compare a target sequence or target structural motif with the sequence, structure, or X-ray data stored within the data storage means. Search means are used, for instance, to identify fragments or regions of a protein or polypeptide which match a particular target sequence or target motif. A variety of known algorithms are disclosed publicly and a variety of commercially available software for conducting search means are and can be used in the computer-based systems of the present invention. A skilled artisan can readily recognize that any one of the available algorithms or implementing software packages for conducting computer analyses can be adapted for use in the present computer-based systems.
As used herein, “a target structural motif,” or “target motif,” refers to any rationally selected sequence or combination of sequences in which the sequence(s) are chosen based on a three-dimensional configuration or electron density map which is formed upon the folding of the target motif. There are a variety of target motifs known in the art. Protein target motifs include, but are not limited to, enzymatic active sites, inhibitor binding sites, structural subdomains, epitopes, functional domains and signal sequences. Similar motifs are known for RNA. A variety of structural formats for the input and output means can be used to input and output the information in the computer-based systems of the present invention.
A variety of comparing means can be used to compare a target sequence or target motif with the data described herein to identify structural motifs or electron density maps derived in part from the atomic coordinate/X-ray diffraction data. A skilled artisan can readily recognize that any one of the publicly available computer modeling programs can be used as the search means for the computer-based systems of the present invention.
F. Target Molecule Fragments and Portions
Fragments of c-fms, for instance fragments comprising active sites defined by two or more amino acids selected from the group consisting of: Trp 550, Lys 586, Thr 587, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Val 615, Lys 616, Glu 633, Met 637, Leu 640, Ile 646, Val 647, Val 661, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Asn 673, Arg 677, Cys 774, Ile 775, His 776, Arg 782, Asn 783, Leu 785, Ile 794, Gly 795, Asp 796, Phe 797, Gly 798, Leu 799, Ala 800, Arg 801, Asp 802 may be prepared by any available means including synthetic or recombinant means. Such fragments may then be used in the assays as described above, for instance, high through put assays to detect interactions between prospective agents and the active site within the fragment.
For recombinant expression or production of the fragments of the invention, nucleic acid molecules encoding the fragment may be prepared. As used herein, “nucleic acid” is defined as RNA or DNA that encodes a protein or peptide as defined above, or is complementary to nucleic acid sequence encoding such peptides, or hybridizes to such nucleic acid and remains stably bound to it under appropriate stringency conditions.
Nucleic acid molecules encoding fragments of the invention may differ in sequence because of the degeneracy in the genetic code or may differ in sequence as they encode proteins or protein fragments that differ in amino acid sequence. Homology or sequence identity between two or more such nucleic acid molecules is determined by BLAST (Basic Local Alignment Search Tool) analysis using the algorithm employed by the programs blastp, blastn, blastx, tblastn and tblastx (Karlin et al., Proc. Natl. Acad. Sci. USA 87:2264-2268 (1990) and Altschul et al., J. Mol. Evol. 36:290-300 (1993), fully incorporated by reference) which are tailored for sequence similarity searching.
The approach used by the BLAST program is to first consider similar segments between a query sequence and a database sequence, then to evaluate the statistical significance of all matches that are identified and finally to summarize only those matches which satisfy a preselected threshold of significance. For a discussion of basic issues in similarity searching of sequence databases, see Altschul et al. (Nat. Genet. 6, 119-129 (1994)) which is fully incorporated by reference. The search parameters for histogram, descriptions, alignments, expect (i.e., the statistical significance threshold for reporting matches against database sequences), cutoff, matrix and filter are at the default settings. The default scoring matrix used by blastp, blastx, tblastn, and tblastx is the BLOSUM62 matrix (Henikoff et al., Proc. Natl. Acad. Sci. USA 89:10915-10919 (1992), fully incorporated by reference). Four blastn parameters were adjusted as follows: Q=10 (gap creation penalty); R=10 (gap extension penalty); wink=1 (generates word hits at every winkth position along the query); and gapw=16 (sets the window width within which gapped alignments are generated). The equivalent Blastp parameter settings were Q=9; R=2; wink=1; and gapw=32. A Bestfit comparison between sequences, available in the GCG package version 10.0, uses DNA parameters GAP=50 (gap creation penalty) and LEN=3 (gap extension penalty) and the equivalent settings in protein comparisons are GAP=8 and LEN=2.
“Stringent conditions” are those that (1) employ low ionic strength and high temperature for washing, for example, 0.015 M NaCl/0.0015 M sodium citrate/0.1% SDS at 50° C. or (2) employ during hybridization a denaturing agent such as formamide, for example, 50% formamide with 0.1% bovine serum albumin/0.1% Ficoll/0.1% polyvinylpyrrolidone/50 mM, sodium phosphate buffer at pH 6.5 with 750 mM NaCl, 75 mM sodium citrate at 42° C. Another example is use of 50% formamide, 5×SSC, 50 mM sodium phosphate (pH 6.8), 0.1% sodium pyrophosphate, 5× Denhardt's solution, sonicated salmon sperm DNA (50 mg/ml), 0.1% SDS and 10% dextran sulfate at 42° C., with washes at 42° C. in 0.2×SSC and 0.1% SDS. A skilled artisan can readily determine and vary the stringency conditions appropriately to obtain a clear and detectable hybridization signal.
As used herein, a nucleic acid molecule is said to be “isolated” when the nucleic acid molecule is substantially separated from contaminant nucleic acid encoding other polypeptides from the source of nucleic acid.
The encoding nucleic acid molecules of the present invention (i.e., synthetic oligonucleotides) and those that are used as probes or specific primers for polymerase chain reaction (PCR) or to synthesize gene sequences encoding proteins of the invention can easily be synthesized by chemical techniques, for example, the phosphotriester method of Matteucci et al. (J. Am. Chem. Soc. 103: 185-3191 (1981)) or using automated synthesis methods. In addition, larger DNA segments can readily be prepared by well known methods, such as synthesis of a group of oligonucleotides that define various modular segments of the gene, followed by ligation of oligonucleotides to build the complete modified gene.
The encoding nucleic acid molecules of the present invention may further be modified so as to contain a detectable label for diagnostic and probe purposes. A variety of such labels are known in the art and can readily be employed with the encoding molecules herein described. Suitable labels include, but are not limited to, biotin, radiolabeled nucleotides and the like. A skilled artisan can employ any of the art-known labels to obtain a labeled encoding nucleic acid molecule.
The present invention further provides recombinant DNA molecules (rDNA) that contain a coding sequence for a protein fragment as described above. As used herein, a rDNA molecule is a DNA molecule that has been subjected to molecular manipulation. Methods for generating rDNA molecules are well known in the art, for example, see Sambrook et al., Molecular Cloning—A Laboratory Manual, Cold Spring Harbor Laboratory Press (1989). In the preferred rDNA molecules, a coding DNA sequence is operably linked to expression control sequences and/or vector sequences.
The choice of vector and expression control sequences to which one of the protein encoding sequences of the present invention is operably linked depends directly, as is well known in the art, on the functional properties desired (e.g., protein expression, and the host cell to be transformed). A vector of the present invention may be capable of directing the replication or insertion into the host chromosome, and preferably also expression, of the structural gene included in the rDNA molecule.
Expression control elements that are used for regulating the expression of an operably linked protein encoding sequence are known in the art and include, but are not limited to, inducible promoters, constitutive promoters, secretion signals, and other regulatory elements. Preferably, the inducible promoter is readily controlled, such as being responsive to a nutrient in the host cell's medium.
The present invention further provides host cells transformed with a nucleic acid molecule that encodes a protein, polypeptide, or fragment of a protein or polypeptide of the present invention. The host cell can be either prokaryotic or eukaryotic. Eukaryotic cells useful for expression of a protein of the invention are not limited, so long as the cell line is compatible with cell culture methods and compatible with the propagation of the expression vector and expression of the gene product. Preferred eukaryotic host cells include, but are not limited to, yeast, insect and mammalian cells, preferably vertebrate cells such as those from a mouse, rat, monkey or human cell line. Preferred eukaryotic host cells include Chinese hamster ovary (CHO) cells available from the ATCC as CCL61, NIH Swiss mouse embryo cells NIH-3T3 available from the ATCC as CRL1658, baby hamster kidney cells (BHK), and the like eukaryotic tissue culture cell lines.
Transformed host cells of the invention may be cultured under conditions that allow the production of the recombinant protein. Optionally the recombinant protein is isolated from the medium or from the cells; recovery and purification of the protein may not be necessary in some instances where some impurities may be tolerated.
Kits may also be prepared with any of the above described nucleic acid molecules, protein fragments, vector and/or host cells optionally packaged with the reagents needed for a specific assay, such as those described above. In such kits, the protein fragments or other reagents may be attached to a solid support, such as glass or plastic beads.
G. Integrated Procedures Which Utilize the Present Invention
Molecular modeling is provided by the present invention for rational drug design (RDD) of mimetics and ligands of a c-fms chimera. As described above, the drug design paradigm uses computer modeling programs to determine potential mimetics and ligands of a c-fms chimera which are expected to interact with sites on the protein. The potential mimetics or ligands are then screened for activity and/or binding and/or interaction with the c-fms protein. For c-fms-related mimetics or ligands, screening methods can be selected from assays for at least one biological activity of c-fms, e.g., such as phosphorylation, according to known method steps. See, for example, U.S. Appl. No. 2004/0002145 A1.
Thus, the tools and methodologies provided by the present invention may be used in procedures for identifying and designing ligands which bind in desirable ways with the target, a c-fms protein. Such procedures utilize an iterative process whereby ligands are synthesized, tested and characterized. New ligands can be designed based on the information gained in the testing and characterization of the initial ligands and then such newly identified ligands can themselves be tested and characterized. This series of processes may be repeated as many times as necessary to obtain ligands with the desirable binding properties.
The following steps serve as an example of the overall procedure:
1. A biological activity of a target is selected.
2. A ligand is identified that appears to be in some way associated with the chosen biological activity (e.g., the ligand may be an inhibitor of a known activity). The activity of the ligand may be tested by in vivo and/or in vitro methods.
A ligand of the present invention can be, but is not limited to, at least one selected from a lipid, a nucleic acid, a compound, a protein, an element, an antibody, a saccharide, an isotope, a carbohydrate, an imaging agent, a lipoprotein, a glycoprotein, an enzyme, a detectable probe, and antibody or fragment thereof, or any combination thereof, which can be detectably labeled as for labeling antibodies. Such labels include, but are not limited to, enzymatic labels, radioisotope or radioactive compounds or elements, fluorescent compounds or metals, chemiluminescent compounds and bioluminescent compounds. Alternatively, any other known diagnostic or therapeutic agent can be used in a method of the invention. Suitable compounds are then tested for activities in relationship to the target.
Complexes between a c-fms chimera and ligands are made either by co-crystallization or more commonly by diffusing the small molecule ligand into the crystal. X-ray diffraction data from the complex crystal are measured and a difference electron density map is calculated. This process provides the precise location of the bound ligand on the target molecule. The difference Fourier is calculated using measured diffraction amplitudes and the phases of these reflections calculated from the coordinates.
3. Using the methods of the present invention, X-ray crystallography is utilized to create electron density maps and/or molecular models of the interaction of the ligand with the target molecule.
The entry of the coordinates of the target into the computer programs discussed above results in the calculation of the most probable structure of the macromolecule. These structures are combined and refined by additional calculations using such programs to determine the probable or actual three-dimensional structure of the target including potential or actual active or binding sites of ligands. Such molecular modeling (and related) programs useful for rational drug design of ligands or mimetics, are also provided by the present invention.
4. The electron density maps and/or molecular models obtained in Step 3 are compared to the electron density maps and/or molecular models of a non-ligand containing target and the observed/calculated differences are used to specifically locate the binding of the ligand on the target or subunit.
5. Modeling tools, such as computational chemistry and computer modeling, are used to adjust or modify the structure of the ligand so that it can make additional or different interactions with the target.
The ligand design uses computer modeling programs which calculate how different molecules interact with the various sites of a target. This procedure determines potential ligands or mimetics of the ligand(s).
The ligand design uses computer modeling programs which calculate how different molecules interact with the various sites of the target, subunit, or a fragment thereof. Thus, this procedure determines potential ligands or ligand mimetics.
6. The newly designed ligand from Step 5 can be tested for its biological activity using appropriate in vivo or in vitro tests, including the high throughput screening methods discussed above.
The potential ligands or mimetics are then screened for activity relating to a c-fms chimera, c-fms protein, or at least a fragment thereof. Such screening methods are selected from assays for at least one biological activity of the native target.
The resulting ligands or mimetics, provided by methods of the present invention, are useful for treating, screening or preventing diseases in animals, such as mammals (including humans) and birds.
7. Of course, each of the above steps can be modified as desired by those of skill in the art so as to refine the procedure for the particular goal in mind. Also, additional X-ray diffraction data may be collected on c-fms chimeric proteins, c-fms chimeric proteins/ligand complexes, structural target motifs and subunit/ligand complexes at any step or phase of the procedure. Such additional diffraction data can be used to reconstruct electron density maps and molecular models which may further assist in the design and selection of ligands with the desirable binding attributes.
It is to be understood that the present invention is considered to include stereoisomers as well as optical isomers, e.g., mixtures of enantiomers as well as individual enantiomers and diastereomers, which arise as a consequence of structural asymmetry in selected compounds, ligands or mimetics of the present series.
Some of the compounds or agents disclosed or discovered by the methods herein may contain one or more asymmetric centers and thus give rise to enantiomers, diastereomers, and other stereoisomeric forms. The present invention is also meant to encompass all such possible forms as well as their racemic and resolved forms and mixtures thereof. When the compounds described or discovered herein contain olefinic double bonds or other centers of geometric asymmetry, and unless otherwise specified, it is intended to include both E and Z geometric isomers. All tautomers are intended to be encompassed by the present invention as well.
As used herein, the term “stereoisomers” is a general term for all isomers of individual molecules that differ only in the orientation of their atoms in space. It includes enantiomers and isomers of compounds with more than one chiral center that are not mirror images of one another (diastereomers).
As used herein, the term “chiral center” refers to to a carbon atom to which four different groups are attached.
As used herein, the term “enantiomer” or “enantiomeric” refers to a molecule that is nonsuperimposable on its mirror image and hence optically active wherein the enantiomer rotates the plane of polarized light in one direction and its mirror image rotates the plane of polarized light in the opposite direction.
As used herein, the term “racemic” refers to a mixture of equal parts of enantiomers and which is optically active.
As used herein, the term “resolution” refers to the separation or concentration or depletion of one of the two enantiomeric forms of a molecule. In the context of this application, the term “resolution” also refers to the amount of detail which can be resolved by the diffraction experiment. Or in other terms, since the inherent disorder of a protein crystal diffraction pattern fades away at some diffraction angle θmax, the corresponding distance dmin of the reciprocal lattices is deterimined by Bragg's law.
In practice in protein crystallography it is usual to quote the nominal resolution of a protein electron density in terms of dmin, the minimum lattice distance to which data is included in the calculation of the map.
Without further description, it is believed that one of ordinary skill in the art can, using the preceding description and the following illustrative examples, make and utilize the compounds of the present invention and practice the claimed methods. The following working examples therefore, specifically point out preferred embodiments of the present invention, and are not to be construed as limiting in any way the remainder of the disclosure.
All constructs begin at amino acid 538 of FMS and end at amino acid 922 of FMS. Chimeras were created by replacing FMS KID with KID sequences from RTK's known to be structured, like tie2 and irk. Chimeras are based on structure prediction and sequence alignment.
c-fms fragments from amino acids 922-678 and 753-922 were generated by PCR using a c-fms construct derived by RT-PCR from THP-1 cells. For cloning purposes, a Sal I site was included on the 5′ side of the 922-678 PCR product and a stop codon followed by a Not I site was included on the 3′ end of the 753-922 PCR product. The FGFR1 kinase insert domain was generated by annealing 2 synthesized oligonucleotides corresponding to amino acids 671-679 of c-fms, followed by amino acids 577-617 of FGFR1 and ending with amino acids 753-760 of c-fms. To obtain the chimera, overlapping PCR was performed using the FMS PCR fragments 922-678 and 753-922 and the annealed synthesized FGFR1 kinase insert domain oligonucleotides as a template. The final PCR product was subcloned into pCRII (Invitrogen) and the sequence was confirmed. For expression in SF9 cells, a recombinant baculovirus was generated by subcloning the FMS-FGFR1 chimera into a modified Invitrogen GATEWAY pDEST8 vector, and following the protocol for Baculovirus Expression according to the Bac-to-Bac manual. Other chimeras were generated in a similar manner by using synthetic oligonucleotides corresponding to the KID of TIE2 or IR.
Frozen cells were thawed and resuspended in 50 mM NaKPO4 pH 7.5, 200 mM NaCl, 5% Glycerol, 1 mM Glutathione, 5 mM Imidazole, 1× Complete EDTA-free protease inhibitor cocktail (Roche) (Buffer A). Thawed cells were dounce homogenized, mechanically lysed with an Emulsiflex-C5 (Avestin) at 10,000-15,000 psi and centrifuged at 40,000×g (16,000 rpm) for 1 hour to remove insoluble material. The supernatant was filtered through a 0.45 μm vacuum filter and incubated with a BD Talon metal affinity resin (BD Biosciences Clontech) overnight at 4° C. After 20 column volumes washes with buffer A containing 10 mM Imidazole, c-fms was eluted using a 10 column volume linear gradient from 10 mM to 200 mM Imidazole in buffer A. Fractions containing c-fms, as assayed by SDS-PAGE, were pooled and combined with 0.2 Units of TEV Protease (Invitrogen)/μg of c-fms to remove the histidine tag. The reaction was dialyzed overnight against 50 mM NaKPO4 pH 7.5, 200 mM NaCl, 5% Glycerol, 2 mM Glutathione and was then incubated with a BD Talon metal affinity resin for two hours to remove TEV protease. Purified c-fms was then filtered through a 0.2 μm filter, concentrated and further purified on size exclusion column (Superdex 200 HR 10/30, Amersham Biosciences). The buffer used for gel filtration was 50 mM HEPES pH 7.5, 200 mM NaCl, 5 mM Glutathione, 3% Glycerol. Fractions containing c-fms were pooled, passed through a 0.1 μm vacuum filter, incubated with a compound for 2 hours and concentrated to a final concentration of 7 to 11 mg/ml.
(a) An autophosphorylation, fluorescence polarization competition immunoassay was used for compounds with IC50's >10 nM. The assay was performed in black 96-well micro plates (UL BioSystems). The assay buffer used was 100 mM HEPES, pH 7.5, 1 mM DTT, 0.01% (v/v) Tween-20. Compounds were diluted in assay-buffer containing 4% DMSO just prior to the assay. To each well, 5 μl of compound were added followed by the addition of 3 μl of a mix containing 33 nM c-fms (3DP) and 16.7 mM MgCl2 (Sigma) in assay buffer. The kinase reaction was initiated by adding 2 μl of 5 mM ATP (Sigma) in assay buffer. The final concentrations in the assay were 10 nM c-fms, 1 mM ATP, 5 mM MgCl2, 2% DMSO. Control reactions were ran in each plate: in positive and negative control wells, assay buffer (made 4% in DMSO) was substituted for the compound; in addition, positive control wells received 1.2 μl of 50 mM EDTA.
The plates were covered and incubated at room temperature for 45 min. At the end of the incubation, the reaction was quenched with 1.2 μl of 50 mM EDTA (EDTA was not added to the positive control wells at this point; see above). Following a 5-min incubation, each well received 10 μl of a 1:1:3 mixture of anti-phosphotyrosine antibody, 10X, PTK green tracer, 10X (vortexed), FP dilution buffer, respectively (all from PanVera, cat. # P2837). The plate was covered, incubated for 30 min at room temperature and the fluorescence polarization was read on the Analyst. The instrument settings were: 485 nm excitation filter; 530 nm emission filter; Z height: middle of well; G factor: 0.93. Under these conditions, the fluorescence polarization values for positive and negative controls were ˜300 and ˜150, respectively, and were used to define the 100% and 0% inhibition of the c-fms reaction. The IC50 values reported are the averages of three independent measurements.
(b) A peptide phosphorylation, fluorescence polarization competition immunoassay was used for compounds with IC50's <10 nM. The assay was performed in black 96-well micro plates. The assay buffer used was 100 mM HEPES, pH 7.5, 1 mM DTT, 0.01% (v/v) Tween-20. Compounds were diluted in assay buffer containing 4% DMSO just prior to the assay. To each well, 5 μl of compound were added followed by the addition of 2 μl of a mix containing 5 nM c-fms and 25 mM MgCl2 in assay buffer. 2 μl of 1540 μM peptide SYEGNSYTFIDPTQ (AnaSpec) were subsequently added. The kinase reaction was initiated by adding 1 μl of 10 mM ATP in assay buffer. The final concentrations in the assay were 1 nM c-fms, 308 μM peptide, 1 mM ATP, 5 mM MgCl2, 2% DMSO. Control reactions were ran in each plate: in positive and negative control wells, assay buffer (made 4% in DMSO) was substituted for the compound; in addition, positive control wells received 1.2 μl of 50 mM EDTA. The plates were covered and incubated at room temperature for 80 min. Quenching of the reaction and detection of product formation were performed as described in (a). Fluorescence polarization values for positive and negative controls were ˜290 and ˜160, respectively, and were used to define the 100% and 0% inhibition of the c-fms reaction.
Crystallization and Structure Determination
In a typical crystallization experiment 1-2 μl of c-fms protein complexed with the lead compounds and concentrated to 7-10 mg/ml was mixed in a 1:1 ratio with well solution (15-28% PEG 3350, 100 mM Sodium-Acetate pH 5.0-5.6, 200 mM Li2SO4, 5 mM DTT, 0-3% glycerol) and placed on a glass cover slip. The cover slip was inverted and sealed over a reservoir of 500-1000 μl of well solution and incubated at 22° C. Crystals usually appeared over night. In most cases μ-seeding with a seed stock obtained from one of the lead compounds was used to induce crystallization. Crystals were harvested with a nylon loop, placed for less than 10 seconds in cryo-solution (27% PEG 3350, 100 mM. Sodium-Acetate pH 5.5, 200 mM Li2SO4, 5 mM DTT, 10% glycerol) and frozen by immersion in liquid nitrogen. Data were collected at 100K on a Bruker AXS MO6XCE rotating anode and a SMART 6000 CCD detector or at the IMCA-CAT ID-17 beamline at the Argonne National Laboratory. The diffraction data was processed with the Bruker Proteum suite or the HKL suite (Denzo/Scalepack) The initial c-fms structure was solved by molecular replacement using the FGFR crystal structure as a search model in CNX. Structure refinement and model building was carried out according to standard protocols using CNX [10] and O [11].
Crystal form I was obtained with the following characteristics: a=81.07, b=81.07, c=144.67, α=90, β=90, γ=120, sg=R3, diffraction limit 1.9 Å (synchrotron), one molecule/ASU. Crystal form II was obtained with the following characteristics: a=53.1, b=72.4, c=91.7, α=90, β=90, γ=90, sg=P212121, diffraction limit 3 Å (synchrotron), one molecule/asymmetric unit.
A highly preferred crystal structure is a crystal structure defined by structure coordinates of c-fms amino acids Trp 550, Lys 586, Thr 587, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Val 615, Lys 616, Glu 633, Met 637, Leu 640, Ile 646, Val 647, Val 661, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Asn 673, Arg 677, Cys 774, Ile 775, His 776, Arg 782, Asn 783, Leu 785, Ile 794, Gly 795, Asp 796, Phe 797, Gly 798, Leu 799, Ala 800, Arg 801, Asp 802.
A preferred crystal structure is a crystal structure defined by structure coordinates of c-fms amino acid residues Lys 586, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Lys 616, Val 647, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Arg 677, Arg 782, Leu 785, Asp 796, Phe 797, Gly 798, Leu 799, Ala 800, Arg 801, Asp 802.
Another crystal structure of the invention is defined by structure coordinates of c-fms amino acid residues Lys 586, Leu 588, Gly 589, Val 596, Glu 598, Ala 614, Lys 616, Val 647, Thr 663, Glu 664, Tyr 665, Cys 666, Cys 667, Tyr 668, Gly 669, Asp 670, Arg 782, Asn 783, Leu 785, Asp 796, Phe 797, Leu 799, Ala 800, Arg 801.
Ligands of c-fms
Binding Mode of the Arylamide Series
The inhibitor of the arylamide series occupies the nucleotide-binding pocket, located between the N-domain and the C-domain. The carbonyl oxygen of the amide bond forms a hydrogen bond with the amide-N of Cys666 in the hinge-region. The five-membered ring together with the cyano-group occupies the adenine pocket. A π-π stacking interaction is formed with Phe797 of the DFG motif. Other van der waals interactions are mediated by the surrounding hydrophobic pocket formed by Val596, Ala614, Lys616, Val647, Thr663, Leu785 and Ala800. The ortho-methyl-piperidine ring is located in the sugar pocket. Arg801 forms the bottom of that pocket and Asn783 and Gly589 are flanking the piperidine ring on either side. The methoxy-aryl ring projects into the solvent area and interacts with part of the solvent interface residues, mainly Leu588 and Gly669. A weak hydrogen bonding interaction between methyl-hydroxy group and the phenol-hydroxy group of Tyr665 can be observed as well. A general description of the acrylamide series is set forth, below.
Binding Mode of Quinolone Series
The quinolone also occupies the nucleotide-binding pocket, with the chloro-aryl ring located in the adenine pocket. Both the amide oxygen and nitrogen form hydrogen-bonding interactions with the backbone of the hinge residues Cys666 and Glu664. The remainder of the interactions is mainly of hydrophobic nature and involves residues of the solvent interface and the sugar pocket (Leu588, Gly589, Leu596, Gly669, Asp670, Leu785, Phe797, Ala800, Arg801). A general description of the quinolone series is set forth below.
Specific ligands used in this invention:
REMARK c-fms (538-922, FGF chimera) complexed with 793693
REMARK refinement resolution: 500.0-2.8 A
REMARK starting r = 0.2645 free_r = 0.3191
REMARK final = 0.2617 free_r = 0.3141
CRYST1 82.210 82.210 143.440 90.00 90.00 120.00 R 3
REMARK Written by CNX VERSION: 2000.12
REMARK c-fms (538-922 tie2 chimera) complexed with 1183648
REMARK refinement resolution: 500.0-1.8 A
REMARK starting r = 0.2383 free_r = 0.2832
REMARK final r = 0.2367 free_r = 0.2811
CRYST1 80.440 80.440 143.760 90.00 90.00 120.00 R 3
REMARK Written by CNX VERSION:2002
Although the present invention has been described in detail with reference to examples above, it is understood that various modifications can be made without departing from the spirit of the invention. All cited patents, published patent applications, publications and other documents cited in this application are herein incorporated by reference in their entirety.
Number | Date | Country | |
---|---|---|---|
60620698 | Oct 2004 | US |