The present invention relates to the human orphan nuclear receptors steroidogenic factor-1 (SF-1) and liver receptor homolog-1 (LRH-1) and modulation of the activity of those receptors.
The following description is provided solely to assist the understanding of the reader, and does not constitute an admission that any of the information provided or references cited are prior art to the present invention.
Nuclear receptors constitute a protein superfamily whose members specifically bind particular physiologically relevant small molecules, such as hormones or vitamins. As distinguished from integral membrane receptors and membrane-associated receptors, nuclear receptors are located in either the cytoplasm or nucleus of eukaryotic cells.
In many cases of binding of a molecule to a nuclear receptor, the nuclear receptor changes the ability of a cell to transcribe DNA, i.e. nuclear receptors modulate DNA transcription, but can also have transcription independent effects. Thus nuclear receptors comprise a class of intracellular, soluble ligand-regulated transcription factors. Nuclear receptors include but are not limited to receptors for glucocorticoids, androgens, mineralocorticoids, progestins, estrogens, thyroid hormones, vitamin D retinoids, and icosanoids. Many nuclear receptors identified by either sequence homology to known receptors (see, e.g., Drewes et al., Mol. Cell. Biol., 1996, 16:925-31) or based on their affinity for specific DNA binding sites in gene promoters (see, e.g., Sladek et al., Genes & Dev., 1990, 4:2353-65) have unascertained ligands and are therefore termed “orphan receptors.”
In a structural context, nuclear receptors are generally characterized by two distinct structural elements. First, nuclear receptors include a DNA binding domain that targets the receptor to specific DNA sequences, which are known as hormone response elements (HREs). The DNA binding domains of these receptors are related in structure and sequence. Second, the C-terminal region of nuclear receptors encompasses the ligand binding domain (LBD). Upon binding a ligand, the receptor adopts a transcriptionally active state.
Steroidogenic factor-1 (SF-1), also known as adrenal 4-binding protein (Ad4BP) and NR5A1, is an essential factor in adrenal and gonadal development and for the proper functioning of the hypothalamic-pituitary-gonadal axis. SF-1 maps to human gene map locus 9q33. SF-1 is a transcription factor which activates the promoters of various adrenal/gonadal steroid hydroxylase genes, as well as a variety of genes essential for endocrine organogenesis (Ikeda et al., Mol. Endocrinol., 1993, 7:852-860; Morohashi et al., Mol. Endocrinol., 1993, 7:1196-1204; and Parker & Schimmer, Endocr. Rev., 1997, 18:361-377). Mammalian SF-1 exhibits significant similarity to Drosophila fushi tarazu factor 1 (Ftz-F1), a regulator of the developmental homeobox gene fushi tarazu (Lavorgna et al., Science, 1992, 252:848-851; and Ueda et al., Genes & Dev., 1990, 4:624-635). The mouse SF-1 gene therefore has been designated mouse Ftz-F1.
SF-1 is conserved across both vertebrate and invertebrate species, indicating a conserved role for the protein in all metazoans (Honda et al., J. Biol. Chem., 1993, 268:7494-7502; Lala et al., Mol. Endocrinol., 1992, 6:1249-1258; Nomura et al., J. Biol. Chem., 1995, 270:7453-7461; Oba et al., Biochem. Biophys. Res. Comm., 1996, 226:261-267; Sun et al., Dev. Biol., 1994, 162:426-437; and Wong et al., J. Mol. Endocrinol., 1996, 17:139-147). SF-1 homologs have been cloned, for example, from silkworm, chicken and frog as well as a variety of mammalian species.
SF-1 is a member of the steroid receptor superfamily, and all SF-1 homologs have a common structural organization that shares several features with other members of the steroid receptor superfamily. A classic zinc finger DNA-binding domain (DBD) is present in the amino-terminal region; this domain confers high affinity binding to the SF-1 cognate response element and is essential for DNA binding and subsequent transcriptional activation (Wilson et al., Science, 1992, 256:107-110; Wilson et al., Mol. Cell. Biol., 1993, 13:5794-5804). The major nuclear import signal also maps to the tandem zinc finger domain.
In contrast to the majority of steroid receptors, which function as dimers in DNA-binding and transcriptional regulation, SF-1 binds DNA as a monomer at an extended AGGTCA site such as the perfect SF-1 binding site, TCAAGGTCA (Wilson et al., supra, 1993). In SF-1 and other monomeric nuclear receptors, amino acid residues carboxy-terminal to the DNA-binding domain, denoted the “A” box, contribute to binding specificity by recognizing nucleotides 5′ to the AGGTCA response element, resulting in an extended monomer response element with increased binding fidelity (Ueda et al., Mol. Cell. Biol., 1992, 12:5667-5672; Wilson et al., supra, 1992; and Wilson et al., supra, 1993). Such monomeric nuclear receptors include liver related homolog 1/fetoprotein transcription factor (LRH-1/FTF/SF-1.beta.), nerve growth factor-induced gene-B (NGF-IB), estrogen-related receptor 1 (ERR1), estrogen-related receptor 2 (ERR2) and retinoic acid receptor-related orphan nuclear receptor (ROR).
A variety of genes bound and regulated by SF-1 are known in the art. These SF-1 target genes include, for example, steroidogenic enzymes such as cytochrome P450 cholesterol side-chain cleavage enzyme (P450scc) and other steroidogenic targets such as the ACTH receptor; gonadal SF-1 target genes such as the gene for the male-specific Mullerian inhibiting substance (MIS), which is expressed in the Sertoli cells of the testis and responsible for regression of the female specific Mullerian duct; and pituitary and hypothalamic target genes such as αGSU and the luteinizing hormone β subunit (LHβ). A variety of additional SF-1 target genes are known in the art; see, e.g., Hammer & Ingraham, Frontiers in Neurobiology, 1999, 20:199-223.
Like other members of the steroid receptor superfamily, SF-1 contains a conserved ligand-binding domain positioned at the carboxy-terminus of the receptor and a conserved activation function 2 (AF2) sequence in the carboxy-terminal region of the ligand-binding domain. In many nuclear receptors, this domain confers responsiveness to specific ligands that activate or, in some cases, repress receptor transcriptional activity (Evans, Science, 1988, 240:889-895; Forman et al., Nature, 1998, 395:612-615). While SF-1-dependent transcriptional activity has been shown in one instance to exhibit a modest increase in response to 25-, 26-, and 27-hydroxycholesterol in CV-1 cells (Lala et al., Proc. Natl. Acad. Sci. USA, 1997, 94:4895-4900), a ligand for SF-1 has not been definitively identified, and SF-1 consequently is referred to as an “orphan receptor.”
SF-1 has been shown to have transactivating activity in the absence of exogenous ligand. Two regions have been identified as important for SF-1 transactivation. Point mutations within the conserved AF2 hexamer motif, LLIEML, which is critical for transactivation function of many nuclear receptors (Mangelsdorf et al., Cell, 1995, 83:835-839), abrogated SF-1 activity, as did removal of the distal hinge region that follows the DNA-binding domain. In contrast, much of the ligand-binding domain can be truncated without significantly impairing SF-1 transcriptional activity. Furthermore, in cell lines that support SF-1-transcriptional activity, the AF1 domain of SF-1 is constitutively phosphorylated at serine 203. A nonphosphorylatable mutant, SF-1S203A, consistently exhibited a significant 50-80% reduction in transcriptional activity on the MIS promoter and other promoters as compared to wild-type SF-1 activity. Point mutations in the AF2 hexamer motif also resulted in significant reduction in SF-1 transactivation, and a further reduction in activity was observed when the AF2 hexamer mutation was combined with the S203A mutation (Hammer et al., Mol. Cel, 1999, 3:521-526). In sum, maximal SF-1 transcriptional activity requires both the AF1 in the distal hinge domain and AF2 (Crawford et al., Mol. Endocrinol., 1997, 11:1626-1635; Ito et al., Mol. Cell. Biol., 1997, 17:1476-1483). Two motifs in particular, the phosphorylated Ser 203 and LLIEML hexamer of the AF2 domain, are essential for full SF-1 transcriptional activity.
Consistent with a role for SF-1 as a regulator of steroid hydroxylases, SF-1 is expressed in the primary organs that produce steroid hormones, including adrenal cortical cells, testicular Leydig cells, and ovarian theca and granulosa cells (Ikeda et al., Mol. Endocrinol., 1994, 8:654-662; Sasano et al., J. Clin. Endocrinol. Metab., 1995, 80:2378-2380; Takayama et al., J. Clin. Endocrinol. Metab., 1995, 80:2815-2821). SF-1 also is expressed in the testicular Sertoli cell, the pituitary gonadotrope, and the ventral medial nucleus (VMN) of the hypothalamus (Asa et al., J. Clin. Endocrinol. Metab., 1996, 81:2165-2170; Hatano et al., Develop., 1994, 120:2787-2797; Ikeda et al., supra, 1994; Ingraham et al., Genes & Dev., 1994, 8:2302-2312; Morohashi et al., Mol. Endocrinol., 1993, 7:1196-1204; and Roselli et al:, Brain Res. Mol. Brain Res., 1997, 44:66-72). SF-1 transcripts have been detected in spleen and placenta in addition to the gonad, adrenal, pituitary and hypothalamus.
In vivo significance of SF-1 has been demonstrated in SF-1 knockout mice. Homozygous Ftz-F1 −/− mice all died of glucocorticoid and mineralocorticoid insufficiency (Luo et al., Mol. Endocrinol., 1995, 9:1233-1239). The absence of SF-1 resulted in female external genitalia regardless of chromosomal sex, consistent with a role for SF-1 in gonadal formation and synthesis of androgens such as dihydrotestosterone, which is required for development of male external genitalia. Gonads and adrenal glands were completely absent from both sexes. Furthermore, all mice, regardless of chromosomal sex, displayed a female internal reproductive tract (Luo et al., Cell, 1994, 77:481-490; Sadovsky et al., Proc. Natl. Acad. Sci. USA, 1995, 92:10939-10943), consistent with a known role of SF-1 in regulation of Mullerian inhibiting substance (Giuili et al., Development, 1997, 124:1799-1807; Shen et al., Cell, 1994, 77:651-661). In the absence of this inhibitory substance, regression of the Mullerian duct, the precursor of the vagina, uterus and fallopian tube, does not take place. SF-1 null mice also lacked follicle stimulating hormone (FSH) and luteinizing hormone (LH) expression in the anterior pituitary. These results indicate that SF-1 is critical for appropriate development of the adrenals, gonads and pituitary gonadotropes.
The phenotype of the SF-1 null mice parallels the phenotype observed in the human syndrome of X-linked congenital hypoplasia, a disorder which is characterized by hypoplastic adrenal glands often accompanied by profound hypogonadism. The gene responsible for the human syndrome, DAX-1 (dosage-sensitive sex reversal-adrenal hypoplasia congenita critical region on the X chromosome), localizes to Xp21 and, like deletions of SF-1, DAX-1 deletions result in profound adrenal hypoplasia in humans (Muscatelli et al., Nature, 1994, 372:672-676; Zanaria et al., Nature, 1994, 372:635-641). Dax-1 also is an orphan nuclear receptor expressed in multiple endocrine organs; Dax-1 and SF-1 appear to colocalize to cells of the adrenals, gonads, gonadotropes and VMN (Ikeda et al., Mol. Endocrinol., 1995, 9:478-486; Swain et al., Nat. Genetics, 1996, 12:404-409). Together with the similar phenotypes of SF-1 null mice and Dax mutations in humans, these results reinforce the importance of SF-1 and indicate that SF-1 and DAX-1 can work together as essential regulators of the hypothalamic-pituitary-steroidogenesis axis in humans.
Ingraham et al., U.S. Pat. Pub. No. 20040092716, Appl. No. 10/616,897, discusses a properly folded steroidogenic factor-1 (SF-1)-like receptor variant, or active fragment thereof, which has an amino acid sequence that encodes a SF-1 -like receptor variant or active fragment thereof and that lacks at least one naturally occurring cysteine residue within the ligand-binding domain of the receptor. This patent publication also discusses a LRH-1 receptor variant or an active fragment thereof that contains a substitution at particular cysteine residues.
Liver receptor homolog-1 (LRH-1) is a second orphan nuclear receptor that has sequence similarity to SF-1. LRH-1 is expressed in liver, intestine, and pancreas, and acts on genes coordinating bile acid synthesis, enterohepatic circulation, and absorption. Gene knockout and heterozygous loss-of-function studies show that both SF-1 and LRH-1 are essential during embryogenesis for normal development of the organs in which they are expressed, and mammalian cell transfection experiments indicate that SF-1 and LRH-1 function as obligate factors for their target genes, acting apparently constitutively. The mouse LRH-1 structure contains a cavity available for potential ligands, but mutations to fill this cavity did not diminish activity, supporting a model of constitutive, ligand-independent function.
LRH-1 is involved in the regulation of a number of different genes, including, for example, steroidogenic acute regulatory protein (Kim et al., J. Clin Endocrinol Metab., 2004, 89:3042-3047), apolipoprotein Al (Delerive et al., Mol. Endocrinol., 2004, 18:2378-87), cholesterol 7 alpha-hydroxylase (Qin et al., Mol. Endocrinol., 2004, 18:2424-2439), aromatase (Clyne et al., Mol. Cell. Endocrinol., 2004, 215:39-44), carboxyl ester lipase (Fayard et al., J. Biol. Chem., 2003, 278:35725-31), and cytochrome P450 7A.
Zhao et al. U.S. Pat. Pub. No. 20030077664, application Ser. No. 09/922,226 provides methods of screening for compounds that modulate hormone receptor activity in which an isolated receptor-containing complex is assayed for an altered modification state as compared to a control modification state. The presence of an altered modification state serves to identify an effective agent that modulates a biological activity of the nuclear hormone receptor.” Potential receptors mentioned for use in the methods include without limitation RXR, HNF4, TLX, COUP-TF, TR, RAR, PPAR, reverb, ROR, SF-1, LRH-1, EcR, PXR, CAR, NOR1, NURR1, ER, ERR, GR, AR, PR, and MR.
Goodwin et al., U.S. Pat. Pub. No. 2004/0038862, application Ser. No. 10/343,289 concerns a method to identify compounds that modulate bile acid synthesis by assessing the ability of a compound to act as a ligand for short heterodimerizing partner-i or liver receptor homologue-1, preferably a compound that modulates the interaction of short heterodimerizing partner-1 with liver receptor homologue-1.
In accordance with the present invention, it has been discovered that “orphan” nuclear receptors human steroidogenic factor-1 (SF-1) and liver receptor homolog-1 (LRH-1) bind phospholipid ligands in a ligand binding domain (LBD) pocket. As a result, the invention provides methods for the identification of modulators that bind in the respective LBD pockets of these receptors.
Thus, in a first aspect, the invention provides a method for identifying compounds that bind to the ligand binding domain of SF-1 or LRH-1 by contacting the ligand binding domain with a test compound and determining whether the compound binds to the domain, thereby identifying compounds that bind to the ligand binding domain of SF-1 or LRH-1. Compounds that bind to the ligand binding domain but do not have detectable modulating activity can be useful for development of derivative compounds that are active modulators, but in preferred embodiments, such binding compounds modulate activity of SF-1 or LRH-1. Thus, such binding compounds can be assayed for modulating activity. The method can be carried out for a plurality of compounds, e.g., a large plurality such as at least 100, 500, 1000, 5000, 10000 compounds. The method additionally contemplates whether the compound binds in a ligand binding pocket. Such a binding determination can be carried out in a variety of ways, e.g., as a direct binding assay or as a competitive assay in which the test compound competes for binding with a known binding compound, e.g., a molecular scaffold as identified herein. The method can also involve determining whether the compound binds at one or both of the co-activator binding surfaces as identified herein. Such a binding determination can be carried out in a variety of ways, e.g., as a direct binding assay or as a competitive assay in which the test compound competes for binding with a known binding compound, e.g., a phospholipid as identified herein.
Identification of such compounds enables a method for identifying or developing additional compounds active on these receptors, e.g., improved modulators. Such identification includes without limitation determining whether any of a plurality of test compounds active on SF-1 or LRH-1 provides an improvement in one or more desired pharmacologic properties relative to an active reference compound. Thereafter, invention methods comprise selecting a compound, if any, that has an improvement in the desired pharmacologic property, thereby providing an improved modulator. In particular embodiments of aspects of modulator development, the desired pharmacologic property is serum half-life longer than 2 hr or longer than 4 hr or longer than 8 hr, aqueous solubility, oral bioavailability more than 10%, or oral bioavailability more than 20%. In certain embodiments, a plurality of derivatives of an active reference compound (e.g., a compound identified in a method described herein) are used.
Also in particular embodiments of aspects of modulator development, the process can be repeated multiple times, i.e., multiple rounds of preparation of derivatives and/or selection of additional related compounds and evaluation of such further derivatives of related compounds, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more additional rounds.
In another aspect, the invention provides a method of designing a ligand that binds to SF-1 or LRH-1, by identifying one or more molecular scaffolds that bind to a binding site of SF-1 or LRH-1 ligand binding domain polypeptide with low affinity; determining the orientation of the one or more molecular scaffolds at the binding site of the polypeptide by obtaining co-crystal structures of the one or more molecular scaffolds in the binding site; and modifying one or more structures of at least one scaffold molecule so as to provide a ligand having altered binding affinity or binding specificity or both for binding to the polypeptide as compared to the binding of the scaffold molecule. The designed ligand(s) can then be provided, e.g., by synthesizing or otherwise obtaining the ligand(s). In particular embodiments, one or more molecular scaffolds interact with at least 3 conserved amino acid residues in a binding pocket of the ligand binding domain and/or with at least 3 residues with which a phospholipid ligand interacts. In another aspect, the invention provides a method of developing altered modulators for SF-1 or LRH-1 by selecting a molecular scaffold from a set of at least 3 molecular scaffolds that bind to SF-1 or LRH-1, and modifying one or more structures of the scaffold molecule so as to provide a ligand having altered binding affinity or binding specificity or both for binding to SF-1 or LRH-1 as compared to the binding of the scaffold molecule.
In particular embodiments, a plurality of distinct compounds are assayed for binding to the binding site of the SF-1 or LRH-1 ligand binding domain polypeptide; co-crystals of the molecular scaffolds bound to the polypeptide are isolated, and the orientation of the molecular scaffold is determined by performing X-ray crystallography on the co-crystals. In further embodiments, the method involves identifying common chemical structures of the molecular scaffolds, placing the molecular scaffolds into groups based on having at least one common chemical structure, and determining the orientation of the one or more molecular scaffolds at the binding site of the polypeptide for at least one representative compound from a plurality of groups; the ligand binds to the target molecule with greater binding affinity or greater binding specificity or both than the molecular scaffold; the orientation of the molecular scaffold is determined by nuclear magnetic resonance in co-crystal structure determination; the plurality of distinct compounds are each assayed for binding to a plurality of members of the NR5A nuclear receptor family.
Also in particular embodiments, after the identification of common chemical structures of the distinct compounds that bind, the compounds are grouped into classes based on common chemical structures and a representative compound from a plurality of the classes is selected for performing X-ray crystallography on co-crystals of the compound and target molecule; the distinct compounds are selected based on criteria selected from molecular weight, clogP, and the number of hydrogen bond donors and acceptors; the clogP is less than 2, and the number of hydrogen bond donors and acceptors is less than 5.
In certain embodiments, the distinct compounds have a molecular weight of from about 100 to about 350 daltons, or more preferably from about 150 to about 350 daltons or from 150 to 300 daltons, or from 200 to 300 daltons. The distinct compounds can be of a variety of structures. In some embodiments, the distinct compounds can have a ring structure, either a carbocyclic or heterocyclic ring, such as for example, a phenyl ring, a pyrrole, imidazole, pyridine, purine, or any ring structure.
In various embodiments, a compound or compounds binds with extremely low affinity, very low affinity, low affinity, moderate affinity, or high affinity; at least about 5% of the binding compounds bind with low affinity (and/or has low activity), or at least about 10%, 15%, or 20% of the compounds bind with low affinity (or very low or extremely low). After the identification of common chemical structures of the distinct compounds that bind, the compounds can be grouped into classes based on common chemical structures and at least one representative compound from at least one, or preferably a plurality, of the classes selected for performing orientation determination, e.g., by X-ray crystallography and/or NMR analysis.
In selecting the distinct compounds for assay in the present invention, the selection can be based on various criteria appropriate for the particular application, such as molecular weight, clogP (or other method of assessing lipophilicity), Polar Surface Area (PSA) (or other indicator of charge and polarity or related properties), and the number of hydrogen bond donors and acceptors. Compounds can also be selected using the presence of specific chemical moieties which, based on information derived from the molecular family, might be indicated as having some affinity for members of the family. Compounds with highly similar structures and/or properties can be identified and grouped using computational techniques to facilitate the selection of a representative subset of the group. As indicated above, in preferred embodiments, the molecular weight is from about 150 to about 350 daltons, more preferably from 150 to 300 daltons. The clogp is preferably less than 2, the number of hydrogen bond donors and acceptors is preferably less than 5 and the PSA less than 100. Compounds can be selected that include chemical structures of drugs having acceptable pharmacalogical properties and/or lacking chemical structures that are known to result in undesirable pharmacological properties, e.g., excessive toxicity and lack of solubility.
In some embodiments, the assay is an enzymatic assay, and the number of groups of molecular scaffolds formed can conveniently be about 500. In some embodiments, the assay is a competition assay, e.g., a binding competition assay. Cell-based assays can also be used. As indicated above, compounds can be used that have low, very low, or extremely low activity in a biochemical or cell-based assay.
The modification of a molecular scaffold can be the addition, subtraction, or substitution of a chemical group. The modification may desirably cause the scaffold to be actively transported to or into or out of particular cells and/or a particular organ. In various embodiments, the modification of the compound includes the addition or subtraction of a chemical atom, substituent or group, such as, for example, a hydrogen, alkyl, alkoxy, phenoxy, alkenyl, alkynyl, phenylalkyl, hydroxyalkyl, haloalkyl, aryl, arylalkyl, alkyloxy, alkylthio, alkenylthio, phenyl, phenylalkyl, phenylalkylthio, hydroxyalkyl-thio, alkylthiocarbamylthio, cyclohexyl, pyridyl, piperidinyl, alkylamino, amino, nitro, mercapto, cyano, hydroxyl, a halogen atom, halomethyl, an oxygen atom (e.g., forming a,ketone, ether or N-oxide), and a sulphur atom (e.g., forming a thiol, thione, sulfonamide or di-alkylsulfoxide (sulfone)).
In certain embodiments, the information provided by performing X-ray crystallography on the co-crystals is provided to a computer program, wherein the computer program provides a measure of the interaction between the molecular scaffold and the protein and a prediction of changes in the interaction between the molecular scaffold and the protein that result from specific modifications to the molecular scaffold, and the molecular scaffold is chemically modified based on the prediction of the biochemical result. The computer program can provide the prediction based on a virtual assay such as, for example, virtual docking of the compound to the protein, shape-based matching, molecular dynamics simulations, free energy perturbation studies, and similarity to a three-dimensional pharmacophore. A variety of such programs are well-known in the art.
Chemical modification of a chemically tractable structure can result in, or be selected to provide, one or more physical changes, e.g., to result in a ligand that fills a void volume in the protein-ligand complex, or in an attractive polar interaction being produced in the protein-ligand complex. The modification can also result in a sub-structure of the ligand being present in a binding pocket of the protein binding site when the protein-ligand complex is formed. After common chemical structures of the compounds that bind are identified, the compounds can be grouped based on having a common chemical sub-structure and a representative compound from each group (or a plurality of groups) can be selected for co-crystallization with the protein and performance of the X-ray crystallography. The X-ray crystallography is preferably performed on the co-crystals under distinct environmental conditions, such as at least 20, 30, 40, or 50 distinct environmental conditions, or more preferably under about 96 distinct environmental conditions. The X-ray crystallography and the modification of a chemically tractable structure of the compound can each be performed a plurality of times, e.g., 2, 3, 4, or more rounds of crystallization and modification.
Also in certain embodiments, one or more molecular scaffolds are selected which bind to a plurality of nuclear receptors, such as members of the NR5A group of nuclear receptors.
The method can also include the identification of conserved residues in a binding site(s) of a SF-1 or LRH-1 ligand binding domain polypeptide, that interact with a molecular scaffold, ligand or other binding compound. Conserved residues can, for example, be identified by sequence alignment of different members of the NR5A family and/or homologs of SF-1 or LRH-1, and identifying binding site residues that are the same or at least similar between multiple members of the group. Interacting residues can be characterized as those within a selected distance from the binding compound(s), e.g., 3, 3.5, 4, 4.5, or 5 angstroms.
As used in connection with binding of a compound with a target, the term “interact” indicates that the distance from a bound compound to a particular amino. acid residue will be 5.0 angstroms or less. In particular embodiments, the distance from the compound to the particular amino acid residue is 4.5 angstroms or less, 4.0 angstroms or less, or 3.5 angstroms or less. Such distances can be determined, for example, using co-crystallography, or estimated using computer fitting of a compound in an active site.
In a related aspect, the invention provides a method of designing a ligand that binds to at least one member of the NR5A family, by identifying as molecular scaffolds one or more compounds that bind to binding sites of a plurality of members of the NR5A family, determining the orientation of one or more molecular scaffolds at the binding site of a NR5A receptor(s) to identify chemically tractable structures of the scaffold(s) that, when modified, alter the binding affinity or binding specificity between the scaffold(s) and the receptor(s), and synthesizing a ligand wherein one or more of the chemically tractable structures of the molecular scaffold(s) is modified to provide a ligand that binds to the receptor with altered binding affinity or binding specificity relative to binding of the scaffold.
Particular embodiments include those described for the preceding aspect.
The invention also provides a method to identify interaction properties that a likely SF-1 or LRH-1 binding compound will possess, thereby allowing, for example, more efficient selection of compounds for structure activity relationship determinations and/or for selection for screening. Thus, another aspect concerns a method for identifying binding characteristics of a ligand of a NR5A protein (e.g., SF-1 or LRH-1), by identifying at least one conserved interacting residue in the receptor that interacts with at least two binding compounds; and identifying at least one common interaction property of those binding compounds with the conserved residue(s). The interaction property and location with respect to the structure of the binding compound defines the binding characteristic.
In various embodiments, the identification of conserved interacting residues involves comparing (e.g., by sequence alignment) a plurality of amino acid sequences in the NR5A family and identifying binding site residues conserved in that family; identification of binding site residues by determining co-crystal structure(s); identifying interacting residues (preferably conserved residues) within a selected distance of the binding compounds, e.g., 3, 3.5, 4, 4.5, or 5 angstroms; the interaction property involves hydrophobic interaction, charge-charge interaction, hydrogen bonding, charge-polar interaction, polar-polar interaction, or combinations thereof.
Another related aspect concerns a method for developing ligands for SF-1 or LRH-1 using a set of scaffolds. The method involves selecting one or both of those receptors, selecting a molecular scaffold, or a compound from a scaffold group, from a set of at least 3 scaffolds or scaffold groups where each of the scaffolds or compounds from each scaffold group are known to bind to the target. In particular embodiments, the set of scaffolds or scaffold groups is at least 4, 5, 6, 7, 8, or even more scaffolds or scaffold groups.
In another aspect the invention provides a method of identifying a modulator of a SF-1 or LRH-1 polypeptide by designing or selecting a compound that interacts with amino acid residues in a ligand binding site of the SF-1 or LRH-1 polypeptide, based upon a crystal structure of the respective ligand binding domain polypeptide, e.g., a structure of such a peptide in complex with one or more of a ligand and a coactivator polypeptide. The method can also involve synthesizing the modulator, and/or determining whether the compound modulates the activity of the SF-1 or LRH-1 polypeptide. Compounds that modulate SF-1 or LRH-1 are thus identified as modulators.
In certain embodiments the amino acid residues are conserved residues; are residues that interact with a phospholipid ligand as described herein; include at least 3, 4, 5, 6, or more conserved residues; include at least 3, 4, 5, 6, or more residues that interact with a phospholipid ligand as described herein; or include at least 2, 3, 4, or more residues that, when mutated from wild-type to a non-similar amino acid residue, changes the level of transcription or expression of a gene regulated by SF-1 or LRH-1 by at least 20% in an assay appropriate for determining such transcription or expression level (in particular embodiments, the gene is one identified herein as regulated by SF-1 or LRH-1).
The invention also provides a method of designing a modulator that modulates the activity of a SF-1 or LRH-1 by evaluating the three-dimensional structure of crystallized SF-1 or LRH-1 ligand binding domain polypeptide complexed with one or more of a ligand and a co-activator polypeptide, and synthesizing or selecting a compound based on the three-dimensional structure of the crystal complex that will bind to the polypeptide. Optionally, such a compound binds to the polypeptide as a potential modulator. The method can also involve determining whether the compound modulates the activity of a SF-1 or LRH-1; such determination can include determination of specificity (e.g., specificity between SF-1 and LRH-1, or specificity between SF-1 or LRH-1 and other members of the NR5A nuclear receptor family, or between SF-1 or LRH-1 and other nuclear receptors.
In another aspect, the invention concerns a method of screening for a modulator of SF-1 or LRH-1. The method involves contacting SF-1 or LRH-1 ligand binding domain polypeptide with a plurality of test compounds and determining whether any of the compounds bind with the ligand binding domain polypeptide. The method can also involve determining whether the compound binds in a LBD phospholipid binding pocket or at one or both of the coactivator binding surfaces as identified herein. Such a binding determination can be carried out as a direct binding assay or as a competitive assay in which the test compound competes for binding with a known binding compound, e.g., a phospholipid as identified herein. Test compounds that bind with SF-1 or LRH-1 can also be assayed for ability to modulate SF-1 or LRH-1 activity.
Additional variants of methods for identifying nuclear receptor modulators that can be applied to SF-1 and LRH-1 are described in Bledsoe et al., U.S. Pat. Pub. No. 2004/0018560, application Ser. No. 10/418,007, which is incorporated herein by reference in its entirety.
In another aspect, the invention provides a protein crystal comprising a substantially pure SF1 ligand binding domain polypeptide optionally comprising a ligand, or a LRH-1 ligand binding domain optionally comprising a ligand. In further embodiments of this aspect, the ligand is a phospholipid ligand.
Preferably, the crystalline form has lattice constants as shown in Table 1 and/or has coordinates as specified in Table 2 or Table 3. In certain embodiments, the ligand is a phospholipid.
The invention also provides a method for obtaining a crystal of SF-1 or LRH-1 ligand binding domain by subjecting substantially pure SF-1 or LRH-1 in the presence of a coactivator peptide and/or a ligand (e.g., a phospholipid ligand as described herein) under conditions substantially equivalent to the crystallization conditions described in the Examples herein.
A related aspect concerns a method for determining the three-dimensional structure of a crystallized SF-1 or LRH-1 ligand binding domain polypeptide in complex with one or more of a ligand and a coactivator polypeptide to a resolution of about 2.8 angstroms or better. In certain embodiments, the method includes: (a) crystallizing a SF-1 or LRH-1 ligand binding domain polypeptide in complex with one or more of a ligand and a coactivator polypeptide to form a crystallized complex; and (b) analyzing the crystallized complex to determine the three-dimensional structure of the SF-1 or LRH-1 ligand binding domain polypeptide in complex with one or more of a ligand and a coactivator polypeptide, whereby the three-dimensional structure of a crystallized SF-1 or LRH-1 ligand binding domain polypeptide in complex with one or more of a ligand and a coactivator polypeptide is determined to a resolution of about 2.8 angstroms or better. It is also preferable that the ligand is a phospholipid, e.g., as described herein.
The invention also provides a modified SF-1 or LRH-1 ligand binding domain, e.g., a domain which is modified as described in the Examples herein. In particular embodiments, the domain is SF-1 ligand binding domain which is modified by substitution or deletion of surface cysteines, C247 and/or C412. The modification can be substitution by serine residues.
As is conventional, the terms “a” and “an” mean “one or more” when used herein, including in the claims.
As used herein, the term “expression” generally refers to the cellular processes by which a polypeptide is produced from RNA.
As used herein, the term “transcription factor” means a cytoplasmic or nuclear protein which binds to a gene, or binds to an RNA transcript of a gene, or binds to another protein which binds to a gene or an RNA transcript or another protein which in turn binds to a gene or an RNA transcript, so as to thereby modulate expression of the gene. Such modulation can additionally be achieved by other mechanisms; the essence of a “transcription factor for a gene” pertains to a factor that alters the level of transcription of the gene in some way.
As used herein in connection with polynucleotides and polypeptides, the term “isolated” means that the molecule is separated from a substantial amount of other nucleic acids, proteins, lipids, carbohydrates or other materials with which they associate, such association being either in cellular material or in a synthesis medium. For example, the polynucleotide or polypeptide can be separated from 50, 60, 70, 80, 90, 95, 97, 98, 99% or more of such other materials.
As used herein, the term “substantially pure” means that the polynucleotide or polypeptide is substantially free of other polynucleotides and/or polypeptides, and thus constitutes at least 50, 60, 70, 80, 90, 95, 97, 98, 99% or more of a sample or preparation as the substantially pure polynucleotide or polypeptide.
As used herein, the term “modified” means an alteration from an entity's normally occurring state. An entity can be modified by removing discrete chemical units or by adding discrete chemical units. The term “modified” encompasses detectable labels as well as those entities added as aids in purification and entities added or removed as aids in crystallization.
As used herein, the terms “structure coordinates” and “structural coordinates” mean mathematical coordinates derived from mathematical equations related to the patterns obtained on diffraction of a monochromatic beam of X-rays by the atoms (scattering centers) of a molecule in crystal form. The diffraction data are used to calculate an electron density map of the repeating unit of the crystal. The electron density maps are used to establish the positions of the individual atoms within the unit cell of the crystal.
As used herein, the term “space group” means the arrangement of symmetry elements of a crystal.
As used herein, the term “molecular replacement” means a method that involves generating a preliminary model of, for example, the wild-type SF-1 ligand binding domain, or a SF-1 mutant crystal whose structure coordinates are unknown, by orienting and positioning a molecule whose structure coordinates are known within the unit cell of the unknown crystal so as best to account for the observed diffraction pattern of the unknown crystal. Phases can then be calculated from this model and combined with the observed amplitudes to give an approximate Fourier synthesis of the structure whose coordinates are unknown. This, in turn, can be subject to any of the several forms of refinement to provide a final, accurate structure of the unknown crystal. See, e.g., Lattman, 1985, Method Enzymol., 115: 55-77; Rossmann (ed.), 1972, The Molecular Replacement Method, Gordon & Breach, New York. Using the structure coordinates of a SF-1 or LRH-1 ligand binding domain provided by the present invention, molecular replacement can be used to determine the structure coordinates of a crystalline mutant or homologue of a SF-1 or LRH-1 ligand binding domain, or of a different crystal form of the SF-1 or LRH-1 ligand binding domain.
As used herein, the term “isomorphous replacement” means a method of using heavy atom derivative crystals to obtain the phase information necessary to elucidate the three-dimensional structure of a native crystal (Blundell et al., Protein Crystallography, 1976, Academic Press; Otwinowski, in Isomorphous Replacement and Anomalous Scattering, (Evans & Leslie, eds.), 1991, 80-86, Daresbury Laboratory, Daresbury, United Kingdom). The phrase “heavy-atom derivatization” is synonymous with the term “isomorphous replacement.”
As used herein, the term “polypeptide” means a polymer of amino acids, regardless of its size. Although “protein” is often used in reference to relatively large polypeptides, and “peptide” is often used in reference to small polypeptides, usage of these terms in the art overlaps and varies. The term “polypeptide” as used herein refers to peptides, polypeptides and proteins, unless clearly indicated to the contrary. As used herein, the terms “protein”, “polypeptide” and “peptide” are used interchangeably herein when referring to a gene product.
As used herein, the term “modulate” means an increase, decrease, or other alteration of any, or all, chemical and biological activities or properties of a wild-type or mutant SF-1 or LRH-1 polypeptide. The term “modulation” as used herein refers to both upregulation (i.e., activation or stimulation) and downregulation (i.e. inhibition or suppression) of a response. Thus a modulator may be either an agonist or an antagonist.
As used herein, the term “gene” is used for simplicity to refer to a functional protein, polypeptide or peptide encoding unit. As will be understood by those in the art, this functional term includes both genomic sequences and cDNA sequences.
As used herein, the term “intron” means a DNA sequence present in a given gene that is not translated into protein.
As used herein, the term “agonist” means an agent that increases, supplements, or potentiates the bioactivity of a functional gene or protein, e.g., SF-1 or LRH-1.
As used herein, the term “antagonist” means an agent that decreases or inhibits the bioactivity of a functional gene or protein, e.g., SF-1 or LRH-1.
As used herein in connection with SF-1 and LRH-1 modulating compounds, binding compounds or ligands, the term “specific for SF-1”, “specific for LRH-1” and terms of like import mean that a particular compound binds to the specified receptor to a statistically greater extent than to other biomolecules that may be present in a particular organism, e.g., at least 2, 3, 4, 5, 10, 20, 50, 100, or 1000-fold. Also, where biological activity other than binding is indicated, the term “specific for SF-1” or “specific for LRH-1” indicates that a particular compound has greater biological activity associated with binding to the specified receptor than to other biomolecules (e.g., at a level as indicated for binding specificity). Similarly, the specificity can be for the specific receptor with respect to other nuclear receptors that may be present from an organism. In particular embodiments, the specificity is between SF-1 and LRH-1.
As used herein, the terms “ligand” and “modulator” are used equivalently to refer to a compound that alters the activity of a target biomolecule, e.g., SF-1 or LRH-1. Generally a ligand or modulator will be a small molecule, where “small molecule refers to a compound with a molecular weight of 1500 daltons or less, or preferably 1000 daltons or less, 800 daltons or less, or 600 daltons or less. Thus, an “improved ligand” is one that possesses better pharmacological and/or pharmacokinetic properties than a reference compound, where “better” can be defined by a person for a particular biological system or therapeutic use. In terms of the development of ligands from scaffolds, a ligand is a derivative of a molecular scaffold that has been chemically modified at one or more chemically tractable structures to bind to the target molecule with altered or changed binding affinity or binding specificity relative to the molecular scaffold. The ligand can bind with a greater specificity and/or affinity for a member of the molecular family relative to the molecular scaffold. A ligand binds non-covalently to a target molecule, which can preferably be a protein or enzyme.
In the context of binding compounds, molecular scaffolds, and ligands, the term “derivative” or “derivative compound” refers to a compound having a common core chemical structure relative to a parent or reference compound, but differs by having at least one structural difference, e.g., by having one or more substituents added and/or removed and/or substituted, and/or by having one or more atoms substituted with different atoms. Unless clearly indicated to the contrary, the term “derivative” does not mean that the derivative is synthesized using the parent compound as a starting material or as an intermediate, although in some cases, the derivative may be synthesized from the parent.
Thus, the term “parent compound” refers to a reference compound for another compound, having structural features also present in the derivative compound. Often but not always, a parent compound has a simpler chemical structure than the derivative.
Also in the context of compounds binding to a biomolecular target, the term “greater specificity” indicates that a compound binds to a specified target to a greater extent than to another biomolecule or biomolecules that may be present under relevant binding conditions, where binding to such other biomolecules produces a different biological activity than binding to the specified target. In some cases, the specificity is with reference to a limited set of other biomolecules, e.g., in the case of SF-1 and LRH-1, in some cases the reference may be other nuclear receptors, or for SF-1 it may be LRH-1 and for LRH-1 it may be SF-1. In particular embodiments, the greater specificity is at least 2, 3, 4, 5, 8, 10, 50, 100, 200, 400, 500, or 1000-fold greater specificity.
Another aspect of the invention concerns novel compounds that bind to a ligand binding domain of SF-1 or LRH-1 and make interactions with amino acids in the ligand binding domain pocket that interact with the phospholipids identified herein.
A related aspect of this invention concerns pharmaceutical compositions that include such a binding compound and at least one pharmaceutically acceptable carrier, excipient, or diluent. The composition can include a plurality of different pharmacologically active compounds.
As used herein, the term “pharmaceutical composition” refers to a preparation that includes a therapeutically significant quantity of an active agent, that is prepared in a form adapted for administration to a subject. Thus, the preparation does not include any component or components in such quantity that a reasonably prudent medical practitioner would find the preparation unsuitable for administration to a normal subject. In many cases, such a pharmaceutical composition is a sterile preparation.
In a related aspect, the invention provides kits that include a pharmaceutical composition as described herein. In particular embodiments, the pharmaceutical composition is packaged, e.g., in a vial, bottle, flask, which may be further packaged, e.g., within a box, envelope, or bag; the pharmaceutical composition is approved by the U.S. Food and Drug Administration or similar regulatory agency for administration to a mammal, e.g., a human; the pharmaceutical composition is approved for administration to a mammal, e.g., a human for a SF-1- or LRH-1-mediated disease or condition; the kit includes written instructions or other indication that the composition is suitable or approved for administration to a mammal, e.g., a human, for a SF-1- or LRH-1-mediated disease or condition; the pharmaceutical composition is packaged in unit dose or single dose form, e.g., single dose pills, capsules, or the like.
In another related aspect, such binding compounds can be used in the preparation of a medicament for the treatment of a SF-1- or LRH-1-mediated disease or condition or a disease or condition in which modulation of one of those nuclear receptors provides a therapeutic benefit.
In another aspect, the invention concerns a method of treating or prophylaxis of a disease or condition in a mammal, e.g., a SF-1- or LRH-1-mediated disease or condition or a disease or condition in which modulation of one of those receptors provides a therapeutic benefit, by administering to the mammal a therapeutically effective amount of a compound that binds in the ligand binding domain pocket, a prodrug of such compound, or a pharmaceutically acceptable salt of such compound or prodrug. The compound can be alone or can be part of a pharmaceutical composition. In a further embodiment, the invention provides a method of treating or prophylaxis of a disease or condition in a mammal, e.g., a SF-1- or LRH-1-mediated disease or condition or a disease or condition in which modulation of one of those receptors provides a therapeutic benefit, by administering to the mammal a therapeutically effective amount of a compound that modulates the activity of SF-1 or LRH-1, a prodrug of such compound, or a pharmaceutically acceptable salt of such compound or prodrug. In a preferred embodiment, the SF-1 or LRH-1 modulator is designed according to a method for designing a ligand that binds to SF-1 or LRH-1 as described herein.
In aspects and embodiments involving treatment or prophylaxis of a disease or conditions, the disease or condition includes without limitation elevated cholesterol level, cancer, hepatitis virus infection, improper or risk of improper development.
As used herein, the terms “SF-1-mediated” and “LRH-1-mediated” disease or condition and like terms refer to a disease or condition in which the biological function of the specified receptor affects the development and/or course of the disease or condition, and/or in which modulation of the receptor alters the development, course, and/or symptoms of the disease or condition. Similarly, the phrases “SF-1 modulation provides a therapeutic benefit” and “LRH-1 modulation provides a therapeutic benefit” and the like indicate that modulation of the level of activity of the specified receptor in a subject indicates that such modulation reduces the severity and/or duration of the disease, reduces the likelihood or delays the onset of the disease or condition, and/or causes an improvement in one or more symptoms of the disease or condition.
In the present context, the term “therapeutically effective” indicates that the materials or amount of material are effective to prevent, alleviate, or ameliorate one or more symptoms of a disease or medical condition, and/or to prolong the survival of the subject being treated.
The term “pharmaceutically acceptable” indicates that the indicated material does not have properties that would cause a reasonably prudent medical practitioner to avoid administration of the material to a patient, taking into consideration the disease or conditions to be treated and the respective route of administration. For example, it is commonly required that such a material be essentially sterile, e.g., for injectibles.
“A pharmaceutically acceptable salt” is intended to mean a salt that retains the biological effectiveness of the free acids and bases of the specified compound and that is not biologically or otherwise unacceptable. A compound of the invention may possess a sufficiently acidic, a sufficiently basic, or both functional groups, and accordingly react with any of a number of inorganic or organic bases, and inorganic and organic acids, to form a pharmaceutically acceptable salt. Exemplary pharmaceutically acceptable salts include those salts prepared by reaction of the compounds of the present invention with a mineral or organic acid or an inorganic base, such as salts including sodium, chloride, sulfates, pyrosulfates, bisulfates, sulfites, bisulfites, phosphates, monohydrogenphosphates, dihydrogenphosphates, metaphosphates, pyrophosphates, chlorides, bromides, iodides, acetates, propionates, decanoates, caprylates, acrylates, formates, isobutyrates, caproates, heptanoates, propiolates, oxalates, malonates, succinates, suberates, sebacates, fumarates, maleates, butyne-1,4 dioates, hexyne-1,6-dioates, benzoates, chlorobenzoates, methylbenzoates, dinitrobenzoates, hydroxybenzoates, methoxybenzoates, phthalates, sulfonates, xylenesulfonates, phenylacetates, phenylpropionates, phenylbutyrates, citrates, lactates, .gamma.-hydroxybutyrates, glycollates, tartrates, methane-sulfonates, propanesulfonates, naphthalene-1 -sulfonates, naphthalene-2-sulfonates, and mandelates.
The term “pharmaceutically acceptable metabolite” refers to a pharmacologically acceptable product, which may be an active product, produced through metabolism of a specified compound (or salt thereof) in the body of a subject or patient. Metabolites of a compound may be identified using routine techniques known in the art, and their activities determined using tests such as those described herein. For example, in some compounds, one or more alkoxy groups can be metabolized to hydroxyl groups while retaining pharmacologic activity and/or carboxyl groups can be esterified, e.g., glucuronidation. In some cases, there can be more than one metabolite, where an intermediate metabolite(s) is further metabolized to provide an active metabolite. For example, in some cases a derivative compound resulting from metabolic glucuronidation may be inactive or of low activity, and can be further metabolized to provide an active metabolite.
In another aspect, the invention provides a method for identifying structurally and energetically allowed sites on a binding compound for attachment of an additional component(s) by analyzing the orientation of the binding compound(s) in a SF-1 or LRH-1 binding site (e.g., by analyzing co-crystal structures), thereby identifying accessible sites on the compound for attachment of the additional component. In particular embodiments, the binding compound is a phospholipid, e.g., as described herein.
In various embodiments, the method involves calculating the change in binding energy on attachment of the additional component at one or more of the accessible sites; the orientation is determined by co-crystallography; the additional component includes a linker, a label such as a fluorophore, a solid phase material such as a gel, bead, plate, chip, or well.
In a related aspect, the invention provides a method for attaching a SF-1 or LRH-1 binding compound to an attachment component(s) without substantially altering the ability of the SF-1 or LRH-1 binding compound to bind SF-1 or LRH-1, by identifying energetically allowed sites for attachment of such an attachment component on a binding compound (e.g., as described for the preceding aspect), and attaching the binding compound or derivative thereof to the attachment component(s) at the energetically allowed site(s). In particular embodiments, the binding compound is a phospholipid as identified herein.
In various embodiments, the attachment component is a linker (which can be a traceless linker) for attachment to a solid phase medium, and the method also involves attaching the binding compound or derivative to a solid phase medium through the linker attached at the energetically allowed site; the binding compound or derivative thereof is synthesized on a linker attached to the solid phase medium; a plurality of compounds or derivatives are synthesized in combinatorial synthesis; the attachment of the compound(s) to the solid phase medium provides an affinity medium
In a related aspect, the invention provides a method for making an affinity matrix for SF-1 or LRH-1, where the method involves identifying energetically allowed sites on a SF-1 or LRH-1 binding compound for attachment to a solid phase matrix without substantially altering the ability of the SF-1 or LRH-1 binding compound to bind SF-1 or LRH-1; and attaching the binding compound to the solid phase matrix through the energetically allowed site. In particular embodiments, the binding compound is a phospholipid, e.g., as described herein.
Various embodiments are as described for attachment of an additional component above; identifying energetically allowed sites for attachment to a solid phase matrix is performed for at least 5, 10, 20, 30, 50, 80, or 100 different compounds; identifying energetically allowed sites is performed for molecular scaffolds or other SF-1 or LRH-1 binding compounds.
SF-1 homologs can be identified by their sequences, where exemplary reference sequence accession numbers are NM—004959 (cDNA sequence for hSF-1) (SEQ ID NO:______) and NP—004950 (protein sequence for hSF-1) (SEQ ID NO:______). One of ordinary skill in the art will recognize that sequence differences will exist due to allelic variation, and will also recognize that other animals, particularly other mammals, have corresponding receptors, which have been identified or can be readily identified using sequence alignment and confirmation of activity, which can also be used. A number of such sequences are readily available from GenBank. One of ordinary skill in the art will also recognize that modifications can be introduced in a SF-1 sequence without destroying receptor activity. Such modified receptors can also be used in the present invention, e.g., if the modifications do not alter the binding site conformation to the extent that the modified receptor lacks substantially normal ligand binding.
As used herein, the terms “steroidogenic factor 1 ligand binding domain polypeptide”, “SF-1 ligand binding domain polypeptide”, and “SF-1 LBD polypeptide” (and like terms) refer to a polypeptide that contains the site where phospholipid binding as identified herein occurs. For human SF-1, such domain generally includes residues P221 through T461 of NP—004950. An exemplary such domain polypeptide is the polypeptide used for crystallization herein consisting of residues G219 to T461 of NP—004950; additional examples include homologs and variants thereof.
LRH-1 homologs can be identified by their sequences, where exemplary reference sequence accession numbers are NM—003822 (cDNA sequence for hLRH-1 isoform 2) (SEQ ID NO:______), NP—003813 (protein sequence for HLRH-1 isoform 2) (SEQ ID NO:______), NM—205860 (cDNA sequence for hLRH-1 isoform 1) (SEQ ID NO:______), and NP—995582 (protein sequence for hLRH-1 isoform 1) (SEQ ID NO:______). One of ordinary skill in the art will recognize that sequence differences will exist due to allelic variation, and will also recognize that other animals, particularly other mammals, have corresponding receptors, which have been identified or can be readily identified using sequence alignment and confirmation of activity, which can also be used. A number of such sequences are readily available from GenBank. One of ordinary skill in the art will also recognize that modifications can be introduced in a LRH-1 sequence without destroying receptor activity. Such modified receptors can also be used in the present invention, e.g., if the modifications do not alter the binding site conformation to the extent that the modified receptor lacks substantially normal ligand binding.
As used herein, the terms “liver receptor homolog 1 ligand binding domain polypeptide”, “LRH-1 ligand binding domain polypeptide”, and “LRH-1 LBD polypeptide” (and like terms) refer to a polypeptide that contains the site where phospholipid binding as identified herein occurs. For human LRH-1, such domain generally includes residues A253 through A495 of NP—003813 encoded by NM—003822 (supra). For mouse LRH-1, such sequence generally extends from A318 through A560 of the protein encoded by NM—030676 (SEQ ID NO:______). An exemplary such human domain polypeptide is the polypeptide used for crystallization herein consisting of residues S251-A495 of NP—003822 (supra); additional examples include homologs and variants thereof.
As used herein in connection with the design or development of ligands, the term “bind” and “binding” and like terms refer to a non-covalent energetically favorable association between the specified molecules (i.e., the bound state has a lower free energy than the separated state, which can be measured calorimetrically). For binding to a target, the binding is at least selective, that is, the compound binds preferentially to a particular target or to members of a target family at a binding site, as compared to non-specific binding to unrelated proteins not having a similar binding site. For example, BSA is often used for evaluating or controlling non-specific binding. In addition, for an association to be regarded as binding, the decrease in free energy going from a separated state to the bound state must be sufficient so that the association is detectable in a biochemical assay suitable for the molecules involved.
By “assaying” is meant the creation of experimental conditions and the gathering of data regarding a particular result of the experimental conditions. For example, enzymes can be assayed based on their ability to act upon a detectable substrate. Likewise, for example, a compound or ligand can be assayed based on its ability to bind to a particular target molecule or molecules and/or to modulate an activity of a target molecule.
By “background signal” in reference to a binding assay is meant the signal that is recorded under standard conditions for the particular assay in the absence of a test compound, molecular scaffold, or ligand that binds to the target molecule. Persons of ordinary skill in the art will realize that accepted methods exist and are widely available for determining background signal.
When a decision is described as “based on” particular criteria, it is meant that the criteria selected are parameters of the decision and guide its outcome. A substantial change in the parameters is likely to result in a change in the decision.
By “binding site” is meant an area of a target molecule to which a ligand can bind non-covalently. Binding sites embody particular shapes and often contain multiple binding pockets present within the binding site. The particular shapes are often conserved within a class of molecules, such as a molecular family. Binding sites within a class also can contain conserved structures such as, for example, chemical moieties, the presence of a binding pocket, and/or an electrostatic charge at the binding site or some portion of the binding site, all of which can influence the shape of the binding site.
By “binding pocket” is meant a specific region of space within a binding site. A binding pocket is a particular space within a binding site at least partially bounded by target molecule atoms. Thus a binding pocket is a particular shape, indentation, or cavity in the binding site. Binding pockets can contain particular chemical groups or structures that are important in the non-covalent binding of another molecule such as, for example, groups that contribute to ionic, hydrogen bonding, van der Waals, or hydrophobic interactions between the molecules.
By “chemical structure” or “chemical substructure” is meant any definable atom or group of atoms that constitute a part of a molecule. Normally, chemical substructures of a scaffold or ligand can have a role in binding of the scaffold or ligand to a target molecule, or can influence the three-dimensional shape, electrostatic charge, and/or conformational properties of the scaffold or ligand.
By “orientation” in reference to a binding compound bound to a target molecule is meant the spatial relationship of the binding compound and at least some of its constituent atoms to the binding pocket and/or atoms of the target molecule at least partially defining the binding pocket.
In the context of target molecules in the present invention, the term “crystal” refers to an ordered complex of target molecule, such that the complex produces an X-ray diffraction pattern when placed in an X-ray beam. Thus, a “crystal” is distinguished from a disordered or partially ordered complex or aggregate of molecules that do not produce such a diffraction pattern. Preferably a crystal is of sufficient order and size to be useful for X-ray crystallography. A crystal may be formed only of target molecule (with solvent and ions) or may be a co-crystal of more than one molecule, for example, as a co-crystal of target molecule and binding compound, and/or of a complex of proteins (such as a holoenzyme).
In the context of this invention, unless otherwise specified, by “co-crystals” is meant an ordered complex of the compound, molecular scaffold, or ligand bound non-covalently to the target molecule that produces a diffraction pattern when placed in an X-ray beam. Preferably the co-crystal is in a form appropriate for analysis by X-ray or protein crystallography. In preferred embodiments the target molecule-ligand complex can be a protein-ligand complex.
By “clogP” is meant the calculated log P of a compound, “P” referring to the partition coefficient of the compound between a lipophilic and an aqueous phase, usually between octanol and water.
By “chemically tractable structures” is meant chemical structures, sub-structures, or sites on a molecule that can be covalently modified to produce a ligand with a more desirable property. The desirable property will depend on the needs of the particular situation. The property can be, for example, that the ligand binds with greater affinity to a target molecule, binds with more specificity, or binds to a larger or smaller number of target molecules in a molecular family, or other desirable properties as needs require.
In the context of compounds binding to a target, the term “greater affinity” indicates that the compound binds more tightly than a reference compound, or than the same compound in a reference condition, i.e., with a lower dissociation constant. In particular embodiments, the greater affinity is at least 2, 3, 4, 5, 8, 10, 50, 100, 200, 400, 500, 1000, or 10,000-fold greater affinity.
By “designing a ligand,” “preparing a ligand,” “discovering a ligand,” and like phrases is meant the process of considering relevant data (especially, but not limited to, any individual or combination of binding data, X-ray co-crystallography data, molecular weight, clogP, and the number of hydrogen bond donors and acceptors) and making decisions about advantages that can be achieved as a result of specific structural modifications to a molecule, and implementing those decisions. This process of gathering data and making decisions about structural modifications that can be advantageous, implementing those decisions, and determining the result can be repeated as many times as necessary to obtain a ligand with desired properties.
By “docking” is meant the process of attempting to fit a three-dimensional configuration of a binding pair member into a three-dimensional configuration of the binding site or binding pocket of the partner binding pair member, which can be a protein, and determining the extent to which a fit is obtained. The extent to which a fit is obtained can depend on the amount of void volume in the resulting binding pair complex (or target molecule-ligand complex). The configuration can be physical or a representative configuration of the binding pair member, e.g., an in silico representation or other model.
By binding with “low affinity” is meant binding to the target molecule with a dissociation constant (KD) of greater than 1 μM under standard conditions. In particular cases, low affinity binding is in a range of 1 μM-10 mM, 1 μM-1 mM, 1 μM-500 μM, 1 μM-200 μM, 1 μM-100 μM. By binding with “very low affinity” is meant binding with a KD of above about 100 μM under standard conditions, e.g., in a range of 100 μM-1 mM, 100 μM-500 μM, 100 μM-200 μM. By binding with “extremely low affinity” is meant binding at a KD of above about 1 mM under standard conditions. By “moderate affinity” is meant binding with a KD of from about 200 nM to about 1 μM under standard conditions. By “moderately high affinity” is meant binding at a KD of from about 1 nM to about 200 nM. By binding at “high affinity” is meant binding at a KD of below about 1 nM under standard conditions. For example, low affinity binding can occur because of a poorer fit into the binding site of the target molecule or because of a smaller number of non-covalent bonds, or weaker covalent bonds present to cause binding of the scaffold or ligand to the binding site of the target molecule relative to instances where higher affinity binding occurs. The standard conditions for binding are at pH 7.2 at 37° C. for one hour. For example, 100 μl/well can be used in HEPES 50 mM buffer at pH 7.2, NaCl 15 mM, ATP 2 μM, and bovine serum albumin 1 ug/well, 37° C. for one hour.
Binding compounds can also be characterized by their effect on the activity of the target molecule. Thus, a “low activity” compound has an inhibitory concentration (IC50) (for inhibitors or antagonists) or effective concentration (EC50) (applicable to agonists) of greater than 1 μM under standard conditions. By “very low activity” is meant an IC50 or EC50 of above 100 μM under standard conditions. By “extremely low activity” is meant an IC50 or EC50 of above 1 mM under standard conditions. By “moderate activity” is meant an IC50 or EC50 of 200 nM to 1 μM under standard conditions. By “moderately high activity” is meant an IC50 or EC50 of 1 nM to 200 nM. By “high activity” is meant an IC50 or EC50 of below 1 nM under standard conditions. The IC50 (or EC50) is defined as the concentration of compound at which 50% of the activity of the target molecule (e.g., enzyme or other protein) activity being measured is lost (or gained) relative to activity when no compound is present. Activity can be measured using methods known to those of ordinary skill in the art, e.g., by measuring any detectable product or signal produced by occurrence of an enzymatic reaction, or other activity by a protein being measured. For SF-1 and LRH-1 agonists and antagonists, activities can be determined as described in the Examples, or using other such assay methods as described herein or known in the art.
By “molecular scaffold” or “scaffold” is meant a small target binding molecule to which one or more additional chemical moieties can be covalently attached, modified, or eliminated to form a plurality of molecules with common structural elements. The moieties can include, but are not limited to, a halogen atom, a hydroxyl group, a methyl group, a nitro group, a carboxyl group, or any other type of molecular group including, but not limited to, those recited in this application. Molecular scaffolds bind to at least one target molecule with low or very low affinity and/or bind to a plurality of molecules in a target family (e.g., protein family), and the target molecule is preferably an enzyme, receptor, or other protein. Preferred characteristics of a scaffold include molecular weight of less than about 350 daltons; binding at a target molecule binding site such that one or more substituents on the scaffold are situated in binding pockets in the target molecule binding site; having chemically tractable structures that can be chemically modified, particularly by synthetic reactions, so that a combinatorial library can be easily constructed; having chemical positions where moieties can be attached that do not interfere with binding of the scaffold to a protein binding site, such that the scaffold or library members can be modified to form ligands, to achieve additional desirable characteristics, e.g., enabling the ligand to be actively transported into cells and/or to specific organs, or enabling the ligand to be attached to a chromatography column for additional analysis. Thus, a molecular scaffold is a small, identified target binding molecule prior to modification to improve binding affinity and/or specificity, or other pharmacalogic properties.
The term “scaffold core” refers to the core structure of a molecular scaffold onto which various substituents can be attached. Thus, for a number of scaffold molecules of a particular chemical class, the scaffold core is common to all the scaffold molecules. In many cases, the scaffold core will consist of or include one or more ring structures.
The term “scaffold group” refers to a set of compounds that share a scaffold core and thus can all be regarded as derivatives of one scaffold molecule.
By “molecular family” is meant groups of molecules classed together based on structural and/or functional similarities. Examples of molecular families include proteins, enzymes, polypeptides, receptor molecules, oligosaccharides, nucleic acids, DNA, RNA, etc. Thus, for example, a protein family is a molecular family. Molecules can also be classed together into a family based on, for example, homology. The person of ordinary skill in the art will realize many other molecules that can be classified as members of a molecular family based on similarities in chemical structure or biological function.
By “protein-ligand complex” or “co-complex” is meant a protein and ligand bound non-covalently together.
By “protein” is meant a polymer of amino acids. The amino acids can be naturally or non-naturally occurring. Proteins can also contain adaptations, such as being glycosylated, phosphorylated, or other common modifications.
By “protein family” is meant a classification of proteins based on structural and/or functional similarities. For example, kinases, phosphatases, proteases, and similar groupings of proteins are protein families. Proteins can be grouped into a protein family based on having one or more protein folds in common, a substantial similarity in shape among folds of the proteins, homology, or based on having a common function. In many cases, smaller families will be specified, e.g., the nuclear receptor family or the NR5A nuclear receptor family.
“Protein folds” are 3-dimensional shapes exhibited by the protein and defined by the existence, number, and location in the protein of alpha helices, beta-sheets, and loops, i.e., the basic secondary structures of protein molecules. Folds can be, for example, domains or partial domains of a particular protein.
By “ring structure” is meant a molecule having a chemical ring or sub-structure that is a chemical ring. In most cases, ring structures will be carbocyclic or heterocyclic rings. The chemical ring may be, but is not limited to, a phenyl ring, aryl ring, pyrrole ring, imidazole, pyridine, purine, or any ring structure.
By “specific biochemical effect” is meant a therapeutically significant biochemical change in a biological system causing a detectable result. This specific biochemical effect can be, for example, the inhibition or activation of an enzyme, the inhibition or activation of a protein that binds to a desired target, or similar types of changes in the body's biochemistry. The specific biochemical effect can cause alleviation of symptoms of a disease or condition or another desirable effect. The detectable result can also be detected through an intermediate step.
By “standard conditions” is meant conditions under which an assay is performed to obtain scientifically meaningful data. Standard conditions are dependent on the particular assay, and can be generally subjective. Normally the standard conditions of an assay will be those conditions that are optimal for obtaining useful data from the particular assay. The standard conditions will generally minimize background signal and maximize the signal sought to be detected.
By “standard deviation” is meant the square root of the variance. The variance is a measure of how spread out a distribution is. It is computed as the average squared deviation of each number from its mean. For example, for the numbers 1, 2, and 3, the mean is 2 and the variance is 0.667; viz,
By a “set” of compounds is meant a collection of compounds. The compounds may or may not be structurally related.
In the context of this invention, by “target molecule” is meant a molecule that a compound, molecular scaffold, or ligand is being assayed for binding to. The target molecule has an activity that binding of the molecular scaffold or ligand to the target molecule will alter or change. The binding of the compound, scaffold, or ligand to the target molecule can preferably cause a specific biochemical effect when it occurs in a biological system. A “biological system” includes, but is not limited to, a living system such as a human, animal, plant, or insect. In most but not all cases, the target molecule will be a protein or nucleic acid molecule.
By “pharmacophore” is meant a representation of molecular features that are considered to be responsible for a desired activity, such as interacting or binding with a receptor. A pharmacophore can include 3-dimensional (hydrophobic groups, charged/ionizable groups, hydrogen bond donors/acceptors), 2D (substructures), and ID (physical or biological) properties.
As used herein in connection with numerical values, the terms “approximately” and “about” mean ±10% of the indicated value.
Additional aspects and embodiments will be apparent from the following Detailed Description and from the claims.
Table 1 provides crystal properties for SF-1 and LRH-1 determined as described in the Examples.
Table 2 provides atomic coordinates for SF1 ligand binding domain polypeptide crystal co-crystallized with a phospholipid ligand as described herein. In this table, the various columns have the following content, beginning with the left-most column:
Table 3 provides atomic coordinates for LRH1 ligand binding domain polypeptide crystal co-crystallized with a phospholipid ligand as described herein. Table entries are as in Table 2.
Table 4 provides the reference nucleotide sequence for human SF-1 cDNA and the amino acid sequence of the encoded SF-1 polypeptide.
Table 5 provides the reference nucleotide sequence for human LRH-1 cDNA isoform 2 and the corresponding amino acid sequence of the encoded LRH-1 polypeptide, and the reference nucleotide sequence for human LRH-1 cDNA isoform 1 and the encoded amino sequence of the corresponding LRH-1 polypeptide. Additionally, Table 5 provides the nucleotide sequence of mouse LRH-1.
I. General
Steroidogenic factor-1 (SF-1, ADFBP, ELP, NR5A1) and liver receptor homologue-1 (LRH-1, FTF, HB1F, CPF, NR5A2) are ‘orphan’ members of the nuclear receptor family for which no natural ligands have been identified (Fayard et al., Trends Cell Biol., 2004, 14, 250-60; Val et al., Nucl Recept. 2003, 1, 8. These two factors are related to fushi tarazu factor-1 (FTZ-F1) of Drosophila, and comprise the NR5A branch of the nuclear receptor gene family in man. Functional similarities follow their sequence similarities, as SF-1 and LRH-1 both function as monomers (Li et al., J. Biol. Chem., 1998, 273:29022-29031) to regulate genes at similar response elements.
However, SF-1 is expressed predominantly in the adrenals, testis, ventromedial hypothalamus, and pituitary, and regulates genes coordinating adrenal and sex steroid syntheses (Val et al., Nucl. Recept., 2003, 1:8), while LRH-1 is expressed in liver, intestine, and pancreas, and act on genes coordinating bile acid synthesis, enterohepatic circulation, and absorption. (Fayard et al., Trends Cell Biol., 2004, 14:250-260.) Gene knockout and heterozygous loss-of-function studies show that both SF-1 and LRH-1 are essential during embryogenesis for normal development of the organs in which they are expressed and mammalian cell transfection experiments indicate that SF-1 and LRH-1 function as obligate factors for their target genes, acting apparently constitutively. (Pare et al., J. Biol. Chem., 2004, 279, 21206-21216; Zhao et al., Mol. Cell Endocrinol., 2001, 185:27-32; Sadovsky et al., Proc. Natl. Acad. Sci. USA, 1995, 92:10939-10943; Shinoda et al., Dev. Dyn., 1995, 204:22-29; Luo et al., Cell, 1994, 77:481-490; Achermann et al., J. Clin. Endocrinol Metab., 2002, 87:1829-1833.) The mouse LRH-1 structure contains a cavity available for potential ligands, but mutations to fill this cavity did not diminish activity, supporting a model of constitutive, ligand-independent function. (Sablin et al., Mol. Cell, 2003, 11:1575-1585.)
X-ray structures of the ligand-binding domains of human SF-1 and human LRH-1 have been determined. Additionally, it has been discovered that each structure includes a phospholipid ligand. The receptor-ligand interactions indicate that as a class, phospholipids are well-suited as ligands to stabilize the active conformation, a conclusion supported by specific structure-guided mutational analyses. Coactivator-derived peptides included in the co-crystallization experiments bind not only to the canonical activation-function (AF-2) surface of both SF-1 and LRH-1, but in the case of the LRH-1, also to a novel second site. These structures indicate a link between phospholipids and cholesterol regulation, and further, introduce possible new modes of co-regulator recruitment unique to the NR5A branch of the nuclear receptor superfamily.
The SF-1 and LRH-1 LBD structures adopt an α-helical sandwich architecture composed of 12 α-helices and one β-hairpin (
In the SF-1 crystal there are two molecules in the crystallographic asymmetric unit, each delineating residues P221 through K459, one completely and the other incompletely, lacking residues Q249 through R255 in the flexible loop after H2. In the LRH-1 crystal there is one molecule in the asymmetric unit, delineating residues A253 through Q284 and K292 through A492, but also lacking residues 285-291 in the loop after H2. Consistent with reports that SF-1 and LRH-1 function as monomers, none of the crystallization contacts form through the canonical H10 dimerization surface used by other NRs. (Gampe et al., 2000, Mol Cell 5, 545-55; Bourguet et al., 2000, Mol Cell 5, 289-98.)
Strikingly, as indicated above, both structures reveal buried phospholipid molecules derived from the E. coli expression host. Based on well-defined electron density, the molecule in SF-1 can be identified as a phosphatidylethanolamine, and in LRH-1, as a phosphatidylglycerol-phosphoglycerol. In each structure the two acyl chains consist of a palmitic acid (16:0) attached to C1 and apalmitoleic acid (16:1,Δ9) to C2 of the glycerol backbone. The Δ9-cis unsaturation of the palmitoleic acid causes a bend that allows the lipid tails to compact around each other. The polar headgroups of the bound phospholipids reach outside the pocket through the channel formed by H3, H6, H11, and the β-hairpin. In the SF-1 structure the ethanolamine interacts through water molecules to E445 in the loop between H11 and H12. In the LRH-1 structure the glycerol-phosphoglycerol headgroup wraps between the N-terminal end of H7 and the C-terminal end of H11, with the glycerol and phosphate oxygen atoms forming hydrogen bonds with A366 and T377 (H7) and Y473 (H11).
Ligands derived from the expression host have been observed previously in other orphan nuclear receptor structures. In some cases the ligand appears to fill the ligand-binding pocket, making multiple interactions with the protein, suggesting biological relevance. (Kallen et al., 2002, Structure (Camb) 10, 1697-707; Dhe-Paganon et al., 2002, J. Biol. Chem. 277, 37973-6; Wisely et al., 2002, Structure (Camb) 10, 1225-34.) In other cases the ligand is loosely-fit, making interactions with nonconserved residues within the pocket, suggesting these as possible pseudo-ligands. (Stehlin et al., 2001, Embo J. 20, 5822-31.) Phosphatidylethanolamine has also been observed in the structures of the insect nuclear receptor, ultraspiracle, adopting the inactive conformation. (Clayton et al., 2001, Proc Natl Acad Sci USA 98, 1549-54; Billas et al., 2003, Nature 426, 91-6.) The lipids extracted from SF-1 and LRH-1 proteins used here contain several mass spectral peaks that can be interpreted as phosphatidylethanolamine and phosphatidylglycerol, with acyl chain lengths varying from 14 to 18, and of varying saturation. However, the glycerolipid tails of the ligands observed in both the SF-1 and LRH-1 crystal structures are the same, and make extensive van der Waals contacts with hydrophobic residues lining the inside wall of the pocket (FIGS. 1C,D and 2A,B), stabilizing these proteins in the active conformation directly though contacts with the C-terminal activation helix, H12, as well as through hydrophobic interactions with H3 and H 11 that support H12. The total volumes of the LRH-1 and SF-1 cavities are 510 and 550 Å3 respectively (
Both SF-1 and LRH-1 make interactions with the phosphate group of the phospholipid that appear likely to affect both ligand affinity and selectivity, and receptor activation. The phosphate lies partially buried, stabilized by forming a salt bridge with a Lys from H11 (K440 in SF-1; K474 in LRH-1), and a hydrogen bond with a Tyr from H 11 (Y436 in SF-1; Y470 in LRH-1) (
Curiously, in the mouse LRH-1 sequence a Glu (residue 440 in mouse) replaces the Gly of the phosphate-binding triad of human LRH-1. In the structure of the mouse LRH-1 this Glu mimics the nucleating interactions with the Lys and Tyr of H11 that the phospholipid phosphorous group makes in other structures of human LRH-1 and SF-1 (
When tested for coactivator binding in vitro, both SF-1 and LRH-1 proteins made in E. coli demonstrated constitutive activity for coactivator recruitment. Addition of phospholipids to these preparations showed little increase in signal, consistent with the preexisting binding of phospholipids. However, the lipids binding SF-1 could be partially extracted by washing the proteins with liposomes prepared using phosphatidylcholine (C22 acyl chain length). It was reasoned that such liposomes with long acyl chains could act as a sink for extracted lipids, without binding the receptors themselves. After such washing the coactivator binding by SF-1 was diminished, but could be activated by the addition of phosphatidylethanolamine (
A selection of structure-guided mutations of SF-1 and LRH-1 pockets were constructed (
Six pocket mutations, A303F, A303M, L378F, A467F, A467M, and Y470F/K474A were tested in LRH-1 (
Both the SF-1 and LRH-1 structures were obtained as complexes with a peptide matching the NR-box 3 of the coactivator NCOA2 (TIF2). The coactivator peptide bound the canonical AF-2 surface through specific sidechain interactions (
Surprisingly, in the LRH-1 structure a coactivator peptide was also bound to a novel second site on the surface formed by residues of H2 (M277, L280), H3 (T295, L298, M299, and M302), the β-hairpin (V365), and H6 (1369) that form a hydrophobic patch complementary to the LRYLL motif of the peptide. The hydrophobic patch also includes atoms of the C1 acyl chain of the phospholipid, in coordination with the methyl group of T295, suggesting a direct participation by the ligand in recruitment of coactivator to this site. Unlike the canonical binding site, there is no strong charge-clamp to the coactivator peptide dipole in the second binding site. However the Tyr of the peptide forms a hydrogen bond with D366 of the β-hairpin, suggesting the residue at the second X of the LXXLL motif will influence the coactivator selectivity. Although no second peptide was bound in the SF-1 crystal, the surface features of SF-1 are similar enough with LRH-1 to suggest that SF-1 could also bind coactivators at this site. The difference in results may be due to crystal packing differences; in the LRH-1 crystal the second peptide is located at a favorable crystal packing interface, but in the SF-1 crystal the packing interferes with peptide binding to this site.
Mutated forms of LRH-1 were engineered for analysis of the novel second coactivator binding site observed in the structure (
In LRH-1 mutation of the canonical site gave strong reductions in activity (96%), suggesting that under these conditions the canonical site is dominant (
In addition to the structural and functional analysis indicated above, phospholipids as ligands for SF-1 and LRH-1 is also reasonable based on mechanistic rationale. Both receptors regulate genes important for cholesterol metabolism. Phospholipid composition must be balanced with cholesterol content in membranes to maintain proper membrane fluidity, and therefore regulation of genes for cholesterol metabolism by a phospholipid signal makes sense. (McConnell & Radhakrishnan, Biochim Biophys Acta 2003, 1610:159-73; Quinn, Prog. Biophys. Mol. Biol., 1981, 38:1-104.) This may be especially true for cells of the adrenal and liver that are specialized for high flux and turnover of cholesterol. (Jefcoate, J. Clin Invest., 2002, 110:881-90.) In fact, a major source of phospholipid in such cells derives from the blood lipoprotein particles, that are known to carry large amounts of phospholipid in addition to cholesterol, so a source of phospholipid signals may derive from these particles. (Vance & Vance, J. Biol. Chem., 1986, 261:4486-91; Wang et al., J. Biol. Chem., 2003, 278:42906-12.) Whether derived from the blood or from intracellular synthesis, phospholipid composition is known to vary with nutrition, exercise, pregnancy, and other metabolic and hormonal status, and such changes could lead to variable NR5A activation, or conceivably, inhibition. (Clamp et al., Lipids, 1997, 32:179-84; Tranquilli et al., Acta Obstet. Gynec., Scand., 2004, 83:443-8; Imai et al., Biochem. Pharmacol., 1999, 58:925-33; Lin et al., J. Lipid Res., 2004, 45:529-35; Andersson et al., Am. J Physiol., 1998, 274:E432-8.) Therefore ligand regulation of these receptors should be considered within a general context of lipid homeostasis. It is noteworthy that cholesterol and phosphatidylethanolamine have been documented to regulate, in mammals and insects respectively, the post-translational processing of the nuclear factor, SREBP, that is important in the regulation of many genes of lipid homeostasis, in some cases cooperating with SF-1. (Wang et al., Cell, 1994, 77:53-62; Dobrosotskaya et al., Science, 2002, 296:879-83; Lopez & McLean, Endocrinology, 1999, 140:5669-81.) Thus the identification of phospholipid as a class of molecule regulating SF-1 and LRH-1, provided by the current X-ray structures provides target structures and allows the identification and development of modulators of these receptors.
II. Applications of SF1 and LRH1 Modulators and Exemplary Assay Methods
A. LRH-1
Compounds that modulate LRH-1 activity can have beneficial effects in the management of cholesterol excess. Thus, activators of LRH-1 would lower circulating cholesterol levels. This is because LRH-1 regulates several genes involved in cholesterol homeostasis, including: CYP7A1, the rate-limiting enzyme for conversion of cholesterol to bile acids (Wang et al., J. Lipid Res., 1996, 37:1831-41; Nitta et al., Proc. Natl. Acad. Sci. USA, 1999, 96:6660-5), the scavenger receptor class B type I (SR-BI), that mediates selective cellular cholesterol uptake from high-density lipoproteins (HDLs) (Schoonjans et al., EMBO Rep., 2002, 3:1181-7), and cholesterol ester transfer protein (CETP), important for remodeling of HDL particles (Luo et al., J. Biol. Chem. , 2001, 276:24767-73).
A second indication for LRH-1 modulators is in treatment or management of hepatitis virus infection. Hepatitis B virus is the major cause of acute and chronic hepatitis, and is associated with development of hepatocellular carcinoma. Certain hepatitus virus genes are stimulated by LRH-1. (Li et al., J. Biol. Chem., 1998, 273, 29022-31; (Gilbert et al., J. Virol., 2000, 74, 5032-9.) Thus inhibitors or modulators of LRH-1 would limit the functions of the hepatitis virus, with beneficial effects on infected individuals.
LRH-1 also regulates other genes important for cholesterol homeostasis, including:
Other targets of LRH-1 include:
Thus, such additional LRH-1 targets can also be used for assaying or screening for modulators of LRH-1. Such modulators can then be used for treatment of diseases or conditions associated with those additional LRH-1 target genes.
B. SF-1
Compounds that modulate SF-1 can have desireable effects on sexual function and sex-related phenotypic aspects. SF-1 is very important during prenatal development of the sexual anatomy. In conjunction with a genetic screening protocol, in situations that are expected to lead to phenotypic development unsupportive of the primary sexual genotype could be corrected, at least in part, by modulation of SF-1.
SF-1 also functions after birth to regulate genes involved in sex hormone synthesis in the testis or ovaries. Thus modulation of SF-1 should assist in the maintenance of sexual function or of sex-related phenotypic appearance.
SF-1 also regulates genes important for the synthesis of adrenal steroids. Thus it controls the levels of a set of very potent hormone regulators of lipid and carbohydrate metabolism (glucocorticoids), and hypertension (mineralocorticoids). SF-1 is a key regulator in the hypothalamic-pituitary-adrenal axis through which environmental factors such as stress, or physiological factors such as starvation, have effects on overall physiology and metabolism. Pharmaceutical modulators of SF-1 can assist in maintaining a normal physiological balance in situations where the unassisted organs are over-reacting to environmental effects (such as too much stress) or medical procedures (such as surgery or other interventional procedures), or drug-induced manipulations intended to intervene in a subset of the normal metabolic regulatory mechanisms.
Pharmaceutical modulators of SF-1 can also be used in the management of ectopic tumors that produce steroid hormones. Initially modulators of SF-1 can be useful in the diagnosis of abnormal steroid production. Once a diagnosis of steroid-producing tumors is established but before surgical procedures are implemented, normal (or closer to normal) physiological tone can be produced with inhibitors of SF-1. In the case of brain or other tumor locations or conditions in which surgery is difficult, longer-term treatment with SF-1 modulators would be valuable.
Modulators of SF-1 would also be useful for treatment of conditions of poisoning with endocrine-disrupting agents, such as pesticides and polychlorinated biphenyls (PCBs), known to interfere with normal endocrine function. But certainly these agents interfere with the normal production of hormones regulated by SF-1 function, and some may interfere directly with SF-1 function. Thus modulators of SF-1 can reverse the negative effects by such compounds.
SF-1 regulates most of the genes encoding enzymes catalyzing the synthesis of steroid hormones, including P450 cholesterol sidechain cleavage enzyme (CYP11A1) (Hu, M. C., et al., Mol. Endocrinol., 2001, 15:812-818), 11-b-hydroxylase (CYP11B1), aldosterone synthase (CYP11B2), CYP17, CYP19; see, e.g., Mascaro, C., et al., Biochem J., 2000. 350 (Pt 3):785-790, for review.
SF-1 also regulates the gene encoding steroidogenic acute regulatory (StAR) protein, that transports cholesterol into the mitochondria where steroids are synthesized. This transport is the rate-limiting step for steroidogenesis.
Other target genes of SF-1 include, for example:
Thus, such additional SF-1 targets can also be used for assaying or screening for modulators of SF-1. Such modulators can then be used for treatment of diseases or conditions associated with those additional SF-1 target genes.
Nuclear receptors that are highly structurally related to SF-1 are present in most insects, as SF-1 (and LRH-1) comprise the members of the nuclear receptors in man that are most related to the FTZ-F1 receptors in insects. Thus, modulators of SF-1 could serve as effective insecticides through actions on an insect receptor related to SF-1, or as molecular scaffolds or reference compounds for developing effective insecticides. Such development can be carried out as described herein for development of modulators of SF-1 and LRH-1 using the respective insect FTZ-F1 receptor, or by using conventional medicinal chemistry to select and test derivatives of the SF-1 or LRH-1 active compounds.
For example, sequence alignments of all 48 human nuclear receptors indicate that SF-1 and LRH-1 are highly related: these receptors are within the NR5 subfamily of the nuclear receptor (NR) superfamily. When the SF-1 and LRH-1 sequences are compared to all currently known sequences from all species, it is observed that the NR5 subfamily also includes the FTZ-F1 gene from Drosophila. Because Drosophila is a member of the Insect class of eukaryotes, it is likely that inhibitors of SF-1 and LRH-1 as provided herein will have insecticidal properties or inhibit insect development. Thus, compounds provided by the present invention can be used to target many diverse insect pests such as flies, gnats, and fleas among many other types. Furthermore, compounds provided by the invention that bind to SF-1 and LRH-1 can be used to refine other compounds that bind to FTZ-F1. Also, the crystal structures of SF-1 and LRH-1 provided by the invention can be used to make models of FTZ-F1 to predict how one or a series of potential ligands for FTZ-F1 will bind to that target; thereby facilitating development of FTZ-F1-inhibiting compounds.
Screening for molecules, e.g., small molecules, that bind to and modulate the SF-1 and LRH-1 receptors can be accomplished using in vitro assays that quantify the amount of binding of co-regulatory proteins with the SF-1 or LRH-1 receptor proteins. Several co-regulatory proteins have been documented to bind to these receptors, including SRC-1, TReP-132, DAX-1, and SHP. The receptor proteins can be produced in E. coli or other convenient expression system. The co-regulatory proteins are typically too large to be conveniently made as full-length proteins; however the relevant receptor-binding motifs can be produced in E. coli. Alternatively, peptides can be chemically synthesized that contain these co-regulator motifs and used in the assays.
A variety of different methods for detecting molecular interactions can be used. For example, Alpha Screen technology (Perkin Elmer) is suitable to detect the interaction of the receptor with the coactivator fragment. In this case it is suitable to engineer the ligand-binding domain of the SF-1 and LRH-1 as an N-terminally HIS-tagged protein that can bind the acceptor bead (containing Nickel moieties that will bind the HIS tag). The coactivator fragment can be synthesized containing a biotin moiety that will bind the donor bead. In the presence of ‘activating’ compounds the association of the receptor with the co-regulator may be strengthened, whereas the presence of ‘inhibitory’ compounds may destabilize this interaction. Libraries of chemicals, or derivatives, can be quantified for their effects on co-regulator binding.
Thus, in an exemplary implementation, the Alpha Screen Histidine detection (Nickel chelate) kit (Perkin Elmer) is used to detect binding between His-tagged receptor LBD and biotinylated coactivator peptides or fragments. The assay is performed in Costar 384-well white polystyrene plates (Coming Inc.), in a total volume of 20 μL. Compounds to be tested for their abilities to modulate the interaction of nuclear receptor with coactivator are added to the 384-well plate in 1 μL of DMSO or buffer in advance of addition of the receptor and coactivator proteins.
Reactions are initiated in 15 μl containing 50 nM His-tagged nuclear receptor and 50 mM biotin-tagged coactivator fragment, using buffer containing 50 mM Bis-tris HCl (pH 7.0), 50 mM KCl, 0.05% Tween 20, 1 mM DTT, 0.1% BSA. Other buffer variations can be tested to optimize the largest difference in signals obtained using the apo receptor and receptor bound to compounds already determined to bind and activate the receptor. After the protein solutions are added to the compounds, the plate is sealed and incubated at room temp for 2 hours. After incubation, a 5 μL mixture containing streptavidin donor beads (15 μg/ml) and Ni-chelate acceptor beads (15 μg/ml) are added from the Nickel chelate kit. Plates are resealed and incubated in the dark for 2 hours at room temperature and then read in an AlphaFusion reader set at a read time of 1 s/well.
A signal is produced by the binding of coactivator to nuclear receptor that can be detected by the AlphaFusion reader (the binding brings the acceptor beads into close proximity of the donor beads, which allows the acceptor beads to detect the singlet oxygens produced by the donor beads, causing them to emit a light detected by the instrument). Data analysis can be performed using GraphPad Prism (GraphPad Software, Inc.). The relative abilities of many compounds to activate the receptor can be assessed by calculating and comparing each of their EC50 values (i.e., the concentration of compound that causes 50% of the maximal effect, interpolated from the results of a series of tests with varying concentrations of each compound).
C. Assaying the Effects of Ligands in Cell Culture
Ligands that modulate the interaction of SF-1 or LRH-1 with co-regulators will affect the expression of genes that are targets of these receptors. Thus assays of the levels of expression of these genes will indicate the effect such compounds are having. For SF-1 an exemplary suitable cell type is the H-295R human adrenal cell. This cell expresses the enzymes, transport proteins, and receptors required for steroid hormone synthesis, and in fact makes the steroid hormone, progesterone, in assayable amounts. After treatment with a ligand, the levels of mRNA encoding these proteins can be quantified by QPCR methods. Alternatively the levels of progesterone can be assayed.
In the case of LRH-1, an exemplary suitable cell type is the HepG2 human liver cell. This cell expresses enzymes, receptors, and transporters important for bile acid synthesis. After treatment with a ligand, the levels of mRNA encoding one or more of these proteins can be quantified by QPCR methods as indicators of the effects of LRH-1 modulation.
III. Development of SF-1 and LRH-1 Active Compounds
A. Modulator Identification and Design
A large number of different methods can be used to identify modulators and to design improved modulators. Some useful methods involve structure-based design.
Structure-based modulator design and identification methods are powerful techniques that can involve searches of computer databases containing a wide variety of potential modulators and chemical functional groups. The computerized design and identification of modulators is useful as the computer databases contain more compounds than the chemical libraries, often by an order of magnitude. For reviews of structure-based drug design and identification (see Kuntz et al., Acc. Chem. Res., 1994, 27:117; Guida Current Opinion in Struc. Biol., 1994, 4:777; Cohnan, Current Opinion in Struc. Biol., 1994, 4: 868).
The three dimensional structure of a polypeptide defined by structural coordinates can be utilized by these design methods, for example, the structural coordinates of SF-1 or LRH-1. In addition, the three dimensional structures of SF-1 or LRH-1 determined by the homology, molecular replacement, and NMR techniques can also be applied to modulator design and identification methods.
For identifying modulators, structural information for SF-1 or LRH-1, in particular, structural information for the active site of the SF-1 or LRH-1, can be used. However, it may be advantageous to utilize structural information from one or more co-crystals of the receptor with one or more binding compounds. It can also be advantageous if the binding compound has a structural core in common with test compounds.
1. Design by Searching Molecular Data Bases
One method of rational design searches for modulators by docking the computer representations of compounds from a database of molecules. Publicly available databases include, for example:
a) ACD from Molecular Designs Limited
b) NCI from National Cancer Institute
c) CCDC from Cambridge Crystallographic Data Center
d) CAST from Chemical Abstract Service
e) Derwent from Derwent Information Limited
f) Maybridge from Maybridge Chemical Company LTD
g) Aldrich from Aldrich Chemical Company
h) Directory of Natural Products from Chapman & Hall
One such data base (ACD distributed by Molecular Designs Limited Information Systems) contains compounds that are synthetically derived or are natural products. Methods available to those skilled in the art can convert a data set represented in two dimensions to one represented in three dimensions. These methods can be carried out using such computer programs as CONCORD from Tripos Associates or DE-Converter from Molecular Simulations Limited.
Multiple methods of structure-based modulator design are known to those in the art. (Kuntz et al., J. Mol. Biol., 1982, 162:269; Kuntz et al., Acc. Chem. Res., 1994, 27:117; Meng et al., J. Comp. Chem., 1992, 13: 505; Bohm, J. Comp. Aided Molec. Design, 1994, 8: 623.)
A computer program widely utilized by those skilled in the art of rational modulator design is DOCK from the University of California in San Francisco. The general methods utilized by this computer program and programs like it are described in three applications below. More detailed information regarding some of these techniques can be found in the Accelrys User Guide, 1995 (Accelrys, San Diego, Calif.) A typical computer program used for this purpose can perform a process comprising the following steps or functions:
Part (c) refers to characterizing the geometry and the complementary interactions formed between the atoms of the active site and the compounds. A favorable geometric fit is attained when a significant surface area is shared between the compound and active-site atoms without forming unfavorable steric interactions. One skilled in the art would note that the method can be performed by skipping parts (d) and (e) and screening a database of many compounds.
Structure-based design and identification of modulators of SF-1 and LRH-1 function can be used in conjunction with assay screening. As large computer databases of compounds (around 10,000 compounds) can be searched in a matter of hours or even less, the computer-based method can narrow the compounds tested as potential modulators of SF-1 or LRH-1 function in biochemical or cellular assays.
The above descriptions of structure-based modulator design are not all encompassing and other methods are reported in the literature and can be used, e.g.:
2. Design by Modifying Compounds in Complex with SF-1 and LRH-1
Another way of identifying compounds as potential modulators is to modify an existing modulator in the polypeptide active site. For example, the computer representation of modulators can be modified within the computer representation of a SF-1 or LRH-1 active site (e.g., LBD pocket). betailed instructions for this technique can be found, for example, in the Accelrys User Manual, 1995 in LUDI. The computer representation of the modulator is typically modified by the deletion of a chemical group or groups or by the addition of a chemical group or groups.
Upon each modification to the compound, the atoms of the modified compound and active site can be shifted in conformation and the distance between the modulator and the active-site atoms may be scored along with any complementary interactions formed between the two molecules. Scoring can be complete when a favorable geometric fit and favorable complementary interactions are attained. Compounds that have favorable scores are potential modulators.
3. Design by Modifying the Structure of Compounds that Bind SF-1 or LRH-1
A third method of structure-based modulator design is to screen compounds designed by a modulator building or modulator searching computer program. Examples of these types of programs can be found in the Molecular Simulations Package, Catalyst. Descriptions for using this program are documented in the Molecular Simulations User Guide (1995). Other computer programs used in this application are ISIS/HOST, ISIS/BASE, ISIS/DRAW) from Molecular Designs Limited and UNITY from Tripos Associates.
These programs can be operated on the structure of a compound that has been removed from the active site of the three dimensional structure of a compound-receptor complex. Operating the program on such a compound is preferable since it is in a biologically active conformation.
A modulator construction computer program is a computer program that may be used to replace computer representations of chemical groups in a compound complexed with a receptor or other biomolecule with groups from a computer database. A modulator searching computer program is a computer program that may be used to search computer representations of compounds from a computer data base that have similar three dimensional structures and similar chemical groups as compound bound to a particular biomolecule.
A typical program can operate by using the following general steps:
Those skilled in the art also recognize that not all of the possible chemical features of the compound need be present in the model of (b). One can use any subset of the model to generate different models for data base searches.
B. Identification of Active Compounds Using SF-1 or LRH-1 Structure and Molecular Scaffolds
In addition to the methods described above that are normally applied based on screening hits that have a substantial level of activity, the availability of crystal structures that include ligand binding sites for SF-1 and LRH-1 enables application of a scaffold method for identifying and developing additional active compounds.
Thus, the present invention also concerns methods for designing ligands active on SF-1 or LRH-1 by using structural information about the respective ligand binding sites and identified binding compounds. While such methods can be implemented in many ways (e.g., as described above), advantageously the process utilizes molecular scaffolds. Such development processes and related methods are described generally below, and can, as indicated, be applied to SF-1 and LRH-1, individually or as a family.
Molecular scaffolds as discussed herein are low molecular weight molecules that bind with low or very low affinity to the target and typically have low or very low activity on that target and/or act broadly across families of target molecules. The ability of a scaffold or other compound to act broadly across multiple members of a target family is advantageous in developing ligands. For example, a scaffold or set of scaffolds can serve as starting compounds for developing ligands with desired specificity or with desired cross-activity on a selected subset of members of a target family. Further, identification of a set of scaffolds that each bind with members of a target family provides an advantageous basis for selecting a starting point for ligand development for a particular target or subset of targets. In many cases, the ability of a scaffold to bind to and/or have activity on multiple members of a target family is related to active site or binding site homology that exists across the target family.
A scaffold active across multiple members of the target family interacts with surfaces or residues of relatively high homology, i.e., binds to conserved regions of the binding pockets. Scaffolds that bind with multiple members can be modified to provide greater specificity or to have a particular cross-reactivity, e.g., by exploiting differences between target binding sites to provide specificity, and exploiting similarities to design in cross-reactivities. Adding substituents that provide attractive interactions with the particular target typically increases the binding affinity, often increasing the activity. The various parts of the ligand development process are described in more detail in following sections, but the following describes an advantageous approach for scaffold-based ligand development.
Scaffold-based ligand development (scaffold-based drug discovery) can be implemented in a variety of ways, but large scale expression of protein is useful to provide material for crystallization, co-crystallization, and biochemical screening (e.g., binding and activity assays). For crystallization, crystallization conditions can be established for apo protein and a structure determined from those crystals. For screening, preferably a biased library selected for the particular target family is screened for binding and/or activity on the target. Highly preferably a plurality of members from the target family is screened. Such screening, whether on a single target or on multiple members of a target family provides screening hits. Low affinity and/or low activity hits are selected. Such low affinity hits can either identify a scaffold molecule, or allow identification of a scaffold molecule by analyzing common features between binding molecules. Simpler molecules containing the common features can then be tested to determine if they retain binding and/or activity, thereby allowing identification of a scaffold molecule.
When multiple members of a particular target family are used for screening, the overlap in binding and/or activity of compounds can provide a useful selection for compounds that will be subjected to crystallization. For example, for 3 target molecules from a target family, if each target has about 200-500 hits in screening of a particular library, much smaller subsets of those hits will be common to any 2 of the 3 targets, and a still smaller subset will be common to all 3 targets, e.g., 100-300. In many.cases, compounds in the subset common to all 3 targets will be selected for co-crystallography, as they provide the broadest potential for ligand development.
Once compounds for co-crystallography are selected, conditions for forming co-crystals are determined, allowing determination of co-crystal structure, and the orientation of binding compound in the binding site of the target is determined by solving the structure (this can be highly assisted if an apo protein crystal structure has been determined or if the structure of a close homolog is available for use in a homology model.) Preferably the co-crystals are formed by direct co-crystallization rather than by soaking the compound into crystals of apo protein.
From the co-crystals and knowledge of the structure of the binding compounds, additional selection of scaffolds or other binding compounds can be made by applying selection filters, e.g., for (1) binding mode, (2) multiple sites for substitution, and/or (3) tractable chemistry. A binding mode filter can, for example, be based on the demonstration of a dominant binding mode. That is, a scaffold or compounds of a scaffold group bind with a consistent orientation, preferably a consistent orientation across multiple members of a target family. Filtering scaffolds for multiple sites for substitution provides greater potential for developing ligands for specific targets due to the greater capacity for appropriately modifying the structure of the scaffold. Filtering for tractable chemistry also facilitates preparation of ligands derived from a scaffold because the synthetic paths for making derivative compounds are available. Carrying out such a process of development provides scaffolds, preferably of divergent structure.
In some cases, it may be impractical or undesirable to work with a particular target for some or all of the development process. For example, a particular target may be difficult to express, by easily degraded, or be difficult to crystallize. In these cases, a surrogate target from the target family can be used. It is desirable to have the surrogate be as similar as possible to the desired target, thus a family member that has high homology in the binding site should be used, or the binding site can be modified to be more similar to that of the desired target, or part of the sequence of the desired target can be inserted in the family member replacing the corresponding part of the sequence of the family member.
Once one or more scaffolds are identified for a target family, the scaffolds can be used to develop multiple products directed at specific members of the family, or at specific subsets of family members. Thus, starting from a scaffold that acts on multiple member of the target family, derivative compounds (ligands) can be designed and tested that have increasing selectivity. In addition, such ligands are typically developed to have greater activity, and will also typically have greater binding affinity. In this process, starting with the broadly acting scaffold, ligands are developed that have improved selectivity and activity profiles, leading to identification of lead compounds for drug development, leading to drug candidates, and final drug products.
C. Scaffolds
Typically it is advantageous to select scaffolds (and/or compound sets or libraries for scaffold or binding compound identification) with particular types of characteristics, e.g., to select compounds that are more likely to bind to a particular target and/or to select compounds that have physical and/or synthetic properties to simplify preparation of derivatives, to be drug-like, and/or to provide convenient sites and chemistry for modification or synthesis.
Useful chemical properties of molecular scaffolds can include one or more of the following characteristics, but are not limited thereto: an average molecular weight below about 350 daltons, or between from about 150 to about 350 daltons, or from about 150 to about 300 daltons; having a clogP below 3; a number of rotatable bonds of less than 4; a number of hydrogen bond donors and acceptors below 5 or below 4; a Polar Surface Area of less than 100 Å2.; binding at protein binding sites in an orientation so that chemical substituents from a combinatorial library that are attached to the scaffold can be projected into pockets in the protein binding site; and possessing chemically tractable structures at its substituent attachment points that can be modified, thereby enabling rapid library construction.
The term “Molecular Polar Surface Area (PSA)” refers to the sum of surface contributions of polar atoms (usually oxygens, nitrogens and attached hydrogens) in a molecule. The polar surface area has been shown to correlate well with drug transport properties, such as intestinal absorption, or blood-brain barrier penetration.
Additional useful chemical properties of distinct compounds for inclusion in a combinatorial library include the ability to attach chemical moieties to the compound that will not interfere with binding of the compound to at least one protein of interest, and that will impart desirable properties to the library members, for example, causing the library members to be actively transported to cells and/or organs of interest, or the ability to attach to a device such as a chromatography column (e.g., a streptavidin column through a molecule such as biotin) for uses such as tissue and proteomics profiling purposes.
A person of ordinary skill in the art will realize other properties that can be desirable for the scaffold or library members to have depending on the particular requirements of the use, and that compounds with these properties can also be sought and identified in like manner. Methods of selecting compounds for assay are known to those of ordinary skill in the art, for example, methods and compounds described in U.S. Pat. Nos. 6,288,234, 6,090,912, and 5,840,485, each of which is hereby incorporated by reference in its entirety, including all charts and drawings.
In various embodiments, the present invention provides methods of designing ligands that bind to a plurality of members of a molecular family, where the ligands contain a common molecular scaffold. Thus, a compound set can be assayed for binding to a plurality of members of a molecular family, e.g., a protein family. One or more compounds that bind to a plurality of family members can be identified as molecular scaffolds. When the orientation of the scaffold at the binding site of the target molecules has been determined and chemically tractable structures have been identified, a set of ligands can be synthesized starting with one or a few molecular scaffolds to arrive at a plurality of ligands, wherein each ligand binds to a separate target molecule of the molecular family with altered or changed binding affinity or binding specificity relative to the scaffold. Thus, a plurality of drug lead molecules can be designed to individually target members of a molecular family based on the same molecular scaffold, and act on them in a specific manner.
D. Binding Assays
1. Use of Binding Assays
The methods of the present invention can involve assays that are able to detect the binding of compounds to a target molecule at a signal of at least about three times the standard deviation of the background signal, or at least about four times the standard deviation of the background signal. The assays can also include assaying compounds for low affinity binding to the target molecule. A large variety of assays indicative of binding are known for different target types and can be used for this invention. Compounds that act broadly across protein families are not likely to have a high affinity against individual targets, due to the broad nature of their binding. Thus, assays (e.g., as described herein) highly preferably allow for the identification of compounds that bind with low affinity, very low affinity, and extremely low affinity. Therefore, potency (or binding affinity) is not the primary, nor even the most important, indicia of identification of a potentially useful binding compound. Rather, even those compounds that bind with low affinity, very low affinity, or extremely low affinity can be considered as molecular scaffolds that can continue to the next phase of the ligand design process.
As indicated above, to design or discover scaffolds that act broadly across protein families, proteins of interest can be assayed against a compound collection or set. The assays can preferably be enzymatic or binding assays. In some embodiments it may be desirable to enhance the solubility of the compounds being screened and then analyze all compounds that show activity in the assay, including those that bind with low affinity or produce a signal with greater than about three times the standard deviation of the background signal. These assays can be any suitable assay such as, for example, binding assays that measure the binding affinity between two binding partners. Various types of screening assays that can be useful in the practice of the present invention are known in the art, such as those described in U.S. Pat. Nos. 5,763,198, 5,747,276, 5,877,007, 6,243,980, 6,294,330, and 6,294,330, each of which is hereby incorporated by reference in its entirety, including all charts and drawings.
In various embodiments of the assays at least one compound, at least about 5%, at least about 10%, at least about 15%, at least about 20%, or at least about 25% of the compounds can bind with low affinity. In many cases, up to about 20% of the compounds can show activity in the screening assay and these compounds can then be analyzed directly with high-throughput co-crystallography, computational analysis to group the compounds into classes with common structural properties (e.g., structural core and/or shape and polarity characteristics), and the identification of common chemical structures between compounds that show activity.
The person of ordinary skill in the art will realize that decisions can be based on criteria that are appropriate for the needs of the particular situation, and that the decisions can be made by computer software programs. Classes can be created containing almost any number of scaffolds, and the criteria selected can be based on increasingly exacting criteria until an arbitrary number of scaffolds is arrived at for each class that is deemed to be advantageous.
2. Surface Plasmon Resonance
Binding parameters can be measured using surface plasmon resonance, for example, with a BIAcore® chip (Biacore, Japan) coated with immobilized binding components. Surface plasmon resonance is used to characterize the microscopic association and dissociation constants of reaction between an sFv or other ligand directed against target molecules. Such methods are generally described in the following references which are incorporated herein by reference: Vely F. et al., Methods in Molecular Biology., 2000, 121:313-21; Liparoto et al., J. Molecular Recognition., 1999, 12:316-21; Lipschultz et al., Methods. 2000, 20:310-8; Malmqvist., Biochemical Society Transactions, 1999, 27:335-40; Alfthan, 1998, Biosensors & Bioelectronics. 13:653-63; Fivash et al., Current Opinion in Biotechnology, 1998, 9:97-101; Price et al., 1998, Tumour Biology 19 Suppl 1:1-20; Malmqvist et al., Current Opinion in Chemical Biology., 1997, 1:378-83; O'Shannessy et al., Analytical Biochemistry. 1996, 236:275-83; Malmborg et al., 1995, J. Immunological Methods. 183:7-13; Van Regenmortel, Developments in Biological Standardization., 1994, 83:143-51; and O'Shannessy, Current Opinions in Biotechnology., 1994, 5:65-71.
BIAcore® uses the optical properties of surface plasmon resonance (SPR) to detect alterations in protein concentration bound to a dextran matrix lying on the surface of a gold/glass sensor chip interface, a dextran biosensor matrix. In brief, proteins are covalently bound to the dextran matrix at a known concentration and a ligand for the protein is injected through the dextran matrix. Near infrared light, directed onto the opposite side of the sensor chip surface is reflected and also induces an evanescent wave in the gold film, which in turn, causes an intensity dip in the reflected light at a particular angle known as the resonance angle. If the refractive index of the sensor chip surface is altered (e.g., by ligand binding to the bound protein) a shift occurs in the resonance angle. This angle shift can be measured and is expressed as resonance units (RUs) such that 1000 RUs is equivalent to a change in surface protein concentration of 1 ng/mm2. These changes are displayed with respect to time along the y-axis of a sensorgram, which depicts the association and dissociation of any biological reaction.
E. High Throughput Screening (HTS) Assays
HTS typically uses automated assays to search through large numbers of compounds for a desired activity. Typically HTS assays are used to find new drugs by screening for chemicals that act on a particular enzyme or molecule. For example, if a chemical inactivates an enzyme it might prove to be effective in preventing a process in a cell which causes a disease. High throughput methods enable researchers to assay thousands of different chemicals against each target molecule very quickly using robotic handling systems and automated analysis of results.
As used herein, “high throughput screening” or “HTS” refers to the rapid in vitro screening of large numbers of compounds (libraries); generally tens to hundreds of thousands of compounds, using robotic screening assays. Ultra high-throughput Screening (uHTS) generally refers to the high-throughput screening accelerated to greater than 100,000 tests per day.
To achieve high-throughput screening, it is advantageous to house samples on a multicontainer carrier or platform. A multicontainer carrier facilitates measuring reactions of a plurality of candidate compounds simultaneously. Multi-well microplates may be used as the carrier. Such multi-well microplates, and methods for their use in numerous assays, are both known in the art and commercially available.
Screening assays may include controls for purposes of calibration and confirmation of proper manipulation of the components of the assay. Blank wells that contain all of the reactants but no member of the chemical library are usually included. As another example, a known inhibitor (or activator) of an enzyme for which modulators are sought, can be incubated with one sample of the assay, and the resulting decrease (or increase) in the enzyme activity used as a comparator or control. It will be appreciated that modulators can also be combined with the enzyme activators or inhibitors to find modulators which inhibit the enzyme activation or repression that is otherwise caused by the presence of the known enzyme modulator. Similarly, when ligands to a target are sought, known ligands of the target can be present in control/calibration assay wells.
F. Measuring Enzymatic and Binding Reactions During Screening Assays
Techniques for measuring the progression of enzymatic and binding reactions, e.g., in multicontainer carriers, are known in the art and include, but are not limited to, the following.
Spectrophotometric and spectrofluorometric assays are well known in the art. Examples of such assays include the use of colorimetric assays for the detection of peroxides, as described in Gordon, A. J. and Ford, R. A., The Chemist's Companion: A Handbook Of Practical Data, Techniques, And References, John Wiley and Sons, N.Y., 1972, Page 437.
Fluorescence spectrometry may be used to monitor the generation of reaction products. Fluorescence methodology is generally more sensitive than the absorption methodology. The use of fluorescent probes is well known to those skilled in the art. For reviews, see Bashford et al., Spectrophotometry and Spectrofluorometry: A Practical Approach, pp. 91-114, IRL Press Ltd. (1987); and Bell, Spectroscopy In Biochemistry, Vol. 1, pp. 155-194, CRC Press (1981).
In spectrofluorometric methods, enzymes are exposed to substrates that change their intrinsic fluorescence when processed by the target enzyme. Typically, the substrate is nonfluorescent and is converted to a fluorophore through one or more reactions. As a non-limiting example, SMase activity can be detected using the Amplex® Red reagent (Molecular Probes, Eugene, Oreg.). In order to measure sphingomyelinase activity using Amplex® Red, the following reactions occur. First, SMase hydrolyzes sphingomyelin to yield ceramide and phosphorylcholine. Second, alkaline phosphatase hydrolyzes phosphorylcholine to yield choline. Third, choline is oxidized by choline oxidase to betaine. Finally, H2O2, in the presence of horseradish peroxidase, reacts with Amplex® Red to produce the fluorescent product, Resorufin, and the signal therefrom is detected using spectrofluorometry.
Fluorescence polarization (FP) is based on a decrease in the speed of molecular rotation of a fluorophore that occurs upon binding to a larger molecule, such as a receptor protein, allowing for polarized fluorescent emission by the bound ligand. FP is empirically determined by measuring the vertical and horizontal components of fluorophore emission following excitation with plane polarized light. Polarized emission is increased when the molecular rotation of a fluorophore is reduced. A fluorophore produces a larger polarized signal when it is bound to a larger molecule (i.e. a receptor), slowing molecular rotation of the fluorophore. The magnitude of the polarized signal relates quantitatively to the extent of fluorescent ligand binding. Accordingly, polarization of the “bound” signal depends on maintenance of high affinity binding.
FP is a homogeneous technology and reactions are very rapid, taking seconds to minutes to reach equilibrium. The reagents are stable, and large batches may be prepared, resulting in high reproducibility. Because of these properties, FP has proven to be highly automatable, often performed with a single incubation with a single, premixed, tracer-receptor reagent. For a review, see Owickiet al., Application ofFluorescence Polarization Assays in High-Throughput Screening, in Genetic Engineering News, 1997, 17:27.
FP is particularly desirable since its readout is independent of the emission intensity (Checovich, W. J., et al., Nature 1995, 375:254-256; Dandliker, W. B., et al., Methods in Enzymology 1981, 74:3-28) and is thus insensitive to the presence of colored compounds that quench fluorescence emission. FP and FRET (see below) are well-suited for identifying compounds that block interactions between sphingolipid receptors and their ligands. See, for example, Parker et al., Development of high throughput screening assays using fluorescence polarization: nuclear receptor-ligand-binding and kinase/phosphatase assays, J. Biomol Screen, 2000, 5:77-88.
Fluorophores derived from sphingolipids that may be used in FP assays are commercially available. For example, Molecular Probes (Eugene, Oreg.) currently sells sphingomyelin and one ceramide flurophores. These are, respectively, N-(4,4-difluoro-5,7-dimethyl-4-bora-3a,4a-diaza-s-indacene-3-pentanoyl)sphingosyl phosphocholine (BODIPY® FL C5-sphingomyelin); N-(4,4-difluoro-5,7-dimethyl-4-bora-3a,4a-diaza-s-indacene-3-dodecanoyl)sphingosyl phosphocholine (BODIPY® FL C12-sphingomyelin); and N-(4,4-difluoro-5,7-dimethyl-4-bora-3 a,4a-diaza-s-indacene-3 -pentanoyl)sphingosine (BODIPY® FL C5-ceramide). U.S. Pat. No. 4,150,949, (Immunoassay for gentamicin), discloses fluorescein-labelled gentamicins, including fluoresceinthiocarbanyl gentamicin. Additional fluorophores may be prepared using methods well known to the skilled artisan.
Exemplary normal-and-polarized fluorescence readers include the POLARION® fluorescence polarization system (Tecan AG, Hombrechtikon, Switzerland). General multiwell plate readers for other assays are available, such as the VERSAMAX® reader and the SPECTRAMAX® multiwell plate spectrophotometer (both from Molecular Devices).
Fluorescence resonance energy transfer (FRET) is another useful assay for detecting interaction and has been described. See, e.g., Heim et al., Curr. Biol. 1996, 6:178-182; Mitra et al., Gene, 1996, 173:13-17; and Selvin et al., Meth. Enzymol., 1995, 246:300-345. FRET detects the transfer of energy between two fluorescent substances in close proximity, having known excitation and emission wavelengths. As an example, a protein can be expressed as a fusion protein with green fluorescent protein (GFP). When two fluorescent proteins are in proximity, such as when a protein specifically interacts with a target molecule, the resonance energy can be transferred from one excited molecule to the other. As a result, the emission spectrum of the sample shifts, which can be measured by a fluorometer, such as a fMAX multiwell fluorometer (Molecular Devices, Sunnyvale Calif.).
Scintillation proximity assay (SPA) is a particularly useful assay for detecting an interaction with the target molecule. SPA is widely used in the pharmaceutical industry and has been described (Hanselman et al., J. Lipid Res., 1997, 38:2365-2373; Kahl et al., Anal. Biochem., 1996, 243:282-283; Undenfriend et al., Anal. Biochem., 1987, 161:494-500). See also U.S. Pat. Nos. 4,626,513 and 4,568,649, and European Patent No. 0,154,734. One commercially available system uses FLASHPLATE® scintillant-coated plates (NEN Life Science Products, Boston, Mass.).
The target molecule can be bound to the scintillator plates by a variety of well known means. Scintillant plates are available that are derivatized to bind to fusion proteins such as GST, His6 or Flag fusion proteins. Where the target molecule is a protein complex or a multimer, one protein or subunit can be attached to the plate first, then the other components of the complex added later under binding conditions, resulting in a bound complex.
In a typical SPA assay, the gene products in the expression pool will have been radiolabeled and added to the wells, and allowed to interact with the solid phase, which is the immobilized target molecule and scintillant coating in the wells. The assay can be measured immediately or allowed to reach equilibrium. Either way, when a radiolabel becomes sufficiently close to the scintillant coating, it produces a signal detectable by a device such as a TOPCOUNT NXT® microplate scintillation counter (Packard BioScience Co., Meriden Conn.). If a radiolabeled expression product binds to the target molecule, the radiolabel remains in proximity to the scintillant long enough to produce a detectable signal.
In contrast, the labeled proteins that do not bind to the target molecule, or bind only briefly, will not remain near the scintillant long enough to produce a signal above background. Any time spent near the scintillant caused by random Brownian motion will also not result in a significant amount of signal. Likewise, residual unincorporated radiolabel used during the expression step may be present, but will not generate significant signal because it will be in solution rather than interacting with the target molecule. These non-binding interactions will therefore cause a certain level of background signal that can be mathematically removed. If too many signals are obtained, salt or other modifiers can be added directly to the assay plates until the desired specificity is obtained (Nichols et al., Anal. Biochem., 1998, 257:112-119).
Additionally, the assay can utilize AlphaScreen (amplified luminescent proximity homogeneous assay) format, e.g., AlphaScreening system (Packard BioScience). AlphaScreen is generally described in Seethala and Prabhavathi, Homogenous Assays: AlphaScreen, Handbook of Drug Screening, Marcel Dekkar Pub., 2001, pp. 106-110.
G. Assay Compounds and Molecular Scaffolds
As described above, preferred characteristics of a scaffold include being of low molecular weight (e.g., less than 350 daltons, or from about 100 to about 350 daltons, or from about 150 to about 300 daltons). Preferably clogP of a scaffold is from -1 to 8, more preferably less than 6, 5, or 4, most preferably less than 3. In particular embodiments the clogP is in a range −1 to an upper limit of 2, 3, 4, 5, 6, or 8; or is in a range of 0 to an upper limit of 2, 3, 4, 5, 6, or 8. Preferably the number of rotatable bonds is less than 5, more preferably less than 4. Preferably the number of hydrogen bond donors and acceptors is below 6, more preferably below 5. An additional criterion that can be useful is a Polar Surface Area of less than 100. Guidance that can be useful in identifying criteria for a particular application can be found in Lipinski et al., Advanced Drug Delivery Reviews, 1997, 23:3-25, which is hereby incorporated by reference in its entirety.
A scaffold will preferably bind to a given protein binding site in a configuration that causes substituent moieties of the scaffold to be situated in pockets of the protein binding site. Also, possessing chemically tractable groups that can be chemically modified, particularly through synthetic reactions, to easily create a combinatorial library can be a preferred characteristic of the scaffold. Also preferred can be having positions on the scaffold to which other moieties can be attached, which do not interfere with binding of the scaffold to the protein(s) of interest but do cause the scaffold to achieve a desirable property, for example, active transport of the scaffold to cells and/or organs, enabling the scaffold to be attached to a chromatographic column to facilitate analysis, or another desirable property. A molecular scaffold can bind to a target molecule with any affinity, such as binding with an affinity measurable as about three times the standard deviation of the background signal, or at high affinity, moderate affinity, low affinity, very low affinity, or extremely low affinity.
Thus, the above criteria can be utilized to select many compounds for testing that have the desired attributes. Many compounds having the criteria described are available in the commercial market, and may be selected for assaying depending on the specific needs to which the methods are to be applied. In some cases sufficiently large numbers of compounds may meet specific criteria that additional methods to group similar compounds may be helpful. A variety of methods to assess molecular similarity, such as the Tanimoto coefficient have been used, see Willett et al., J. Chemical Information and Computer Science, 1998, 38:983-996. These can be used to select a smaller subset of a group of highly structurally redundant compounds. In addition, cluster analysis based on relationships between the compounds, or structural components of the compound, can also be carried out to the same end; see Lance & Williams, Computer J., 1967, 9:373-380, Jarvis & Patrick IEEE Transactions in Computers, 1973, C-22:1025-1034 for clustering algorithms, and Downs et al. J. Chemical Information and Computer Sciences, 1994, 34:1094-1102 for a review of these methods applied to chemical problems. One method of deriving the chemical components of a large group of potential scaffolds is to virtually break up the compound at rotatable bonds so as to yield components of no less than 10 atoms. The resulting components may be clustered based on some measure of similarity, e.g. the Tanimoto coefficient, to yield the common component groups in the original collection of compounds. For each component group, all compounds containing that component may be clustered, and the resulting clusters used to select a diverse set of compounds containing a common chemical core structure. In this fashion, a useful library of scaffolds may be derived even from millions of commercial compounds.
A “compound library” or “library” is a collection of different compounds having different chemical structures. A compound library is screenable, that is, the compound library members therein may be subject to screening assays. In preferred embodiments, the library members can have a molecular weight of from about 100 to about 350 daltons, or from about 150 to about 350 daltons.
Libraries can contain at least one compound that binds to the target molecule at low affinity. Libraries of candidate compounds can be assayed by many different assays, such as those described above, e.g., a fluorescence polarization assay. Libraries may consist of chemically synthesized peptides, peptidomimetics, or arrays of combinatorial chemicals that are large or small, focused or nonfocused. By “focused” it is meant that the collection of compounds is prepared using the structure of previously characterized compounds and/or pharmacophores.
Compound libraries may contain molecules isolated from natural sources, artificially synthesized molecules, or molecules synthesized, isolated, or otherwise prepared in such a manner so as to have one or more moieties variable, e.g., moieties that are independently isolated or randomly synthesized. Types of molecules in compound libraries include but are not limited to organic compounds, polypeptides and nucleic acids as those terms are used herein, and derivatives, conjugates and mixtures thereof.
Compound libraries useful for the invention may be purchased on the commercial market or prepared or obtained by any means including, but not limited to, combinatorial chemistry techniques, fermentation methods, plant and cellular extraction procedures and the like (see, e.g., Cwirla et al., Biochemistry, 1990, 87:6378-6382; Houghten et al., Nature, 1991, 354:84-86; Lam et al., Nature, 1991, 354:82-84; Brenner et al., Proc. Natl. Acad. Sci. USA, 1992, 89:5381-5383; R. A. Houghten, Trends Genet., 1993, 9:235-239; E. R. Felder, Chimia, 1994, 48:512-541; Gallop et al., J. Med. Chem., 1994, 37:1233-1251; Gordon et al., J. Med. Chem., 1994, 37:1385-1401; Carell et al., Chem. Biol., 1995,3:171-183; Madden et al., Perspectives in Drug Discovery and Design 2:269-282; Lebl et al., Biopolymers, 1995, 37:177-198); small molecules assembled around a shared molecular structure; collections of chemicals that have been assembled by various commercial and noncommercial groups, natural products; extracts of marine organisms, fungi, bacteria, and plants.
Preferred libraries can be prepared in a homogenous reaction mixture, and separation of unreacted reagents from members of the library is not required prior to screening. Although many combinatorial chemistry approaches are based on solid state chemistry, liquid phase combinatorial chemistry is capable of generating libraries (Sun C M., Recent advances in liquid-phase combinatorial chemistry, Combinatorial Chemistry & High Throughput Screening, 1999, 2:299-318).
Libraries of a variety of types of molecules can be prepared in order to obtain members therefrom having one or more preselected attributes that can be prepared by a variety of techniques, including but not limited to parallel array synthesis (Houghton, Ann. Rev. Pharmacol. Toxicol., 2000, 40:273-82); solution-phase combinatorial chemistry (Merritt, Comb Chem High Throughput Screen, 1998, 1:57-72; Coe et al., Mol. Divers, 1998-99, 4:31-38; Sun, Comb Chem High Throughput Screenm, 1999, 2:299-318); synthesis on soluble polymer (Gravert et al., Curr Opin Chem Biol., 1997, 1:107-13); and the like. See, e.g., Dolle etal., J. Comb Chem., 1999, 1:235-82; and Kundu et al., Prog Drug Res., 1999, 53:89-156, Combinatorial chemistry: polymer supported synthesis of peptide and non-peptide libraries). Compounds may be clinically tagged for ease of identification (Chabala, Curr Opin Biotechnol., 1995 6:633-9, Solid-phase combinatorial chemistry and novel tagging methods for identifying leads).
The combinatorial synthesis of carbohydrates and libraries containing oligosaccharides has been described (Schweizer et al., Curr. Opin. Chem. Biol., 1999, 3:291-8, Combinatorial synthesis of carbohydrates). The synthesis of natural-product based compound libraries has been described (Wessjohann, Curr. Opin. Chem. Biol., 2000, 4:303-9).
Libraries of nucleic acids are prepared by various techniques, including by way of non-limiting example the ones described herein, for the isolation of aptamers. Libraries that include oligonucleotides and polyaminooligonucleotides (Markiewicz et al., Farmaco., 2000, 55:174-7) displayed on streptavidin magnetic beads are known. Nucleic acid libraries are known that can be coupled to parallel sampling and be deconvoluted without complex procedures such as automated mass spectrometry (Enjalbal et al., Mass Spectrometry Reviews., 2000, 19:139-61) and parallel tagging. (Perrin D M., Combinatorial Chemistry & High Throughput Screening, 3:243-69).
Peptidomimetics are identified using combinatorial chemistry and solid phase synthesis (Kim H O. Kahn M., Combinatorial Chemistry & High Throughput Screening, 2000, 3:167-83; al-Obeidi, Mol Biotechnol., 1998, 9:205-23). The synthesis may be entirely random or based in part on a known polypeptide.
Polypeptide libraries can be prepared according to various techniques. In brief, phage display techniques can be used to produce polypeptide ligands (Gram H., Combinatorial Chemistry & High Throughput Screening, 1999, 2:19-28) that may be used as the basis for synthesis of peptidomimetics. Polypeptides, constrained peptides, proteins, protein domains, antibodies, single chain antibody fragments, antibody fragments, and antibody combining regions are displayed on filamentous phage for selection.
Large libraries of individual variants of human single chain Fv antibodies have been produced. See, e.g., Siegel et al., J. Molecular Biology 2000, 302:285-93; Poul et al., J. Molecular Biology. 2000, 301:1149-61; Amersdorfer et al., Methods in Molecular Biology. 2001, 145:219-40; Hughes-Jones et al., British J. Haematology., 1999, 105:811-6; McCall et al., Immunotechnology. 1998, 4:71-87; Sheets et al., (published erratum appears in Proc Natl Acad Sci USA 1999 96:795), 1998, Proc Natl Acad Sci USA 95:6157-62).
Focused or smart chemical and pharmacophore libraries can be designed with the help of sophisticated strategies involving computational chemistry (e.g., Kundu et al., Progress in Drug Research 1999, 53:89-156) and the use of structure-based ligands using database searching and docking, de novo drug design and estimation of ligand binding affinities (Joseph-McCarthy D., Pharmacology & Therapeutics 1999, 84:179-91; Kirkpatrick et al., Combinatorial Chemistry & High Throughput Screening., 1999, 2:211-21; Eliseev & Lehn, Current Topics in Microbiology & Immunology, 1999, 243:159-72; Bolger et al., Methods Enz. 1991, 203:21-45; Martin, Methods Enz. 1991, 203:587-613; Neidle et al., Methods Enz. 1991, 203:433-458; U.S. Pat. No. 6,178,384).
Selecting a library of potential scaffolds and a set of assays measuring binding to representative target molecules which are in a particular protein family thus allows the creation of a data set profiling binding of the library to the target protein family. Groups of scaffolds with different sets of binding properties can be identified using the information within this dataset. Thus, groups of scaffolds binding to one, two or three members of the family may be selected for particular applications.
In many cases, a group of scaffolds exhibiting binding to two or more members of a target protein family will contain scaffolds with a greater likelihood that such binding results from specific interactions with the individual target proteins. This would be expected to substantially reduce the effect of so-called “promiscuous inhibitors” which severely complicate the interpretation of screening assays (see McGovern et al., J. Med. Chem. 2002, 45:1712-22). Thus, in many preferred applications the property of displaying binding to multiple target molecules in a protein family may be used as a selection criteria to identify molecules with desirable properties. In addition, groups of scaffolds binding to specific subsets of a set of potential target molecules may be selected. Such a case would include the subset of scaffolds that bind to any two of three or three of five members of a target protein family.
Such subsets may also be used in combination or opposition to further define a group of scaffolds that have additional desirable properties. This would be of significant utility in cases where inhibiting some members of a protein family had known desirable effects, such as inhibiting tumor growth, whereas inhibiting other members of the protein family which were found to be essential for normal cell function would have undesirable effects. A criteria that would be useful in such a case includes selecting the subset of scaffolds binding to any two of three desirable target molecules and eliminating from this group any that bound to more than one of any three undesirable target molecules.
Representative molecular scaffolds of the invention include, but are not limited to compounds of Formula I:
A-L1-B-L2-C Formula I
wherein:
The following compound obtained from Chembridge (San Diego, Calif.), 5-(4-chloro-3-methyl-phenoxymethyl)-3-thiophen-2-ylmethyl-[1,2,4]oxadiazole, is an example of a possible molecular scaffold compound for development of ligands that bind to SF-1 or LRH-1:
H. Crystallography
After binding compounds have been determined, the orientation of compound bound to target is determined. Preferably this determination involves crystallography on co-crystals of molecular scaffold compounds with target. Most protein crystallographic platforms can preferably be designed to analyze up to about 500 co-complexes of compounds, ligands, or molecular scaffolds bound to protein targets due to the physical parameters of the instruments and convenience of operation.
If the number of scaffolds that have binding activity exceeds a number convenient for the application of crystallography methods, the scaffolds can be placed into groups based on having at least one common chemical structure or other desirable characteristics, and representative compounds can be selected from one or more of the classes. Classes can be made with increasingly exacting criteria until a desired number of classes (e.g., 10, 20, 50, 100, 200, 300, 400, 500) is obtained. The classes can be based on chemical structure similarities between molecular scaffolds in the class, e.g., all possess a pyrrole ring, benzene ring, or other chemical feature. Likewise, classes can be based on shape characteristics, e.g., space-filling characteristics.
The co-crystallography analysis can be performed by co-complexing each scaffold with its target, e.g., at concentrations of the scaffold that showed activity in the screening assay. This co-complexing can, for example, be accomplished with the use of low percentage organic solvents with the target molecule and then concentrating the target with each of the scaffolds. In preferred embodiments these solvents are less than 5% organic solvent such as dimethyl sulfoxide (DMSO), ethanol, methanol, or ethylene glycol in water or another aqueous solvent.
Each scaffold complexed to the target molecule can then be screened with a suitable number of crystallization screening conditions at appropriate temperature, e.g., both 4 and 20 degrees. In preferred embodiments, about 96 crystallization screening conditions can be performed in order to obtain sufficient information about the co-complexation and crystallization conditions, and the orientation of the scaffold at the binding site of the target molecule. Crystal structures can then be analyzed to determine how the bound scaffold is oriented physically within the binding site or within one or more binding pockets of the molecular family member.
It is desirable to determine the atomic coordinates of the compounds bound to the target proteins in order to determine which is a most suitable scaffold for the protein family. X-ray crystallographic analysis is therefore most preferable for determining the atomic coordinates. Those compounds selected can be further tested with the application of medicinal chemistry. Compounds can be selected for medicinal chemistry testing based on their binding position in the target molecule. For example, when the compound binds at a binding site, the compound's binding position in the binding site of the target molecule can be considered with respect to the chemistry that can be performed on chemically tractable structures or sub-structures of the compound, and how such modifications on the compound are expected to interact with structures or sub-structures on the binding site of the target. Thus, one can explore the binding site of the target and the chemistry of the scaffold in order to make decisions on how to modify the scaffold to arrive at a ligand with higher potency and/or selectivity.
The structure of the target molecule bound to the compound may also be superimposed or aligned with other structures of members of the same protein family. In this way modifications of the scaffold can be made to enhance the binding to members of the target family in general, thus enhancing the utility of the scaffold library. Different useful alignments may be generated, using a variety of criteria such as minimal RMSD superposition of alpha-carbons or backbone atoms of homologous or structurally related regions of the proteins.
These processes allow for more direct design of ligands, by utilizing structural and chemical information obtained directly from the co-complex, thereby enabling one to more efficiently and quickly design lead compounds that are likely to lead to beneficial drug products. In various embodiments it may be desirable to perform co-crystallography on all scaffolds that bind, or only those that bind with a particular affinity, for example, only those that bind with high affinity, moderate affinity, low affinity, very low affinity, or extremely low affinity. It may also be advantageous to perform co-crystallography on a selection of scaffolds that bind with any combination of affinities.
Standard X-ray protein diffraction studies such as by using a Rigaku RU-200® (Rigaku, Tokyo, Japan) with an X-ray imaging plate detector or a synchrotron beam-line can be performed on co-crystals and the diffraction data measured on a standard X-ray detector, such as a CCD detector or an X-ray imaging plate detector.
Performing X-ray crystallography on about 200 co-crystals should generally lead to about 50 co-crystal structures, which should provide about 10 scaffolds for validation in chemistry, which should finally result in about 5 selective leads for target molecules.
Additives that promote co-crystallization can of course be included in the target molecule formulation in order to enhance the formation of co-crystals. In the case of proteins or enzymes, the scaffold to be tested can be added to the protein formulation, which is preferably present at a concentration of approximately 1 mg/ml. The formulation can also contain between 0%-10% (v/v) organic solvent, e.g. DMSO, methanol, ethanol, propane diol, or 1,3 dimethyl propane diol (MPD) or some combination of those organic solvents. Compounds are preferably solubilized in the organic solvent at a concentration of about 100 mM and added to the protein sample at a concentration of about 1-10 mM. The protein-compound complex is then concentrated to a final concentration of protein of from about 5 to about 20 mg/ml. The complexation and concentration steps can conveniently be performed using a 96 well formatted concentration apparatus (e.g., Amicon Inc., Piscataway, N.J.). Buffers and other reagents present in the formulation being crystallized can contain other components that promote crystallization or are compatible with crystallization conditions, such as DTT, propane diol, glycerol.
The crystallization experiment can be set-up by placing small aliquots of the concentrated protein-compound complex (e.g., 1 μl) in a 96 well format and sampling under 96 crystallization conditions. (Other formats can also be used, for example, plates with fewer or more wells.) Crystals can typically be obtained using standard crystallization protocols that can involve the 96 well crystallization plate being placed at different temperatures. Co-crystallization varying factors other than temperature can also be considered for each protein-compound complex if desirable. For example, atmospheric pressure, the presence or absence of light or oxygen, a change in gravity, and many other variables can all be tested. The person of ordinary skill in the art will realize other variables that can advantageously be varied and considered. Conveniently, commercially available crystal screening plates with specified conditions in individual wells can be utilized.
I. Virtual Assays
As described above, virtual assays or compound design techniques are useful for identification and design of modulators; such techniques are also applicable to a molecular scaffold method. Commercially available software that generates three-dimensional graphical representations of the complexed target and compound from a set of coordinates provided can be used to illustrate and study how a compound is oriented when bound to a target. (e.g., InsightII®, Accelrys, San Diego, Calif.; or Sybyl®, Tripos Associates, St. Louis, Mo.). Thus, the existence of binding pockets at the binding site of the targets can be particularly useful in the present invention. These binding pockets are revealed by the crystallographic structure determination and show the precise chemical interactions involved in binding the compound to the binding site of the target. The person of ordinary skill will realize that the illustrations can also be used to decide where chemical groups might be added, substituted, modified, or deleted from the scaffold to enhance binding or another desirable effect, by considering where unoccupied space is located in the complex and which chemical substructures might have suitable size and/or charge characteristics to fill it. The person of ordinary skill will also realize that regions within the binding site can be flexible and its properties can change as a result of scaffold binding, and that chemical groups can be specifically targeted to those regions to achieve a desired effect. Specific locations on the molecular scaffold can be considered with reference to where a suitable chemical substructure can be attached and in which conformation, and which site has the most advantageous chemistry available.
An understanding of the forces that bind the compounds to the target proteins reveals which compounds can most advantageously be used as scaffolds, and which properties can most effectively be manipulated in the design of ligands. The person of ordinary skill will realize that steric, ionic, polar, hydrogen bond, and other forces can be considered for their contribution to the maintenance or enhancement of the target-compound complex. Additional data can be obtained with automated computational methods, such as docking and/or molecular dynamics simulations, which can afford a measure of the energy of binding. In addition, to account for other effects such as entropies of binding and desolvation penalties, methods which provide a measure of these effects can be integrated into the automated computational approach. The compounds selected can be used to generate information about the chemical interactions with the target or for elucidating chemical modifications that can enhance selectivity of binding of the compound.
An exemplary calculation of binding energies between protein-ligand complexes can be obtained using the FlexX score (an implementation of the Bohm scoring function) within the Tripos software suite (Tripos Associates, St. Louis, Mo.). The form for that equation is shown below:
ΔGbind=ΔGtr+ΔGhb+ΔGion+ΔGlipo+ΔGarom+ΔGrot
where: ΔGtr is a constant term that accounts for the overall loss of rotational and translational entropy of the ligand, ΔGhb accounts for hydrogen bonds formed between the ligand and protein, ΔGion accounts for the ionic interactions between the ligand and protein, ΔGlipo accounts for the lipophilic interaction that corresponds to the protein-ligand contact surface, ΔGarom accounts for interactions between aromatic rings in the protein and ligand, and ΔGrot accounts for the entropic penalty of restricting rotatable bonds in the ligand upon binding. The calculated binding energy for compounds that bind strongly to a given target will likely be lower than −25 kcal/mol, while the calculated binding affinity for a good scaffold or an unoptimized compound will generally be in the range of −15 to −20. The penalty for restricting a linker such as the ethylene glycol or hexatriene is estimated as typically being in the range of +5 to +15.
This method estimates the free energy of binding that a lead compound should have to a target protein for which there is a crystal structure, and it accounts for the entropic penalty of flexible linkers. It can therefore be used to estimate the penalty incurred by attaching linkers to molecules being screened and the binding energy that a lead compound must attain in order to overcome the penalty of the linker. The method does not account for solvation, and the entropic penalty is likely overestimated when the linkers are bound to the solid phase through an additional binding complex, e.g., a biotin:streptavidin complex.
Another exemplary method for calculating binding energies is the MM-PBSA technique (Massova & Kollman, J. Amer. Chem. Soc., 1999, 121:8133-43; Chong et al., Proc. of the Natl. Acad. of Sci. USA, 1999, 96:14330-5; Donini & Kollman, J. Med. Chem. 2000, 43:4180-8). This method uses a Molecular Dynamics approach to generate many sample configurations of the compound and complexed target molecule, then calculates an interaction energy using the well-known AMBER force field (Cornell, et al., J. Amer. Chem. Soc., 1995, 117:5179-97) with corrections for desolvation and entropy of binding from the ensemble.
Use of this method yields binding energies highly correlated with those found experimentally. The absolute binding energies calculated with this method are reasonably accurate, and the variation of binding energies is approximately linear with a slope of 1±0.5. Thus, the binding energies of compounds interacting strongly with a given target will be lower than about −8 kcal/mol, while a binding energy of a good scaffold or unoptimized compound will be in the range of −3 to −7 kcal/mol.
Computer models, such as homology models (i.e., based on a known, experimentally derived structure) can be constructed using data from the co-crystal structures. A computer program such as Modeller (Accelrys, San Diego Calif.) may be used to assign the three dimensional coordinates to a protein sequence using an alignment of sequences and a set or sets of template coordinates. When the target molecule is a protein or enzyme, preferred co-crystal structures for making homology models contain high sequence identity in the binding site of the protein sequence being modeled, and the proteins will preferentially also be within the same class and/or fold family. Knowledge of conserved residues in active sites of a protein class can be used to select homology models that accurately represent the binding site. Homology models can also be used to map structural information from a surrogate protein where an apo or co-crystal structure exists to the target protein.
Virtual screening methods, such as docking, can also be used to predict the binding configuration and affinity of scaffolds, compounds, and/or combinatorial library members to homology models. Using this data, and carrying out “virtual experiments” using computer software can save substantial resources and allow the person of ordinary skill to make decisions about which compounds can be suitable scaffolds or ligands, without having to actually synthesize the ligand and perform co-crystallization. Decisions thus can be made about which compounds merit actual synthesis and co-crystallization. An understanding of such chemical interactions aids in the discovery and design of drugs that interact more advantageously with target proteins and/or are more selective for one protein family member over others. Thus, applying these principles, compounds with superior properties can be discovered.
Another commonly-used virtual screening method is pharmacophore-based search. Crystal structures of a target protein allow the identification of pharmacophore features in the three-dimensional space using programs such as Catalyst (Accelrys, San Diego Calif.) or MOE (CCG, Montreal, Canada). Programs such as Catalyst and MOE can be used to search a large collection of existing compounds or virtual compounds that satisfy all or a subset of the defined pharmacophore features. Use of these data allows the person of ordinary skill to make decisions about which compounds may have activity for the target. These compounds and the binding hypothesis generated by using pharmacophore-based methods can then be used as a starting point to design compounds with better properties.
J. Ligand Design and Preparation
The design and preparation of ligands can be performed with or without structural and/or co-crystallization data by considering the chemical structures in common between the active scaffolds of a set. In this process structure-activity hypotheses can be formed and those chemical structures found to be present in a substantial number of the scaffolds, including those that bind with low affinity, can be presumed to have some effect on the binding of the scaffold. This binding can be presumed to induce a desired biochemical effect when it occurs in a biological system (e.g., a treated mammal). New or modified scaffolds or combinatorial libraries derived from scaffolds can be tested to disprove the maximum number of binding and/or structure-activity hypotheses. The remaining hypotheses can then be used to design ligands that achieve a desired binding and biochemical effect.
But in many cases it will be preferred to have co-crystallography data for consideration of how to modify the scaffold to achieve the desired binding effect (e.g., binding at higher affinity or with higher selectivity). Using the case of proteins and enzymes, co-crystallography data shows the binding pocket of the protein with the molecular scaffold bound to the binding site, and it will be apparent that a modification can be made to a chemically tractable group on the scaffold. For example, a small volume of space at a protein binding site or pocket might be filled by modifying the scaffold to include a small chemical group that fills the volume. Filling the void volume can be expected to result in a greater binding affinity, or the loss of undesirable binding to another member of the protein family. Similarly, the co-crystallography data may show that deletion of a chemical group on the scaffold may decrease a hindrance to binding and result in greater binding affinity or specificity.
Various software packages have implemented techniques which facilitate the identification and characterization of interactions of potential binding sites from complex structure, or from an apo structure of a target molecule, i.e. one without a compound bound (e.g. SiteID, Tripos Associates, St. Louis Mo. and SiteFinder, Chemical Computing Group, Montreal Canada, GRID, Molecular Discovery Ltd., London UK). Such techniques can be used with the coordinates of a complex between the scaffold of interest and a target molecule, or these data in conjunction with data for a suitably aligned or superimposed related target molecule, in order to evaluate changes to the scaffold that would enhance binding to the desired target molecule structure or structures. Molecular Interaction Field-computing techniques, such as those implemented in the program GRID, result in energy data for particular positive and negative binding interactions of different computational chemical probes being mapped to the vertices of a matrix in the coordinate space of the target molecule. These data can then be analyzed for areas of substitution around the scaffold binding site which are predicted to have a favorable interaction for a particular target molecule. Compatible chemical substitution on the scaffold e.g. a methyl, ethyl or phenyl group in a favorable interaction region computed from a hydrophobic probe, would be expected to result in an improvement in affinity of the scaffold. Conversely, a scaffold could be made more selective for a particular target molecule by making such a substitution in a region predicted to have an unfavorable hydrophobic interaction in a second, related undesirable target molecule.
It can be desirable to take advantage of the presence of a charged chemical group located at the binding site or pocket of the protein. For example, a positively charged group can be complemented with a negatively charged group introduced on the molecular scaffold. This can be expected to increase binding affinity or binding specificity, thereby resulting in a more desirable ligand. In many cases, regions of protein binding sites or pockets are known to vary from one family member to another based on the amino acid differences in those regions. Chemical additions in such regions can result in the creation or elimination of certain interactions (e.g., hydrophobic, electrostatic, or entropic) that allow a compound to be more specific for one protein target over another or to bind with greater affinity, thereby enabling one to synthesize a compound with greater selectivity or affinity for a particular family member. Additionally, certain regions can contain amino acids that are known to be more flexible than others. This often occurs in amino acids contained in loops connecting elements of the secondary structure of the protein, such as alpha helices or beta strands. Additions of chemical moieties can also be directed to these flexible regions in order to increase the likelihood of a specific interaction occurring between the protein target of interest and the compound. Virtual screening methods can also be conducted in silico to assess the effect of chemical additions, subtractions, modifications, and/or substitutions on compounds with respect to members of a protein family or class.
The addition, subtraction, or modification of a chemical structure or sub-structure to a scaffold can be performed with any suitable chemical moiety. For example the following moieties, which are provided by way of example and are not intended to be limiting, can be utilized: hydrogen, alkyl, alkoxy, phenoxy, alkenyl, alkynyl, phenylalkyl, hydroxyalkyl, haloalkyl, aryl, arylalkyl, alkyloxy, alkylthio, alkenylthio, phenyl, phenylalkyl, phenylalkylthio, hydroxyalkyl-thio, alkylthiocarbbamylthio, cyclohexyl, pyridyl, piperidinyl, alkylamino, amino, nitro, mercapto, cyano, hydroxyl, a halogen atom, halomethyl, an oxygen atom (e.g., forming a ketone or N-oxide) or a sulphur atom (e.g., forming a thiol, thione, di-alkylsulfoxide or sulfone) are all examples of moieties that can be utilized.
Additional examples of structures or sub-structures that may be utilized are an aryl optionally substituted with one, two, or three substituents independently selected from the group consisting of alkyl, alkoxy, halogen, trihalomethyl, carboxylate, nitro, and ester moieties; an amine of formula —NX2X3, where X2 and X3 are independently selected from the group consisting of hydrogen, saturated or unsaturated alkyl, and homocyclic or heterocyclic ring moieties; halogen or trihalomethyl; a ketone of formula —COX4, where X4 is selected from the group consisting of alkyl and homocyclic or heterocyclic ring moieties; a carboxylic acid of formula —(X5)nCOOH or ester of formula (X6)nCOOX7, where X5, X6, and X7 and are independently selected from the group consisting of alkyl and homocyclic or heterocyclic ring moieties and where n is 0 or 1; an alcohol of formula (X8)nOH or an alkoxy moiety of formula —(X8)nOX9, where X8 and X9 are independently selected from the group consisting of saturated or unsaturated alkyl and homocyclic or heterocyclic ring moieties, wherein said ring is optionally substituted with one or more substituents independently selected from the group consisting of alkyl, alkoxy, halogen, trihalomethyl, carboxylate, nitro, and ester and where n is 0 or 1; an amide of formula NHCOX10, where X10 is selected from the group consisting of alkyl, hydroxyl, and homocyclic or heterocyclic ring moieties, wherein said ring is optionally substituted with one or more substituents independently selected from the group consisting of alkyl, alkoxy, halogen, trihalomethyl, carboxylate, nitro, and ester; SO2, NX11, X12, where X11 and X12 are selected from the group consisting of hydrogen, alkyl, and homocyclic or heterocyclic ring moieties; a homocyclic or heterocyclic ring moiety optionally substituted with one, two, or three substituents independently selected from the group consisting of alkyl, alkoxy, halogen, trihalomethyl, carboxylate, nitro, and ester moieties; an aldehyde of formula —COH; a sulfone of formula —SO2X13, where X13 is selected from the group consisting of saturated or unsaturated alkyl and homocyclic or heterocyclic ring moieties; and a nitro of formula —NO2.
K. Identification of Binding Characteristics of Binding Compounds
It can also be beneficial in selecting compounds for testing to first identify binding characteristics that a ligand should advantageously possess. This can be accomplished by analyzing the interactions that a plurality of different binding compounds have with a particular target, e.g., interactions with one or more conserved residues in the binding site. These interactions are identified by considering the nature of the interacting moieties. In this way, atoms or groups that can participate in hydrogen bonding, polar interactions, charge-charge interactions, and the like are identified based on known structural and electronic factors.
L. Identification of Energetically Allowed Sites for Attachment
In addition to the identification and development of ligands, determination of the orientation of a molecular scaffold or other binding compound in a binding site allows identification of energetically allowed sites for attachment of the binding molecule to another component. For such sites, any free energy change associated with the presence of the attached component should not destablize the binding of the compound to the target to an extent that will disrupt the binding. Preferably, the binding energy with the attachment should be at least 4 kcal/mol., more preferably at least 6, 8, 10, 12, 15, or 20 kcal/mol. Preferably, the presence of the attachment at the particular site reduces binding energy by no more than 3, 4, 5, 8, 10, 12, or 15 kcal/mol.
In many cases, suitable attachment sites will be those that are exposed to solvent when the binding compound is bound in the binding site. In some cases, attachment sites can be used that will result in small displacements of a portion of the enzyme without an excessive energetic cost. Exposed sites can be identified in various ways. For example, exposed sites can be identified using a graphic display or 3-dimensional model. In a graphic display, such as a computer display, an image of a compound bound in a binding site can be visually inspected to reveal atoms or groups on the compound that are exposed to solvent and oriented such that attachment at such atom or group would not preclude binding of the enzyme and binding compound. Energetic costs of attachment can be calculated based on changes or distortions that would be caused by the attachment as well as entropic changes.
Many different types of components can be attached. Persons with skill are familiar with the chemistries used for various attachments. Examples of components that can be attached include, without limitation: solid phase components such as beads, plates, chips, and wells; a direct or indirect label; a linker, which may be a traceless linker; among others. Such linkers can themselves be attached to other components, e.g., to solid phase media, labels, and/or binding moieties.
The binding energy of a compound and the effects on binding energy for attaching the molecule to another component can be calculated approximately by manual calculation, or by using any of a variety of available computational virtual assay techniques, such as docking or molecular dynamics simulations. A virtual library of compounds derived from the attachment of components to a particular scaffold can be assembled using a variety of software programs (such as Afferent, MDL Information Systems, San Leandro, Calif. or CombiLibMaker, Tripos Associates, St. Louis, Mo.). This virtual library can be assigned appropriate three dimensional coordinates using software programs (such as Concord, Tripos Associates, St. Louis, Mo. or Omega, Openeye Scientific Software, Santa Fe, N.Mex.). These structures may then be submitted to the appropriate computational technique for evaluation of binding energy to a particular target molecule. This information can be used for purposes of prioritizing compounds for synthesis, for selecting a subset of chemically tractable compounds for synthesis, and for providing data to correlate with the experimentally determined binding energies for the synthesized compounds.
The crystallographic determination of the orientation of the scaffold in the binding site specifically enables more productive methods of assessing the likelihood of the attachment of a particular component resulting in an improvement in binding energy. Such an example is shown for a docking-based strategy in Haque et al., (J. Med. Chem. 1999, 42:1428-40), wherein an “Anchor and Grow” technique which relied on a crystallographically determined fragment of a larger molecule, potent and selective inhibitors were rapidly created. The use of a crystallographically characterized small molecule fragment in guiding the selection of productive compounds for synthesis has also been demonstrated in Boehm et al., J. Med. Chem. 2000, 43:2664-74. An illustration of the use of crystallographic data and molecular dynamics simulations in the prospective assessment of inhibitor binding energies can be found in Pearlman and Charifson, J. Med. Chem. 2001, 44, 3417-23. Another important class of techniques which rely on a well defined structural starting point for computational design is the combinatorial growth algorithm based systems, such as the GrowMol program (Bohacek & McMartin, J. Amer. Chem. Soc., 1994, 116:5560-71. These techniques have been used to enable the rapid computational evolution of virtual inhibitor computed binding energies, and directly led to more potent synthesized compounds whose binding mode was validated crystallographically (see Organic Letters, 2001, 3:2309-2312).
1. Linkers
Linkers suitable for use in the invention can be of many different types. Linkers can be selected for particular applications based on factors such as linker chemistry compatible for attachment to a binding compound and to another component utilized in the particular application. Additional factors can include, without limitation, linker length, linker stability, and ability to remove the linker at an appropriate time. Exemplary linkers include, but are not limited to, hexenyl, hexatrienyl, ethylene glycol, and peptide linkers. Traceless linkers can also be used, e.g., as described in Plunkett & Ellman., J. Org. Chem., 1995, 60:6006.
Typical functional groups, that are utilized to link binding compound(s), include, but not limited to, carboxylic acid, amine, hydroxyl, and thiol. (Examples can be found in Solid-supported combinatorial and parallel synthesis of small molecular weight compound libraries; Tetrahedron Organic Chemistry Series 1998, Vol.17:85; Pergamon).
2. Labels
As indicated above, labels can also be attached to a binding compound or to a linker attached to a binding compound. Such attachment may be direct (attached directly to the binding compound) or indirect (attached to a component that is directly or indirectly attached to the binding compound). Such labels allow detection of the compound either directly or indirectly. Attachment of labels can be performed using conventional chemistries. Labels can include, for example, fluorescent labels, radiolabels, light scattering particles, light absorbent particles, magnetic particles, enzymes, and specific binding agents (e.g., biotin or an antibody target moiety).
3. Solid Phase Media
Additional examples of components that can be attached directly or indirectly to a binding compound include various solid phase media. Similar to attachment of linkers and labels, attachment to solid phase media can be performed using conventional chemistries. Such solid phase media can include, for example, small components such as beads, nanoparticles, and fibers (e.g., in suspension or in a gel or chromatographic matrix). Likewise, solid phase media can include larger objects such as plates, chips, slides, and tubes. In many cases, the binding compound will be attached in only a portion of such an objects, e.g., in a spot or other local element on a generally flat surface or in a well or portion of a well.
IV. Organic Synthetic Techniques
The versatility of computer-based modulator design and identification lies in the diversity of structures screened by the computer programs. The computer programs can search databases that contain very large numbers of molecules and can modify modulators already complexed with the enzyme with a wide variety of chemical functional groups. A consequence of this chemical diversity is that a potential modulator of a biomolecular function may take a chemical form that is not predictable. A wide array of organic synthetic techniques exist in the art to meet the challenge of constructing these potential modulators. Many of these organic synthetic methods are described in detail in standard reference sources utilized by those skilled in the art. One example of such a reference is March, 1994, Advanced Organic Chemistry; Reactions, Mechanisms and Structure, New York, McGraw Hill. Thus, the techniques useful to synthesize a potential modulator of biomolecular function identified by computer-based methods are readily available to those skilled in the art of organic chemical synthesis.
V. Isomers, Prodrugs, and Active Metabolites
The present invention concerns compounds that can be describes with generic formulas and specific compounds. In addition, such compounds may exist in a number of different forms or derivatives, all within the scope of the present invention. These include, for example, tautomers, stereoisomers, racemic mixtures, regioisomers, salts, prodrugs (e.g., carboxylic acid esters), solvated forms, different crystal forms or polymorphs, and active metabolites.
A. Tautomers, Stereoisomers, Regioisomers, and Solvated Forms
It is understood that certain compounds may exhibit tautomerism. In such cases, the formula drawings within this specification expressly depict only one of the possible tautomeric forms It is therefore to be understood that within the invention the formulas are intended to represent any tautomeric form of the depicted compounds and are not to be limited merely to the specific tautomeric form depicted by the formula drawings.
Likewise, some of the compounds according to the present invention may exist as stereoisomers, i.e. they have the same sequence of covalently bonded atoms and differ in the spatial orientation of the atoms. For example, the compounds may be optical stereoisomers, which contain one or more chiral centers, and therefore, may exist in two or more stereoisomeric forms (e.g. entantiomers or diastereomers). Thus, such compounds may be present as single stereoisomers (i.e., essentially free of other stereoisomers), racemates, and/or mixtures of enantiomers and/or diastereomers. As another example, stereoisomers include geometric isomers, such as cis- or trans-orientation of substituents on adjacent carbons of a double bond. All such single stereoisomers, racemates and mixtures thereof are intended to be within the scope of the present invention. Unless specified to the contrary, all such steroisomeric forms are included within the formulas provided herein.
In certain embodiments, a chiral compound of the present invention is in a form that contains at least 80% of a single isomer (60% enantiomeric excess (“e.e.”) or diastereomeric excess (“d.e.”)), or at least 85% (70% e.e. or d.e.), 90% (80% e.e. or d.e.), 95% (90% e.e. or d.e.), 97.5% (95% e.e. or d.e.), or 99% (98% e.e. or d.e.). As generally understood by those skilled in the art, an optically pure compound having one chiral center is one that consists essentially of one of the two possible enantiomers (i.e., is enantiomerically pure), and an optically pure compound having more than one chiral center is one that is both diastereomerically pure and enantiomerically pure. In certain embodiments, the compound is present in optically pure form.
For compounds in which synthesis involves addition of a single group at a double bond, particularly a carbon-carbon double bond, the addition may occur at either of the double bond-linked atoms. For such compounds, the present invention includes both such regioisomers.
Additionally, the formulas are intended to cover solvated as well as unsolvated forms of the identified structures. For example, the indicated structures include both hydrated and non-hydrated forms. Other examples of solvates include the structures in combination with isopropanol, ethanol, methanol, DMSO, ethyl acetate, acetic acid, or ethanolamine.
B. Prodrugs and Metabolites
For compounds useful in the present invention, the invention also includes prodrugs (generally pharmaceutically acceptable prodrugs), active metabolic derivatives (active metabolites), and their pharmaceutically acceptable salts.
In this context, prodrugs are compounds or pharmaceutically acceptable salts thereof which, when metabolized under physiological conditions or when converted by solvolysis, yield the desired active compound. Typically, the prodrug is inactive, or less active than the active compound, but may provide advantageous handling, administration, or metabolic properties. For example, some prodrugs are esters of the active compound; during metabolysis, the ester group is cleaved to yield the active drug. Also, some prodrugs are activated enzymatically to yield the active compound, or a compound which, upon further chemical reaction, yields the active compound. A common example is an alkyl ester of a carboxylic acid.
As described in The Practice ofMedicinal Chemistry, Ch. 31-32 (Ed. Wermuth, Academic Press, San Diego, Calif., 2001), prodrugs can be conceptually divided into two non-exclusive categories, bioprecursor prodrugs and carrier prodrugs. Generally, bioprecursor prodrugs are compounds that are inactive or have low activity compared to the corresponding active drug compound, that contain one or more protective groups and are converted to an active form by metabolism or solvolysis. Both the active drug form and any released metabolic products should have acceptably low toxicity. Typically, the formation of active drug compound involves a metabolic process or reaction that is one of the follow types:
Oxidative reactions: Oxidative reactions are exemplified without limitation to reactions such as oxidation of alcohol, carbonyl, and acid functions, hydroxylation of aliphatic carbons, hydroxylation of alicyclic carbon atoms, oxidation of aromatic carbon atoms, oxidation of carbon-carbon double bonds, oxidation of nitrogen-containing functional groups, oxidation of silicon, phosphorus, arsenic, and sulfur, oxidative N-dealkylation, oxidative O- and S-dealkylation, oxidative deamination, as well as other oxidative reactions.
Reductive reactions: Reductive reactions are exemplified without limitation to reactions such as reduction of carbonyl groups, reduction of alcoholic groups and carbon-carbon double bonds, reduction of nitrogen-containing functions groups, and other reduction reactions.
Reactions without change in the state of oxidation: Reactions without change in the state of oxidation are exemplified without limitation to reactions such as hydrolysis of esters and ethers, hydrolytic cleavage of carbon-nitrogen single bonds, hydrolytic cleavage of non-aromatic heterocycles, hydration and dehydration at multiple bonds, new atomic linkages resulting from dehydration reactions, hydrolytic dehalogenation, removal of hydrogen halide molecule, and other such reactions.
Carrier prodrugs are drug compounds that contain a transport moiety, e.g., that improves uptake and/or localized delivery to a site(s) of action. Desirably for such a carrier prodrug, the linkage between the drug moiety and the transport moiety is a covalent bond, the prodrug is inactive or less active than the drug compound, the prodrug and any release transport moiety are acceptably non-toxic. For prodrugs where the transport moiety is intended to enhance uptake, typically the release of the transport moiety should be rapid. In other cases, it is desirable to utilize a moiety that provides slow release, e.g., certain polymers or other moieties, such as cyclodextrins. (See, e.g., Cheng et al., U.S. Pat. Pub. No. 2004/0077595, U.S. Ser. No. 10/656,838, incorporated herein by reference.) Such carrier prodrugs are often advantageous for orally administered drugs. Carrier prodrugs can, for example, be used to improve one or more of the following properties: increased lipophilicity, increased duration of pharmacological effects, increased site-specificity, decreased toxicity and adverse reactions, and/or improvement in drug formulation (e.g., stability, water solubility, suppression of an undesirable organoleptic or physiochemical property). For example, lipophilicity can be increased by esterification of hydroxyl groups with lipophilic carboxylic acids, or of carboxylic acid groups with alcohols, e.g., aliphatic alcohols. Wermuth, The Practice ofMedicinal Chemistry, Ch. 31-32, Ed. Wermuth, Academic Press, San Diego, Calif., 2001.
Prodrugs may proceed from prodrug form to active form in a single step or may have one or more intermediate forms which may themselves have activity or may be inactive.
Metabolites, e.g., active metabolites overlap with prodrugs as described above, e.g., bioprecursor prodrugs. Thus, such metabolites are pharmacologically active compounds or compounds that further metabolize to pharmacologically active compounds that are derivatives resulting from metabolic process in the body of a subject or patient. Of these, active metabolites are such pharmacologically active derivative compounds. For prodrugs, the prodrug compounds is generally inactive or of lower activity than the metabolic product. For active metabolites, the parent compound may be either an active compound or may be an inactive prodrug.
Prodrugs and active metabolites may be identified using routine techniques know in the art. See, e.g., Bertolini et al., J. Med Chem., 1997, 40:2011-2016; Shan et al., J. Pharm Sci 86:756-757; Bagshawe, Drug Dev Res., 1995, 34:220-230; Wermuth, (supra).
C. Pharmaceutically Acceptable Salts
Compounds can be formulated as or be in the form of pharmaceutically acceptable salts. Pharmaceutically acceptable salts are non-toxic salts in the amounts and concentrations at which they are administered. The preparation of such salts can facilitate the pharmacological use by altering the physical characteristics of a compound without preventing it from exerting its physiological effect. Useful alterations in physical properties include lowering the melting point to facilitate transmucosal administration and increasing the solubility to facilitate administering higher concentrations of the drug.
Pharmaceutically acceptable salts include acid addition salts such as those containing sulfate, chloride, hydrochloride, fumarate, maleate, phosphate, sulfamate, acetate, citrate, lactate, tartrate, methanesulfonate, ethanesulfonate, benzenesulfonate, p-toluenesulfonate, cyclohexylsulfamate and quinate. Pharmaceutically acceptable salts can be obtained from acids such as hydrochloric acid, maleic acid, sulfuric acid, phosphoric acid, sulfamic acid, acetic acid, citric acid, lactic acid, tartaric acid, malonic acid, methanesulfonic acid, ethanesulfonic acid, benzenesulfonic acid, p-toluenesulfonic acid, cyclohexylsulfamic acid, fumaric acid, and quinic acid.
Pharmaceutically acceptable salts also include basic addition salts such as those containing benzathine, chloroprocaine, choline, diethanolamine, ethylenediamine, meglumine, procaine, aluminum, calcium, lithium, magnesium, potassium, sodium, ammonium, alkylamine, and zinc, when acidic functional groups, such as carboxylic acid or phenol are present. For example, see Remington's Pharmaceutical Sciences, 19th ed., Mack Publishing Co., Easton, Pa., Vol. 2, p. 1457, 1995. Such salts can be prepared using the appropriate corresponding bases.
Pharmaceutically acceptable salts can be prepared by standard techniques. For example, the free-base form of a compound can be dissolved in a suitable solvent, such as an aqueous or aqueous-alcohol solution containing the appropriate acid and then isolated by evaporating the solution. In another example, a salt can be prepared by reacting the free base and acid in an organic solvent.
Thus, for example, if the particular compound is a base, the desired pharmaceutically acceptable salt may be prepared by any suitable method available in the art, for example, treatment of the free base with an inorganic acid, such as hydrochloric acid, hydrobromic acid, sulfuric acid, nitric acid, phosphoric acid and the like, or with an organic acid, such as acetic acid, maleic acid, succinic acid, mandelic acid, fumaric acid, malonic acid, pyruvic acid, oxalic acid, glycolic acid, salicylic acid, a pyranosidyl acid, such as glucuronic acid or galacturonic acid, an alpha-hydroxy acid, such as citric acid or tartaric acid, an amino acid, such as aspartic acid or glutamic acid, an aromatic acid, such as benzoic acid or cinnamic acid, a sulfonic acid, such as p-toluenesulfonic acid or ethanesulfonic acid, or the like.
Similarly, if the particular compound is an acid, the desired pharmaceutically acceptable salt may be prepared by any suitable method, for example, treatment of the free acid with an inorganic or organic base, such as an amine (primary, secondary or tertiary), an alkali metal hydroxide or alkaline earth metal hydroxide, or the like. Illustrative examples of suitable salts include organic salts derived from amino acids, such as glycine and arginine, ammonia, primary, secondary, and tertiary amines, and cyclic amines, such as piperidine, morpholine and piperazine, and inorganic salts derived from sodium, calcium, potassium, magnesium, manganese, iron, copper, zinc, aluminum and lithium.
The pharmaceutically acceptable salt of the different compounds may be present as a complex. Examples of complexes include 8-chlorotheophylline complex (analogous to, e.g., dimenhydrinate: diphenhydramine 8-chlorotheophylline (1:1) complex; Dramamine) and various cyclodextrin inclusion complexes.
Unless specified to the contrary, specification of a compound herein includes pharmaceutically acceptable salts of such compound.
D. Polymorphic forms
In the case of agents that are solids, it is understood by those skilled in the art that the compounds and salts may exist in different crystal or polymorphic forms, all of which are intended to be within the scope of the present invention and specified formulas.
VI. Administration
The methods and compounds will typically be used in therapy for human patients. However, they may also be used to treat similar or identical diseases in other vertebrates, e.g., mammals such as other primates, sports animals, bovines, equines, porcines, ovines, and pets such as dogs and cats.
Suitable dosage forms, in part, depend upon the use or the route of administration, for example, oral, transdermal, transmucosal, or by injection (parenteral). Such dosage forms should allow the compound to reach target cells. Other factors are well known in the art, and include considerations such as toxicity and dosage forms that retard the compound or composition from exerting its effects. Techniques and formulations generally may be found in Remington: The Science and Practice of Pharmacy, 21st edition, Lippincott, Williams and Wilkins, Philadelphia, Pa., 2005 (hereby incorporated by reference herein).
Carriers or excipients can be used to produce pharmaceutical compositions. The carriers or excipients can be chosen to facilitate administration of the compound. Examples of carriers include calcium carbonate, calcium phosphate, various sugars such as lactose, glucose, or sucrose, or types of starch, cellulose derivatives, gelatin, vegetable oils, polyethylene glycols and physiologically compatible solvents. Examples of physiologically compatible solvents include sterile solutions of water for injection (WFI), saline solution, and dextrose.
The compounds can be administered by different routes including intravenous, intraperitoneal, subcutaneous, intramuscular, oral, transmucosal, rectal, or transdermal. Oral administration is preferred. For oral administration, for example, the compounds can be formulated into conventional oral dosage forms such as capsules, tablets, and liquid preparations such as syrups, elixirs, and concentrated drops.
Pharmaceutical preparations for oral use can be obtained, for example, by combining the active compounds with solid excipients, optionally grinding a resulting mixture, and processing the mixture of granules, after adding suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose preparations, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium carboxymethylcellulose (CMC), and/or polyvinylpyrrolidone (PVP: povidone). If desired, disintegrating agents may be added, such as the cross-linked polyvinylpyrrolidone, agar, or alginic acid, or a salt thereof such as sodium alginate.
Dragee cores are provided with suitable coatings. For this purpose, concentrated sugar solutions may be used, which may optionally contain, for example, gum arabic, talc, poly-vinylpyrrolidone, carbopol gel, polyethylene glycol (PEG), and/or titanium dioxide, lacquer solutions, and suitable organic solvents or solvent mixtures. Dye-stuffs or pigments may be added to the tablets or dragee coatings for identification or to characterize different combinations of active compound doses.
Pharmaceutical preparations that can be used orally include push-fit capsules made of gelatin (“gelcaps”), as well as soft, sealed capsules made of gelatin, and a plasticizer, such as glycerol or sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols (PEGs). In addition, stabilizers may be added.
Alternatively, injection (parenteral administration) may be used, e.g., intramuscular, intravenous, intraperitoneal, and/orsubcutaneous. For injection, the compounds of the invention are formulated in sterile liquid solutions, preferably in physiologically compatible buffers or solutions, such as saline solution, Hank's solution, or Ringer's solution. In addition, the compounds may be formulated in solid form and redissolved or suspended immediately prior to use. Lyophilized forms can also be produced.
Administration can also be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art, and include, for example, for transmucosal administration, bile salts and fusidic acid derivatives. In addition, detergents may be used to facilitate permeation. Transmucosal administration, for example, may be through nasal sprays or suppositories (rectal or vaginal).
The amounts of various compound to be administered can be determined by standard procedures taking into account factors such as the compound IC50, the biological half-life of the compound, the age, size, and weight of the patient, and the disorder associated with the patient. The importance of these and other factors are well known to those of ordinary skill in the art. Generally, a dose will be between about 0.01 and 50 mg/kg, preferably 0.1 and 20 mg/kg of the patient being treated. Multiple doses may be used.
A number of examples involved in the present invention are described below. In most cases, alternative techniques could also be used. The examples are intended to be illustrative and are not limiting or restrictive to the scope of the invention.
Human SF-1 and LRH-1 constructs were obtained by PCR amplification of cDNA (BD Biosciences). For E. coli expression the SF-1 G219-T461 insert was cloned into a modified pET vector (Novagen) encoding an N-terminal hexa-HIS tag, cleavable using TEV protease. The SF-1 LBD primer containing a BamHI cloning site and a TEV protease recognition site before residue G219 was:
The non-coding strand primer, adding a stop codon and a SalI cloning site, was
An analogous strategy was used for expression of the LRH-1 S25 1 -A495 (see below, SEQ ID NO:______) using the coding-strand primer,
and the non-coding strand primer,
From structure-based alignment with the mouse LRH-1 structure (1PK5) it was obvious that human SF-1 would have surface-exposed Cys residues at positions 247 and 412. For crystallography of SF-1 these Cys were removed by mutagenesis of the SF-1 DNA using Quick-change protocols (Stratagene) with complementary primers (see below, SEQ ID NO______). The coding-strand primers used were:
For analysis in mammalian cell culture, transient transfection vectors encoding the LBDs of SF-1 and LRH-1 were cloned as fusion proteins with the GAL4 DBD into a modified SG5-GAL4 vector. The SF-1 G219-T461 LBD primer containing an NdeI cloning site before residue G219 was:
The LRH-1 S251-A495 LBD primer containing an NdeI site before S251 was
Coding-strand primers for mutations of SF-1 and LRH-1 to test ligand binding and coactivator binding using Quick-change protocols were:
E. coli expression vectors for GST fusion proteins with SRC-1 (residues M595-Q780, containing NR-boxes I, II and III) were made as described (Marimuthu et al., Mol. Endocrinol., 2002, 16:271-86) except a modified pGEX-2T vector (Amersham) was engineered to encode a C-terminal fusion peptide,
with a biotinylation site (Kim & McHenry, J. Biol. Chem., 1996, 271:20690-20698.) The insert encoding a NR-binding site from the coactivator TReP (Gizard et al., J. Biol. Chem., 2002, 277, 39144-39155), M173-P192, encoding residues
was engineered by gene synthesis, and cloned into the N-terminal GST/C-terminal biotinylation site vector. All constructs were sequenced (DavisSequencing, Inc.).
SF-1 G219-T461 with Cys 247 and 412 Removed:
Nucleic acid(SEQ ID NO: ______)
Encoded protein (SEQ ID NO: ______)
LRH-1 S251-A495 with Cys 247 and 412 Removed:
Nucleic acid(SEQ ID NO: ______)
Encoded protein (SEQ ID NO: ______)
The SF-1 LBD (G219-T416 with C247S/C412S mutations) and the LRH-1 LBD (S251 -A495) used for crystallography were produced as TEV-cleavable N-terminally HIS-tagged proteins in E. coli strain BL21(DE3) RIL (Stratagene). Single colonies were grown for 4 hrs at 37° C. in 2 separate 200 mL Luria broth (LB) media containing kanamycin (30 μg/mL) and chloramphenicol (15 μg/mL). 400 mL culture was transferred to a 45 L Bioreactor containing 30 L Terrific Broth (TB) media also supplemented with kanamycin and chloramphenicol. Cultures were allowed to grow at 37° C. until reaching an OD600 of 2.0-2.5 OD then grown at 20° C., with 0.5 mM IPTG added for continued growth for 15 hrs at 20° C. Cells were harvested using a continuous flow centrifuge and paste frozen at −80° C.
Cell pastes with SF-1 or LRH-1 were resuspended with 40 mL lysis buffer (50 mM Na/K Phosphate [pH 8.0], 250 mM NaCl, 5% glycerol) per liter of cells, and lysed using a microfluidizer (Microfluidics M-110H) at 18,000 psi. Lysate was clarified by centrifugation at 15,000 g at 4° C. for 2 hrs. Imidazole was added to the clarified lysate to a final concentration of 15 mM, and then loaded onto a 50 ml Ni-Chelating Sepharose (AP Biotech) column. The column was washed with 500 mL of buffer A (20 mM HEPES [pH8.0], 250 mM NaCl, 5% glycerol) containing 15 mM imidazole, and eluted with a 100 mL gradient to 100% buffer B (20 mM HEPES [pH8.0], 250 mM imidazole, 250 mM NaCl, 5% glycerol). Eluted LBDs were diluted six-fold with buffer C (20 mM Tris [pH 8.0]) and loaded onto a 75 mL Source 30Q (AP Biotech) column. The column was washed with 100 mL buffer C containing 20 mM NaCl and eluted with a fifteen column volume linear gradient from 2 to 25% buffer D (20 mM Tris [pH 8.0], 1 M NaCl). The LBD proteins, which eluted between 50 mM and 150 mM NaCl, were analyzed using native and SDS-PAGE, and tested for coactivator-binding activity. Pooled fractions were incubated with TEV protease at 50 μg/mg overnight at 4° C. for removal of the N-terminal tag. The sequence removed is:
The cleaved protein was re-purified using a Source30Q column, and eluted with an eight column volume gradient from 2 to 25% buffer D. At this stage, the proteins were >95% pure as determined by SDS-PAGE analysis. Prior to concentration, beta-mercaptoethanol was added to 14 mM final concentration, and the proteins concentrated to 20 mg/mL and stored at −80° C.
Coactivator N-terminal GST/C-terminal biotinylation site fusion proteins were produced in E. coli strain BL21(DE3) RIL (Stratagene). Shaker cultures (750 ml 2× LB) were grown at 37° C. until an OD600 of 1.2. Then, 0.5 mM IPTG was added and cultures were cooled to 15° C. with continued shaking overnight. Cells were harvested by centrifugation, frozen in liquid N2 and stored at −80° C. Cell pastes (5 gm) were suspended in 50 mL extraction buffer (50 mM Tris pH 8.0,250 mM NaCl, 0.1% Triton X-100). Lysozyme (0.5 mL of 20 mg/mL, Sigma) was added and left on ice 15-30 min., followed by sonication (1.5 min on ice) using flat-tip probe and setting 6 of model 550-sonic dismembranator (Fisher). The prep was checked for loss of DNA viscosity, then centrifuged at 17,000 rpm for 30 min. at 4° C. in a SA-600 rotor (Beckman). Supernatant was recovered and mixed with 0.5 mL buffer-washed slurry of Glutathione-Sepharose beads (Amersham) continuously for 1 hr at 4° C. Beads were centrifuged at low speed and washed once with 20 mL extraction buffer, and twice with 50 mM Tris pH 8.0. GST protein was recovered by elution with 3-5 ml elution buffer (50 mM Tris pH 8.0, 6.5 mg/ml glutathione (Sigma).
For co-expression studies, the ampicillin-resistant GST-coactivator fusion plasmids were co-introduced with the kanamycin-resistant HIS-tagged LRH-1 or SF-1 plasmids. Growth and extraction was the same as for GST-tagged coactivators, above. To the centrifuged prep from 750 mL culture was added imidazole to a final 10 mM, and 1.0 mL buffer-washed slurry of Talon cobalt affinity resin (BD Biosciences), stirring continuously for 1 hr at 4° C. Beads were centrifuged at low speed and washed once with 20 mL extraction buffer containing 10 mM imidazole, and twice with cobalt wash buffer (20 mM Tris pH 8.0, 100 mM NaCl, 10% glycerol) also with 10 mM imidazole. HIS-tagged protein was recovered by elution with 3-5 ml cobalt wash buffer with 200 mM imidazole.
For liposome washing of HIS-tagged SF-1 protein, 20 mg was extracted from a 750 mL culture, bound to cobalt affinity resin, and washed as above. While remaining bound to the resin, two sequential 30 minute, 5 mL washes in cobalt wash buffer containing sonicated 100 μM 1,2-didodecanoyl-sn-glycero-3-phosphocholine (Sigma) were applied, followed by two final washes in cobalt wash buffer. The HIS-tagged protein was recovered in 3 mL cobalt wash buffer with 200 mM imidazole.
Initial crystallization of human SF-1 and LRH-1 were observed in sparse-matrix screens using Hampton Index screen kits (Hampton Research). Human SF-1 protein was diluted to 15 mg/ml in 20 mM Tris-HCl, pH 8.0, 100 mM NaCl, 10 mM DTT with a 2× molar excess of the peptides NCOA1 (SRC-1) NID-2
and/or NCOA2 (TIF2, GRIP1) NID-3
Crystals were grown by sitting drop vapor diffusion at 4° C., mixing equal volumes of protein/peptide sample with reservoir solution containing 18% polyethylene glycol (PEG) 3350, 0.2M ammonium sulfate, 0.1M BisTris pH 5.5, and 2.5% sucrose. Crystals grew to a size of 0.6 mm×0.3 mm×0.3 mm in 5-8 days. For cryo-protection sucrose was added to SF-1 crystals prior to freezing.
Human LRH-1 protein was diluted to 10 mg/ml in 20 mMTris/HCl, pH 7.5, 62 mM NaCl, 100 mM ammonium acetate, 2 mM CHAPS with 2× molar excess of the peptide NCOA2 NID-3
Crystals were grown by sitting drop vapor diffusion at 20° C., mixing equal volumes of protein/peptide sample with reservoir solution containing 0.9M NaH2PO4, 0.1 M K2HPO4 (Hampton Index screen #17). Crystals grew to a size of 0.13 mm×0.03 mm×0.03 mm in 2 weeks. Glycerol was used for cryo-protection.
The X-ray diffraction data of both human SF-1 and human LRH-1 were collected at the Advanced Light Source (ALS) beam line 8.3.1 using a Quantum 210 CCD detector. Data collection was performed under cryogenic temperature. The diffraction data were integrated and scaled using programs Mosflm and SCALA (Table 1). (Leslie, Acta Crystallogr. D Biol Crystallogr., 1999, 55 (Pt 10):1696-1702.)
To solve the SF-1 structure, a homology model was generated based on the crystal structure of mouse LRH-1 (1PK5). (Sablin et al., Mol. Cell, 2003, 11:1575-1585.) Molecular replacement of the data up to 3.5 Å was carried out using EPMR (Kissinger et al., Acta Crystallogr. D Biol Crystallogr., 1999, 55 (Pt 2):484-91) obtaining a solution in space group P3121. Two molecules related by non-crystallographic symmetry were determined in each asymmetric unit. The electron density map calculated with the initial phases revealed the majority of the structure. An initial model was obtained manually using program O. (Jones et al., Acta Crystallogr A, 1991, 47 ( Pt 2):110-9.) The initial model was then subject to refinement using program CNX (Brunger et al., Acta Crystallogr D Biol Crystallogr., 1998, 54 (Pt 5):905-21) with least square minimization on the maximum likelihood target functions, simulated annealing and torsion angle dynamics. Subsequent interactive model building and refinement were performed against 2.1 Å data with least square refinement, individual B-factor refinement, and TLS refinement using programs CNX and REFMAC5. (Brunger et al., Acta Crystallogr D Biol Crystallogr., 1998, 54 (Pt 5):905-21.) Well-defined election density indicated one NCOA2 NID-3 peptide bound to the surface and the unexpected PE ligand bound inside the ligand pocket.
The human LRH-1 structure determination and refinement was similar to that for SF-1. A homology model was generated based on the crystal structure of mouse LRH-1 (1PK5). (Sablin et al., Mol. Cell, 2003, 11, 1575-85.) It was then used as the search model for molecular replacement using program EPMR. (Kissinger et al., Acta Crystallogr D Biol Crystallogr., 1999, 55 ( Pt 2):484-91.) The crystal is in space group P212121 with one molecule in each asymmetric unit. The initial molecular replacement solution was then subject to iterative refinement against data up to 2.5 Å. At a late stage of refinement, some electron density appeared in the ligand binding pocket representing a phospholipid molecule. The shape of the electron density suggested the structure of a phosphatidylglycerol-phosphoglycerol, confirmed by further refinement. NCOA2 NID-3 peptide was found to bind at two sites on the molecular surface.
The Alpha Screen Histidine detection (Nickel chelate) kit (Perkin Elmer) was used to detect binding between His-tagged SF-1 LBD and biotinylated GST-SRC-1 fragments. The assay was performed in Costar 384-well white polystyrene plates (Coming Inc.) in a total volume of 20 μL using buffer containing 50 mM Bis-tris HCl (pH 7.5), 50 mM KCl, 0.05% Tween 20, 1 mM DTT, 0.1% BSA. Reactions were initiated in 15 μL containing 50 nM His-tagged SF-1 receptor and 50 nM biotin-tagged SRC-1 fragment. Phospholipid was included as indicated. PE 18:3 (1,2-Dilinolenoyl-sn-glycero-3-phosphoethanolamine) was from Avanti Polar Lipids. The plate was sealed and incubated at room temp for 2 hours. After incubation, 5 μL containing streptavidin donor beads (15 μg/ml) and Ni-chelate acceptor beads (15 μg/ml) was added from the Nickel chelate kit. Plates were resealed and incubated in the dark for 2 hours at room temperature and then read in a Fusion Alpha reader set at a read time of 1 s/well. Data analysis was done using GraphPad Prism (GraphPad Software, Inc.).
HEK293T cells were cultured at 37° C. in Dulbecco's modified Eagle's medium(DMEM) with penicillin(100 U/ml), streptomycin (100 U/ml) and 10% heat-inactivated fetal calf serum (Invitrogen). For transient transfection HEK293T cells were grown to 80% confluency in 96-well plates, and medium exchanged for 100 μl serum-free medium before addition of 100 ng pSG-GAL4-SF-1 -LBD or pSG-GAL4-LRH-1 -LBD expression vector, 40 ng pFR-Luc reporter gene (Stratagene), and 12 ng pRL-TK transfection control plasmids (Promega) mixed with 0.5 μl Metafectene (Biontex). After 4 hours serum-containing medium was added. After 24 hrs medium was removed and cells were lysed in Renilla luciferase assay lysis buffer (Promega). Firefly luciferase was measured using Luciferase Reporter Gene Assay kit (Roche) and Renilla luciferase was measured using Renilla Luciferase Assay System (Promega).
All patents and other references cited in the specification are indicative of the level of skill of those skilled in the art to which the invention pertains, and are incorporated by reference in their entireties, including any tables and figures, to the same extent as if each reference had been incorporated by reference in its entirety individually.
One skilled in the art would readily appreciate that the present invention is well adapted to obtain the ends and advantages mentioned, as well as those inherent therein. The methods, variances, and compositions described herein as presently representative of preferred embodiments are exemplary and are not intended as limitations on the scope of the invention. Changes therein and other uses will occur to those skilled in the art, which are encompassed within the spirit of the invention, are defined by the scope of the claims.
It will be readily apparent to one skilled in the art that varying substitutions and modifications may be made to the invention disclosed herein without departing from the scope and spirit of the invention. For example, variations can be made in the method for identifying modulators and/or various methods of administration can be used. Thus, such additional embodiments are within the scope of the present invention and the following claims.
The invention illustratively described herein suitably may be practiced in the absence of any element or elements, limitation or limitations which is not specifically disclosed herein. Thus, for example, in each instance herein any of the terms “comprising”, “consisting essentially of” and “consisting of” may be replaced with either of the other two terms. The terms and expressions which have been employed are used as terms of description and not of limitation, and there is no intention that in the use of such terms and expressions of excluding any equivalents of the features shown and described or portions thereof, but it is recognized that various modifications are possible within the scope of the invention claimed. Thus, it should be understood that although the present invention has been specifically disclosed by preferred embodiments and optional features, modification and variation of the concepts herein disclosed may be resorted to by those skilled in the art, and that such modifications and variations are considered to be within the scope of this invention as defined by the appended claims.
In addition, where features or aspects of the invention are described in terms of Markush groups or other grouping of alternatives, those skilled in the art will recognize that the invention is also thereby described in terms of any individual member or subgroup of members of the Markush group or other group.
Also, unless indicated to the contrary, where various numerical values are provided for embodiments, additional embodiments are described by taking any 2 different values as the endpoints of a range. Such ranges are also within the scope of the described invention.
Thus, additional embodiments are within the scope of the invention and within the following claims.
aRsym = Σ|Iavg − Ij|/ΣIj.
bRcryst = Σ|Fo − Fc|/ΣFo, where Fo and Fc are observed and calculated structure factors, respectively, Rfree was calculated from a randomly chosen 5% of reflections excluded form the refinement, and Rcryst was calculated from the remaining 95% of reflections.
r.m.s.d. is the root-mean-square deviation from ideal geometry. Numbers in parentheses are for the highest resolution shell.
This application claims the benefit of U.S. Provisional App. No. 60/634,827, filed Dec. 8, 2004, entitled SF-1 and LRH-1 Modulator Development, which is incorporated herein by reference in its entirety and for all purposes.
Number | Date | Country | |
---|---|---|---|
60634827 | Dec 2004 | US |