The present invention relates generally to the structure of the ligand binding domain of HNF4γ, and more particularly to the crystalline structure of the ligand binding domain of HNF4γ. The invention further relates to methods by which modulators and ligands of HNF4γ, and HNF4α, can be identified.
Nuclear receptors represent a superfamily of proteins that specifically bind a physiologically relevant small molecule, such as a hormone or vitamin. As a result of a molecule binding to a nuclear receptor, the nuclear receptor changes the ability of a cell to transcribe DNA, i.e. nuclear receptors modulate the transcription of DNA. However they can also have transcription independent actions.
Unlike integral membrane receptors and membrane-associated receptors, nuclear receptors reside in either the cytoplasm or nucleus of eukaryotic cells. Thus nuclear receptors comprise a class of intracellular, soluble ligand-regulated transcription factors. Nuclear receptors include but are not limited to receptors for glucocorticoids, androgens, mineralcorticoids, progestins, estrogens, thyroid hormones, vitamin D retinoids, icosanoids and peroxisomes. Many nuclear receptors, identified by either sequence homology to known receptors (See, Drewes et al., (1996) Mol. Cell. Biol. 16:925-31) or based on their affinity for specific DNA binding sites in gene promoters (See, Sladek et al., Genes Dev. 4:2353-65), have unascertained ligands and are therefore termed “orphan receptors”.
Hepatocyte Nuclear Factor 4 (HNF4) is an orphan nuclear receptor and two isoforms, HNF4α and HNF4γ, have currently been identified. HNF4α was originally identified based on its ability to bind promoter regions in the plasma transthyretin (TTR) and apoCIII genes. Sladek et al., Genes Dev. 4:2353-65. HNF4γ was identified based on its known homology to HNF4α. Drewes et al., (1996) Mol. Cell. Biol. 16:925-31. Nuclear receptors activate or repress transcription through partner proteins called co-activators or co-repressors, respectively. CREB-binding protein, or CBP, is a known co-activator for HNF4α (Wang et al., (1998) J. Biol. Chem. 273: 30847-50; Dell & Hadzopoulou-Cladaras, (1999) J. Biol. Chem. 274: 9013-21). Mutations in HNF4α have been linked to the metabolic disorder Mature Onset of Diabetes of the Young (MODY), type 1. Yamagata et al., (1996) Nature 384:458-60. HNF4α+/− subjects experience reduced serum levels of apolipoprotein AII, apolipoprotein CIII and lipoproitein(a), leading to reduced triglycerides. Shih et al., (2000) Diabetes 49:832-37. HNF4α regulation had previously been identified for these apolipoprotien genes (Mietus-Snyder et al., (1992) Mol. Cell. Biol. 12:1708-18; Chan et al., (1993) Nucleic Acid Res. 21:1205-11), as well as regulation of other factors involved in glucose metabolism and insulin secretion. Diaz Guerra et al., (1993) Mol. Cell. Biol. 13:7725-33; Miguerol et al., (1994) J. Biol. Chem. 269:8944-51; Stoffel & Duncan, (1997) Proc. Natl. Acad. Sci. U.S.A. 94:13209-14; Wang et al., (1998) J. Biol. Chem. 273:30847-50.
Structurally, the HNF4 family of nuclear receptors, including HNF4α and HNF4γ, are generally characterized by two distinct structural elements. First, nuclear receptors comprise a central DNA binding domain which targets the receptor to specific DNA sequences, which are known as hormone response elements (HREs). The DNA binding domains of these receptors are related in structure and sequence, and are located within the middle of the receptor. Second, the C-terminal region of the HNF4 family of nuclear receptors encompasses the ligand binding domain (LBD). Upon binding a ligand, the receptor shifts to a transcriptionally active state.
Almost all nuclear hormone receptors bind DNA, and the physiologically active complex of many is as a heterodimer with the retinoid X receptor (RXR). The HNF4 isoforms are unusual in that they are obligate homodimers and cannot dimerize with any other nuclear receptors. In fact, retinoid X receptor (RXR) heterodimer formation is actually prevented by LBD interactions. Jiang & Sladek, (1997) J. Biol. Chem. 272:1218-25.
Polypeptides, including the ligand binding domain of HNF4γ, have a three-dimensional structure determined by the primary amino acid sequence and the environment surrounding the polypeptide. This three-dimensional structure establishes the polypeptide's activity, stability, binding affinity, binding specificity, and other biochemical attributes. Thus, knowledge of a protein's three-dimensional structure can provide much guidance in designing agents that mimic, inhibit, or improve its biological activity in soluble or membrane bound forms.
The three-dimensional structure of a polypeptide can be determined in a number of ways. Many of the most precise methods employ X-ray crystallography (See, e.g., Van Holde, (1971) Physical Biochemistry, Prentice-Hall, N.J., 221-39). This technique relies on the ability of crystalline lattices to diffract X-rays or other forms of radiation. Diffraction experiments suitable for determining the three-dimensional structure of macromolecules typically require high-quality crystals. Unfortunately, such crystals have been unavailable for the ligand binding domain of HNF4γ, as well as many other proteins of interest. Thus, high-quality diffracting crystals of the ligand binding domain of HNF4γ would greatly assist in the elucidation of HNF4γ's three-dimensional structure, and would provide insight into the ligand binding properties of HNF4γ.
Clearly, the solved crystal structure of the HNF4γ ligand binding domain would be useful in the design of modulators of activity mediated by all HNF4 isoforms. Evaluation of the available sequence data has made it clear that HNF4α shares significant sequence homology with HNF4γ. Further, HNF4γ shows structural homology with the three-dimensional fold of other proteins.
The solved HNF4γ-ligand crystal structure would provide structural details and insights necessary to design a modulator of HNF4γ that maximizes preferred requirements for any modulator, i.e. potency and specificity. By exploiting the structural details obtained from an HNF4γ-ligand crystal structure, it would be possible to design an HNF4 modulator that, despite HNF4γ's similarity with other proteins, exploits the unique structural features of HNF4γ. An HNF4 modulator developed using structure-assisted design would take advantage of heretofore unknown HNF4 structural considerations and thus be more effective than a modulator developed using homology-based design. Potential or existent homology models cannot provide the necessary degree of specificity. An HNF4γ modulator designed using the structural coordinates of a crystalline form of HNF4γ would also provide a starting point for the development of modulators of other HNF4s.
What is needed, therefore, is a crystallized form of an HNF4γ LBD polypeptide, preferably in complex with a ligand. Acquisition of crystals of the HNF4γ LBD polypeptide will permit the three dimensional structure of the HNF4γ LBD to be determined. Knowledge of this three dimensional structure will facilitate the design of modulators of HNF4γ activity. Such modulators can lead to therapeutic compounds to treat a wide range of conditions, including lipid homeostasis disorders and glucose homeostasis disorders.
A substantially pure HNF4γ ligand binding domain polypeptide in crystalline form is disclosed. Preferably, the crystalline form has lattice constants of a=152.71 Å, b=152.71 Å, c=93.42 Å, α=90°, β=90°, γ=90°. More preferably, the crystalline form is a tetragonal crystalline form. Even more preferably, the crystalline form has a space group of 14122. Still more preferably, the HNF4γ ligand binding domain polypeptide has the amino acid sequence shown in SEQ ID NO:4.
In a preferred embodiment, the HNF4γ ligand binding domain polypeptide is in complex with a ligand. More preferably, the ligand is a fatty acid.
A method for determining the three-dimensional structure of a crystallized HNF4γ ligand binding domain polypeptide to a resolution of about 3.0 Å or better is also disclosed. The method comprises (a) crystallizing an HNF4γ ligand binding domain polypeptide; and (b) analyzing the HNF4γ ligand binding domain polypeptide to determine the three-dimensional structure of the crystallized HNF4γ ligand binding domain polypeptide, whereby the three-dimensional structure of a crystallized HNF4γ ligand binding domain polypeptide is determined to a resolution of about 3.0 Å or better.
A method of designing a modulator of an HNF4 polypeptide is also disclosed. The method comprises (a) designing a potential modulator of an HNF4 polypeptide that will form bonds with amino acids in a substrate binding site based upon a crystalline structure of an HNF4γ ligand binding domain polypeptide; (b) synthesizing the modulator; and (c) determining whether the potential modulator modulates the activity of the HNF4 polypeptide, whereby a modulator of an HNF4 polypeptide is designed.
In an alternative embodiment, a method of designing a modulator that selectively modulates the activity of an HNF4 polypeptide in accordance with the present invention comprises: (a) obtaining a crystalline form of an HNF4γ ligand binding domain polypeptide; (b) evaluating the three-dimensional structure of the crystallized HNF4γ ligand binding domain polypeptide; and (c) synthesizing a potential modulator based on the three-dimensional crystal structure of the crystallized HNF4γ ligand binding domain polypeptide, whereby a modulator that selectively modulates the activity of an HNF4 polypeptide is designed. Preferably, the method further comprises contacting an HNF4γ ligand binding domain polypeptide with the potential modulator; and assaying the HNF4γ ligand binding domain polypeptide for binding of the potential modulator, for a change in activity of the HNF4γ ligand binding domain polypeptide, or both. More preferably, the crystalline form is such that the three-dimensional structure of the crystallized HNF4γ ligand binding domain polypeptide can be determined to a resolution of about 3.0 Å or better.
In yet another embodiment, a method of designing a modulator of an HNF4 polypeptide in accordance with the present invention comprises: (a) selecting a candidate HNF4 ligand; (b) determining which amino acid or amino acids of an HNF4 polypeptide interact with the ligand using a three-dimensional model of a crystallized protein comprising an HNF4γ LBD; (c) identifying in a biological assay for HNF4 activity a degree to which the ligand modulates the activity of the HNF4 polypeptide; (d) selecting a chemical modification of the ligand wherein the interaction between the amino acids of the HNF4 polypeptide and the ligand is predicted to be modulated by the chemical modification; (e) performing the chemical modification on the ligand to form a modified ligand; (f) contacting the modified ligand with the HNF4 polypeptide; (g) identifying in a biological assay for HNF4 activity a degree to which the modified ligand modulates the biological activity of the HNF4 polypeptide; and (h) comparing the biological activity of the HNF4 polypeptide in the presence of modified ligand with the biological activity of the HNF4 polypeptide in the presence of the unmodified ligand, whereby a modulator of an HNF4 polypeptide is designed. Preferably, the HNF4 polypeptide is an HNF4γ polypeptide. More preferably, the three-dimensional model of a crystallized protein is an HNF4γ LBD polypeptide with a bound ligand. Even more preferably, the method further comprises repeating steps (a) through (f), if the biological activity of the HNF4 polypeptide in the presence of the modified ligand varies from the biological activity of the HNF4 polypeptide in the presence of the unmodified ligand.
A method for identifying an HNF4 modulator is also disclosed. The method comprises (a) providing atomic coordinates of an HNF4γ ligand binding domain to a computerized modeling system; and (b) modeling ligands that fit spatially into the binding pocket of the HNF4γ ligand binding domain to thereby identify an HNF4 modulator. Preferably, the method further comprises identifying in an assay for HNF4-mediated activity a modeled ligand that increases or decreases the activity of the HNF4.
A method of identifying an HNF4γ modulator that selectively modulates the activity of an HNF4γ polypeptide compared to other polypeptides is disclosed. The method comprises (a) providing atomic coordinates of an HNF4γ ligand binding domain to a computerized modeling system; and (b) modeling a ligand that fits into the binding pocket of an HNF4γ ligand binding domain and that interacts with conformationally constrained residues of an HNF4γ that are conserved among HNF4 isoforms to thereby identify an HNF4γ modulator. Preferably, the method further comprises identifying in a biological assay for HNF4γ-mediated activity a modeled ligand that selectively binds to the HNF4γ ligand binding domain and increases or decreases the activity of the HNF4γ.
An assay method for identifying a compound that inhibits binding of a ligand to an HNF4 polypeptide is disclosed. The assay method comprises (a) incubating an HNF4 polypeptide with a ligand in the presence of a test inhibitor compound; (b) determining an amount of ligand that is bound to the HNF4 polypeptide, wherein decreased binding of ligand to the HNF4 protein in the presence of the test inhibitor compound relative to binding of ligand in the absence of the test inhibitor compound is indicative of inhibition; and (c) identifying the test compound as an inhibitor of ligand binding if decreased ligand binding is observed. Preferably, the ligand is a fatty acid.
Accordingly, it is an object of the present invention to provide a three dimensional structure of the ligand binding domain of HNF4γ. The object is achieved in whole or in part by the present invention.
An object of the invention having been stated hereinabove, other objects will be evident as the description proceeds, when taken in connection with the accompanying Drawings and Laboratory Examples as best described hereinbelow.
Until disclosure of the present invention presented herein, the ability to obtain crystalline forms of an HNF4γ LBD has not been realized. And until disclosure of the present invention presented herein, a detailed three-dimensional crystal structure of an HNF4γ polypeptide has not been solved.
In addition to providing structural information, crystalline polypeptides provide other advantages. For example, the crystallization process itself further purifies the polypeptide, and satisfies one of the classical criteria for homogeneity. In fact, crystallization frequently provides unparalleled purification quality, removing impurities that are not removed by other purification methods such as HPLC, dialysis, conventional column chromatography, etc. Moreover, crystalline polypeptides are often stable at ambient temperatures and free of protease contamination and other degradation associated with solution storage. Crystalline polypeptides can also be useful as pharmaceutical preparations. Finally, crystallization techniques in general are largely free of problems such as denaturation associated with other stabilization methods (e.g., lyophilization). Once crystallization has been accomplished, crystallographic data provides useful structural information that can assist the design of compounds that can serve as agonists or antagonists, as described herein below. In addition, the crystal structure provides information useful to map a receptor binding domain, which could then be mimicked by a small non-peptide molecule that would serve as an antagonist or agonist.
I. Definitions
Following long-standing patent law convention, the terms “a” and “an” mean “one or more” when used in this application, including the claims.
As used herein, the term “mutation” carries its traditional connotation and means a change, inherited, naturally occurring or introduced, in a nucleic acid or polypeptide sequence, and is used in its sense as generally known to those of skill in the art.
As used herein, the term “labeled” means the attachment of a moiety, capable of detection by spectroscopic, radiologic or other methods, to a probe molecule.
As used herein, the term “target cell” refers to a cell, into which it is desired to insert a nucleic acid sequence or polypeptide, or to otherwise effect a modification from conditions known to be standard in the unmodified cell. A nucleic acid sequence introduced into a target cell can be of variable length. Additionally, a nucleic acid sequence can enter a target cell as a component of a plasmid or other vector or as a naked sequence.
As used herein, the term “transcription” means a cellular process involving the interaction of an RNA polymerase with a gene that directs the expression as RNA of the structural information present in the coding sequences of the gene. The process includes, but is not limited to the following steps: (a) the transcription initiation, (b) transcript elongation, (c) transcript splicing, (d) transcript capping, (e) transcript termination, (f) transcript polyadenylation, (g) nuclear export of the transcript, (h) transcript editing, and (i) stabilizing the transcript.
As used herein, the term “expression” generally refers to the cellular processes by which a polypeptide is produced from RNA.
As used herein, the term “transcription factor” means a cytoplasmic or nuclear protein which binds to a gene, or binds to an RNA transcript of a gene, or binds to another protein which binds to a gene or an RNA transcript or another protein which in turn binds to a gene or an RNA transcript, so as to thereby modulate expression of the gene. Such modulation can additionally be achieved by other mechanisms; the essence of “transcription factor for a gene” is that the level of transcription of the gene is altered in some way.
As used herein, the term “hybridization” means the binding of a probe molecule, a molecule to which a detectable moiety has been bound, to a target sample.
As used herein, the term “detecting” means confirming the presence of a target entity by observing the occurrence of a detectable signal, such as a radiologic or spectroscopic signal that will appear exclusively in the presence of the target entity.
As used herein, the term “sequencing” means determining the ordered linear sequence of nucleic acids or amino acids of a DNA or protein target sample, using conventional manual or automated laboratory techniques.
As used herein, the term “isolated” means oligonucleotides substantially free of other nucleic acids, proteins, lipids, carbohydrates or other materials with which they can be associated, such association being either in cellular material or in a synthesis medium. The term can also be applied to polypeptides, in which case the polypeptide will be substantially free of nucleic acids, carbohydrates, lipids and other undesired polypeptides.
As used herein, the term “substantially pure” means that the polynucleotide or polypeptide is substantially free of the sequences and molecules with which it is associated in its natural state, and those molecules used in the isolation procedure. The term “substantially free” means that the sample is at least 50%, preferably at least 70%, more preferably 80% and most preferably 90% free of the materials and compounds with which is it associated in nature.
As used herein, the term “primer” means a sequence comprising two or more deoxyribonucleotides or ribonucleotides, preferably more than three, and more preferably more than eight and most preferably at least about 20 nucleotides of an exonic or intronic region. Such oligonucleotides are preferably between ten and thirty bases in length.
As used herein, the term “DNA segment” means a DNA molecule that has been isolated free of total genomic DNA of a particular species. In a preferred embodiment, a DNA segment encoding an HNF4 polypeptide refers to a DNA segment that contains SEQ ID NO:1, but can optionally comprise fewer or additional nucleic acids, yet is isolated away from, or purified free from, total genomic DNA of a source species, such as Homo sapiens. Included within the term “DNA segment” are DNA segments and smaller fragments of such segments, and also recombinant vectors, including, for example, plasmids, cosmids, phages, viruses, and the like.
As used herein, the phrase “enhancer-promoter” means a composite unit that contains both enhancer and promoter elements. An enhancer-promoter is operatively linked to a coding sequence that encodes at least one gene product.
As used herein, the phrase “operatively linked” means that an enhancer-promoter is connected to a coding sequence in such a way that the transcription of that coding sequence is controlled and regulated by that enhancer-promoter. Techniques for operatively linking an enhancer-promoter to a coding sequence are well known in the art; the precise orientation and location relative to a coding sequence of interest is dependent, inter alia, upon the specific nature of the enhancer-promoter.
As used herein, the terms “candidate substance” and “candidate compound” are used interchangeably and refer to a substance that is believed to interact with another moiety, for example a given ligand that is believed to interact with a complete, or a fragment of, an HNF4 polypeptide, and which can be subsequently evaluated for such an interaction. Representative candidate substances or compounds include xenobiotics such as drugs and other therapeutic agents, carcinogens and environmental pollutants, natural products and extracts, as well as endobiotics such as steroids, fatty acids and prostaglandins. Other examples of candidate compounds that can be investigated using the methods of the present invention include, but are not restricted to, agonists and antagonists of an HNF4 polypeptide, toxins and venoms, viral epitopes, hormones (e.g., opioid peptides, steroids, etc.), hormone receptors, peptides, enzymes, enzyme substrates, co-factors, lectins, sugars, oligonucleotides or nucleic acids, oligosaccharides, proteins, small molecules and monoclonal antibodies.
As used herein, the term “biological activity” means any observable effect flowing from interaction between an HNF4 polypeptide and a ligand. Representative, but non-limiting, examples of biological activity in the context of the present invention include homodimerization of an HNF4, lipid binding by HNF4 and association of an HNF4 with DNA.
As used herein, the term “modified” means an alteration from an entity's normally occurring state. An entity can be modified by removing discrete chemical units or by adding discrete chemical units. The term “modified” encompasses detectable labels as well as those entities added as aids in purification.
As used herein, the terms “structure coordinates” and “structural coordinates” mean mathematical coordinates derived from mathematical equations related to the patterns obtained on diffraction of a monochromatic beam of X-rays by the atoms (scattering centers) of a molecule in crystal form. The diffraction data are used to calculate an electron density map of the repeating unit of the crystal. The electron density maps are used to establish the positions of the individual atoms within the unit cell of the crystal.
Those of skill in the art understand that a set of structure coordinates determined by X-ray crystallography is not without standard error. For the purpose of this invention, any set of structure coordinates for HNF4γ or an HNF4γ mutant that have a root mean square (RMS) deviation from ideal of no more than 0.5 Å when superimposed, using the polypeptide backbone atoms, on the structure coordinates listed in Table 2 shall be considered identical.
As used herein, the term “space group” means the arrangement of symmetry elements of a crystal.
As used herein, the term “molecular replacement” means a method that involves generating a preliminary model of the wild-type HNF4γ ligand binding domain, or an HNF4γ mutant crystal whose structure coordinates are unknown, by orienting and positioning a molecule whose structure coordinates are known within the unit cell of the unknown crystal so as best to account for the observed diffraction pattern of the unknown crystal. Phases can then be calculated from this model and combined with the observed amplitudes to give an approximate Fourier synthesis of the structure whose coordinates are unknown. This, in turn, can be subject to any of the several forms of refinement to provide a final, accurate structure of the unknown crystal. See, e.g., Lattman, (1985) Method Enzymol., 115: 55-77; Rossmann, ed, (1972) The Molecular Replacement Method, Gordon & Breach, New York.) Using the structure coordinates of the ligand binding domain of HNF4γ provided by this invention, molecular replacement can be used to determine the structure coordinates of a crystalline mutant or homologue of the HNF4γ ligand binding domain, or of a different crystal form of the HNF4γ ligand binding domain.
As used herein, the term “isomorphous replacement” means a method of using heavy atom derivative crystals to obtain the phase information necessary to elucidate the three-dimensional structure of a native crystal (Blundell et al., (1976) Protein Crystallography, Academic Press; Otwinowski, (1991), in Isomorphous Replacement and Anomalous Scattering, (Evans & Leslie, eds.), pp. 80-86, Daresbury Laboratory, Daresbury, United Kingdom). The phrase “heavy-atom derivatization” is synonymous with the term “isomorphous replacement”.
As used herein, the terms “β-sheet” and “beta-sheet” mean the conformation of a polypeptide chain stretched into an extended zig-zig conformation. Portions of polypeptide chains that run “parallel” all run in the same direction. Polypeptide chains that are “antiparallel” run in the opposite direction from the parallel chains.
As used herein, the terms “α-helix” and “alpha-helix” mean the conformation of a polypeptide chain wherein the polypeptide backbone is wound around the long axis of the molecule in a left-handed or right-handed direction, and the R groups of the amino acids protrude outward from the helical backbone, wherein the repeating unit of the structure is a single turnoff the helix, which extends about 0.56 nm along the long axis.
As used herein, the term “unit cell” means a basic parallelepiped shaped block. The entire volume of a crystal can be constructed by regular assembly of such blocks. Each unit cell comprises a complete representation of the unit of pattern, the repetition of which builds up the crystal. Thus, the term “unit cell” means the fundamental portion of a crystal structure that is repeated infinitely by translation in three dimensions. A unit cell is characterized by three vectors a, b, and c, not located in one plane, which form the edges of a parallelepiped. Angles α, β and γ define the angles between the vectors: angle α is the angle between vectors b and c; angle β is the angle between vectors a and c; and angle γ is the angle between vectors a and b. The entire volume of a crystal can be constructed by regular assembly of unit cells; each unit cell comprises a complete representation of the unit of pattern, the repetition of which builds up the crystal.
As used herein, the term “tetragonal unit cell” means a unit cell wherein a=b≠c; and α=β=γ=90°. The vectors a, b and c describe the unit cell edges and the angles α, β, and γ describe the unit cell angles.
As used herein, the term “crystal lattice” means the array of points defined by the vertices of packed unit cells.
As used herein, the term “active site” means that site in a polypeptide where substrate binding occurs. For HNF4γ, the active site comprises the residues Ile135, Val138, Cys139, Ser141, Met142, Gln145, Leu179, Leu180, Gly182, Ala183, Arg186, Leu194, Leu196, Gly197, Ile202, Glu210, Ile211, Val214, Ala215, Val218, Met301, Gln304, Ile305, Val308, Val314, Ile316 and Leu320.
As used herein, the term “HNF4” means nucleic acids encoding a hepatocyte nuclear factor 4 (HNF4) nuclear receptor polypeptide that can bind DNA and/or one or more ligands and/or has the ability to form multimers. The term “HNF4” encompasses at least the HNF4α and HNF4γ isoforms. The term “HNF4” includes invertebrate homologs; however, preferably, HNF4 nucleic acids and polypeptides are isolated from vertebrate sources. “HNF4” further includes vertebrate homologs of HNF4 family members, including, but not limited to, mammalian and avian homologs. Representative mammalian homologs of HNF4 family members include, but are not limited to, murine and human homologs.
As used herein, the terms “HNF4 gene product”, “HNF4 protein”, “HNF4 polypeptide”, and “HNF4 peptide” are used interchangeably and mean peptides having amino acid sequences which are substantially identical to native amino acid sequences from an organism of interest and which are biologically active in that they comprise all or a part of the amino acid sequence of an HNF4 polypeptide, or cross-react with antibodies raised against an HNF4 polypeptide, or retain all or some of the biological activity (e.g., DNA or ligand binding ability and/or dimerization ability) of the native amino acid sequence or protein. Such biological activity can include immunogenicity.
As used herein, the terms “HNF4 gene product”, “HNF4 protein”, “HNF4 polypeptide”, and “HNF4 peptide” also include analogs of an HNF4 polypeptide. By “analog” is intended that a DNA or peptide sequence can contain alterations relative to the sequences disclosed herein, yet retain all or some of the biological activity of those sequences. Analogs can be derived from genomic nucleotide sequences as are disclosed herein or from other organisms, or can be created synthetically. Those skilled in the art will appreciate that other analogs, as yet undisclosed or undiscovered, can be used to design and/or construct HNF4 analogs. There is no need for an “HNF4 gene product”, “HNF4 protein”, “HNF4 polypeptide”, or “HNF4 peptide” to comprise all or substantially all of the amino acid sequence of an HNF4 polypeptide gene product. Shorter or longer sequences are anticipated to be of use in the invention; shorter sequences are herein referred to as “segments”. Thus, the terms “HNF4 gene product”, “HNF4 protein”, “HNF4 polypeptide”, and “HNF4 peptide” also include fusion, chimeric or recombinant HNF4 polypeptides and proteins comprising sequences of the present invention. Methods of preparing such proteins are disclosed herein and are known in the art.
In the present invention, the terms “HNF4γ gene product”, “HNF4γ protein”, “HNF4γ polypeptide”, and “HNF4γ peptide” are used interchangeably and mean to a preferred isoform of an HNF4 polypeptide family, namely HNF4γ. A more preferred embodiment of an HNF4γ polypeptide comprises the amino acid sequence of SEQ ID NO:2.
As used herein, the term “polypeptide” means any polymer comprising any of the 20 protein amino acids, regardless of its size. Although “protein” is often used in reference to relatively large polypeptides, and “peptide” is often used in reference to small polypeptides, usage of these terms in the art overlaps and varies. The term “polypeptide” as used herein refers to peptides, polypeptides and proteins, unless otherwise noted. As used herein, the terms “protein”, “polypeptide” and “peptide” are used interchangeably herein when referring to a gene product.
As used herein, the term “modulate” means an increase, decrease, or other alteration of any or all chemical and biological activities or properties of a wild-type or mutant HNF4 polypeptide, preferably a wild-type or mutant HNF4γ polypeptide. The term “modulation” as used herein refers to both upregulation (i.e., activation or stimulation) and downregulation (i.e. inhibition or suppression) of a response.
As used herein, the term “diabetes” means disorders related to alterations in glucose homeostasis. In the mildest forms of diabetes, this alteration is detected only after challenge with a carbohydrate load, while in moderate to severe forms of disease, hyperglycemia is present. Type I diabetes, insulin dependent diabetes mellitus or IDDM, is the result of a progressive autoimmune destruction of the pancreatic β-cells with subsequent insulin deficiency. The more prevalent Type II, non-insulin dependent diabetes mellitus or NIDDM, is associated with peripheral insulin resistance, elevated hepatic glucose production, and inappropriate insulin secretion. Type II diabetes that develops during the age of 20-30 years old and is associated with chronic hyperglycemia and monogenic inheritance is referred to as maturity onset diabetes of the young (MODY, Type II). Other forms of Type II diabetes develop in an individual sometime after 20-30 years of age, for example, late-onset NIDDM. HNF4α is linked to MODY I.
As used herein, the terms “HNF4 gene” and “recombinant HNF4 gene” mean a nucleic acid molecule comprising an open reading frame encoding an HNF4 polypeptide of the present invention, including both exon and (optionally) intron sequences.
As used herein, the term “gene” is used for simplicity to refer to a functional protein, polypeptide or peptide encoding unit. As will be understood by those in the art, this functional term includes both genomic sequences and cDNA sequences. Preferred embodiments of genomic and cDNA sequences are disclosed herein.
As used herein, the term “DNA sequence encoding an HNF4 polypeptide” can refer to one or more coding sequences within a particular individual. Moreover, certain differences in nucleotide sequences can exist between individual organisms, which are called alleles. It is possible that such allelic differences might or might not result in differences in amino acid sequence of the encoded polypeptide yet still encode a protein with the same biological activity. As is well known, genes for a particular polypeptide can exist in single or multiple copies within the genome of an individual. Such duplicate genes can be identical or can have certain modifications, including nucleotide substitutions, additions or deletions, all of which still code for polypeptides having substantially the same activity.
As used herein, the term “intron” means a DNA sequence present in a given gene which is not translated into protein.
As used herein, the term “interact” means detectable interactions between molecules, such as can be detected using, for example, a yeast two hybrid assay. The term “interact” is also meant to include “binding” interactions between molecules. Interactions can, for example, be protein-protein or protein-nucleic acid in nature.
As used herein, the terms “cells,” “host cells” or “recombinant host cells” are used interchangeably and mean not only to the particular subject cell, but also to the progeny or potential progeny of such a cell. Because certain modifications can occur in succeeding generations due to either mutation or environmental influences, such progeny might not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.
As used herein, the term “agonist” means an agent that supplements or potentiates the bioactivity of a functional HNF4 gene or protein or of a polypeptide encoded by a gene that is up- or down-regulated by an HNF4 polypeptide and/or a polypeptide encoded by a gene that contains an HNF4 binding site in its promoter region.
As used herein, the term “antagonist” means an agent that decreases or inhibits the bioactivity of a functional HNF4 gene or protein, or that supplements or potentiates the bioactivity of a naturally occurring or engineered non-functional HNF4 gene or protein. Alternatively, an antagonist can decrease or inhibit the bioactivity of a functional gene or polypeptide encoded by a gene that is up- or down-regulated by an HNF4 polypeptide and/or contains an HNF4 binding site in its promoter region. An antagonist can also supplement or potentiate the bioactivity of a naturally occurring or engineered non-functional gene or polypeptide encoded by a gene that is up- or down-regulated by an HNF4 polypeptide, and/or contains an HNF4 binding site in its promoter region.
As used herein, the terms “chimeric protein” or “fusion protein” are used interchangeably and mean a fusion of a first amino acid sequence encoding an HNF4 polypeptide with a second amino acid sequence defining a polypeptide domain foreign to, and not homologous with, any domain of one of an HNF4 polypeptide. A chimeric protein can present a foreign domain which is found in an organism which also expresses the first protein, or it can be an “interspecies” or “intergenic” fusion of protein structures expressed by different kinds of organisms. In general, a fusion protein can be represented by the general formula X—HNF4—Y, wherein HNF4 represents a portion of the protein which is derived from an HNF4 polypeptide, and X and Y are independently absent or represent amino acid sequences which are not related to an HNF4 sequence in an organism, which includes naturally occurring mutants.
II. Description of Tables
Table 1 is a table summarizing the crystal and data statistics obtained from the crystallized ligand binding domain of HNF4γ. Data on the unit cell are presented, including data on the crystal space group, unit cell dimensions, molecules per asymmetric cell and crystal resolution.
Table 2 is a table of the atomic structure coordinate data obtained from X-ray diffraction from the ligand binding domain of HNF4γ in complex with a ligand.
Table 3 is a table depicting a sequence alignment comparing HNF4γ and HNF4α. Boxed HNF4γ residues are in alpha helices, shaded HNF4γ residues are in beta strands. Bold HNF4γ residues are thoser residues that have the potential to form Van der Waals's contacts (5Å) with palmitic acid; bold and underlined HNF4γ residues form hydrogen bonds to palmitic acid. Underlined HNF4α residues are mutations associated with the disease MODY 1.
Table 4 is a table summarizing data obtained from analytes detected by GC/MS using chemical ionization
III. General Considerations
Hepatocyte nuclear factor cDNAs code for several different genes and map to different chromosomes. HNF1 maps to chromosome 12, vHNF1 maps to chromosome 17, HNF4α maps to chromosome 20 and HNF4γ maps to chromosome 8. HNF1 and vHNF1 are homologous to each other, regulate several of the same genes and have similar tissues expression patterns. HNF4α and HNF4γ are also homologous to each other. Additionally, HNF4α and HNF4γ have an overlapping, but not identical expression pattern. The existence of multiple isoforms of the HNF4 polypeptide could explain the complex forms of regulation controlled by these transcription factors in different tissues. The redundancy of these transcription factors suggests the possibility of biological complementation by these genes, with respect to each other; when one isoform is defective, for example in a subject afflicted with diabetes, the other isoform could compensate.
The present invention will usually be applicable mutatis mutandis to all HNF4 polypeptides, as discussed herein based, in part, on the patterns of HNF4 structure and modulation that have emerged as a consequence of determining the three dimensional structure of HNF4γ in complex with a ligand. Generally, the HNF4s display substantial regions of amino acid homology. Additionally, the HNF4s display an overall structural motif comprising three modular domains:
The amino terminal domain of the HNF4 isoforms is the least conserved of the three domains. This domain is involved in transcriptional activation and, in some cases, its uniqueness can dictate selective receptor-DNA binding and activation of target genes by HNF4 isoforms.
The DNA binding domain is the most conserved structure amongst the HNF4s. It typically contains about 70 amino acids that fold into two zinc finger motifs, wherein a zinc ion coordinates four cysteines. The DBD contains two perpendicularly oriented α-helices that extend from the base of the first and second zinc fingers. The two zinc fingers function in concert along with non-zinc finger residues to direct the HNF4s to specific target sites on DNA. Various amino acids in the DBD influence spacing between two half-sites (which usually comprises six nucleotides) for receptor homodimerization. The optimal spacings facilitate cooperative interactions between DBDs, and D box residues are part of the dimerization interface. Other regions of the DBD facilitate DNA-protein and protein-protein interactions required for HNF4 homodimerization.
The LBD is the second most highly conserved domain in these receptors. Whereas the integrity of several different LBD sub-domains is important for ligand binding, truncated molecules containing only the LBD can retain normal ligand binding activity. This domain also participates in other functions, including dimerization, nuclear translocation and transcriptional regulation activities. Importantly, this domain can bind a ligand and can undergo ligand-induced conformational changes. Ligand binding allows the activation domain to serve as an interaction site for essential co-activator proteins that function to stimulate or inhibit transcription.
The carboxy-terminal activation subdomain is in close three-dimensional proximity in the LBD to the ligand, so as to allow for ligands bound to the LBD to coordinate (or interact) with amino acid(s) in the activation subdomain. As disclosed herein, the LBD of an HNF4 is expressed, crystallized and its three dimensional structure determined. Computational and other methods for the design of ligands to the LBD are also disclosed.
IV. Production of HNF4 Polypeptides
The native and mutated HNF4 polypeptides, and fragments thereof, of the present invention can be chemically synthesized in whole or part using techniques that are well-known in the art (See, e.g., Creighton, (1983) Proteins: Structures and Molecular Principles, W.H. Freeman & Co., New York, incorporated herein in its entirety). Alternatively, methods which are well known to those skilled in the art can be used to construct expression vectors containing a partial or the entire native or mutated HNF4 polypeptide coding sequence and appropriate transcriptional/translational control signals. These methods include in vitro recombinant DNA techniques, synthetic techniques and in vivo recombination/genetic recombination. See, for example, the techniques described in Sambrook et al., (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, New York, and Ausubel et al., (1989) Current Protocols in Molecular Biology, Greene Publishing Associates and Wiley Interscience, New York, both incorporated herein in their entirety.
A variety of host-expression vector systems can be utilized to express an HNF4 coding sequence. These include but are not limited to microorganisms such as bacteria transformed with recombinant bacteriophage DNA, plasmid DNA or cosmid DNA expression vectors containing an HNF4 coding sequence; yeast transformed with recombinant yeast expression vectors containing an HNF4 coding sequence; insect cell systems infected with recombinant virus expression vectors (e.g., baculovirus) containing an HNF4 coding sequence; plant cell systems infected with recombinant virus expression vectors (e.g., cauliflower mosaic virus, CaMV; tobacco mosaic virus, TMV) or transformed with recombinant plasmid expression vectors (e.g., Ti plasmid) containing an HNF4 coding sequence; or animal cell systems. The expression elements of these systems vary in their strength and specificities.
Depending on the host/vector system utilized, any of a number of suitable transcription and translation elements, including constitutive and inducible promoters, can be used in the expression vector. For example, when cloning in bacterial systems, inducible promoters such as pL of bacteriophage λ, plac, ptrp, ptac (ptrp-lac hybrid promoter) and the like can be used. When cloning in insect cell systems, promoters such as the baculovirus polyhedrin promoter can be used. When cloning in plant cell systems, promoters derived from the genome of plant cells, such as heat shock promoters; the promoter for the small subunit of RUBISCO; the promoter for the chlorophyll a/b binding protein) or from plant viruses (e.g., the 35S RNA promoter of CaMV; the coat protein promoter of TMV) can be used. When cloning in mammalian cell systems, promoters derived from the genome of mammalian cells (e.g., metallothionein promoter) or from mammalian viruses (e.g., the adenovirus late promoter; the vaccinia virus 7.5 K promoter) can be used. When generating cell lines that contain multiple copies of the tyrosine kinase domain DNA, SV40-, BPV- and EBV-based vectors can be used with an appropriate selectable marker.
V. Formation of HNF4γ Ligand Binding Domain Crystals
In one embodiment, the present invention provides crystals of HNF4γ. The crystals were obtained using the methodology disclosed in the Examples. The HNF4γ crystals, which can be native crystals, derivative crystals or co-crystals, have tetragonal unit cells (a tetragonal unit cell is a unit cell wherein a=b≠c, and wherein α=β=γ=90°) and space group symmetry 14122. There is one HNF4γ molecule in the asymmetric unit. In the HNF4γ crystalline form, the unit cell has dimensions of a=b=152.71 c=93.42, and α=β=γ=90°.
The HNF4γ LBD-ligand structure was solved using single isomorphous replacement anomalous scattering (SIRAS) techniques. In the SIRAS method of solving protein crystals, a derivative crystal is prepared that contains an atom that is heavier than the other atoms of the sample. One representative heavy atom that can be incorporated into the derivative crystal is mercury. A mercury-based heavy atom derivative crystal was used to solve the structure of the HNF4γ ligand binding domain of the present invention. Heavy atom derivative crystals can be prepared by soaking a crystal in a solution containing a selected heavy atom salt. In the present invention, heavy atom derivative crystals were prepared by soaking a crystalline form of the HNF4γ LBD in methyl mercury chloride (MeHgCl).
Symmetry-related reflections in the X-ray diffraction pattern, usually identical, are altered by the anomalous scattering contribution of the heavy atoms. The measured differences in symmetry-related reflections are used to determine the position of the heavy atoms, leading to an initial estimation of the diffraction phases, and subsequently, an electron density map is prepared. The prepared electron density map is then used to identify the position of the other atoms in the sample.
V.A. Preparation of HNF4 Crystals
The native and derivative co-crystals, and fragments thereof, disclosed in the present invention can be obtained by a variety of techniques, including batch, liquid bridge, dialysis, vapor diffusion and hanging drop methods (See, e.g., McPherson, (1982) Preparation and Analysis of Protein Crystals, John Wiley, New York.; McPherson, (1990) Eur. J. Biochem. 189:1-23.; Weber, (1991) Adv. Protein Chem. 41:1-36). In a preferred embodiment, the vapor diffusion and hanging drop methods are used for the crystallization of HNF4 polypeptides and fragments thereof.
In general, native crystals of the present invention are grown by dissolving substantially pure HNF4 polypeptide or a fragment thereof in an aqueous buffer containing a precipitant at a concentration just below that necessary to precipitate the protein. Water is removed by controlled evaporation to produce precipitating conditions, which are maintained until crystal growth ceases.
In a preferred embodiment of the invention, native crystals are grown by vapor diffusion (See, e.g., McPherson, (1982) Preparation and Analysis of Protein Crystals, John Wiley, New York.; McPherson, (1990) Eur. J. Biochem. 189:1-23). In this method, the polypeptide/precipitant solution is allowed to equilibrate in a closed container with a larger aqueous reservoir having a precipitant concentration optimal for producing crystals. Generally, less than about 25 μL of HNF4 polypeptide solution is mixed with an equal volume of reservoir solution, giving a precipitant concentration about half that required for crystallization. This solution is suspended as a droplet underneath a coverslip, which is sealed onto the top of the reservoir. The sealed container is allowed to stand, until crystals grow. Crystals generally form within two to six weeks, and are suitable for data collection within approximately seven to ten weeks. Of course, those of skill in the art will recognize that the above-described crystallization procedures and conditions can be varied.
V.B. Preparation of Derivative Crystals
Derivative crystals of the present invention, e.g. heavy atom derivative crystals, can be obtained by soaking native crystals in mother liquor containing salts of heavy metal atoms. Such derivative crystals are useful for phase analysis in the solution of crystals of the present invention. In a preferred embodiment of the present invention, for example, soaking a native crystal in a solution containing methyl-mercury chloride provides derivative crystals suitable for use as isomorphous replacements in determining the X-ray crystal structure of an HNF4 polypeptide. Additional reagents useful for the preparation of the derivative crystals of the present invention will be apparent to those of skill in the art after review of the disclosure of the present invention presented herein.
V.C. Preparation of Co-Crystals
Co-crystals of the present invention can be obtained by soaking a native crystal in mother liquor containing compounds known or predicted to bind the LBD of an HNF4, or a fragment thereof. Alternatively, co-crystals can be obtained by co-crystallizing an HNF4 LBD polypeptide or a fragment thereof in the presence of one or more compounds known or predicted to bind the polypeptide. In a preferred embodiment, such a compound is a fatty acid of variable length.
V.D. Solving a Crystal Structure of the Present Invention
Crystal structures of the present invention can be solved using a variety of techniques including, but not limited to, isomorphous replacement anomalous scattering or molecular replacement methods. Computer software packages will also be helpful in solving a crystal structure of the present invention. Applicable software packages include but are not limited to X-PLOR™ program (Brünger, (1992) X-PLOR, Version 3.1. A System for X-ray Crystallography and NMR, Yale University Press, New Haven, Conn.; X-PLOR is available from Molecular Simulations, Inc., San Diego, Calif.), Xtal View (McRee, (1992) J. Mol. Graphics 10: 44-47; X-tal View is available from the San Diego Supercomputer Center), SHELXS 97 (Sheldrick (1990) Acta Cryst. A46: 467; SHELX 97 is available from the Institute of Inorganic Chemistry, Georg-August-Universitat, Göttingen, Germany), HEAVY (Terwilliger, Los Alamos National Laboratory) can be used and SHAKE-AND-BAKE (Hauptman, (1997) Curr. Opin. Struct. Biol. 7: 672-80; Weeks et al., (1993) Acta Cryst. D49: 179; available from the Hauptman-Woodward Medical Research Institute, Buffalo, N.Y.). See also, Ducruix & Geige, (1992) Crystallization of Nucleic Acids and Proteins: A Practical Approach, IRL Press, Oxford, England, and references cited therein.
VI. Summary of Results for the HNF4γ Ligand Binding Domain
The three-dimensional structure of the HNF4γ LBD has been solved by X-ray crystallography and is depicted in
VI.A. Overall Structure of the HNF41 LBD
The overall fold of the HNF4γ LBD of the present invention is an “α-helical sandwich”, is depicted in
VI.B. Structural Features of the Dimerization Site
The HNF4γ homodimer interface is composed of residues in helices 7, 9 and 10, and is the same interface seen in other nuclear receptor homo- and heterodimers (Bourguet et al., (1995) Nature 375: 377-82; Brzozowski et al., (1997) Nature (London) 389: 753-58; Nolte et al., (1998) Nature (London) 395: 1374, (Bourguet et al., (2000) Mol. Cell 5: 289-98; Gampe, Jr. et al., (2000) Mol. Cell 5: 545-55). Of the 22 residues involved in HNF4γ dimerization, 20 are conserved in HNF4α, and all charged residues in the HNF4γ dimer interface are identical in HNF4α. This homodimer interface exemplifies themes seen in other nuclear receptor dimers, buried hydrophobic surface for stability, with hydrogen bonds and charge-pairing for specificity. As a dimer HNF4γ buries 1320 Å2 of accessible surface per monomer, between the 1266 Å2 and 1632 Å2 observed for RXRα and ERα homodimers, respectively. The HNF4γ homodimer interface includes specific side-chain/side-chain interactions, with hydrogen bonds between Q266Nε-E2860ε and Q295Oε-Q295Nε and salt bridges between E228Oε-K259Nζand possibly D271Oδ-R281Nε, and R281NH. HNF4/RXR heterodimer formation is prevented by LBD interactions (Jiang & Sladek, (1997) J. Biol. Chem. 272: 1218-25), and LBD heterodimer formation is precluded because not all salt-bridges will form. The RXR equivalent to HNF4γ E228 and K259 are D359 and E390, respectively, so a heterodimer will create one salt bridge and one potentially unfavorable pairing. HNF4γ D271 and R281 are equivalent to RXRα A402 and P412, respectively, and no charge-pairing is possible. Also, the critical heterodimer salt bridge observed between RXRα R393 and PPARγ D441 cannot be made to HNF4, where the equivalent residue is a Thr. Our observation that the E228-K259 salt bridge is important for homodimer formation agrees with the work of Bogan et al. (Bogan et al., (2000) J. Mol. Biol. 302: 831-851). Their results showed that wild-type HNF4α cannot form heterodimers with HNF4α mutants where residues E327 (γE286) and K300 (γK259) are changed to their RXR equivalents. The conservation of the interface residues between HNF4γ and HNF4α suggests that the HNF4α homodimer interface is similar to HNF4γ's. In fact, HNF4α/HNF4γ heterodimers could possibly exist in tissues where both are present.
VI.C. Structural Features of the HNF4γ Binding Pocket
The HNF4γ LBD has a well-defined ligand binding pocket, which is similar to the nuclear receptors RXR (Bourguet et al., (1995) Nature 375: 377-82) and RAR (Renaud et al., (1995) Nature 378: 681-89). The pocket volume, 476 Å3, is consistent with binding a small molecule ligand, and is hydrophobic over 76% of the pocket surface. Arginine 186, which is conserved among a number of nuclear receptors, occupies the same pocket position seen in retinoid X receptor (RXR), retinoic acid receptor (RAR), thyroid hormone receptor (TR), estrogen receptor (ER) and progesterone receptor (PR). In all previous structures, this binding-pocket arginine makes hydrogen bonds to oxygen atoms of bound ligands. HNF4γ's pocket is too narrow to accommodate steroids. Another prominent feature of the pocket is a direct contact between M142 in helix 3 and M301 in helix 11. This contact bridges the binding pocket, and effectively blocks direct ligand access to residues 318-325 in helix 12.
Palmitic acid forms hydrogen bonds with the side chain of arginine 186, and with the backbone nitrogen of glycine 197 (not shown). Alanine 215 corresponds to serine 256 in HNF4α. Because serine can form hydrogen bonds to the ligand, the specificity is different for the two receptor subtypes. GC/MS analyses of receptor extracts indicates that although HNF4α also binds palmitic and stearic acid, it preferentially binds different fatty acids. Valine 214 corresponds to valine 255 in HNF4α, and is one of the mutations associated with MODY, Type 1.
VI.D. Identification and Characterization of an HNF4α Binding Pocket Ligand
Electron density was observed in the HNF4γ binding pocket in the first solvent-flattened SIRAS map. During the course of refinement, the pocket density improved and appeared consistent with a thin curved ligand, depicted in
The description of the bound ligand from the structural data led to the belief that the compound was a fatty acid. Analytical methods were used to obtain a definitive identification of the ligand. First, bound ligand(s) was separated from a purified preparation of HNF4γ LBD by liquid-liquid extraction (Folch et al., (1957) J. Biol. Chem. 226: 497-509). The extract was then treated with 3% (v/v) acetyl chloride in methanol. This reagent converts fatty acids to their corresponding fatty acid methyl esters (FAME). The derivatized sample was then analyzed by gas chromatography/mass spectrometry (GC/MS) using both electron impact ionization (EI) and chemical ionization (CI) in separate analyses. The constituents of the extract were identified by comparing the GC/MS data for the extract with data for standard fatty acids, acquired likewise.
A comparison of
Results of the GC/MS analyses show that the HNF4γ extract consisted of a mixture of fatty acids with palmitic acid as the most abundant component. Data from the CI analysis are summarized in Table 4. The second column lists the protonated molecular ion detected of each labeled peak in
VI.E. Confirmation of the Functionality of the HNF4 Ligand by FRET Assay
To confirm that fatty acids were functional HNF4 ligands, HNF4α and HNF4γ were tested for their ability to recruit the nuclear receptor co-activator CREB binding protein, a known activation partner (Wang et al., (1998) J. Biol. Chem. 273: 30847-50; Dell & Hadzopoulou-Cladaras, (1999) J. Biol. Chem. 274: 9013-21). A FRET (fluorescent resonance energy transfer) assay was employed using purified recombinant CREB-binding protein (CBP) and HNF4 LBD (Zhou et al., (1998) Mol. Endocrinol. 12: 1594-1604). Long-chain fatty acids (LCFA) with increasing carbon lengths from 12 to 18 carbon methylene units were tested for their ability to modulate the association between HNF4 and CBP in a dose dependent manner. Saturated fatty acids with chains smaller than 16 carbons did not affect basal CBP association. Palmitic and stearic acids increased the allosteric interaction between CBP and HNF4, with apparent binding constants of 1 μM.
The reported ligands for HNF4α are fatty acyl-CoA thioesters (Hertz et al., (1998) Nature (London) 392: 512-16), which are much larger than other nuclear receptor ligands (Bogan et al., (1998) Nat. Struct. Biol. 5: 679-81). When tested in the FRET assay, palmitoyl-CoA and steroyl-CoA decreased the basal level of CBP recruitment to both HNF4α and HNF4γ. Shorter fatty acyl-CoAs had no effect on CBP association. This behavior indicates that longer chain fatty-acyl CoA derivatives are not HNF4 agonists.
HNF4α is primarily expressed in the liver and pancreas and is regulated by fatty acids, indicating a link between fatty acid and glucose metabolism. There are known effects of free fatty acids on glucose-stimulated insulin secretion (GSIS), including an initial stimulatory effect (Stein et al., (1997) J. Clin. Invest 100:398403; Dobbins et al., (1998a) Diabetes 47:1613-18; Dobbins et al., (1998b) J. Clin. Invest. 101:2370-76), followed by a decrease after long term exposure (Zhou & Grill, (1994) J. Clin. Invest. 93:870-76; Zhou & Grill, (1995) J. Clin. Endocrinol. Metab. 80:1584-90; Boliheimer et al., (1998) J. Clin. Invest. 101:1094-1101; Biorklund & Grill (1999) Diabetes 48:1409-14; Jacqueminet et al., (2000) Metab. Clin. Exp. 49:532-36). The observed negative effects of long term fatty acid exposure on pancreatic islet function (Zhou & Grill, (1995) J. Clin. Endocrinol Metab. 80:1584-90) are likely to be partially mediated by HNF4.
VI.F. Analysis of the HNF4α Ligand Binding Mode
Although fatty acids are ligands for both PPARs and HNF4s, the proposed binding mode and specificity are significantly different. The structure of EPA bound to PPARδ (Xu et al., (1999) MOL Cell 3: 397-403) showed that the acid head group hydrogen bonds to PPARδ residues H323, H449 and Y43 in the AF2 helix. In HNF4γ, the fatty acid head group most likely hydrogen bonds to residue R185 in helix 5, and possibly to G197, much like the acid-protein interactions observed in retinoid binding nuclear receptors (Bourguet et al., (2000)Mol. Cell 5: 289-98; Gampe, Jr. et al., (2000) Mol. Cell 5: 545-55; Renaud et al., (1995) Nature 378: 681-89). The hydrophobic tail in the PPARδ/EPA complex can adopt two bent conformations, with the tail-up conformation pointing towards helix 5. In contrast, the hydrophobic tail in HNF4γ curves around the M142-M301 salt bridge and points towards the loop between helix 11 and the AF2 helix. Thus, the fatty acid in PPARδ binds in essentially the reverse orientation to that in HNF4γ.
The substrate specificity of the HNF4s is also markedly different from PPARs. The PPARs accept a wide range of fatty acids, but C18-20 mono- and poly-unsaturated fatty acids bind most tightly. Both HNF4s bind a much smaller range of substrates, with 16-18 carbon saturated fatty acids highly preferred. Thus, all HNF4 substrates are also bound by PPARα and PPARδ, but the converse is not true. The greater substrate specificity of HNF4 indicates a more specific role in the regulation of biological pathways.
VI.G Unique Structural Differences Between HNF4, and HNF4α
Without an atomic structure for HNF4α, the structure of HNF4γ can be considered in order to speculate on the design of isoform specific compounds. The solved structure of HNF4γ suggests that there is a potential for isoform specific ligand recognition based on amino acid differences between HNF4α and HNF4γ. Of the 26 amino acids in the binding pocket, 6 are different between HNF4γ and HNF4α. The substitution that can be directly exploited for designing isoform specific ligands is Ala215γ-Ser256α. This substitution adds a hydrogen bond donor near the C8-C9 of palmitic acid, and represents a substantial change to the chemical character of the binding pocket. Compounds that make this hydrogen bond will preferentially bind to HNF4α. Alternatively, compounds with a bulky hydrophobic group in that position may clash sterically with the hydroxyl of serine, and would preferentially bind HNF4γ. Thus, the HNF4γ structure provides a roadmap for the design of isoform specific compounds.
Most of the substitutions between HNF4α and HNF4γ are conservative, exchanging one hydrophobic residue for another. These are Ile202γ-Val242α, Ile211γ-Met252α, Val218γ-Ile259α, Val308γ-Ile349α, and Val314γ-Ala355α. These substitutions have the effect of changing the shape of the binding pocket without altering its chemical characteristics greatly. Two of the substitutions that add mass to the binding pocket residues (Ile211γ-Met252α, Val218γ-Ile259α) occur along the curve of palmitic acid, and have the effect of restricting the pocket. This is partially offset by the substitution Ile202γ-Val242α near palmitic acid C6-C8, which enlarges a cavity in the binding pocket. The pair of substitutions Val308γ-Ile349α, and Val314γ-Ala355α occur near the paimitic acid tail, and direct the fatty acid tail more towards the loop connecting helix 11 and helix 12 (the AF2 helix), while expanding the pocket there. These shape changes to the binding pocket can also be exploited in the design of isoform specific compounds.
One other difference between HNF4γ and HNF4α that could change the characteristics of the binding pocket is that HNF4α has an extra residue, Ala250, in the loop between the beta turn and helix 7. This extra residue could slightly shift the positions of the residues in helix 7, i.e. Glu251, Met252, Val255, Ser256, and Ile259. However, modeling amino acid shifts caused by extra loop residues is more speculative than substitutions.
VI.H. Generation of Easily-Solved HNF4 Crystals
The present invention discloses a substantially pure HNF4 LBD polypeptide in crystalline form. In a preferred embodiment, exemplified in the Figures and Laboratory Examples, HNF4γ is crystallized with bound ligand. Crystals are formed from HNF4 LBD polypeptides that are usually expressed by a cell culture, such as E. coli. Bromo-, iodo- and substitutions can be included during the preparation of crystal forms and can act as heavy atom substitutions in HNF4 ligands and crystals of HNF4s. This method can be advantageous for the phasing of the crystal, which is a crucial, and sometimes limiting, step in solving the three-dimensional structure of a crystallized entity. Thus, the need for generating the heavy metal derivatives traditionally employed in crystallography can be eliminated. After the three-dimensional structure of an HNF4 or HNF4 LBD with or without a ligand bound is determined, the resultant three-dimensional structure can be used in computational methods to design synthetic ligands for HNF4γ and other HNF4 polypeptides. Further activity structure relationships can be determined through routine testing, using assays disclosed herein and known in the art.
VII. Uses of HNF4γ Crystals and the Three-Dimensional Structure of the Ligand Bindina Domain of HNF4γ
VII.A. Design and Development of HNF4 Modulators
The knowledge of the structure of the HNF4γ ligand binding domain (LBD), an aspect of the present invention, provides a tool for investigating the mechanism of action of HNF4γ and other HNF4 polypeptides in a subject. For example, various computer models, as described herein, can predict the binding of various substrate molecules to the LBD of HNF4γ. Upon discovering that such binding in fact takes place, knowledge of the protein structure then allows design and synthesis of small molecules that mimic the functional binding of the substrate to the LBD of HNF4γ, and to the LBDs of other HNF4 polypeptides. This is the method of “rational” drug design, further described herein.
Use of the isolated and purified HNF4γ crystalline structure of the present invention in rational drug design is thus provided in accordance with the present invention. Additional rational drug design techniques are described in U.S. Pat. Nos. 5,834,228 and 5,872,011, incorporated herein in their entirety.
Thus, in addition to the compounds described herein, other sterically similar compounds can be formulated to mimic the key structural regions of an HNF4 in general, or of HNF4γ in particular. The generation of a structural functional equivalent can be achieved by the techniques of modeling and chemical design known to those of skill in the art and described herein. It will be understood that all such sterically similar constructs fall within the scope of the present invention.
VII.A.1. Rational Drug Design
The three-dimensional structure of the ligand binding domain of HNF4γ is unprecedented and will greatly aid in the development of new synthetic ligands for an HNF4 polypeptide, such as HNF4 agonists and antagonists, including those that bind exclusively to any one of the HNF4 isoforms. In addition, the HNF4s are well suited to modern methods, including three-dimensional structure elucidation and combinatorial chemistry, such as those disclosed in U.S. Pat. No. 5,463,564, incorporated herein by reference. Structure determination using X-ray crystallography is possible because of the solubility properties of the HNF4s. Computer programs that use crystallography data when practicing the present invention will enable the rational design of ligands to these receptors. Programs such as RASMOL (Biomolecular Structures Group, Glaxo Wellcome Research & Development Stevenage, Hertfordshire, UK Version 2.6, August 1995, Version 2.6.4, December 1998, Copyright © Roger Sayle 1992-1999) can be used with the atomic structural coordinates from crystals generate by practicing the invention or used to practice the invention by generating three-dimensional models and/or determining the structures involved in ligand binding. Computer programs such as those sold under the registered trademark INSIGHT II® and such as GRASP (Nicholls et al., (1991) Proteins 11: 282) allow for further manipulations and the ability to introduce new structures. In addition, high throughput binding and bioactivity assays can be devised using purified recombinant protein and modern reporter gene transcription assays known to those of skill in the art in order to refine the activity of a designed ligand.
A method of identifying modulators of the activity of an HNF4 polypeptide using rational drug design is thus provided in accordance with the present invention. The method comprises designing a potential modulator for an HNF4 polypeptide of the present invention that will form non-covalent bonds with amino acids in the ligand binding pocket based upon the crystalline structure of the HNF4γ LBD polypeptide; synthesizing the modulator; and determining whether the potential modulator modulates the activity of the HNF4 polypeptide. In a preferred embodiment, the modulator is designed for an HNF4γ polypeptide. Preferably, the HNF4γ polypeptide comprises the nucleic acid sequence of SEQ ID NO:1, and the HNF4γ LBD comprises the nucleic acid sequence SEQ ID NO:3. The determination of whether the modulator modulates the biological activity of an HNF4 polypeptide is made in accordance with the screening methods disclosed herein, or by other screening methods known to those of skill in the art. Modulators can be synthesized using techniques known to those of ordinary skill in the art.
In an alternative embodiment, a method of designing a modulator of an HNF4 polypeptide in accordance with the present invention is disclosed comprising: (a) selecting a candidate HNF4 ligand; (b) determining which amino acid or amino acids of an HNF4 polypeptide interact with the ligand using a three-dimensional model of a crystallized HNF4γ LBD; (c) identifying in a biological assay for HNF4 activity a degree to which the ligand modulates the activity of the HNF4 polypeptide; (d) selecting a chemical modification of the ligand wherein the interaction between the amino acids of the HNF4 polypeptide and the ligand is predicted to be modulated by the chemical modification; (e) performing the chemical modification on the ligand to form a modified ligand; (f) contacting the modified ligand with the HNF4 polypeptide; (g) identifying in a biological assay for HNF4 activity a degree to which the modified ligand modulates the biological activity of the HNF4 polypeptide; and (h) comparing the biological activity of the HNF4 polypeptide in the presence of modified ligand with the biological activity of the HNF4 polypeptide in the presence of the unmodified ligand, whereby a modulator of an HNF4 polypeptide is designed. present invention. The method comprises designing a potential modulator for an HNF4 polypeptide of the present invention that will form non-covalent bonds with amino acids in the ligand binding pocket based upon the crystalline structure of the HNF4γ LBD polypeptide; synthesizing the modulator; and determining whether the potential modulator modulates the activity of the HNF4 polypeptide. In a preferred embodiment, the modulator is designed for an HNF4γ polypeptide. Preferably, the HNF4γ polypeptide comprises the nucleic acid sequence of SEQ ID NO:1, and the HNF4γ LBD comprises the nucleic acid sequence SEQ ID NO:3. The determination of whether the modulator modulates the biological activity of an HNF4 polypeptide is made in accordance with the screening methods disclosed herein, or by other screening methods known to those of skill in the art. Modulators can be synthesized using techniques known to those of ordinary skill in the art.
In an alternative embodiment, a method of designing a modulator of an HNF4 polypeptide in accordance with the present invention is disclosed comprising: (a) selecting a candidate HNF4 ligand; (b) determining which amino acid or amino acids of an HNF4 polypeptide interact with the ligand using a three-dimensional model of a crystallized HNF4γ LBD; (c) identifying in a biological assay for HNF4 activity a degree to which the ligand modulates the activity of the HNF4 polypeptide; (d) selecting a chemical modification of the ligand wherein the interaction between the amino acids of the HNF4 polypeptide and the ligand is predicted to be modulated by the chemical modification; (e) performing the chemical modification on the ligand to form a modified ligand; (f) contacting the modified ligand with the HNF4 polypeptide; (g) identifying in a biological assay for HNF4 activity a degree to which the modified ligand modulates the biological activity of the HNF4 polypeptide; and (h) comparing the biological activity of the HNF4 polypeptide in the presence of modified ligand with the biological activity of the HNF4 polypeptide in the presence of the unmodified ligand, whereby a modulator of an HNF4 polypeptide is designed.
VII.A.2. Methods for Using the HNF4γ LBD Structural Coordinates for Molecular Design
For the first time, the present invention permits the use of molecular design techniques to design, select and synthesize chemical entities and compounds, including modulatory compounds, capable of binding to the ligand binding pocket or an accessory binding site of HNF4γ and the HNF4γ LBD, in whole or in part. Correspondingly, the present invention also provides for the application of similar techniques in the design of modulators of any HNF4 polypeptide.
In accordance with a preferred embodiment of the present invention, the structure coordinates of a crystalline HNF4γ LBD can be used to design compounds that bind to an HNF4 LBD (more preferably an HNF4γ LBD) and alter the properties of an HNF4 LBD (for example, the dimerization or ligand binding ability) in different ways. One aspect of the present invention provides for the design of compounds that act as competitive inhibitors of an HNF4 polypeptide by binding to all, or a portion of, the binding sites on an HNF4 LBD. The present invention also provides for the design of compounds that can act as uncompetitive inhibitors of an HNF4 LBD. These compounds can bind to all, or a portion of, an accessory binding site of an HNF4 that is already binding its ligand and can, therefore, be more potent and less non-specific than known competitive inhibitors that compete only for the HNF4 ligand binding pocket. Similarly, non-competitive inhibitors that bind to and inhibit HNF4 LBD activity, whether or not it is bound to another chemical entity, can be designed using the HNF4 LBD structure coordinates of this invention.
A second design approach is to probe an HNF4 or HNF4 LBD (preferably an HNF4γ or HNFγ LBD) crystal with molecules comprising a variety of different chemical entities to determine optimal sites for interaction between candidate HNF4 or HNF4 LBD modulators and the polypeptide. For example, high resolution X-ray diffraction data collected from crystals saturated with solvent allows the determination of the site where each type of solvent molecule adheres. Small molecules that bind tightly to those sites can then be designed and synthesized and tested for their HNF4γ modulator activity.
Once a computationally-designed ligand is synthesized using the methods of the present invention or other methods known to those of skill in the art, assays can be used to establish its efficacy of the ligand as a modulator of HNF4 (preferably HNF4γ) activity. After such assays, the ligands can be further refined by generating intact HNF4, or HNF4 LBD, crystals with a ligand bound to the LBD. The structure of the ligand can then be further refined using the chemical modification methods described herein and known to those of skill in the art, in order to improve the modulation activity or the binding affinity of the ligand. This process can lead to second generation ligands with improved properties.
Ligands also can be selected that modulate HNF4 responsive gene transcription by the method of altering the interaction of co-activators and co-repressors with their cognate HNF4. For example, agonistic ligands can be selected that block or dissociate a co-repressor from interacting with the HNF4, and/or that promote binding or association of a co-activator. Antagonistic ligands can be selected that block co-activator interaction and/or promote co-repressor interaction with a target receptor. Selection can be done via binding assays that screen for designed ligands having the desired modulatory properties. Preferably, interactions of an HNF4γ polypeptide are targeted. Suitable assays for screening that can be employed, mutatis mutandis in the present invention, are described in published PCT international applications WO 00/037,077 and WO 00/025,134, which are incorporated herein in their entirety by reference.
VII.A.3. Methods of Designing HNF4 LBD Modulator Compounds
The design of candidate substances, also referred to as “compounds” or “candidate compounds”, that bind to or inhibit HNF4 LBD-mediated activity according to the present invention generally involves consideration of two factors. First, the compound must be capable of physically and structurally associating with an HNF4 LBD. Non-covalent molecular interactions important in the association of an HNF4 LBD with its substrate include hydrogen bonding, van der Waals interactions and hydrophobic interactions.
Second, the compound must be able to assume a conformation that allows it to associate with an HNF4 LBD. Although certain portions of the compound will not directly participate in this association with an HNF4 LBD, those portions can still influence the overall conformation of the molecule. This, in turn, can have a significant impact on potency. Such conformational requirements include the overall three-dimensional structure and orientation of the chemical entity or compound in relation to all or a portion of the binding site, e.g., the ligand binding pocket or an accessory binding site of an HNF4 LBD, or the spacing between functional groups of a compound comprising several chemical entities that directly interact with an HNF4 LBD.
The potential modulatory or binding effect of a chemical compound on an HNF4 LBD can be analyzed prior to its actual synthesis and testing by the use of computer modeling techniques that employ the coordinates of a crystalline HNF4γ LBD polypeptide of the present invention. If the theoretical structure of the given compound suggests insufficient interaction and association between it and an HNF4 LBD, synthesis and testing of the compound is obviated. However, if computer modeling indicates a strong interaction, the molecule can then be synthesized and tested for its ability to bind and modulate the activity of an HNF4 LBD. In this manner, synthesis of unproductive or inoperative compounds can be avoided.
A modulatory or other binding compound of an HNF4 LBD polypeptide (preferably an HNF4γ LBD) can be computationally evaluated and designed via a series of steps in which chemical entities or fragments are screened and selected for their ability to associate with the individual binding sites or other areas of a crystalline HNF4γ LBD polypeptide of the present invention.
One of several methods can be used to screen chemical entities or fragments for their ability to associate with an HNF4 LBD and, more particularly, with the individual binding sites of an HNF4 LBD, such as ligand binding pocket or an accessory binding site. This process can begin by visual inspection of, for example, the ligand binding pocket on a computer screen based on the HNF4γ LBD atomic coordinates in Table 2. Selected fragments or chemical entities can then be positioned in a variety of orientations, or docked, within an individual binding site of an HNF4γ LBD as defined herein above. Docking can be accomplished using software programs such as those available under the tradenames QUANTA™ (Molecular Simulations Inc., San Diego, Calif.) and SYBYL™ (Tripos, Inc., St. Louis, Mo.), followed by energy minimization and molecular dynamics with standard molecular mechanics forcefields, such as CHARM (Brooks et al., (1983) J. Comp. Chem., 8: 132) and AMBER 5 (Case et al., (1997), AMBER 5, University of California, San Francisco; Pearlman et al., (1995) Comput. Phys. Commun. 91: 1-41).
Specialized computer programs can also assist in the process of selecting fragments or chemical entities. These include:
Once suitable chemical entities or fragments have been selected, they can be assembled into a single compound or modulator. Assembly can proceed by visual inspection of the relationship of the fragments to each other on the three-dimensional image displayed on a computer screen in relation to the structure coordinates of an HNF4γ LBD. Manual model building using software such as QUANTA™ or SYBYL™ typically follows.
Useful programs to aid one of ordinary skill in the art in connecting the individual chemical entities or fragments include:
Instead of proceeding to build an HNF4 LBD modulator (preferably an HNF4γ LBD modulator) in a step-wise fashion one fragment or chemical entity at a time as described above, modulatory or other binding compounds can be designed as a whole or de novo using the structural coordinates of a crystalline HNF4γ LBD polypeptide of the present invention and either an empty binding site or optionally including some portion(s) of a known modulator(s). Applicable methods can employ the following software programs:
Other molecular modeling techniques can also be employed in accordance with this invention. See, e.g., Cohen et al., (1990) J. Med. Chem. 33: 883-94. See also, Navia & Murcko, (1992) Curr. Opin. Struc. Biol. 2: 202-10; U.S. Pat. No. 6,008,033, herein incorporated by reference.
Once a compound has been designed or selected by the above methods, the efficiency with which that compound can bind to an HNF4γ LBD can be tested and optimized by computational evaluation. By way of particular example, a compound that has been designed or selected to function as an HNF4γ LBD modulator should also preferably traverse a volume not overlapping that occupied by the binding site when it is bound to its native ligand. Additionally, an effective HNF4 LBD modulator should preferably demonstrate a relatively small difference in energy between its bound and free states (i.e., a small deformation energy of binding). Thus, the most efficient HNF4 LBD modulators should preferably be designed with a deformation energy of binding of not greater than about 10 kcal/mole, and preferably, not greater than 7 kcal/mole. It is possible for HNF4 LBD modulators to interact with the polypeptide in more than one conformation that is similar in overall binding energy. In those cases, the deformation energy of binding is taken to be the difference between the energy of the free compound and the average energy of the conformations observed when the modulator binds to the polypeptide.
A compound designed or selected as binding to an HNF4 polypeptide (preferably an HNF4γ LBD polypeptide) can be further computationally optimized so that in its bound state it would preferably lack repulsive electrostatic interaction with the target polypeptide. Such non-complementary (e.g., electrostatic) interactions include repulsive charge-charge, dipole-dipole and charge-dipole interactions. Specifically, the sum of all electrostatic interactions between the modulator and the polypeptide when the modulator is bound to an HNF4 LBD preferably make a neutral or favorable contribution to the enthalpy of binding.
Specific computer software is available in the art to evaluate compound deformation energy and electrostatic interaction. Examples of programs designed for such uses include:
These programs can be implemented using a suitable computer system. Other hardware systems and software packages will be apparent to those skilled in the art after review of the disclosure of the present invention presented herein.
Once an HNF4 LBD modulating compound has been optimally selected or designed, as described above, substitutions can then be made in some of its atoms or side groups in order to improve or modify its binding properties. Generally, initial substitutions are conservative, i.e., the replacement group will have approximately the same size, shape, hydrophobicity and charge as the original group. It should, of course, be understood that components known in the art to alter conformation should be avoided. Such substituted chemical compounds can then be analyzed for efficiency of fit to an HNF4 LBD binding site using the same computer-based approaches described in detail above.
VII.B. Distinguishing Between HNF4 Isoforms
The present invention discloses the ability to generate new synthetic ligands to distinguish between HNF4 isoforms. As described herein, computer-designed ligands can be generated that distinguish between binding isoforms, thereby allowing the generation of either tissue specific or function specific ligands. The atomic structural coordinates disclosed in the present invention reveal structural details unique to HNF4γ. These structural details can be exploited when a novel ligand is designed using the methods of the present invention or other ligand design methods known in the art. The structural features that differentiate an HNF4γ from an HNF4α can be targeted in ligand design. Thus, for example, a ligand can be designed that will recognize HNF4γ, while not interacting with other HNF4s or even with moieties having similar structural features. Prior to the disclosure of the present invention, the ability to target an HNF4 isoform was unattainable.
VII.C. Method of Screening for Chemical and Biological Modulators of the Biological Activity of HNF4γ
A candidate substance identified according to a screening assay of the present invention has an ability to modulate the biological activity of an HNF4 polypeptide or an HNF4 LBD polypeptide. In a preferred embodiment, such a candidate compound can have utility in the treatment of disorders and conditions associated with the biological activity of an HNF4γ or an HNF4γ LBD polypeptide, including diabetes, glucose homeostasis and lipid homeostasis.
In a cell-free system, the method comprises the steps of establishing a control system comprising an HNF4γ polypeptide and a ligand which is capable of binding to the polypeptide; establishing a test system comprising an HNF4γ polypeptide, the ligand, and a candidate compound; and determining whether the candidate compound modulates the activity of the polypeptide by comparison of the test and control systems. A representative ligand comprises a fatty acid or other small molecule, and in this embodiment, the biological activity or property screened includes binding affinity.
In another embodiment of the invention, a form of an HNF4γ polypeptide or a catalytic or immunogenic fragment or oligopeptide thereof, can be used for screening libraries of compounds in any of a variety of drug screening techniques. The fragment employed in such a screening can be affixed to a solid support. The formation of binding complexes, between an HNF4γ polypeptide and the agent being tested, will be detected. In a preferred embodiment, the HNF4γ polypeptide has an amino acid sequence of SEQ ID NO:2. When an HNF4γ LBD polypeptide is employed, a preferred embodiment will include an HNF4γ polypeptide having the amino acid sequence of SEQ ID NO:4.
Another technique for drug screening which can be used provides for high throughput screening of compounds having suitable binding affinity to the protein of interest as described in published PCT application WO 84/03564, herein incorporated by reference. In this method, as applied to a polypeptide of the present invention, large numbers of different small test compounds are synthesized on a solid substrate, such as plastic pins or some other surface. The test compounds are reacted with the polypeptide, or fragments thereof. Bound polypeptide is then detected by methods well known to those of skill in the art. The polypeptide can also be placed directly onto plates for use in the aforementioned drug screening techniques.
In yet another embodiment, a method of screening for a modulator of an HNF4γ polypeptide or an HNF4γ LBD polypeptide comprises: providing a library of test samples; contacting an HNF4γ polypeptide or an HNF4γ LBD polypeptide with each test sample; detecting an interaction between a test sample and a an HNF4γ polypeptide or an HNF4γ LBD polypeptide; identifying a test sample that interacts with an HNF4γ polypeptide or an HNF4γ LBD polypeptide; and isolating a test sample that interacts with an HNF4γ polypeptide or an HNF4γ LBD polypeptide.
In each of the foregoing embodiments, an interaction can be detected spectrophotometrically, radiologically or immunologically. An interaction between an HNF4γ polypeptide or an HNF4γ LBD polypeptide and a test sample can also be quantified using methodology known to those of skill in the art. In another embodiment, the HNF4γ polypeptide and the HNF4γ LBD is in crystalline form.
In accordance with the present invention there is also provided a rapid and high throughput screening method that relies on the methods described above. This screening method comprises separately contacting each of a plurality of substantially identical samples with an HNF4γ polypeptide or an HNF4γ LBD and detecting a resulting binding complex. In such a screening method the plurality of samples preferably comprises more than about 104 samples, or more preferably comprises more than about 5×104 samples.
VII.D. Method of Identifying Compounds Which Inhibit Ligand Binding
Until disclosure of the present invention, the natural ligand of HNF4γ was unknown. Various hypotheses predicted the general properties an HNF4γ ligand might exhibit, but no ligand was conclusively identified. The present invention solves this problem by conclusively identifying a natural ligand of HNF4γ, the fatty acid palmitic acid. Using the identity of HNF4γ's natural ligand, disclosed for the first time herein, it is possible to design test compounds that inhibit binding of ligands normally bound by an HNF4 polypeptide.
In one aspect of the present invention, an assay method for identifying a compound that inhibits binding of a ligand to an HNF4 polypeptide is disclosed. A natural ligand of HNF4γ, such as a fatty acid can be used in the assay method as the ligand against which the inhibition by a test compound is gauged. Palmitic acid is a preferred fatty acid in the assay method. The method comprises (a) incubating an HNF4 polypeptide with a ligand in the presence of a test inhibitor compound; (b) determining an amount of ligand that is bound to the HNF4 polypeptide, wherein decreased binding of ligand to the HNF4 polypeptide in the presence of the test inhibitor compound relative to binding in the absence of the test inhibitor compound is indicative of inhibition; and (c) identifying the test compound as an inhibitor of ligand binding if decreased ligand binding is observed. Preferably, the ligand is a fatty acid and even more preferably, the fatty acid is palmitic acid.
In another aspect of the present invention, the disclosed assay method can be used in the structural refinement of candidate HNF4 inhibitors. For example, multiple rounds of optimization can be followed by gradual structural changes in a strategy of inhibitor design. A strategy such as this is made possible by the disclosure of the coordinates of the HNF4γ LBD and the disclosure of a natural ligand of HNF4, the fatty acid, palmitic acid.
VII.E. Design of HNF4 Isoform Modulators
The HNF4γ crystal structure of the present invention can be used to generate modulators of other HNF4 isoforms, such as HNF4α. Analysis of the disclosed crystal structure can provide a guide for designing HNF4α modulators. Absent the crystal structure of the present invention, researches would be required to design HNF4α modulators de novo. The present invention, however, addresses this problem by providing insights into the binding pocket of HNF4γ which can be extended, due to significant structural similarity, to the binding pocket of HNF4α. An evaluation of the binding pocket of HNF4γ indicates that a potential HNF4α modulator would meet a broad set of general criteria. Broadly, it can be stated that, based on the crystal structure of HNF4γ, a potent HNF4α ligand would require several general features including: (a) a carboxylic acid or equivalent isosteric “head group” to interact with the amino acids R186 and G197 to form a strong polar hydrogen bonding interaction; (b) a lipophilic non-head group region of the molecule, which could possibly consist of aromatic rings, aliphatic carbon atoms, ether oxygens atoms, etc.; and (c) the ability to adopt a conformation that is complementary to the shape of the binding pocket.
Using the discerned structural similarities and differences between HNF4 isoforms, as represented and predicted based on the crystal structure of the present invention and homology models, an HNF4α modulator can be designed. For example, based on an evaluation of a homology model of HNF4α, which is derived from the HNF4γ crystal structure, it is expected that a potent ligand would need similar characteristics as listed above for a compound recognized by HNF4γ. Additional modifications can be included, based on the disclosed structure, which are predicted to further define a modulator specific for HNF4α over other isoforms. For example, if amino acid A215 (using HNF4γ numbering scheme) is mutated to a serine residue, a group capable of hydrogen bonding (which could be either donating or accepting) placed within 3 angstroms of the serine residue (distance of OG of the serine residue to the “heavy atom” of the hydrogen bonding group) would increase both the potency and selectivity of the compounds for HNF4α. Thus, the disclosed crystal structure of HNF4γ can be useful when designing modulators of HNF4α and other isoforms.
VII. Design. Preparation and Structural Analysis of HNF4γ and HNF4γ LBD Mutants and Structural Equivalents
The present invention provides for the generation of HNF4 and HNF4 mutants (preferably HNF4γ and HNF4γ LBD mutants), and the ability to solve the crystal structures of those that crystallize. More particularly, through the provision of the three-dimensional structure of an HNF4γ LBD, desirable sites for mutation can be identified.
The structure coordinates of an HNF4γ LBD provided in accordance with the present invention also facilitate the identification of related proteins or enzymes analogous to HNF4γ in function, structure or both, (for example, an HNF4α), which can lead to novel therapeutic modes for treating or preventing a range of disease states.
VIII.A. Sterically Similar Compounds
A further aspect of the present invention is that sterically similar compounds can be formulated to mimic the key portions of an HNF4 LBD structure. Such compounds are functional equivalents. The generation of a structural functional equivalent can be achieved by the techniques of modeling and chemical design known to those of skill in the art and described herein. Modeling and chemical design of HNF4 and HNF4 LBD structural equivalents can be based on the structure coordinates of a crystalline HNF4γ LBD polypeptide of the present invention. It will be understood that all such sterically similar constructs fall within the scope of the present invention.
VIII.B. HNF4 Polypeptides
The generation of chimeric HNF4 polypeptides is also an aspect of the present invention. Such a chimeric polypeptide can comprise an HNF4 LBD polypeptide or a portion of an HNF4 LBD, (e.g. an HNF4γ LBD) that is fused to a candidate polypeptide or a suitable region of the candidate polypeptide, for example HNF4α. Throughout the present disclosure it is intended that the term “mutant” encompass not only mutants of an HNF4 LBD polypeptide but chimeric proteins generated using an HNF4 LBD as well. It is thus intended that the following discussion of mutant HNF4 LBDs apply mutatis mutandis to chimeric HNF4 and HNF4 LBD polypeptides and to structural equivalents thereof.
In accordance with the present invention, a mutation can be directed to a particular site or combination of sites of a wild-type HNF4 LBD. For example, an accessory binding site or the binding pocket can be chosen for mutagenesis. Similarly, a residue having a location on, at or near the surface of the polypeptide can be replaced, resulting in an altered surface charge of one or more charge units, as compared to the wild-type HNF4 and HNF4 LBD. Alternatively, an amino acid residue in an HNF4 or an HNF4 LBD can be chosen for replacement based on its hydrophilic or hydrophobic characteristics.
Such mutants can be characterized by any one of several different properties as compared with the wild-type HNF4 LBD. For example, such mutants can have an altered surface charge of one or more charge units, or can have an increase in overall stability. Other mutants can have altered substrate specificity in comparison with, or a higher specific activity than, a wild-type HNF4 or HNF4 LBD.
HNF4 and HNF4 LBD mutants of the present invention can be generated in a number of ways. For example, the wild-type sequence of an HNF4 or an HNF4 LBD can be mutated at those sites identified using this invention as desirable for mutation, by means of oligonucleotide-directed mutagenesis or other conventional methods, such as deletion. Alternatively, mutants of an HNF4 or an HNF4 LBD can be generated by the site-specific replacement of a particular amino acid with an unnaturally occurring amino acid. In addition, HNF4 or HNF4 LBD mutants can be generated through replacement of an amino acid residue, for example, a particular cysteine or methionine residue, with selenocysteine or selenomethionine. This can be achieved by growing a host organism capable of expressing either the wild-type or mutant polypeptide on a growth medium depleted of either natural cysteine or methionine (or both) but enriched in selenocysteine or selenomethionine (or both).
Mutations can be introduced into a DNA sequence coding for an HNF4 or an HNF4 LBD using synthetic oligonucleotides. These oligonucleotides contain nucleotide sequences flanking the desired mutation sites. Mutations can be generated in the full-length DNA sequence of an HNF4 or an HNF4 LBD or in any sequence coding for polypeptide fragments of an HNF4 or an HNF4 LBD.
According to the present invention, a mutated HNF4 or HNF4 LBD DNA sequence produced by the methods described above, or any alternative methods known in the art, can be expressed using an expression vector. An expression vector, as is well known to those of skill in the art, typically includes elements that permit autonomous replication in a host cell independent of the host genome, and one or more phenotypic markers for selection purposes. Either prior to or after insertion of the DNA sequences surrounding the desired HNF4 or HNF4 LBD mutant coding sequence, an expression vector also will include control sequences encoding a promoter, operator, ribosome binding site, translation initiation signal, and, optionally, a repressor gene or various activator genes and a signal for termination. In some embodiments, where secretion of the produced mutant is desired, nucleotides encoding a “signal sequence” can be inserted prior to an HNF4 or an HNF4 LBD mutant coding sequence. For expression under the direction of the control sequences, a desired DNA sequence must be operatively linked to the control sequences; that is, the sequence must have an appropriate start signal in front of the DNA sequence encoding the HNF4 or HNF4 LBD mutant, and the correct reading frame to permit expression of that sequence under the control of the control sequences and production of the desired product encoded by that HNF4 or HNF4 LBD sequence must be maintained.
Any of a wide variety of well-known available expression vectors can be useful to express a mutated HNF4 or HNF4 LBD coding sequences of this invention. These include for example, vectors consisting of segments of chromosomal, non-chromosomal and synthetic DNA sequences, such as various known derivatives of SV40, known bacterial plasmids, e.g., plasmids from E. coli including col E1, pCR1, pBR322, pMB9 and their derivatives, wider host range plasmids, e.g., RP4, phage DNAs, e.g., the numerous derivatives of phage X, e.g., NM 989, and other DNA phages, e.g., M13 and filamentous single stranded DNA phages, yeast plasmids and vectors derived from combinations of plasmids and phage DNAs, such as plasmids which have been modified to employ phage DNA or other expression control sequences. In a preferred embodiment of this invention, the E. coli vector pRSET A, including a T7-based expression system, is employed.
In addition, any of a wide variety of expression control sequences-sequences that control the expression of a DNA sequence when operatively linked to it—can be used in these vectors to express the mutated DNA sequences according to this invention. Such useful expression control sequences, include, for example, the early and late promoters of SV40 for animal cells, the lac system, the trp system the TAC or TRC system, the major operator and promoter regions of phage X, the control regions of fd coat protein, all for E. coli, the promoter for 3-phosphoglycerate kinase or other glycolytic enzymes, the promoters of acid phosphatase, e.g., Pho5, the promoters of the yeast α-mating factors for yeast, and other sequences known to control the expression of genes of prokaryotic or eukaryotic cells or their viruses, and various combinations thereof.
A wide variety of hosts are also useful for producing mutated HNF4γ and HNF4γ LBD polypeptides according to this invention. These hosts include, for example, bacteria, such as E. coli, Bacillus and Streptomyces, fungi, such as yeasts, and animal cells, such as CHO and COS-1 cells, plant cells, insect cells, such as Sfg cells, and transgenic host cells.
It should be understood that not all expression vectors and expression systems function in the same way to express mutated DNA sequences of this invention, and to produce modified HNF4 and HNF4 LBD polypeptides or HNF4 or HNF4 LBD mutants. Neither do all hosts function equally well with the same expression system. One of skill in the art can, however, make a selection among these vectors, expression control sequences and hosts without undue experimentation and without departing from the scope of this invention. For example, an important consideration in selecting a vector will be the ability of the vector to replicate in a given host. The copy number of the vector, the ability to control that copy number, and the expression of any other proteins encoded by the vector, such as antibiotic markers, should also be considered.
In selecting an expression control sequence, a variety of factors should also be considered. These include, for example, the relative strength of the system, its controllability and its compatibility with the DNA sequence encoding a modified HNF4 or HNF4 LBD polypeptide of this invention, with particular regard to the formation of potential secondary and tertiary structures.
Hosts should be selected by consideration of their compatibility with the chosen vector, the toxicity of a modified HNF4 or HNF4 LBD to them, their ability to express mature products, their ability to fold proteins correctly, their fermentation requirements, the ease of purification of a modified HNF4 or HNF4 LBD and safety. Within these parameters, one of skill in the art can select various vector/expression control system/host combinations that will produce useful amounts of a mutant HNF4 or HNF4 LBD. A mutant HNF4 or HNF4 LBD produced in these systems can be purified by a variety of conventional steps and strategies, including those used to purify the wild-type HNF4 or HNF4 LBD.
Once an HNF4 LBD mutation(s) has been generated in the desired location, such as an active site or dimerization site, the mutants can be tested for any one of several properties of interest. For example, mutants can be screened for an altered charge at physiological pH. This is determined by measuring the mutant HNF4 or HNF4 LBD isoelectric point (pI) and comparing the observed value with that of the wild-type parent. Isoelectric point can be measured by gel-electrophoresis according to the method of Wellner (Wellner, (1971) Anal. Chem. 43: 597). A mutant HNF4 or HNF4 LBD polypeptide containing a replacement amino acid located at the surface of the enzyme, as provided by the structural information of this invention, can lead to an altered surface charge and an altered pl.
VIII.C. Generation of an Engineered HNF4 or HNF4 LBD Mutant
In another aspect of the present invention, a unique HNF4 or HNF4 LBD polypeptide can be generated. Such a mutant can facilitate purification and the study of the ligand-binding abilities of an HNF4 polypeptide.
As used in the following discussion, the terms “engineered HNF4”, “engineered HNF4 LDB”, “HNF4 mutant”, and “HNF4 LBD mutant” refers to polypeptides having amino acid sequences which contain at least one mutation in the wild-type sequence. The terms also refer to HNF4 and HNF4 LBD polypeptides which are capable of exerting a biological effect in that they comprise all or a part of the amino acid sequence of an engineered HNF4 or HNF4 LBD mutant polypeptide of the present invention, or cross-react with antibodies raised against an engineered HNF4 or HNF4 LBD mutant polypeptide, or retain all or some or an enhanced degree of the biological activity of the engineered HNF4 or HNF4 LBD mutant amino acid sequence or protein. Such biological activity can include lipid binding in general, and fatty acid binding in particular.
The terms “engineered HNF4 LBD” and “HNF4 LBD mutant” also includes analogs of an engineered HNF4 LBD or HNF4 LBD mutant polypeptide. By “analog” is intended that a DNA or polypeptide sequence can contain alterations relative to the sequences disclosed herein, yet retain all or some or an enhanced degree of the biological activity of those sequences. Analogs can be derived from genomic nucleotide sequences or from other organisms, or can be created synthetically. Those of skill in the art will appreciate that other analogs, as yet undisclosed or undiscovered, can be used to design and/or construct HNF4 LBD or HNF4 LBD mutant analogs. There is no need for an engineered HNF4 LBD or HNF4 LBD mutant polypeptide to comprise all or substantially all of the amino acid sequence of SEQ ID NOs:2 or 4. Shorter or longer sequences are anticipated to be of use in the invention; shorter sequences are herein referred to as “segments”. Thus, the terms “engineered HNF4 LBD” and “HNF4 LBD mutant” also includes fusion, chimeric or recombinant engineered HNF4 LBD or HNF4 LBD mutant polypeptides and proteins comprising sequences of the present invention. Methods of preparing such proteins are disclosed herein above and are known in the art.
VIII.D. Sequence Similarity and Identity
As used herein, the term “substantially similar” means that a particular sequence varies from nucleic acid sequence of SEQ ID NOs:1 or 3, or the amino acid sequence of SEQ ID NOs:2 or 4 by one or more deletions, substitutions, or additions, the net effect of which is to retain at least some of biological activity of the natural gene, gene product, or sequence. Such sequences include “mutant” or “polymorphic” sequences, or sequences in which the biological activity and/or the physical properties are altered to some degree but retains at least some or an enhanced degree of the original biological activity and/or physical properties. In determining nucleic acid sequences, all subject nucleic acid sequences capable of encoding substantially similar amino acid sequences are considered to be substantially similar to a reference nucleic acid sequence, regardless of differences in codon sequences or substitution of equivalent amino acids to create biologically functional equivalents.
VIII.D.1. Sequences that are Substantially Identical to an Engineered HNF4 or HNF4 LBD Mutant Sequence of the Present Invention
Nucleic acids that are substantially identical to a nucleic acid sequence of an engineered HNF4 or HNF4 LBD mutant of the present invention, e.g. allelic variants, genetically altered versions of the gene, etc., bind to an engineered HNF4 or HNF4 LBD mutant sequence under stringent hybridization conditions. By using probes, particularly labeled probes of DNA sequences, one can isolate homologous or related genes. The source of homologous genes can be any species, e.g. primate species; rodents, such as rats and mice, canines, felines, bovines, equines, yeast, nematodes, etc.
Between mammalian species, e.g. human and mouse, homologs have substantial sequence similarity, i.e. at least 75% sequence identity between nucleotide sequences. Sequence similarity is calculated based on a reference sequence, which can be a subset of a larger sequence, such as a conserved motif, coding region, flanking region, etc. A reference sequence will usually be at least about 18 nt long, more usually at least about 30 nt long, and can extend to the complete sequence that is being compared. Algorithms for sequence analysis are known in the art, such as BLAST, described in Altschul et al., (1990) J. Mol. Biol. 215: 403-10.
Percent identity or percent similarity of a DNA or peptide sequence can be determined, for example, by comparing sequence information using the GAP computer program, available from the University of Wisconsin Geneticist Computer Group. The GAP program utilizes the alignment method of Needleman et al., (1970) J. Mol. Biol. 48: 443, as revised by Smith et al., (1981) Adv. Appl. Math. 2:482. Briefly, the GAP program defines similarity as the number of aligned symbols (i.e., nucleotides or amino acids) which are similar, divided by the total number of symbols in the shorter of the two sequences. The preferred parameters for the GAP program are the default parameters, which do not impose a penalty for end gaps. See, e.g., Schwartz et al., eds., (1979), Atlas of Protein Sequence and Structure, National Biomedical Research Foundation, pp. 357-358, and Gribskov et al., (1986) Nucl. Acids. Res. 14: 6745.
The term “similarity” is contrasted with the term “identity”. Similarity is defined as above; “identity”, however, means a nucleic acid or amino acid sequence having the same amino acid at the same relative position in a given family member of a gene family. Homology and similarity are generally viewed as broader terms than the term identity. Biochemically similar amino acids, for example leucine/isoleucine or glutamate/aspartate, can be present at the same position-these are not identical per se, but are biochemically “similar.” As disclosed herein, these are referred to as conservative differences or conservative substitutions. This differs from a conservative mutation at the DNA level, which changes the nucleotide sequence without making a change in the encoded amino acid, e.g. TCC to TCA, both of which encode serine.
As used herein, DNA analog sequences are “substantially identical” to specific DNA sequences disclosed herein if: (a) the DNA analog sequence is derived from coding regions of the nucleic acid sequence shown in SEQ ID NOs:1 or 3; or (b) the DNA analog sequence is capable of hybridization with DNA sequences of (a) under stringent conditions and which encode a biologically active HNF4γ or HNF4γ LBD gene product; or (c) the DNA sequences are degenerate as a result of alternative genetic code to the DNA analog sequences defined in (a) and/or (b). Substantially identical analog proteins and nucleic acids will have between about 70% and 80%, preferably between about 81% to about 90% or even more preferably between about 91% and 99% sequence identity with the corresponding sequence of the native protein or nucleic acid. Sequences having lesser degrees of identity but comparable biological activity are considered to be equivalents.
As used herein, “stringent conditions” means conditions of high stringency, for example 6×SSC, 0.2% polyvinylpyrrolidone, 0.2% Ficoll, 0.2% bovine serum albumin, 0.1% sodium dodecyl sulfate, 100 μg/ml salmon sperm DNA and 15% formamide at 68° C. For the purposes of specifying additional conditions of high stringency, preferred conditions are salt concentration of about 200 mM and temperature of about 45° C. One example of such stringent conditions is hybridization at 4×SSC, at 65° C., followed by a washing in 0.1×SSC at 65° C. for one hour. Another exemplary stringent hybridization scheme uses 50% formamide, 4×SSC at 42° C.
In contrast, nucleic acids having sequence similarity are detected by hybridization under lower stringency conditions. Thus, sequence identity can be determined by hybridization under lower stringency conditions, for example, at 50° C. or higher and 0.1×SSC (9 mM NaCl/0.9 mM sodium citrate) and the sequences will remain bound when subjected to washing at 55° C. in 1×SSC.
VIII.D.2. Complementarity and Hybridization to an Engineered HNF4 or HNF4 LBD Mutant Sequence
As used herein, the term “complementary sequences” means nucleic acid sequences which are base-paired according to the standard Watson-Crick complementarity rules. The present invention also encompasses the use of nucleotide segments that are complementary to the sequences of the present invention.
Hybridization can also be used for assessing complementary sequences and/or isolating complementary nucleotide sequences. As discussed above, nucleic acid hybridization will be affected by such conditions as salt concentration, temperature, or organic solvents, in addition to the base composition, length of the complementary strands, and the number of nucleotide base mismatches between the hybridizing nucleic acids, as will be readily appreciated by those skilled in the art. Stringent temperature conditions will generally include temperatures in excess of about 30° C., typically in excess of about 37° C., and preferably in excess of about 45° C. Stringent salt conditions will ordinarily be less than about 1,000 mM, typically less than about 500 mM, and preferably less than about 200 mM. However, the combination of parameters is much more important than the measure of any single parameter. See, e.g., Wetmur & Davidson, (1968) J. Mol. Biol. 31: 349-70. Determining appropriate hybridization conditions to identify and/or isolate sequences containing high levels of homology is well known in the art. See, e.g., Sambrook et al., (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, N.Y.
VIII.D.3. Functional Equivalents of an Engineered HNF4 or HNF4 LBD Mutant Nucleic Acid Sequence of the Present Invention
As used herein, the term “functionally equivalent codon” is used to refer to codons that encode the same amino acid, such as the ACG and AGU codons for serine. HNF4γ or HNF4γ LBD-encoding nucleic acid sequences comprising SEQ ID NOs:1 and 3 which have functionally equivalent codons are covered by the present invention. Thus, when referring to the sequence example presented in SEQ ID NOs:1 and 3, applicants contemplate substitution of functionally equivalent codons into the sequence example of SEQ ID NOs:1 and 3. Thus, applicants are in possession of amino acid and nucleic acids sequences which include such substitutions but which are not set forth herein in their entirety for convenience.
It will also be understood by those of skill in the art that amino acid and nucleic acid sequences can include additional residues, such as additional N- or C-terminal amino acids or 5′ or 3′ nucleic acid sequences, and yet still be essentially as set forth in one of the sequences disclosed herein, so long as the sequence retains biological protein activity where polypeptide expression is concerned. The addition of terminal sequences particularly applies to nucleic acid sequences which can, for example, include various non-coding sequences flanking either of the 5′ or 3′ portions of the coding region or can include various internal sequences, i.e., introns, which are known to occur within genes.
VIII.D.4. Biological Equivalents
The present invention envisions and includes biological equivalents of an engineered HNF4 or HNF4 LBD mutant polypeptide of the present invention. The term “biological equivalent” refers to proteins having amino acid sequences which are substantially identical to the amino acid sequence of an engineered HNF4 LBD mutant of the present invention and which are capable of exerting a biological effect in that they are capable of binding lipid moieties or cross-reacting with anti-HNF4 or HNF4 LBD mutant antibodies raised against an engineered mutant HNF4 or HNF4 LBD polypeptide of the present invention.
For example, certain amino acids can be substituted for other amino acids in a protein structure without appreciable loss of interactive capacity with, for example, structures in the nucleus of a cell. Since it is the interactive capacity and nature of a protein that defines that protein's biological functional activity, certain amino acid sequence substitutions can be made in a protein sequence (or the nucleic acid sequence encoding it) to obtain a protein with the same, enhanced, or antagonistic properties. Such properties can be achieved by interaction with the normal targets of the protein, but this need not be the case, and the biological activity of the invention is not limited to a particular mechanism of action. It is thus in accordance with the present invention that various changes can be made in the amino acid sequence of an engineered HNF4 or HNF4 LBD mutant polypeptide of the present invention or its underlying nucleic acid sequence without appreciable loss of biological utility or activity.
Biologically equivalent polypeptides, as used herein, are polypeptides in which certain, but not most or all, of the amino acids can be substituted. Thus, when referring to the sequence examples presented in SEQ ID NOs:1 and 3, applicants envision substitution of codons that encode biologically equivalent amino acids, as described herein, into the sequence example of SEQ ID NOs:2 and 4, respectively. Thus, applicants are in possession of amino acid and nucleic acids sequences which include such substitutions but which are not set forth herein in their entirety for convenience.
Alternatively, functionally equivalent proteins or peptides can be created via the application of recombinant DNA technology, in which changes in the protein structure can be engineered, based on considerations of the properties of the amino acids being exchanged, e.g. substitution of lie for Leu. Changes designed by man can be introduced through the application of site-directed mutagenesis techniques, e.g., to introduce improvements to the antigenicity of the protein or to test an engineered HNF4 or HNF4 LBD mutant polypeptide of the present invention in order to modulate lipid-binding or other activity, at the molecular level.
Amino acid substitutions, such as those which might be employed in modifying an engineered HNF4 or HNF4 LBD mutant polypeptide of the present invention are generally, but not necessarily, based on the relative similarity of the amino acid side-chain substituents, for example, their hydrophobicity, hydrophilicity, charge, size, and the like. An analysis of the size, shape and type of the amino acid side-chain substituents reveals that arginine, lysine and histidine are all positively charged residues; that alanine, glycine and serine are all of similar size; and that phenylalanine, tryptophan and tyrosine all have a generally similar shape. Therefore, based upon these considerations, arginine, lysine and histidine; alanine, glycine and serine; and phenylalanine, tryptophan and tyrosine; are defined herein as biologically functional equivalents. Other biologically functionally equivalent changes will be appreciated by those of skill in the art. It is implicit in the above discussion, however, that one of skill in the art can appreciate that a radical, rather than a conservative substitution is warranted in a given situation. Non-conservative substitutions in engineered mutant HNF4 or HNF4 LBD polypeptides of the present invention are also an aspect of the present invention.
In making biologically functional equivalent amino acid substitutions, the hydropathic index of amino acids can be considered. Each amino acid has been assigned a hydropathic index on the basis of their hydrophobicity and charge characteristics, these are: isoleucine (+4.5); valine (+4.2); leucine (+3.8); phenylalanine (+2.8); cysteine (+2.5); methionine (+1.9); alanine (+1.8); glycine (−0.4); threonine (−0.7); serine (−0.8); tryptophan (−0.9); tyrosine (−1.3); proline (−1.6); histidine (−3.2); glutamate (−3.5); glutamine (−3.5); aspartate (−3.5); asparagine (−3.5); lysine (−3.9); and arginine (−4.5).
The importance of the hydropathic amino acid index in conferring interactive biological function on a protein is generally understood in the art (Kyte & Doolittle, (1982), J. Mol. Biol. 157: 105-132, incorporated herein by reference). It is known that certain amino acids can be substituted for other amino acids having a similar hydropathic index or score and still retain a similar biological activity. In making changes based upon the hydropathic index, the substitution of amino acids whose hydropathic indices are within +2 of the original value is preferred, those which are within +1 of the original value are particularly preferred, and those within ±0.5 of the original value are even more particularly preferred.
It is also understood in the art that the substitution of like amino acids can be made effectively on the basis of hydrophilicity. U.S. Pat. No. 4,554,101, incorporated herein by reference, states that the greatest local average hydrophilicity of a protein, as governed by the hydrophilicity of its adjacent amino acids, correlates with its immunogenicity and antigenicity, i.e. with a biological property of the protein. It is understood that an amino acid can be substituted for another having a similar hydrophilicity value and still obtain a biologically equivalent protein.
As detailed in U.S. Pat. No. 4,554,101, the following hydrophilicity values have been assigned to amino acid residues: arginine (+3.0); lysine (+3.0); aspartate (+3.0±1); glutamate (+3.0±1); serine (+0.3); asparagine (+0.2); glutamine (+0.2); glycine (0); threonine (−0.4); proline (−0.5±1); alanine (−0.5); histidine (−0.5); cysteine (−1.0); methionine (−1.3); valine (−1.5); leucine (−1.8); isoleucine (−1.8); tyrosine (−2.3); phenylalanine (−2.5); tryptophan (−3.4).
In making changes based upon similar hydrophilicity values, the substitution of amino acids whose hydrophilicity values are within ±2 of the original value is preferred, those which are within ±1 of the original value are particularly preferred, and those within +0.5 of the original value are even more particularly preferred.
While discussion has focused on functionally equivalent polypeptides arising from amino acid changes, it will be appreciated that these changes can be effected by alteration of the encoding DNA, taking into consideration also that the genetic code is degenerate and that two or more codons can code for the same amino acid.
Thus, it will also be understood that this invention is not limited to the particular amino acid and nucleic acid sequences of SEQ ID NOs:1-4. Recombinant vectors and isolated DNA segments can therefore variously include an engineered HNF4γ or HNF4γ LBD mutant polypeptide-encoding region itself, include coding regions bearing selected alterations or modifications in the basic coding region, or include larger polypeptides which nevertheless comprise an HNF4γ or HNF4γ LBD mutant polypeptide-encoding regions or can encode biologically functional equivalent proteins or polypeptides which have variant amino acid sequences. Biological activity of an engineered HNF4γ or HNF4γ LBD mutant polypeptide can be determined, for example, by lipid-binding assays known to those of skill in the art.
The nucleic acid segments of the present invention, regardless of the length of the coding sequence itself, can be combined with other DNA sequences, such as promoters, enhancers, polyadenylation signals, additional restriction enzyme sites, multiple cloning sites, other coding segments, and the like, such that their overall length can vary considerably. It is therefore contemplated that a nucleic acid fragment of almost any length can be employed, with the total length preferably being limited by the ease of preparation and use in the intended recombinant DNA protocol. For example, nucleic acid fragments can be prepared which include a short stretch complementary to a nucleic acid sequence set forth in SEQ ID NOs:1 and 3, such as about 10 nucleotides, and which are up to 10,000 or 5,000 base pairs in length. DNA segments with total lengths of about 4,000, 3,000, 2,000, 1,000, 500, 200, 100, and about 50 base pairs in length are also useful.
The DNA segments of the present invention encompass biologically functional equivalents of engineered HNF4 or HNF4 LBD mutant polypeptides. Such sequences can rise as a consequence of codon redundancy and functional equivalency that are known to occur naturally within nucleic acid sequences and the proteins thus encoded. Alternatively, functionally equivalent proteins or polypeptides can be created via the application of recombinant DNA technology, in which changes in the protein structure can be engineered, based on considerations of the properties of the amino acids being exchanged. Changes can be introduced through the application of site-directed mutagenesis techniques, e.g., to introduce improvements to the antigenicity of the protein or to test variants of an engineered HNF4 or HNF4 LBD mutant of the present invention in order to examine the degree of lipid-binding activity, or other activity at the molecular level. Various site-directed mutagenesis techniques are known to those of skill in the art and can be employed in the present invention.
The invention further encompasses fusion proteins and peptides wherein an engineered HNF4 or HNF4 LBD mutant coding region of the present invention is aligned within the same expression unit with other proteins or peptides having desired functions, such as for purification or immunodetection purposes.
Recombinant vectors form important further aspects of the present invention. Particularly useful vectors are those in which the coding portion of the DNA segment is positioned under the control of a promoter. The promoter can be that naturally associated with an HNF4 gene, as can be obtained by isolating the 5′ non-coding sequences located upstream of the coding segment or exon, for example, using recombinant cloning and/or PCR technology and/or other methods known in the art, in conjunction with the compositions disclosed herein.
In other embodiments, certain advantages will be gained by positioning the coding DNA segment under the control of a recombinant, or heterologous, promoter. As used herein, a recombinant or heterologous promoter is a promoter that is not normally associated with an HNF4 gene in its natural environment. Such promoters can include promoters isolated from bacterial, viral, eukaryotic, or mammalian cells. Naturally, it will be important to employ a promoter that effectively directs the expression of the DNA segment in the cell type chosen for expression. The use of promoter and cell type combinations for protein expression is generally known to those of skill in the art of molecular biology (See, e.g., Sambrook et al., (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, New York, specifically incorporated herein by reference). The promoters employed can be constitutive or inducible and can be used under the appropriate conditions to direct high level expression of the introduced DNA segment, such as is advantageous in the large-scale production of recombinant proteins or peptides. One preferred promoter system contemplated for use in high-level expression is a T7 promoter-based system.
IX. The Role of the Three-Dimensional Structure of the HNF4γ LDB in Solving Additional HNF4 Crystals
Because polypeptides can crystallize in more than one crystal form, the structural coordinates of an HNF4γ LBD, or portions thereof, as provided by the present invention, are particularly useful in solving the structure of other crystal forms of HNF4γ and the crystalline forms of other HNF4s. The coordinates provided in the present invention can also be used to solve the structure of HNF4 or HNF4 LBD mutants (such as those described in Section VIII above), HNF4 LDB co-complexes, or of the crystalline form of any other protein with significant amino acid sequence homology to any functional domain of HNF4.
IX.A. Determining the Three-Dimensional Structure of a Polypeptide Using the Three-Dimensional Structure of the HNF4γ LBD as a Template in Molecular Replacement
One method that can be employed for the purpose of solving additional HNF4 crystal structures is molecular replacement. See generally, Rossmann, ed, (1972) The Molecular Replacement Method, Gordon & Breach, New York. In the molecular replacement method, the unknown crystal structure, whether it is another crystal form of an HNF4γ or an HNF4γ LBD, (i.e. an HNF4γ or an HNF4γ LBD mutant), or an HNF4γ or an HNF4γ LBD polypeptide complexed with another compound (a “co-complex”), or the crystal of some other protein with significant amino acid sequence homology to any functional region of the HNF4γ LBD, can be determined using the HNF4γ LBD structure coordinates provided in Table 2. This method provides an accurate structural form for the unknown crystal more quickly and efficiently than attempting to determine such information ab initio.
In addition, in accordance with this invention, HNF4γ or HNF4γ LBD mutants can be crystallized in complex with known modulators. The crystal structures of a series of such complexes can then be solved by molecular replacement and compared with that of wild-type HNF4γ or the wild-type HNF4γ LBD. Potential sites for modification within the various binding sites of the enzyme can thus be identified. This information provides an additional tool for determining the most efficient binding interactions, for example, increased hydrophobic interactions, between the HNF4γ LBD and a chemical entity or compound.
All of the complexes referred to in the present disclosure can be studied using X-ray diffraction techniques (See, e.g., Blundell & Johnson (1985) Method.Enzymol., 114A & 115B, (Wyckoff et al., eds.), Academic Press) and can be refined using computer software, such as the X-PLOR™ program (Brünger, (1992) X-PLOR, Version 3.1. A System for X-ray Crystallography and NMR, Yale University Press, New Haven, Conn.; X-PLOR is available from Molecular Simulations, Inc., San Diego, Calif.). This information can thus be used to optimize known classes of HNF4 and HNF4 LBD modulators, and more importantly, to design and synthesize novel classes of HNF4 and HNF4 LBD modulators.
The following Laboratory Examples have been included to illustrate preferred modes of the invention. Certain aspects of the following Laboratory Examples are described in terms of techniques and procedures found or contemplated by the present inventors to work well in the practice of the invention. These Laboratory Examples are exemplified through the use of standard laboratory practices of the inventors. In light of the present disclosure and the general level of skill in the art, those of skill will appreciate that the following Laboratory Examples are intended to be exemplary only and that numerous changes, modifications and alterations can be employed without departing from the spirit and scope of the invention.
Amino acids 102408 of the HNF4γ LDB (SEQ ID NO:3) were expressed by subcloning Into a T7 E. coli expression vector, pRSETa (Invitrogen, Carlsbad, Calif.). A histidine tag, sequence MKKGHHHHHHG (SEQ ID NO:5), was engineered at the N-terminus of the HNF4γ protein using a 5′ oligo. The plasmid was transformed into BL21 (DE3) cells which were grown at 22° C. overnight and were then harvested. The soluble protein was purified with an affinity column of Ni+2-NTA coupled agarose (Qiagen, Valencia, Calif.) (25 mM Tris pH=8.0, 50 mM imidazole pH=8.0, 150 mM NaCl). A 50-500 mM imidazole gradient was used for elution. HNF4γ eluted at 100 mM imidazole. The protein was diluted to 25 mM salt and further purified using a POROS™ 50HQ column (PerSeptive Biosystems, Foster City, Calif.) (25 mM Tris pH 8.0, 0.5 mM EDTA, 25 mM NaCl, 5 mM DTT, 5% Propane-diol) eluting with a 25 to 500 mM NaCl gradient. Two peaks were isolated, one representing homodimers of full-length HNF4γ LBD, the other containing heterodimers of full-length and C-terminally truncated HNF4γ. The homodimer peak was concentrated to 20 mg/ml and further purified by gel filtration chromatography (10 mM Tris pH 8.0, 0.1 mM EDTA, 150 mM NaCl, 10 mM DTT, 5% Propane-diol) using a Superdex 75 column (AP Biotech, Piscataway, N.J.). Protein sequence and purity were confirmed by N-terminal sequencing and mass spectrometry to greater than 95% homogeneity.
Crystallization trials were initially conducted with both the homogenous purified protein and the heterogeneous mixture. Crystals were obtained from both; however, the heterogeneous crystals were of poor diffraction quality. The purified protein was concentrated to 30 mg/ml (10 mM Tris pH 8.0, 0.1 mM EDTA, 150 mM NaCl, 10 mM DTT, 5% propane-diol) and crystallized using the vapor diffusion method by adding equal volume amounts of concentrated protein and a crystallization buffer of 0.75M ammonium di-hydrogen phosphate/di-ammonium hydrogen phosphate pH=5.0, 10 mM DTT. Crystals formed within 2-3 weeks and were suitable for data collection in 7 to 10 weeks.
HNF4γ LBD crystallized in the space group 14122 with a unit cell of dimensions a=b=152.71 Å, c=93.42 Å, α=β=γ=90°, and one molecule in the asymmetric unit. The structure was solved using single isomorphous replacement anomalous scattering (SIRAS) from a methyl-mercury derivative collected at beam line 171D at the Advanced Photon Source (located at the Argonne National Lab, Argonne, Ill.). Mercury sites were found using the software package Shake-and-Bake (Hauptman, (1997) Curr. Opin. Struct Biol. 7: 672-80; Weeks et al., (1993) Acta Cryst. D49: 179; available from the Hauptman-Woodward Medical Research Institute, Buffalo, N.Y.), and phases were improved by solvent flipping (Abrahams & Leslie, (1996) Acta Cryst. D52: 3042), which produced traceable electron density. Models were built using QUANTA™ (Molecular Simulations Inc., San Diego, Calif.), and refined using CNX™ (Molecular Simulations Inc., San Diego, Calif.).
Lipids were extracted from an aliquot of HNF4γ LBD with chloroform/methanol 2:1 (v/v). The extract was dried under argon and then dissolved in a small volume of organic solvent. The extract was then treated with an aliquot of 3% (v/v) acetyl chloride in methanol for 30 min at room temperature to produce the methyl ester of the predicted fatty acid. After the reaction, the sample was dried again under argon. The derivatized sample was then analyzed by GC/MS on a Shimadzu GC-17A QP-5050A instrument. Analytes were eluted from a 25 meter DB5 column by increasing the column temperature from 100-280° C. at 120° C. per minute. Ionization of analytes was achieved by either EI or CI. Mass spectra were acquired using a scan range of 70-500 Da in 0.5 seconds. Representative data are depicted in
A cell-free fluorescent resonance energy transfer (FRET) assay was used to measure the association between the amino portion of CBP (CREB-binding protein) (residues 54-457) and the HNF4 LBD (HNF4α amino acids 141465 and HNF4γ amino acids 102-408) (Zhou et al., (1998) Mol. Endocrinol 12: 1594-1604). Proteins were expressed in E. coli, purified to homogeneity, and biotinylated. CBP, the fluorescence donor, was labeled with a europium chelate, and HNF4 LBD was labeled with the streptavidin-conjugated fluorophore allophycocyanin (Molecular Probes, Eugene, Oreg.). Labeled HNF4 LBD and CBP were incubated together with ligands for 15 minutes at 21° C. before assaying. A small basal level was observed, as depicted in
The crystal structure of HNF4γ was subjected to hydrogen addition and subsequent minimization holding all heavy atoms fixed using the DISCOVER™ CVFF™ force field (Molecular Simulations, San Diego, Calif.). The model of palmitic acid was generated using the above-described HNF4γ protein and docking calculations using the program MVP (Lambert, (1997) in Practical Application of Computer-Aided Drug Design, (Charifson, ed.) Marcel-Dekker, New York, pp. 243-303). Crystallographically determined atoms were used as a template and the corresponding atoms in palmitic acid were constrained to within 0.5 Å of the template.
The references listed below as well as all references cited in the specification are incorporated herein by reference to the extent that they supplement, explain, provide a background for or teach methodology, techniques and/or compositions employed herein.
WO 2000/025,134
It will be understood that various details of the invention can be changed without departing from the scope of the invention. Furthermore, the foregoing description is for the purpose of illustration only, and not for the purpose of limitation—the invention being defined by the claims.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US02/02992 | 1/31/2002 | WO |
Number | Date | Country | |
---|---|---|---|
60265656 | Jan 2001 | US |