Reporter assays have been used routinely in the pharmaceutical and biotechnology industries to identify lead compounds that affect protein function. In the last decade, the chemist's ability to synthesize large numbers of chemical compounds in a short amount of time through techniques such as combinatorial chemistry has greatly increased, and often, thousands to millions of compounds need to be screened to identify those having a desired effect on a protein of interest.
Typically, reporter assays measure the activities of one reporter protein in a sample, but may combine multiple reporters. One strategy for co-expression of multiple reporters involves the design of bicistronic constructs, in which two genes separated by an internal ribosome entry site (IRES) sequence are expressed as a single transcriptional cassette (or bicistronic transcript) under the control of a common upstream promoter (Yen et al., Science. 2008 Nov. 7; 322(5903):918-23). The intervening IRES sequence functions as a ribosome-binding site for efficient cap-independent internal initiation of translation. Such a design enables transcription of both genes with IRES-directed cap-independent translation. This system allows for co-expression of both a control reporter, not expected to change upon experimental treatment, along with a test reporter that is normalized to the control in each test sample. However, many perturbations in the cell can differentially affect cap-dependent translation compared to cap-independent translation. Moreover, some IRESes have been shown to display variable expression of the downstream gene (Wong et al. Gene Ther. 2002 March; 9(5):337-44). This leads to high false positives and unreliable reporter assays. Thus, there is a need for an efficient high-throughput approach for analysis of protein stability where nonspecific alterations in reporter activity are used to control for the inherent variability in cell based protein stability assays. This allows for reducing the error in the data required to effectively and efficiently run an HTS assay.
The present disclosure relates, in some aspects, to the development of a plasmid that can be used to efficiently monitor the stabilities of thousands of proteins after specific perturbations.
According to some aspects, the present disclosure provides a method to identify a test compound that stabilizes or destabilizes a protein of interest, the method comprising:
In some embodiments, the first and second reporter proteins have distinguishable detectable reporter signals. In some embodiments, the first and second reporter proteins are enzyme proteins having distinguishable signals generated from their products. In some embodiments, the first and second reporter proteins are bioluminescent proteins having distinguishable bioluminescence signals. In some embodiments, the first and second reporter proteins are fluorescent proteins having distinguishable fluorescence signals. In some embodiments, the first and second reporter proteins are selected from the group consisting of renilla luciferase (Rluc) and firefly luciferase (FLuc). In some embodiments, the first and second reporter proteins are selected from the group consisting of green fluorescence protein and red fluorescence protein. In some embodiments, the promoter is a eukaryotic promoter or a synthetic promoter. In some embodiments, the promoter comprises cytomegalovirus (CMV) promoter. In some embodiments, the open reading frame is derived from an ORFeome of an organism. In some embodiments, the open reading frame encodes an oncoprotein. In some embodiments, the oncoprotein is selected from the group consisting of MYC, Ikaros family zinc finger protein 1 (IKZF1), Ikaros family zinc finger protein 3 (IKZF3), Interferon regulatory factor 4 (IRF4), mutant p53, N-Ras, c-Fos, and c-Jun. In some embodiments, contacting a transformed host cell comprising the plasmid with a test compound comprises growing the transformed host cell in the presence of the test compound for an appropriate time.
Each of the embodiments and aspects of the invention can be practiced independently or combined. Also, the phraseology and terminology used herein is for the purpose of description and should not be regarded as limiting. The use of “including”, “comprising”, or “having”, “containing”, “involving”, and variations thereof herein, is meant to encompass the items listed thereafter and equivalents thereof as well as additional items.
These and other aspects of the inventions, as well as various advantages and utilities will be apparent with reference to the Detailed Description. Each aspect of the invention can encompass various embodiments as will be understood.
All documents identified in this application are incorporated in their entirety herein by reference.
The accompanying drawings are not intended to be drawn to scale. In the drawings, each identical or nearly identical component that is illustrated in various figures is represented by a like numeral. For purposes of clarity, not every component may be labeled in every drawing. In the drawings:
In
In
The present application is based, in some aspects, on the development of a plasmid that can be used to efficiently monitor the stabilities of thousands of proteins after specific perturbations. The plasmid allows for the co-expression of two reporter proteins, each of which is placed under the control of an IRES. In this way both reporters are transcribed together (i.e. are encoded by the same mRNA) and both are translated using an IRES. This minimizes the problem of spurious changes in the ratio of the two reporters caused by perturbations (e.g. compounds) that differentially effect IRES-dependent versus IRES-independent translation, and thus minimizes false positives.
According to some aspects, the present disclosure provides a method to identify a test compound that stabilizes or destabilizes a protein of interest. The method comprises
(i) contacting a transformed host cell comprising a DNA plasmid with a test compound, wherein the plasmid comprises in operable linkage
(a) a promoter,
(b) a first internal ribosomal entry site (IRES);
(c) a nucleotide sequence encoding a first reporter protein;
(d) a second IRES; and
(e) a nucleotide sequence encoding a second reporter protein,
wherein an open reading frame (ORF) is fused to the nucleotide sequence encoding a first reporter protein or to the nucleotide sequence encoding a second reporter protein and wherein said open reading frame codes for a protein of interest;
(ii) determining ratios of fused reporter protein signal to unfused reporter protein signal in presence and absence of the test compound; and
(iii) identifying said test compound as a stabilizer when the ratio of fused reporter protein signal to unfused reporter protein signal in the presence of the test compound is increased as compared to the ratio of fused reporter protein signal to unfused reporter protein signal in the absence of the test compound, and identifying said test compound as a destabilizer when the ratio of fused reporter protein signal to unfused reporter protein signal in the presence of the test compound is decreased as compared to the ratio of fused reporter protein signal to unfused reporter protein signal in the absence of the test compound.
As used herein, “operable linkage” refers to a functional linkage between two nucleic acid sequences, such as a transcription control element (e.g., a promoter) and the linked transcribed sequence. Thus, a promoter is in operable linkage with a gene if it can mediate transcription of the gene.
As used herein a “promoter” usually contains specific DNA sequences (responsive elements) that provide binding sites for RNA polymerase and transcriptional factors for transcription to take place. In some embodiments, the promoter is a eukaryotic promoter or a synthetic promoter. Examples of promoters include, but are not limited to, the TATA box, the SV40 late promoter from simian virus 40, cytomegalovirus (CMV) promoter, ubiquitin C promoter (UbC promoter) and the T7 promoter. These and other promoter sequences are well known in the art. In one example of the invention, the promoter is a CMV promoter. In one example of the invention, the promoter is a UbC promoter.
As used herein, an “internal ribosomal entry site” or “IRES” is a cis acting nucleic acid element that mediates the internal entry of ribosomes on an RNA molecule and thereby regulates translation in eukaryotic systems. In the methods and compositions of the present invention, a first and a second IRES elements are contained in the plasmid. The first and second IRES elements permit the independent translation of a nucleotide sequence encoding a reporter protein and an open reading frame fused to a nucleotide sequence encoding another reporter protein from a single messenger RNA. In some embodiments, the first and second IRESs are the same (i.e., they have identical sequences). In some embodiments, the first and second IRESs are not the same (i.e., they do not have identical sequences).
Many IRES elements have been identified in both viral and eukaryotic genomes. In addition, synthetic IRES elements have also been developed. For example, IRES elements have been found in a variety of viruses including members of the genus Enterovirus (e.g. human poliovirus 1 (Ishii et al. (1998) J Virol. 72:2398-405 and Shiroki et al. (1997) J. Virol. 77:1-8), human Coxsackievirus B); Rhinovirus (e.g., human rhinovirus); Hepatovirus (Hepatitis A virus); Cardiovirus (Encephalomyocarditis virus ECMV (nucleotides 2137-2752 of GenBank Accession No. AB041927 and Kim et al. (1992) Mol Cell Biology 72:3636-43) and Etheirler's encephalomyelitis virus); Aphtovirus (Foot- and mouth disease virus (nucleotides 600-1058 of GenBank Accession No. AF308157; Belsham et al. (1990) EMBO 77:1105-10; Poyry et al. (2001) RNA 7:647-60; and Stoneley et al. (2000) Nucleic Acid Research 25:687-94), equine rhinitis A virus, Ewuine rhinitis B); Pestivirus (e.g., Bovine viral diarrhea virus (Poole et al. (1995) Virology 206:150-154) and Classical swine fever virus (Rijnbrand et al. (1997) J. Virol 77:451-7); Hepacivirus (e.g., Hepatitis C virus (Tsukiyama-Kohara et al. (1992) J. Virol. 66:1476-1483, Lemon et al. (1997) Semin. Virol. 5:274-288, and nucleotide 1201-1812 of GenBank Accession No. AJ242654.) and GB virus B). Each of these references is herein incorporated by reference.
IRES elements have also been found in viruses from the family Retroviridae, including members of the Lentivirus family (e.g., Simian immunodeficiency virus (Ohlmann et al. (2000) Journal of Biological Chemistry 275:11899-906) and human immunodeficiency virus 1 (Buck et s/. (2001) J Virol. 75:181-91); the BLV-HTLV retroviruses (e.g., Human T-lymphotrophic virus type 1 (Attal et al. (1996) EEES Letters 392:220-4); and the Mammalian type C retoviral family (e.g., Moloney murine leukemia virus (Vagner et al. (1995) J. Biol. Chem 270:20316-83), Friend murine leukemia virus, Harvey murine sarcoma virus, Avian retriculoendotheliosis virus (Lopez-Lastra et al. (1997) Hum. Gene Ther 5:1855-65), Murine leukemia virus (env RNA) (Deffaud et al. (2000) J. Virol. 74:846-50), Rous sarcoma virus (Deffaud et al. (2000) J. Virol. 74:11581-8). Each of these references is herein incorporated by reference.
Eukaryotic mRNAs also contain IRES elements including, for example, BiP (Macejak et al. (1991) Nature 355:91); Antennapedia of Drosophilia (exons d and e) (Oh et al. (1992) Genes and Development 6:1643-1653; c-myc; and, the X-linked inhibitor of apoptosis (XIAP) gene (U.S. Pat. No. 6,171,821).
Various synthetic IRES elements have been generated. See, for example, De Gregorio et al. (1999) EMBO J. 75:4865-74; Owens et al. (2001) PNAS 4:1471-6; and Venkatesan et al. (2001) Molecular and Cellular Biology 21:2826-37. For additional IRES elements known in the art, see, for example, rangueil.inserm.fr/IRESdatabase.
In a specific embodiment, the IRES sequence is derived from encephalomyocarditis virus (ECMV).
As used herein, a reporter protein is any protein that can be specifically detected when expressed (i.e, has a detectable signal when expressed), for example, via its fluorescence or enzyme activity. The plasmid comprises a nucleotide sequence encoding a first reporter protein and a nucleotide sequence encoding a second reporter protein. An open reading frame is fused either to the nucleotide sequence encoding a first reporter protein or to the nucleotide sequence encoding a second reporter protein. In some embodiments, the open reading frame is fused to the nucleotide sequence encoding a first reporter protein. In some embodiments, the open reading frame is fused to the nucleotide sequence encoding a second reporter protein. This allows one to study the expression of the linked open reading frame in response to different stimuli. As used herein, “fused” is intended to mean that the amino acids encoded by the ORF and the reporter protein are joined by peptide bonds to create a contiguous protein sequence. Thus, the reporter protein fused to the open reading frame serves as a marker of the stability of the fused open reading frame. The other reporter protein that is unfused to the open reading frame (and thus does not create a contiguous protein sequence with the amino acids encoded by the ORF) serves as an internal control to normalize for cell number and expression variability.
Typically, the first and second reporter proteins have distinguishable detectable reporter signals. For example, the first and second reporter proteins are enzyme proteins having distinguishable signals generated from their products. In some embodiments, the first and second reporter proteins are bioluminescent proteins that emit light at different wavelengths and/or utilize different substrates. Alternatively, the first and second reporter proteins are fluorescent proteins that fluoresce at different wavelengths.
Many reporter proteins known in the art may be used, including but not limited to bioluminescent proteins, fluorescent reporter proteins, and enzyme proteins such as beta-galactosidase, horse radish peroxidase and alkaline phosphatase that produce specific detectable products. The fluorescent reporter proteins include, for example, green fluorescent protein (GFP), cyan fluorescent protein (CFP), red fluorescent protein (RFP) and yellow fluorescent protein (YFP) as well as modified forms thereof e.g. enhanced GFP (EGFP), enhanced CFP (ECFP), enhanced RFP (ERFP), mCHERRY, and enhanced YEP (EYEP).
Examples of bioluminescent proteins, such as luciferases, including but not limited to renilla luciferase (Rluc), firefly luciferase (FLuc) and NanoLuc, are known in the art (see, for example, Fan, F. and Wood, K., Assay and drug development technologies V5 #1 (2007); Gupta, R. et al Nature Methods V8 #10 (2011); Nano-Glo® Luciferase Assay System (Promega) and en.wikipedia.org/wiki/Bioluminescence.
Other non-limiting examples of reporter proteins are shown below:
Photinus pyralis
Luciola cruciata
Luciola italica
Luciola lateralis
Luciola mingrelica
Photuris pennsylvanica
Pyrophorus plagiophthalamus
Phrixothrix hirtus
Renilla reniformis
Renilla luciferase
Gaussia princeps
Gaussia luciferase
Gaussia-Dura luciferase
Cypridina noctiluca
Cypridina luciferase
Cypridina hilgendorfii
Cypridina (Vargula) luciferase
Metridia longa
Metridia luciferase
Oplophorus gracilorostris
In some embodiments, the first and second reporter proteins are selected from the group consisting of renilla luciferase (Rluc), firefly luciferase (FLuc) and NanoLuc. In some embodiments, the first and second reporter proteins are selected from the group consisting of green fluorescence protein and red fluorescence protein.
An open reading frame is fused either to the nucleotide sequence encoding a first reporter protein or to the nucleotide sequence encoding a second reporter protein. The open reading frame is fused to the 5′ or to the 3′ end of the nucleotide sequence. As used herein, an open reading frame or ORF refers to a sequence of nucleotides that codes for a contiguous sequence of amino acids. The translated open reading frame may be all or a portion of a gene encoding a protein or polypeptide of interest.
The ORF of the plasmid codes for a protein of interest. As used herein, a “protein of interest” can be any conceivable polypeptide or protein that may be of interest, such as to study or otherwise characterize. In some embodiments, the ORF may be derived from an ORFeome of an organism. A complete ORFeome contains nucleic acids that encode all proteins of a given organism. A representative fraction of a full ORFeome is at least 60% of all proteins expressed by the organism. In some embodiments, the organism is a mammal. In some embodiments, the mammal is human.
In some embodiments, the protein of interest is a human polypeptide or protein. In some embodiments, the protein of interest is an oncoprotein, such as, but not limited to, RAS, MYC, SRC, FOS, JUN, MYB, ABL, BCL2, HOX11, HOX11L2, TAL1/SCL, LMO1, LM02, EGFR, MYCN, MDM2, CDK4, GLI1, IGF2, EGFR, FLT3-ITD, TP53, PAX3, PAX7, BCR/ABL, HER2 NEU, FLT3R, FLT3-ITD, TAN1, B-RAF, E2A-PBX1, and NPM-ALK, as well as fusion of members of the PAX and FKHR gene families, WNT, MYC, ERK EGFR, FGFR3, CDH5, KIT, RET, Interferon regulatory factor 4 (IRF4) and TRK. Other exemplary oncogenes are well known in the art and several such examples are described in, for example, The Genetic Basis of Human Cancer (Vogelstein, B. and Kinzler, K. W. eds. McGraw-Hill, New York, N.Y., 1998).
In some embodiments, the protein of interest is a transcription factor. Some examples of such transcription factors include (but are not limited to) the STAT family (STATs 1, 2, 3, 4, 5a, 5b, and 6), FOS/JUN, NF κB, HIV-TAT, and the E2F family. In some embodiments, the protein of interest is an IKAROS family zinc finger protein. In some embodiments, the protein of interest is IKZF1, IKZF2, IKZF3, IKZF4, or IKZF5. In some embodiments, the protein of interest is IKZF1 or IKZF3.
The nucleotide sequence encoding a reporter protein and the fused ORF are “in frame”, i.e., consecutive triplet codons of a single polynucleotide comprising the nucleotide sequence encoding the reporter protein and the fused open reading frame encode a single continuous amino acid sequence.
The methods described herein allows one to screen libraries of compounds and identify a test compound that stabilizes or destabilizes a protein of interest. A compound library is a collection of stored compounds typically used in high-throughput screening. The library compounds may include, for example, synthesized organic molecules, naturally occurring organic molecules, peptides, polypeptides, nucleic acid molecules and components thereof. Examples of compound library include, but are not limited to, Screen-Well® Compound Libraries (Enzo Life Sciences), EXPRESS-Pick Collection and CORE Library (Chem Bridge), National Cancer Institute Library, Prestwick Chemical Library® and Tocriscreen Compound Library Collections.
The plasmids described herein may be introduced into the host cell using any available technique known in the art. For example, the plasmid may be introduced into the host cell by lipofection, calcium phosphate transfection, DEAE-dextran mediated transfection, electroporation, transduction, sonoporation, infection and optical transfection. Suitable host cells include, but are not limited to, bacterial cells (e.g., E. coli, Bacillus subtilis, and Salmonella typhimurium), yeast cells (e.g., Saccharomyces cerevisiae and Schizosaccharomyces pombe), plant cells (e.g., Nicotiana tabacum and Gossypium hirsutum), and mammalian cells (e.g., CHO cells, and 3T3 fibroblasts, HEK 293 cells, U-2 OS cells).
In some embodiments, contacting a host cell transformed with the plasmid described herein with a test compound comprises growing the transformed host cell in the presence of the test compound for an appropriate time. under suitable culture conditions. Suitable culture conditions, including the duration of the culture, will vary depending on the cell being cultured. However, one skilled in the art can easily determine the culture conditions by following standard protocols, such as those described in the series Methods in Microbiology, Academic Press Inc. Typically, the cell culture medium may contain any of the following nutrients in appropriate amounts and combinations: salt(s), buffer(s), amino acids, glucose or other sugar(s), antibiotics, serum or serum replacement, and other components such as, but not limited to, peptide growth factors, cofactors, and trace elements. In some embodiments, the transfected host cells are grown in the presence of the compound for 15 mins, 30 mins, 1 hour, 2 hours, 4 hours, 6 hours, 8 hours, 10 hours, 12 hours, 14 hours, 16 hours, 18 hours, 20 hours, 24 hours, 30 hours, 48 hours, or 72 hours.
In some embodiments, a single transformed host cell is first isolated, cloned and expanded based on optimized responses to a control test compound and confirmed to provide sufficiently low error required for HTS campaigns. Selection of appropriate clones is aided by determining the response of the fused reporter protein of interest relative to the unfused reporter with the necessary response stability and reproducibility required for high throughput screening. Identification of useful clones is aided by additionally normalizing the fused reporter signals to the control unfused reporter which can significantly reduce the inherent error relative to measuring the response solely from the fused reporter. This reduction in error is critical for the identification of a useful clonal cell line that responds to a test compound with a large enough response relative to the response error obtained from the respective signals observed from the treated and untreated samples in order to provide a Z factor sufficient for high throughput screening. (en.wikipedia.org/wiki/Z-factor).
As used herein, “fused reporter protein signal” refers to the detectable signal of the reporter protein encoded by the nucleotide sequence that is fused to the ORF. As used herein, “unfused reporter protein signal” refers to the detectable signal of the reporter protein encoded by the nucleotide sequence that is not fused to the ORF. The fused and unfused reporter protein signals in the presence and absence of the test compound are determined using methods known in the art. Detectors such as, but not limited to, luminometers, spectrophotometers, and fluorimeters, or any other device that can detect changes in reporter protein activity can be used. Assay systems known in the art that allow for quantitation of a stable reporter signal from two reporter genes in a single sample can be used. Examples include, but are not limited to, Dual-Glo® Luciferase Assay System (Promega) that measures the activities of firefly and Renilla luciferases sequentially from a single sample.
After detecting the signals generated by the reporter proteins, the ratio of the fused reporter protein signal to unfused reporter protein signal in the presence of the test compound is compared to the ratio of the fused reporter protein signal to unfused reporter protein signal in the absence of the test compound. When the ratio of fused reporter protein signal to unfused reporter protein signal in the presence of the test compound is increased as compared to the ratio of fused reporter protein signal to unfused reporter protein signal in the absence of the test compound, the test compound is identified as a stabilizer of the protein of the interest. In contrast, when the ratio of fused reporter protein signal to unfused reporter protein signal in the presence of the test compound is decreased as compared to the ratio of fused reporter protein signal to unfused reporter protein signal in the absence of the test compound, the test compound is identified as a destabilizer of the protein of interest.
In some embodiments, the open reading frame is fused to the nucleotide sequence encoding a first reporter protein. In such embodiments, ratios of first reporter protein signal to second reporter protein signal are determined in presence and absence of the compound. The test compound is identified as a stabilizer when the ratio of the first reporter protein signal to second reporter protein signal in the presence of the test compound is increased as compared to the ratio of first reporter protein signal to second reporter protein signal in the absence of the test compound. The test compound is identified as a destabilizer when the ratio of first reporter protein signal to second reporter protein signal in the presence of the test compound is decreased as compared to the ratio of first reporter protein signal to second reporter protein signal in the absence of the test compound.
In some embodiments, the open reading frame is fused to the nucleotide sequence encoding a second reporter protein. In such embodiments, ratios of second reporter protein signal to first reporter protein signal are determined in presence and absence of the compound. The test compound is identified as a stabilizer when the ratio of the second reporter protein signal to first reporter protein signal in the presence of the test compound is increased as compared to the ratio of second reporter protein signal to first reporter protein signal in the absence of the test compound. The test compound is identified as a destabilizer when the ratio of second reporter protein signal to first reporter protein signal in the presence of the test compound is decreased as compared to the ratio of second reporter protein signal to first reporter protein signal in the absence of the test compound.
The present invention is further illustrated by the following Example, which in no way should be construed as further limiting. The entire contents of all of the references (including literature references, issued patents, published patent applications, and co-pending patent applications) cited throughout this application are hereby expressly incorporated by reference.
pIRIGF constructs express in 293FT and HELA cells (
293FT and HELA cells were transfected with IKZF1-firefly, IKZF3-firefly and MYC-firefly fusion proteins and selected using puromycin and geneticin respectively. These pools were very unstable and lost signals in 10 to 30 days and generally had very small responses to IMiD's (
Cell line clones (IKZF1-2B4, IKZF1-2B11, myc-1C3 and myc-5F2) expressing the indicated firefly fusion protein were evaluated in the dual-glo assay for reproducibility. Potency of IMiD's and relative reduction in firefly luciferase signals confirmed the expected responses and generated data with Z′ values sufficient for screening (
Pilot screen results for IKZF1 2B4 cells—Active compounds (Prestwick collection and NCI collection are shown in
Pilot screen results for MYC 5F2 cells—Active compounds NCI collection are shown in
The hits tested on IKZF1 2B4 and MYC 5F2 cell lines were confirmed (
Cherry pick retests for IKZF1 were screened at ICCB (
The results of the ICCB cherry picks IKZF1 vs. MYC selectivity comparison show that the majority of hits in the IKZF1 cell line screen were also active in the counter screen assay suggesting a nonspecific mechanism (
Two HSP90 inhibitors, BIIB021 (
Seven cell lines were used to measure the half life of the luciferases after blocking all protein synthesis with cyclohexamide. The decay observed for both fused luciferases (MYC-firefly and MYC-nanoluc;
Seven MYC-luciferase reporter cell lines were used to measure changes in MYC-luciferase expression after blocking the proteasome with MG132. The expression of the unfussed renilla and firefly were unchanged for about 6 hours but decreased after 18 hours to a variable extent among cell lines (
Seven MYC-luciferase reporter cell lines were used to measure changes in MYC-luciferase expression after inhibition of ubiquitin dependent proteolysis with the neddylation inhibitor MLN-4924 (
A549 and H1299 Cell Lines Expressing MYC-Firefly after siMYC Knockdown
siRNA was used to knockdown the MYC-firefly luciferase fusion protein in the A549 and H1299 cell lines using 48 hour treatment with siRNA directed to MRC mRNA. The reduction in fusion protein, as observed by western blotting with MYC and firefly directed antibodies (
siGENOME siRNA Library
This application claims the benefit under 35 U.S.C. §119(e) of U.S. provisional application U.S. Ser. No. 62/062,257, filed Oct. 10, 2014, the entire contents of which are incorporated herein by reference.
This invention was made with government support under 2R01CA068490-19, and 2R01CA076120-13 awarded by National Institute of Health. The government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US15/54914 | 10/9/2015 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62062257 | Oct 2014 | US |