The instant application contains a Sequence Listing which has been submitted electronically in ASCII format and is hereby incorporated by reference in its entirety. Said ASCII copy, created on Mar. 24, 2016, is named 0100-0013WO1_SL.txt and is 192,837 bytes in size.
The field of the invention is cell and molecular biology. Specifically, the field of the invention is cell signal transduction and methods of genetically engineering or modifying the same. More specifically, the invention relates to a novel nuclear receptor-based ligand inducible polypeptide coupler and methods of modulating protein-protein interactions within a host cell.
In the field of genetic engineering and medicine, precise control and modulation of cellular signaling pathways is a valuable and sought after tool for studying, manipulating, and controlling development and other physiological processes (e.g., pathological conditions). Signaling pathways are known to regulate a wide array of cellular processes and functions, including proliferation, differentiation, and apoptosis. Signaling pathways can be regulated through a number of mechanisms such as post-translational modifications (e.g., phosphorylation, ubiquitination, etc.) and protein-protein interactions. One common mechanism for activating or regulating a signaling pathway is through the formation of multi-protein complexes (e.g., dimers, trimers, and oligomers) via protein-protein interactions. Such complexes can include multiple copies of the same protein (homo-complex) or copies of distinct proteins (hetero-complex). The induction of the protein-protein interaction and formation of the complex is in some cases triggered by binding of a ligand to one or more of the member proteins (e.g., a receptor molecule). While numerous such cell signaling pathways have been discovered and characterized, there remains a need to be able to target and manipulate such pathways in a rapid, efficient, and reliable manner using pharmaceutically acceptable and available activating ligands.
In contrast to the relative scarcity of modulation systems for cell signaling pathways, methods for regulating gene expression through induction of protein-protein interactions between transcritption factors have been developed and employed. In order for gene expression to be triggered, such that it produces the RNA necessary as the first step in protein synthesis, a transcriptional activator must be brought into proximity of a promoter that controls gene transcription. Typically, the transcriptional activator itself is associated with a protein that has at least one DNA binding domain that binds to DNA binding sites present in the promoter regions of genes. Thus, for gene expression to occur, a protein comprising a DNA binding domain and an activation domain located at an appropriate distance from the DNA binding domain must be brought into the correct position in the promoter region of the gene.
One method for inducing protein-protein interactions relies on immunosuppressive molecules such as FK506, rapamycin and cyclosporine A, which can bind to immunophilins, FKBP12, cyclophilin, etc. A general strategy has been devised to bring together any two proteins by placing FK506 on each of the two proteins or by placing FK506 on one and cyclosporine A on another one. A synthetic homodimer of FK506 (FK1012) or a compound resulting from fusion of FK506-cyclosporine (FKCsA) can then be used to induce dimerization of these molecules (Spencer et al., 1993, Science 262: 1019-24; Belshaw et al., 1996 Proc Natl Acad Sci USA 93: 4604-7). A Gal4 DNA binding domain fused to FKBP12 and a VP16 activator domain fused to cyclophilin, and FKCsA compound were used to show heterodimerization and activation of a reporter gene under the control of a promoter containing Gal4 binding sites. Unfortunately, this system includes immunosuppressants which can have unwanted side effects and therefore, limits its use for various mammalian applications.
Higher eukaryotic transcription activation systems such as steroid hormone receptor systems have also been employed to regulate gene expression. Steroid hormone receptors are members of the nuclear receptor superfamily and are found in vertebrate and invertebrate cells. Unfortunately, use of steroidal compounds that activate the receptors for the regulation of gene expression, particularly in plants and mammals, is limited due to their involvement in many other natural biological pathways in such organisms. In order to overcome such difficulties, an alternative system has been developed using insect ecdysone receptors (EcR).
Growth, molting, and development in insects are regulated by the ecdysone steroid hormone (molting hormone) and the juvenile hormones (Dhadialla, et al., 1998, Annu. Rev. Entomol. 43: 545-569). The molecular target for ecdysone in insects consists of at least ecdysone receptor (EcR) and ultraspiracle protein (USP). EcR is a member of the nuclear steroid receptor super family that is characterized by signature DNA and ligand binding domains, and an activation domain (Koelle et al. 1991, Cell, 67:59-77). EcR receptors are responsive to a number of steroidal compounds such as ponasterone A and muristerone A. Non-steroidal compounds with ecdysteroid agonist activity have also been described, including the commercially available insecticides tebufenozide and methoxyfenozide that (see International Patent Application No. PCT/EP96/00686 and U.S. Pat. No. 5,530,028, each of which is incorporated by reference herein in its entirety). Both analogs have exceptional safety profiles in other organisms.
The insect ecdysone receptor (EcR) heterodimerizes with Ultraspiracle (USP), the insect homologue of the mammalian retinoid X receptor (RXR), binds ecdysteroids through its ligand binding domain, and also binds ecdysone receptor response elements to activate transcription of ecdysone responsive genes (Riddiford et al., 2000).
EcR has five modular domains, A/B (transactivation), C (DNA binding, heterodimerization)), D (Hinge, heterodimerization), E (ligand binding, heterodimerization and transactivation) and F (transactivation) domains. Some of these domains such as A/B, C and E retain their function when they are fused to other proteins. EcR is a member of the nuclear receptor superfamily and classified into subfamily 1, group H (referred to herein as “Group H nuclear receptors”). The members of each group share 40-60% amino acid identity in the E (ligand binding) domain (Laudet et al., A Unified Nomenclature System for the Nuclear Receptor Subfamily, 1999; Cell 97: 161-163). In addition to the ecdysone receptor, other members of this nuclear receptor subfamily 1, group H, include: ubiquitous receptor (UR), Orphan receptor 1 (OR-1), steroid hormone nuclear receptor 1 (NER-1), RXR interacting protein-15 (RIP-15), liver x receptor β(LXRβ), steroid hormone receptor like protein (RLD-1), liver×receptor (LXR), liver×receptor α (LXRα), farnesoid×receptor (FXR), receptor interacting protein 14 (RIP-14), and farnesol receptor (HRR-1).
In mammalian cells, it has been demonstrated that insect ecdysone receptor (EcR) can heterodimerize with mammalian retinoid X receptor (RXR) and can be used to regulate expression of target genes in a ligand dependent manner. The use of such expression system components, however, has not been contemplated, demonstrated, or applied for regulating protein-protein interaction or for use, for example, in regulating, controlling, inducing or inhibiting extracellular and intracellular signal transduction pathways and protein-protein associations.
While other gene expression systems have been developed, a need remains for systems that allow precise modulation of cell signaling pathways, in both plants and animals, via regulation of protein-protein interactions.
Various publications are cited herein, the disclosures of which are incorporated by reference herein in their entireties.
In some embodiments, the invention comprises two polypeptides comprising a first non-naturally occurring polypeptide comprising a fragment or domain of a nuclear receptor protein and a second non-naturally occurring polypeptide comprising a different fragment or domain of a nuclear receptor protein, wherein the first polypeptide is capable of binding an activating ligand, wherein the second polypeptide is capable of associating with the first polypeptide in the presence of the activating ligand, wherein each of the first and second polypeptides further comprise heterologous amino acids or polypeptide sequences such that activating ligand induced association of the first and second polypeptides results in an activated functional, biological or cell signal transduction condition.
In certain embodiments of the invention, one or both nuclear receptor protein fragments or domains comprise an arthropod nuclear receptor amino acid sequence.
In some embodiments of the invention, one or both nuclear receptor protein fragments or domains comprise a Group H nuclear receptor amino acid sequence.
In certain embodiments of the invention, the nuclear receptor amino acid sequence of the first polypeptide comprises an ecdysone receptor (EcR) ligand binding domain, polypeptide fragment, or substitution mutant thereof.
In some embodiments of the invention, the second polypeptide nuclear receptor protein fragment or domain comprises a mammalian nuclear receptor amino acid sequence.
In certain embodiments of the invention, the mammalian nuclear receptor protein fragment or domain comprises a RXR nuclear receptor polypeptide fragment, or substitution mutant thereof.
In some embodiments of the invention, the second polypeptide nuclear receptor protein fragment or domain comprises a chimera of invertebrate and mammalian nuclear receptor amino acid sequences, or substitution mutants thereof.
In certain embodiments of the invention, the second polypeptide nuclear receptor protein fragment or domain comprises a chimera of invertebrate USP (RXR homologue) and mammalian RXR nuclear receptor amino acid sequences, or substitution mutants thereof.
In some embodiments, the invention comprises a ligand inducible polypeptide coupling (LIPC) system comprising: a)A first non-naturally occurring polypeptide comprising a fragment or domain of an arthropod nuclear receptor protein, and b) A second non-naturally occurring polypeptide comprising a fragment or domain of an arthropod and/or mammalian nuclear receptor protein, wherein the first and second polypeptides comprise additional heterologous sequences capable of producing an activated functional, biological or cell signal transduction condition following contact with an activating ligand.
In some embodiments of the invention, one or both nuclear receptor protein fragments or domains of the LIPC comprise a Group H nuclear receptor amino acid sequence.
In certain embodiments of the invention, the first polypeptide of the LIPC comprises an ecdysone receptor (EcR) ligand binding domain, polypeptide fragment, or substitution mutant thereof.
In some embodiments of the invention, the second polypeptide of the LIPC comprises a mammalian nuclear receptor amino acid sequence.
In certain embodiments of the invention, the second polypeptide of the LIPC comprises a RXR nuclear receptor polypeptide fragment, or substitution mutant thereof.
In some embodiments of the invention, the second polypeptide of the LIPC comprises a chimera of invertebrate and mammalian nuclear receptor amino acid sequences, or substitution mutants thereof.
In certain embodiments of the invention, the second polypeptide of the LIPC comprises a chimera of invertebrate USP (RXR homologue) and mammalian RXR nuclear receptor amino acid sequences, or substitution mutants thereof.
In some embodiments of the invention, the nuclear receptor protein fragments of the first and second polypeptides of the invention, including of the LIPC, are derived from an ecdysone receptor polypeptide selected from the group consisting of a spruce budworm Choristoneura fumiferana EcR (“CfEcR”) LBD, a beetle Tenebrio molitor EcR (“TmEcR”) LBD, a Manduca sexta EcR (“MsEcR”) LBD, a Heliothies virescens EcR (“HvEcR”) LBD, a midge Chironomus tentans EcR (“CfEcR”) LBD, a silk moth Bombyx mori EcR (“BmEcR”) LBD, a fruit fly Drosophila melanogaster EcR (“DmEcR”) LBD, a mosquito Aedes aegypti EcR (“AaEcR”) LBD, a blowfly Lucilia capitata EcR (“LcEcR”) LBD, a blowfly Lucilia cuprina EcR (“LucEcR”) LBD, a Mediterranean fruit fly Ceratitis capitata EcR (“CcEcR”) LBD, a locust Locusta migratoria EcR (“LmEcR”) LBD, an aphid Myzus persicae EcR (“MpEcR”) LBD, a fiddler crab Celuca pugilator EcR (“CpEcR”) LBD, a whitefly Bamecia argentifoli EcR (BaEcR) LBD, a leafhopper Nephotetix cincticeps EcR (NcEcR) LBD, and an ixodid tick Amblyomma americanum EcR (“AmaEcR”) LBD.
In certain embodiments of the invention, the nuclear receptor protein fragments of the first and second polypeptides of the invention, including of the LIPC, are derived from are derived from an ecdysone receptor polypeptide encoded by a polynucleotide comprising a nucleic acid sequence of SEQ ID NO: 1 (CfEcR-DEF), SEQ ID NO: 2 (CfEcR-CDEF), SEQ ID NO: 3 (DmEcR-DEF), SEQ ID NO: 4 (TmEcR-DEF) SEQ ID NO: 5 (AmaEcR-DEF), or a polynucleotide encoding a functional variant that is substantially identical thereto.
In certain embodiments of the invention, at least one of the ecdysone receptor polypeptides comprises a polypeptide sequence of SEQ ID NO: 6 (CfEcR-DEF), SEQ ID NO: 7 (DmEcR-DEF), SEQ ID NO: 8 (CfEcR-CDEF), SEQ ID NO: 9 (TmEcR-DEF), SEQ ID NO: 10 (AmaEcR-DEF), or a polypeptide sequence substantially identical thereto.
In certain embodiments of the invention, the ecdysone receptor polypeptide sequence comprises about or at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or substitution mutations relative to the corresponding wild-type ecdysone receptor polypeptide.
In certain embodiments of the invention, the ecdysone receptor polypeptide is encoded by a polynucleotide comprising a codon mutation that results in a substitution of an amino acid residue, wherein the amino acid residue is at a position equivalent to or analogous to a) amino acid residue 20, 21, 48, 51, 52, 55, 58, 59, 61, 62, 92, 93, 95, 96, 107, 109, 110, 120, 123, 125, 175, 218, 219, 223, 230, 234, or 238 of SEQ ID NO: 17, b) amino acid residues 95 and 110 of SEQ ID NO: 17, c) amino acid residues 218 and 219 of SEQ ID NO: 17, d) amino acid residues 107 and 175 of SEQ ID NO: 17, e) amino acid residues 127 and 175 of SEQ ID NO: 17, f) amino acid residues 107 and 127 of SEQ ID NO: 17, g) amino acid residues 107, 127 and 175 of SEQ ID NO: 17, h) amino acid residues 52, 107 and 175 of SEQ ID NO: 17, i) amino acid residues 96, 107 and 175 of SEQ ID NO: 17, j) amino acid residues 107, 110 and 175 of SEQ ID NO: 17, k) amino acid residue 107, 121, 213, or 217 of SEQ ID NO: 18, or 1) amino acid residue 91 or 105 of SEQ ID NO: 19.
In certain embodiments of the invention, the substitution mutation the ecdysone receptor polypeptide is selected from the group consisting of a) E20A, Q21A, F48A, I51A, T52A, T52V, T52I, T52L, T55A, T58A, V59A, L61A, I62A, M92A, M93A, R95A, V96A, V96T, V96D, V96M, V107I, F109A, A110P, A110S, A110M, A110L, Y120A, A123F, M125A, R175E, M218A, C219A, L223A, L230A, L234A, W238A, R95A/A110P, M218A/C219A, V107/IR175E, Y127E/R175E, V107/IY127E, V107/IY127E/R175E, T52V/V107/IR175E, V96A/V107I/R175E, T52A/V107I/R175E, V96T/V107/IR175E, or V107I/A110P/R175E substitution mutation of SEQ ID NO: 17, b) A107P, G121R, G121L, N213A, C217A, or C217S substitution mutation of SEQ ID NO: 18, and c) G91A or A105P substitution mutation of SEQ ID NO: 19.
In some embodiments of the invention, the retinoid X receptor polypeptide comprises a polypeptide selected from the group consisting of a vertebrate retinoid X receptor polypeptide, an invertebrate retinoid X receptor polypeptide (USP), and a chimeric retinoid X polypeptide comprising polypeptide fragments from a vertebrate and invertebrate RXR.
In certain embodiments of the invention, the chimeric retinoid X receptor polypeptide comprises at least two different retinoid X receptor polypeptide fragments selected from the group consisting of a vertebrate species retinoid X receptor polypeptide fragment, an invertebrate species retinoid X receptor polypeptide fragment, and a non-Dipteran/non-Lepidopteran invertebrate species retinoid X receptor polypeptide fragment.
In some embodiments of the invention, the chimeric retinoid X receptor polypeptide comprises a retinoid X receptor polypeptide comprising at least one retinoid X receptor polypeptide fragment selected from the group consisting of an EF-domain helix 1, an EF-domain helix 2, an EF-domain helix 3, an EF-domain helix 4, an EF-domain helix 5, an EF-domain helix 6, an EF-domain helix 7, an EF-domain helix 8, an EF-domain helix 9, an EF-domain helix 10, an EF-domain helix 11, an EF-domain helix 12, an F-domain, and an EF-domain β-pleated sheet, wherein the retinoid X receptor polypeptide fragment is from a different species retinoid X receptor polypeptide or a different isoform retinoid X receptor polypeptide than the second retinoid X receptor polypeptide fragment.
In certain embodiments of the invention, the chimeric retinoid X receptor polypeptide is encoded by a polynucleotide comprising a nucleic acid sequence of a) SEQ ID NO: 11, b) nucleotides 1-348 of SEQ ID NO: 12 and nucleotides 268-630 of SEQ ID NO: 13, c) nucleotides 1-408 of SEQ ID NO: 12 and nucleotides 337-630 of SEQ ID NO: 13, d) nucleotides 1465 of SEQ ID NO: 12 and nucleotides 403-630 of SEQ ID NO: 13, e) nucleotides 1-555 of SEQ ID NO: 12 and nucleotides 490-630 of SEQ ID NO: 13, f) nucleotides 1-624 of SEQ ID NO: 12 and nucleotides 547-630 of SEQ ID NO: 13, g) nucleotides 1-645 of SEQ ID NO: 12 and nucleotides 601-630 of SEQ ID NO: 13, and h) nucleotides 1-717 of SEQ ID NO: 12, nucleotides 613-630 of SEQ ID NO: 13, or a polynucleotide encoding a functional variant that is substantially identical thereto.
In some embodiments of the invention, the chimeric retinoid X polypeptide comprises a polypeptide sequence of a) SEQ ID NO: 14, b) amino acids 1-116 of SEQ ID NO: 15 and amino acids 90-210 of SEQ ID NO: 16, c) amino acids 1-136 of SEQ ID NO: 15 and amino acids 113-210 of SEQ ID NO: 16, d) amino acids 1-155 of SEQ ID NO: 15 and amino acids 135-210 of SEQ ID NO: 16, e) amino acids 1-185 of SEQ ID NO: 15 and amino acids 164-210 of SEQ ID NO: 16, f) amino acids 1-208 of SEQ ID NO: 15 and amino acids 183-210 of SEQ ID NO: 16, g) amino acids 1-215 of SEQ ID NO: 15 and amino acids 201-210 of SEQ ID NO: 16, and h) amino acids 1-239 of SEQ ID NO: 15, amino acids 205-210 of SEQ ID NO: 16, or a polypeptide sequence substantially identical thereto.
In certain embodiments of the invention, one or both additional heterologous sequences of the first and second polypeptides or the LIPC system comprise a transmembrane domain.
In certain embodiments of the invention, at least one of the transmembrane domains of the first and second polypeptides or the LIPC system is a single-pass type I transmembrane.
In certain embodiments of the invention, LIPC components are fused to heterologous polypeptides which result in or produce cell death, or anergy, upon ligand-induced dimerization; such systems may be referred to as “suicide” or “kill” switches.
In some embodiments, the invention comprises an isolated polynucleotide comprising a polynucleotide sequence that encodes the first or second polypeptides described herein.
In certain embodiments, the invention comprises, a first polynucleotide comprising a nucleotide sequence encoding the first polypeptide and a second polynucleotide comprising a nucleotide sequence encoding a second polypeptide described herein.
In some embodiments, the invention comprises a vector comprising any one of the polynucleotides above. In certain embodiments, the invention comprises a vector comprising both of the first and second polynucleotides described herein. In some embodiments, the vector of the invention is an expression vector.
In certain embodiments, the invention comprises a host cell comprising any one of the vectors above. In some embodiments, the host cell is a mammalian T-cell. In certain embodiments, the host cell is a human T-cell.
In some embodiments, the invention comprises a method of inducing cell signal transduction comprising introducing the first and second polypeptides, the LIPC system, the polynucleotides, and/or any of the vectors described herein and contacting the host cell with an activating ligand.
In certain embodiments of the invention, the activating ligand of the first and second polypeptides, the LIPC system, the polynucleotides, the vector, and/or the method described herein is:
wherein:
E is a (C4-C6)alkyl containing a tertiary carbon or a cyano(C3-C5)alkyl containing a tertiary carbon; R1 is H, Me, Et, i-Pr, F, formyl, CF3, CHF2, CHCl2, CH2F, CH2Cl, CH2OH, CH2OMe, CH2CN, CN, C≡CH, 1-propynyl, 2-propynyl, vinyl, OH, OMe, OEt, cyclopropyl, CF2CF3, CH═CHCN, allyl, azido, SCN, or SCHF2;
R2 is H, Me, Et, n-Pr, i-Pr, formyl, CF3, CHF2, CHCl2, CH2F, CH2Cl, CH2OH, CH2OMe, CH2CN, CN, C≡CH, 1-propynyl, 2-propynyl, vinyl, Ac, F, Cl, OH, OMe, OEt, O-n-Pr, OAc, NMe2, NEt2, SMe, SEt, SOCF3, OCF2CF2H, COEt, cyclopropyl, CF2CF3, CH═CHCN, allyl, azido, OCF3, OCHF2, O-i-Pr, SCN, SCHF2, SOMe, NH—CN, or joined with R3 and the phenyl carbons to which R2 and R3 are attached to form an ethylenedioxy, a dihydrofuryl ring with the oxygen adjacent to a phenyl carbon, or a dihydropyryl ring with the oxygen adjacent to a phenyl carbon;
R3 is H, Et, or joined with R2 and the phenyl carbons to which R2 and R3 are attached to form an ethylenedioxy, a dihydrofuryl ring with the oxygen adjacent to a phenyl carbon, or a dihydropyryl ring with the oxygen adjacent to a phenyl carbon;
R4, R5, and R6 are independently H, Me, Et, F, Cl, Br, formyl, CF3, CHF2, CHCl2, CH2F, CH2Cl, CH2OH, CN, C≡CH, 1-propynyl, 2-propynyl, vinyl, OMe, OEt, SMe, or Set; or
In certain embodiments of the invention, the activating ligand of the first and second polypeptides, the LIPC system, the polynucleotides, the vector, and/or the method described herein is a compound of the formula:
wherein R1, R2, R3, and R4 are: a) H, (C1-C6)alkyl; (C1-C6)haloalkyl; (C1-C6)cyanoalkyl; (C1-C6)hydroxyalkyl; (C1-C4)alkoxy(C1-C6)alkyl; (C2-C6)alkenyl optionally substituted with halo, cyano, hydroxyl, or (C1-C4)alkyl; (C2-C6)alkynyl optionally substituted with halo, cyano, hydroxyl, or (C1-C4)alkyl; (C3-C5)cycloalkyl optionally substituted with halo, cyano, hydroxyl, or (C1-C4)alkyl; or b) unsubstituted or substituted benzyl wherein the substituents are independently 1 to 5 H, halo, nitro, cyano, hydroxyl, (C1-C6)alkyl, or (Ci-C6)alkoxy; and
R5 is H; OH; F; Cl; or (C1-C6)alkoxy;
provided that: when R1, R2, R3, and R4 are isopropyl, then R5 is not hydroxyl;
when R5 is H, hydroxyl, methoxy, or fluoro, then at least one of R1, R2, R3, and R4 is not H;
when only one of R1, R2, R3, and R4 is methyl, and R5 is H or hydroxyl, then the remainder of R1, R2, R3, and R4 are not H;
when both R4 and one of R1, R2, and R3 are methyl, then R5 is neither H nor hydroxyl;
when R1, R2, R3, and R4 are all methyl, then R5 is not hydroxyl;
when R1, R2, and R3 are all H and R5 is hydroxyl, then R4 is not ethyl, n-propyl, n-butyl, allyl, or benzyl.
In certain embodiments of the invention, the activating ligand of the first and second polypeptides, the LIPC system, the polynucleotides, the vector, and/or the method described herein is a compound of the formula:
wherein X and X′ are independently 0 or S;
(a) substituted or unsubstituted phenyl wherein the substitutents are independently 1-5H, (C1-C4)alkyl, (C1-C4)alkoxy, (C2-C4)alkenyl, halo (F, Cl, Br, I), (C1-C4)haloalkyl, hydroxy, amino, cyano, or nitro; or
(b) substituted or unsubstituted 2-pyridyl, 3-pyridyl, or 4-pyridyl, wherein the substitutents are independently 1-4H, (C1-C4)alkyl, (C1-C4)alkoxy, (C2-C4)alkenyl, halo (F, Cl, Br, I), (C1-C4)haloalkyl, hydroxy, amino, cyano, or nitro;
i) when R9 and R10 are both H, or
ii) when either R9 or R10 are halo, (C1-C3)alkyl, (C1-C3)alkoxy(C1-C3)alkyl, or benzoyloxy(C1-C3)alkyl, or
iii) when R5 and R6 do not together form a linkage of the type (—OCHR9CHR10O—),
then the number of carbon atoms, excluding those of cyano substitution, for either or both of groups R1 or R2 is greater than 4, and the number of carbon atoms, excluding those of cyano substitution, for the sum of groups R1, R2, and R3 is 10, 11, or 12.
A more complete understanding of the present invention may be obtained by reference to the accompanying drawings, when considered in conjunction with the subsequent detailed description. The embodiments illustrated in the drawings are intended only to exemplify the invention and should not be construed as limiting the invention to the illustrated embodiments. Additional embodiments and configurations can provide further useful embodiments.
The invention provided herein uses components of EcR-RXR transcriptional switch systems (see e.g., PCT Publication Nos. WO 2001/070816, WO 2002/066612, WO 2002/066613, WO 2002/066614, WO 2002/066615, WO 2003/027266, WO 2003/027289, and WO 2005/108617 each of which is hereby incorporated herein by reference its entirety) which can be expressed in, or by, a host cell to control, regulate or modulate association of fused protein components. One role of protein-protein interactions is to initiate cell signal transduction processes, such as by activating cytoplasmic and/or extracellular signaling domains or restoring functionality to a fragmented or split protein via receptor-ligand binding interactions. Thus, this naturally occurring system can be artificially modulated by driving the association of two inactive signaling domains via induced formation of a “bridge” between an EcR and an RXR component (in the presence of an EcR ligand) wherein the latter components have been incorporated with (i.e., fused to) the signaling domain polypeptides.
In certains embodiments, described herein are systems and methods relating to selective activation of cellular signaling domains via ligand-induced polypeptide coupling. The systems and methods provide a ligand induced polylpeptide coupling system which allows for induction (e.g., modulation, control, regulation) of protein-protein interactions and (“on demand”) activation of signaling domains, or inactivation/inhibition of signaling domains.
Accordingly, disclosed herein are systems and methods that use protein components of a gene transcriptional switch system (expressed in a host cel) for inducing physical association with one another (via an activating ligand) to form a complex (i.e., induce protein-protein interactions) of other associated proteins or domains. Ligand induced protein association can, for example, initiate functions such as activating cytoplasmic and/or extracellular signaling domains in the presence of activating ligand. Thus, in the presence of activating ligand, two signaling domains that are normally inactive can be activated by bringing them together via a “bridge” between the EcR and USP/RXR components.
The use of the word “a” or “an” when used in conjunction with the term “comprising” in the claims and/or the specification may mean “one,” but it is also consistent with the meaning of “one or more,” “at least one,” and “one or more than one.”
The use of the term “for example” and its corresponding abbreviation “e.g.” (whether italicized or not) means that the specific terms cited are representative examples only (that is, specimens, samples, illustrations, models, etc) and embodiments of the invention are not intended to be limited to the specific examples referenced or cited unless explicitly stated otherwise.
The forward slash character (“/”), when used herein in reference to gene or polypeptide components (unless indicated otherwise) is an abbreviation for the words “and/or”. For example, unless specified otherwise, the term “USP/RXR” indicates a polypeptide that can have a mixture of components of both USP and RXR polypeptides or fragments thereof (e.g., a chimeric polypeptide), or USP polypeptide components or fragements thereof (e.g., domains) only, or RXR components or fragements thereof (e.g., domains) only.
As used in this specification and claim(s), the words “comprising” (and any form of comprising, such as “comprise” and “comprises”), “having” (and any form of having, such as “have” and “has”), “including” (and any form of including, such as “includes” and “include”) or “containing” (and any form of containing, such as “contains” and “contain”) are inclusive or open-ended and do not exclude additional, unrecited elements or method steps. It is contemplated that any embodiment discussed in this specification can be implemented with respect to any method, system, host cell, expression vector, or composition of the invention. Furthermore, systems, host cells, expression vectors, and/or compositions of the invention can be used to achieve methods of the invention.
“Synthetic” as used herein refers to compounds formed through a chemical process by human agency, as opposed to those of natural origin.
By “isolated” is meant the removal of a nucleic acid, peptide, or polypeptide from its natural environment. By “purified” is meant that a given nucleic acid, whether one that has been removed from nature (including genomic DNA and mRNA) or synthesized (including cDNA) and/or amplified under laboratory conditions, peptide, or polypeptide has been increased in purity, wherein “purity” is a relative term, not “absolute purity.” It is to be understood, however, that nucleic acids, peptides, and polypeptides may be formulated with diluents or adjuvants and still for practical purposes be isolated. For example, nucleic acids typically are mixed with an acceptable carrier or diluent when used for introduction into cells.
A “nucleic acid” is a polymeric compound comprised of covalently linked subunits called nucleotides. Nucleic acid includes polyribonucleic acid (RNA) and polydeoxyribonucleic acid (DNA), both of which may be single-stranded or double-stranded. DNA includes but is not limited to cDNA, genomic DNA, plasmids DNA, synthetic DNA, and semi-synthetic DNA. DNA may be linear, circular, or supercoiled.
A “nucleic acid molecule” refers to the phosphate ester polymeric form of ribonucleosides (adenosine, guanosine, uridine or cytidine; “RNA molecules”) or deoxyribonucleosides (deoxyadenosine, deoxyguanosine, deoxythymidine, or deoxycytidine; “DNA molecules”), or any phosphoester analogs thereof, such as phosphorothioates and thioesters, in either single stranded form, or a double-stranded helix. Double stranded DNA-DNA, DNA-RNA and RNA-RNA helices are possible. The term nucleic acid molecule, and in particular DNA or RNA molecule, refers only to the primary and secondary structure of the molecule, and does not limit it to any particular tertiary forms. Thus, this term includes double-stranded DNA found, inter alia, in circular or linear DNA molecules (e.g., restriction fragments), plasmids, and chromosomes. In discussing the structure of particular double-stranded DNA molecules, 5′ sequences may be described herein according to the normal convention of indicating only the sequence in the 5′ to 3′ direction along the non-transcribed strand of DNA, i.e., the strand having a sequence complementary to the mRNA. A “recombinant DNA molecule” is a DNA molecule that has undergone a molecular biological manipulation.
The term “fragment” will be understood to mean, in reference to polynucleotides, a nucleotide sequence of reduced length relative to the reference nucleic acid and comprising, over the common portion, a nucleotide sequence identical to the reference nucleic acid. Such a nucleic acid fragment, according to the invention may be, where appropriate, included in a larger polynucleotide of which it is a constituent. Such fragments comprise, or alternatively consist of, oligonucleotides ranging in length from at least 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 105, 110, 120, 125, 130, 135, 140, 145, 150, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1250, 1500, 2000, 2500, 3000, 3500, 4000, 4500, 5000 or 6000 consecutive nucleotides of a nucleic acid according to the invention. In certain embodiments, such fragments may comprise, or alternatively consist of, oligonucleotides of any integer in length ranging, for example, from 6 to 6,000 nucleotides. In certain embodiments such fragments may be any integer in length which is evenly divisible by 3 (e.g., such that the the polynucleotide encodes a full or partial polypeptide open reading frame). In certain embodiments such partial polypeptide fragments may be any integer in length (e.g., such that the polynucleotide may be used as a PCR primer or other hybridizable fragment or for use in generating synthetic or restriction fragment length polynucleotides.)
As used herein, an “isolated nucleic acid fragment” is a polymer of RNA or DNA that is single- or double-stranded, optionally containing synthetic, non-natural or altered nucleotide bases. An isolated nucleic acid fragment in the form of a polymer of DNA may be comprised of one or more segments of cDNA, genomic DNA or synthetic DNA.
A “gene” refers to an assembly of nucleotides that encode a polypeptide, and includes cDNA and genomic DNA nucleic acids. “Gene” also refers to a nucleic acid fragment that expresses a specific protein or polypeptide, including regulatory sequences preceding (5′ non-coding sequences) and following (3′ non-coding sequences) the coding sequence. “Native gene” refers to a gene as found in nature with its own regulatory sequences. “Chimeric gene” refers to any gene that is not a native gene, comprising regulatory and/or coding sequences that are not found together in nature. Accordingly, a chimeric gene may comprise regulatory sequences and coding sequences that are derived from different sources, or regulatory sequences and coding sequences derived from the same source, but arranged in a manner different than that found in nature. A chimeric gene may comprise coding sequences derived from different sources and/or regulatory sequences derived from different sources. “Endogenous gene” refers to a native gene in its natural location in the genome of an organism. A “foreign” gene or “heterologous” gene refers to a gene not normally found in a host organism or cell, but that is introduced into the host organism or cell by gene transfer. Foreign genes can comprise, without limitation, native genes inserted into a non-native organism and chimeric genes. A “transgene” is a foreign or heterologous gene that has been introduced into the genome of a host organism or cell. “Heterologous” DNA refers to DNA not naturally located a the cell, or in a chromosomal site of a cell's genome. In some embodiments, heterologous DNA includes a gene foreign to the cell.
“Polynucleotide” or “oligonucleotide” as used herein refers to a polymeric form of nucleotides of any length, either ribonucleotides or deoxyribonucleotides. This term refers only to the primary structure of the molecule. Thus, this term includes double and single stranded DNA, triplex DNA, as well as double and single stranded RNA. It also includes modified, for example, by methylation and/or by capping, and unmodified forms of the polynucleotide. The term is also meant to include molecules that include non-naturally occurring or synthetic nucleotides as well as nucleotide analogs. In certain embodiments, an oligonucleotide is hybridizable to a genomic DNA molecule, a cDNA molecule, a plasmid DNA or an mRNA molecule. Oligonucleotides can be labeled (e.g., with 32P-nucleotides or nucleotides to which a label, such as biotin, has been covalently conjugated). In some embodiments, a labeled oligonucleotide can be used as a probe to detect the presence of a nucleic acid. Oligonucleotides (one or both of which may be labeled) can be used as PCR primers, either for cloning full length or a fragment of a nucleic acid, or to detect the presence of a nucleic acid. An oligonucleotide can also be used to form a triple helix with a DNA molecule. In certain embodiments, oligonucleotides are prepared synthetically, for example, on a nucleic acid synthesizer. Accordingly, oligonucleotides can be prepared with non-naturally occurring phosphoester analog bonds, such as thioester bonds, etc.
Nucleic acids and/or nucleic acid sequences are “homologous” when they are derived, naturally or artificially, from a common ancestral nucleic acid or nucleic acid sequence. Proteins and/or protein sequences are homologous when their encoding DNAs are derived, naturally or artificially, from a common ancestral nucleic acid or nucleic acid sequence. The homologous molecules can be termed homologs. For example, any naturally occurring proteins, as described herein, can be modified by any available mutagenesis method. When expressed, this mutagenized nucleic acid encodes a polypeptide that is homologous to the protein encoded by the original nucleic acid. Homology is generally inferred from sequence identity between two or more nucleic acids or proteins (or sequences thereof). The precise percentage of identity between sequences that is useful in establishing homology varies with the nucleic acid and protein at issue, but as little as 25% sequence identity is routinely used to establish homology. Higher levels of sequence identity, e.g., 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% or 99% or more can also be used to establish homology. Methods for determining sequence identity percentages (e.g., BLASTP and BLASTN using default parameters) are described herein and are generally available.
A DNA “coding sequence” is a double-stranded DNA sequence that is transcribed and translated into a polypeptide in a cell in vitro or in vivo when placed under the control of appropriate regulatory sequences. “Suitable regulatory sequences” refer to nucleotide sequences located upstream (5′ non-coding sequences), within, or downstream (3′ non-coding sequences) of a coding sequence, and which influence the transcription, RNA processing or stability, or translation of the associated coding sequence. Regulatory sequences may include promoters, translation leader sequences, introns, polyadenylation recognition sequences, RNA processing site, effector binding site and stem-loop structure. The boundaries of the coding sequence are determined by a start codon at the 5′ (amino) terminus and a translation stop codon at the 3′ (carboxyl) terminus. A coding sequence can include, but is not limited to, prokaryotic sequences, cDNA from mRNA, genomic DNA sequences, and synthetic DNA sequences. If the coding sequence is intended for expression in a eukaryotic cell, a polyadenylation signal and transcription termination sequence will usually be located 3′ to the coding sequence.
“Open reading frame,” abbreviated ORF, means a length of nucleic acid sequence, either DNA, cDNA or RNA, that comprises a translation start signal or initiation codon, such as an ATG or AUG, and a termination codon, and can be potentially translated into a polypeptide sequence.
“Homologous recombination” refers to the insertion of a foreign DNA sequence into another DNA molecule (e.g., insertion of a vector in a chromosome). In some embodiments, the vector targets a specific chromosomal site for homologous recombination. For specific homologous recombination, the vector will contain sufficiently long regions of homology to sequences of the chromosome to allow complementary binding and incorporation of the vector into the chromosome. Longer regions of homology, and greater degrees of sequence similarity, may increase the efficiency of homologous recombination.
A “vector” or “expression vector” is any modality for the cloning of and/or transfer of a nucleic acid into a host cell. A vector may be a replicon to which another DNA segment may be attached so as to bring about the replication of the attached segment. A “replicon” is any genetic element (e.g., plasmid, phage, cosmid, chromosome, virus) that functions as an autonomous unit of DNA replication in a cell. The term “vector” includes both viral and nonviral means for introducing the nucleic acid into a cell in vitro, ex vivo or in vivo.
The term “plasmid” refers to an extra chromosomal element often carrying a gene that is not part of the central metabolism of the cell, and may be in the form of circular double-stranded DNA molecules. Such elements may be autonomously replicating sequences, genome integrating sequences, phage or nucleotide sequences, linear, circular, or supercoiled, of a single- or double-stranded DNA or RNA, derived from any source, in which a number of nucleotide sequences have been joined or recombined into a unique construction which is capable of introducing a promoter fragment and DNA sequence for a selected gene product along with appropriate 3′ untranslated sequence into a cell.
Vectors may be introduced into the desired host cells by methods known in the art, e.g., transfection, electroporation, microinjection, transduction, cell fusion, DEAE dextran, calcium phosphate precipitation, lipofection (lysosome fusion), use of a gene gun, or a DNA vector transporter (see, e.g., Wu et al., 1992, J. Biol. Chem. 267: 963-967; Wu and Wu, 1988, J. Biol. Chem. 263: 14621-14624; and Hartmut et al., Canadian Patent Application No. 2,012,311, filed Mar. 15, 1990, each of which is incorporated by reference here in its entirety).
It is also possible to introduce a vector in vivo as a naked DNA plasmid (see, e.g., U.S. Pat. Nos. 5,693,622, 5,589,466 and 5,580,859, each of which is incorporated by reference herein in its entirety). Receptor-mediated DNA delivery approaches can also be used (see, e.g., Curel et al., 1992, Hum. Gene Ther 3: 147-154; and Wu and Wu, 1987, J. Biol. Chem 262: 4429-4432, each of which is incorporated by reference herein in its entirety).
The term “transfection” means the uptake of exogenous or heterologous RNA or DNA by a cell. A cell has been “transfected” by exogenous or heterologous RNA or DNA when such RNA or DNA has been introduced inside the cell. A cell has been “transformed” by exogenous or heterologous RNA or DNA when the transfected RNA or DNA effects a phenotypic change. The transforming RNA or DNA can be integrated (covalently linked) into chromosomal DNA making up the genome of the cell.
“Transformation” refers to the transfer of a nucleic acid fragment into the genome of a host organism, resulting in genetically stable inheritance. Host organisms containing the transformed nucleic acid fragments are referred to as “transgenic” or “recombinant” or “transformed” organisms.
The term “selectable marker” means an identifying factor, usually an antibiotic or chemical resistance gene, that is able to be selected for based upon the marker gene's effect, i.e., resistance to an antibiotic, resistance to a herbicide, colorimetric markers, enzymes, fluorescent markers, and the like, wherein the effect is used to track the inheritance of a nucleic acid of interest and/or to identify a cell or organism that has inherited the nucleic acid of interest. Examples of selectable marker genes known and used in the art include, but are not limited to: genes providing resistance to ampicillin, streptomycin, gentamycin, kanamycin, hygromycin, bialaphos herbicide, sulfonamide, and the like; and genes that are used as phenotypic markers, for example, anthocyanin regulatory genes, isopentanyl transferase gene, and the like.
The term “reporter gene” means a nucleic acid encoding an identifying factor that is able to be identified based upon the reporter gene's effect, wherein the effect is used to track the inheritance of a nucleic acid of interest, to identify a cell or organism that has inherited the nucleic acid of interest, and/or to measure gene expression induction or transcription. Examples of reporter genes known and used in the art include, but are not limited to: luciferase (Luc), green fluorescent protein (GFP), chloramphenicol acetyltransferase (CAT), β-galactosidase (LacZ), β-glucuronidase (Gus), and the like. Selectable marker genes may also be considered reporter genes.
“Operably linked” as used herein refers to refers to the physical and/or functional linkage of a DNA segment to another DNA segment in such a way as to allow the segments to function in their intended manners. A DNA sequence encoding a gene product is operably linked to a regulatory sequence when it is linked to the regulatory sequence, such as, for example, promoters, enhancers and/or silencers, in a manner which allows modulation of transcription of the DNA sequence, directly or indirectly. For example, a DNA sequence is operably linked to a promoter when it is ligated to the promoter downstream with respect to the transcription initiation site of the promoter, in the correct reading frame with respect to the transcription initiation site and allows transcription elongation to proceed through the DNA sequence. An enhancer or silencer is operably linked to a DNA sequence coding for a gene product when it is ligated to the DNA sequence in such a manner as to increase or decrease, respectively, the transcription of the DNA sequence. Enhancers and silencers may be located upstream, downstream or embedded within the coding regions of the DNA sequence. A DNA for a signal sequence is operably linked to DNA coding for a polypeptide if the signal sequence is expressed as a preprotein that participates in the secretion of the polypeptide. The terms “cassette,” “expression cassette,” and “gene expression cassette” refer to a segment of DNA that can be inserted into a nucleic acid or polynucleotide (e.g., specific restriction sites or by homologous recombination). The segment of DNA may comprise a polynucleotide that encodes a polypeptide of interest, and the cassette and restriction sites may be designed to ensure insertion of the cassette in the proper reading frame for transcription and translation. “Transformation cassette” refers to a vector comprising a polynucleotide that encodes a polypeptide of interest and having elements in addition to the polynucleotide that facilitate transformation of a particular host cell. Cassettes, expression cassettes, gene expression cassettes and transformation cassettes of the invention may also comprise elements that allow for enhanced expression of a polynucleotide encoding a polypeptide of interest in a host cell. These elements may include, but are not limited to: a promoter, a minimal promoter, an enhancer, a response element, a terminator sequence, a polyadenylation sequence, and the like. “Regulatory region” means a nucleic acid sequence that regulates the expression of a second nucleic acid sequence. A regulatory region may include sequences which are naturally responsible for expressing a particular nucleic acid (a homologous region) or may include sequences of a different origin that are responsible for expressing different proteins or even synthetic proteins (a heterologous region). In particular, the sequences can be sequences of prokaryotic, eukaryotic, or viral genes or derived sequences that stimulate or repress transcription of a gene in a specific or non-specific manner and in an inducible or non-inducible manner. Regulatory regions include origins of replication, RNA splice sites, promoters, enhancers, transcriptional termination sequences, and signal sequences which direct the polypeptide into the secretory pathways of the target cell. A regulatory region from a “heterologous source” is a regulatory region that is not naturally associated with the expressed nucleic acid. Included among the heterologous regulatory regions are regulatory regions from a different species, regulatory regions from a different gene, hybrid regulatory sequences, and regulatory sequences which do not occur in nature.
“Peptide” is used herein to refer to a compound containing two or more amino acid residues linked in a chain. A “polypeptide” is a polymeric compound comprised of covalently linked amino acid residues. Amino acids have the following general structure:
Amino acids are classified into seven groups on the basis of the side chain R: (1) aliphatic side chains, (2) side chains containing a hydroxylic (OH) group, (3) side chains containing sulfur atoms, (4) side chains containing an acidic or amide group, (5) side chains containing a basic group, (6) side chains containing an aromatic ring, and (7) proline, an imino acid in which the side chain is fused to the amino group.
A “protein” comprises a polypeptide. An “isolated polypeptide” or “isolated protein” is a polypeptide or protein that is substantially free of those compounds that are normally associated therewith in its natural state (e.g., other proteins or polypeptides, nucleic acids, carbohydrates, lipids). “Isolated” is not meant to exclude artificial or synthetic mixtures with other compounds, or the presence of impurities which do not interfere with biological activity, and which may be present, for example, due to incomplete purification, addition of stabilizers, or compounding into a pharmaceutically acceptable preparation.
A “substitution mutant polypeptide” or a “substitution mutant” as used herein means a polypeptide comprising a substitution or substitutions (or consisting of a substitution or substitutions) of about or at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, or more wild-type or naturally occurring amino acid with a different amino acid relative to the wild-type or naturally occurring polypeptide. A substitution mutant polypeptide may comprising only one (1) amino acid substitution compared to the wild-type or naturally occurring polypeptide may be referred to as a “point mutant” or a “single point mutant” polypeptide.
When a substitution mutant polypeptide includes, or consists of, a substitution of one (1) or more wild-type or naturally occurring amino acids, this substitution may comprise, or consist of, either an equivalent number of wild-type or naturally occurring amino acids deleted for the substitution, i.e., two wild-type or naturally occurring amino acids replaced with two non-wild-type or non-naturally occurring amino acids, or a non-equivalent number of wild-type amino acids deleted for the substitution, e.g., two wild-type amino acids replaced with one non-wild-type amino acid (a substitution+deletion mutation), or two wild-type amino acids replaced with three non-wild-type amino acids (a substitution+insertion mutation). Substitution mutants may be described using an abbreviated nomenclature system to indicate the amino acid residue and number replaced within the reference polypeptide sequence and the new substituted amino acid residue. For example, a substitution mutant in which the twentieth (20th) amino acid residue of a polypeptide is substituted may be abbreviated as “x20z,” wherein “x” is the parent, normally occurring or naturally occurring amino acid to be replaced, “20” is the amino acid residue position or number referenced within the polypeptide, and “z” is the newly substituted amino acid. Therefore, a substitution mutant abbreviated interchangeably as “E20A” or “Glu20Ala” indicates that the substitution mutant comprises an alanine residue (typically abbreviated in the art as “A” or “Ala”) in place of a glutamic acid (typically abbreviated in the art as “E” or “Glu”) at position 20 of the polypeptide.
“Fragment,” when used in relation to a polypeptide, as used herein means a polypeptide whose amino acid sequence is shorter than that of a reference polypeptide and which comprises, or consists of, over the entire portion of the reference polypeptide, an identical amino acid sequence (unless explicitly stated otherwise, e.g., “a fragment 95% identical to . . . ”). Such fragments may, where appropriate, be included in a larger polypeptide of which they are a part. Such fragments of a polypeptide according to the invention may comprise, or alternatively consist of, a polymer ranging in length from at least 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 105, 110, 120, 125, 130, 135, 140, 145, 150, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1250, 1500, 2000, 2500, 3000, 3500, 4000, 4500, or 5000 amino acid residues. In certain embodiments, such fragments may comprise, or alternatively consist of, amino acid polymers (i.e., peptides, polypeptides) of any integer in length ranging, for example, from 4 to 5,000 residues.
“Truncate” or “truncated,” when used in relation to a polypeptide, is a polypeptide fragment whose amino acid sequence is shorter (at either the N-terminus, C-terminus, or both N- and C- termini) compared to that of a reference polypeptide (e.g., such as may result from a deletion or enzymatic processing of amino acid residues).
A “variant” of a polypeptide or protein is any analogue, fragment, truncation, derivative, or mutant which is derived from, or differing from, a similar polypeptide or protein but which retains at least one biological property of the original, or reference, polypeptide or protein. Different variants of the polypeptide or protein may exist in nature. These variants may be naturally occurring allelic variations characterized by differences in the nucleotide sequences of the structural gene coding for the protein, or may involve differential splicing or post-translational modification, or variants may be artificially (e.g., genetically, synthetically, recombinantly) engineered. The skilled artisan can produce variants having single or multiple amino acid substitutions, deletions, additions, or replacements. These variants may include, inter alfa: (a) variants in which one or more amino acid residues are substituted with conservative or non-conservative amino acids, (b) variants in which one or more amino acids are added to the polypeptide or protein, (c) variants in which one or more of the amino acids includes a substituent group, and/or (d) variants in which the polypeptide or protein is fused with another polypeptide. The techniques for obtaining these variants, including genetic (suppressions, deletions, mutations, etc.), chemical, and enzymatic techniques, are known to persons having ordinary skill in the art. A “functional variant” or “functional fragment” of a protein disclosed herein retains at least a portion of the function of a reference protein. For example, a “functional variant” or “functional fragment” of a protein can retain at least about 10%, about 20%, about 30%, about 40%, about 50%, about 60%, about 70%, about 80%, about 90%, or about 100% of the biological activity or function of the reference protein to which it is compared. In addition, a “functional variant” or “functional fragment” of a protein can, for example, comprise, or consist of, the amino acid sequence of the reference protein with at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 conservative amino acid substitutions per every 100 consecutive amino acid residues. The phrase “conservative amino acid substitution” or “conservative mutation” refers to the replacement of one amino acid by another amino acid with a common property (e.g., hydrophobicity, hydrophilicity, ionic charge, basic, acidic, polar, non-polar, etc). A functional way to define common properties between individual amino acids is to analyze the normalized frequencies of amino acid changes between corresponding proteins of homologous organisms (Schulz, G. E. and Schirmer, R. H., Principles of Protein Structure, Springer-Verlag, New York (1979), which is incorporated by reference herein in its entirety). According to such analyses, groups of amino acids may be defined where amino acids within a group exchange preferentially with each other, and therefore resemble each other most in their impact on the overall protein structure (Schulz, G. E. and Schirmer, R. H., supra). Examples of conservative mutations include amino acid substitutions of amino acids within the sub-groups above, for example, lysine for arginine and vice versa such that a positive charge may be maintained; glutamic acid for aspartic acid and vice versa such that a negative charge may be maintained; serine for threonine such that a free —OH can be maintained; and glutamine for asparagine such that a free —NH2 can be maintained. In some instances, it may be preferable for the conservative amino acid substitution to not interfere with, or inhibit the biological activity of, the functional variant. In some instances the conservative amino acid substitution may enhance the biological activity of the functional variant, such that the biological activity of the functional variant is increased as compared to the parent molecule. In other instances, it may be desirable for the conservative substitution to interfere with, eliminate, or reduce at least one or more biological activities.
Alternatively or additionally, functional variants can comprise, or consist of, the amino acid sequence of the reference protein with at least one non-conservative amino acid substitution. “Non-conservative mutations” involve amino acid substitutions between different groups (i.e., wherein the original and substituted AA have a different chemical property, such as differences in properties relating to hydrophobicity, hydrophilicity, ionic charge, polar, non-polar, acidic, basic properties, etc.). A few examples of non-conservative substitutions would be, lysine (basic) for tryptophan (non-polar) or for glutamic acid (acidic), aspartic acid (acidic) for tyrosine (polar) or for histidine (basic), or phenylalanine (non-polar) for arginine (basic) or for serine (polar), etc. In some instances, it may be preferable for the non-conservative amino acid substitution to not interfere with, or inhibit the biological activity of, the functional variant. In some instances the non-conservative amino acid substitution may enhance the biological activity of the functional variant, such that the biological activity of the functional variant is increased as compared to the parent molecule. In other instances, it may be desirable for the non-conservative substitution to interfere with, eliminate, or reduce at least one or more biological activities.
A “heterologous protein” refers to a protein not naturally produced in the cell. A “mature protein” refers to a post-translationally processed polypeptide, i.e., one from which any pre- or propeptides present in the primary translation product have been removed. “Precursor” protein refers to the primary product of translation of mRNA, i.e., with pre- and propeptides still present. Pre- and propeptides may be but are not limited to signal peptides or intracellular localization signals.
The term “signal peptide” refers to an amino terminal polypeptide preceding the secreted mature protein. The signal peptide is cleaved from and is therefore not present in the mature protein. Signal peptides have the function of directing and translocating secreted proteins across cell membranes. Signal peptide is also referred to as signal protein.
A “signal sequence” is included at the beginning of the coding sequence of a protein to be expressed on the surface of a cell. This sequence encodes a signal peptide, N-terminal to the mature polypeptide, that directs the host cell to translocate the polypeptide. The term “translocation signal sequence” may also be used to refer to this type of signal sequence. Translocation signal sequences can be found associated with a variety of proteins native to eukaryotes and prokaryotes, and are often functional in both types of organisms.
The term “homology” refers to the percent of identity between two polynucleotide or two polypeptidemolecules. The correspondence between the sequence of one molecule to another can be determined by techniques known to the art. For example, homology can be determined by a direct comparison of the sequence information between two polypeptide molecules by aligning the sequence information and using readily available computer programs. Alternatively, homology can be determined by hybridization of polynucleotides under conditions that form stable duplexes between homologous regions, followed by digestion with single-stranded-specific nuclease(s) and size determination of the digested fragments.
Accordingly, the term “sequence similarity” in all its grammatical forms refers to the degree of identity, homology, or correspondence between nucleic acid or amino acid sequences of proteins that may or may not share a common evolutionary origin (see Reeck et al., 1987, Cell 50:667, which is incorporated by reference herein in its entirety). In certain embodiments, two DNA sequences are “substantially homologous” or “substantially similar” when at least about 50%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95% at least about 97%, at least about 98%, at least about 99%, of the nucleotides match over the defined length of the DNA or amino acid sequences. Sequences that are substantially homologous can be identified by comparing the sequences using standard software available in sequence data banks, or in a Southern hybridization experiment under, for example, stringent conditions as understood by those of ordinary skill in the art. For example, stringent hybridization conditions may comprise, or alternatively consist of, hybridization of either target, “probe”, or detection-reagent DNA to filter bound DNA in 6x sodium chloride/sodium citrate (SSC) at about 45 degrees Celsius, followed by one or more washes in 0.2x SSC, 0.1% SDS at about 50-65 degrees Celsius), followed by one or more washes in 0.1x SSC, 0.2% SDS at about 68 degrees Celsius; or, under other stringent hybridization conditions which are known to those of skill in the art (see, for example, Ausubel, F. M. et al., eds., 1989 Current Protocols in Molecular Biology, Green publishing associates, Inc., and John Wiley & Sons Inc., New York, at pages 6.3.1-6.3.6 and 2.10.3). Polynucleotides encoding such polypeptides are also encompassed by the invention.
The terms “identical” or “sequence identity” in the context of two nucleic acid sequences or amino acid sequences of polypeptides refers to the residues in the two sequences which are the same when aligned for maximum correspondence over a specified comparison window. A “comparison window”, as used herein, refers to a segment of at least about 10, at least about 20, at least about 50, at least about 100, at least about 200, at least about 300, at least about 500, or at least about 1000 residues in which a sequence may be compared to a reference sequence of the same number of contiguous positions after the two sequences are aligned optimally. Methods of alignment of sequences for comparison are well-known in the art. Optimal alignment of sequences for comparison may be conducted by the local homology algorithm of Smith and Waterman (1981) Adv. Appl. Math. 2:482, incorporated by reference herein in its entirety; by the alignment algorithm of Needleman and Wunsch (1970) J. Mol. Biol. 48:443, incorporated by reference herein in its entirety; by the search for similarity method of Pearson and Lipman (1988) Proc. Nat. Acad. Sci U.S.A. 85:2444, incorporated by reference herein in its entirety; by computerized implementations of these algorithms (including, but not limited to CLUSTAL in the PC/Gene program by Intelligentics, Mountain View Calif., GAP, BESTFIT, BLAST, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group (GCG), 575 Science Dr., Madison, Wis., U.S.A.); the CLUSTAL program is well described by Higgins and Sharp (1988) Gene 73:237-244 and Higgins and Sharp (1989) CABIOS 5:151-153; Corpet et al. (1988) Nucleic Acids Res. 16:10881-10890; Huang et al. (1992) Computer Applications in the Biosciences 8:155-165; and Pearson et al. (1994) Methods in Molecular Biology 24:307-331, each of which is incorporated by reference herein in its entirety. In addition to computer software-based alignments, alignments may also be performed by manual inspection and manual alignment.
In one class of embodiments, polypeptides are 70%, at least 70%, 75%, at least 75%, 80%, at least 80%, 85%, at least 85%, 90%, at least 90%, 95%, at least 95%, 97%, at least 97%, 98%, at least 98%, 99%, or at least 99% or 100% identical to a reference polypeptide, or a fragment thereof (e.g., as measured by BLASTP or CLUSTAL, or other alignment software) using default parameters. Similarly, nucleic acids can also be described with reference to a starting nucleic acid, e.g., they can be 50%, at least 50%, 60%, at least 60%, 70%, at least 70%, 75%, at least 75%, 80%, at least 80%, 85%, at least 85%, 90%, at least 90%, 95%, at least 95%, 97%, at least 97%, 98%, at least 98%, 99%, at least 99%, or 100% identical to a reference nucleic acid or a fragment thereof (e.g., as measured by BLASTN or CLUSTAL, or other alignment software using default parameters). When one molecule is said to have a certain percentage of sequence identity with a larger molecule, it means that when the two molecules are optimally aligned, said percentage of residues in the smaller molecule finds a match residue in the larger molecule in accordance with the order by which the two molecules are optimally aligned, and the “%” (percent) identity is calculated in accord with the length of the smaller molecule.
The term “substantially identical” as applied to nucleic acid or amino acid sequences means that a nucleic acid or amino acid sequence comprises, or consists of, a sequence that has 70%, at least 70%, 75%, at least 75%, 80%, at least 80%, 85%, at least 85%, 90%, at least 90%, 95%, at least 95%, 97%, at least 97%, 98%, at least 98%, 99%, or at least 99% or 100%, compared to a reference sequence. As indicated above, sequence identity may be calculated, for example, using programs well-known and routinely used by those of ordinary skill in the art. For example, the BLASTN program (for nucleotide sequences) uses as defaults a word length (W) of 11, an expectation (E) of 10, M=5, N=−4, and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a word length (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89:10915 (1992), incorporated by reference herein in its entirety). Percentage of sequence identity is determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity. Preferably, the substantial identity exists over a region of the sequences that is at least about 10, at least about 20, at least about 50, at least about 100, at least about 200, at least about 300, at least about 500, or at least about 1000 residues in length. In a most preferred embodiment, the sequences are substantially identical over the entire length of the coding region.
Proteins disclosed herein (including functional portions and functional variants thereof) may comprise synthetic amino acids in place of one or more naturally-occurring amino acids. Such synthetic amino acids are known in the art, and include, for example but not limited to, aminocyclohexane carboxylic acid, norleucine, α-amino n-decanoic acid, homoserine, S-acetylaminomethyl-cysteine, trans-3- and trans-4-hydroxyproline, 4-aminophenylalanine, 4-nitrophenylalanine, 4-chlorophenylalanine, 4-carboxyphenylalanine, β-phenylserine β-hydroxyphenylalanine, phenylglycine, α-naphthylalanine, cyclohexylalanine, cyclohexylglycine, indoline-2-carboxylic acid, 1,2,3,4-tetrahydroisoquinoline-3-carboxylic acid, aminomalonic acid, aminomalonic acid monoamide, N′-benzyl-N′-methyl-lysine, N′,N′-dibenzyl-lysine, 6-hydroxylysine, ornithine, α-aminocyclopentane carboxylic acid, α-aminocyclohexane carboxylic acid, α-aminocycloheptane carboxylic acid, α-(2-amino-2-norbornane)-carboxylic acid, α,γ-diaminobutyric acid, α, β-diaminopropionic acid, homophenylalanine, and α-tert-butylglycine.
The term “substantially purified” refers to a nucleic acid sequence, polypeptide, protein or other compound which is essentially free, i.e., is more than about 50% free of, more than about 70% free of, more than about 90% free of, the polynucleotides, proteins, polypeptides and other molecules that the nucleic acid, polypeptide, protein or other compound is naturally associated with.
“Synthetic genes” can be assembled from oligonucleotide building blocks that are chemically synthesized using procedures known to those or ordinary skill in the art. These building blocks are ligated and annealed to form gene segments that are then enzymatically assembled to construct the entire gene. “Chemically synthesized,” as related to a sequence of DNA, means that the component nucleotides were assembled in vitro. Manual chemical synthesis of DNA may be accomplished using well-established procedures.The skilled artisan appreciates the likelihood of enhanced gene expression if codon usage is biased towards those codons favored by the host cell or organism in which it is expressed. Determination of preferred codons can be based on a survey of genes derived from the host cell where sequence information is available.
The term “hybrid,” when used in reference to a polypeptide, nucleotide, or fragment thereof, as used herein refers to a polypeptide, polynucleotide, or fragment thereof, whose amino acid and/or nucleotide sequence is not found in nature. For example, a fusion protein of two heterologous proteins or polypeptides or a cDNA encoding a fusion polypeptide.
“Ligand Inducible Polypeptide Coupler” and “Ligand Inducible Polypeptide Couplers” is used interchangeably herein with “LIPC” and “LIPCs”, irrespectively, that is, “LIPC” can mean “Coupler” (singular) or “Couplers” plural) As such, LIPC refers to a system and polypeptide components of that system for bringing together (“coupling”; i.e., oligomerizing, dimerizing) polypeptides, in a small molecule ligand-dependent manner via incorporation of nuclear receptor polypeptide components into fusion proteins (e.g., use of Group H nuclear receptor and EcR receptor polypeptide components (e.g. EcR polypeptide fragments or domains); including EcR ligand binding polypeptides and nuclear receptor USP and/or RXR nuclear receptor polypeptide components (e.g. polypeptide fragments or domain thereof) as described herein.
Administration of an activating ligand and configuration of LIPC components can be used to regulate the timing and location of dimerization and polypeptide coupling activation. LIPC relies upon protein factors encoded by genes which are not native to the host, and which are encoded by heterologous sequences. A LIPC that is used to control the spatial and temporal association of polypeptide components in a host system can be derived from a foreign source such as bacteria, yeast, plants, insects, or viruses. Thus, the LIPC nuclear receptor polypeptide components confer utility in the host by providing a mechanism to control the association (e.g., dimerization, oligomerization) of polypeptides or proteins with which LIPC components are “fused” (i.e., engineered to be fusion proteins).
“Genetic switches,” also referred to as “gene switches” or “transcriptional switches,” are used for controlling gene expression and are artificially designed for the deliberate regulation of transgenes. Gene switches typically encode a trans-activator or trans-inhibitor whose activity can be regulated and a trans-activator-responsive or trans-inhibitor-susceptible promoter for controlling a gene of interest. These factors may be ligand-responsive, chimeric proteins containing a DNA-binding domain, a ligand-binding domain and a transcriptional activation domain or inhibition domain, respectively. These include for example, antibiotic responsive switches based on tetracycline-sensory trans-activators and trans-inhibitors, mammalian or insect steroid receptor-derived trans-activators, and rapamycin-induced trans-activators. Other genetic switches make use of endogenous transcription factors that can be deliberately activated by physical cues or signals, and whose transient activation is tolerated by the host cell. Examples of systems of this kind include gene switches that make use of transcription factors which can be activated by heat or ionizing radiation for example. See e.g., Auslander, S. and Fussenegger, M. (2012). Trends in Biotechnology (electronic release) pp. 1-14; Vilaboa N, Boellmann F, Voellmy R (2011) Gene Switches for Deliberate Regulation of Transgene Expression: Recent Advances in System Development and Uses. J Genet Syndr Gene Ther 2:107, each of which is incorporated by reference herein in its entirety.
In one embodiment, the genetic switch includes the following components: 1) Co-Activation Partner (CAP) and a Ligand-inducible Transcription Factor (LTF) which form unstable and unproductive heterodimers in the absence of Activator Ligand; 2) Activator Ligand: a molecule (e.g., an ecdysone analog or other a non-steroid small molecule); and 3) an Inducible Promoter, (e.g., a customizable promoter which binds the LTF). In one embodiment, the genetic switch allows for the expression of transduced genes only when the small molecule activator ligand combines with the switch components (CAP and LTF) thereby activating gene transcription from an inducible promoter, and ultimately resulting in expression of desired proteins. The timing, location, and concentration of genetic switch can be regulated in a dose dependent manner with the activator ligand. In certain embodiments components of the EcR-based genetic switch developed by Applicant (for example, as referenced under the trademark) RHEOSWITCH®)are used as component parts to generate ligand inducible polypeptide couplers (LIPCs) of the present invention (see for example, PCT Publication Nos. WO 2001/070816, WO 2002/066612, WO 2002/066613, WO 2002/066614, WO 2002/066615, WO 2003/027266, WO 2003/027289, and WO 2005/108617 each of which is hereby incorporated by reference herein in its entirety).
In the present invention, components of EcR-based “genetic switches” are employed to create “ligand inducible polypeptide couplers” described, and envisaged by, the disclosure herein. “Ecdysone receptor” and “EcR” are used interchangeably herein and refer to members of the Arthropod superfamily of nuclear receptors, classified into subfamily 1, group H (referred to herein as “Group H nuclear receptors”). The members of each group share 40-60% amino acid identity in the E (ligand binding) domain (Laudet et al., A Unified Nomenclature System for the Nuclear Receptor Subfamily, 1999; Cell 97: 161-163, which is incorporated by reference herein in its entirety). In addition to the ecdysone receptor, other members of this nuclear receptor subfamily 1, group H include: ubiquitous receptor (UR), Orphan receptor 1 (OR-1), steroid hormone nuclear receptor 1 (NER-1), RXR interacting protein-15 (RIP-15), liver x receptor β (LXRβ), steroid hormone receptor like protein (RLD-1), liver x receptor (LXR), liver x receptor α(LXRα), farnesoid x receptor (FXR), receptor interacting protein 14 (RIP-14), and farnesol receptor (HRR-1). EcR proteins are characterized by signature DNA and ligand binding domains (LBD), and an activation domain (Koelle et al. 1991, Cell, 67:59-77, which is incorporated by reference herein in its entirety). EcR receptors are responsive to a number of steroidal and non-steroidal compounds, i.e., activating ligands.
“Retinoid X receptor” and “RXR” are used interchangeably herein and refer to a member of the nuclear hormone receptor family, in particular the steroid and thyroid hormone receptor superfamily. Vertebrate RXR includes at least three distinct genes (RXR alpha, beta and gamma), which give rise to a large number of protein products through differential promoter usage and alternative splicing. Invertebrate homologs of RXR (e.g., the ultraspiracle (USP) protein) are found in a wide range of species and are envisaged for use in the present invention.
“Activating ligand” as used herein refers to a compound that is capable of binding to a member of the nuclear steroid receptor super family (e.g., EcR and RXR) and activating the member by inducing association (e.g., dimerization, oligomerization, or protein-protein interaction) of the nuclear receptor components. Exemplary activating ligands for the present invention are provided below.
The term “inactive” or “inactivated,” when referencing inactive polypeptides, domains, signaling molecules, protein or polypeptide fragments, or protein subunits of polypeptides, as used herein means a protein or polypeptide that is not presently generating all or substantially all of one or more of its inherent biological functions or activities. In some embodiments, an inactive or inactivated protein or polypeptide becomes activated through association with another protein or polypeptide, i.e., protein-protein interaction. Such activation can occur, for example, through oligomerization induced by the binding of a first nuclear receptor ligand binding protein fragment to a second nuclear receptor protein fragment, wherein the first and second nuclear receptor fragments are part of two separate, larger, first and second heterologous polypeptides, wherein the first and second heterologous polypeptides change from a biologically inactive to a biologically active state upon ligand induced oligomerization.
“T cell” or “T lymphocyte” as used herein is a type of lymphocyte that plays a central role in cell-mediated immunity. They may be distinguished from other lymphocytes, such as B cells and natural killer cells (NK cells), by the presence of a T-cell receptor (TCR) on the cell surface.
“Antibody” as used herein refers to monoclonal or polyclonal antibodies. The term “monoclonal antibodies,” as used herein, refers to antibodies that bind to the same epitope (for example, such as antibodies that are produced by a single clone of B-cells). In contrast, “polyclonal antibodies” refer to a population of antibodies that bind to different epitopes of the same antigen (for example, such as antibodies that are produced by a heterogenous mixture of different B-cells). Ligand Inducible Polypeptide Coupler (LIPC) of the Invention
Described herein is a ligand inducible polypeptide coupler (LIPC) thatutilizes the ability of a pair of interacting nuclear receptor proteins (by engineering the LIPC (i.e., nuclear receptor) components to generate fusion proteins) to bring together separate proteins or domains and induce their association (e.g., dimerization, oligomerization) of otherwise separate proteins or domains (e.g., separated, biologically inactive polypeptide monomers, such as receptor tyrosine kinase polypeptides (RTKs) which typically require dimerization to form an active signaling complex). In certain embodiments, the switch system of the presnt invention is an ecdysone receptor (EcR)-based system. The ecdysone receptor-based ligand inducible polypeptide couplermay be either heterodimeric or homodimeric with respect to the “parent” non-nuclear receptor (LIPC) polypeptide components or domains. On the other hand, it is understood that a functional nuclear receptor (e.g., EcR complex) generally refers to a heterodimeric protein complex containing two or more members of the steroid receptor family. For example, an ecdysone receptor protein obtained from various insects, and an ultraspiracle (USP) protein or vertebrate homolog of USP, retinoid X receptor (RXR) protein (see, e.g., Yao, et al. (1993) Nature 366, 476-479 and Yao, et al., (1992) Cell 71, 63-72, each of which is incorporated by reference herein in its entirety).
The present invention can include two or more expression cassettes; e.g., encoding EcR and USP/RXR components fused to separate polypeptides or domains (e.g., signaling molecules, signaling domains, complementary protein fragments, protein subunits, and natural or engineered partial or truncated proteins). In the presence of activating ligand, the interaction of EcR-containing polypeptides with the USP/RXR-containing polypeptides brings the attached (fusion) proteins or domains in close proximity allowing for their association (protein-protein interaction), see e.g.,
The ecdysone receptor complex typically includes proteins which are members of the nuclear receptor superfamily wherein all members are generally characterized by the presence of an amino-terminal transactivation domain, a DNA binding domain (“DBD”), and a ligand binding domain (“LBD”) separated from the DBD by a hinge region. Members of the nuclear receptor superfamily are also characterized by the presence of four or five domains: A/B, C, D, E, and in some members F (see, e.g., US patent 4,981,784 and Evans, Science 240:889-895(1988), each of which is incorporated by reference herein in its entirety). The “A/B” domain corresponds to the transactivation domain, “C” corresponds to the DNA binding domain, “D” corresponds to the hinge region, and “E” corresponds to the ligand binding domain. Some members of the family may also have another transactivation domain on the carboxy-terminal side of the LBD corresponding to “F.”
These domains may be either native (i.e., naturally-occurring), modified, or chimeras (i.e., heterologous fusion proteins) of domains from different nuclear receptor proteins. Because the domains of EcR, USP, and RXR are modular in nature, the LBD, DBD, and transactivation domains may be interchanged.
Within certain embodiments, a dipteran (fruit fly Drosophila melanogaster) or a lepidopteran (spruce bud worm Choristoneura fumiferana) ultraspiracle protein (USP) is utilized as part of an LIPC system. In certain embodiments, a vertebrate or mammalian retinoid X receptor (RXR) (see, e.g., International Publ. No. WO/2001/070816, which is incorporated by reference herein in its entirety) is utilized as part of an LIPC system. In certain embodiments, the ultraspiracle protein of Locusta migratoria (“LmUSP”) and the RXR homolog 1 and RXR homolog 2 of the ixodid tick Amblyomma americanum (“AmaRXR1” and “AmaRXR2,” respectively) and their non-Dipteran, non-Lepidopteran homologs including, but not limited to: fiddler crab Celuca pugilator RXR homolog (“CpRXR”), beetle Tenebrio molitor RXR homolog (“TmRXR”), honeybee Apis mellifera RXR homolog (“AmRXR”), and an aphid Myzus persicae RXR homolog (“MpRXR”), all of which are referred to herein collectively as invertebrate RXRs (and which can function similar to vertebrate retinoid X receptor (RXR)) are utilized as part of an LIPC system.
EcR Components
The present invention provides for ecdysone receptor (EcR) polypeptide components, e.g., EcR ligand binding domains (LBD), to be employed in a ligand inducible polypeptide coupler system described herein. Exemplary EcR components that can be used in the invention are described, for example, in International PCT Publ. Nos. WO 2001/070816, WO 2002/066612, WO 2002/066613, WO 2002/066614, WO 2002/066615, WO 2003/027266, WO 2003/027289, WO 2005/108617, and WO 2009/114201each of which is incorporated by reference herein in its entirety.
In certain embodiments, the LIPC EcR component is an EcR ligand binding domain (LBD), or a related steroid/thyroid hormone nuclear receptor family member LBD, analog, combination, modification, or fragement thereof. In some embodiments, the LIPC LBD is from a truncated EcR polypeptide or EcR LBD. A truncation or substitution mutation thereof may be made by any method used in the art, including but not limited to restriction endonuclease digestion/deletion, PCR-mediated oligonucleotide-directed deletion, chemical mutagenesis, DNA strand breakage, and the like.
The LIPC EcR polypeptide component may be an invertebrate EcR, for example, selected from the class Arthropod. In some embodiments, the LIPC EcR polypeptide component (or fragments thereof) is selected from the group consisting of a Lepidopteran EcR, a Dipteran EcR, an Orthopteran EcR, a Homopteran EcR and a Hemipteran EcR. In particular embodiments, the EcR is a from spruce budwonn Choristoneura fumiferana EcR (“CfEcR”), a beetle Tenebrio molitor EcR (“TmEcR”), a Manduca sexta EcR (“MsEcR”), a Heliothies virescens EcR (“HvEcR”), a midge Chironomus tentans EcR (“CfEcR”), a silk moth Bombyx mori EcR (“BmEcR”), a fruit fly Drosophila melanogaster EcR (“DmEcR”), a mosquito Aedes aegypti EcR (“AaEcR”), a blowfly Lucilia capitata EcR (“LcEcR”), a blowfly Lucilia cuprina EcR (“LucEcR”), a Mediterranean fruit fly Ceratitis capitata EcR (“CcEcR”), a locust Locusta migratoria EcR (“LmEcR”), an aphid Myzus persicae EcR (“MpEcR”), a fiddler crab Celuca pugilator EcR (“CpEcR”), an ixodid tic Amblyomma americanurn EcR (“AmaEcR”), a whitefly Bamecia argentifoli EcR (“BaEcR”, SEQ ID NO: 20) or a leafhopper Nephotetix cincticeps EcR (“NcEcR”, SEQ ID NO: 21). In one embodiment, the LIPC LBD (or fragment thereof) is from spruce budworm (Choristoneura fumiferana) EcR (“CfEcR”) or fruit fly Drosophila melanogaster EcR (“DmEcR”).
In certain embodiments, the LIPC LBD is from a truncated EcR polypeptide. In some embodiments, the LIPC EcR polypeptide truncation results in a deletion of at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, or 265 amino acids. Preferably, an LIPC EcR polypeptide truncation results in a deletion of at least a partial polypeptide domain. More preferably, the LIPC EcR polypeptide truncation results in a deletion of at least an entire polypeptide domain. In a certain embodiments, the LIPC EcR polypeptide truncation results in a deletion of at least an AB-domain, a C-domain, a D-domain, an F-domain, an A/B/C-domains, an A/B/1/2-C-domains, an A/B/C/D-domains, an A/B/C/D/F-domains, an A/B/F-domains, an A/B/C/F-domains, a partial E domain, or a partial F domain. A combination of several complete and/or partial domain deletions may also be performed.
In some embodiments, an LIPC ecdysone receptor polypeptide component, or fragment thereof, is encoded by a polynucleotide comprising a nucleic acid sequence of SEQ ID NO: 22 (CfEcR-EF), SEQ ID NO: 23 (DmEcR-EF), SEQ ID NO: 24 (CfEcR-DE), or SEQ ID NO: 25 (DmEcR-DE), or a fragment thereof.
In some embodiments, an LIPC ecdysone receptor polypeptide component, or fragment thereof, is encoded by a polynucleotide comprising a nucleic acid sequence of SEQ ID NO: 1 (CfEcR-DEF), SEQ ID NO: 2 (CfEcR-CDEF), SEQ ID NO: 3 (DmEcR-DEF), SEQ ID NO: 4 (TmEcR-DEF) or SEQ ID NO: 5 (AmaEcR-DEF), or a fragment thereof.
In certain embodiments, an LIPC ecdysone receptor polypeptide component comprises an amino acid sequence of SEQ ID NO: 26 (CfEcR-EF), SEQ ID NO: 27 (DmEcR-EF), SEQ ID NO: 28 (CfEcR-DE), or SEQ ID NO: 29 (DmEcR-DE), or a fragment thereof. In some embodiments, an LIPC ecdysone receptor polypeptide component comprises an amino acid sequence of SEQ ID NO: 6 (CfEcR-DEF), SEQ ID NO: 8 (CfEcR-CDEF), SEQ ID NO: 7 (DmEcR-DEF), SEQ ID NO: 9 (TmEcR-DEF), or SEQ ID NO: 10 (AmaEcR-DEF), or a fragment thereof.
In addition, amino acid residues that are involved in ligand binding to Group H nuclear receptor ligand binding domains (e.g., EcR ligand binding domains) that affect the ligand sensitivity and magnitude of gene expression induction in an ecdysone receptor-based inducible gene expression (“gene switch”) system have been identified (see, e.g., International Publ. No. WO 02/066612, which is incorporated by reference herein in its entirety). These substitution mutant nuclear receptor polypeptides and their use in a LIPC system can provide improved ligand-induced (“activated”) polypeptide coupling in host cells and organisms in which regulation (modulation, control) of ligand sensitivity and magnitude of ligand induced oligomerization may be selected as desired, depending upon the application. As described further below, Group H nuclear receptors which comprise substitution mutations (referred to herein as “substitution mutants”) can be employed in ligand inducible polypeptide couplers (LIPC) of the present invention.
LIPC ecdysone receptor (EcR) polypeptide components (including EcR ligand binding domains (LBD)) used in the present invention may be from an invertebrate EcR, e.g., selected from the class Arthropod EcR. In certain embodiments, the LIPC EcR polypeptide component is selected from the group consisting of a Lepidopteran EcR, a Dipteran EcR, an Orthopteran EcR, a Homopteran EcR and a Hemipteran EcR. In certain embodiments, the EcR ligand binding domain for use in the present invention is from a spruce budworm Choristoneura fumiferana EcR (“CfEcR”), a beetle Tenebrio molitor EcR (“TmEcR”), a Manduca sexta EcR (“MsEcR”), a Heliothies virescens EcR (“HvEcR”), a midge Chironomus tentans EcR (“CtEcR”), a silk moth Bombyx mori EcR (“BmEcR”), a squinting bush brown Bicyclus anynana EcR (“BanEcR”), a buckeye Junonia coenia EcR (“JcEcR”), a fruit fly Drosophila melanogaster EcR (“DmEcR”), a mosquito Aedes aegypti EcR (“AaEcR”), a blowfly Lucilia capitata (“LcEcR”), a blowfly Lucilia cuprina EcR (“LucEcR”), a blowfly Caliphora vicinia EcR (“CvEcR”), a Mediterranean fruit fly Ceratitis capitata EcR (“CcEcR”), a locust Locusta migratoria EcR (“LmEcR”), an aphid Myzus persicae EcR (“MpEcR”), a fiddler crab Celuca pugilator EcR (“CpEcR”), an ixodid tick Amblyomma americanum EcR (“AmaEcR”), a whitefly Bamecia argentifoli EcR or a leafhopper Nephotetix cincticeps EcR. In some embodiments, the LIPC polypeptide component is from a CfEcR, a DmEcR, or an AmaEcR.
In certain embodiments, the LIPC Group H nuclear receptor polypeptide component is encoded by a polynucleotide comprising, or consisting of, a codon mutation that results in a substitution of a) amino acid residue 20, 21, 48, 51, 52, 55, 58, 59, 61, 62, 92, 93, 95, 96, 107, 109, 110, 120, 123, 125, 175, 218, 219, 223, 230, 234, or 238 of SEQ ID NO: 17, b) amino acid residues 95 and 110 of SEQ ID NO: 17, c) amino acid residues 218 and 219 of SEQ ID NO: 17, d) amino acid residues 107 and 175 of SEQ ID NO: 17, e) amino acid residues 127 and 175 of SEQ ID NO: 17, f) amino acid residues 107 and 127 of SEQ ID NO: 17, g) amino acid residues 107, 127 and 175 of SEQ ID NO: 17, h) amino acid residues 52, 107 and 175 of SEQ ID NO: 17, i) amino acid residues 96, 107, and 175 of SEQ ID NO: 17, j) amino acid residues 107, 110, and 175 of SEQ ID NO: 17, k) amino acid residue 107, 121, 213, or 217 of SEQ ID NO: 18, or 1) amino acid residue 91 or 105 of SEQ ID NO: 19. In certain embodiments, the Group H nuclear receptor ligand binding domain is from an ecdysone receptor. In certain embodiments, an LIPC EcR polypeptide component comprising a substitution mutation can comprise, or consist of, a substitution of about or at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, or more wild-type or naturally occurring amino acid with a different amino acid relative to the wild-type or naturally occurring EcR receptor ligand binding domain polypeptide.
In another embodiment, the LIPC Group H nuclear receptor ligand polypeptide component is encoded by a polynucleotide comprising, or consisting of, a codon mutation that results in a substitution of a) an alanine residue at a position equivalent or analogous to amino acid residue 20, 21, 48, 51, 55, 58, 59, 61, 62, 92, 93, 95, 109, 120, 125, 218, 219, 223, 230, 234, or 238 of SEQ ID NO: 17, b) an alanine, valine, isoleucine, or leucine residue at a position equivalent or analogous to amino acid residue 52 of SEQ ID NO: 17, c) an alanine, threonine, aspartic acid, or methionine residue at a position equivalent or analogous to amino acid residue 96 of SEQ ID NO: 17, d) a proline, serine, methionine, or leucine residue at a position equivalent or analogous to amino acid residue 110 of SEQ ID NO: 17, e) a phenylalanine residue at a position equivalent or analogous to amino acid residue 123 of SEQ ID NO: 17, f) an alanine residue at a position equivalent or analogous to amino acid residue 95 of SEQ ID NO: 17 and a proline residue at a position equivalent or analogous to amino acid residue 110 of SEQ ID NO: 17, g) an alanine residue at a position equivalent or analogous to amino acid residues 218 and 219 of SEQ ID NO: 17, h) an isoleucine residue at a position equivalent or analogous to amino acid residue 107 of SEQ ID NO: 17, i) an glutamine residue at a position equivalent or analogous to amino acid residues 175 of SEQ ID NO: 17, j) an isoleucine residue at a position equivalent or analogous to amino acid residue 107 of SEQ ID NO: 17 and a glutamine residue at a position equivalent or analogous to amino acid residue 175 of SEQ ID NO: 17, k) a glutamine residue at a position equivalent or analogous to amino acid residues 127 and 175 of SEQ ID NO: 17, 1) an isoleucine residue at a position equivalent or analogous to amino acid residue 107 of SEQ ID NO: 17 and a glutamine residue at a position equivalent or analogous to amino acid residue 127 of SEQ ID NO: 17, m) an isoleucine residue at a position equivalent or analogous to amino acid residue 107 of SEQ ID NO: 17 and a glutamine residue at a position equivalent or analogous to amino acid residues 127 and 175 of SEQ ID NO: 17, n) a valine residue at a position equivalent or analogous to amino acid residue of SEQ ID NO: 17, an isoleucine residue at a position equivalent or analogous to amino acid residue 107 of SEQ ID NO: 17 and a glutamine residue at a position equivalent or analogous to amino acid residue 175 of SEQ ID NO: 17, o) an alanine residue at a position equivalent or analogous to amino acid residue 96 of SEQ ID NO: 17, an isoleucine residue at a position equivalent or analogous to amino acid residue 107 of SEQ ID NO: 17 and a glutamine residue at a position equivalent or analogous to amino acid residue of SEQ ID NO: 17, p) an alanine residue at a position equivalent or analogous to amino acid residue 52 of SEQ ID NO: 17, an isoleucine residue at a position equivalent or analogous to amino acid residue 107 of SEQ ID NO: 17, and a glutamine residue at a position equivalent or analogous to amino acid residue 175 of SEQ ID NO: 17, q) a threonine residue at a position equivalent or analogous to amino acid residue 96 of SEQ ID NO: 17, an isoleucine residue at a position equivalent or analogous to amino acid residue 107 of SEQ ID NO: 17, and a glutamine residue at a position equivalent or analogous to amino acid residue 175 of SEQ ID NO: 17, r) an isoleucine residue at a position equivalent or analogous to amino acid residue 107 of SEQ ID NO: 17, a proline residue at a position equivalent or analogous to amino acid 110 of SEQ ID NO: 17, and a glutamine residue at a position equivalent or analogous to amino acid 175 of SEQ ID NO: 17, s) a proline at a position equivalent or analogous to amino acid residue 107 of 25 SEQ ID NO: 18, t) an arginine or a leucine at a position equivalent or analogous to amino acid residue 121 of SEQ ID NO: 18, u) an alanine at a position equivalent or analogous to amino acid residue 213 of SEQ ID NO: 18, v) an alanine or a serine at a position equivalent or analogous to amino acid residue 217 of SEQ ID NO: 18, w) an alanine at a position equivalent or analogous to amino acid residue 91 of SEQ ID NO: 19, or x) a proline at a position equivalent or analogous to amino acid residue 105 of SEQ ID NO: 19. In certain embodiments, the LIPC Group H nuclear receptor polypeptide component is from an ecdysone receptor.
In another embodiment, the LIPC Group H nuclear receptor polypeptide component having a substitution mutation is an ecdysone receptor ligand binding domain comprising, or consisting of, a substitution mutation encoded by a polynucleotide comprising, or consisting of, a codon mutation that results in a substitution mutation selected from the group consisting of a) E20A, Q21A, F48A, I51A, T52A, T52V, T52I, T52L, T55A, T58A, V59A, L61 A, I62A, M92A, M93A, R95A, V96A, V96T, V96D, V96M, V107I, F109A, A110P, A110S, A110M, A110L, Y120A, A123F, M125A, R175E, M218A, C219A, L223A, L230A, L234A, W238A, R95A/A110P, M218A/C219A, V107I/R175E, Y127E/R175E, V107I/Y127E, V107I/Y127E/R175E, T52V/V107I/R175E, V96A/V107I/R175E, T52A/V107I/R175E, V96T/V107I/R175E or V107I/A110P/R175E substitution mutation of SEQ ID NO: 17, b) A107P, G121R, G121L, N213A, C217A, or C217S substitution mutation of SEQ ID NO: 18, and c) G91A or A105P substitution mutation of SEQ ID NO: 19.
In other embodiments, the LIPC Group H nuclear receptor polypeptide component having a substitution mutation is an ecdysone receptor ligand binding domain polypeptide comprising, or consisting of, a substitution mutation encoded by a polynucleotide that hybridizes to a polynucleotide comprising a codon mutation that results in a substitution mutation selected from the group consisting of a) T58A, A110P, A110L, A110S, or A110M of SEQ ID NO: 17, b) A107P of SEQ ID NO: 18, and c) A105P of SEQ ID NO: 19 under hybridization conditions comprising a hybridization step in less than 500 mM salt and at least 37 degrees Celsius, and a washing step in 2XSSPE at least 63 degrees Celsius. In certain embodiments, the hybridization conditions comprise less than 200 mM salt and at least 37 degrees Celsius for the hybridization step. In another embodiment, the hybridization conditions comprise 2XSSPE and 63 degrees Celsius for both the hybridization and washing steps. In another embodiment, the ecdysone receptor ligand binding domain lacks or exhibits reduced steroid binding activity, such as 20-hydroxyecdysone binding activity, ponasterone A binding activity, or muristerone A binding activity.
In another embodiment, the LIPC Group H nuclear receptor polypeptide component has a substitution mutation at a position equivalent or analogous to a) amino acid residue 20, 21, 48, 51, 52, 55, 58, 59, 61, 62, 92, 93, 95, 96, 107, 109, 110, 120, 123, 125, 175, 218, 219, 223, 230, 234, or 238 of SEQ ID NO: 17, b) amino acid residues 95 and 110 of SEQ ID NO: 17, c) amino acid residues 218 and 219 of SEQ ID NO: 17, d) amino acid residues 107 and 175 of SEQ ID NO: 17, e) amino acid residues 127 and 175 of SEQ ID NO: 17, f) amino acid residues 107 and 127 of SEQ ID NO: 17, g) amino acid residues 107, 127 and 175 of SEQ ID NO: 17, h) amino acid residues 52, 107 and 175 of SEQ ID NO: 17, i) amino acid residues 96, 107 and 175 of SEQ ID NO: 17, j) amino acid residues 107, 110, and 175 of SEQ ID NO: 17, k) amino acid residue 107, 121, 213, or 217 of SEQ ID NO: 18, or 1) amino acid residue 91 or 105 of SEQ ID NO: 19. In certain embodiments, the LIPC Group H nuclear receptor polypeptide component is from an ecdysone receptor.
In some embodiments, the LIPC Group H nuclear receptor polypeptide component has a substitution of a) an alanine residue at a position equivalent or analogous to amino acid residue 20, 21, 48, 51, 55, 58, 59, 61, 62, 92, 93, 95, 109, 120, 125, 218, 219, 223, 230, 234, or 238 of SEQ ID NO: 17, b) an alanine, valine, isoleucine, or leucine residue at a position equivalent or analogous to amino acid residue 52 of SEQ ID NO: 17, c) an alanine, threonine, aspartic acid, or methionine residue at a position equivalent or analogous to amino acid residue 96 of SEQ ID NO: 17, d) a proline, serine, methionine, or leucine residue at a position equivalent or analogous to amino acid residue 110 of SEQ ID NO: 17, e) a phenylalanine residue at a position equivalent or analogous to amino acid residue 123 of SEQ ID NO: 17, f) an alanine residue at a position equivalent or analogous to amino acid residue 95 of SEQ ID NO: 17 and a proline residue at a position equivalent or analogous to amino acid residue 110 of SEQ ID NO: 17, g) an alanine residue at a position equivalent or analogous to amino acid residues 218 and 219 of SEQ ID NO: 17, h) an isoleucine residue at a position equivalent or analogous to amino acid residue 107 of SEQ ID NO: 17, 1) a glutamine residue at a position equivalent or analogous to amino acid residues 175 of SEQ ID NO: 17, j) an isoleucine residue at a position equivalent or analogous to amino acid residue 107 of SEQ ID NO: 17 and a glutamine residue at a position equivalent or analogous to amino acid residue 175 of SEQ ID NO: 17, k) a glutamine residue at a position equivalent or analogous to amino acid residues 127 and 175 of SEQ ID NO: 17, 1) an isoleucine residue at a position equivalent or analogous to amino acid residue 107 of SEQ ID NO: 17 and a glutamine residue at a position equivalent or analogous to amino acid residue 127 of SEQ ID NO: 17, m) an isoleucine residue at a position equivalent or analogous to amino acid residue 107 of SEQ ID NO: 17 and a glutamine residue at a position equivalent or analogous to amino acid residues 127 and 175 of SEQ ID NO: 17, n) a valine residue at a position equivalent or analogous to amino acid residue 52 of SEQ ID NO: 17, an isoleucine residue at a position equivalent or analogous to amino acid residue 107 of SEQ ID NO: 17 and a glutamine residue at a position equivalent or analogous to amino acid residue 175 of SEQ ID NO: 17, o) an alanine residue at a position equivalent or analogous to amino acid residue 96 of SEQ ID NO: 17, an isoleucine residue at a position equivalent or analogous to amino acid residue 107 of SEQ ID NO: 17 and a glutamine residue at a position equivalent or analogous to amino acid residue 175 of SEQ ID NO: 17, p) an alanine residue at a position equivalent or analogous to amino acid residue 52 of SEQ ID NO: 17, an isoleucine residue at a position equivalent or analogous to amino acid residue 107 of SEQ ID NO: 17, and a glutamine residue at a position equivalent or analogous to amino acid residue 175 of SEQ ID NO: 17, q) a threonine residue at a position equivalent or analogous to amino acid residue 96 of SEQ ID NO: 17, an isoleucine residue at a position equivalent or analogous to amino acid residue 107 of SEQ ID NO: 17, and a glutamine residue at a position equivalent or analogous to amino acid residue 175 of SEQ ID NO. 17, r) an isoleucine residue at a position equivalent or analogous to amino acid residue 107 of SEQ ID NO: 17, a proline residue at a position equivalent or analogous to amino acid 110 of SEQ ID NO: 17, and a glutamine residue at a position equivalent or analogous to amino acid 175 of SEQ ID NO: 17, s) a proline at a position equivalent or analogous to amino acid residue 107 of SEQ ID NO: 18, t) an arginine or a leucine at a position equivalent or analogous to amino acid residue 121 of SEQ ID NO: 18, u) an alanine at a position equivalent or analogous to amino acid residue 213 of SEQ ID NO: 18, v) an alanine or a serine at a position equivalent or analogous to amino acid residue 217 of SEQ ID NO: 18, w) an alanine at a position equivalent or analogous to amino acid residue 91 of SEQ ID NO: 19, or x) a proline at a position equivalent or analogous to amino acid residue 105 of SEQ ID NO: 19. In certain embodiments, the LIPC Group H nuclear receptor polypeptide component is from an ecdysone receptor.
In another embodiment, an LIPC Group H nuclear receptor polypeptide component having a substitution mutation is an ecdysone receptor ligand binding domain polypeptide composing a substitution mutation, wherein the substitution mutation is selected from the group consisting of a) E20A, Q21A, F48A, I51A, T52A, T52V, T52I, T52L, T55A, T58A, V59A, L61A, I62A, M92A, M93A, R95A, V96A, V96T, V96D, V96M, V107L F109A, A110P, A110S, A110M, A110L, Y120A, A123F, M125A, R175E, M218A, C219A, L223A, L230A, L234A, W238A, R95A/A110P, M218A C219A, V107I/R175E, Y127E/R175E, V107I/Y127E, V107I/Y127E/R175E, T52V/V107I/R175E, V96A/V107I/R175E, T52A/V107I/R175E, V96T/V107I/R175E, or V107I/A110P/R175E substitution mutation of SEQ ID NO: 17, b) A107P, G121R, G121L, N213A, C217A, or C217S substitution mutation of SEQ ID NO: 18, and c) G91A or A105P substitution mutation of SEQ ID NO: 19. In certain embodiments an EcR polypeptide component (amino acid sequence) used in an LIPC protein of the invention comprises, or alternatively consists of, one or more substitution mutations selected from the group consisting of substitutions indicated in Table 1.
RXR Components
The present invention provides for particular RXR components, including RXR ligand binding domains (LBD), to be employed in ligand inducible polypeptide couplers (LIPCs) described herein. Exemplary RXR components that can be used in the present invention include, for example, those described in International PCT Publ. Nos.: WO 2001/070816; WO 2002/066612; WO 2002/066613; WO 2002/066614; WO 2002/066615; WO 2003/027266; WO 2003/027289; WO 2005/108617 and, WO 2009/114201, each of which is incorporated by reference herein in its entirety.
In certain embodiments, the LIPC RXR component is a mouse Mus musculus RXR (MmRXR) or a human Homo sapiens RXR (HsRXR). The LIPC RXR component may be an RXRα, RXRβ, or RXRγisoform, or fragment thereof.
In some embodiments, the RXR LIPC component is a truncated RXR. The LIPC RXR polypeptide truncation can comprise, or consist of, a deletion of at least 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, or 265 amino acids. In certain embodiments, the LIPC RXR polypeptide truncation comprises, or consists of, a deletion of at least a partial polypeptide domain. In some embodiments, the LIPC RXR polypeptide truncation comprises, or consists of, a deletion of at least an entire polypeptide domain. In a specific embodiment, the LIPC RXR polypeptide truncation comprises, or consists of, a deletion of at least an AB-domain deletion, a C-domain deletion, a D-domain deletion, an E-domain deletion, an F-domain deletion, an A/B/C-domains deletion, an A/B/1/2-C-domains deletion, an A B/C/D-domains deletion, an A/B/C D/F-domains deletion, an A/B/F-domains, and an A/B/C/F-domains deletion. A combination of several complete and/or partial domain deletions may also be performed.
In certain embodiments, the LIPC RXR polypeptide component is encoded by a polynucleotide comprising, or consisting of, a nucleic acid sequence selected from the group consisting of SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32, SEQ ID NO: 33, SEQ ID NO: 34, SEQ ID NO: 35, SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, and SEQ ID NO: 39, or a fragment thereof.
In another embodiment, the LIPC RXR component comprises or consists of a polypeptide sequence selected from the group consisting of SEQ ID NO: 40, SEQ ID NO: 41, SEQ ID NO: 42, SEQ ID NO: 43, SEQ ID NO: 44, SEQ ID NO: 45, SEQ ID NO: 46, SEQ ID NO: 47, SEQ ID NO: 48, and SEQ ID NO: 49, or a fragment thereof.
In certain embodiments, LIPC of the invention include a chimeric RXR polypeptide comprising at least two polypeptide fragments selected from the group consisting of: 1) a vertebrate species RXR polypeptide fragment; 2) an invertebrate species RXR polypeptide fragment; and, 3) a non-Dipteran/non-Lepidopteran invertebrate species RXR polypeptide fragment. An LIPC chimeric RXR polypeptide component of the invention may comprise or consist of two different animal species RXR polypeptide fragments, or when the animal species is the same, the two or more polypeptide fragments may be from two or more different isoforms of the animal species RXR polypeptide fragment.
In some embodiments, the vertebrate species LIPC RXR polypeptide fragment comprises or consists of a mouse Mus musculus RXR (MmRXR) or a human Homo sapiens RXR (HsRXR), or fragment thereof. The LIPC RXR polypeptide component may comprise or consist of an RXRα, RXRβ, or RXRγisoform, or fragment thereof.
In some embodiments, the vertebrate species LIPC RXR polypeptide fragment is from a vertebrate species RXR encoded by a polynucleotide comprising, or consisting of, a nucleic acid sequence selected from the group consisting of SEQ ID NO: 62, SEQ ID NO: 63, SEQ ID NO: 64, SEQ ID NO: 65, SEQ ID NO: 66, and SEQ ID NO: 67, or fragment thereof. In another embodiment, the vertebrate species LIPC RXR polypeptide fragment is from a vertebrate species RXR comprising, or consisting of, an amino acid sequence selected from the group consisting of SEQ ID NO: 68, SEQ ID NO: 69, SEQ ID NO: 70, SEQ ID NO: 71, SEQ ID NO: 72, and SEQ ID NO: 73, or fragment thereof.
In another embodiment, a LIPC invertebrate species RXR polypeptide fragment is from a locust Locusta migratoria ultraspiracle polypeptide (LmUSP), an ixodid tick Amblyomma americanum RXR homolog 1 (AmaRXR1), a ixodid tick Amblyomma americanum RXR homolog 2 (AmaRXR2), a fiddler crab Celuca pugilator RXR homolog (CpRXR), a beetle Tenebrio molitor RXR homolog (TmRXR), a honeybee Apis mellifera RXR homolog (AmRXR), and an aphid Myzus persicae RXR homolog (MpRXR).
In certain embodiments, a LIPC invertebrate species RXR polypeptide fragment is from a invertebrate species RXR polypeptide encoded by a polynucleotide comprising or consisting of a nucleic acid sequence of SEQ ID NO: 50, SEQ ID NO: 51, SEQ ID NO: 52, SEQ ID NO: 53, SEQ ID NO: 54, or SEQ ID NO: 55, or fragment thereof. In another embodiment, a LIPC invertebrate species RXR polypeptide fragment is from a invertebrate species RXR polypeptide comprising or consisting of an amino acid sequence of SEQ ID NO: 56, SEQ ID NO: 57, SEQ ID NO: 58, SEQ ID NO: 59, SEQ ID NO: 60, or SEQ ID NO: 61, or fragment thereof.
In certain embodiments, a LIPC invertebrate species RXR polypeptide fragment is from a non-Dipteran/non-Lepidopteran invertebrate species RXR homolog.
In some embodiments, a LIPC chimeric RXR component comprises or consists of at least one vertebrate species RXR polypeptide fragment and one invertebrate species RXR polypeptide fragment.
In another embodiment, a LIPC chimeric RXR component comprises or consists of at least one vertebrate species RXR polypeptide fragment and one non-Dipteran/non-Lepidopteran invertebrate species RXR homolog polypeptide fragment.
In another embodiment, a LIPC chimeric RXR component comprises or consists of at least one invertebrate species RXR polypeptide fragment and one non-Dipteran/non-Lepidopteran invertebrate species RXR homolog polypeptide fragment.
In another embodiment, a LIPC chimeric RXR component comprises or consists of at least one vertebrate species RXR polypeptide fragment and one different vertebrate species RXR polypeptide fragment.
In another embodiment, a LIPC chimeric RXR component comprises or consists of at least one invertebrate species RXR polypeptide fragment and one different invertebrate species RXR polypeptide fragment.
In another embodiment, a LIPC chimeric RXR component comprises or consists of at least one non-Dipteran/non-Lepidopteran invertebrate species RXR polypeptide fragment and one different non-Dipteran non-Lepidopteran invertebrate species RXR polypeptide fragment.
In certain embodiments, a LIPC chimeric RXR component has an RXR region comprising at least one polypeptide fragment selected from the group consisting of an EF-domain helix 1, an EF-domain helix 2, an EF-domain helix 3, an EF-domain helix 4, an EF-domain helix 5, an EF-domain helix 6, an EF-domain helix 7, an EF-domain helix 8, and EF-domain helix 9, an EF-domain helix 10, an EF-domain helix 11, an EF-domain helix 12, an F-domain, and/or an EF-domain β-pleated sheet, wherein at least one of two or more domains are from different species RXR (e.g., a human RXR polypeptide fragment and a murine RXR polypeptide fragment).
In another embodiment, a first polypeptide fragment of a LIPC chimeric RXR component component comprises or consists of helices 1-6, helices 1-7, helices 1-8, helices 1-9, helices 1-10, helices 1-11, or helices 1-12 of a first species RXR, and a second polypeptide fragment of the chimeric LIPC RXR component comprises or consists of helices 7-12, helices 8-12, helices 9-12, helices 10-12, helices 11-12, helix 12, or F domain of a second species RXR, respectively.
In another embodiment, a first polypeptide fragment of a LIPC chimeric RXR component comprises or consists of helices 1-6 of a first species RXR, and a second polypeptide fragment of the LIPC chimeric RXR component comprises helices 7-12 of a second species RXR.
In another embodiment, a first polypeptide fragment of a LIPC chimeric RXR component comprises or consists of helices 1-7 of a first species RXR, and a second polypeptide fragment of the LIPC chimeric RXR component comprises or consists of helices 8-12 of a second species RXR.
In another embodiment, a first polypeptide fragment of a LIPC chimeric RXR component comprises or consists of helices 1-8 of a first species RXR, and a second polypeptide fragment of the LIPC chimeric RXR component comprises or consists of helices 9-12 of a second species RXR.
In another embodiment, a first polypeptide fragment of a LIPC chimeric RXR component comprises or consists of helices 1-9 of a first species RXR, and a second polypeptide fragment of the LIPC chimeric RXR component comprises or consists of helices 10-12 of a second species RXR.
In another embodiment, a first polypeptide fragment of a LIPC chimeric RXR component comprises or consists of helices 1-10 of a first species RXR, and a second polypeptide fragment of the LIPC chimeric RXR component comprises or consists of helices 11-12 of a second species RXR.
In another embodiment, a first polypeptide fragment of a LIPC chimeric RXR component comprises or consists of helices 1-11 of a first species RXR, and a second polypeptide fragment of the LIPC chimeric RXR component comprises or consists of helix 12 of a second species RXR.
In another embodiment, a first polypeptide fragment of a LIPC chimeric RXR component comprises or consists of helices 1-12 of a first species RXR, and a second polypeptide fragment of the LIPC chimeric RXR component comprises or consists of an F domain of a second species RXR.
In another embodiment, a LIPC RXR component comprises or consists of a truncated chimeric RXR. A chimeric RXR truncation can comprise a deletion of at least 1, 2, 3, 4, 5, 6, 8, 10, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 25, 26, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, or 240 amino acids. In certain embodiments, a chimeric RXR truncation results in a deletion of at least a partial polypeptide domain. In other embodiments, a chimeric RXR truncation results in a deletion of at least an entire polypeptide domain. In another embodiment, a chimeric RXR truncation results in a deletion of at least a partial E-domain, a complete E-domain, a partial F-domain, a complete F-domain, an EF-domain helix 1, an EF-domain helix 2, an EF-domain helix 3, an EF-domain helix 4, an EF-domain helix 5, an EF-domain helix 6, an EF-domain helix 7, an EF-domain helix 8, and EF-domain helix 9, an EF-domain helix 10, an EF-domain helix 11, an EF-domain helix 12, and/or an EF-domain f3-pleated sheet. A combination of several partial and or complete domain deletions may also be performed.
In certain embodiments, a LIPC truncated chimeric RXRcomponent is encoded by a polynucleotide comprising or consisting of a nucleic acid sequence of SEQ ID NO: 74, SEQ ID NO: 75, SEQ ID NO: 76, SEQ ID NO: 77, SEQ ID NO: 78, or SEQ ID NO: 79, or fragments thereof. In another embodiment, a LIPC truncated chimeric RXR component comprises or consists of a nucleic acid sequence of SEQ ID NO: 80, SEQ ID NO: 81, SEQ ID NO: 82, SEQ ID NO: 83, SEQ ID NO: 84, or SEQ ID NO: 85, or fragment thereof.
In another embodiment, a LIPC chimeric RXR component is encoded by a polynucleotide comprising or consisting of a nucleic acid sequence of a) SEQ ID NO: 11, b) nucleotides 1-348 of SEQ BD NO: 12 and nucleotides 268-630 of SEQ ID NO: 13, c) nucleotides 1-408 of SEQ ID NO: 12 and nucleotides 337-630 of SEQ ID NO: 13, d) nucleotides 1-465 of SEQ ID NO: 12 and nucleotides 403-630 of SEQ ID NO: 13, e) nucleotides 1-555 of SEQ ID NO: 12 and nucleotides 490-630 of SEQ ID NO: 13, f) nucleotides 1-624 of SEQ ID NO: 12 and nucleotides 547-630 of SEQ ID NO: 13, g) nucleotides 1-645 of SEQ ID NO: 12 and nucleotides 601-630 of SEQ ID NO: 13, and h) nucleotides 1-717 of SEQ ID NO: 12 and/or nucleotides 613-630 of SEQ ID NO: 13, or a fragment thereof.
In another preferred embodiment, a LIPC chimeric RXR component comprises of consists of an amino acid sequence of a) SEQ ID NO: 14, b) amino acids 1-116 of SEQ ID NO: 15 and amino acids 90-210 of SEQ ID NO: 16, c) amino acids 1-136 of SEQ ID NO: 15 and amino acids 113-210 of SEQ ID NO: 16, d) amino acids 1-155 of SEQ ID NO: 15 and amino acids 135-210 of SEQ ID NO: 16, e) amino acids 1-185 of SEQ ID NO: 15 and amino acids 164-210 of SEQ ID NO: 16, f) amino acids 1-208 of SEQ ID NO: 15 and amino acids 183-210 of SEQ ID NO: 16, g) amino acids 1-215 of SEQ ID NO: 15 and amino acids 201-210 of SEQ ID NO: 16, and/or h) amino acids 1-239 of SEQ ID NO: 15 or amino acids 205-210 of SEQ ID NO: 16, or a fragment thereof.
EcR and/or RXR Polypeptide Components
In certain embodiments, EcR and/or USP/RXR polypeptides used in a LIPC of the invention comprise, or consist of, at least one or more EcR and/or RXR substitution mutants selected from the group consisting of substitution mutants described in any one or more of International PCT Publ. Nos. WO 2001/070816, WO 2002/066612, WO 2002/066613, WO 2002/066614, WO 2002/066615, WO 2003/027266, WO 2003/027289, and WO 2005/108617, each of which is incorporated by reference herein in its entirety.
Gene Expression Cassettes of the Present Invention
One embodiment of the invention includes a ligand inducible polypeptide coupler (LIPC) system comprising: a) a first expression cassette that is capable of being expressed in a host cell comprising a polynucleotide that encodes a first fusion protein (polypeptide) comprising i) a nuclear receptor polypeptide or fragment thereof and ii) a first inactive signaling domain; and b) a second expression cassette that is capable of being expressed in the host cell comprising a polynucleotide sequence that encodes a second, separate, fusion protein (polypeptide) comprising i) a second nuclear receptor polypeptide or fragment thereof and ii) a second inactive signaling domain; wherein the first and second inactive signaling domains are activated upon association of the two fusion proteins with one another.
Another embodiment of the invention includes a ligand inducible polypeptide coupler (LIPC) system comprising: a) a first expression cassette that is capable of being expressed in a host cell comprising a polynucleotide that encodes a first fusion protein (polypeptide) comprising i) an arthropod nuclear receptor polypeptide or fragment thereof; and ii) a first inactive signaling domain; and b) a second expression cassette that is capable of being expressed in the host cell comprising a polynucleotide sequence that encodes a second, separate, fusion protein (polypeptide) comprising i) a second, non-arthropod nuclear receptor polypeptide or fragment thereof; and ii) a second inactive signaling domain; wherein the first and second inactive signaling domains are activated upon association of the two fusion proteins with one another. In another embodiment the non-arthropod nuclear receptor comprises a non-dipteran/non-lepidopteran nuclear receptor polypeptide or fragment thereof. In another embodiment the non-arthropod nuclear receptor comprises a mammalian nuclear receptor polypeptide or fragment thereof. In another embodiment the non-arthropod nuclear receptor comprises a human nuclear receptor polypeptide or fragment thereof. In another embodiment the non-arthropod nuclear receptor comprises a murine nuclear receptor polypeptide or fragment thereof. In another embodiment the non-arthropod nuclear receptor comprises a chimeric nuclear receptor polypeptide or fragments thereof, wherin the chimera comprises polypeptide components from two or more different species.
One embodiment of the invention includes a ligand inducible polypeptide coupler (LIPC) system comprising: a) a first expression cassette that is capable of being expressed in a host cell comprising a polynucleotide that encodes a first fusion protein (polypeptide) comprising i) an ecdysone receptor (EcR) polypeptide or fragment thereof and ii) a first inactive signaling domain; and b) a second expression cassette that is capable of being expressed in the host cell comprising a polynucleotide sequence that encodes a second, separate, fusion protein (polypeptide) comprising i) a retinoid X receptor polypeptide or fragment thereof and ii) a second inactive signaling domain; wherein the first and second inactive signaling domains are activated upon association of the two fusion proteins with one another.
Ligands, optionally, for use in invention as described below, when combined with an EcR ligand binding domain and a RXR ligand binding domain, as described herein, provide the means for external temporal regulation (activation or withdrawal of activation; i.e., via cessation of administration, or contact with, ligand) of the signaling domain(s). Binding of ligand to the LIPC EcR and RXR polypeptide components enables protein-protein interaction of LIPC-fusion proteins, and in certain embodiments activation, of the signaling domains. In some embodiments, one or more of the LIPC domains is varied producing a hybrid LIPC. In certain embodiments, hybrid genes and the resulting hybrid proteins are optimized in the chosen host cell or organism for desired activity and complementary binding of the ligand.
Embodiments of the invention include ligand inducible polypeptide coupler systems that allow for tailored (e.g., dose-regulated, inducible) activation of inactive domains (e.g., signaling molecules, signaling domains, complementary protein fragments, protein subunits, and natural or engineered partial or truncated proteins) through protein-protein interactin or association.
In certain embodiments, a signaling protein and/or polypeptide domain whose activity is to be modulated is a homologous protein or fragment thereof with respect to the host cell. In other embodiments, the signaling protein and/or polypeptide domain whose activity is to be modulated is a heterologous protein or fragment thereof with respect to the host cell.
Embodiments of the invention include compostions and uses of signaling proteins and polypeptide domains encoding polypeptides or signaling domains involved in a disease, a disorder, a dysfunction, a genetic defect, targets for drug discovery, and proteomics analyses and applications, etc.
Numerous cell signaling polypeptides and domains (e.g., signaling proteins) that require association (e.g., dimerization or oligomerization) or protein-protein interaction for activation have been identified in a wide-range of organisms and can be used in the present invention. Many of these signaling molecules participate in signaling pathways that are conserved throughout a large number of organisms.
For example, many cell surface receptors anchored in the membrane with a single transmembrane domain are primarily activated by endogenous (i.e., naturally occurring) ligand-induced dimerization or oligomerization. Generally, these molecules do not associate on their own, but are brought together (or in close proximity to their binding partner) through interactions with an endogenous extracellular ligand. In contrast to endogenous naturally occurring cell signal protein activation, the present invention provides for a small-molecule, ligand inducible polypeptide coupler system to modulate (i.e., turn on, turn off, increase or decrease) activity, i.e., dimerization or oligomerization, of cell signaling proteins and domains via “on demand” administration (or withdrawal of administration) of a small molecule nuclear receptor activating ligand. For a review of various molecules and pathways that utilize protein dimerization or oligomerization for activation, see, e.g., Klemm, et al. Annu. Rev. Immunol. 16:569-92 (1998), which is incorporated by reference herein in its entirety.
In certain embodiments the following signaling molecules and/or domains from cell surface receptors, intracellular signaling proteins, and their associated pathway members are envisaged for use with the invention as the first and/or second inactive signaling domain, signaling molecule, complementary protein fragment, protein subunit, or natural or engineered partial or truncated protein of the invention:
Receptor tyrosine kinase (RTK) receptors and their associated pathway members, including RTK class I (EGF receptor family) (ErbB family), RTK class II (Insulin receptor family), RTK class III (PDGF receptor family), RTK class IV (FGF receptor family), RTK class V (VEGF receptors family), RTK class VI (HGF receptor family), RTK class VII (Trk receptor family), RTK class VIII (Eph receptor family), RTK class IX (AXL receptor family), RTK class X (LTK receptor family), RTK class XI (TIE receptor family), RTK class XII (ROR receptor family), RTK class XIII (DDR receptor family), RTK class XIV (RET receptor family), RTK class XV (KLG receptor family), RTK class XVI (RYK receptor family), and RTK class XVII (MuSK receptor family).
Cytokine receptors and their associated pathway members, including type I cytokine receptor (e.g., Type I interleukin receptors, Erythropoietin receptor, GM-CSF receptor, G-CSF receptor, growth hormone receptor, prolactin receptor, Oncostatin M receptor, and Leukemia inhibitory factor receptor), type II cytokine receptor (e.g., Type II interleukin receptors, interferon-alpha/beta receptor, and interferon-gamma receptor), members of the immunoglobulin superfamily (e.g., Interleukin-1 receptor, CSF1, C-kit receptor, and Interleukin-18 receptor). Tumor necrosis factor receptor family (e.g., CD27, CD30, CD40, CD120, and Lymphotoxin beta receptor). Chemokine receptors (e.g., Interleukin-8 receptor, CCR1, CXCR4, MCAF receptor, and NAP-2 receptor). TGF beta receptors (e.g., TGF beta receptor 1 and TGF beta receptor 2). Antigen receptor signaling receptors (e.g., B cell and T cell antigen receptors).
Additional signaling proteins and/or domains that are envisaged to be used with the present invention include, but are not limited to, firefly luciferase (fLuc), Signal Transducer and Activator of Transcription (STAT) proteins, NF-κB proteins, antibodies (including antibody fragments), transcription factors, nuclear receptors, including nuclear hormone receptors, 14-3-3 proteins, G-protein coupled receptors, G proteins, kinesin, triosephosphateisomerase (TIM), alcohol dehydrogenase, Factor XI, Factor XIII, Toll-like receptors, fibrinogen, Bcl-2 family members, Smad family members, and the like.
In certain embodiments, the inactive signaling domain of the invention have a transmembrane domain. In some embodiments the transmembrane domain is a single-pass transmembrane domain. In certain embodiments, the single-pass transmembrane domain is a single-pass type I transmembrane domain. In other embodiments, the transmembrane domain is a multi-pass transmembrane domain. In certain embodiments, the transmembrane domain(s) have a hydrophilic alpha helix motif.
Acceptable activating ligands that can be used with the invention are any that modulate protein-protein interaction of the signaling domains of the switch system wherein the presence of the ligand results in activation of the inactive signaling domains. Such ligands include those disclosed in International PCT Publ. Nos. WO 2002/066612, WO 2002/066614, WO 2003/105849, WO 2004/072254, WO 2004/005478, WO 2004/078924, WO 2005/017126, WO 2008/153801, WO 2009/114201, WO 2013/036758, WO 2014/144380 and in U.S. Pat. Nos. 6,258,603 and 8,748,125, each of which is incorporated by reference herein in its entirety.
Exemplary ligands include, but are not limited to, ponasterone, muristerone A, 9-cis-retinoic acid, synthetic analogs of retinoic acid, N,N′-diacylhydrazines such as those disclosed in U.S. Pat. Nos. 6,013,836, 5,117,057, 5,530,028 and 537,872, each of which is incorporated by reference herein in its entirety; dibenzoylalkyl cyanohydrazines such as those disclosed in European Application No. 461809, which is incorporated by reference herein in its entirety; N-alkyl-N,N′-diaroylhydrazines such as those disclosed in U.S. Pat. No. 5,225,443 which is incorporated by reference herein in its entirety; N-acyl-N-alkylcarbonylhydrazines such as those disclosed in European Application No. 234994 which is incorporated by reference herein in its entirety; N-aroyl-N-alkyl-N′-aroylhydrazines such as those described in U. S. Pat. No. 4,985,461, which is incorporated by reference herein in its entirety, and other similar materials including 3,5-di-tert-butyl-4-hydroxy-N-isobutyl-benzamide, 8-0-acetylharpagide, and the like.
In certain embodiments, the ligand for use in the methods of the present invention is a compound of the formula:
wherein E is a (C4-C6)alkyl containing a tertiary carbon or a cyano(C3-C5)alkyl containing a tertiary carbon; R1 is H, Me, Et, i-Pr, F, formyl, CF3, CHF2, CHCl2, CH2F, CH2Cl, CH2OH, CH2OMe, CH2CN, CN, C≡CH, 1-propynyl, 2-propynyl, vinyl, OH, OMe, OEt, cyclopropyl, CF2CF3, CH═CHCN, allyl, azido, SCN, or SCHF2;
R2 is H, Me, Et, n-Pr, i-Pr, formyl, CF3, CHF2, CHCl2, CH2F, CH2Cl, CH2OH, CH2OMe, CH2CN, CN, C≡CH, 1-propynyl, 2-propynyl, vinyl, Ac, F, Cl, OH, OMe, OEt, O-n-Pr, OAc, NMe2, NEt2, SMe, SEt, SOCF3, OCF2CF2H, COEt, cyclopropyl, CF2CF3, CH═CHCN, allyl, azido, OCF3, OCHF2, O-i-Pr, SCN, SCHF2, SOMe, NH—CN, or joined with R3 and the phenyl carbons to which R2 and R3 are attached to form an ethylenedioxy, a dihydrofuryl ring with the oxygen adjacent to a phenyl carbon, or a dihydropyryl ring with the oxygen adjacent to a phenyl carbon;
R3 is H, Et, or joined with R2 and the phenyl carbons to which R2 and R3 are attached to form an ethylenedioxy, a dihydrofuryl ring with the oxygen adjacent to a phenyl carbon, or a dihydropyryl ring with the oxygen adjacent to a phenyl carbon; R4, R5, and R6 are independently H, Me, Et, F, Cl, Br, formyl, CF3, CHF2, CHCl2, CH2F, CH2Cl, CH2OH, CN, C≡CH, 1-propynyl, 2-propynyl, vinyl, OMe, OEt, SMe, or Set
In some embodiments, the ligand for use with the methods of the present invention is a compound of the formula:
wherein R1, R2, R3, and R4 are:
a) H, (C1-C6)alkyl; (C1-C6)haloalkyl; (C1-C6)cyanoalkyl; (C1-C6)hydroxyalkyl; (C1-C4)alkoxy(C1-C6)alkyl; (C2-C6)alkenyl optionally substituted with halo, cyano, hydroxyl, or (C1-C4)alkyl; (C2-C6)alkynyl optionally substituted with halo, cyano, hydroxyl, or (C1-C4)alkyl; (C3-C5)cycloalkyl optionally substituted with halo, cyano, hydroxyl, or (C1-C4)alkyl; oxiranyl optionally substituted with halo, cyano, or (C1-C4)alkyl; or
b) unsubstituted or substituted benzyl wherein the substituents are independently 1 to 5 H, halo, nitro, cyano, hydroxyl, (C1-C6)alkyl, or (C1-C6)alkoxy; and R5 is H; OH; F; Cl; or (C1-C6)alkoxy.
In some embodiments, when R1, R2, R3, and R4 are H, then R5 is not H or hydroxy.
In certain embodiments, at least one of R1, R2, R3, and R4 is not H. In another embodiment, at least two of R1, R2, R3, and R4 are not H. In another embodiment, at least three R1, R2, R3, and R4 are not H. In another embodiment, each of R1, R2, R3, and R4 are not H.
In some embodiments, when R1, R2, R3, and R4 are H, then R5 is not methoxy, when R1, R2, R3, and R4 are isopropyl, then R5 is not hydroxy, and when R1, R2, and R3 are H and R5 is hydroxy, then R4 is not methyl or ethyl.
In specific embodiments, R1, R2, R3, and R4 are: a) H, (C1-C6)alkyl; (C1-C6)haloalkyl; (C1-C6)cyanoalkyl; (C1-C6)hydroxyalkyl; (C1-C4)alkoxy(C1-C6)alkyl; (C2-C6)alkenyl; (C2-C6)alkynyl; oxiranyl optionally substituted with halo, cyano, or (C1-C4)alkyl; or b) unsubstituted or substituted benzyl wherein the substituents are independently 1 to 5 H, halo, cyano, or (C1-C6)alkyl; and R5 is H, OH, F, Cl, or (C1-C6)alkoxy.
In other specific embodiments, R1, R2, R3, and R4 are H, (C1-C6)alkyl; (C2-C6)alkenyl; (C2-C6)alkynyl; 2′-ethyloxiranyl, or benzyl; and R5 is H; OH; or F.
In specific embodiments, when R1, R2, R3, and R4 are isopropyl, then R5 is not hydroxyl; when R5 is H, hydroxyl, methoxy, or fluoro, then at least one of R1, R2, R3, and R4 is not H; when only one of R1, R2, R3, and R4 is methyl, and R5 is H or hydroxyl, then the remainder of R1, R2, R3, and R4 are not H; when both R4 and one of R1, R2, and R3 are methyl, then R5 is neither H nor hydroxyl; when R1, R2, R3, and R4 are all methyl, then R5 is not hydroxyl; and when R1, R2, and R3 are all H and R5 is hydroxyl, then R4 is not ethyl, n-propyl, n-butyl, allyl, or benzyl.
Certain embodiments of the invention include the use of the following steroidal ligands: 20-hydroxyecdysone, 2-methyl ether; 20-hydroxyecdysone, 3-methyl ether; 20-hydroxyecdysone, 14-methyl ether; 20-hydroxyecdysone, 2,22-dimethyl ether; 20-hydroxyecdysone, 3,22-dimethyl ether; 20-hydroxyecdysone, 14,22-dimethyl ether; 20-hydroxyecdysone, 22,25-dimethyl ether; 20-hydroxyecdysone, 2,3,14,22-tetramethyl ether; 20-hydroxyecdysone, 22-H-propyl ether; 20-hydroxyecdysone, 22-n-butyl ether; 20-hydroxyecdysone, 22-allyl ether; 20-hydroxyecdysone, 22-benzyl ether; 20-hydroxyecdysone, 22-(28R,S)-2′-ethyloxiranyl ether; ponasterone A, 2-methyl ether; ponasterone A, 14-methyl ether; ponasterone A, 22-methyl ether; ponasterone A, 2,22-dimethyl ether; ponasterone A, 3,22-dimethyl ether; ponasterone A, 14,22-dimethyl ether; dacryhainansterone, 22-methyl ether.
Additional embodiments of the invention include the use of the following steroidal ligands: 25,26-didehydroponasterone A, (iso-stachysterone C (Δ25(26))), shidasterone (stachysterone D), stachysterone C, 22-deoxy-20-hydroxyecdysone (taxisterone), ponasterone A, polyporusterone B, 22-dehydro-20-hydroxyecdysone, ponasterone A 22-methyl ether, 20-hydroxyecdysone, pterosterone, (25R)-inokosterone, (25S)-inokosterone, pinnatasterone, 25-fluoroponasterone A, 24(28)-dehydromakisterone A, 24-epi-makisterone A, makisterone A, 20-hydroxyecdysone-22-methyl ether, 20-hydroxyecdysone-25-methyl ether, abutasterone, 22,23-di-epi-geradiasterone, 20,26-dihydroxyecdysone (podecdysone C), 24-epi-abutasterone, geradiasterone, 29-norcyasterone, ajugasterone B, 24(28)[Z]-dehydroamarasterone B, amarasterone A, makisterone C, rapisterone C, 20-hydroxyecdysone-22,25-dimethyl ether, 20-hydroxyecdysone-22-ethyl ether, carthamosterone, 24(25)-dehydroprecyasterone, leuzeasterone, cyasterone, 20-hydroxyecdysone-22-allyl ether, 24(28)[Z]-dehydro-29-hydroxymakisterone C, 20-hydroxyecdysone-22-acetate, viticosterone E (20-hydroxyecdysone 25-acetate), 20-hydroxyecdysone-22-n-propyl ether, 24-hydroxycyasterone, 20-hydroxyecdysone-22-n-butyl ether, ponasterone A 22-hemi succinate, 22-acetoacetyl-20-hydroxyecdysone, 20-hydroxyecdysone-22-benzyl ether, canescensterone, 20-hydroxyecdysone-22-hemisuccinate, inokosterone-26-hemisuccinate, 20-hydroxyecdysone-22-benzoate, 20-hydroxyecdysone-22-β-D-glucopyranoside, 20-hydroxyecdysone-25-β-D-glucopyranoside, sileneoside A (20-hydroxyecdysone-22α-galactoside), 3-deoxy-1β,20-dihydroxyecdysone (3-deoxyintegri sterone A), 2-deoxyintegristerone A, 1-epi-integristerone A, integristerone A, sileneoside C (integristerone A 22α-galactoside), 2,22-dideoxy-20-hydroxyecdysone, 2-deoxy-20-hydroxyecdysone, 2-deoxy-20-hydroxyecdysone-3-acetate, 2-deoxy-20,26-dihydroxyecdysone, 2-deoxy-20-hydroxyecdysone-22-acetate, 2-deoxy-20-hydroxyecdysone-3,22-diacetate, 2-deoxy-20-hydroxyecdysone-22-benzoate, ponasterone A 2-hemi succinate, 20-hydroxyecdysone-2-methyl ether, 20-hydroxyecdysone-2-acetate, 20-hydroxyecdysone-2-hemisuccinate, 20-hydroxyecdysone-2-β-D-glucopyranoside, 2-dansyl-20-hydroxyecdysone, 20-hydroxyecdysone-2,22-dimethyl ether, ponasterone A 3B-D-xylopyranoside (limnantheoside B), 20-hydroxyecdysone-3-methyl ether, 20-hydroxyecdysone-3-acetate, 20-hydroxyecdysone-3β-D-xylopyranoside (limnantheoside A), 20-hydToxyecdysone-3-β-D-glucopyranoside, sileneoside D (20-hydroxyecdysone-3α-galactoside), 20-hydroxyecdysone 3β-D-glucopyranosyl-[1-3]-β-D-xylopyranoside (limnantheoside C), 20-hydroxyecdysone-3,22-dimethyl ether, cyasterone-3-acetate, 2-dehydro-3-epi-20-hydroxyecdysone, 3-epi-20-hydroxecdysone (coronatasterone), rapisterone D, 3-dehydro-20-hydroxyecdysone, 5β-hydroxy-25,26-didehydroponasterone A, 5β-hydroxystachysterone C, 25-deoxypolypodine B, polypodine B, 25-fluoropolypodine B, 5β-hydroxyabutasterone, 26-hydroxypolypodine B, 29-norsengosterone, sengosterone, 6β-hydroxy-20-hydroxyecdysone, 6α-hydroxy-20-hydroxyecdysone, 20-hydroxyecdysone-6-oxime, ponasterone A 6-carboxymethyloxime, 20-hydroxyecdysone-6-carboxymethyloxime, ajugasterone C, rapisterone B, muristerone A, atrotosterone B, atrotosterone A, turkesterone-2-acetate, punisterone (rhapontisterone), turkesterone, atrotosterone C, 25-hydroxyatrotosterone B, 25-hydroxyatrotosterone A, paxillosterone, rurkesterone-2,22-diacetate, turkesterone-22-acetate, turkesterone-11α-acetate, turkesterone-2, 11α-diacetate, turkesterone-11α-propionate, turkesterone-11α-butanoate, turkesterone-11α-hexanoate, turkesterone-11α-decanoate, turkesterone-11α-laurate, turkesterone-11α-myristate, turkesterone-11α-arachidate, 22-dehydro-12β-hydroxynorsengosterone, 22-dehydro-12β-hydroxycyasterone, 22-dehydro-12β-hydroxysengosterone, 14-deoxy(14α-H)-20-hydroxyecdysone, 20-hydroxyecdysone-14-methyl ether, 14α-perhydroxy-20-hydroxyecdysone, 20-hydroxyecdysone 14,22-dimethyl ether, 20-hydroxyecdysone-2,3,14,22-tetramethyl ether, (20S)-22-deoxy-20,21-dihydroxyecdysone, 22,25-dideoxyecdysone, (22S)-20-(2,2′-dimethylfuranyl)ecdysone, (22R)-20-(2,2′-dimethylfuranyl)ecdysone, 22-deoxyecdysone, 25-deoxyecdysone, 22-dehydroecdysone, ecdysone, 22-epi-ecdysone, 24-methylecdysone (20-deoxymakisterone A), ecdysone-22-hemisuccinate, 25-deoxyecdysone-22-β-D-glucopyranoside, ecdysone-22-myristate, 22-dehydro-20-iso-ecdysone, 20-iso-ecdysone, 20-iso-22-epi-ecdysone, 2-deoxyecdysone, sileneoside E (2-deoxyecdysone 3β-glucoside; blechnoside A), 2-deoxyecdysone-22-acetate, 2-deoxyecdysone-3,22-diacetate, 2-deoxyecdysone-22-3-D-glucopyranoside, 2-deoxyecdysone glucopyranoside, 2-deoxy-21-hydroxyecdysone, 3-epi-22-iso-ecdysone, 3-dehydro-2-deoxyecdysone (silenosterone), 3-dehydroecdysone, 3-dehydro-2-deoxyecdysone-22-acetate, ecdysone-6-carboxymethyloxime, ecdysone-2,3-acetonide, 14-epi-20-hydroxyecdysone-2,3-acetonide, 20-hydroxyecdysone-2,3-acetonide, 20-hydroxyecdysone-20,22-acetonide, 14-epi-20-hydroxyecdysone-2,3,20,22-diacetonide, paxillosterone-20,22-p-hydroxybenzylidene acetal, poststerone, (20S)-dihydropoststerone, (20S)dihydropoststerone, poststerone-20-dansylhydrazine, (20S)-dihydropoststerone-2,3,20-tribenzoate, (20R)-dihydropoststerone-2,3,20-tribenzoate, (20R)dihydropoststerone-2,3-acetonide, (20S)dihydropoststerone-2,3-acetonide, (5α-H)-dihydrorubrosterone, 2,14,22,25-tetradeoxy-5 α-ecdysone, 5 α-ketodiol, bombycosterol, 2α, 3 α,22S,25-tetrahydroxy-5α-cholestan-6-one, (5α-H)-2-deoxy-21-hydroxyecdysone, castasterone, 24-epi-castasterone, (5αα-H)-2-deoxyintegri sterone A, (5α-H)-22-deoxyintegristerone A, (5α-H)-20-hydroxyecdysone, 24,25-didehydrodacryhaninansterone, 25,26-didehydrodacryhainansterone, 5-deoxykaladasterone (dacryhainansterone), (14α-H)-14-deoxy-25-hydroxydacryhainansterone, 25-hydroxydacryhainansterone, rubrosterone, (5β-H)-dihydrorubrosterone, dihydrorubrosterone-17β-acetate, sidisterone, 20-hydroxyecdysone-2,3,22-triacetate, 14-deoxy(14β-H)-20-hydroxyecdysone, 14-epi-20-hydroxyecdysone, 9β,20-dihydroxyecdysone, malacosterone, 2-deoxypolypodine B-3-β-D-glucopyranoside, ajugalactone, cheilanthone B, 2β3β,6α-trihydroxy-5β-cholestane, 2β,3β,6β-trihydroxy-5β-cholestane, 14-dehydroshidasterone, stachysterone B, 2β,3β,9α,20R,22R,25-hexahydroxy-5β(3-cholest-7, 14-dien-6-one, kaladasterone, (14β-H)-14-deoxy-25-hydroxydacryhainansterone, 4-dehydro-20-hydroxyecdysone, 14-methyl-12-en-shidasterone, 14-methyl-12-en-15,20-dihydroxyecdysone, podecdysone B, 2β,3 β,20R,22R-tetrahydroxy-25-fluoro-5β-cholest-8,14-dien-6-one (25-fluoropodecdysone B), calonysterone, 14-deoxy-14,18-cyclo-20-hydroxyecdysone, 9α,14α-epoxy-20-hydroxyecdysone, 9βα, 14 β-epoxy-20-hydroxyecdysone, 9α,14α-epoxy-20-hydroxyecdysone 2,3,20,22-diacetonide, 28-homobrassinolide, iso-homobrassinolide.
In some embodiments, the ligand for use with the methods of the present invention is a compound of the general formula:
wherein X and X′ are independently O or S;
Y is:
(a) substituted or unsubstituted phenyl wherein the substitutents are independently 1-5H, (C1-C4)alkyl, (C1-C4)alkoxy, (C2-C4)alkenyl, halo (F, Cl, Br, I), (C1-C4)haloalkyl, hydroxy, amino, cyano, or nitro; or
(b) substituted or unsubstituted 2-pyridyl, 3-pyridyl, or 4-pyridyl, wherein the substitutents are independently 1-4H, (C1-C4)alkyl, (C1-C4)alkoxy, (C2-C4)alkenyl, halo (F, Cl, Br, I), (C1-C4)haloalkyl, hydroxy, amino, cyano, or nitro;
R1 and R2 are independently: H; cyano; cyano-substituted or unsubstituted (C1-C7) branched or straight-chain alkyl; cyano-substituted or unsubstituted (C2-C7) branched or straight-chain alkenyl; cyano-substituted or unsubstituted (C3-C7) branched or straight-chain alkenylalkyl; or together the valences of R1 and R2 form a (C1-C7)cyano-substituted or unsubstituted alkylidene group (RaRbC═) wherein the sum of non-substituent carbons in Ra and Rb is 0-6;
R3 is H, methyl, ethyl, n-propyl, isopropyl, or cyano;
R4, R7, and R8 are independently: H, (C1-C4)alkyl, (C1-C4)alkoxy, (C2-C4)alkenyl, halo (F, Cl, Br, I), (C1-C4)haloalkyl, hydroxy, amino, cyano, or nitro; and
R5 and R6 are independently: H, (C1-C4)alkyl, (C2-C4)alkenyl, (C3-C4)alkenylalkyl, halo (F, Cl, Br, I), C1-C4 haloalkyl, (C1-C4)alkoxy, hydroxy, amino, cyano, nitro, or together as a linkage of the type (—OCHR9CHR10O—) form a ring with the phenyl carbons to which they are attached; wherein R9 and R10 are independently: H, halo, (C1-C3)alkyl, (C2-C3)alkenyl, (C1-C3)alkoxy(C1-C3)alkyl, benzoyloxy(C1-C3)alkyl, hydroxy(C1-C3)alkyl, halo(C1-C3)alkyl, formyl, formyl(C1-C3)alkyl, cyano, cyano(C1-C3)alkyl, carboxy, carboxy(C1-C3)alkyl, (C1-C3)alkoxycarbonyl(C1-C3)alkyl, (C1-C3)alkylcarbonyl(C1-C3)alkyl, (C1-C3)alkanoyloxy(C1-C3)alkyl, amino(C1-C3)alkyl, (C1-C3)alkylamino(C1-C3)alkyl (—(CH2)nRcRc), oximo (—CH═NOH), oximo(C1-C3)alkyl, (C1-C3)alkoximo (—C═NORd), alkoximo(C1-C3)alkyl, (C1-C3)carboxamido (—C(O)NReRf), (C1-C3)carboxamido(C1-C3)alkyl, (C1-C3)semicarbazido (—C═NNHC(O)NReRf), semicarbazido(C1-C3)alkyl, aminocarbonyloxy (—OC(O)NHRg), aminocarbonyloxy(C1-C3)alkyl, pentafluorophenyloxycarbonyl, pentafluorophenyloxycarbonyl(C1-C3)alkyl, p-toluenesulfonyloxy(C1-C3)alkyl, arylsulfonyloxy(C1-C3)alkyl, (C1-C3)thio(C1-C3)alkyl, (C1-C3)alkylsulfoxido(C1-C3)alkyl, (C1-C3)alkylsulfonyl(C1-C3)alkyl, or (C1-C5)trisubstituted-siloxy(C1-C3)alkyl (—(CH2)nSiORdReRg); wherein n=1-3, Rc and Rd represent straight or branched hydrocarbon chains of the indicated length, Re, Rf represent H or straight or branched hydrocarbon chains of the indicated length, Rg represents (C1-C3)alkyl or aryl optionally substituted with halo or (C1-C3)alkyl, and Rc, Rd, Re, Rf, and Rg are independent of one another;
provided that
i) when R9 and R10 are both H, or
ii) when either R9 or R10 are halo, (C1-C3)alkyl, (C1-C3)alkoxy(C1-C3)alkyl, or benzoyloxy(C1-C3)alkyl, or
iii) when R5 and R6 do not together form a linkage of the type (—OCHR9CHR10O—),
then the number of carbon atoms, excluding those of cyano substitution, for either or both of groups R1 or R2 is greater than 4, and the number of carbon atoms, excluding those of cyano substitution, for the sum of groups R1, R2, and R3 is 10, 11, or 12.
A novel ecdysone receptor/retinoid X receptor-based ligand inducible polypeptide coupler system of the invention may comprise an expression cassette having a polynucleotide sequence that encodes a hybrid polypeptide comprising an EcR nuclear receptor polypeptide component and an inactive signaling domain or a RXR nuclear receptor polypeptide component and an inactive signaling domain. These expression cassettes, the polynucleotides they comprise, and the hybrid polypeptides they encode are useful as components of an EcR/RXR-based ligand inducible polypeptide coupler system to modulate the activity of signaling domains within a host cell.
Thus, the present invention provides an isolated polynucleotide that encodes a hybrid polypeptide having an EcR nuclear receptor polypeptide component and an inactive signaling domain and/or a RXR nuclear receptor polypeptide component and an inactive signaling domain. The isolated polynucleotides that encode the EcR and/or RXR nuclear receptor polypeptide components of the invention comprise, but are not limited to, the polynucleotide sequences described above, including wild-type, truncated, and substitution mutation-containing EcR polypeptides described herein and/or wild-type, truncated, and chimeric RXR polypeptides described herein, including combinations thereof.
In addition, the isolated polynucleotides of the present invention can have polynucleotide sequences that encode signaling domains, including those described herein. The polynucleotide sequences of such signaling domains are readily accessible via publically available databases that are known to those of ordinary skill in the art. Such databases include, but are not limited to, GenBank (ncbi.nlm.nih.gov/genbank), UniProt (uniprot.org), and the like.
The novel ecdysone receptor/retinoid X receptor-based ligand inducible polypeptide coupler system of the invention can comprise an expression cassette having a polynucleotide that encodes a hybrid polypeptide comprising an EcR polypeptide and/or an inactive signaling domain or a RXRpolypeptide and an inactive signaling domain. These expression cassettes, the polynucleotides they comprise, and the hybrid polypeptides they encode are useful as components of an EcR/RXR-based ligand inducible polypeptide coupler system to modulate the activity of signaling domains within a host cell.
Thus, the present invention also relates to an isolated hybrid polypeptide having an EcR polypeptide and an inactive signaling domain (e.g., signaling molecules, signaling domains, complementary protein fragments, protein subunits, and natural or engineered partial or truncated proteins) and/or a RXR polypeptide and an inactive signaling domain (e.g., signaling molecules, signaling domains, complementary protein fragments, protein subunits, and natural or engineered partial or truncated proteins) according to the invention. The EcR and/or RXR domains of the isolated polypeptides of the invention can comprise, but are not limited to, polypeptide sequences described herein, including wild-type, truncated, functional fragments, and substitution mutation-containing EcR ligand binding domains described herein and/or wild-type, truncated, functional fragments, and chimeric RXR polypeptides described herein, including combinations thereof.
In addition, the isolated hybrid polypeptides of the invention can have signaling domains (e.g., signaling molecules, signaling domains, complementary protein fragments, protein subunits, and natural or engineered partial or truncated proteins), including those described herein. The amino acid sequences of such signaling domains are readily accessible via publically available databases that are known to those of ordinary skill in the art. Such databases include, but are not limited to, GenBank (ncbi.nlm.nih.gov/genbank), UniProt (uniprot.org), and the like.
The novel ecdysone receptor/retinoid X receptor-based ligand inducible polypeptide coupler system of the invention comprises an expression cassette comprising a polynucleotide that encodes a hybrid polypeptide comprising an EcR ligand binding domain and an inactive signaling domain (e.g., signaling molecules, signaling domains, complementary protein fragments, protein subunits, and natural or engineered partial or truncated proteins) and/or a RXR polypeptide and an inactive signaling domain (e.g., signaling molecules, signaling domains, complementary protein fragments, protein subunits, and natural or engineered partial or truncated proteins). These expression cassettes, the polynucleotides they comprise, and the hybrid polypeptides they encode can be expressed in a host cell using any suitable expression vector. Suitable expression vectors are well known to those of ordinary skill in the art and the choice of expression vector and optimal expression conditions in view of the desired host cell can be readily determined by one of ordinary skill in the art. Exemplary expression vectors that can be employed with the invention include, but are not limited to, the expression vectors described above.
As described above, the ligand inducible polypeptide coupler system of the present invention may be used to modulate protein-protein interaction, i.e., association, within a host cell. Modulation in transgenic host cells may be useful for the modulation of various proteins of interest. Thus, the invention provides an isolated host cell comprising a ligand inducible polypeptide coupler system according to the invention. The present invention also provides an isolated host cell comprising a ligand inducible polypeptide coupler system comprising one or more expression cassettes according to the invention. The invention also provides an isolated host cell comprising a polynucleotide or a polypeptide. The isolated host cell may be either a prokaryotic or a eukaryotic host cell.
In certain embodiments, the isolated host cell is a prokaryotic host cell or a eukaryotic host cell. In another specific embodiment, the isolated host cell is an invertebrate host cell or a vertebrate host cell. Such host cells may be selected from a bacterial cell, a fungal cell, a yeast cell, a nematode cell, an insect cell, a fish cell, a plant cell, an avian cell, an animal cell, and a mammalian cell. More specifically, the host cell is a yeast cell, a nematode cell, an insect cell, a plant cell, a zebrafish cell, a chicken cell, a hamster cell, a mouse cell, a rat cell, a rabbit cell, a cat cell, a dog cell, a bovine cell, a goat cell, a cow cell, a pig cell, a horse cell, a sheep cell, a simian cell, a monkey cell, a chimpanzee cell, or a human cell. Examples of host cells include, but are not limited to, fungal or yeast species such as Aspergillus, Trichoderma, Saccharomyces, Pichia, Candida, Hansenula, or bacterial species such as those in the genera Synechocystis, Synechococcus, Salmonella, Bacillus, Acinetobacter, Rhodococcus, Streptomyces, Escherichia, Pseudomonas, Methylomonas, Methylobacter, Alcaligenes, Synechocystis, Anabaena, Thiobacillus, Methanobacterium and Klebsiella, animal, and mammalian host cells.
In certain embodiments, the host cell is a yeast cell selected from the group consisting of a Saccharomyces, a Pichia, and a Candida host cell. In a specific embodiment, the host cell is a Caenorhabditis elegans nematode cell. In another specific embodiment, the host cell is a hamster cell. In another embodiment, the host cell is a murine cell. In another embodiment, the host cell is a monkey cell. In another specific embodiment, the host cell is a human cell.
In another embodiment, the host cell is a mammalian cell selected from the group consisting of a hamster cell, a mouse cell, a rat cell, a rabbit cell, a cat cell, a dog cell, a bovine cell, a goat cell, a cow cell, a pig cell, a horse cell, a sheep cell, a monkey cell, a chimpanzee cell, and a human cell. In certain embodiments the host cell is an immortalized cell, an immune cell, or a T-cell.
Host cell transformation is well known in the art and may be achieved by a variety of methods including but not limited to electroporation, viral infection, plasmid/vector transfection, non-viral vector mediated transfection, particle bombardment, and the like. Expression of desired gene products involves culturing the transformed host cells under suitable conditions and inducing expression of the transformed gene. Culture conditions and gene expression protocols in prokaryotic and eukaryotic cells are well known in the art. Cells may be harvested and the gene products isolated according to protocols specific for the gene product.
In addition, a host cell may be chosen that modulates the expression of the inserted polynucleotide, or modifies and processes the polypeptide product in the specific fashion desired.
The invention also relates to a non-human organism comprising an isolated host cell according to the invention. In certain embodiments, the non-human organism is selected from the group consisting of a bacterium, a fungus, a yeast, an animal, and a mammal. In some embodiments, the non-human organism is a yeast, a mouse, a rat, a rabbit, a cat, a dog, a bovine, a goat, a pig, a horse, a sheep, a monkey, or a chimpanzee.
In a certain embodiments, the non-human organism is a yeast selected from the group consisting of Saccharomyces, Pichia, and Candida. In another embodiment, the non-human organism is a Mus musculus mouse.
Applicant's invention encompasses methods of incorporating LIPCs into polypeptides (generating heterologous polypeptides) to modulate activity of signaling domains in host cells. Specifically, Applicant's invention provides a method of inducing or inhibiting activation of signaling proteins and pathways via incorporation of LIPC components into signal activating or inhibiting polypeptides expressed in a host cell, and contacting the host cell with a ligand, to bring about the signal transduction activation or inhibition.
In one embodiment, cell signal transduction is activated by LIPC-induced dimerization of oligomerization of signaling domains (e.g., signaling molecules, signaling domains, complementary protein fragments, protein subunits, and natural or engineered partial or truncated proteins).
In another embodiment, cell signal transduction is inhibited by LIPC-induced dimerization of an inhibitory polypeptide to a cell signal transduction (activation) pathway polypeptide. In one embodiment, a component of the LIPC alone (e.g., an EcR or RxR/USP polypeptide) is the inhibitory polypeptide.
In one embodiment, LIPC polypeptides are used to modulate (i.e., activate or inhibit) intracellular protein-protein interactions. In another embodiment, LIPC polypeptides are used to modulate (i.e., activate or inhibit) extracellular protein-protein interactions. In another embodiment, LIPC polypeptides are used to modulate (i.e., activate or inhibit) transmembrane protein-protein interactions.
Genes and proteins of interest for expression and modulation of activity via LIPC in a host cell may be endogenous genes or heterologous genes. Nucleic acid or amino acid sequence information for a desired gene or protein can be located in one of many public access databases, for example, GenBank, EMBL, Swiss-Prot, and PIR, or in numerous biology-related journal publications. Thus, those of ordinary skill in the art have access to nucleic acid sequence and/or amino acid sequence information for virtually all known genes and proteins. Such information can then be used to construct the desired constructs for expression of the protein of interest (e.g., signaling domain) within the expression cassettes used in Applicant's methods described herein.
Examples of genes and proteins of interest for expression in a host cell using Applicant's methods include, but are not limited to, enzymes, reporter genes, structural proteins, transmembrane receptors, nuclear receptor, genes encoding polypeptides or signaling domains involved in a disease, a disorder, a dysfunction, a genetic defect, antibodies, targets for drug discovery, and proteomics analyses and applications, and the like.
Among the many and varied manners in which a Ligand Inducible Polypeptide Coupler (LIPC) of the present invention may be utilized and incorporated into control of or effect upon a biological cell signal transduction system, one general example is substitution of any other ligand inducible dimerization or multimerization system (such as those utilizing FK506 or rapamycin) with LIPC components of the present invention.
A specific example in which a Ligand Inducible Polypeptide Coupler (LIPC) of the present invention may be utilized and incorporated into control of a biological cell signal transduction system, is for use in generating an inducible cell “kill switch” or “suicide switch”; such as has been proposed for use in destroying genetically modified T cells (e.g., chimeric antigen receptor (CAR) T cells).
Some examples of the above-referenced sytems are reviewed and described in:
Applicant's RheoSwitch genetic switch technology drives transcription in the presence of an activating ligand. The ligand binds the EcR ligand-binding domain portion of a GAL4-EcR fusion protein, which recruits an RXR-VP16 component (see, e.g.,
The ligand inducible polypeptide coupler operates differently than a transcriptional gene switch. Using the LIPC system, protein-protein interaction is controlled, not gene expression. Levels of activation may be regulated in a dose-dependent fashion as controlled via concentration and quantity of small molecule ligand administration.
As described herein, a split firefly luciferase system has been used to demonstate ligand-inducible EcR-RXR fusion protein association. This system represents a new method for employing protein switch components. Such a switch is fundamentally different from gene transcriptional activation switches, which are directed to controlling protein expression. Controlling protein-protein interaction, i.e., association, requires careful and specific engineering, as the molecules to be associated (e.g., dimerized or oligomerized) must have some differential function when associated and have limited, or no natural affinity for each other under the non-ligand conditions.
A series of EcR and RXR fusions (some with a split firefly luciferase (fLuc)) proteins have been conceived and designed (see
The fLuc protein was divided into two pieces having no intrinsic affinity for each other (such that it is inactive until brought into close association by fused protein elements) for use as a system of testing protein-protein association. HEK293 cells were transfected with the split fLuc fused to EcR and RXR domains as follows:
A day before transfection, 10,000 cells (293T cells) were plated into each well of a 96 well plate containing 100 μl of growth medium (Dulbecco's Modified Eagle's Medium with 10% Fetal Bovine Serum) without antibiotics. Plasmids in pairs, RxR Nluc with Cluc EcR and EcR_ Nluc with Cluc_ RxR (see
Twenty four hours (24hrs) post-transfection, cell culture media from each well of the 96-well plate was replaced with 100 nM Veledimex activating ligand and Dimethyl sulfoxide-DMSO (negative control). Each component was diluted thousand fold in Dulbecco's Modified Eagle's Medium with 10% Fetal Bovine Serum and incubated for 6 hrs at 37° C. in a 5% CO2 incubator. ONE-Glo™ Luciferase Assay Buffer was combined with ONE-Glo™ Luciferase Assay Substrate, which contains 5′-Fluoroluciferin (a luciferin analog). This reagent was frozen after reconstitution and stored at −20° C. until use. Luciferase ONE-Glo™ Luciferase substrate was thawed to room temperature in a water bath. The 96-well plate was removed from the incubator and equilibrated for ˜1 hr., at room temperature, plate bottom covered with Corning® 96 well microplate aluminum sealing tape, before addition of the substrate. 100 μl of the ONE-Glo™ Luciferase reagent buffer was added to each well of the 96-well plate. After 3 minutes of incubation at room temperature to ensure complete cell lysis, the 96-well plate was placed in GloMax™ 96 Microplate Luminometer to measure bioluminescence from each well.
In the absence of activating ligand, only background signal was observed. fLuc signal was detected following addition of activating ligand (
Upon addition of activating ligand, a clear fLuc signal is generated using the EcR and RXR LIPC system. Only background is observed in the absence of ligand (see
Positive signal should only be observed in complementing pairs of vectors that have been exposed to activating ligand, driving association of EcR and RXR components and restoring fLuc activity. Ligand dose response curves are shown in
EcR dimerization induction via Veledimex ligand results are shown in
Data generated by the present system can be used to inform molecular designs for additional systems going forward. Additional uses of such a system include, but are not limited to, screening for signaling domains (e.g., signaling molecules, signaling domains, complementary protein fragments, protein subunits, and natural or engineered partial or truncated proteins) that are activated through protein-protein interaction.
Based on the experiments and results with the intracellular split fLuc reporter, new designs for LIPC systems will be undertaken. Additional configurations of EcR, RXR, and split fLuc elements will be assayed to demonstrate additional pairings. All of this information can be used to inform the generation of comparative models of the proteins that can in turn provide guidance for future designs. The current split fLuc vectors will also be tested in other important cell types for consistent activity. As the proteins are constitutively expressed in the present example, the dimerization event should be rapid when activating ligand is administered. Conversely, given that the fLuc halves have no affinity for each other and do not covalently interact, this system could also be used to examine off-rate kinetics following removal of activating ligand. Both signal onset and decay experiments are envisaged and being undertaken.
Further, additional LIPC designs are being pursued. Some of the designs are similar to those of the fLuc system above, with differences being, for example, that the molecules involved in the interaction can be single-pass type I transmembrane proteins. Initial designs and experiments will be with EcR and RXR localized intracellularly with at least portions of the fused proteins located extracellularly (see
Further research will include experiments to understand on- and off-rates, optimal expression levels required to drive desired activation effects, and reduce (if needed) potential background (e.g., biological effects of the unpartnered proteins in the absence of ligand).
Experiments were performed to test if nuclear receptor domains (i.e., EcR and RxR polypeptides) could be induced to homodimerize upon addition of ligand (
“EcR” is Ecdysone receptor;
“EcR-EcR” means “EcR_Nluc+Cluc_EcR” which is a luciferase polypeptide split into two halves, such that an EcR polypeptide is fused to the N-terminus of a luciferase polypeptide fragment (EcR_Nluc) and another fragment of luciferase has an EcR polypeptide fused to its C-terminal end (Cluc_EcR); thereby activating luciferase (generation of bioluminescence) upon EcR homodimerization;
“RxR” is Retinoid X receptor;
“Mock” means no vector added;
“eGFP” is enhanced GFP (used as a negative control);
“RxR_EcR” means “EcR_Nluc+Cluc_RXR” which is a luciferase polypeptide split into two halves, such that an EcR polypeptide is fused to the N-terminus of a luciferase polypeptide fragment (EcR13 Nluc) and another fragment of luciferase has an RxR polypeptide fused to its C-terminal end (Cluc RxR); thereby activating luciferase (generation of bioluminescence) upon EcR homodimerization;
The results (
Unless defined otherwise, all technical and scientific terms and any acronyms used herein have the same meanings as commonly understood by one of ordinary skill in the art in the field of this invention.
All references cited herein are incorporated by reference herein to the full extent allowed by law. The discussion of those references is intended merely to summarize the assertions made by their authors. No admission is made that any reference (or a portion of any reference) is relevant art. Applicants reserve the right to challenge the accuracy and pertinence of any cited reference.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2016/024690 | 3/29/2016 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62140380 | Mar 2015 | US |