A computer readable form of the Sequence Listing is filed with this application by electronic submission and is incorporated into this application by reference in its entirety. The Sequence Listing is contained in the file created on Sep. 10, 2024 having the file name “23-1345-US.xml” and is 635,732 bytes in size.
Membrane fusion is a process of merging of two membranes. In a biological system, membrane fusion proteins catalyze this process by pulling two lipid bilayer membranes into close proximity and forcing them to merge into a single membrane. Enveloped viruses and some viral vectors such as lentiviral or retroviral vectors have membrane fusion proteins on their surface to efficiently fuse the viral membrane and host cell membrane to deliver their genetic material into cells. Therefore, membrane fusion proteins are potentially useful for intracellular drug delivery applications. However, natural membrane fusion proteins including viral envelope glycoproteins or SNARE proteins are hard to engineer and sometimes immunogenic or toxic for biological applications. Newly designed membrane fusion proteins might be advantageous over existing membrane fusion proteins since they could be modular and easier to engineer.
In a first aspect, the disclosure provides nucleic acids encoding a polypeptide comprising the formula X1-X2-X3, wherein
X1 comprises an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of the bold font residues in SEQ ID NO: 149-208;
X2 comprises a juxtamembrane domain (JMD), wherein X2 comprises an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NO:500-505; and
X3 comprises a transmembrane domain (TMD).
In one embodiment, X1 comprises an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO:156. In another embodiment, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, or all 13 of L8, L12, V15, V18, 121, M22, L28, V29, G33, 136, L39, L46, L53 are conserved (i.e., identical) in the polypeptide relative to SEQ ID NO:156. In a further embodiment, X2 comprises an amino acid sequence at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NO:209-222 and 456. In one embodiment, X2 comprises an amino acid sequence at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NO: 213 and 214. In another embodiment, X3 comprises an amino acid at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NO: SEQ ID NO:223-234. In a further embodiment, the nucleic acid encodes a polypeptide comprising an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NO:1-37, 147, and 236-289. In one embodiment, the nucleic acid encodes a polypeptide comprising an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO:8. In another embodiment, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, or all 13 of L8, L12, V15, V18, 121, M22, L28, V29, G33, 136, L39, L46, L53 are conserved (i.e., identical) in the polypeptide relative to SEQ ID NO: 8. In other embodiments, the nucleic acids encode a fusion protein that further comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 290 or 291; optionally wherein the encoded polypeptide and the polypeptide domain are connected by an amino acid linker. In certain embodiments, the nucleic acid encodes a fusion protein comprising an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO:147 and 244-258.
In a second aspect, the disclosure provides nucleic acids encoding a polypeptide comprising an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NO:310-316, wherein X1 is an amino acid linker. In one embodiment, the polypeptide comprises an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NO: 314-316, wherein X1 is an amino acid linker. In another embodiment, the nucleic acid encodes a polypeptide comprising an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NO:38-44.
In a third aspect, the disclosure provides nucleic acids encoding a polypeptide comprising the formula X1-X2-X3, wherein
In one embodiment, X2 comprises an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NO:331-332. In another embodiment, X3 comprises an amino acid at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NO: 223-234. In a further embodiment, the polypeptide comprising an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NO:45-63.
In a fourth aspect, the disclosure provides nucleic acids encoding a polypeptide comprising the formula X1-X2-X3, wherein
In one embodiment, X2 comprises an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NO: 331-332 and 426-445. In another embodiment, X3 comprises an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NO:223-234. In a further embodiment, the polypeptide formula comprises B1-B2-X1-X2-X3, wherein
In one embodiment, the polypeptide comprises an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NO: 64-146, 148, and 446-455.
In all aspects, the nucleic acids may encode a polypeptide that further comprises a signal peptide at its amino-terminus, including but not limited to a signal peptide that comprises the amino acid sequence selected from the group consisting of SEQ ID NO: 292-309. In all aspects, the nucleic acids may comprise an expression vector comprising the nucleic acid operatively linked to a control sequence, such as a promoter.
The disclosure also provides polypeptides or fusion proteins encoded by the nucleic acid of any aspect or embodiment of the disclosure, and host cell comprising the nucleic acid, expression vector, polypeptide, and/or fusion protein of any aspect or embodiment of the disclosure. In one embodiment, the host cell comprises a membrane fusion protein complex anchored in a lipid bilayer membrane of the cell, wherein the membrane fusion protein complex comprises the following components:
In another embodiment, the host cell comprises a membrane fusion protein complex anchored in a lipid bilayer membrane of the cell, wherein the membrane fusion protein complex comprises the following components:
The disclosure also provides vesicles, comprising one or more polypeptide or fusion protein of any aspect or embodiment herein incorporated into the lipid envelope of the vesicle. In various embodiments, the vesicle comprises a liposome, a lipid nanoparticle, a viral vector, or an enveloped particle that may optionally comprise any suitable cargo, including but not limited to a protein or nucleic acid cargo. In other embodiments, one or more polypeptide or fusion protein of any aspect or embodiment disclosed here are anchored on a surface of the liposome, the lipid nanoparticle, the viral vector, or the enveloped particle.
In one embodiment, the host cell or vesicle of any embodiment herein, further comprises a therapeutic or diagnostic moiety loaded in the host cell or vesicle.
The disclosure also provides kits. In one embodiment, the kit comprises
In another embodiment, the kit comprises
The disclosure also provides methods for inducing membrane fusion. In one embodiment, the method comprises mixing
In another embodiment, the method comprises mixing:
All references cited are herein incorporated by reference in their entirety. Within this application, unless otherwise stated, the techniques utilized may be found in any of several well-known references such as: Molecular Cloning: A Laboratory Manual (Sambrook, et al., 1989, Cold Spring Harbor Laboratory Press), Gene Expression Technology (Methods in Enzymology, Vol. 185, edited by D. Goeddel, 1991. Academic Press, San Diego, CA), “Guide to Protein Purification” in Methods in Enzymology (M. P. Deutshcer, ed., (1990) Academic Press, Inc.); PCR Protocols: A Guide to Methods and Applications (Innis, et al. 1990. Academic Press, San Diego, CA), Culture of Animal Cells: A Manual of Basic Technique, 2nd Ed. (R. I. Freshney. 1987. Liss, Inc. New York, NY), Gene Transfer and Expression Protocols, pp. 109-128, ed. E. J. Murray, The Humana Press Inc., Clifton, N.J.), and the Ambion 1998 Catalog (Ambion, Austin, TX).
As used herein, the singular forms “a”, “an” and “the” include plural referents unless the context clearly dictates otherwise.
All embodiments of any aspect of the disclosure can be used in combination, unless the context clearly dictates otherwise.
Unless the context clearly requires otherwise, throughout the description and the claims, the words ‘comprise’, ‘comprising’, and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is to say, in the sense of “including, but not limited to”. Words using the singular or plural number also include the plural and singular number, respectively. Additionally, the words “herein,” “above,” and “below” and words of similar import, when used in this application, shall refer to this application as a whole and not to any particular portions of the application.
As used herein, the amino acid residues are abbreviated as follows: alanine (Ala; A), asparagine (Asn; N), aspartic acid (Asp; D), arginine (Arg; R), cysteine (Cys; C), glutamic acid (Glu; E), glutamine (Gln; Q), glycine (Gly; G), histidine (His; H), isoleucine (Ile; I), leucine (Leu; L), lysine (Lys; K), methionine (Met; M), phenylalanine (Phe; F), proline (Pro; P), serine (Ser; S), threonine (Thr; T), tryptophan (Trp; W), tyrosine (Tyr; Y), and valine (Val; V).
Any N-terminal amino acids are optional, and may be deleted.
The polypeptides of all aspects and embodiments of the disclosure are able, for example, to form a heterooligomer with corresponding other protein(s) of the disclosure on a lipid bilayer membrane (as discussed in more detail below), followed by induction of merger of two membranes into a single continuous membrane.
VAMP2-Redesign (v-SNARE-Like Proteins (v-SPs))
In a first aspect, the disclosure provides nucleic acids encoding a v-SNARE-like protein (v-SP) polypeptide comprising the formula X1-X2-X3, wherein
As described in the examples, the disclosure provides a series of membrane fusion proteins that can induce cell-cell fusion when expressed on the surface of mammalian cells, or liposome fusion when displayed on the surface of liposomes. The designed proteins are based on the human neuronal SNARE complex (which is composed of three proteins, VAMP2, Syntaxin 1A or Syn1A, and SNAP25), which has a parallel four-helical bundle structure and transmembrane domains at the C-terminus of VAMP2 and Syn1A (see
As further described in the examples, new sequences were generated that are believed to fold into the four-helix bundle structure like the parental SNARE complex, followed by engineering SNAP25 so that one of the two coiled-coil domains of SNAP25 is an anti-parallel coiled-coil (
The amino acid sequences of the encoded X1 domains (SEQ ID NO:149-208) are provided in Tables 1 and 2 below.
SSSSSEKLRETQAQVDEVVDIMRVNVDKVLERDQKLSELDDRADALQAGASQFETSAA
NLASNRRLQQTSEEVREVNDIMRVNVDKVLERDQKLSELDDRADALQAGASQFETSAA
NLASNRRLQQTQAQVDEVVDVMRDNRNLVDERDQKLSELDDRADALQAGASQFETSAA
NLASNRRLQQTQAQVDEVVDIMRVNVDKVLRQGEQIDRLEDRADALQAGASQFETSAA
NLASNRRLQQTQAQVDEVVDIMRVNVDKVLERDQKLSELDERADELEKSASQFETSAA
NLASNRRLQQTQAQVDEVVDIMRVNVDKVLERDQKLSELDDRADALQAGAERLEENAT
NLASNRRLQQTQAQVDEVVDIMRVNVDKVLERDQKLSELDDRADALQAGASQFETSAA
SSSSNEKLRETLREVEDVKNIMEDNRRLVERQGRQIDRLEEKADDLERSAERLSDNSE
SSSDSSKLRETEEETRDVIDIMRDNRRSVEERGRQIDRLEERADDLEDSAERLEENAK
SGSSNERLREVSKEAREVREMAMDVKEKIEEQGRKIEELEEKAESLKDSAERFDENAK
SGTTNEKLRKVSSEADEVKEMGMDVKEKVEEQGRKIEELEEKAEDLKDSAERFDENAK
SGSSSEKLRQISSEAEEVKEMGMDILKKIEEQGEKIERLEEKAESLKDSAERFADNAK
DGTSNERLRETSKEAREVRDMAMDNMKKVEEQGEKIEELEEKAEELKDSAERLDDNAK
DGTSNEKLRETSEQAREVRDMALDNKEKIEEQGEKIDRLEEKAESLKDSAERFAENAK
SEEMSKKLEETSKEVDEVLEIMEEIREMLEEQGRRIDRLEKKAEELEEGAEKFEELSE
SEERKEKLEETLKEVDEVLEIMKENKEMLEEQGERLERLEEKAEELEEGAEKFEELAE
SKERSEKLKETMEEVEEVLEIMKEIRRMMEEQGERIDRLEEKAEELEEGAEKFEELAE
In one embodiment, X1 comprises an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO:156. The X1 domain of SEQ ID NO:156 is present in the full length VAMP2 redesign of SEQ ID NO:8 (Table 5). Residues present at the interface between VAMP2 and sc-t-SP in SEQ ID NO:156 are L8, L12, V15, V18, 121, M22, L28, V29, G33, 136, L39, L46, L53 (see highlighted residues below). Thus, in one embodiment, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, or all 13 of L8, L12, V15, V18, 121, M22, L28, V29, G33, 136, L39, L46, L53 are conserved (i.e., identical) in the polypeptide relative to SEQ ID NO:156 (see below).
In another embodiment, the X2 domain comprises an amino acid sequence at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NO:209-222 and 456, or selected from the group consisting of SEQ ID NO:213 and 214. The amino acid sequences of the X2 JMDs are provided in Table 3.
In a further embodiment, X3 comprises an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NO: SEQ ID NO: 223-234, or selected from the group consisting of SEQ ID NO:223 or 232. The amino acid sequences of the X3 TMDs is provided in Table 4.
In one embodiment, the polypeptide comprises the genus B2-X1-X2-X3, wherein B2 comprises an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO:235.
Some v-SP designs contain a Pro-rich region (SATAATAPPAAPAGEGGPPAPPP, (SEQ ID NO: 235)) at the N terminus derived from native VAMP2, but this region is dispensable for fusion activity, and thus is optional in the present designs, and may be present or absent.
In another embodiment of this first aspect, the nucleic acid encodes a polypeptide comprises an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NO:1-37, 147, and 236-289, or at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO:8. In a further embodiment, the nucleic acid encodes an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO:235 N-terminal to the polypeptide comprising an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NO:1-37, 147, and 236-289.
In one embodiment of this first aspect, the nucleic acid encodes a fusion protein, comprising:
In these embodiments the fusion protein can be used for inducible binding to sc-t-SP (see below) in the presence of rapamycin. The domain of SEQ ID NO:290 is an FKBP domain that can bind to its cognate binding partner FRB domain (SEQ ID NO:291) fused to the N-terminus of inducible sc-t-SPs (e.g. SEQ ID NO 148) in the presence of rapamycin and induce fusion. FKBP domain fused to v-SPs and FRB domain fused to sc-t-SPs would function similarly if they were interchanged with each other. In one such embodiment, the nucleic acid encodes a fusion protein comprising an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO:147 and 244-258.
The sequences of these full length VAMP2 redesigned proteins are shown in Tables 5 and 6.
SSSSSEKLRETQAQVDEVVDIMRVNVDKVLERDQKLSELDDRADALQAGASQFETSAA
KLKRK
YWWKNLK
MMIILGVICAIILIIIIVYFST
NLASNRRLQQTSEEVREVNDIMRVNVDKVLERDQKLSELDDRADALQAGASQFETSAA
KLKRK
YWWKNLK
MMIILGVICAIILIIIIVYFST
NLASNRRLQQTQAQVDEVVDVMRDNRNLVDERDQKLSELDDRADALQAGASQFETSAA
KLKRK
YWWKNLK
MMIILGVICAIILIIIIVYFST
NLASNRRLQQTQAQVDEVVDIMRVNVDKVLRQGEQIDRLEDRADALQAGASQFETSAA
KLKRK
YWWKNLK
MMIILGVICAIILIIIIVYFST
NLASNRRLQQTQAQVDEVVDIMRVNVDKVLERDQKLSELDERADELEKSASQFETSAA
KLKRK
YWWKNLK
MMIILGVICAIILIIIIVYFST
NLASNRRLQQTQAQVDEVVDIMRVNVDKVLERDQKLSELDDRADALQAGAERLEENAT
KLKRK
YWWKNLK
MMIILGVICAIILIIIIVYFST
NLASNRRLQQTQAQVDEVVDIMRVNVDKVLERDQKLSELDDRADALQAGASQFETSAA
KLTKY
YEEKESK
MMIILGVICAIILIIIIVYFST
SSSSNEKLRETLREVEDVKNIMEDNRRLVERQGRQIDRLEEKADDLERSAERLSDNSE
KLKRK
YWWKNSK
MMIILGVICAIILIIIIVYFST
SSSDSSKLRETEEETRDVIDIMRDNRRSVEERGRQIDRLEERADDLEDSAERLEENAK
KLKRK
YWWKNSK
MMIILGVICATILIIIIVYFST
SSSSNEKLRETLREVEDVKNIMEDNRRLVERQGRQIDRLEEKADDLERSAERLSDNSE
KLKRK
YWWKNSK
MMIILGVICAIILIIIIVYFST
SGSSNERLREVSKEAREVREMAMDVKEKIEEQGRKIEELEEKAESLKDSAERFDENAK
KLKRK
YWWKNLK
MMIILGVICAIILIIIIVYFST
SGTTNEKLRKVSSEADEVKEMGMDVKEKVEEQGRKIEELEEKAEDLKDSAERFDENAK
KLKRK
YWWKNLK
MMIILGVICAIILIIIIVYFST
SGSSSEKLRQISSEAEEVKEMGMDILKKIEEQGEKIERLEEKAESLKDSAERFADNAK
KLKRK
YWWKNLK
MMIILGVICAIILIIIIVYFST
DGTSNERLRETSKEAREVRDMAMDNMKKVEEQGEKIEELEEKAEELKDSAERLDDNAK
KLKRK
YWWKNLK
MMIILGVICAIILIIIIVYFST
DGTSNEKLRETSEQAREVRDMALDNKEKIEEQGEKIDRLEEKAESLKDSAERFAENAK
KLKRK
YWWKNLK
MMIILGVICAIILIIIIVYFST
SEEMSKKLEETSKEVDEVLEIMEEIREMLEEQGRRIDRLEKKAEELEEGAEKFEELSE
KLKRK
YWWKNLK
MMIILGVICATILIIIIVYFST
SEERKEKLEETLKEVDEVLEIMKENKEMLEEQGERLERLEEKAEELEEGAEKFEELAE
KLKRK
YWWKNLK
MMIILGVICAIILIIIIVYFST
SKERSEKLKETMEEVEEVLEIMKEIRRMMEEQGERIDRLEEKAEELEEGAEKFEELAE
KLKRK
YWWKNLK
MMIILGVICAIILIIIIVYFST
SSSSNEKLRETLREVEDVKNIMEDNRRLVERQGRQIDRLEEKADDLERSAERLSDNSE
KLKRK
YWWKNLK
MMIILGVICAIILIIIIVYFST
SSSSNEKLRETLREVEDVKNIMEDNRRLVERQGRQIDRLEEKADDLERSAERLSDNSE
KLKKK
KKKKKKK
MMIILGVICAIILIIIIVYFST
SSSSNEKLRETLREVEDVKNIMEDNRRLVERQGRQIDRLEEKADDLERSAERLSDNSE
KLKRK
YWWKNLK
AAVLVLLVIVIISLIVLVVIWST
SSSSNEKLRETLREVEDVKNIMEDNRRLVERQGRQIDRLEEKADDLERSAERLSDNS
EKLKRK
YWWKNLK
MMVVVVVVVVVVVVVVVVYFST
SSSSNEKLRETLREVEDVKNIMEDNRRLVERQGRQIDRLEEKADDLERSAERLSDNSE
KLKRK
YWWKNLK
MMIIIIIIIIIIIIIIIIYFST
SSSSNEKLRETLREVEDVKNIMEDNRRLVERQGRQIDRLEEKADDLERSAERLSDNSE
KLKKK
KKKKKKK
AAVLVLLVIVIISLIVLVVIWST
SSSSNEKLRETLREVEDVKNIMEDNRRLVERQGRQIDRLEEKADDLERSAERLSDNSE
KLKKK
KKKKKKK
MMVVVVVVVVVVVVVVVVYFST
NLASNRRLQQTQAQVDEVVDIMRVNVDKVLERDQKLSELDDRADALQAGASQFETSAA
KLKKK
KKKKKKK
MMIILGVICAIILIIIIVYFST
NLASNRRLQQTQAQVDEVVDIMRVNVDKVLERDQKLSELDDRADALQAGASQFETSAA
KLKRK
YWWKNLK
MMLLLLLLLLLLLLLLLLYFST
NLASNRRLQQTQAQVDEVVDIMRVNVDKVLERDQKLSELDDRADALQAGASQFETSAA
KLKRK
YWWKNLK
MMVVVVVVVVVVVVVVVVYFST
NLASNRRLQQTQAQVDEVVDIMRVNVDKVLERDQKLSELDDRADALQAGASQFETSAA
KLKRK
YWWKNLK
FFFIIGLIIGLFLVLRVGIHLST
NLASNRRLQQTQAQVDEVVDIMRVNVDKVLERDQKLSELDDRADALQAGASQFETSAA
KLKRK
YWWKNLK
ILWISFAISCFLLCVVLLGFIST
NLASNRRLQQTQAQVDEVVDIMRVNVDKVLERDQKLSELDDRADALQAGASQFETSAA
KLKRK
YWWKNLK
IATGMVGALLLLLVVALGIGLFST
NLASNRRLQQTQAQVDEVVDIMRVNVDKVLERDQKLSELDDRADALQAGASQFETSAA
KLKRK
YWWKNLK
IMIIICCVILGIVIASTVGGIST
NLASNRRLQQTQAQVDEVVDIMRVNVDKVLERDQKLSELDDRADALQAGASQFETSAA
KLKRK
YWWKNLK
AAVLVLLVIVIISLIVLVVIWST
SSSSNEKLRETLREVEDVKNIMEDNRRLVERQGRQIDRLEEKADDLERSAERLSDNSE
KIKKK
FFFKKEK
MMIILGVICAIILIIIIVYFST
SSSSNEKLRETLREVEDVKNIMEDNRRLVERQGRQIDRLEEKADDLERSAERLSDNSE
RIRRR
FFFRRFR
MMIILGVICAIILIIIIVYFST
SSSSNEKLRETLREVEDVKNIMEDNRRLVERQGRQIDRLEEKADDLERSAERLSDNSE
KIKKK
FFFKKFK
AAVLVLLVIVIISLIVLVVIWST
SSSSNEKLRETLREVEDVKNIMEDNRRLVERQGRQIDRLEEKADDLERSAERLSDNSE
KIKKK
FFFKKFK
MMVVVVVVVVVVVVVVVVFST
QVDEVVDIMRVNVDKVLERDQKLSELDDRADALQAGASQFETSAA
KLKRKYWWKNLK
MMIILG
VICAIILIIIIVYFST
In a further embodiment of all aspects of the disclosure, the nucleic acid encodes a polypeptide that further comprises a signal peptide at its amino-terminus. Any signal peptide may be used as suitable for an intended purpose. The signal peptide may be directly linked to the polypeptide, or may be connected via an amino acid linker. In some embodiments, the signal peptide comprises the amino acid sequence selected from the group consisting of SEQ ID NO: 292-309. The amino acid sequence of these exemplary signal peptides are provided in Table 7.
In a further embodiment of any aspect of the disclosure, the nucleic acid comprises an expression vector comprising the nucleic acid operatively linked to a control sequence, such as a promoter.
In a second aspect, the disclosure provides nucleic acids encoding SNAP25-redesigned polypeptide comprising an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NO:310-316, wherein X1 is an amino acid linker.
In this second aspect, the disclosure provides redesigned SNAP25 proteins in which one of the two coiled-coil domains is an anti-parallel coiled-coil, as described above. When combined with v-SP and native or redesigned Syn1A, redesigned SNAP25 is capable of inducing cell-cell fusion when displayed on the surface of mammalian cells, or liposome fusion when displayed on the surface of liposomes. For membrane fusion to occur, v-SP is presented on one membrane and SNAP25 and Syn1A on the other.
The amino acid sequences of SEQ ID NO:310-316 are provided in Table 9.
AEDADLEKQKQEEEKRGETLKDESLEATRKMVNMVREAREMAMRNGELLESQGEKLDRIEEKADRMETKLDE
ADEDLKKIEG-X1-
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSG
AEDADSLAQQQQEEQRGSTLIDESLEATRKMKEMVEEAVRMAMDNGELLRSQGEKLDRIEEKADRMESLLDE
ADENLDKIEG-X1-
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSG
AEDADMRNELEEMQRRADQLADESLESTRRMLQLVEESKDAGIRTLVMLDEQGEQLERIEEGMDQINKDMKE
AEKNLADLGK-X1-
PSYIREVNNSEKEKEINEGLGRVDQQVQELKDMAVVMGEKVDEQNEKIDRINEKADKNEQRVNDLTKEAEKL
LNSG
AEDADMRNELEEMQRRADQLADESLESTRRMLQLVEESKDAGIRTLVMLDEQGEQLERIEEGMDQINKDMKE
AEKNLADLGK-X1-
SSFIRRVNGSEREREIDRGLERVDQQVKELKDMARVMGDKTDEQGEKIDRIEEKADRNEERVEKLVKEAKEL
LESG
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSG
X1-EELKDLEKEGKELKELVEELDREVKELKESMEKLKEMTEEAAELSSQALEIMRRTRKLSEELLKEAKE
EEEEEEEEEEEEE
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSG-X1-
EELKKLEKEGEKLKELVEELDREIKELKEGMERLREMFEEAAKLSEEALEIMRRTRKLSEEELEEAKQKAKE
QEKEKALKEK
NKREEEIDKGLDRVGEIISKLNEMAREMGEKIEEQNQKISEIEKKADEAIEKVEKLIKDAEKLLGSG-X1-
EELKKLEKEGEKLKELVEELDREIKELKEGMERLREMFEEAAKLSEEALEIMRRTRKLSEEELEEAKQKAKE
QEKEKALKEK
In one embodiment, the polypeptide comprises an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NO:314-316, wherein X1 is an amino acid linker.
The XI linker may be any linker suitable for an intended purpose. In some embodiments, the amino acid linker is a GS-rich linker of less than 20, less than 15, or less than 10 amino acids in length. As used here, “GS-rich” means at least 50% G or S residues.
In a further embodiment, the nucleic acid encodes a polypeptide comprising an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NO:38-44. In a further embodiment, the nucleic acid encodes a polypeptide comprising an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NO:42-44.
The amino acid sequence of SEQ ID NO:38-44 are provided in Table 9.
AEDADLEKQKQEEEKRGETLKDESLEATRKMVNMVREAREMAMRNGELLESQGEKLDRIEEKADRMETKLDE
ADEDLKKIEG FSGLSVSPSNKLKSSDAYKKAWGNNQDGVVASQPARVVDEREQMAISGGFIRRVTNDARENE
MDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSG
AEDADSLAQQQQEEQRGSTLIDESLEATRKMKEMVEEAVRMAMDNGELLRSQGEKLDRIEEKADRMESLLDE
ADENLDKIEGFSGLSVSPSNKLKSSDAYKKAWGNNQDGVVASQPARVVDEREQMAISGGFIRRVTNDARENE
MDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSG
AEDADMRNELEEMQRRADQLADESLESTRRMLQLVEESKDAGIRTLVMLDEQGEQLERIEEGMDQINKDMKE
AEKNLADLGKFSGLSVSPSNKLKSSDAYKKAWGNNQDGVVASQPARVVDEREQMAISPSYIREVNNSEKEKE
INEGLGRVDQQVQELKDMAVVMGEKVDEQNEKIDRINEKADKNEQRVNDLTKEAEKLLNSG
AEDADMRNELEEMQRRADQLADESLESTRRMLQLVEESKDAGIRTLVMLDEQGEQLERIEEGMDQINKDMKE
AEKNLADLGKFSGLSVSPSNKLKSSDAYKKAWGNNQDGVVASQPARVVDEREQMAISSSFIRRVNGSERERE
IDRGLERVDQQVKELKDMARVMGDKTDEQGEKIDRIEEKADRNEERVEKLVKEAKELLESG
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EEEEEEAEEEE
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEK
NKREEEIDKGLDRVGEIISKLNEMAREMGEKIEEQNQKISEIEKKADEAIEKVEKLIKDAEKLLGSGGGSGG
EQEKEKALKEK
In a further embodiment, the nucleic acids of this second aspect encode a polypeptide that further comprises a signal peptide at its amino-terminus. Any signal peptide may be used as suitable for an intended purpose. The signal peptide may be directly linked to the polypeptide, or may be connected via an amino acid linker. In some embodiments, the signal peptide comprises the amino acid sequence selected from the group consisting of SEQ ID NO:292-309. The amino acid sequence of these exemplary signal peptides are provided in Table 7.
In a further embodiment, the nucleic acids of the second aspect comprise an expression vector comprising the nucleic acid operatively linked to a control sequence, such as a promoter.
In a third aspect, the disclosure provides nucleic acids encoding Syn1A-redesigned polypeptide comprising the formula X1-X2-X3, wherein:
This aspect provides redesigned Syntaxin 1A (Syn1A) variants as described above. When combined with v-SP and native or redesigned SNAP25, redesigned Syn1A is capable of inducing cell-cell fusion when displayed on the surface of mammalian cells, or liposome fusion when displayed on the surface of liposomes. For membrane fusion to occur, v-SP is presented on one membrane and Syn1A and SNAP25 on the other.
The amino acid sequence of SEQ ID NO:317-330 are provide in Table 10.
SIEMEELSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNVEHAVDYVERAVSD
SISKQALKRLEERHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNVEHAVDYVERAVSD
SISKQALSEIETNNEQVIKLENSIRELHDMFMDMAMLVESQGEMIDRIEYNVEHAVDYVERAVSD
SISKQALSEIETRHSEIRRLLESIRELHDMEMDMAMLVESQGEMIDRIEYNVEHAVDYVERAVSD
SISKQALSEIETRHSEIIKLENSVEEMHDMEMDMAMLVESQGEMIDRIEYNVEHAVDYVERAVSD
SISKQALSEIETRHSEIIKLENSIRELKDMARDMAMLVESQGEMIDRIEYNVEHAVDYVERAVSD
SISKQALSEIETRHSEIIKLENSIRELHDMEMRLGDMVESQGEMIDRIEYNVEHAVDYVERAVSD
SISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVDEQGEMIDRIEYNVEHAVDYVERAVSD
SISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEKISRIEYNVEHAVDYVERAVSD
SISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEERVEHAVDYVERAVSD
SISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNVEAAEAYVERAVSD
SISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNVEHAVDGVKAAVSD
SISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNVEHAVDYVERAKDN
SISKQALSEIETRHSEIIKLENSIRELHDMFMDMAMLVESQGEMIDRIEYNVEHAVDYVERAVSD
In one embodiment of this third aspect, X2 comprises an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NO:331-332.
In a further embodiment, X3 comprises an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NO: SEQ ID NO: 223-234, or selected from the group consisting of SEQ ID NO:223 or 232. The amino acid sequences of the X3 TMDs is provided in Table 4.
In another embodiment of this third aspect, the nucleic acid encodes a polypeptide comprising an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NO:45-63. The amino acid sequences of SEQ ID NO:45-63 are provided in Table 11.
SIEMEELSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNVEHAVDYVERAVSD
TKKAVKY
QSKARRKK
IMIIICCVILGIVIASTVGGIFA
SISKQALKRLEERHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNVEHAVDYVERAVSD
TKKAVKY
QSKARRKKIMIIICCVILGIVIASTVGGIFA
SISKQALSEIETNNEQVIKLENSIRELHDMFMDMAMLVESQGEMIDRIEYNVEHAVDYVERAVSD
TKKAVKY
QSKARRKK
IMIIICCVILGIVIASTVGGIFA
SISKQALSEIETRHSEIRRLLESIRELHDMEMDMAMLVESQGEMIDRIEYNVEHAVDYVERAVSD
TKKAVKY
QSKARRKK
IMIIICCVILGIVIASTVGGIFA
SISKQALSEIETRHSEIIKLENSVEEMHDMEMDMAMLVESQGEMIDRIEYNVEHAVDYVERAVSD
TKKAVKY
QSKARRKK
IMIIICCVILGIVIASTVGGIFA
SISKQALSEIETRHSEIIKLENSIRELKDMARDMAMLVESQGEMIDRIEYNVEHAVDYVERAVSD
TKKAVKY
QSKARRKK
IMIIICCVILGIVIASTVGGIFA
SISKQALSEIETRHSEIIKLENSIRELHDMEMRLGDMVESQGEMIDRIEYNVEHAVDYVERAVSD
TKKAVKY
QSKARRKK
IMIIICCVILGIVIASTVGGIFA
SISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVDEQGEMIDRIEYNVEHAVDYVERAVSD
TKKAVKY
QSKARRKK
IMIIICCVILGIVIASTVGGIFA
SISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEKISRIEYNVEHAVDYVERAVSD
TKKAVKY
QSKARRKK
IMIIICCVILGIVIASTVGGIFA
SISKQALSEIETRHSEIIKLENSIRELHDMFMDMAMLVESQGEMIDRIEERVEHAVDYVERAVSD
TKKAVKY
QSKARRKK
IMIIICCVILGIVIASTVGGIFA
SISKQALSEIETRHSEIIKLENSIRELHDMFMDMAMLVESQGEMIDRIEYNVEAAEAYVERAVSD
TKKAVKY
QSKARRKK
IMIIICCVILGIVIASTVGGIFA
SISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNVEHAVDGVKAAVSD
TKKAVKY
QSKARRKK
IMIIICCVILGIVIASTVGGIFA
SISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNVEHAVDYVERAKDN
TKKAVKY
QSKARRKK
IMIIICCVILGIVIASTVGGIFA
SISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNVEHAVDYVERAKDN
TKKAVKY
QSKARRKK
IMIIICCVILGIVIASTVGGIFA
SISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNVEHAVDYVERAVSD
TKKAVKY
QSKARRKKFFFIIGLIIGLFLVLRVGIHLFA
SISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNVEHAVDYVERAVSD
TKKAVKY
QSKARRKKILWISFAISCFLLCVVLLGFIFA
SISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNVEHAVDYVERAVSD
TKKAVKY
QSKARRKKIATGMVGALLLLLVVALGIGLFFA
SISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNVEHAVDYVERAVSD
TKKAVKY
QSKARRKKMMIILGVICAIILIIIIVYFFA
SISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNVEHAVDYVERAVSD
TKKAVKY
QSKARRKKAAVLVLLVIVIISLIVLVVIWFA
In a further embodiment, the nucleic acids of this third aspect encode a polypeptide that further comprises a signal peptide at its amino-terminus. Any signal peptide may be used as suitable for an intended purpose. The signal peptide may be directly linked to the polypeptide, or may be connected via an amino acid linker. In some embodiments, the signal peptide comprises the amino acid sequence selected from the group consisting of SEQ ID NO:292-309. The amino acid sequence of these exemplary signal peptides are provided in Table 7.
In a further embodiment, the nucleic acids of the second aspect comprise an expression vector comprising the nucleic acid operatively linked to a control sequence, such as a promoter.
sc-t-SP Designs
In a fourth aspect, the disclosure provides nucleic acids encoding single-chain t-SNARE-like proteins (sc-t-SPs) comprising the formula X1-X2-X3, wherein:
In this fourth aspect, the disclosure provides single-chain t-SNARE-like proteins (sc-t-SPs) which fuse the C-terminus of redesigned SNAP25 to the N-terminus of Syn1A sequences as described below. The sc-t-SP designs have three coiled-coil domains, and one of them is anti-parallel as it is derived from the redesigned SNAP25 described above. The sc-t-SP designs can bind to v-SPs and form a four-helix bundle like the native SNARE complex, though one helix is anti-parallel unlike the native SNARE complex. When combined with v-SP, sc-t-SPs are capable of inducing membrane fusion in mammalian cells or liposomes. The sc-t-SP should be present on one membrane and v-SP on the other to induce membrane fusion. The amino acid sequences of the X1 domains of SEQ ID NO:333-425 are provided in Tables 12 and 13.
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EEEEEEAEEEEGGSGGSGGSSISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNV
EHAVDYVERAVSD
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSSISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNV
EHAVDYVERAVSD
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EEEEEEGGSGGSGGSSISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNVEHAVD
YVERAVSD
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EGGSGGSGGSSISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNVEHAVDYVERA
VSD
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
SISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNVEHAVDYVERAVSD
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
ALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNVEHAVDYVERAVSD
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EEEEEEGGSGGSGGSALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNVEHAVDYVERA
VSD
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EGGSGGSGGSALSEIETRHSEIIKLENSIRELHDMFMDMAMLVESQGEMIDRIEYNVEHAVDYVERAVSD
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
ALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNVEHAVDYVERAVSD
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEKVKEELAKVEERHKQIQALLDKIEELYEMEKEMSEKISEQGQKIDRIEEKV
SKASEHVSKGVED
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEKVKKELAEVEKRHKQILELEEKIKELYEMEKEMSEKIEKQGQKIDRIDDKV
SEAKKHVEKAVED
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEAVKKSLAAKEERHKQILELLEKIKELHEMFKELSEKIEKQGQKIDRIEDKV
SKASEHVSKGVED
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEAVKKELAAIEERHEQILELLKKIEELYEMFKELSEKIEKQGQKIDRIEKKV
SEASRHVSKAVED
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEKVKKDLAAIEERHQQILELEEKIKELHEMFKEMSEKISEQMQKIDRIEEKV
SKASEHVSKGVED
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEKVRRELAAIEERHRQILELLEKIEELHEMFKEMSEKISKQMEKIDRIDDRV
SEASRHVEKGVED
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSKAVKEELANIENRHKQIDALYEKIKELHEMFLEMSERIEAQLQKIDRIDDKV
SKAKAHVEKGVED
NKREEEIDKGLDRVGEIISKLNEMAREMGEKIEEQNQKISEIEKKADEAIEKVEKLIKDAEKLLGSGGGSGG
EQEKEKALKEKGGSGGSGGSSISKQALSEIETRHSEIIKLENSIRELHDMFMDMAMLVESQGEMIDRIEYNV
EHAVDYVERAVSD
NKREEEIDKGLDRVGEIISKLNEMAREMGEKIEEQNQKISEIEKKADEAIEKVEKLIKDAEKLLGSGGGSGG
EQEKEKALKEKGGSGGSGGSALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNVEHAVD
YVERAVSD
NKREEEIDKGLDRVGEIISKLNEMAREMGEKIEEQNQKISEIEKKADEAIEKVEKLIKDAEKLLGSGGGSGG
EQEKEKALKEKGGSGGSGGSSISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNV
EHAVDYVERAVSD
NKREEEIDKGLDRVGEIISKLNEMAREMGEKIEEQNQKISEIEKKADEAIEKVEKLIKDAEKLLGSGGGSGG
EQEKEKALKEKGGSGGSGGSSISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNV
EHAVDYVERAVSD
NKREEEIDKGLDRVGEIISKLNEMAREMGEKIEEQNQKISEIEKKADEAIEKVEKLIKDAEKLLGSGGGSGG
EQEKEKALKEKGGSGGSGGSSISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNV
EHAVDYVERAVSD
NEREKEIDEGLERVGELISKLKELAREMSEKIEEQNQKLSEIDKKAEEAIKLLEKANASAKKLLEKPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
NEREKEIEEGLERVGELISELKEMAREMSEKIEEQNKKLDEISKKADEAIKLLEKANKGAEELLKKPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
NEREKEIDEGLEKIGELISKLKEMAREMSEKIEEQNEKLDEIDKKADEAIKLLEEANKKAEKLLKKKGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
NEREKEIEEGLERIGELISKLKELAREMSEKIEEQNEKLSEISEKADEAIKLLEKANASAQKLLEKPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
NPREEEIDKGLEEIGKLISELKELAREMSEKIEEQNEKISEIDEKAKEAIELLKKANEKAKELLEKEGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
SPREKEIDEGLERVSELVKKLKELAEKMKEMIEEQGRRIERIERKAEEAKERIEKLNEKAEKLLEDPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
SEREKEIDEGLDRVSEIVKELKKMAEEMRRMIEEQGRRIERIEEKAEEAKEKIEEANERAEKLLKDPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
SEREKEIDEGLEKVSEIVKELKEMAEEMREMIERQGEQIERIEKKAEEAKKKIEEQNERAERLLKDPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
SEREKEIEEGLERVSEIVRRLKELAEEMRRMIEEQGRRIDRIEEKADKAKEEIEKQNEKLEKLLKDPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
SEREKEIDEGLEKVSEIVKELKELAKEMKEMIEEQGRRIDRIERKAEETKKKIEELNEQAERLLKDPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
SEREEEIDKGLERVSEIVKKLKELAEKMKEEIERQGEQIDRIEKKADETIKEIERLNESADRLLKSPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
SEREKEIDEGLDRVSEIVKELKKMAEEMRRMIEEQGRRIERIEEKAEEAKEKIEEANERAEKLLKDPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
SEREKEIDEGLDRVSEIVKELKKMAEEMRRMIEEQGRRIERIEEKAEEAKEKIEEANERAEKLLKDPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
SEREKEIDEGLDRVSEIVKELKKMAEEMRRMIEEQGRRIERIEEKAEEAKEKIEEANERAEKLLKDPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
SEREKEIDEGLDRVSEIVKELKKMAEEMRRMIEEQGRRIERIEEKAEEAKEKIEEANERAEKLLKDPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
SEREKEIDEGLDRVSEIVKELKKMAEEMRRMIEEQGRRIERIEEKAEEAKEKIEEANERAEKLLKDPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
SEREKEIDEGLDRVSEIVKELKKMAEEMRRMIEEQGRRIERIEEKAEEAKEKIEEANERAEKLLKDPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
SEREKEIDEGLDRVSEIVKELKKMAEEMRRMIEEQGRRIERIEEKAEEAKEKIEEANERAEKLLKDPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
SEREKEIDEGLDRVSEIVKELKKMAEEMRRMIEEQGRRIERIEEKAEEAKEKIEEANERAEKLLKDPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
SEREKEIDEGLDRVSEIVKELKKMAEEMRRMIEEQGRRIERIEEKAEEAKEKIEEANERAEKLLKDPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEKVKESLANKEERHKQILELEEKIKELYEMEKELSEKIEEQLKKIDRIEEKV
SEASRHVSKGVES
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEKTRRTLAEIEERHRQILELEEKIEELYEMFKELSEKISEQGQKISRIEDKV
SKASEHVSKGVEN
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEKVKKSLAEIEKRHEQILQLEKQIEELHEMFKELSEKISKQGQKIDRIEEKV
EEAKRHVEKAVKD
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSKKVKEELARIEARHQQILALEEKIRELYEMEKELSEKIEEQGKKIDRIEDKV
SKASEHVSKGVEN
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
SEELKKLEKEGEKLKELVEELDREIKELKEGMERLREMFEEAAKLSEEALEIMRRTRKLSEEELEEAKQKAK
EQEKEKALKEKGGSGGSGGSKKVKEELKEKEKRHRQIEELLKKIEELHEMFEELSERISEQGQKIDRIDDKV
SKASEHVSKGVED
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEEVKKSLAEIEKRHEQILALEKKIEELYEMEKELGEKIEKQLQKISRIEEKV
SEASRHVSKGVED
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEKVKKELAEIEARHQQIEALLEQIKELYEMEKELSEKIEEQGQKISRIEDKV
SKASEHVSKGVEQ
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEKVKKELAEKEKRHKQIDELLEKIKELYEMFKEMGEKIEKQGEKIDRIEKKV
SEASKHVSKAVED
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEAVKKELAAIEARHKQIDALLEKIKELHEMFEEMSKKIEEQMQKISRIEDKV
SEASRHVSKAVSD
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEKVKKELAEIEKRHKQILELEEKIKELHEMFKELGEKIEKQGQKISRIDDKV
SEAKRHVEKGVED
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSKKVKEELKKIEERHKQILELEEKIEELYEMEKELAERIEKQGEKIDRIDEKV
SEAKRNVEKAVED
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEKVKKELEEKEKRHKQILELEEKIKELYEMEKELSEKIEEQLQKIDRIDDKV
SEASRHVSKGVED
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEEVRRSLEEIERRHRQILELEEKIEELYEMEKEMSEKIEEQGQKISRIEEKV
SKASEHVSKAVED
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEEVKKELAEIEARHEQIKELEKQIEELHEMFKELGEKIEKQGEKIDRIDEKV
SEASRHVSKAVED
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEKVKEELAKKEARHEQILELTEKIEELSEMFKELSEKISEQGQKIDRIEDKV
SKASEHVSKAVED
SELEKKIDELLEEISRLVRELKEIAKELRRLTERQGRQVERIEREVEEAEREIEELNKEAEELLEKEDSEDE
PYTPEELEKVKERHELIKKLKEEIKELKEMFEELRELVRRQGERLDRIEEKVRRAVEHVKKAEEN
SEKEKEIDELLDKVSEIVKELKKLAEELKRRTERQGRQIEEIERKTEEAKRKIEELNKKAEELLKKEDDDSD
PYTPEELEKVRERHELIKKLLEEIEELEEMFEELERLVEEQGRRLERIEEKVSRAVRHVERAEEN
SSKEEEIEELLDEVSEIVRRLKEMAREIREMVERQGRQIERIERKVEEAKRKIEELNKKAEELLEKEDDESE
EYTPEELEEVEERHKLIQKLLEEIKELKEMFEELERLVEEQGRRLERIEEKVRRAVEHVKRALEN
PYTPEELEKVKERHELIKKLKEEIKELKEMFEELRELVRRQGERLDRIEEKVRRAVEHVKKAEEN
SEKEKEIDELLDKVSEIVKELKKLAEELKRRTERQGRQIEEIERKTEEAKRKIEELNKKAEELLKKEDDDSD
PYTPEELEKVRERHELIKKLLEEIEELEEMFEELERLVEEQGRRLERIEEKVSRAVRHVERAEEN
SETEKKENEALEELERLLEEAKKLLEEQRRLLEAQGEVQKEQEKLEDELEEIQEEAEKYQNKLLESKDEEDE
EAEAAEKAGKAAGVSLKEEVEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
SSLEKKIDENLEKALELLEELKEKLEEMRRLLEESGRLQDELEELMDETQKQQEELEKLLEKLLKMDDSDEQ
EAKKELEEAKKKGEEGKEKLEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
SEEKKKMEELLDKIKELLEELKKLAEEIKKLLEEQGRQLEKLEEEADKALRQAEEAIRLQEKALELEDDEEI
EKLKEYEELKKLGILPKEELEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
EEFEKKENELLEELKKKLEEAKKLLKENRRLLEEQGRQLEEIEEKMEEAEELQEKALEYQEKAEKAGFSDES
KAEEKAKEAAKKGVDLSEELEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
SEEEEKFNELLEKIEEELEEIKELAEELREKLEELGRLTEKALELADELEKLFEEAEKLLEEALKLGDGEEL
EAEKKAEELKKLGIDPTEELEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
SSLKEKFNELLDELLKELEEAKELLEEIREQLERIGEQLEELEEQFDEILKEQEELEKQQKKLLESPGSEEE
EAKKKYEEKKKKGINGTEELEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
KESEEKFEEKLKELEKLLEEAKELLEKQREYLEESGRLLDEAEKLMDETERIFEETLKLQDKLLAAKDEEDQ
EAEEKEKEGEKKGVSYKEEVEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
SKLKEEFEKYLEELLKQLEKLKEKLKELREKLEEQGKQLEKLEEQFDRILEQQEKLLEQQEKLLEDEGSEEE
EAEKKYEELKKKGINGTEELEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
SEREKEFNERLEEMEKLLEKIKKLAEEIRRLLEKQGELLDKLEELADEALRLQEKAIEKSEKILEKGYNEET
ELEKRKEEYEKLGIDLSEELEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
SSFEEEINKLLEELKRKLEELKEILEEIRKLLEEQGRQLDEIEEKMDEAEELAEKAEEYLKKAEEAGGGEES
KAEAAAAEAAKLGIDKSEELEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
SEEEEKFNELLEKIEEKLEEAKELAEELREELEKIGELTDEAERLADEALKLAEEAEKLLKEALKLGDEDEL
KAEEKYKEDLKKGIDNTEEIEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
ESFEKELEELLEKIQELMEKIKELAEKLREALEESGRLLEEIEEAVDKLEEKFEEIEKLQENAEKYEDTEEA
EAEKKLKEAEEYNEELEKEVEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
SELEEKENKYLEELLETLEKLKEALEKIREKLEEQGKQLDKIEEAFDELLKQQEELLKQQEELLADPGSEES
EAEKEYKEDKKKGINNKEKLEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
SSLEKKIDENLEKALELLEELKEKLEEMRRLLEESGRLQDELEELMDETQKQQEELEKLLEKLLKMDDSDEQ
EAKKELEEAKKKGEEGKEKLEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
SSLEKKIDENLEKALELLEELKEKLEEMRRLLEESGRLQDELEELMDETQKQQEELEKLLEKLLKMDDSDEQ
EAKKELEEAKKKGEEGKEKLEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
GELKEKKEKLSKEFEKLLKESKRLAEELKEKLEELGRALDEAEELADEVERQQEELEKLQEEILKSEENEDE
ATKAEAEAAKAKGEDISDKLEAAEKEYKSVKEELKLVEEIKKKVEEIKEMLEEMKERIEEMEEKVKRIEEKL
KRIEESLKRVEEN
DELEKKIKELEEKSEEELKEAKELAEELRRLLEELERALDEAERLADEVERKQEELEKLMEEMLKSEDNESD
ETKKKAEEMKKKGIDISEELEKAKEELESVKKNLELVKKILEEVKEIKEELEEMGEEIERMEEKVDRIEEKL
ERVEESLERVSKN
SEEDKKMEELLEEALKLLEELKELLEKNRELLEELGRQQEELEKLQDEAERLQEELEEAFKKMEENEESEEG
RAEAARAEYEAAGSPEVERAEQVLEEYREAKEFYEKVEELLREVKEIKEEIKEMEERIKEIGERIKRIEEKI
ERVEKLLERTEKN
SEKEKEFNELLEEALRELEKLKELLEENGRLLERTGEQLERMEELMDEAEEKQEELEEAIKKMEKYEDSEEG
KAEKAKAEYEAAGKDEVKECEKVKEKYEEAKKRYEQVEKLLKEVEEIKEEIERMGEEIKRQGERIERIEEKI
ERVEEELERLEEN
KPGEEKLNKLLEELLKKLEELKKLAEENRRLLERQGRQLEELERRFEELNRRMEELNEKLEKLLKEEPNEE
EPKEKELEELLEELLRELEEIKKLLEEFRRLQEEIGRQIEEIERQLEELLERLEELNEKLENLLKREDNEN
RPEEEKLNELLDELLRLLEEIKKLLEENRALLEEIGRQIDRIEEQLDRLLRELKELNEKLEALLKREDNEN
SLEEILEKLKEIAELLEEVEELTEELKEETERAGRELEELERRLEELVRRAEELNRKLEKILEEEDSDDIL
ERLKEARRELRELRERLEEVEREIERLIREAEEQSELLEELERELEEIKELLKELLEKEEELSEEELELIK
KLLEEIEELEEMFEELERLVEEQGRRLERIEEKVSRAVRHVERAEEN SEQ ID NO: 419
QNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGGSEELKKLEKEGEKLKELVEELDREIKELKE
GMERLREMFEEAAKLSEEALEIMRRTRKLSEEELEEAKQKAKEQEKEKALKEKGGSGGSGGSEKVKRE
LAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKVSKAKEHVEKGVED
MALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGGSEELKKLEKEGEKLKELVE
ELDREIKELKEGMERLREMFEEAAKLSEEALEIMRRTRKLSEEELEEAKQKAKEQEKEKALKEKGGSG
ED SEQ ID NO: 421
QGRRIERIEEKAEEAKEKIEEANERAEKLLKDPGGSGGSEELKKLEKEGEKLKELVEELDREIKELKE
GMERLREMFEEAAKLSEEALEIMRRTRKLSEEELEEAKQKAKEQEKEKALKEKGGSGGSGGSEKVKRE
LAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKVSKAKEHVEKGVED
MAEEMRRMIEEQGRRIERIEEKAEEAKEKIEEANERAEKLLKDPGGSGGSEELKKLEKEGEKLKELVE
ELDREIKELKEGMERLREMFEEAAKLSEEALEIMRRTRKLSEEELEEAKQKAKEQEKEKALKEKGGSG
ED SEQ ID NO: 423
RIEEKAEEAKEKIEEANERAEKLLKDPGGSGGSEELKKLEKEGEKLKELVEELDREIKELKEGMERLR
EMFEEAAKLSEEALEIMRRTRKLSEEELEEAKQKAKEQEKEKALKEKGGSGGSGGSEKVKRELAQIEE
RHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKVSKAKEHVEKGVED SEQ ID NO: 424
EIERKTEEAKRKIEELNKKAEELLKKEDDDSDLEKTKELLKEAKEQLREVKEIKRRVEELKREQEETL
KLTKEAAELAEEAKELMEEMLELSEEILEEMLENPKPYTPEELEKVRERHELIKKLLEEIEELEEMFE
ELERLVEEQGRRLERIEEKVSRAVRHVERAEEN SEQ ID NO: 425
In a further embodiment of this fourth aspect, the X2 JMD domain comprises an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NO:331-332 and 426-445. The amino acid sequences of SEQ ID NO: 426-445 are shown in Table 14.
In another embodiment, the X3 TMD domain comprises an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NO:223-234. The amino acid sequences of the X3 TMDs is provided in Table 4.
In further embodiments of this fourth aspect, the nucleic acid encodes a polypeptide comprising the formula B1-B2-X1-X2-X3, wherein
In these embodiments the fusion protein can be used for inducible binding to v-SPs (described above) in the presence of rapamycin. The sc-t-SPs can be fused to the FRB domain (SEQ ID NO:291) at the N-terminus and FRB can bind to its cognate binding partner FKBP domain (e.g. SEQ ID NO:290) fused to the N-terminus of v-SPs (e.g. SEQ ID NO 148) only in the presence of rapamycin. The fusion activity of designed fusogens fused to FKB and FRB can be induced in the presence of rapamycin. FRB domain fused to sc-t-SPs and FKBP domain fused to v-SPs would function similarly if they were interchanged with each other.
In one embodiment, the nucleic acids of this fourth aspect encode a polypeptide comprising an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NO:64-146, 148, and 446-455
The amino acid sequence of SEQ ID NO:64-146 and 446-455 are shown in Tables 15 and 16.
In a further embodiment, the nucleic acids of this fourth aspect encode a polypeptide that further comprises a signal peptide at its amino-terminus. Any signal peptide may be used as suitable for an intended purpose. The signal peptide may be directly linked to the polypeptide, or may be connected via an amino acid linker. In some embodiments, the signal peptide comprises the amino acid sequence selected from the group consisting of SEQ ID NO:292-309. The amino acid sequence of these exemplary signal peptides are provided in Table 7.
In a further embodiment, the nucleic acids of the second aspect comprise an expression vector comprising the nucleic acid operatively linked to a control sequence, such as a promoter.
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EEEEEEEEEEGGSGGSGGSSISKQALSEIETRHSEIIKLENSIRELHDMFMDMAMLVESQGEMIDRIEYNV
EHAVDYVERAVSD
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSSISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNV
EHAVDYVERAVSD
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
YVERAVSD
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EGGSGGSGGSSISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNVEHAVDYVERA
VSD
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
KAVKYQSKARRKKIMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
SISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNVEHAVDYVERAVSD
TKKAVKY
QSKARRKK
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
ALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNVEHAVDYVERAVSD
TKKAVKYQSKAR
RKK
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EEEEEEGGSGGSGGSALSEIETRHSEIIKLENSIRELHDMFMDMAMLVESQGEMIDRIEYNVEHAVDYVERA
VSD
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EGGSGGSGGSALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNVEHAVDYVERAVSDTK
KAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
QSKARRKK
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
ALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNVEHAVDYVERAVSD
TKKAVKYQSKAR
RKK
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEKVKEELAKVEERHKQIQALLDKIEELYEMFKEMSEKISEQGQKIDRIEEKV
SKASEHVSKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEKVKKELAEVEKRHKQILELEEKIKELYEMEKEMSEKIEKQGQKIDRIDDKV
SEAKKHVEKAVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEAVKKSLAAKEERHKQILELLEKIKELHEMFKELSEKIEKQGQKIDRIEDKV
SKASEHVSKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEAVKKELAAIEERHEQILELLKKIEELYEMEKELSEKIEKQGQKIDRIEKKV
SEASRHVSKAVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEKVKKDLAAIEERHQQILELEEKIKELHEMFKEMSEKISEQMQKIDRIEEKV
SKASEHVSKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEKVRRELAAIEERHRQILELLEKIEELHEMFKEMSEKISKQMEKIDRIDDRV
SEASRHVEKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSKAVKEELANIENRHKQIDALYEKIKELHEMFLEMSERIEAQLQKIDRIDDKV
SKAKAHVEKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
NKREEEIDKGLDRVGEIISKLNEMAREMGEKIEEQNQKISEIEKKADEAIEKVEKLIKDAEKLLGSGGGSGG
EQEKEKALKEKGGSGGSGGSSISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNV
EHAVDYVERAVSD
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
NKREEEIDKGLDRVGEIISKLNEMAREMGEKIEEQNQKISEIEKKADEAIEKVEKLIKDAEKLLGSGGGSGG
EQEKEKALKEKGGSGGSGGSALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNVEHAVD
YVERAVSD
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
NKREEEIDKGLDRVGEIISKLNEMAREMGEKIEEQNQKISEIEKKADEAIEKVEKLIKDAEKLLGSGGGSGG
SEELKKLEKEGEKLKELVEELDREIKELKEGMERLREMFEEAAKLSEEALEIMRRTRKLSEEELEEAKQKAK
EQEKEKALKEKGGSGGSGGSSISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNV
EHAVDYVERAVSD
TKKAVKYQSKARRKKLLLLLLLLLLLLLLLLLLLLLFA
NKREEEIDKGLDRVGEIISKLNEMAREMGEKIEEQNQKISEIEKKADEAIEKVEKLIKDAEKLLGSGGGSGG
EQEKEKALKEKGGSGGSGGSSISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNV
EHAVDYVERAVSD
TKKAVKYQSKARRKKMMIILGVICAIILIIIIVYFFA
NKREEEIDKGLDRVGEIISKLNEMAREMGEKIEEQNQKISEIEKKADEAIEKVEKLIKDAEKLLGSGGGSGG
EQEKEKALKEKGGSGGSGGSSISKQALSEIETRHSEIIKLENSIRELHDMEMDMAMLVESQGEMIDRIEYNV
EHAVDYVERAVSDTKKAVKYQSRRRRRRMMIILGVICAIILIIIIVYFFA
NEREKEIDEGLERVGELISKLKELAREMSEKIEEQNQKLSEIDKKAEEAIKLLEKANASAKKLLEKPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
NEREKEIEEGLERVGELISELKEMAREMSEKIEEQNKKLDEISKKADEAIKLLEKANKGAEELLKKPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
NEREKEIDEGLEKIGELISKLKEMAREMSEKIEEQNEKLDEIDKKADEAIKLLEEANKKAEKLLKKKGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
NEREKEIEEGLERIGELISKLKELAREMSEKIEEQNEKLSEISEKADEAIKLLEKANASAQKLLEKPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
NPREEEIDKGLEEIGKLISELKELAREMSEKIEEQNEKISEIDEKAKEAIELLKKANEKAKELLEKEGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
SPREKEIDEGLERVSELVKKLKELAEKMKEMIEEQGRRIERIERKAEEAKERIEKLNEKAEKLLEDPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
SEREKEIDEGLDRVSEIVKELKKMAEEMRRMIEEQGRRIERIEEKAEEAKEKIEEANERAEKLLKDPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
SEREKEIDEGLEKVSEIVKELKEMAEEMREMIERQGEQIERIEKKAEEAKKKIEEQNERAERLLKDPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
SEREKEIEEGLERVSEIVRRLKELAEEMRRMIEEQGRRIDRIEEKADKAKEEIEKQNEKLEKLLKDPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
SEREKEIDEGLEKVSEIVKELKELAKEMKEMIEEQGRRIDRIERKAEETKKKIEELNEQAERLLKDPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
SEREEEIDKGLERVSEIVKKLKELAEKMKEEIERQGEQIDRIEKKADETIKEIERLNESADRLLKSPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
SEREKEIDEGLDRVSEIVKELKKMAEEMRRMIEEQGRRIERIEEKAEEAKEKIEEANERAEKLLKDPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVEDTKKAVKYQSRRRRRRIMIIICCVILGIVIASTVGGIFA
SEREKEIDEGLDRVSEIVKELKKMAEEMRRMIEEQGRRIERIEEKAEEAKEKIEEANERAEKLLKDPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKKKKKK
IMIIICCVILGIVIASTVGGIFA
SEREKEIDEGLDRVSEIVKELKKMAEEMRRMIEEQGRRIERIEEKAEEAKEKIEEANERAEKLLKDPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKARRKK
AAVLVLLVIVIISLIVLVVIW
SEREKEIDEGLDRVSEIVKELKKMAEEMRRMIEEQGRRIERIEEKAEEAKEKIEEANERAEKLLKDPGGSGG
SEELKKLEKEGEKLKELVEELDREIKELKEGMERLREMFEEAAKLSEEALEIMRRTRKLSEEELEEAKQKAK
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKARRKK
MMVVVVVVVVVVVVVVVVYF
SEREKEIDEGLDRVSEIVKELKKMAEEMRRMIEEQGRRIERIEEKAEEAKEKIEEANERAEKLLKDPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKARRKK
MMIIIIIIIIIIIIIIIIYF
SEREKEIDEGLDRVSEIVKELKKMAEEMRRMIEEQGRRIERIEEKAEEAKEKIEEANERAEKLLKDPGGSGG
SEELKKLEKEGEKLKELVEELDREIKELKEGMERLREMFEEAAKLSEEALEIMRRTRKLSEEELEEAKQKAK
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSRRRRRR
AAVLVLLVIVIISLIVLVVIW
SEREKEIDEGLDRVSEIVKELKKMAEEMRRMIEEQGRRIERIEEKAEEAKEKIEEANERAEKLLKDPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSRRRRRR
MMVVVVVVVVVVVVVVVVYF
SEREKEIDEGLDRVSEIVKELKKMAEEMRRMIEEQGRRIERIEEKAEEAKEKIEEANERAEKLLKDPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKKKKKK
AAVLVLLVIVIISLIVLVVIW
SEREKEIDEGLDRVSEIVKELKKMAEEMRRMIEEQGRRIERIEEKAEEAKEKIEEANERAEKLLKDPGGSGG
EQEKEKALKEKGGSGGSGGSEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKKKKKK
MMVVVVVVVVVVVVVVVVYF
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEKVKESLANKEERHKQILELEEKIKELYEMFKELSEKIEEQLKKIDRIEEKV
SEASRHVSKGVES
LKEAVEYDEKARRKK
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEKTRRTLAEIEERHRQILELEEKIEELYEMEKELSEKISEQGQKISRIEDKV
SKASEHVSKGVEN
LKKAVEYDEKARRKK
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEKVKKSLAEIEKRHEQILQLEKQIEELHEMEKELSEKISKQGQKIDRIEEKV
EEAKRHVEKAVKD
LKEAVEYEEKARRKK
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSKKVKEELARIEARHQQILALEEKIRELYEMFKELSEKIEEQGKKIDRIEDKV
SKASEHVSKGVEN
LKEAVEYDEKARRKK
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSKKVKEELKEKEKRHRQIEELLKKIEELHEMFEELSERISEQGQKIDRIDDKV
SKASEHVSKGVED
LKEAVEYEEKARRKK
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEEVKKSLAEIEKRHEQILALEKKIEELYEMEKELGEKIEKQLQKISRIEEKV
SEASRHVSKGVED
LKEAVKYREESEKME
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEKVKKELAEIEARHQQIEALLEQIKELYEMFKELSEKIEEQGQKISRIEDKV
SKASEHVSKGVEQ
LKEAVKYNEEGKKME
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEKVKKELAEKEKRHKQIDELLEKIKELYEMFKEMGEKIEKQGEKIDRIEKKV
SEASKHVSKAVED
LKEAVEYREKSEKME
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEAVKKELAAIEARHKQIDALLEKIKELHEMFEEMSKKIEEQMQKISRIEDKV
SEASRHVSKAVSD
LKEAVKYKEESEKME
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEKVKKELAEIEKRHKQILELEEKIKELHEMFKELGEKIEKQGQKISRIDDKV
SEAKRHVEKGVED
LKKAVEYKEKSEKKE
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSKKVKEELKKIEERHKQILELEEKIEELYEMEKELAERIEKQGEKIDRIDEKV
SEAKRNVEKAVED
TKKAVKYQSESEKME
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEKVKKELEEKEKRHKQILELEEKIKELYEMEKELSEKIEEQLQKIDRIDDKV
SEASRHVSKGVED
TKKAVKYQSEAEKME
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEEVRRSLEEIERRHRQILELEEKIEELYEMFKEMSEKIEEQGQKISRIEEKV
SKASEHVSKAVED
TKKAVKYQSEAEKKE
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEEVKKELAEIEARHEQIKELEKQIEELHEMFKELGEKIEKQGEKIDRIDEKV
SEASRHVSKAVED
TKKAVKYQSESEKME
IMIIICCVILGIVIASTVGGIFA
DARENEMDENLEQVSGIIGNLRHMALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGG
EQEKEKALKEKGGSGGSGGSEKVKEELAKKEARHEQILELTEKIEELSEMFKELSEKISEQGQKIDRIEDKV
SKASEHVSKAVED
TKKAVKYQSESEKME
IMIIICCVILGIVIASTVGGIFA
SELEKKIDELLEEISRLVRELKEIAKELRRLTERQGRQVERIEREVEEAEREIEELNKEAEELLEKEDSEDE
PYTPEELEKVKERHELIKKLKEEIKELKEMFEELRELVRRQGERLDRIEEKVRRAVEHVKKAEEN
IGEAVKY
LEKSKELE
IMIIICCVILGIVIASTVGGIFA
SEKEKEIDELLDKVSEIVKELKKLAEELKRRTERQGRQIEEIERKTEEAKRKIEELNKKAEELLKKEDDDSD
PYTPEELEKVRERHELIKKLLEEIEELEEMFEELERLVEEQGRRLERIEEKVSRAVRHVERAEEN
LGEAVEY
LEKSKKLE
IMIIICCVILGIVIASTVGGIFA
SSKEEEIEELLDEVSEIVRRLKEMAREIREMVERQGRQIERIERKVEEAKRKIEELNKKAEELLEKEDDESE
LEELEKLVEEAKEQLREVEEINREVEELGREQERLLRKTREAAKLAEKAEELMKKMLELSEEILEEMKEKPK
EYTPEELEEVEERHKLIQKLLEEIKELKEMFEELERLVEEQGRRLERIEEKVRRAVEHVKRALEN
LKEAKEY
RKKNEELE
IMIIICCVILGIVIASTVGGIFA
SELEKKIDELLEEISRLVRELKEIAKELRRLTERQGRQVERIEREVEEAEREIEELNKEAEELLEKEDSEDE
PYTPEELEKVKERHELIKKLKEEIKELKEMFEELRELVRRQGERLDRIEEKVRRAVEHVKKAEEN
IGEAVKY
SEKEKEIDELLDKVSEIVKELKKLAEELKRRTERQGRQIEEIERKTEEAKRKIEELNKKAEELLKKEDDDSD
PYTPEELEKVRERHELIKKLLEEIEELEEMFEELERLVEEQGRRLERIEEKVSRAVRHVERAEEN
LGEAVEY
SETEKKENEALEELERLLEEAKKLLEEQRRLLEAQGEVQKEQEKLEDELEEIQEEAEKYQNKLLESKDEEDE
EAEAAEKAGKAAGVSLKEEVEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
SSLEKKIDENLEKALELLEELKEKLEEMRRLLEESGRLQDELEELMDETQKQQEELEKLLEKLLKMDDSDEQ
EAKKELEEAKKKGEEGKEKLEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
SEEKKKMEELLDKIKELLEELKKLAEEIKKLLEEQGRQLEKLEEEADKALRQAEEAIRLQEKALELEDDEEI
EKLKEYEELKKLGILPKEELEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
EEFEKKENELLEELKKKLEEAKKLLKENRRLLEEQGRQLEEIEEKMEEAEELQEKALEYQEKAEKAGFSDES
KAEEKAKEAAKKGVDLSEELEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
SEEEEKFNELLEKIEEELEEIKELAEELREKLEELGRLTEKALELADELEKLFEEAEKLLEEALKLGDGEEL
EAEKKAEELKKLGIDPTEELEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
SSLKEKFNELLDELLKELEEAKELLEEIREQLERIGEQLEELEEQFDEILKEQEELEKQQKKLLESPGSEEE
EAKKKYEEKKKKGINGTEELEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
KESEEKFEEKLKELEKLLEEAKELLEKQREYLEESGRLLDEAEKLMDETERIFEETLKLQDKLLAAKDEEDQ
EAEEKEKEGEKKGVSYKEEVEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
SKLKEEFEKYLEELLKQLEKLKEKLKELREKLEEQGKQLEKLEEQFDRILEQQEKLLEQQEKLLEDEGSEEE
EAEKKYEELKKKGINGTEELEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
SEREKEFNERLEEMEKLLEKIKKLAEEIRRLLEKQGELLDKLEELADEALRLQEKAIEKSEKILEKGYNEET
ELEKRKEEYEKLGIDLSEELEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
SSFEEEINKLLEELKRKLEELKEILEEIRKLLEEQGRQLDEIEEKMDEAEELAEKAEEYLKKAEEAGGGEES
KAEAAAAEAAKLGIDKSEELEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
SEEEEKFNELLEKIEEKLEEAKELAEELREELEKIGELTDEAERLADEALKLAEEAEKLLKEALKLGDEDEL
KAEEKYKEDLKKGIDNTEEIEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
ESFEKELEELLEKIQELMEKIKELAEKLREALEESGRLLEEIEEAVDKLEEKFEEIEKLQENAEKYEDTEEA
EAEKKLKEAEEYNEELEKEVEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
SELEEKFNKYLEELLETLEKLKEALEKIREKLEEQGKQLDKIEEAFDELLKQQEELLKQQEELLADPGSEES
EAEKEYKEDKKKGINNKEKLEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA
SSLEKKIDENLEKALELLEELKEKLEEMRRLLEESGRLQDELEELMDETQKQQEELEKLLEKLLKMDDSDEQ
EAKKELEEAKKKGEEGKEKLEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKARRKK
AAVLVLLVIVIISLIVLVVIW
SSLEKKIDENLEKALELLEELKEKLEEMRRLLEESGRLQDELEELMDETQKQQEELEKLLEKLLKMDDSDEQ
EAKKELEEAKKKGEEGKEKLEKVKRELAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKV
SKAKEHVEKGVED
TKKAVKYQSKARRKK
MMVVVVVVVVVVVVVVVVYF
GELKEKKEKLSKEFEKLLKESKRLAEELKEKLEELGRALDEAEELADEVERQQEELEKLQEEILKSEENEDE
ATKAEAEAAKAKGEDISDKLEAAEKEYKSVKEELKLVEEIKKKVEEIKEMLEEMKERIEEMEEKVKRIEEKL
KRIEESLKRVEEN
LKEIKELAKKREEKG
IMIIICCVILGIVIASTVGGIFG
DELEKKIKELEEKSEEELKEAKELAEELRRLLEELERALDEAERLADEVERKQEELEKLMEEMLKSEDNESD
ETKKKAEEMKKKGIDISEELEKAKEELESVKKNLELVKKILEEVKEIKEELEEMGEEIERMEEKVDRIEEKL
ERVEESLERVSKN
LEEIERLFEERKEKG
IMIIICCVILGIVIASTVGGIFG
SEEDKKMEELLEEALKLLEELKELLEKNRELLEELGRQQEELEKLQDEAERLQEELEEAFKKMEENEESEEG
RAEAARAEYEAAGSPEVERAEQVLEEYREAKEFYEKVEELLREVKEIKEEIKEMEERIKEIGERIKRIEEKI
ERVEKLLERTEKN
IKEIKELRDKIEKNG
IMIIICCVILGIVIASTVGGIFG
SEKEKEFNELLEEALRELEKLKELLEENGRLLERTGEQLERMEELMDEAEEKQEELEEAIKKMEKYEDSEEG
KAEKAKAEYEAAGKDEVKECEKVKEKYEEAKKRYEQVEKLLKEVEEIKEEIERMGEEIKRQGERIERIEEKI
ERVEEELERLEEN
LEEIKKLREKIKENG
IMIIICCVILGIVIASTVGGIFN
KPGEEKLNKLLEELLKKLEELKKLAEENRRLLERQGROLEELERRFEELNRRMEELNEKLEKLLKEEPNEE
VEYLEKSKKLE
IMIIICCVILGIVIASTVGGIFA SEQ ID NO: 446
EPKEKELEELLEELLRELEEIKKLLEEFRRLQEEIGRQIEEIERQLEELLERLEELNEKLENLLKREDNEN
VEYLEKSKKLE
IMIIICCVILGIVIASTVGGIFA SEQ ID NO: 447
RPEEEKLNELLDELLRLLEEIKKLLEENRALLEEIGROIDRIEEQLDRLLRELKELNEKLEALLKREDNEN
VEYLEKSKKLE
IMIIICCVILGIVIASTVGGIFA SEQ ID NO: 448
SLEEILEKLKEIAELLEEVEELTEELKEETERAGRELEELERRLEELVRRAEELNRKLEKILEEEDSDDIL
ERLKEARRELRELRERLEEVEREIERLIREAEEQSELLEELERELEEIKELLKELLEKEEELSEEELELIK
LGIVIASTVGGILA SEQ ID NO: 449
QNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGGSEELKKLEKEGEKLKELVEELDREIKELKE
GMERLREMFEEAAKLSEEALEIMRRTRKLSEEELEEAKOKAKEQEKEKALKEKGGSGGSGGSEKVKRE
LAQIEERHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKVSKAKEHVEKGVED
TKKAVKYQS
KARRKK
IMIIICCVILGIVIASTVGGIFA SEQ ID NO: 450
MALDMGNEIDTQNRQIDRIMEKADSAKTRIDEANQRATKMLGSGGGSGGSEELKKLEKEGEKLKELVE
ELDREIKELKEGMERLREMFEEAAKLSEEALEIMRRTRKLSEEELEEAKOKAKEQEKEKALKEKGGSG
ED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA SEQ ID NO: 451
KSGNVKDLTQAWDLYYHVERRISGGSGGSGGSGGSSEREKEIDEGLDRVSEIVKELKKMAEEMRRMIEE
QGRRIERIEEKAEEAKEKIEEANERAEKLLKDPGGSGGSEELKKLEKEGEKLKELVEELDREIKELKE
GMERLREMFEEAAKLSEEALEIMRRTRKLSEEELEEAKQKAKEQEKEKALKEKGGSGGSGGSEKVKRE
LAQIEERHOQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKVSKAKEHVEKGVED
TKKAVKYQS
KARRKK
IMIIICCVILGIVIASTVGGIFA SEQ ID NO: 452
MAEEMRRMIEEQGRRIERIEEKAEEAKEKIEEANERAEKLLKDPGGSGGSEELKKLEKEGEKLKELVE
ELDREIKELKEGMERLREMFEEAAKLSEEALEIMRRTRKLSEEELEEAKOKAKEQEKEKALKEKGGSG
ED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA SEQ ID NO: 453
RIEEKAEEAKEKIEEANERAEKLLKDPGGSGGSEELKKLEKEGEKLKELVEELDREIKELKEGMERLR
EMFEEAAKLSEEALEIMRRTRKLSEEELEEAKQKAKEQEKEKALKEKGGSGGSGGSEKVKRELAQIEE
RHQQILELEEKIKELLEMFKELSEKIEEQGQKIDRIEDKVSKAKEHVEKGVED
TKKAVKYQSKARRKK
IMIIICCVILGIVIASTVGGIFA SEQ ID NO: 454
EIERKTEEAKRKIEELNKKAEELLKKEDDDSDLEKTKELLKEAKEQLREVKEIKRRVEELKREQEETL
KLTKEAAELAEEAKELMEEMLELSEEILEEMLENPKPYTPEELEKVRERHELIKKLLEEIEELEEMFE
ELERLVEEQGRRLERIEEKVSRAVRHVERAEEN
LGEAVEYLEKSKKLEI
MIIICCVILGIVIASTVGG
IFA SEQ ID NO: 455
The nucleic acids of all aspects of the disclosure may comprise single stranded or double stranded RNA or DNA in genomic or cDNA form, or DNA-RNA hybrids, each of which may include chemically or biochemically modified, non-natural, or derivatized nucleotide bases. Such nucleic acid sequences may comprise additional sequences useful for promoting expression and/or purification of the encoded peptide or chimeric molecular construct, including but not limited to polyA sequences, modified Kozak sequences, and sequences encoding epitope tags, export signals, and secretory signals, nuclear localization signals, and plasma membrane localization signals. It will be apparent to those of skill in the art, based on the teachings herein, what nucleic acid sequences will encode the polypeptide or fusion protein of the disclosure. 10
The expression vectors of all aspects of the disclosure comprise the nucleic acid of any aspect of the disclosure operatively linked to a suitable control sequence, such as a promoter. “Expression vector” includes vectors that operatively link a nucleic acid coding region or gene to any control sequences capable of effecting expression of the gene product. “Control sequences” operably linked to the nucleic acid sequences of the disclosure are nucleic acid sequences capable of effecting the expression of the nucleic acid molecules. The control sequences need not be contiguous with the nucleic acid sequences, so long as they function to direct the expression thereof. Thus, for example, intervening untranslated yet transcribed sequences can be present between a promoter sequence and the nucleic acid sequences and the promoter sequence can still be considered “operably linked” to the coding sequence. Other such control sequences include, but are not limited to, polyadenylation signals, termination signals, and ribosome binding sites. Such expression vectors can be of any type, including but not limited plasmid and viral-based expression vectors. The control sequence used to drive expression of the disclosed nucleic acid sequences in a mammalian system may be constitutive (driven by any of a variety of promoters, including but not limited to, CMV, SV40, RSV, actin, EF) or inducible (driven by any of a number of inducible promoters including, but not limited to, tetracycline, ecdysone, steroid-responsive). The expression vector must be replicable in the host organisms either as an episome or by integration into host chromosomal DNA. In various embodiments, the expression vector may comprise a plasmid, viral-based vector, or any other suitable expression vector.
In a fifth aspect, the disclosure provides polypeptides or fusion proteins encoded by the nucleic acid of any embodiment herein.
In a sixth aspect, the disclosure provides host cells comprising the nucleic acid, expression vector, polypeptide, and/or fusion protein of any embodiment or combination of embodiments herein. In one embodiment, the host cell comprises a membrane fusion protein complex anchored in a lipid bilayer membrane of the cell, wherein the membrane fusion protein complex comprises the following components:
As described in the examples, the disclosure provides a series of membrane fusion proteins that can induce cell-cell fusion when expressed on the surface of mammalian cells, or liposome fusion when displayed on the surface of liposomes. The designed proteins are based on the human neuronal SNARE complex (which is composed of three proteins, VAMP2, Syntaxin 1A or Syn1A, and SNAP25), which has a parallel four-helical bundle structure and transmembrane domains at the C-terminus of VAMP2 and Syn1A (see
In another embodiment, the host cell comprises a membrane fusion protein complex anchored in a lipid bilayer membrane of the cell, wherein the membrane fusion protein complex comprises the following components:
As further described in the examples, new sequences were generated that are believed to fold into the four-helix bundle structure like the parental SNARE complex, followed by engineering SNAP25 so that one of the two coiled-coil domains of SNAP25 is an anti-parallel coiled-coil (
In a seventh aspect, the disclosure provides vesicles, comprising one or more polypeptide or fusion protein of any embodiment herein incorporated into the lipid envelope of the vesicle. The vesicle may be any vesicle that comprises a lipid envelope. In various non-limiting embodiments, the vesicle comprises a liposome, a lipid nanoparticle, a viral vector, or an enveloped particle that may optionally comprise any suitable cargo, including but not limited to a protein or nucleic acid cargo. In some embodiments, one or more polypeptide or fusion protein of any embodiment herein are anchored on a surface of the liposome, the lipid nanoparticle, the viral vector, or the enveloped particle.
All embodiments of the host cells and vesicles disclosed herein may further comprise a therapeutic or diagnostic moiety loaded in the host cell or vesicle. The host cells and vesicles may be used, for example, for intracellular delivery of such therapeutic or diagnostic moieties. Any therapeutic or diagnostic moiety may be loaded into the host cell or vesicle as appropriate for an intended use. In non-limiting embodiments, the therapeutic or diagnostic moiety may comprise a protein or nucleic acid therapeutic or diagnostic moiety.
In an eight aspect, the disclosure provides kits, comprising
In one embodiment, the first host cell comprises a polypeptide encoded by the nucleic acid of any embodiment of the first aspect of the disclosure (VAMP2 redesign/v-SNARE-like) anchored in a lipid bilayer membrane of the cell or vesicle. In another embodiment, the second host cell comprises a polypeptide encoded by the nucleic acid of any embodiment of the third aspect of the disclosure (Syn1A) anchored in a lipid bilayer membrane of the cell or vesicle.
In a ninth aspect, the disclosure provides kits comprising
In one embodiment, the first host cell or vesicle comprises a polypeptide encoded by the nucleic acid of any embodiment of the first aspect of the disclosure (VAMP2 redesign/v-SNARE-like) anchored in a lipid bilayer membrane of the cell or vesicle. In another embodiment, the second host cell or vesicle comprises a polypeptide encoded by the nucleic acid of any embodiment of the fourth aspect of the disclosure anchored in a lipid bilayer membrane of the cell or vesicle.
In all embodiments of the kits of the disclosure, the first host cell or vesicle and/or the second host cell or vesicle may further comprise a therapeutic or diagnostic moiety loaded in the cell or vesicle, as described herein.
In a tenth aspect, the disclosure provides methods for inducing membrane fusion, comprising mixing:
In another embodiment, the methods for inducing membrane fusion comprise mixing:
In one embodiment, the methods comprise first delivering the nucleic acids or expression vectors of the disclosure into target cells in vivo by a conventional delivery system (viral vector, etc.) so that the target cell becomes a first host cell as recited above, and then, delivering therapeutic or other moiety to the target cell by using a vesicle that is a second vesicle as described above. In another embodiment, the methods comprise first delivering the nucleic acids of the disclosure into target cells in vivo by a conventional delivery system (viral vector, etc.) so that the target cell becomes a second host cell as recited above, and then, delivering therapeutic or other moiety to the target cell by using a vesicle that is a first vesicle as described above
The host cells of the disclosure may comprise the polypeptide, fusion protein nucleic acid and/or expression vector (i.e.: episomal or chromosomally integrated) disclosed herein, wherein the host cells can be either prokaryotic or eukaryotic. The cells can be transiently or stably engineered to incorporate the expression vector of the disclosure, using techniques including but not limited to bacterial transformations, calcium phosphate co-precipitation, electroporation, or liposome mediated-, DEAE dextran mediated-, polycationic mediated-, or viral mediated transfection. In some embodiments, the cells are eukaryotic cells comprising lipid bilayers, such as mammalian cells including but not limited to human cells.
We made a series of membrane fusion proteins (see Tables above) that can induce cell-cell fusion when expressed on the surface of mammalian cells, or liposome fusion when displayed on the surface of liposomes. The human neuronal SNARE complex (which is composed of three proteins, VAMP2, Syntaxin 1A or Syn1A, and SNAP25) has a parallel four-helical bundle structure and transmembrane domains at the C-terminus of VAMP2 and Syn1A (see
We first redesigned the amino acid sequence of the human neuronal SNARE and generated new sequences that are likely to fold into the four-helix bundle structure like the parental SNARE complex. Next, we engineered SNAP25 so that one of the two coiled-coil domains of SNAP25 is an anti-parallel coiled-coil (
Furthermore, we have generated new protein backbones based on the structure of sc-t-SP and made new sequences for these backbones. These sequences are likely to fold into SNARE complex-like structures, but their predicted structures are slightly different from the original neuronal SNARE (single-digit RMSD). These new proteins also showed significantly higher fusion activity compared to native neuronal SNARE.
Our studies demonstrated that the juxtamembrane domain (JMD) of native VAMP2 and v-SP is important for activity, but various non-native sequences (K9, KIF, and RIF) showed substantial fusion activity. Our studies further demonstrated that the transmembrane domain of native VAMP2, v-SPs, native t-SNAREs, and sc-t-SPs can be replaced with non-native sequences, including TMD derived from VSV-G, flu HA, EGFR, PDGFR, and non-cognate SNARE (like VAMP2 protein with Syn1A-TMD or vice versa). Finally, while Syn1A JMD (TKKAVKYQSKARRKK, (SEQ ID NO: 331)) is critical for fusion activity in native three-component fusion machinery, in our designs, this JMD sequence is not essential for activity and can be replaced with non-native sequences, as shown in various sc-t-SP designs disclosed herein.
We describe designed proteins that have membrane fusion activity, and are useful, for example, in intracellular delivery and synthetic intracellular membrane trafficking systems.
Computationally designed amino acid sequences that fold into a SNARE complex-like four-helix bundle structure and are capable of inducing the fusion of two membranes when displayed on the lipid bilayer membrane such as cell membrane and liposomal membrane. The designed protein complex is composed of two (SEQ ID: 64-146) or three (SEQ ID: 38-63) protein components and anchored into the membrane by their transmembrane domains.
The new sequences were designed using the native SNARE structure as a template (SEQ ID: 1-63, except 26-33, 42-44, and 58-63). The predicted structure of these designs is identical to that of the parental SNARE complex (
Furthermore, we generated the new backbone of sc-t-SP. The structure of parental neuronal SNARE was “partially diffused” by RFdiffusion and the newly generated backbones and sequences were predicted to fold into SNARE complex-like four-helix bundle structures (SEQ ID: 128-146). These designs showed superior fusion activity compared to the parental native SNARE complex (
When combined with small molecule-dependent heterodimeric domains, the fusogenic activity of these designed fusion proteins can be controlled by the presence of specific small molecules (chemically induced dimerization; exemplified by rapamycin induced binding herein
Genes for designed proteins were synthetized and cloned into mammalian expression vectors such as pCMV or pcDNA3.1. All designs were expressed in human embryonic kidney cell line HEK293T by transfection of plasmid DNA using polyethyleneimine (PEI). Designed SNARE-like proteins were expressed on the surface of HEK293T cells as flipped SNARE3. The v-cells express v-SP and T7 RNA polymerase while t-cells express t-SP and reporter luciferase under T7 promoter. In this assay, only after the cell-cell fusion between v-cells and t-cells, reporter luciferase gene is expressed. Transfected cells were mixed together and after overnight incubation, cell-cell fusion was quantitatively assessed by luciferase assay.
Number | Date | Country | |
---|---|---|---|
63582937 | Sep 2023 | US |