A computer readable form of the Sequence Listing is filed with this application by electronic submission and is incorporated into this application by reference in its entirety. The Sequence Listing is contained in the file created on Feb. 11, 2021, having the file name “20-1008-PCT2_SeqList_ST25.txt” and is 161 kb in size.
Many proteins, including but not limited to viral glycoprotein antigens, must be expressed as secreted proteins in eukaryotic cells. This requirement can derive from many different causes, including but not limited to a requirement for post-translational modifications including but not limited to N-linked glycosylation, disulfide bond formation, etc. However, the yield of secreted protein from eukaryotic cells varies widely for reasons that are not fully understood by those of skill in the art, and some proteins altogether fail to secrete at appreciable levels.
In one aspect, the disclosure provides polypeptides comprising or consisting of:
(a) an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:1 (I3-01 wild type), wherein 1, 2, 3, 4, 5, 6, 7, 8, 9, or all 10 of the following mutations relative to SEQ ID NO:1 are present in the polypeptide: F32Y, H37D/E/K/N/Q/R, F43Q, F168D/E/K/N/Q/R/S/T/Y, K169D/E/N/Q, L173D/E/N/Q/S, A174S, S179D/E, K183D/E, and/or T185D/E/K/N/Q/S;
(b) an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:2 (O43-38 tetramer wild type), wherein 1, 2, 3, 4, 5, 6, 7, 8, or all 9 of the following mutations relative to SEQ ID NO:2 are present in the polypeptide: M138D/E/K/N/Q/R/S/T, L139D/N/S, A141S, V142R/T, A143S, N146D/E/K/R, R147N, H172D/E/K/N/Q, and/or E173D/K.
(c) an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:3 (O43-38 trimer wild type), wherein 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or all 21 of the following mutations relative to SEQ ID NO:3 are present in the polypeptide: R17D/E/K/N/Q/S/T, N19D/E, S20D/E/K/N, V21D/T, V22D/E/Q/S/T, L23D/E/K/N/Q/R/S, A26S, K27N/Q, A30S V31N/S/T, F32R/Y, L33D/E/K/N/Q/R/S/T, H37D/E/K/N/R, F43Q, W167D/E/K/N/Q/R/S/T/Y, F168D/E/K/N/Q/R/S/T/Y, K169D/E/N, L173D/E/N/Q/R/S, A174S, S179D/E/K/N/Q/R, and/or K183D/E/N/Q;
(d) an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:4 (I53_dn5A wild type), wherein 1, 2, 3, 4, 5, 6, 7, 8, 9, or all 10 of the following mutations relative to SEQ ID NO:4 are present in the polypeptide: R17T, W18D/E/K/N/Q/R/S/T/Y, N19E, E21D, L28D/E/K/N/Q/R/S/T/Y, L31D/E/K/N/Q/S/T, K32D/E/N/Q, T118D/E/N/Q/S, L120D/E/K/N/Q/R/S/T, and/or T121D/E/K/N/S; or
(e) an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:5 or 6 (hMPV wild type), wherein 1, 2, 3, 4, or all 5, of the following mutations relative to SEQ ID NO:5 or 6 are present in the polypeptide: A107D, V112R, T114E, V118R, and/or G265DG264D;
wherein residues in parentheses are optional and may be present or may be absent in whole or in part.
In another aspect, the disclosure provides fusion proteins comprising:
(a) the polypeptide according to any embodiment of the first aspect of the disclosure; and
(b) a second functional polypeptide.
In a further aspect, the disclosure provides nanoparticles comprising a plurality of the polypeptides or fusion proteins of any embodiment of the first aspect and second aspect of the disclosure, and compositions comprising a plurality of such nanoparticles. In further aspects, the disclosure provides nucleic acids encoding the polypeptides or fusion proteins, expression vectors comprising the nucleic acids operatively linked to a suitable control sequence, host cells comprising the polypeptides, fusion proteins, nanoparticles, compositions, nucleic acids, and/or expression vectors, and pharmaceutical compositions thereof.
In a further aspect, the disclosure provides computer-implemented methods for designing a secreted peptide, such as the polypeptides of the disclosure.
All references cited are herein incorporated by reference in their entirety. Within this application, unless otherwise stated, the techniques utilized may be found in any of several well-known references such as: Molecular Cloning: A Laboratory Manual (Sambrook, et al., 1989, Cold Spring Harbor Laboratory Press), Gene Expression Technology (Methods in
Enzymology, Vol. 185, edited by D. Goeddel, 1991. Academic Press, San Diego, Calif.), “Guide to Protein Purification” in Methods in Enzymology (M. P. Deutshcer, ed., (1990) Academic Press, Inc.); PCR Protocols: A Guide to Methods and Applications (Innis, et al. 1990. Academic Press, San Diego, Calif.), Culture of Animal Cells: A Manual of Basic Technique, 2nd Ed. (R. I. Freshney. 1987. Liss, Inc. New York, N.Y.), Gene Transfer and Expression Protocols, pp. 109-128, ed. E. J. Murray, The Humana Press Inc., Clifton, N.J.), and the Ambion 1998 Catalog (Ambion, Austin, Tex.).
As used herein, the singular forms “a”, “an” and “the” include plural referents unless the context clearly dictates otherwise.
As used herein, the amino acid residues are abbreviated as follows: alanine (Ala; A), asparagine (Asn; N), aspartic acid (Asp; D), arginine (Arg; R), cysteine (Cys; C), glutamic acid (Glu; E), glutamine (Gln; Q), glycine (Gly; G), histidine (His; H), isoleucine (Ile; I), leucine (Leu; L), lysine (Lys; K), methionine (Met; M), phenylalanine (Phe; F), proline (Pro; P), serine (Ser; S), threonine (Thr; T), tryptophan (Trp; W), tyrosine (Tyr; Y), and valine (Val; V).
All embodiments of any aspect of the disclosure can be used in combination, unless the context clearly dictates otherwise.
Unless the context clearly requires otherwise, throughout the description and the claims, the words ‘comprise’, ‘comprising’, and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is to say, in the sense of “including, but not limited to”. Words using the singular or plural number also include the plural and singular number, respectively. Additionally, the words “herein,” “above,” and “below” and words of similar import, when used in this application, shall refer to this application as a whole and not to any particular portions of the application.
In a first aspect, the disclosure provides polypeptides comprising or consisting of:
(a) an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:1 (I3-01 wild type), wherein 1, 2, 3, 4, 5, 6, 7, 8, 9, or all 10 of the following mutations relative to SEQ ID NO:1 are present in the polypeptide: F32Y, H37D/E/K/N/Q/R, F43Q, F168D/E/K/N/Q/R/S/T/Y, K169D/E/N/Q, L173D/E/N/Q/S, A174S, S179D/E, K183D/E, and/or T185D/E/K/N/Q/S;
(b) an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:2 (O43-38 tetramer wild type), wherein 1, 2, 3, 4, 5, 6, 7, 8, or all 9 of the following mutations relative to SEQ ID NO:2 are present in the polypeptide: M138D/E/K/N/Q/R/S/T, L139D/N/S, A141S, V142R/T, A143S, N146D/E/K/R, R147N, H172D/E/K/N/Q, and/or E173D/K.
(c) an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:3 (O43-38 trimer wild type), wherein 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or all 21 of the following mutations relative to SEQ ID NO:3 are present in the polypeptide: R17D/E/K/N/Q/S/T, N19D/E, S20D/E/K/N, V21D/T, V22D/E/Q/S/T, L23D/E/K/N/Q/R/S, A26S, K27N/Q, A30S V31N/S/T, F32R/Y, L33D/E/K/N/Q/R/S/T, H37D/E/K/N/R, F43Q, W167D/E/K/N/Q/R/S/T/Y, F168D/E/K/N/Q/R/S/T/Y, K169D/E/N, L173D/E/N/Q/R/S, A174S, S179D/E/K/N/Q/R, and/or K183D/E/N/Q;
(d) an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:4 (I53_dn5A wild type), wherein 1, 2, 3, 4, 5, 6, 7, 8, 9, or all 10 of the following mutations relative to SEQ ID NO:4 are present in the polypeptide: R17T, W18D/E/K/N/Q/R/S/T/Y, N19E, E21D, L28D/E/K/N/Q/R/S/T/Y, L31D/E/K/N/Q/S/T, K32D/E/N/Q, T118D/E/N/Q/S, L120D/E/K/N/Q/R/S/T, and/or T121D/E/K/N/S; or
(e) an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:5 or 6 (hMPV wild type), wherein 1, 2, 3, 4, or all 5, of the following mutations relative to SEQ ID NO:5 or 6 are present in the polypeptide: A107D, V112R, T114E, V118R, and/or G264D;
wherein residues in parentheses are optional and may be present or may be absent in whole or in part.
The gatekeeper of the first step in the secretory pathway, cotranslational translocation across the ER membrane, is the Sec translocon, which acts as a fate-determining channel for nascent polypeptides. As detailed in the examples below, the inventors provide a method for improving the secretion of proteins from eukaryotic cells, and corresponded novel proteins that have improved secretion capability in eukaryotic cells, and fusion proteins and nanoparticles comprising the polypeptides, all of which can be used, for example as scaffolds for multivalent antigen presentation to generate improved vaccines.
In one embodiment, the polypeptide comprises or consists of an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:1 (I3-01 wild type), wherein 1, 2, 3, 4, 5, 6, 7, 8, 9, or all 10 of the following mutations relative to SEQ ID NO:1 are present in the polypeptide: F32Y, H37D/E/K/N/Q/R, F43Q, F168D/E/K/N/Q/R/S/T/Y, K169D/E/N/Q, L173D/E/N/Q/S, A174S, S179D/E, K183D/E, and/or T185D/E/K/N/Q/S. In one non-limiting example of this embodiment, the polypeptide comprises or consists of an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:7-14.
In another embodiment, the polypeptide comprises or consists of an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:2 (O43-38 tetramer wild type), wherein 1, 2, 3, 4, 5, 6, 7, 8, or all 9 of the following mutations relative to SEQ ID NO:2 are present in the polypeptide: M138D/E/K/N/Q/R/S/T, L139D/N/S, A141S, V142R/T, A143S, N146D/E/K/R, R147N, H172D/E/K/N/Q, and/or E173D/K. In one such embodiment, the polypeptide comprises or consists of an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:24-25.
In a further embodiment, the polypeptide comprises or consists of an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:3 (O43-38 trimer wild type), wherein 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or all 21 of the following mutations relative to SEQ ID NO:3 are present in the polypeptide: R17D/E/K/N/Q/S/T, N19D/E, S20D/E/K/N, V21D/T, V22D/E/Q/S/T, L23D/E/K/N/Q/R/S, A26S, K27N/Q, A30S V31N/S/T, F32R/Y, L33D/E/K/N/Q/R/S/T, H37D/E/K/N/R, F43Q, W167D/E/K/N/Q/R/S/T/Y, F168D/E/K/N/Q/R/S/T/Y, K169D/E/N, L173D/E/N/Q/R/S, A174S, S179D/E/K/N/Q/R, and/or K183D/E/N/Q. In one such embodiment, the polypeptide comprises or consists of an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:26-28.
In one embodiment, the polypeptide comprises or consists of an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:4 (I53 dn5A wild type), wherein 1, 2, 3, 4, 5, 6, 7, 8, 9, or all 10 of the following mutations relative to SEQ ID NO:4 are present in the polypeptide: R17T, W18D/E/K/N/Q/R/S/T/Y, N19E, E21D, L28D/E/K/N/Q/R/S/T/Y, L31D/E/K/N/Q/S/T, K32D/E/N/Q, T118D/E/N/Q/S, L120D/E/K/N/Q/R/S/T, and/or T121D/E/K/N/S. In one such embodiment, the polypeptide comprises or consists of an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:15-23.
In another embodiment, the polypeptide comprises or consists of an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:5 or 6 (hMPV wild type), wherein 1, 2, 3, 4, or all 5, of the following mutations relative to SEQ ID NO:5 or 6 are present in the polypeptide: A107D, V112R, T114E, V118R, and/or G264D. In one such embodiment, the polypeptide comprises or consists of an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:5 or 6, wherein the polypeptide comprises a set of mutations relative to SEQ ID NO:5 or 6 selected from the group consisting of:
(a) T114E+V118R
(b) A107D+V112R+T114E+V118R
(c) A107D+V112R
(d) A107D+V112R+T114E+V118R; and
(e) A107D+V112R+T114E+V118R+G264D.
In various embodiments, the polypeptides comprise or consists of an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence selected from SEQ ID NO: 29, 31, 33, 35, 37, 39, 41, 43, and 45. In other embodiments, some or all of the residues in parentheses are absent. In further embodiments, some or all of the residues in parentheses are present.
As disclosed herein, the polypeptides of the disclosure may be present in fusion proteins and nanoparticles. Thus, in another embodiment, the disclosure provides fusion proteins comprising:
(a) the polypeptide according to any embodiment of the disclosure; and
(b) a second functional polypeptide.
The second functional polypeptide may have any suitable function, including but not limited to therapeutic polypeptides, diagnostic polypeptides, detectable polypeptides, etc. In one embodiment, the second functional polypeptide comprises an immunogenic portion of a polypeptide antigen. An immunogenic portion of any suitable polypeptide antigen may be used, including but not limited to viral antigens. In one embodiment, the second functional polypeptide comprises an immunogenic portion of an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:5 or 6 (hMPV wild type), wherein 1, 2, 3, 4, or all 5, of the following mutations relative to SEQ ID NO:5 or 6 are present in the polypeptide: A107D, V112R, T114E, V118R, and/or G264D;
wherein residues in parentheses are optional and may be present or may be absent in whole or in part. In one embodiment, the second functional polypeptide comprises or consists of an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:5 or 6, wherein the polypeptide comprises a set of mutations relative to SEQ ID NO:5 or 6 selected from the group consisting of:
(a) T114E+V118R
(b) A107D+V112R+T114E+V118R
(c) A107D+V112R
(d) A107D+V112R+T114E+V118R; and
(e) A107D+V112R+T114E+V118R+G264D.
In another embodiment, the fusion protein comprises or consists of an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:30, 32, 34, 36, 38, 40, 42, 44, or 46 wherein residues in parentheses are optional and may be present or may be absent in whole or in part. In one embodiment, the residues SGR present in the second optional sequence from the N-terminus of SEQ ID NO: 30, 32, 34, 36, 38, 40, 42, 44, or 46 are present.
In another embodiment, the disclosure provides nanoparticle comprising a plurality of the polypeptides or fusion proteins of any embodiment of combination of embodiments herein. In one embodiment, the nanoparticle comprises
(a) a plurality of polypeptides comprising an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:1 (I3-01 wild type), wherein 1, 2, 3, 4, 5, 6, 7, 8, 9, or all 10 of the following mutations relative to SEQ ID NO:1 are present in the polypeptide: F32Y, H37D/E/K/N/Q/R, F43Q, F168D/E/K/N/Q/R/S/T/Y, K169D/E/N/Q, L173D/E/N/Q/S, A174S, S179D/E, K183D/E, and/or T185D/E/K/N/Q/S, or fusion proteins thereof. In one non-limiting example of this embodiment, the polypeptide comprises or consists of an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:7-14, or a fusion protein thereof.
In another embodiment, the nanoparticles comprise
(a) a plurality of first polypeptides comprising an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:2 (O43-38 tetramer wild type), wherein 1, 2, 3, 4, 5, 6, 7, 8, or all 9 of the following mutations relative to SEQ ID NO:2 are present in the polypeptide: M138D/E/K/N/Q/R/S/T, L139D/N/S, A141S, V142R/T, A143S, N146D/E/K/R, R147N, H172D/E/K/N/Q, and/or E173D/K, or fusion proteins thereof, wherein the plurality of first polypeptides self-interact to form a first multimeric substructure; and
(b) a plurality of second polypeptides comprising an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:3 (O43-38 trimer wild type), wherein 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or all 21 of the following mutations relative to SEQ ID NO:3 are present in the polypeptide: R17D/E/K/N/Q/S/T, N19D/E, S20D/E/K/N, V21D/T, V22D/E/Q/S/T, L23D/E/K/N/Q/R/S, A26S, K27N/Q, A30S V31N/S/T, F32R/Y, L33D/E/K/N/Q/R/S/T, H37D/E/K/N/R, F43Q, W167D/E/K/N/Q/R/S/T/Y, F168D/E/K/N/Q/R/S/T/Y, K169D/E/N, L173D/E/N/Q/R/S, A174S, S179D/E/K/N/Q/R, and/or K183D/E/N/Q, or fusion proteins thereof, wherein the plurality of second polypeptides self-interact to form a second multimeric substructure;
wherein multiple copies of the first multimeric substructure and the second multimeric substructure interact with each other at one or more non-covalent protein-protein interfaces.
In one embodiment, the first polypeptides comprise or consist of an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:24-25. In another embodiment, the second polypeptides comprise or consist of an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:26-28.
In another embodiment, the nanoparticle comprises
(a) a plurality of first polypeptides comprising an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:4 (I53_dn5A), wherein 1, 2, 3, 4, 5, 6, 7, 8, 9, or all 10 of the following mutations relative to SEQ ID NO:4 are present in the polypeptide: R17T, W18D/E/K/N/Q/R/S/T/Y, N19E, E21D, L28D/E/K/N/Q/R/S/T/Y, L31D/E/K/N/Q/S/T, K32D/E/N/Q, T118D/E/N/Q/S, L120D/E/K/N/Q/R/S/T, and/or T121D/E/K/N/S, or fusion proteins thereof, wherein the plurality of first polypeptides self-interact to form a first multimeric substructure; and
(b) a plurality of second polypeptides comprising or consisting of SEQ ID NO:47 that self-interact to form a second multimeric substructure;
wherein multiple copies of the first multimeric substructure and the second multimeric substructure interact with each other at one or more non-covalent protein-protein interfaces. In one such embodiment, the first polypeptides comprise or consist of an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:15-23.
In all of these embodiments of nanoparticles, the plurality of component polypeptides may comprise one or more fusion proteins. When one or more fusion proteins are present, one or more of the fusion proteins comprise a second functional polypeptide as described above. Such second functional polypeptides may include but not limited to an immunogenic portion of a polypeptide antigen, wherein the polypeptide antigen includes but is not limited to the polypeptide comprising an immunogenic portion of an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:5 or 6 (hMPV wild type), wherein 1, 2, 3, 4, or all 5, of the following mutations relative to SEQ ID NO:5 or 6 are present in the polypeptide: A107D, V112R, T114E, V118R, and/or G264D;
wherein residues in parentheses are optional and may be present or may be absent in whole or in part. In one embodiment, the second functional polypeptide comprises or consists of an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:5 or 6, wherein the polypeptide comprises a set of mutations relative to SEQ ID NO:5 or 6 selected from the group consisting of:
(a) T114E+V118R
(b) A107D+V112R+T114E+V118R
(c) A107D+V112R
(d) A107D+V112R+T114E+V118R; and
(e) A107D+V112R+T114E+V118R+G264D, or
wherein the polypeptide comprises or consists of an amino acid sequence at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to the amino acid sequence of SEQ ID NO:29, 31, 33, 35, 37, 39, 41, 43, or 45.
In another embodiment, the disclosure provides compositions comprising a plurality of nanoparticles according to any embodiment or combination of embodiments described herein. The compositions may be used for any of the uses described herein, including but not limited for use as vaccines when loaded with immunogenic portions of a polypeptide antigen.
In another embodiment, the disclosure provides a synthetic (“degreased”) nanoparticle, comprising a cryptic transmembrane domain, wherein one or more of the hydrophobic amino acids of the cryptic transmembrane domain have been substituted with a polar amino acid. In one embodiment, the amino acid substitution is within a 19-residue sliding window for transmembrane insertion potential (dG_ins); windows of dG_ins less than or equal to +2.7 kcal/mol are confirmed to be local minima within +/−9 residues, and the cutoff of +2.7 kcal/mol is the signature of the cryptic transmembrane domain. In another embodiment, the synthetic nanoparticle comprises a polypeptide comprising the amino acid sequence of SEQ ID NO:13.
In one embodiment, the synthetic nanoparticle is a polypeptide. In other embodiments, the synthetic nanoparticle comprises a signal peptide and/or a tag. In another embodiment, the synthetic nanoparticle comprises a one-component or homomeric nanoparticle. In one such embodiment, the synthetic nanoparticle comprises an expressed sequence as shown and described herein.
In another embodiment, the synthetic nanoparticle comprises variant I3-01 amino acid sequences. In one such embodiment, the synthetic nanoparticle comprises a polar amino acid substitution at position 25, position, 35, position 171, position 177, or position 180, or at any two or more combinations of those positions. In a further embodiment, the synthetic nanoparticle further comprises an agent to be secreted (“secreted agent”). In one such embodiment, the secreted agent is selected from:
In one embodiment, the polypeptide comprises an antigen an antigen immunogenic portion of an antigen. In another embodiment, the antigen immunogen or immunogenic is of viral origin. In one embodiment, the virus is human metapneumo virus (hMPV).
In another embodiment, the synthetic nanoparticle comprises a two-component nanoparticle. In one such embodiment, the synthetic nanoparticle comprises a trimer, a tetramer, or a pentamer. In another embodiment, the synthetic nanoparticle is selected from: I53_dn5, O43-38, and I53-50. In another embodiment, the synthetic nanoparticle is 153_dn5 and wherein the pentameric subunit I53_dn5A of the synthetic nanoparticle comprises a polar amino acid substitution at least one of position 16, position 29, position 116, position 118, or position 119, or at any two or more combinations of those positions.
In one embodiment the synthetic nanoparticle is O43-38 and wherein the tetrameric subunit O43-38tet of the synthetic nanoparticle comprises a polar amino acid substitution at position 29, position 141, position 19, position 21, or position 31, or at any two or more combinations of those positions.
In another aspect the disclosure provides nucleic acids encoding the polypeptide, fusion proteins, or nanoparticles of any embodiment or combination of embodiments of the disclosure. The nucleic acid sequence may comprise single stranded or double stranded RNA or DNA in genomic or cDNA form, mRNA, or DNA-RNA hybrids, each of which may include chemically or biochemically modified, non-natural, or derivatized nucleotide bases. Such nucleic acid sequences may comprise additional sequences useful for promoting expression and/or purification of the encoded polypeptide, including but not limited to polyA sequences, modified Kozak sequences, and sequences encoding epitope tags, export signals, and secretory signals, nuclear localization signals, and plasma membrane localization signals. It will be apparent to those of skill in the art, based on the teachings herein, what nucleic acid sequences will encode the polypeptides of the disclosure.
In a further aspect, the disclosure provides expression vectors comprising the nucleic acid of any aspect of the disclosure operatively linked to a suitable control sequence. “Expression vector” includes vectors that operatively link a nucleic acid coding region or gene to any control sequences capable of effecting expression of the gene product. “Control sequences” operably linked to the nucleic acid sequences of the disclosure are nucleic acid sequences capable of effecting the expression of the nucleic acid molecules. The control sequences need not be contiguous with the nucleic acid sequences, so long as they function to direct the expression thereof. Thus, for example, intervening untranslated yet transcribed sequences can be present between a promoter sequence and the nucleic acid sequences and the promoter sequence can still be considered “operably linked” to the coding sequence. Other such control sequences include, but are not limited to, polyadenylation signals, termination signals, and ribosome binding sites. Such expression vectors can be of any type, including but not limited plasmid and viral-based expression vectors. The control sequence used to drive expression of the disclosed nucleic acid sequences in a mammalian system may be constitutive (driven by any of a variety of promoters, including but not limited to, CMV, SV40, RSV, actin, EF) or inducible (driven by any of a number of inducible promoters including, but not limited to, tetracycline, ecdysone, steroid-responsive). The expression vector must be replicable in the host organisms either as an episome or by integration into host chromosomal DNA. In various embodiments, the expression vector may comprise a plasmid, viral-based vector, or any other suitable expression vector.
In another aspect, the disclosure provides host cells that comprise the polypeptide, fusion protein, nanoparticle, composition, nucleic acid, and/or expression vector (i.e.: episomal or chromosomally integrated) disclosed herein, wherein the host cells can be either prokaryotic or eukaryotic. The cells can be transiently or stably engineered to incorporate the expression vector of the disclosure, using techniques including but not limited to bacterial transformations, calcium phosphate co-precipitation, electroporation, or liposome mediated-, DEAE dextran mediated-, polycationic mediated-, or viral mediated transfection.
In another embodiment, the disclosure provides pharmaceutical compositions comprising:
(a) the polypeptide, fusion protein, nanoparticle, composition, nucleic acid, expression vector, and/or host cell of any embodiment or combination of embodiments herein; and
(b) a pharmaceutically acceptable carrier.
The pharmaceutical compositions of the disclosure can be used, for example, in the methods of the disclosure described below. The pharmaceutical composition may comprise in addition to the polypeptide or other active agent of the disclosure (a) a lyoprotectant; (b) a surfactant; (c) a bulking agent; (d) a tonicity adjusting agent; (e) a stabilizer; (0 a preservative and/or (g) a buffer.
In some embodiments, the buffer in the pharmaceutical composition is a Tris buffer, a histidine buffer, a phosphate buffer, a citrate buffer or an acetate buffer. The pharmaceutical composition may also include a lyoprotectant, e.g. sucrose, sorbitol or trehalose. In certain embodiments, the pharmaceutical composition includes a preservative e.g. benzalkonium chloride, benzethonium, chlorohexidine, phenol, m-cresol, benzyl alcohol, methylparaben, propylparaben, chlorobutanol, o-cresol, p-cresol, chlorocresol, phenylmercuric nitrate, thimerosal, benzoic acid, and various mixtures thereof. In other embodiments, the pharmaceutical composition includes a bulking agent, like glycine. In yet other embodiments, the pharmaceutical composition includes a surfactant e.g., polysorbate-20, polysorbate-40, polysorbate-60, polysorbate-65, polysorbate-80 polysorbate-85, poloxamer-188, sorbitan monolaurate, sorbitan monopalmitate, sorbitan monostearate, sorbitan monooleate, sorbitan trilaurate, sorbitan tristearate, sorbitan trioleaste, or a combination thereof. The pharmaceutical composition may also include a tonicity adjusting agent, e.g., a compound that renders the formulation substantially isotonic or isoosmotic with human blood. Exemplary tonicity adjusting agents include sucrose, sorbitol, glycine, methionine, mannitol, dextrose, inositol, sodium chloride, arginine and arginine hydrochloride. In other embodiments, the pharmaceutical composition additionally includes a stabilizer, e.g., a molecule which, when combined with a protein of interest substantially prevents or reduces chemical and/or physical instability of the protein of interest in lyophilized or liquid form. Exemplary stabilizers include sucrose, sorbitol, glycine, inositol, sodium chloride, methionine, arginine, and arginine hydrochloride.
The pharmaceutical composition and the compositions may further comprise one or more other active agents suitable for an intended use.
In another aspect, the disclosure provides methods of delivering a secreted agent from a cell, comprising administering or admixing the cell with the nucleic acid molecule and/or the expression vector of any embodiment or combination of embodiments herein and secreting the nanoparticle or synthetic nanoparticle.
In another aspect, the disclosure provides vaccines comprising the nanoparticle, composition, pharmaceutical composition, synthetic nanoparticle, nucleic acid, expression vector, and/or cell of any embodiment or combination of embodiments herein.
In a further aspect, the disclosure provides methods to vaccinate a subject against a virus, the method comprising administering the nanoparticle, composition, pharmaceutical composition, synthetic nanoparticle(s) or the vaccine(s) described herein to the subject. The subject may be any suitable subject, including but not limited to a mammalian subject such as a human subject. In one embodiment, the method comprises
(a) obtaining the nanoparticle, composition, pharmaceutical composition, synthetic nanoparticles, the compositions, or the vaccines described herein; and,
(b) administering the synthetic nanoparticles, the compositions, or the vaccines described herein to the subject.
In another embodiment, the administration elicits an immune response in the subject, such that the subject is protected against infection.
The disclosure also provides kits, comprising one or more components selected from the group consisting of the polypeptide, fusion protein, nanoparticle, composition, synthetic nanoparticle(s), the nucleic acid molecule(s), the expression vector(s), the cell(s), the composition(s),or the vaccine(s) described herein.
In another aspect, the disclosure provides computer-implemented methods for designing a secreted peptide, using any suitable methods as described herein. In one embodiment, the methods comprise:
generating a 3D structure of a protein of interest with a 19-residue sliding window for transmembrane insertion potential (dG_ins);
wherein windows of dG_ins less than or equal to +2.7 kcal/mol are confirmed to be local minima within +/−9 residues, and the cutoff of +2.7 kcal/mol is the signature of a cryptic transmembrane domain;
designing one or more peptide sequences based on the generated 3D structure and predicting mutations at each position within that domain, wherein allowed residues are all polar, excluding histidine, such that the final allowable residues are amino acids D, E, K, R, Q, N, S, T, Y; and side chains of other residues within an 8-Angstrom shell are allowed to adopt different rotamers (“repack” to one of skill in the art) but not mutate to other residues (“design” to one of skill in the art).
In one embodiment, for each mutation or set of mutations, the score of the overall energy of the structure is generated and wherein
(a) if the new score is higher than the original score by a threshold amount of 15 REU (dscore), the degreaser variant is discarded and not further evaluated; or
(b) if the new score is within the tolerance, but the change in dG_ins is less than +0.27 kcal/mol (ddG_ins), the mutation placed at that position is rejected and disallowed at that position, and the position is subjected to mutation again; or
(c) if the new score is within the tolerance and the ddG_ins is greater than +0.27 kcal/mol, the mutation is accepted, the structure is optionally output, and the metrics of that mutation are written to the final output file.
Each position within such a domain is thusly evaluated and mutated, and each domain within the sequence is thusly evaluated and mutated. The final outputs may be written to the end of the output structure file, examples of which are shown in Tables 1 and 2.
Many proteins, including but not limited to viral glycoprotein antigens, must be expressed as secreted proteins in eukaryotic cells. This requirement can derive from many different causes, including but not limited to a requirement for post-translational modifications including but not limited to N-linked glycosylation, disulfide bond formation, etc. However, the yield of secreted protein from eukaryotic cells varies widely for reasons that are not fully understood by those of skill in the art, and some proteins altogether fail to secrete at appreciable levels. Here we describe the identification of cryptic transmembrane domains in a variety of protein sequences that accounts for their poor secretion from eukaryotic cells. We further describe that eliminating these cryptic transmembrane domains through the mutation of hydrophobic residues to polar residues improves the yield of secreted protein. We disclose a general computational method for the identification of cryptic transmembrane domains and their removal through mutation without disrupting a protein's overall structure. We further disclose examples of both designed nanoparticle proteins and viral glycoprotein antigens whose secretion was improved using the method.
A general computational method to predict putative transmembrane domains and redesign them. Across all domains of life, membrane proteins are interpreted by a protein complex known as the translocon. In eukaryotes, Sec61 and its associated chaperones recognize proteins destined for the secretory pathway, plasma membrane resident and extracellularly secreted, via an amino-terminal signal peptide. As the protein is translated, segments of high hydrophobicity partition into the ER membrane.
We have found that several designed protein nanoparticle components, although solubly and stably expressed in bacterial systems, were incompatible with eukaryotic secretion. This differential expression in eukaryotic cells was not correlated with bacterial expression levels. Initial attempts to rationally redesign sequences by structural examination did not afford secretable nanoparticle components. Thus, a model-guided design method was needed in order to improve secretion of these proteins.
A general computational method for designing protein sequences for improved secretion from eukaryotic cells. We wrote code that describes the amino acid- and position-specific contribution to transmembrane insertion at each position within a given segment of a protein. We integrated this code into the Rosetta™ macromolecular modeling and design suite to enable simultaneous design away from high hydrophobicity and toward native stability, such that mutations introduced to remove cryptic transmembrane segments do not destabilize the protein's native structure. We refer to this design protocol as the Degreaser™. Initial input parameters for the degreaser were empirically determined by visual inspection of a range of outputs, with the intention of minimally perturbing the existing designed interfaces.
Characterization of Degreaser™ variants. The Degreaser predicted several variants for each protein input. Each variant generated had increased predicted transmembrane insertion potential, confirming the intended behavior of the Degreaser. Several variants were generated for each input structure, which were then visually inspected. The initial set of proteins examined were: I3-01, a one-component icosahedral particle that was designed using the trimeric 1wa3-wt protein as a starting point,; I53-dn5A, the pentameric component of a two-component icosahedral nanoparticle, designed from PDB 2jfb, and the tetrameric and trimeric components of the two-component octahedral nanoparticle O43-38, designed starting from PDBs 1e4c and 1wa3, respectively. 1wa3-wt was solubly secreted from HEK293F suspension cells when appended to an IgK secretion signal; the nanoparticle components were not appreciably secreted.
a. Definition: “Degreaser™” refers to the program that was written and compiled in C++ as part of the Rosetta™ macromolecular modeling package in order to use standard Rosetta features such as PDB handling and other scoring metrics.
b. Definition: a “degreased” protein sequence can be said to have been evaluated by the Degreaser, and to have been experimentally validated to have improved secretion from a eukaryotic cell. Candidates that were evaluated by the degreaser but not experimentally evaluated for improved secretion from a eukaryotic cell, or other candidates that were not evaluated, can be called “not degreased.”
c. Definition: a “degreaser variant” refers to a candidate output from the degreaser before it is/was classified as “degreased” or “not degreased.”
d. The core of the code is briefly outlined here. The input 3D structure of interest is evaluated with a 19-residue sliding window for transmembrane insertion potential (dG_ins). Windows of dG_ins less than or equal to +2.7 kcal/mol are confirmed to be local minima within +/−9 residues, and the cutoff of +2.7 kcal/mol is the signature of a ‘cryptic transmembrane domain.’ Once all such domains are recognized, the program uses the Rosetta™ Packer to make mutations at each position within that domain. The allowed residues were all polar, excluding histidine, such that the final allowable residues were “DEKRQNSTY.” (SEQ ID NO: 62) After the Packer makes a change to a residue in the domain, side chains of other residues within an 8-Angstrom shell were allowed to adopt different rotamers (“repack” to one of skill in the art) but not mutate to other residues (“design” to one of skill in the art). For each mutation or set of mutations, the Rosetta score, or overall energy of the structure, is evaluated, as well as the new dG_ins. If the new score was higher than the original score by a threshold amount of 15 REU (dscore), the degreaser variant is discarded and not further evaluated. If the new score is within the tolerance, but the change in dG_ins is less than +0.27 kcal/mol (ddG_ins), the mutation placed at that position is rejected and disallowed at that position, and the position is subjected to mutation again. If the new score is within the tolerance and the ddG_ins was greater than +0.27 kcal/mol, the mutation is accepted, the structure is optionally output, and the metrics of that mutation are written to the final output file. Each position within such a domain is thusly evaluated and mutated, and each domain within the sequence is thusly evaluated and mutated. The final outputs are written to the end of the output structure file, examples of which are shown in Tables 1 and 2.
e. All degreaser variants were inspected by looking at the mutant's 3D structure in PyMol. Some outputs that appeared unrealistic as would be known to one of skill in the art, such as the incorporation of charged residues into the hydrophobic core of a protein, were removed from the variant list. Furthermore, only a select number of candidates for each scaffold were chosen for experimental evaluation.
Definition: “pCMV” refers to a pcDNA3.1-based expression vector.
Definition: “IgK signal peptide” refers to the amino acid sequence “METDTLLLWVLLLWVPGSTGD (SEQ ID NO: 48)” and “IgK-mini-FLAG” refers to the amino acid sequence “METDTLLLWVLLLWVPGSTGDYKDEK (SEQ ID NO: 49)”.
Definition: “His tag” refers to the amino acid sequence “HHHHHH (SEQ ID NO: 50)”.
Definition: “myc tag” refers to the amino acid sequence “EQKLISEEDL (SEQ ID NO: 51)”.
Unless otherwise specified, all constructs experimentally evaluated for secretion from a eukaryotic cell contain an IgK signal peptide or IgK-mini-FLAG at the amino terminus and a myc tag immediately followed by a His tag at the carboxy terminus.
For I3-01, degreaser variants were generated by two-round PCR amplification. In brief, primers annealing to 5′ and 3′ regions of the multiple cloning site in the pCMV expression vector encoding I3-01 were designed to be universal. Then, for each variant, a primer was designed to incorporate the mutation(s) of interest. The first round of amplification generated a 100- to 200-base pair “megaprimer,” which was then used in a second round of amplification to generate a linear, double-stranded DNA fragment encoding the degreaser variant of interest. These mutation-bearing DNA sequences were ligated by Gibson assembly into PCR-linearized vector. All sequences were validated by forward and reverse sequencing reads upstream and downstream of the gene of interest, respectively.
For other degreaser variants, human codon-optimized sequences were synthesized by Genscript or IDT, then cloned into existing vectors by Gibson assembly. hMPV F proteins and degreased hMPV F protein variants were synthesized with the hMPV F native signal peptide rather than “IgK” or “IgK-mini-FLAG.”
Plasmids of pCMV harboring degreaser variants were transformed into NEB 5-alpha high-efficiency chemically competent cells per the manufacturer's instructions. Cultures were inoculated in TB or LB media containing suitable antibiotics. Plasmids were prepared with Qiagen Plasmid Miniprep kits according to the manufacturer's instructions.
Purified plasmids were transfected into HEK293F suspension cell culture using PEI, per the manufacturer's instructions. Cells were harvested three, four, or five days after transfection. Medium was separated from cells by centrifugation at 1,500×g.
Definition: “anti” refers to an antibody raised against a particular epitope; e.g. an “anti-myc” antibody binds to myc-tagged polypeptides.
Definition: “TBS” refers to Tris-buffered saline, and is pH 8.0 unless otherwise specified.
Cell and supernatant fractions were treated with 0.5% Triton-X 100 containing >2.5 U/uL of Benzonase™ nuclease for 10 minutes at 37° C. Samples were then diluted for SDS-PAGE into 50 mM Tris pH 6.8, 2% SDS, 10% glycerol, and at least 1 mM DTT. Samples in SDS buffer were incubated at 95° C. for five minutes before being loaded onto pre-cast 4-20% Criterion™ gels (BIO-RAD). Gels were run at 250V for 26 minutes, then transferred onto nitrocellulose membranes (BIO-RAD) from a Trans-blot Turbo kit according to manufacturer's instructions. Transferred membranes were optionally stained with Ponceau™ S per manufacturer's instructions. Membranes were then blocked with 3% blotting-grade blocker (BIO-RAD) in TBS supplemented with 0.1% Tween-20. Anti-myc antibody, mouse monoclonal (Cell Signaling Technologies), was diluted 1 in 20,000 in the same blocking buffer and incubated with the membrane. After incubation and wash with TBS with 0.1% Tween-20, anti-mouse IgG HRP-conjugate was diluted 1 in 20,000, and StrepTactin™ anti-ladder was diluted 1 in 50,000 in fresh blocking buffer and incubated with the membrane. After incubation and wash, the membranes were visualized with Clarity ECL substrate (BIO-RAD) per the manufacturer's instructions on a BIO-RAD GelDoc™ Imager.
Western blotting was the main assay used to detect improvements to secretion levels. Looking at the ratio of secreted protein to total protein controlled for potential expression differences among variants, although those differences were minimal, likely due to there being only one mutation per variant. Semi-quantitative measurements could be made using ImageJ software to analyze the raw blot images by densitometry. For each scaffold tested; that is, I3-01, O43-38 tetramer, O43-38 trimer, and I53-dn5A pentamer, at least one variant significantly (>50%) improved secretion yields. Each degreased variant is not necessarily the variant that had the highest dG_ins; in those cases, the poor secretion of the variants with the highest dG_ins could be due to destabilization of the protein or other unforeseen effects.
After cell culture supernatant was filtered through a 0.45 μm filter, 40 uL of Ni-NTA slurry was added to 1 mL of supernatant. This mixture was incubated, then resin was sedimented by centrifugation. Three washes of increasing imidazole concentration (10 mM, 20 mM, and 50 mM) were used to remove unwanted contaminants. Finally, the protein of interest was eluted with 500 mM imidazole in a Tris buffer.
Later constructs were purified with Ni Excel™ Sepharose (GE Healthcare) according to manufacturer's instructions.
Purification of protein from cell culture supernatant also served to increase the concentration of protein in the samples analyzed. The transition was made from Ni-NTA resin to Ni Excel™ Sepharose after poor yields were obtained with Ni-NTA, which was attributed to EDTA present in cell culture media that may strip Ni ions from the resin.
Samples were prepared for negative stain EM by diluting to 0.05-0.075 mg/mL using a Tris-based buffer, and 6.0 μL was incubated on a glow-discharged, copper, carbon-coated grid for 1 min before quickly immersing the grid in a 60 μL drop of water. The water was blotted off within seconds by Whatman™ No. 1 filter paper, and the grid was immediately dipped into a 6.0 μL drop of stain (2% w/v uranyl formate). The stain was immediately blotted away and within seconds the grid was dipped into another 6.0 μL drop of stain, which was left on the grid for 30 seconds. At the end of this time, the stain was blotted dry and allowed to air dry for 5 minutes prior to imaging. Images were recorded on a FEI Morgagni 268 transmission electron microscope equipped with a Gatan US4000 CCD camera, using Leginon™ software for data collection at a nominal magnification of 22,000× at a defocus range comprised between −1 um and −4 um.
TEM was the primary assay used to determine preservation of original protein architecture, especially in the case of I3-01, as it is secreted as a full nanoparticle. This method was preferable to other assays as it is a fast and definitive readout for assembly versus no assembly. As shown in
For secreted individual nanoparticle components, assembly competency could not be directly assessed by TEM of those proteins, as was possible with I3-01. Therefore, purified components from cell culture supernatant were mixed at a 1:1 ratio with the appropriate second component in order to form nanoparticle assemblies. The second component was typically produced in bacterial culture as previously described. Assembly reactions were then purified by size-exclusion chromatography. Most variants demonstrated good assembly competency. An exception was the assembly of degreased O43-38 tetramer with degreased O43-38 trimer, indicating that the mutations made to both components of this architecture interfered with assembly of the nanoparticle.
Filtered supernatants containing degreaser variants were bound to Nunc MaxiSorp™ 96-well plates in a two-fold dilution series. Antibodies specific to a tag or known epitope of interest were first applied, followed by a secondary anti-human antibody conjugated to HRP. For nanoparticle proteins and degreased nanoparticle proteins, protein yield was determined colorimetrically using the substrate TMB and absorbances were collected at 450 nm. For hMPV F proteins and degreased hMPV F protein variants, protein yield was determined colorimetrically using the substrate ABTS and absorbances were collected at 405 nm.
Design and experimental evaluation of degreased hMPV F genetically fused to nanoparticle proteins. In addition to proteins in which cryptic transmembrane domains have been introduced by mutation, computational design, or directed evolution, some naturally occurring proteins also contain cryptic transmembrane domains. For example, many viral fusion glycoproteins have long stretches of hydrophobic amino acids that contain the “fusion peptides” the glycoproteins insert into host cell membranes during the membrane fusion process. One non-limiting example of such a protein is hMPV F (e.g., the Arg/2/02 isolate; Genbank ABD27846.1), which has three strongly predicted transmembrane domains at positions 103-125, 256-278, and 514-530. Only the region from residues 514-530 is known to traverse the viral membrane; residues 103-125 comprise the fusion peptide, while residues 256-278 have not been previously reported to interact with membranes. We used the degreaser to make several degreased variants of prefusion hMPV F (“115-BV”; Battles et al., Nat. Comm. 2017) and expressed them as genetic fusions to the nanoparticle components I53-50 and I53_dn5. We found that several of these variants, and in particular hMPV F-50A_14, which contains four degreaser mutations, secreted more efficiently from mammalian cells than corresponding non-degreased constructs. This non-limiting example demonstrates that the degreaser protocol may be used to improve the secretion of naturally occurring proteins that contain cryptic transmembrane domains.
Index refers to the amino acid position of the first amino acid in the potential transmembrane domain. Sequence refers to the sequence of the domain or variant at that position. dG_ins is the predicted transmembrane potential, with lower numbers more likely to be poorly secreting. Score is the Rosetta-calculated energy. ddG_ins and dscore are the differences in dG_ins and score relative to the unperturbed structure, respectively. Note that some variants were manually designed, and thus have no Rosetta score evaluated for them, though their dG_ins can still be calculated.
The ddG_ins and dscore values for each variant in this Table and Table 1 (except the manually added ones) fit the aforementioned criteria, indicating that the program works as intended. Finally, the program may combine the top three individual candidate mutations with respect to ddG_ins, and if the triple mutant passes the defined score threshold, it is also reported as a degreaser variant.
Table 2. A curated list of I53-dn5 pentamer degreaser variants that were experimentally characterized. The ddG_ins and dscore values for each variant in this Table and Table 1 (except the manually added ones) fit the aforementioned criteria, indicating that the program works as intended. Finally, the program may combine the top three individual candidate mutations with respect to ddG_ins, and if the triple mutant passes the defined score threshold, it is also reported as a degreaser variant.
Nat.
From the foregoing, it will be appreciated that, although specific embodiments of the disclosure have been described herein for purposes of illustration, various modifications may be made without deviating from the spirit and scope of the invention. Accordingly, the invention is not limited except as by the appended claims.
This application claims priority to U.S. Provisional Application Ser. No. 62/977,036 filed Feb. 14, 2020, incorporated by reference herein in its entirety.
This invention was made with government support under Grant No. HDTRA1-18-1-0001, awarded by the Defense Threat Reduction Agency and Grant Nos. HHSN272201700059C and R01 GM120553, awarded by the National Institutes of Health. The government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2021/017856 | 2/12/2021 | WO |
Number | Date | Country | |
---|---|---|---|
62977036 | Feb 2020 | US |