In the past few decades, discoveries about the causes of various human diseases at molecular and cellular levels, combined with technical advances in genetic engineering and pharmaceutical sciences, have enabled treatment of many life-threatening illnesses by administering therapeutic proteins to patients. Depending on the target disease as well as the nature and mechanism of action of the therapeutic proteins, the delivery of therapeutic proteins to a targeted tissue or organ site varies dramatically in its specific routes, protein stability or bioavailability, and therefore its ultimate effectiveness, despite significant effort having been devoted to such research endeavors. To this date, effective delivery remains the major obstacle to achieve desired therapeutic outcome by administration of therapeutic proteins. Thus, there exists an urgent need for developing new and improved strategies for delivering therapeutic proteins for the purpose of medical treatment. This invention addresses this and other related needs.
This invention provides a novel approach to achieve effective delivery of proteins of interest, e.g., therapeutic proteins. Thus, in a first aspect, this invention provides a modified Cry protein such as a polypeptide comprising the amino acid sequence shown in a modified SEQ ID NO:1, with two or more, three, four, five, six, seven, eight, nine, ten or more, or all of amino acids at residues 391, 395, 423, 430, 432, 433, 436, 461, 462, 463, and 466 of SEQ ID NO:1 replaced with charged residues such as lysine or arginine, and the polypeptide forms crystal upon being expressed in a host cell. Optionally, at least one, two, or three, amino acids at residues 533, 535, and 536 of SEQ ID NO:1 are further modified, for instance, substituted with alanine. Additionally, The polypeptide may comprise any three-domain Cry protein with a similar structure to Cry3Aa (SEQ ID NO:1), with two or more, three, four, five, six, seven, eight, nine, ten or more, or all of aspartic acid and glutamic acid residues in the region corresponding to the domain II of Cry3Aa replaced with lysine or arginine, and the polypeptide forms crystal upon being expressed in a host cells. This invention provides additional recombinant polypeptides for the same or similar uses by modifying other Cry proteins in the same or similar manner desribed above and herein. The aspartate and glutamate residues in other Cry proteins that can be modified (e.g., substituted with lysine or arginine) to achieve a similar charge profile are identified as follows:
In addition, the polypeptide of this invention further includes a fragment of a modified Cry protein (e.g., a modified Cry3Aa protein) generally corresponding to its domain II, such as the 389-471 fragment of SEQ ID NO:2, which the present inventors have discovered to be a soluble peptide and act in a manner similar to that of a cell-penetrating peptide and therefore possesses the capability of effectively transporting its fusion partner, a protein of interest (e.g., a protein with a detectable label such as a fluorescent moiety or a biologically active protein such as a therapeutice protein) into target cells.
In some embodiments, the polypeptide of this invention is linked to a heterologous moiety. This moiety in some instances may be a peptide, such as a reporter protein (which is capable of generating a detectable signal, e.g., a fluorophore peptide such as mCherry), or a therapeutic protein (which is capable of conferring a therapeutic effect when administered), or a transcription factor (which is capable of modulating gene expression, e.g., OCT4 for cell reprogramming), or a gene-editing protein (which is capable of recognizing and cleaving specific DNA sequence, e.g., the RNA-guided CRISPR-associated (Cas) protein and Cre recombinase), or an enzyme (which is capable of recognizing and cleaving specific peptide sequence or structural element or synthetic substrate, such as sortase and cytosine deaminase), thus forming a fusion protein of the modified Cry protein (e.g., modified SEQ ID NO:1 such as SEQ ID NO:2 or a fragment thereof, for example, the 389-471 segment of SEQ ID NO:2) and the peptide, although in other instances the heterologous moiety may be a non-peptide moiety such as a detectable label (e.g., a composition comprising a radioisotope, fluorescent dye, or electron-dense reagent) or a solid substrate/support. In some embodiments, the polypeptide is a fusion protein comprising the modified SEQ ID NO:1 and the therapeutic protein, in which all amino acids at residues 391, 395, 423, 430, 432, 433, 436, 461, 462, 463, and 466 of SEQ ID NO:1 have been replaced with lysine. In some embodiments, the polypeptide as a fusion protein has additional modifications in SEQ ID NO:1, for example, mutations including insertions, deletions, or substitutions at one, optionally two or all three residues 533, 535, and 536. In some embodiments, the fusion protein includes the modified SEQ ID NO:1 comprising amino acids at residues 391, 395, 423, 430, 432, 433, 436, 461, 462, 463, and 466 having been replaced with lysine, and amino acids at residues 533, 535, and 536 having been replaced with alanine, plus a therapeutic protein, such as the p16 protein or the p53 protein. In some embodiments, the fusion protein includes a fragment of a modified Cry protein (e.g., the 389-471 segment of SEQ ID NO:2) and its fusion partner, a heterologous peptide, which may be a protein capable of exerting a detectable signal or a desired biological activity, exemplified above and herein.
In a second aspect, the present invention provides a polynucleotide sequence encoding the polypeptide or fusion protein of this invention as described above and herein. In some embodiments, the polynucleotide sequence is present in an expression cassette, which is typically a recombinantly produced nucleotide structure comprising a promoter (for example, a heterologous promoter) operably linked to the polynucleotide sequence encoding the polypeptide. In some embodiments, the expression cassette may be present in the form of a polynucleotide vector, such as a plasmid or a viral vector. In a related aspect, this invention provides a host cell comprising the polypeptide described above and herein, a host cell comprising the polynucleotide sequence encoding the polypeptide, and a host cell comprising the expression cassette or vector that contains the polynucleotide sequence encoding the polypeptide. In some cases, the host cell is a bacterial cell or one derived from a bacterium, especially a cell of a Bacillus sp. bacterium, such as Bacillus subtilis (Bs) or Bacillus thuringiensis (Bt) cell. In some embodiments, the bacterium is E. coli.
In a third aspect, the present invention provides a method for recombinantly producing the polypeptide or fusion protein of this invention. The method includes the steps of (i) introducing the polynucleotide sequence encoding the polypeptide of this invention as described above and herein into a host cell; and (ii) culturing the cell under conditions permissible for the expression of the polypeptide. The polynucleotide sequence encoding the polypeptide may be in the form of an expression cassette or a vector such as a plasmid. In some embodiments, the host cell expressing the polypeptide of this invention is a bacterial cell, especially of Bacillus sp. such as a Bacillus subtilis (Bs) cell or Bacillus thuringiensis (Bt) cell. Another bacterial strain, such as E. coli, may also be used. In some cases, the method of recombinantly producing the polypeptide further includes a step (iii) of purifying the polypeptide after it has been expressed by the host cell, for example, when the polypeptide is in the crystal form. Typically, the fusion protein assumes a crystalline form or crystalized form upon its expression within the host cells. It may be purified in the crystal form; or it may be purified and then solubilized if necessary.
In a fourth aspect, the present invention provides a composition comprising the polypeptide or fusion protein described above or herein and a mammalian cell. In some embodiments, the polypeptide is crystalized. In some embodiments, the fusion protein is a soluble protein comprising a segment of a modified Cry protein (e.g., the 389-471 segment of SEQ ID NO:2) and its fusion partner, a heterologous protein of desirable properties such as in detectability or a specific biological activity as described above and herein. In some embodiments, the mammalian cell is a cancer cell. In some embodiments, the mammalian cell is an epithelial cell, fibroblast cell, neuronal cell, or immune cell.
In a fifth aspect, the present invention provides a method for delivering an effector protein (such as a therapeutic protein) into a mammalian cell. The method includes the step of contacting the polypeptide or fusion protein of this invention as described above and herein with the mammalian cell, wherein the polypeptide is a fusion protein of a modified Cry protein (e.g., the modified SEQ ID NO:1) and the effector protein, and the polypeptide is crystalized. In the alternative, the fusion protein is a soluble protein comprising a segment of a modified Cry protein (e.g., the 389-471 segment of SEQ ID NO:2) and its fusion partner, a heterologous protein of desirable properties such as in the detectability or a specific biological activity as described above and herein, with the segment of modified Cry protein acting in a manner similar to that of a cell-penetrating peptide to facilitate and enhance the efficiency of transportation of the heterologous protein into target cells.
The term “Cry protein,” as used herein, refers to any one protein among a class of crystalline three-domain Cry proteins produced by strains of Bacillus thuringiensis (http://www.lifesci.sussex.ac.uk/home/Neil_Crickmore/Bt/). Some examples of “Cry proteins” include, but are not limited to, Cry1Aa, Cry2Aa, Cry3Aa, Cry4Aa, Cry5B, Cry7Ca1, Cry8Ea1, Cry10Aa, and Cry11Aa. Their amino acid sequences and polynucleotide coding sequences are known (set forth in SEQ ID NOs:1 and 4-11). Their GenBank Accession Numbers are:
In addition to the wild-type Cry proteins, the term “Cry protein” also encompasses functional variants, which (1) share an amino acid sequence identity of at least 80%, 81%, 82%, 83%, 84%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.5% to the polypeptide sequence of any one of the three-domain Cry proteins listed in http://www.lifesci.sussex.ac.uk/home/Neil_Crickmore/Bt/; and (2) retain the ability to spontaneously form crystals within host cells as can be confirmed by known methods such as electron micrograph (see description in, e.g., Park et al., Appl Environ Microbiol, 1998, 64, 3932-3938; Schnepf et al., Microbiol Mol Biol Rev, 1998, 62, 775-806; Whiteley and Schnepf, Annu Rev Microbiol, 1986, 40, 549-576; and Nair et al., PLoS One, 2015, 10, e0127669).
The term “nucleic acid” or “polynucleotide” refers to deoxyribonucleic acids (DNA) or ribonucleic acids (RNA) and polymers thereof in either single- or double-stranded form. Unless specifically limited, the term encompasses nucleic acids containing known analogues of natural nucleotides that have similar binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally occurring nucleotides. Unless otherwise indicated, a particular nucleic acid sequence also implicitly encompasses conservatively modified variants thereof (e.g., degenerate codon substitutions), alleles, orthologs, SNPs, and complementary sequences as well as the sequence explicitly indicated. Specifically, degenerate codon substitutions may be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer et al., Nucleic Acid Res. 19:5081 (1991); Ohtsuka et al., J. Biol. Chem. 260:2605-2608 (1985); and Rossolini et al., Mol. Cell. Probes 8:91-98 (1994)). The term nucleic acid is used interchangeably with gene, cDNA, and mRNA encoded by a gene.
The term “gene” means the segment of DNA involved in producing a polypeptide chain. It may include regions preceding and following the coding region (leader and trailer) as well as intervening sequences (introns) between individual coding segments (exons).
The term “amino acid” refers to naturally occurring and synthetic amino acids, as well as amino acid analogs and amino acid mimetics that function in a manner similar to the naturally occurring amino acids. Naturally occurring amino acids are those encoded by the genetic code, as well as those amino acids that are later modified, e.g., hydroxyproline, γ-carboxyglutamate, and O-phosphoserine. Amino acid analogs refers to compounds that have the same basic chemical structure as a naturally occurring amino acid, i.e., an α carbon that is bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs have modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic chemical structure as a naturally occurring amino acid. “Amino acid mimetics” refers to chemical compounds having a structure that is different from the general chemical structure of an amino acid, but that functions in a manner similar to a naturally occurring amino acid.
There are various known methods in the art that permit the incorporation of an unnatural amino acid derivative or analog into a polypeptide chain in a site-specific manner, see, e.g., WO 02/086075.
Amino acids may be referred to herein by either the commonly known three letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly accepted single-letter codes.
“Conservatively modified variants” applies to both amino acid and nucleic acid sequences. With respect to particular nucleic acid sequences, “conservatively modified variants” refers to those nucleic acids that encode identical or essentially identical amino acid sequences, or where the nucleic acid does not encode an amino acid sequence, to essentially identical sequences. Because of the degeneracy of the genetic code, a large number of functionally identical nucleic acids encode any given protein. For instance, the codons GCA, GCC, GCG and GCU all encode the amino acid alanine. Thus, at every position where an alanine is specified by a codon, the codon can be altered to any of the corresponding codons described without altering the encoded polypeptide. Such nucleic acid variations are “silent variations,” which are one species of conservatively modified variations. Every nucleic acid sequence herein that encodes a polypeptide also describes every possible silent variation of the nucleic acid. One of skill will recognize that each codon in a nucleic acid (except AUG, which is ordinarily the only codon for methionine, and TGG, which is ordinarily the only codon for tryptophan) can be modified to yield a functionally identical molecule. Accordingly, each silent variation of a nucleic acid that encodes a polypeptide is implicit in each described sequence.
As to amino acid sequences, one of skill will recognize that individual substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein sequence which alters, adds or deletes a single amino acid or a small percentage of amino acids in the encoded sequence is a “conservatively modified variant” where the alteration results in the substitution of an amino acid with a chemically similar amino acid. Conservative substitution tables providing functionally similar amino acids are well known in the art. Such conservatively modified variants are in addition to and do not exclude polymorphic variants, interspecies homologs, and alleles of the invention.
The following eight groups each contain amino acids that are conservative substitutions for one another:
Amino acids may be referred to herein by either their commonly known three letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly accepted single-letter codes.
In the present application, amino acid residues are numbered according to their relative positions from the left most residue, which is numbered 1, in an unmodified wild-type polypeptide sequence.
As used in herein, the terms “identical” or percent “identity,” in the context of describing two or more polynucleotide or amino acid sequences, refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same (for example, a Cry protein or a modified Cry protein sequence of this invention has at least 80% identity, preferably 85%, 90%, 91%, 92%, 93, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identity, to a reference sequence, e.g., the amino acid sequence of a corresponding wild-type Cry protein), when compared and aligned for maximum correspondence over a comparison window, or designated region as measured using one of the following sequence comparison algorithms or by manual alignment and visual inspection. Such sequences are then said to be “substantially identical.” With regard to polynucleotide sequences, this definition also refers to the complement of a test sequence. Preferably, the identity exists over a region that is at least about 50 amino acids or nucleotides in length, or more preferably over a region that is 75-100 amino acids or nucleotides in length.
For sequence comparison, typically one sequence acts as a reference sequence, to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. Default program parameters can be used, or alternative parameters can be designated. The sequence comparison algorithm then calculates the percent sequence identities for the test sequences relative to the reference sequence, based on the program parameters. For sequence comparison of nucleic acids and proteins, the BLAST and BLAST 2.0 algorithms and the default parameters discussed below are used.
A “comparison window”, as used herein, includes reference to a segment of any one of the number of contiguous positions selected from the group consisting of from 20 to 600, usually about 50 to about 200, more usually about 100 to about 150 in which a sequence may be compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned. Methods of alignment of sequences for comparison are well-known in the art. Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv. Appl. Math. 2:482 (1981), by the homology alignment algorithm of Needleman & Wunsch, J. Mol. Biol. 48:443 (1970), by the search for similarity method of Pearson & Lipman, Proc. Nat'l. Acad. Sci. USA 85:2444 (1988), by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, Wis.), or by manual alignment and visual inspection (see, e.g., Current Protocols in Molecular Biology (Ausubel et al., eds. 1995 supplement)).
Examples of algorithms that are suitable for determining percent sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, which are described in Altschul et al., (1990) J. Mol. Biol. 215: 403-410 and Altschul et al. (1977) Nucleic Acids Res. 25: 3389-3402, respectively. Software for performing BLAST analyses is publicly available at the National Center for Biotechnology Information website, ncbi.nlm.nih.gov. The algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul et al., supra). These initial neighborhood word hits acts as seeds for initiating searches to find longer HSPs containing them. The word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always <0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a word size (W) of 28, an expectation (E) of 10, M=1, N=−2, and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a word size (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff and Henikoff, Proc. Natl. Acad. Sci. USA 89:10915 (1989)).
The BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin and Altschul, Proc. Nat'l. Acad. Sci. USA 90:5873-5787 (1993)). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.2, more preferably less than about 0.01, and most preferably less than about 0.001.
An indication that two nucleic acid sequences or polypeptides are substantially identical is that the polypeptide encoded by the first nucleic acid is immunologically cross reactive with the antibodies raised against the polypeptide encoded by the second nucleic acid, as described below. Thus, a polypeptide is typically substantially identical to a second polypeptide, for example, where the two peptides differ only by conservative substitutions. Another indication that two nucleic acid sequences are substantially identical is that the two molecules or their complements hybridize to each other under stringent conditions, as described below. Yet another indication that two nucleic acid sequences are substantially identical is that the same primers can be used to amplify the sequence.
“Polypeptide,” “peptide,” and “protein” are used interchangeably herein to refer to a polymer of amino acid residues. All three terms apply to amino acid polymers in which one or more amino acid residue is an artificial chemical mimetic of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers and non-naturally occurring amino acid polymers. As used herein, the terms encompass amino acid chains of any length, including full-length proteins, wherein the amino acid residues are linked by covalent peptide bonds.
The term “recombinant” when used with reference, e.g., to a cell, or a nucleic acid, protein, or vector, indicates that the cell, nucleic acid, protein or vector, has been modified by the introduction of a heterologous nucleic acid or protein or the alteration of a native nucleic acid or protein, or that the cell is derived from a cell so modified. Thus, for example, recombinant cells express genes that are not found within the native (non-recombinant) form of the cell or express native genes that are otherwise abnormally expressed, under expressed or not expressed at all.
A “promoter” is defined as an array of nucleic acid control sequences that direct transcription of a polynucleotide sequence. As used herein, a promoter includes necessary polynucleotide sequences near the start site of transcription, such as, in the case of a polymerase II type promoter, a TATA element. A promoter also optionally includes distal enhancer or repressor elements, which can be located as much as several thousand base pairs from the start site of transcription. A “constitutive” promoter is a promoter that is active under most environmental and developmental conditions. An “inducible” promoter is a promoter that is active under environmental or developmental regulation. The term “operably linked” refers to a functional linkage between a polynucleotide expression control sequence (such as a promoter, or array of transcription factor binding sites) and a second polynucleotide sequence, wherein the expression control sequence directs transcription of the polynucleotide sequence corresponding to the second sequence.
An “expression cassette” is a nucleic acid construct, generated recombinantly or synthetically, with a series of specified polynucleotide elements that permit transcription of a particular polynucleotide sequence in a host cell. An expression cassette may be part of a plasmid, viral genome, or nucleic acid fragment. Typically, an expression cassette includes a polynucleotide to be transcribed, operably linked to a promoter.
The term “heterologous” as used in the context of describing the relative location of two elements, refers to the two elements such as polynucleotide sequences (e.g., a promoter or a protein/polypeptide-encoding sequence) or polypeptide sequences (e.g., a modified Cry protein of this invention or a fusion protein comprising such a modified Cry protein) that are not naturally found in the same relative positions. Thus, a “heterologous promoter” of a gene refers to a promoter that is not naturally operably linked to that gene. Similarly, a “heterologous polypeptide” or “heterologous polynucleotide” to a modified Cry protein or its encoding sequence is one derived from an origin other than this particular Cry protein in the wild-type version, or one derived from the wild-type Cry protein but the fusion of a modified Cry protein (or its coding sequence) with a heterologous polypeptide (or polynucleotide sequence) does not result in a longer polypeptide or polynucleotide sequence that can be found naturally in the corresponding wild-type Cry protein (or its coding sequence).
A “label,” “detectable label,” or “detectable moiety” is a composition detectable by radiological, spectroscopic, photochemical, biochemical, immunochemical, chemical, or other physical means. For example, useful labels include radioisotopes such as 32P, fluorescent dyes, electron-dense reagents, enzymes (e.g., as commonly used in an ELISA), biotin, digoxigenin, or haptens and proteins that can be made detectable, e.g., by incorporating a radioactive component into a polypeptide or used to detect antibodies specifically reactive with the polypeptide. Typically a detectable label is a heterologous moiety attached to a probe or a molecule (e.g., a protein or nucleic acid) with defined binding characteristics (e.g., a polypeptide with a known binding specificity or a polynucleotide), so as to allow the presence of the probe/molecule (and therefore its binding target) to be readily detectable. The heterologous nature of the label ensures that it has an origin different from that of the probe or molecule that it labels, such that the probe/molecule attached with the detectable label does not constitute a naturally occurring composition.
A “host cell” is a cell that contains an expression vector and supports the replication or expression of the expression vector. Host cells may be prokaryotic cells such as E. coli or Bacillus thuringiensis, or eukaryotic cells such as yeast, insect, amphibian, or mammalian cells such as CHO, HeLa and the like, e.g., cultured cells, explants, and cells in vivo.
The term “about” as used herein denotes a range of +/−10% of a reference value. For examples, “about 10” defines a range of 9 to 11.
There has been growing interest in devising new and more effective methods for administration of therapeutic proteins for the purpose of treating medical conditions and disorders. By generating modified Cry proteins capable of self-crystallization and their fragments capable of enhancing protein intake across cell membrane, the present inventors have developed an innovative and effective strategy to deliver therapeutic proteins.
A. General Recombinant Technology
Basic texts disclosing general methods and techniques in the field of recombinant genetics include Sambrook and Russell, Molecular Cloning, A Laboratory Manual (3rd ed. 2001); Kriegler, Gene Transfer and Expression: A Laboratory Manual (1990); and Ausubel et al., eds., Current Protocols in Molecular Biology (1994).
For nucleic acids, sizes are given in either kilobases (kb) or base pairs (bp). These are estimates derived from agarose or acrylamide gel electrophoresis, from sequenced nucleic acids, or from published DNA sequences. For proteins, sizes are given in kilodaltons (kDa) or amino acid residue numbers. Proteins sizes are estimated from gel electrophoresis, from sequenced proteins, from derived amino acid sequences, or from published protein sequences.
Oligonucleotides that are not commercially available can be chemically synthesized, e.g., according to the solid phase phosphoramidite triester method first described by Beaucage & Caruthers, Tetrahedron Lett. 22: 1859-1862 (1981), using an automated synthesizer, as described in Van Devanter et. al., Nucleic Acids Res. 12: 6159-6168 (1984). Purification of oligonucleotides is performed using any art-recognized strategy, e.g., native acrylamide gel electrophoresis or anion-exchange HPLC as described in Pearson & Reanier, J. Chrom. 255: 137-149 (1983).
The sequence of a gene of interest, such as the polynucleotide sequence encoding a modified Cry protein or fusion protein thereof, and synthetic oligonucleotides can be verified after cloning or subcloning using, e.g., the chain termination method for sequencing double-stranded templates of Wallace et al., Gene 16: 21-26 (1981).
B. Coding Sequence for a Modified Cry Protein
Polynucleotide sequences encoding modified Cry proteins, their fragments, or fusion proteins of this invention can be readily constructed by modifying a wild-type Cry protein to obtain a variant or fragment and optionally combining the coding sequences for the fusion partners, such as a Cry3Aa protein and p53 or p16 protein. The sequences for Cry proteins and enzymes are generally known and may be obtained from a commercial supplier.
In addition to the use of full length wild-type Cry proteins for constructing the modified Cry proteins or fusion proteins of this invention, fragments of Cry proteins and/or variants of Cry proteins may also be useful. A DNA sequence encoding a Cry protein can be modified to generate fragments or variants of the Cry protein. So long as the fragments and variants retain the ability to spontaneously form crystals when expressed in a host cell, especially a Bacillus bacterial cell, they can be used for producing the fusion proteins and render the fusion proteins the ability to undergo spontaneous crystallization. In some cases, soluble fragments corresponding to domain II of a Cry protein (including a modified version) such as P3AP are developed for use in the enhanced delivery of proteins having desired activity or function to target cells. Typically, the variants bear a high percentage of sequence identity (e.g., at least 80, 85, 90, 95, 97, 98, 99% or higher) to the wild-type Cry protein sequence, whereas the fragments may be substantially shorter than the full length Cry protein, such as having some amino acids (e.g., 10-300 or 20-200 or 50-100 amino acids) removed from the N- or C-terminus of the full length Cry protein. For example, a useful Cry3Aa fragment may be as short as the first 290 amino acids from the N-terminus, encompassing Domain I of the protein. Other examples of such fragments include a Cry protein fragment having its first 57 amino acids from N-terminus removed and a Cry protein fragment having its C-terminal 18 amino acids removed. The ability of a modified Cry protein or a fusion protein thereof to undergo spontaneous crystallization can be verified by electron micrograph, whereas the desired biological activity attributable to the fusion partner or the “cargo” protein (e.g., cancer-suppressing protein such as p53 or p16) can be confirmed by established assays for each specific “cargo” protein. Surprisingly, the present inventors discovered during their studies that the presence of a modified Cry protein having multiple lysines introduced into domain II in a fusion protein affords a significant increase in the cellular uptake and endosomal escape of the fusion protein, thus permitting more effective delivery of therapeutic proteins to target cells (e.g., cancer cells). The inventors further revealed that modification to the Cry protein in domain III is able to confer to the fusion protein significantly enhanced solubility, providing additional enhancement to therapeutic efficacy of the “cargo” protein.
In addition, the present inventors discovered that a fragment of a modified Cry3Aa protein, such as a fragment of Pos3Aa (SEQ ID NO:2) generally corresponding to domain II of the protein, i.e., starting at residue 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, or 394 and ending at residue 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, or 486 of SEQ ID NO:2, although a soluble peptide, is capable of efficiently crossing cell membrane and therefore can be used to effectively deliver a protein of interest (e.g., a protein with desired biological function or activity) into target cells when peptide is fused to the protein. P3AP, the 389-471 segment of SEQ ID NO:2, is an exemplary peptide of the present invention useful for such applications.
In some cases, a peptide linker or spacer is used between the coding sequences for the modified Cry protein or its fragment and its fusion partner, a heterologous protein or an effector protein. Such heterologous protein may be of any nature and any size, although in some cases it is one within the molecular weight range of about 2-200 kDa, or about 5-100 or 10-100 kDa, or about 15-75 kDa. In particular, proteins that are hard to produce (e.g., p16 protein) or easy to aggregate when produced in E. coli or any other protein production system (e.g., p53 protein) may still suitable for use in this delivery method. One purpose is to ensure the proper reading frame for the fusion protein such that the coding sequences for both modified Cry protein or fragment and the heterologous protein are in frame. Another purpose is to provide appropriate spatial relationship between the modified Cry protein and the heterologous protein, such that each may retain its original functionality: the modified Cry protein is able to cause self-crystallization of the fusion protein, and the heterologous protein remains active in its desirable biological activity (e.g., cancer-suppressing capacity). Also, one or more linkers may be placed at the very beginning and/or the very end of the open reading frame, so as to facilitate proper start and termination of the coding sequence translation. Such linkage amino acid sequences are usually shorts and typically no longer than 100 or 50 amino acids, such as between 1 to 100, 1 or 2 to 50, 2 or 3 to 25, 3 or 4 to 10 amino acids.
C. Sequence Modification for Preferred Codon Usage in a Host Organism
The polynucleotide sequence encoding a modified Cry protein or a fragment thereof or fusion protein of this invention can be further altered to coincide with the preferred codon usage of a particular host. For example, the preferred codon usage of one strain of bacterial cells can be used to derive a polynucleotide that encodes a recombinant polypeptide of the invention and includes the codons favored by this strain. The frequency of preferred codon usage exhibited by a host cell can be calculated by averaging frequency of preferred codon usage in a large number of genes expressed by the host cell (e.g., calculation service is available from web site of the Kazusa DNA Research Institute, Japan). This analysis is preferably limited to genes that are highly expressed by the host cell.
At the completion of modification, the coding sequences are verified by sequencing and are then subcloned into an appropriate expression vector for recombinant production of a modified Cry protein or its fragment or a fusion protein thereof.
Following verification of the coding sequence, a modified Cry protein/its fragment or fusion protein of this invention can be produced using routine techniques in the field of recombinant genetics, relying on the polynucleotide sequences encoding the modified Cry protein/its fragment or fusion protein disclosed herein.
A. Expression Systems
To obtain high level expression of a nucleic acid encoding a recombinant protein of this invention, one typically subclones a polynucleotide encoding the protein in the correct reading frame into an expression vector that contains a strong promoter to direct transcription, a transcription/translation terminator and a ribosome binding site for translational initiation. Suitable bacterial promoters are well known in the art and described, e.g., in Sambrook and Russell, supra, and Ausubel et al., supra. Bacterial expression systems for expressing the polypeptide are available in, e.g., E. coli, Bacillus sp., Salmonella, and Caulobacter. Kits for such expression systems are commercially available.
The promoter used to direct expression of a heterologous nucleic acid depends on the particular application. The promoter is optionally positioned about the same distance from the heterologous transcription start site as it is from the transcription start site in its natural setting. As is known in the art, however, some variation in this distance can be accommodated without loss of promoter function. In some cases, a constitutive promoter is used, whereas in other cases an inducible promoter rather than a constitutive promoter is preferred.
In addition to the promoter, the expression vector typically includes a transcription unit or expression cassette that contains all the additional elements required for the expression of the polypeptide of this invention in host cells. A typical expression cassette thus contains a promoter operably linked to the nucleic acid sequence encoding the recombinant protein and signals required for efficient polyadenylation of the transcript, ribosome binding sites, and translation termination. The nucleic acid sequence encoding the recombinant protein may be linked to a cleavable signal peptide sequence to promote secretion of the polypeptide by the transformed cell. Such signal peptides include, among others, the signal peptides from tissue plasminogen activator, insulin, and neuron growth factor, and juvenile hormone esterase of Heliothis virescens. Additional elements of the cassette may include enhancers and, if genomic DNA is used as the structural gene, introns with functional splice donor and acceptor sites.
In addition to a promoter sequence, the expression cassette should also contain a transcription termination region downstream of the coding sequence to provide for efficient termination. The termination region may be obtained from the same gene as the promoter sequence or may be obtained from different genes.
The particular expression vector used to transport the genetic information into the cell is not particularly critical. Any of the conventional vectors used for expression in eukaryotic or prokaryotic cells may be used, especially those suitable for expression in cells of Bacillus sp. such as Bt and Bs. Standard bacterial expression vectors include plasmids such as pBR322 based plasmids, pSKF, pET23D, and fusion expression systems such as GST and LacZ.
The elements that are typically included in expression vectors also include a replicon that functions in bacteria such as Bacillus sp. and E. coli, a gene encoding antibiotic resistance to permit selection of bacteria that harbor recombinant plasmids, and unique restriction sites in nonessential regions of the plasmid to allow insertion of coding sequences. The particular antibiotic resistance gene chosen is not critical, any of the many resistance genes known in the art are suitable. Similar to antibiotic resistance selection markers, metabolic selection markers based on known metabolic pathways may also be used as a means for selecting transformed host cells.
B. Transfection Methods
Standard transfection methods are used to produce bacterial, mammalian, yeast, insect, or plant cell lines that express large quantities of a recombinant protein of this invention, which are then purified using standard techniques (see, e.g., Colley et al., J. Biol. Chem. 264: 17619-17622 (1989); Guide to Protein Purification, in Methods in Enzymology, vol. 182 (Deutscher, ed., 1990)). Transformation of eukaryotic and prokaryotic cells are performed according to standard techniques (see, e.g., Morrison, J. Bact. 132: 349-351 (1977); Clark-Curtiss & Curtiss, Methods in Enzymology 101: 347-362 (Wu et al., eds, 1983).
Any of the well-known procedures for introducing foreign nucleotide sequences into host cells may be used. These include the use of calcium phosphate transfection, polybrene, protoplast fusion, electroporation, liposomes, microinjection, plasma vectors, viral vectors and any of the other well-known methods for introducing cloned genomic DNA, cDNA, synthetic DNA, or other foreign genetic material into a host cell (see, e.g., Sambrook and Russell, supra). It is only necessary that the particular genetic engineering procedure used be capable of successfully introducing at least one gene into the host cell capable of expressing the recombinant protein of this invention.
C. Purification of Modified Cry Proteins and Fusion Proteins
Once the expression of a modified Cry protein or its fragment or a fusion protein thereof in transfected host cells is confirmed, e.g., via electron micrograph for detecting protein crystals or an immunoassay such as Western blotting analysis, the host cells are then cultured in an appropriate scale for the purpose of purifying the recombinant protein.
When the Cry fusion proteins or fusion proteins of the present invention are produced recombinantly by transformed bacteria in large amounts, for example after promoter induction, the proteins are present in crystalline form or insoluble aggregates within the host cells. Thus, one can readily isolate the crystals from the cell lysate based on their distinct density by utilizing techniques such as centrifugation and density gradient separation followed by one or more rinsing steps to further remove contaminants from the protein crystals.
There are several protocols that are suitable for purification of protein inclusion bodies. For example, purification of aggregate proteins (hereinafter referred to as inclusion bodies) typically involves the extraction, separation and/or purification of inclusion bodies by disruption of bacterial cells, e.g., by incubation in a buffer of about 100-150 μg/ml lysozyme and 0.1% Nonidet P40, a non-ionic detergent. The cell suspension can be ground using a Polytron grinder (Brinkman Instruments, Westbury, N.Y.). Alternatively, the cells can be sonicated on ice. Additional methods of lysing bacteria are described in Ausubel et al. and Sambrook and Russell, both supra, and will be apparent to those of skill in the art.
The cell suspension is generally centrifuged and the pellet containing the inclusion bodies resuspended in buffer which does not dissolve but washes the inclusion bodies, e.g., 20 mM Tris-HCl (pH 7.2), 1 mM EDTA, 150 mM NaCl and 2% Triton-X 100, a non-ionic detergent. It may be necessary to repeat the wash step to remove as much cellular debris as possible. The remaining pellet of inclusion bodies may be resuspended in an appropriate buffer (e.g., 20 mM sodium phosphate, pH 6.8, 150 mM NaCl). Other appropriate buffers will be apparent to those of skill in the art.
Following the washing step, the inclusion bodies are solubilized by the addition of a solvent that is both a strong hydrogen acceptor and a strong hydrogen donor (or a combination of solvents each having one of these properties). The proteins that formed the inclusion bodies may then be renatured by dilution or dialysis with a compatible buffer. Suitable solvents include, but are not limited to, urea (from about 4 M to about 8 M), formamide (at least about 80%, volume/volume basis), and guanidine hydrochloride (from about 4 M to about 8 M). Some solvents that are capable of solubilizing aggregate-forming proteins, such as SDS (sodium dodecyl sulfate) and 70% formic acid, may be inappropriate for use in this procedure due to the possibility of irreversible denaturation of the proteins, accompanied by a lack of immunogenicity and/or activity. Although guanidine hydrochloride and similar agents are denaturants, this denaturation is not irreversible and renaturation may occur upon removal (by dialysis, for example) or dilution of the denaturant, allowing re-formation of the immunologically and/or biologically active protein of interest. After solubilization, the protein can be separated from other bacterial proteins by standard separation techniques. For further description of purifying recombinant polypeptides from bacterial inclusion body, see, e.g., Patra et al., Protein Expression and Purification 18: 182-190 (2000).
While the Cry fusion protein crystals tend to remain insoluble at lower or neutral pHs, placing them in alkaline solutions with pH at or greater than 10 or 11 can often effectively dissolve the protein. Once dissolved, the protein can then be analyzed by gel separation (e.g., on an SDS gel) and immunoassays to confirm its identity based on the appropriate molecular weight and immunoreactivity.
D. Crosslinking Cry Fusion Proteins
Crosslinking is a commonly used technique for a broad ranges of goals, such as to stabilize protein tertiary and quaternary structure for analysis; to capture and identify unknown protein interactors or interaction domains; to conjugate an enzyme or tag to an antibody or other purified protein; to immobilize antibodies or other proteins for assays or affinity-purification; and to attach peptides to larger “carrier” proteins to facilitate handling/storage. The present inventors have observed that crosslinking tends to further enhance the desirable properties of the Cry fusion protein crystals such as thermostability and tolerance to organic solvents. Thus, in some cases there is a preference to further crosslink a Cry fusion protein upon its recombinant production and purification.
Despite the complexity of protein structure, including composition with 20 different amino acids, only a small number of protein functional groups comprise selectable targets for practical crosslinking methods. In fact, just four protein chemical targets account for the vast majority of crosslinking and chemical modification techniques: (1) primary amines (—NH2): this group exists at the N-terminus of each polypeptide chain and in the side chain of lysine (Lys, K) residues; (2) carboxyls (—COOH): this group exists at the C-terminus of each polypeptide chain and in the side chains of aspartic acid (Asp, D) and glutamic acid (Glu, E); (3) sulfhydryls (—SH): this group exists in the side chain of cysteine (Cys, C). Often, as part of a protein's secondary or tertiary structure, cysteines are joined together between their side chains via disulfide bonds (—S—S—); and (4) carbonyls (—CHO): these aldehyde groups can be created by oxidizing carbohydrate groups in glycoproteins. For each of these protein functional-group targets, there exist one to several types of reactive groups that are capable of targeting them and have been used as the basis for synthesizing crosslinking and modification reagents. Crosslinkers are selected on the basis of their chemical reactivities (i.e., specificity for particular function groups) and other chemical properties that facilitate their use in different specific applications.
After a fusion protein of the present invention, e.g., a Pos3Aa-p53 fusion protein, is recombinantly produced in host cells (such as Bacillus subtilis cells or Bacillus thuringiensis cells) in a crystalline form and then properly purified, it can then be chemically crosslinked to further increase the level of enhancement in the protein's properties such as thermosstability and tolerance to organic solvents. Well-known chemical crosslinking reagents can be used for this purpose in accordance with the established procedures. Some examples of suitable crosslinking reagents include glutaraldehyde, bis(sulfosuccinimidyl)suberate (BS3), phenol-formaldehyde, Lys to lys cross-linking: DSG (disuccinimidyl glutarate), Lys to cys cross-linking: Sulfo-EMCS (N-ε-maleimidocaproyl-oxysulfosuccinimide ester), Cys to cys cross-linking, and BMH (bismaleimidohexane).
The present invention also provides pharmaceutical compositions comprising an effective amount of a modified Cry fusion protein for achieving the intended effect by the biological activity attributable to the fusion partner or “cargo” protein of the modified Cry protein or a fragment thereof (e.g., P3AP), therefore useful in both prophylactic and therapeutic applications depending on the specific target disease and therapeutic protein. Pharmaceutical compositions of the invention are suitable for use in a variety of drug delivery systems. Suitable formulations for use in the present invention are found in Remington's Pharmaceutical Sciences, Mack Publishing Company, Philadelphia, Pa., 17th ed. (1985). For a brief review of methods for drug delivery, see, Langer, Science 249: 1527-1533 (1990).
The pharmaceutical compositions of the present invention can be administered by various routes, e.g., oral, subcutaneous, transdermal, transnasal, intramuscular, intravenous, or intraperitoneal. The routes of administering the pharmaceutical compositions include systemic or local delivery to a subject suffering from a neurodegenerative disease at daily doses of about 0.01-5000 mg, preferably 5-500 mg, of a Cry fusion protein for a 70 kg adult human per day. The appropriate dose may be administered in a single daily dose or as divided doses presented at appropriate intervals, for example as two, three, four, or more subdoses per day.
For preparing pharmaceutical compositions containing a Cry fusion protein, inert and pharmaceutically acceptable carriers are used. The pharmaceutical carrier can be either solid or liquid. Solid form preparations include, for example, powders, tablets, dispersible granules, capsules, cachets, and suppositories. A solid carrier can be one or more substances that can also act as diluents, flavoring agents, solubilizers, lubricants, suspending agents, binders, or tablet disintegrating agents; it can also be an encapsulating material.
In powders, the carrier is generally a finely divided solid that is in a mixture with the finely divided active component, e.g., a Cry fusion protein comprising a therapeutic protein such as p53 or p16. In tablets, the active ingredient (the Cry fusion protein) is mixed with the carrier having the necessary binding properties in suitable proportions and compacted in the shape and size desired.
For preparing pharmaceutical compositions in the form of suppositories, a low-melting wax such as a mixture of fatty acid glycerides and cocoa butter is first melted and the active ingredient is dispersed therein by, for example, stirring. The molten homogeneous mixture is then poured into convenient-sized molds and allowed to cool and solidify.
Powders and tablets preferably contain between about 5% to about 70% by weight of the active ingredient. Suitable carriers include, for example, magnesium carbonate, magnesium stearate, talc, lactose, sugar, pectin, dextrin, starch, tragacanth, methyl cellulose, sodium carboxymethyl cellulose, a low-melting wax, cocoa butter, and the like.
The pharmaceutical compositions can include the formulation of the active ingredient of a Cry fusion protein with encapsulating material as a carrier providing a capsule in which the recombinant polypeptide (with or without other carriers) is surrounded by the carrier, such that the carrier is thus in association with the polypeptide. In a similar manner, cachets can also be included. Tablets, powders, cachets, and capsules can be used as solid dosage forms suitable for oral administration.
Liquid pharmaceutical compositions include, for example, solutions suitable for oral or parenteral administration, suspensions, and emulsions suitable for oral administration. Sterile water solutions of the active component (e.g., a Cry fusion protein) or sterile solutions of the active component in solvents comprising water, buffered water, saline, PBS, ethanol, or propylene glycol are examples of liquid compositions suitable for parenteral administration. The compositions may contain pharmaceutically acceptable auxiliary substances as required to approximate physiological conditions, such as pH adjusting and buffering agents, tonicity adjusting agents, wetting agents, detergents, and the like.
Sterile solutions can be prepared by dissolving the active component (e.g., a Cry fusion protein) in the desired solvent system, and then passing the resulting solution through a membrane filter to sterilize it or, alternatively, by dissolving the sterile compound in a previously sterilized solvent under sterile conditions. The resulting aqueous solutions may be packaged for use as is, or lyophilized, the lyophilized preparation being combined with a sterile aqueous carrier prior to administration. The pH of the preparations typically will be between 3 and 11, more preferably from 5 to 9, and most preferably from 7 to 8.
The pharmaceutical compositions containing the Cry fusion protein can be administered for prophylactic and/or therapeutic treatments. In therapeutic applications, compositions are administered to a patient already suffering from a target condition/disease in an amount sufficient to prevent, cure, reverse, or at least partially slow or arrest the symptoms of the condition and its complications. An amount adequate to accomplish this is defined as a “therapeutically effective dose.” Amounts effective for this use will depend on the severity of the disease or condition and the weight and general state of the patient, but generally range from about 0.1 mg to about 2,000 mg of the recombinant polypeptide per day for a 70 kg patient, with dosages of from about 5 mg to about 500 mg of the recombinant polypeptide per day for a 70 kg patient being more commonly used.
In prophylactic applications, pharmaceutical compositions containing a Cry fusion protein are administered to a patient susceptible to or otherwise at risk of developing a target disease or disorder in an amount sufficient to delay or prevent the onset of the symptoms. Such an amount is defined to be a “prophylactically effective dose.” In this use, the precise amounts of the recombinant polypeptide again depend on the patient's state of health and weight, but generally range from about 0.1 mg to about 2,000 mg of the recombinant polypeptide for a 70 kg patient per day, more commonly from about 5 mg to about 500 mg for a 70 kg patient per day.
Single or multiple administrations of the compositions can be carried out with dose levels and pattern being selected by the treating physician. In any event, the pharmaceutical formulations should provide a quantity of a Cry fusion protein sufficient to effectively achieve the intended therapeutic effects, e.g., inhibit cancer cell proliferation, invasion and/or metastasis in the patient, either therapeutically or prophylactically.
The invention also provides kits for prophylactic or therapeutic applications by administering a Cry fusion protein according to the method of the present invention. The kits typically include a first container that contains a pharmaceutical composition having an effective amount of a Cry fusion protein, for example, having a therapeutic protein with anti-cancer activity as the fusion partner with a modified Cry protein or a fragment thereof, optionally with a second container containing a second therapeutically active agent, for example, another anti-cancer agent. In some cases, the kits will also include informational material containing instructions on how to dispense the pharmaceutical composition, including description of the type of patients who may be treated (e.g., a person suffering from a condition or disease suitable for treatment by the fusion partner or “cargo” protein in the Cry fusion protein of this invention), the schedule (e.g., administration dose and frequency), route of administration, and the like.
The following examples are provided by way of illustration only and not by way of limitation. Those of skill in the art will readily recognize a variety of non-critical parameters that could be changed or modified to yield essentially the same or similar results.
Proteins perform essential biological functions in cells, such as gene regulation, signal transduction and enzyme catalysis, making them potential candidates/targets for drug research and development. Highlighting the tremendous potential of the protein therapeutics market, seven of the top ten drugs sold globally in 2018 were monoclonal antibodies, with numerous other proteins being explored for the treatment of various diseases. While almost all current approved protein-based drugs in the market act on extracellular targets, many diseases are caused by the dysfunction of intracellular proteins. The lack of protein-based therapeutics in the clinic is due in large part to the instability and low cell penetration efficiency of these therapeutics into cells. Moreover, entrapment by endosomes and lysosomes after cell entry can dramatically hinder the efficacy of these proteins. To address these issues, multiple strategies, including cell membrane deformation, hypertonic buffer treatment, cell penetrating peptides and nano/micro-carriers, have been explored to mediate the internalization of protein therapeutics into mammalian cells. Considering the protection to cargo proteins and the versatility for in vivo application, nano/micro-carriers are arguably one of the most promising approaches for protein delivery. Various materials, like cationic polymers, lipids, inorganic materials and proteins, have been utilized to manufacture nano/micro-carriers with different chemical and physical properties. Compared to other materials, protein-based particles have distinct features well suited for intracellular drug delivery, including good biocompatibility, biodegradability, ease of modification.
It was previously reported the development of a novel protein delivery platform based on Cry3Aa protein that naturally forms sub-micrometer-sized protein crystals within the bacterium Bacillus thuringiensis (Bt). In mouse studies, the purified Cry3Aa crystals exhibited great biocompatibility and biodegradability, and minimal toxicity. More importantly, this platform appears to stabilize its cargo proteins in the form of Cry3Aa-cargo fusion protein crystals as demonstrated by the extended lifetime of the protein compared to free protein. It was further shown that these protein crystals could be specifically taken up by phagocytic macrophages. As proof of concept, an antimicrobial peptide has been encapsulated in high loading efficiency into Cry3Aa crystals and successfully delivered to macrophages for treating intracellular parasites infection in a mouse model of cutaneous leishmaniasis. To expand the application of this platform on other non-phagocytic cells, a positively-charged mutant of Cry3Aa (Pos3Aa), which retains its crystal-forming ability in Bt cells, was identified. In this disclosure, the Pos3Aa protein crystals are demonstrated to be readily taken up by various types of non-phagocytic cells, such as cancer cells and fibroblasts, with high efficiency of endosomal escape. Successful intracellular delivery of a fluorescent mCherry protein and two tumor suppressor proteins p53 and p16 was achieved via this Pos3Aa platform. Significantly, Pos3Aa-mediated p53 and p16 delivery restored their anti-cancer activities in p53 or p16-deficient cancer cells, indicating that Pos3Aa-based protein crystals can be an effective platform for intracellular protein delivery.
Positively charged amino acids (lysine and arginine) are proven to be crucial for the efficient cellular uptake of cell-penetrating peptides or supercharged proteins by mammalian cells. As above-mentioned, it was discovered that wildtype Cry3Aa protein harbor a large proportion of negatively charged surface residues in the domain II, which is presumably correlated to the poor internalization of Cry3Aa protein crystals into non-phagocytic cells (
Pos3Aa protein crystals were produced in Spo− 407-OA Bt cells, purified by sucrose gradient centrifugation and characterized for the subsequent experiments. Interestingly, Pos3Aa crystals hold a zeta potential value of −14.9 mV, indicating a near-neutral surface of these crystals instead of a positively charged one. One possible explanation, based on the structure of Pos3Aa crystals, is that the lysine clusters are regularly distributed on the limited regions of the crystal surface due to the crystal packing, and meanwhile other regions are still neutral or negatively charged (
To evaluate the cellular uptake efficiency of Pos3Aa protein crystals, Alexa488-labeled Pos3Aa crystals (Alexa488-Pos3Aa) were incubated with A549 cells, primary pulmonary fibroblasts (PPFs) and PC12 cells for 24 h and the internalization of crystals was examined using fluorescent confocal microscopy. As indicated in
Genetic fusion of cargo proteins to cell penetrating peptides is one of the most commonly used method to mediate the intracellular protein delivery. Despite the typically efficient endocytic uptake of CPPs, it appears that CPP-tagged proteins frequently fail to exhibit significant cellular activities. Arguably the major obstacle is the inefficient endosomal escape, which is also a challenge to micro/nano-carriers, leading to the entrapment of delivered proteins in the intracellular vesicles. To assess the ability of Pos3Aa protein crystals to escape from endo/lysosomes, A549 cells, PPFs and PC12 cells were incubated with Alexa488-Pos3Aa crystals for 24 hours and subsequently stained with LysoTracker Red DND-99 to visualize the endocytic vesicles. The co-localization analysis indicated that a large percentage of Pos3Aa crystals were localized in the cytoplasm, suggesting the occurrence of endosomal escape (
4. Intracellular Delivery of Fluorescent mCherry Protein
To evaluate the possibility of applying Pos3Aa crystal as a carrier for protein delivery, the ability of Pos3Aa to deliver a model cargo protein, mCherry, into mammalian cells was explored. mCherry protein was genetically fused to the C-terminus of Pos3Aa, and the corresponding Pos3Aa-mCherry fusion protein crystals were produced in Bt cells and purified for subsequent experiments. Two CPP-tagged mCherry proteins, TAT-mCherry and polyarginine(R9)-mCherry, were also produced for comparison. As shown in the confocal images, Pos3Aa-mCherry crystals revealed highly efficient cellular uptake by A549 cells (FIG. 7A). The delivery efficiency was then quantified using flow cytometry. Cells treated with Pos3Aa-mCherry crystals exhibited 40-fold change in the mean mCherry fluorescence intensity, whereas the treatment of TAT-mCherry and R9-mCherry proteins only caused 1.25-fold and 1.7-fold changes, respectively (
Given the promising results of mCherry delivery, delivery of bioactive proteins that hold potentials to be valuable therapeutics was then tested. Transcription factors are a class of key regulatory proteins controlling the eukaryotic gene expression in many biological processes, such as development and tumorigenesis. The dysfunction of transcription factors is a driver of numerous diseases, and these proteins are therefore considered to be attractive therapeutic targets. One typical example is p53 protein, whose encoding gene TP53 is found mutated or deleted in nearly half of human cancers. p53 is activated upon cellular stress signals, like oncogenic stress and DNA damage, resulting in the expression of downstream genes involved in cell-cycle arrest, DNA repair and apoptosis. Restoring p53 functions in cancer cells could be a potent alternative to cancer therapy, while the poor stability and low cell penetration efficiency of native p53 protein limit its direct intracellular delivery.
Herein, the successful delivery of p53 protein using the Pos3Aa platform is reported. Pos3Aa-p53 fusion protein was expressed in Bt Spo− 407OA cells, and the resultant Pos3Aa-p53 fusion protein crystals were purified by sucrose gradient centrifugation. Live cell imaging showed that Pos3Aa-p53 crystals can be readily internalized by the p53-deficient breast cancer MDA-MB-231 cells (
The anticancer activity of Pos3Aa-p53 protein crystals was then tested using MTS assay. As indicated in
One consequence of the p53 deficiency in cancer cells is their resistance to anticancer drugs that induce p53-dependent apoptosis. It is thus hypothesized that the restoration of p53 function by Pos3Aa-p53 crystals might be able the increase the cellular sensitivity to these anticancer drugs. To confirm this, two classic drugs, fluorouracil (5-FU) and doxorubicin (Dox) were chosen as model compounds for the subsequent validation. It has been demonstrated that loss of p53 function significantly reduces the susceptibility of cancer cells to 5-FU, whereas a p53-independent mechanism is involved in the Dox-induced cell death. It was expected that Pos3Aa-p53 treatment would make the p53-deficienct MDA-MB-231 cells more sensitive to 5-FU but not Dox. A pre-treatment experiment was first performed. Cells were pre-delivered with 500 nM Pos3Aa-p53 crystals for 24 h. After that, cells with or without pretreatment were seeded into 96-well plates at same seeding concentration, and treated with graded doses of 5-FU or Dox. As shown in
Given that the release of cargo proteins from the Cry-cargo fusion protein crystals is based on the solubilization of these crystals, a triple mutant of Pos3Aa (T533A, G535A, D536A) protein (Pos3Aa™) was generated to improve the solubility of Pos3Aa-cargo protein crystals. When compared with Pos3Aa-mCherry crystals, Pos3Aa™-mCherry protein crystals exhibited better solubility (more crystals can be solubilized) (
This Pos3Aa triple mutant protein was then applied to deliver another tumor suppressor protein—p16. Frequent deletions or mutations of the INK4 gene, which encodes the cyclin-dependent kinase inhibitor p16 protein, have been reported in around half of human cancers. p16 protein contributes to the regulation of cell cycle progression by binding to CDK4/6, inhibiting cyclin D-CDK4/6 complex formation and CDK4/6-mediated phosphorylation of Rb family members. The resultant hypophosphorylated Rb family members binds to E2Fs, a family of transcription factors controlling the proliferation-associated genes, prevents them from nuclear import, and consequently leads to G1 cell cycle arrest. It was hypothesized that direct delivery of p16 protein into p16-deficienct cancer cells might arrest the cells at G1 phase and thus inhibit the cell growth. To evaluate this possibility, Pos3Aa™-mCherry-p16 (Pos3Aa™-mCh-p16) fusion protein crystals were produced in Bt cells and purified using sucrose gradient centrifugation. A mCherry-tagged p16 protein (mCh-p16) was produced at the same time for comparison. p16-Deficient squamous carcinoma UM-SCC-22A cells were incubated with Pos3Aa™-mCh-p16 crystals for 24, 48 and 96 h to assess the cellular uptake of those crystals. As indicated in
7. Intracellular Delivery of mCherry Protein by a Pos3Aa-Derived Peptide
It is shown that Pos3Aa crystal platform has its own advantages in intracellular delivery of functional proteins and their sustained release. Some applications, such as delivery of Cas9 protein for gene editing, however, require transient and fast action of these bioactive proteins. Hence, inspired by the rapid action of cell-penetrating peptides, a peptide derived from the domain II of Pos3Aa protein was identified and named as P3AP (
The abilities of cell-penetrating peptides to translocate through cell membranes is generally accompanied by cytotoxicity. To assess the cytotoxicity of P3AP peptide, MDA-MB-231 cells were incubated with different concentrations of mCherry, TAT-mCherry, R9-mCherry, and P3AP-mCherry for 72 h, and the cell viabilities were determined by MTS/PMS reagent. As indicated in
Proteins are potential candidates/targets for drug research and development. While almost all current approved protein-based drugs in the market act on extracellular targets, many diseases are caused by the dysfunction of intracellular proteins. The instability and low cell penetration efficiency of proteins, however, limit the development of protein-based therapeutics. Moreover, entrapment by endo/lysosomes dramatically hinders the efficacy of protein therapeutics. To overcome these issues, the present inventors have developed a protein delivery platform based on an engineered Cry3Aa protein (Pos3Aa) with noted advantages: first, efficient endosomal escape. It has been demonstrated that a large percentage of Pos3Aa protein crystals were localized in the cytoplasm after the cellular uptake by A549 cells, primary fibroblasts (PPFs) and PC12 cells, indicating the occurrence of endosomal escape. Second, much more efficient cellular uptake than conventional cell-penetrating peptides. It has been demonstrated that cells treated with Pos3Aa-mCherry crystals exhibited 40-fold increase in the mean mCherry fluorescence intensity, whereas the treatment of TAT-mCherry and R9-mCherry proteins only caused 1.25-fold and 1.7-fold changes, respectively. Third, capability of delivering bio-functional proteins. It is further demonstrated that two tumor suppressor proteins, p53 and p16, can be efficiently delivered into cancer cells by this Pos3Aa platform. Significantly, Pos3Aa-mediated p53 and p16 delivery restored their anti-cancer activities in p53/p16-deficient cancer cells. Lastly, intracelluar delivery of proteins by a new Pos3Aa-derived peptide—P3AP. Latest results indicate that, in addition to Pos3A crystals, a P3AP peptide derived from the domain II of Pos3Aa protein can be used to efficiently deliver proteins into cells with minimal cytotoxicity.
All patents, patent applications, and other publications, including GenBank Accession Numbers, cited in this application are incorporated by reference in the entirety for all purposes.
DSIDQLPPETKKKPLKKGYSHQLNYVMCFLMQGSRGTIPVLTWTHKSVDFFNMIDSKKITQLPLVKAYKLQSGAS
This application claims priority to U.S. Patent Application No. 62/834,605, filed Apr. 16, 2019, the contents of which are hereby incorporated by reference in the entirety for all purposes.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2020/084939 | 4/15/2020 | WO |
Number | Date | Country | |
---|---|---|---|
62834605 | Apr 2019 | US |