This invention relates to microbial consortia and their use in production of multi-protein complexes.
Protein purification is conducted routinely in areas encompassing biochemical characterization of cellular pathways (Goering et al., 2016; Lu et al., 2015; Shimizu and Ueda, 2010) to in vitro, cell-free assays (Caschera and Noireaux, 2016; Niederholtmeyer et al., 2015; Pardee et al., 2014; Takahashi et al., 2015; Tsuji et al., 2016). While the classical approach works well for the synthesis of one protein species, the preparation of multi-protein complexes, especially in the case of metabolic pathways (Lopez-Gallego and Schmidt-Dannert, 2010) and mRNA translation machinery (TraM) (Shimizu and Ueda, 2010), remains difficult due to the large number of protein species and stringent requirement of protein ratios (Li et al., 2014; Matsubayashi and Ueda, 2014). TraM consists of 34 proteins, including 11 IET genes (3 Initiation factors, 4 Elongation factors, 3 Termination/Release factors and the Ribosome Recycling Factor), and 23 AAT (tRNA-Amino acyl-transferases) (Shimizu and Ueda, 2010). Pure TraM proteins are traditionally prepared by purifying each protein individually or few proteins at a time, and then mixing them to assemble the functional TraM (Shimizu and Ueda, 2010; Wang et al., 2012).
There is a need in the art for new methods of providing the proteins required for in vitro translation. The present invention addresses these and other needs.
The present invention provides a microbial culture (referred to here as a microbial consortium) comprising a plurality of microbial strains, each strain comprising a different recombinant plasmid including a gene encoding a different protein involved in translation of mRNA, wherein the protein expression level of each protein is controlled to a pre-defined level, such that the proteins are capable of forming a multi-protein complex which translates an mRNA molecule into a polypeptide in a reaction mixture.
The amount of each protein can be determined by: (a) the density of the microbial strain in the culture, (b) the copy number of the plasmid comprising the gene encoding the protein, (c) the sequence of the ribosomal binding site in the gene encoding the protein; or (d) a combination of (a), (b) and (c). Each protein in the multi-protein complex may include a tag to facilitate isolation of the protein (e.g., poly His tag).
In a typical embodiment, each gene has the same promoter (e.g., a PT7/lacO hybrid promoter) and the microbial culture comprises E. coli. Each microbial strain may comprise a single plasmid including a gene encoding a protein involved in translation of mRNA. Alternatively, at least one strain comprises more than one plasmid including a gene encoding a protein involved in translation of mRNA.
The proteins in the multi-protein complex may comprise initiation factors, elongation factors, termination/release factors, a ribosome recycling factor and tRNA-Amino acyl-transferases. In some embodiments, the initiation factors are translational initiation factor 1, translational initiation factor 2, and translational initiation factor 3; the elongation factors are translational elongation factor G, translational elongation factor Tu, translational elongation factor Ts, and translational elongation factor 4; the termination/release factors are translational release factor 1, translational release factor 2, and translational release factor 3; and the tRNA-Amino acyl-transferases are Val-tRNA synthetase, Met-tRNA synthetase, Ile-tRNA synthetase, Thr-tRNA synthetase, Lys-tRNA synthetase, Glu-tRNA synthetase, Ala-tRNA synthetase, Asp-tRNA synthetase, Asn-tRNA synthetase, Leu-tRNA synthetase, Arg-tRNA synthetase, Cys-tRNA synthetase, Trp-tRNA synthetase, Phe-tRNA synthetase B, Pro-tRNA synthetase, Ser-tRNA synthetase, Phe-tRNA synthetase A, Gln-tRNA synthetase, Tyr-tRNA synthetase, Met-tRNA formyltransferase, Gly-tRNA synthetase B, His-tRNA synthetase, and Gly-tRNA synthetase A.
The invention also provides methods of making a multi-protein complex as described above. The methods comprise (a) providing a microbial culture comprising a plurality of microbial strains, each strain comprising a different recombinant plasmid including a gene encoding a different protein involved in translation of mRNA, wherein the protein expression level of each protein is controlled to a pre-defined level, such that the proteins are capable of forming a multi-protein complex; and (b) simultaneously isolating the proteins from the microbial culture, thereby forming the multi-protein complex.
The invention further provides methods of translating an mRNA molecule into a polypeptide. The methods comprise: (a) providing a microbial culture comprising a plurality of microbial strains, each strain comprising a different recombinant plasmid comprising a gene encoding a different protein involved in translation of mRNA, wherein the protein expression level of each protein is controlled to a pre-defined level, such that the proteins are capable of forming a multi-protein complex which translates an mRNA molecule into a polypeptide in a reaction mixture; (b) simultaneously isolating the proteins from the microbial culture, thereby forming the multi-protein complex; (c) forming a reaction mixture comprising the multi-protein complex, amino acids, ribosomes, and the mRNA molecule or a DNA molecule encoding the mRNA; (d) incubating the reaction mixture under conditions suitable for translation of the mRNA molecule into a polypeptide; and (e) isolating the polypeptide.
“Operably linked” indicates that two or more DNA segments are joined together such that they function in concert for their intended purposes. For example, coding sequences are operably linked to promoter in the correct reading frame such that transcription initiates in the promoter and proceeds through the coding segment(s) to the terminator.
A “polynucleotide” is a single- or double-stranded polymer of deoxyribonucleotide or ribonucleotide bases typically read from the 5′ to the 3′ end. Polynucleotides include RNA and DNA, and may be isolated from natural sources, synthesized in vitro, or prepared from a combination of natural and synthetic molecules. When the term is applied to double-stranded molecules it is used to denote overall length and will be understood to be equivalent to the term “base pairs”.
A “polypeptide” or “protein” is a polymer of amino acid residues joined by peptide bonds, whether produced naturally or synthetically. Polypeptides of less than about 75 amino acid residues are also referred to here as peptides or oligopeptides.
The term “promoter” is used herein for its art-recognized meaning to denote a portion of a gene containing DNA sequences that provide for the binding of RNA polymerase and initiation of transcription of an operably linked coding sequence. Promoter sequences are typically found in the 5′ non-coding regions of genes.
The terms “identical” or percent “identity,” in the context of two or more nucleic acids or polypeptide sequences, (e.g., two proteins of the invention and polynucleotides that encode them) refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same, when compared and aligned for maximum correspondence, as measured using one of the following sequence comparison algorithms or by visual inspection.
The phrase “substantially identical,” in the context of two nucleic acids or polypeptides of the invention, refers to two or more sequences or subsequences that have at least 60%, 65%, 70%, 75%, 80%, or 90-95% nucleotide or amino acid residue identity, when compared and aligned for maximum correspondence, as measured using one of the following sequence comparison algorithms or by visual inspection. Preferably, the substantial identity exists over a region of the sequences that is at least about 50 residues in length, more preferably over a region of at least about 100 residues, and most preferably the sequences are substantially identical over at least about 150 residues. In a most preferred embodiment, the sequences are substantially identical over the entire length of the coding regions.
For sequence comparison, typically one sequence acts as a reference sequence, to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are input into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. The sequence comparison algorithm then calculates the percent sequence identity for the test sequence(s) relative to the reference sequence, based on the designated program parameters.
Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv. Appl. Math. 2:482 (1981), by the homology alignment algorithm of Needleman & Wunsch, J. Mol. Biol. 48:443 (1970), by the search for similarity method of Pearson & Lipman, Proc. Nat'l. Acad. Sci. USA 85:2444 (1988), by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, Wis.), or by visual inspection (see generally, Current Protocols in Molecular Biology, F. M. Ausubel et al., eds., Current Protocols, a joint venture between Greene Publishing Associates, Inc. and John Wiley & Sons, Inc., (1995 Supplement) (Ausubel)).
Examples of algorithms that are suitable for determining percent sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, which are described in Altschul et al. (1990) J. Mol. Biol. 215: 403-410 and Altschuel et al. (1977) Nucleic Acids Res. 25: 3389-3402, respectively. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul et al, supra). These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always>0) and N (penalty score for mismatching residues; always<0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 11, an expectation (E) of 10, M=5, N=−4, and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89:10915 (1989)).
In addition to calculating percent sequence identity, the BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin & Altschul, Proc. Nat'l. Acad. Sci. USA 90:5873-5787 (1993)). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.1, more preferably less than about 0.01, and most preferably less than about 0.001.
A further indication that two nucleic acid sequences or polypeptides of the invention are substantially identical is that the polypeptide encoded by the first nucleic acid is immunologically cross reactive with the polypeptide encoded by the second nucleic acid, as described below. Thus, a polypeptide is typically substantially identical to a second polypeptide, for example, where the two peptides differ only by conservative substitutions. Another indication that two nucleic acid sequences are substantially identical is that the two molecules hybridize to each other under stringent conditions, as described below.
The present invention provides a new approach to produce a desired multi-protein complex (e.g., one useful for in vitro translation of mRNAs or TraM) by exploiting microbial consortia (i.e., associations of multiple strains of microorganisms living in a single culture). The invention is based on the design principle of distributing metabolic burden from protein synthesis across multiple microbial strains. Different bacterial strains are engineered to express distinct proteins in a single culture (referred to as TraM one shot or TraMOS). Subsequently, all the proteins are purified using a single affinity chromatography step.
As explained in detail below, the relative amount of each protein in the complex is regulated such that the complex efficiently produces the desired final product (e.g., a translated polypeptide in the case of TraMos).
The proteins of the invention can be made using standard methods well known to those of skill in the art. Recombinant expression in a variety of microbial host cells, including E. coli, or other prokaryotic hosts is well known in the art.
Polynucleotides encoding the desired proteins in the complex, recombinant expression vectors, and host cells containing the recombinant expression vectors, as well as methods of making such vectors and host cells by recombinant methods are well known to those of skill in the art.
The polynucleotides may be synthesized or prepared by techniques well known in the art. Nucleotide sequences encoding the desired proteins may be synthesized, and/or cloned, and expressed according to techniques well known to those of ordinary skill in the art. In some embodiments, the polynucleotide sequences will be codon optimized for a particular recipient using standard methodologies. For example, a DNA construct encoding a protein can be codon optimized for expression in microbial hosts, e.g., bacteria.
Examples of useful bacteria include, but are not limited to, Escherichia, Enterobacter, Azotobacter, Erwinia, Bacillus, Pseudomonas, Klebsielia, Proteus, Salmonella, Serratia, Shigella, Rhizobia, Vitreoscilla, and Paracoccus. The nucleic acid encoding the desired protein is operably linked to appropriate expression control sequences for each host. For E. coli this includes a promoter such as the T7, trp, or lambda promoters, a ribosome binding site and preferably a transcription termination signal. The proteins may also be expressed in other cells, such as mammalian, insect, plant, or yeast cells.
Once expressed, the recombinant proteins can be purified according to standard procedures of the art, including ammonium sulfate precipitation, affinity columns, column chromatography, gel electrophoresis and the like. In a typical embodiment, the recombinantly produced proteins are expressed as a fusion protein that has a “tag” at one end which facilitates purification of the proteins. Suitable tags include affinity tags such as a polyhistidine tag which will bind to metal ions such as nickel or cobalt ions. Other suitable tags are known to those of skill in the art, and include, for example, epitope tags. Epitope tags are generally incorporated into recombinantly expressed proteins to enable the use of a readily available antibody to detect or isolate the protein.
The following examples are offered to illustrate, but not to limit the claimed invention.
E. coli Strain and Plasmids
E. coli BL21 (DE3)-pLysS strain was used to construct the consortia that express fluorescent proteins. BL21 (DE3) was used to construct the consortia that express TraM proteins. Genomic DNA from E. coli MG1655 was prepared using Wizard Genomic DNA Purification Kit (Promega). pET15b (Novagen), pLysS (Novagen), and pSC101 (Manen and Caro, 1991) plasmids were used to create new plasmids pIURAH, pIURCM and pIURKL, respectively (Supplementary Information Section 2 for details). The three plasmids carry an NsiI/PacI cloning site downstream of a PT7/lacO hybrid promoter. pIURAH contains the AmpR/ColE1 replication origin and expresses lad, pIURCM contains the CmR/p15A replication origin and expresses T7 lysozyme, and pIURKL contains KmR/pSC101 replication origin (
CFP, GFP, mOrange and mCherry genes were amplified with the insertion of a 6x-His tag sequence in the C-end using specific primers. The amplicons were cloned into XbaI/NcoI-digested pET15b plasmid using Gibson Assembly (New England Biolabs), yielding C.CFP-, C.GFP, C.mOrange- and C.mCherry-pET15b plasmids. mOrange was cloned into NsiI/PacI-digested pIURKL using Gibson Assembly (yielding C.mOrange-pIURKL). C.GFP-pET15b RBS sequence was modified by digesting the plasmid XbaI/NcoI and inserting a PCR product (generated using primers that introduced a weaker RBS) by Gibson Assembly, to produce C.GFPweak-pET15b.
Analysis of the Consortia that Express Fluorescent Proteins
The plasmids expressing each fluorescent proteins (C.CFP-, C.GFP, C.GFPweak-, C.mOrange- and C.mCherry-pET15b) were independently transformed into BL21 (DE3)-pLysS. The resulting strains were AmpR/CmR. C.CFP-, C.GFP, and C.mCherry-pET15b plasmids were co-transformed with the unmodified pIURKL in BL21 (DE3)-pLysS. C.mOrange-pIURKL was co-transformed with the unmodified pET15b into BL21 (DE3)-pLysS cells. These strains (AmpR/CmR/KmR) were used to construct consortium L (
Premixed consortia were inoculated in triplicates at 1/250 dilution in 5 mL M9 media supplemented with 0.1% casamino acids, 0.1% glucose, and carbenicillin/chloramphenicol. After 2 hrs, cultures were induced with 1 mM IPTG for 6 hrs. Cells were collected and lysed in CelLytic B Buffer (Sigma Aldrich) supplemented with Benzonase (Novagen) 0.02% v/v. Cell debris was removed by centrifugation (20,000g for 15 min at 4° C.) and supernatant was stored for purification. The supernatant was applied to 100 μL of Ni-NTA resin (Life Technologies) previously equilibrated with a binding buffer (50 mM Tris-HCl pH 7.5, 100 mM NaCl and 30 mM Imidazole). The resin was washed with 1 mL of wash buffer (binding buffer supplemented with 1% Tween 20) and 1 mL of binding buffer. Proteins were eluted in elution buffer (50 mM Tris-HCl pH 7.5, 100 mM NaCl and 250 mM Imidazole). Total protein concentration was quantified using 660 nm Protein Assay (Thermo Scientific). Fluorescence intensities of CFP, GFP, mOrange, and mCherry were determined using NanoQuant plate (Tecan) and m1000Pro Infinite reader.
The 34 TraM genes (Table S2) were cloned from E. coli MG1655 genomic DNA, using specific primers to introduce either N- or C-end 6x His tag, as well as NsiI and PacI restriction sites. The genes were amplified by PCR. C-end tagged TraM genes were reamplified using the proper forward primer and a universal reverse primer (TramCend_Cloner). All fragments were cloned using Gibson Assembly (New England Biolabs) into pIUR plasmids, which were digested by NsiI and PacI. All TraM genes were amplified using one set of primers except asnS-N (1425 bp), which was amplified using a primer set for base pairs 1-742 and another primer set for base pairs 716-1425. These two fragments were fused together in Gibson Assembly reactions to clone the full length gene in pIUR plasmids. All positive clones were confirmed by DNA sequencing and western blots to confirm identity of the proteins expressed in BL21 (DE3) induced with IPTG.
1Tg strains were created by simultaneous transformation of pIURAH or pIURKL genes coding for a single TraM genes, plus unmodified pIURCM and pIURKL or pIURAH, accordingly, into BL21 (DE3) competent cells (Table S3). 2Tg strains were generated by co-transformation of pIURAH and pIURKL plasmids coding for TraM genes, plus unmodified pIURCM. Finally, 3Tg strains were created by co-transforming the three pIUR plasmid coding for TraM genes. All strains were confirmed by expression of the target proteins, which were analyzed by western blot using anti-His antibody. All strains were selected in LB-agar plates supplemented with the three antibiotics and stored as glycerol stocks.
In order to determine growth rate of the 1Tg, 2Tg, and 3Tg strains, we first grew the strains overnight at 37° C. in LB supplemented with antibiotics. The overnight cultures were inoculated at 1/200 dilution into 96-well plates containing 200 μL of LB with antibiotics. The plate, covered with plastic lid, was incubated for 1 hr at 37° C. with shaking cycles of 20 sec ON, 40 sec OFF in plate reader, and water or IPTG (0.5 mM final concentration) was added. OD600 was registered over 8 hrs. Growth rates were calculated using the program GrowthRates (Hall et al., 2014).
Buffers for purification of TraMOS proteins were prepared following previous work (Shimizu and Ueda, 2010), with slight modifications. Buffer A: 50 mM HEPES pH 7.5, 1 M Ammonium chloride, 10 mM Magnesium chloride; Buffer B: 50 mM HEPES pH 7.5, 500 mM Imidazole, 10 mM Magnesium chloride; Buffer HT: 50 mM HEPES pH 7.5, 100 mM potassium chloride, 10 mM Magnesium chloride, 7 mM 2-mercaptoethanol; Buffer HT+: 50 mM HEPES pH 7.5, 100 mM potassium chloride, 50 mM potassium glutamate, 10 mM Magnesium chloride, 7 mM 2-mercaptoethanol. 2-mercaptoethanol was freshly prepared before use in all cases.
The 1Tg strains coding for the 11 initiation, elongation, and termination factors were grown overnight at 37° C. in 3 mL of LB media supplemented with carbenicillin/chloramphenicol/kanamycin. Each strain was individually inoculated in a flask containing 600 mL LB with antibiotics at 1/250 dilution, and grown for 90 min at 37° C. before induction with 0.5 mM IPTG for 4 hrs. Cells were collected by centrifugation and stored at −80° C. overnight. Next day, cell pellet was resuspended in 5 mL per g of cells in a binding buffer (Buffer A:Buffer B 97.5:2.5 with 7 mM 2-mercaptoethanol). Cells were lysed by sonication and cell debris was removed by centrifugation (20,000g, 15 min, 4° C.). Supernatant was applied to a 1 mL HisTrap FF column (GE Healthcare Life Sciences) previously equilibrated with 10 volumes of the binding buffer. Each column was washed with 10 volumes of the binding buffer and 10 volumes of a wash buffer (Buffer A:Buffer B 95:5 plus 7 mM 2-mercaptoethanol), and then eluted with 7 mL of an elution buffer (Buffer A:Buffer B 20:80 plus 7 mM 2-mercaptoethanol). Each elution fraction was dialyzed for 6 hrs against Buffer HT, followed by overnight dialysis against Buffer HT supplemented with glycerol 20%. Proteins were then concentrated by ultrafiltration using Amicon Ultra-4 Centrifugal Filter Units 3,000 MWCO (EMD Millipore). Protein concentrations of each factor were analyzed using the 660 nm Protein Assay. Control IET was prepared by combining all the factors at the concentrations shown in Table S6. Control AAT is a mixture of all the tRNA-amino acyl transferases from E. coli (Sigma Aldrich).
Each strain required to establish a consortium was grown overnight from glycerol stocks in LB media supplemented with the antibiotics at 37° C. Details on the design of the strains and establishment of consortia are described in Supplementary Information, Section 3. The overnight cultures were used to establish consortia by mixing the strains at the indicated ratios (ratio represent % of the strain in the total volume of the mix). The consortia were then inoculated 1/500 into 600 mL LB with antibiotics and grown 90 minutes before induction for 4 hrs with 0.5 mM IPTG, except the 15-strain consortia that were inoculated 1/200, grown 90 minutes and induced for 5 hrs with 0.5 mM IPTG. TraM proteins from the cultures were purified as described above, with the exception that the final overnight dialysis step was performed against Buffer HT+. Protein identification and quantification were performed by the Proteomics Core Facility, Genome Center at University of California, Davis. Samples were digested with trypsin, and peptides were analyzed using Q-Exactive liquid chromatography tandem mass spectrometry (LC-MS/MS). Results were analyzed using X!tandem against a customized database that includes the total BL21 (DE3) and the 6x-His-tagged TraM proteins.
Proteins were separated by SDS-PAGE using 8-16% Mini-PROTEAN TGX precasted gels (Bio-Rad). For western blot, proteins were transferred to nitrocellulose membranes using Trans-Blot Turbo Transfer System (Bio-Rad). For the quantification of total protein amount, gels were stained using Coomassie Brilliant Blue Electrophoresis Gel Stain (G-Biosciences). Nitrocellulose membranes were stained using Ponceau-S Membrane Stain (G-Biosciences), imaged and subsequently blocked with 5% Dry fat milk in TBS-T buffer (TBS plus 0.1% Tween-20). Membranes were exposed to either Mouse Anti-6x-His Epitope Tag HIS.H8 or Rat Anti-FLAG Epitope Tag L5 to detect His-tagged or FLAG-tagged proteins, respectively. Following washes with TBS-T plus 0.1% BSA, membranes were exposed to HRP-conjugated secondary antibodies Goat anti-Mouse IgG or Goat anti-Rat IgG for His-tagged or FLAG-tagged proteins, respectively. Membranes were developed using Clarity Western ECL Substrate (Bio-Rad). Gels and membranes were imaged using a PXi Imaging system (Syngene).
Overnight cultures of BL21 (DE3) strain were diluted 1:1000 in fresh LB containing 0.4 mM IPTG. Bacteria were collected and washed twice with PBS (4,000×g, 10 min, 4° C.) after growing at 30° C. for 6 h. The bacterial pellet was resuspended in sonication buffer (10 mM Tris-acetate pH 7.6, 14 mM Magnesium acetate, 60 mM Potassium gluconate, 1 mM DTT) to a final concentration of 1 g/mL. The resuspended bacteria were lysed by sonication. Cell lysates were centrifuged at 12,000×g for 20 min at 4° C. The supernatant was incubated at 37° C. for 30 minutes. The resulting WCE was aliquoted and stored at −80° C.
2× reaction buffer contained amino acid mix 110 mM (each amino acid 5.4 mM), tRNA (Roche) 108 UA260/mL, ATP 7.5 mM, GTP 5 mM, CTP 2.5 mM, UTP 2.5 mM, Creatine phosphate 100 mM, Folinic acid 60 μg/mL, HEPES-KOH 7.6 100 mM, Potassium glutamate 700 mM, Magnesium Acetate 36 mM, Spermidine 2 mM, DTT 10 mM, BSA 1 mg/mL, Creatine Kinase (Roche) 162 μg/ml, Myokinase (Sigma Aldrich) 100 μg/mL, Diphosphonucleotide Kinase (Sigma Aldrich) 8.16 μg/mL, T7 RNAP (New England Biolabs) 400 U/μl, RNAse inhibitor (New England Biolabs) 0.8 U/μl. Amino acid mixture was prepared as described in a previous work (Caschera and Noireaux, 2015). Reactions (final volume 5 μL) were established by combining 2× reaction buffer, cell-free systems, 1.3 μM ribosomes (New England Biolabs), and 2-5 ng of plasmid DNA. When reactions were conducted using the S12 WCE, T7 RNAP was not included in the 2× reaction buffer, and ribosomes were not added. After mixing, reactions were incubated 4 h at 37° C., and measured using the NanoQuant plate as described above.
In vitro transcription/translation reactions (final volume 5 μL) were performed using either C.mCherry-pET15b or WTCHGSN-pET15b plasmids (Supplementary Information Section 4) and incubated for 3 hrs at 37° C. Next, the reactions were supplemented with 2 μl of PBS and either 1 μl of FITC-Casein (AnaSpec)+1 μl of PBS or 1 μl of FITC-Casein+1 μl of papain (Sigma Aldrich). Final concentration of FITC-Casein was 0.04 μg/μL. Final concentration of papain was 0.4 ng/μL. Each reaction was allowed to proceed for 2 hrs at 37° C. and measured for FITC fluorescence intensities using NanoQuant plate as described above. Data was normalized using the fluorescence intensities of the control (FITC-Casein in PBS without CFS). Reactions in 384-well plates were set up in a similar way, except that plates were covered with film and placed in the plate reader to measure FITC-fluorescence intensities using a 2 h kinetic cycle at 37° C. with measurement at every 5 min. Fluorescence data was normalized using the data at time=0. For the screening of the library (details in Supplementary Information Section 4), different plasmids were added to replicate wells.
Models' details are described in Supplementary Information, Sections 1 (fluorescent protein consortia) and 5 (TraMOS predictive model). Codes were written using MatLab. All statistical analysis were performed using GraphPad Prism 5.0 software.
The preparation of multiprotein complexes requires a tight control over expression levels of each protein in the consortium, in order to match their working concentrations in the final product. For coarse-grained regulation of protein amount, the cell number of each bacterial strain is controlled through its relative density in the consortium. For fine-grained regulation of protein amount, transcription and translation levels are controlled using synthetic genetic constructs. To simplify the genetic constructs, we use a single regulatory circuit based on the PT7/lacO hybrid promoter to activate protein expression by T7 RNAP and inhibit it by LacI. In addition, the transcription rate is controlled using plasmids with different copy number, whereas the translation rate is modulated by altering the ribosomal binding site (RBS) sequence of the target gene.
To define the control mechanisms, we designed consortia composed of four strains, each expressing one of four different fluorescent proteins CFP (cyan fluorescent protein), GFP (green fluorescent protein), mCherry and mOrange (
The mathematical model suggested that protein levels in the consortia can be controlled by changing the relative density of each strain in the consortia (
Next, we extended the control mechanisms to produce multi-protein complexes, using TraM as a model multi-protein complex. To start, we designed three plasmids with compatible replication origins and distinct copy number, each carrying a hybrid PT7/lacO promoter, cloning sites, and T7 RNAP terminator sequence (
As initial attempts resulted in TraMOS with low expression activities (see Supplementary Information Section 3.1.1 for details), we reduced the complexity of the TraM system by creating sub-consortia based on common functions of the proteins: the IET consortium with 11 strains, each coding for one of the IET genes (Supplementary Information Section 3.1.2); and the AAT consortium with 23 strains, each expressing a single AAT gene (Supplementary Information Section 3.1.3). Based on reported concentrations of the proteins in an optimized system (Kazuta et al., 2014) (Table S4), we designed the consortia to achieve comparable expression levels of each TraM factor, taking into consideration the plasmid copy number, predicted translation rates, and relative densities of the strains (Tables S2 and S6). The established consortia were used to co-purify either the 11 IET (TraMOS IET III) or the 23 AAT (TraMOS AAT III) proteins from single bacterial co-cultures. In parallel, we prepared an IET mixture from individually purified IET proteins, termed Control IET. We then tested the GFP expression activity using the protein mixtures. Indeed, the TraMOS assembled from separate TraMOS IET III and TraMOS AAT III cultures gave rise to GFP expression (
To further improve TraMOS IET, we created three additional IET consortia, termed IET IV, V and VI, in which the relative densities of bacterial strains were adjusted (Supplementary Information Section 3.1.2). When TraMOS IETs were combined with Control AAT (
The above divide-and-conquer strategy generated the necessary insights into setting up full TraMOS consortia A and B, each with 34 bacterial strains combined in a single culture. Overall IET strains density in TraMOS A was lower than that of TraMOS B (Supplementary Information Section 3.1.4). Both TraMOS exhibited expression activities (
Reproducible Preparation of TraMOS Using Bacterial Consortia with Reduced Strain Number
A microbial-consortia approach for purifying multi-protein complexes would be less susceptible to experimental errors if the consortia have lower number of bacterial strains. To this end, we first created 17 strains coding for two TraM genes (2Tg) and 11 strains expressing three TraM genes (3Tg) simultaneously (Table S3). Then, we used these strains to establish two new consortia (
Reproducibility is a critical, yet non-trivial aspect of multi-protein purification approach based on microbial-consortia. To this end, we produced TraMOS replicates from 18- and 15-strain consortia. Next, we identified and quantified the protein composition of the TraMOS using mass spectrometry (
By reducing the time and cost associated with preparing multi-protein complexes, our approach essentially enables high-throughput applications of TraMOS without investment into additional purification equipment. Here, we utilized TraMOS to test translation activity from a set of different plasmids expressing GFP with variable RBS sequences. It has been shown by biophysical modeling and experimental data that the sequence comprising 35 nucleotides up- and down-stream from the initiation codon affect the translation rate (Espah Borujeni et al., 2014; Mutalik et al., 2013). The RBS Calculator predicted that the translation rates of the four variants presented here are different. Using bacterial S12 whole cell extract (WCE) to test in vitro transcription/translation activity, we observed significant differences in expression activities of two variants (Ngo1 and Ngo1RBS) relative to the negative control (
In addition, we demonstrate the utility of TraMOS by incorporating it into a screening assay of protease inhibitors. Cysteine proteases, important in parasite pathogenesis, are inhibited by a family of small peptides, including the Trypanozoma cruzi inhibitor chagasin (Redzynia et al., 2009). Chagasin binds to the protease, blocking its active site in three loops, BE, CD and FG (
Our work has wide impact on cell-free synthetic biology by enabling the production of pure translation machinery through a simple and fast method. The approach is compatible with the existing equipment of most labs that perform protein purification routinely, allowing easy implementation of TraMOS and democratizing access to this system for high-throughput cell-free applications. Furthermore, our work establishes a microbial-consortia based approach for the purification of multi-protein complexes, which may be generalized to the production of other systems, such as the 28-enzyme system for purine nucleotide synthesis (Schultheisz et al., 2008) and the seven-enzyme system for production of an anti-malaria artemisinin precursor, amorpha-4,11-diene (Chen et al., 2013). Application of our strategy to other multi-protein complexes will require further adjustment of purification conditions (buffer composition or alternative tags). Finally, to enable autonomous control of protein expression in synthetic bacteria consortia, we may incorporate inter-strain communication (Groβkopf and Soyer, 2014) that responds to quorum sensing signals or nutrients (Scott and Hasty, 2016).
Arai, T., Matsuoka, S., Cho, H. Y., Yukawa, H., Inui, M., Wong, S. L., and Doi, R. H. (2007). Synthesis of Clostridium cellulovorans minicellulosomes by intercellular complementation. Proceedings of the National Academy of Sciences of the United States of America 104, 1456-1460.
Brenner, K., You, L., and Arnold, F. H. (2008). Engineering microbial consortia: a new frontier in synthetic biology. Trends in biotechnology 26, 483-489.
Caschera, F., and Noireaux, V. (2015). Preparation of amino acid mixtures for cell-free expression systems. BioTechniques 58, 40-43.
Caschera, F., and Noireaux, V. (2016). Compartmentalization of an all-E. coli Cell-Free Expression System for the Construction of a Minimal Cell. Artificial life 22, 185-195.
Chen, X., Zhang, C., Zou, R., Zhou, K., Stephanopoulos, G., and Too, H. P. (2013). Statistical Experimental Design Guided Optimization of a One-Pot Biphasic Multienzyme Total Synthesis of Amorpha-4,11-diene. PLoS ONE 8, e79650.
Chen, Y., Kim, J. K., Hirning, A. J., Josić, K., and Bennett, M. R. (2015). Emergent genetic oscillations in a synthetic microbial consortium. Science 349, 986.
Espah Borujeni, A., Channarasappa, A. S., and Salis, H. M. (2014). Translation rate is controlled by coupled trade-offs between site accessibility, selective RNA unfolding and sliding at upstream standby sites. Nucleic Acids Res 42, 2646-2659.
Goering, A. W., Li, J., McClure, R. A., Thomson, R. J., Jewett, M. C., and Kelleher, N. L. (2016). In Vitro Reconstruction of Nonribosomal Peptide Biosynthesis Directly from DNA Using Cell-Free Protein Synthesis. ACS Synthetic Biology.
Goers, L., Freemont, P., and Polizzi, K. M. (2014). Co-culture systems and technologies: taking synthetic biology to the next level. Journal of the Royal Society, Interface/the Royal Society 11.
Groβkopf, T., and Soyer, O. S. (2014). Synthetic microbial communities. Current Opinion in Microbiology 18, 72-77.
Hall, B. G., Acar, H., Nandipati, A., and Barlow, M. (2014). Growth rates made easy. Molecular biology and evolution 31, 232-238.
Kazuta, Y., Matsuura, T., Ichihashi, N., and Yomo, T. (2014). Synthesis of milligram quantities of proteins using a reconstituted in vitro protein synthesis system. Journal of Bioscience and Bioengineering 118, 554-557.
Li, J., Gu, L., Aach, J., and Church, G. M. (2014). Improved Cell-Free RNA and Protein Synthesis System. PLoS ONE 9, e106232.
Lopez-Gallego, F., and Schmidt-Dannert, C. (2010). Multi-enzymatic synthesis. Curr Opin Chem Biol 14, 174-183.
Lu, F., Smith, P. R., Mehta, K., and Swartz, J. R. (2015). Development of a synthetic pathway to convert glucose to hydrogen using cell free extracts. International Journal of Hydrogen Energy 40, 9113-9124.
Manen, D., and Caro, L. (1991). The replication of plasmid pSC101. Molecular Microbiology 5, 233-237.
Matsubayashi, H., and Ueda, T. (2014). Purified cell-free systems as standard parts for synthetic biology. Current Opinion in Chemical Biology 22, 158-162.
Mutalik, V. K., Guimaraes, J. C., Cambray, G., Lam, C., Christoffersen, M. J., Mai, Q. A., Tran, A. B., Paull, M., Keasling, J. D., Arkin, A. P., et al. (2013). Precise and reliable gene expression via standard transcription and translation initiation elements. Nature methods 10, 354-360.
Niederholtmeyer, H., Sun, Z. Z., Hori, Y., Yeung, E., Verpoorte, A., Murray, R. M., and Maerkl, S. J. (2015). Rapid cell-free forward engineering of novel genetic ring oscillators. eLife 4, e09771.
Pandey, K. (2013). Macromolecular inhibitors of malarial cysteine proteases—An invited review. Journal of Biomedical Science and Engineering 6, 11.
Pardee, K., Green, Alexander A., Ferrante, T., Cameron, D. E., DaleyKeyser, A., Yin, P., and Collins, James J. (2014). Paper-Based Synthetic Gene Networks. Cell 159, 940-954.
Redzynia, I., Ljunggren, A., Bujacz, A., Abrahamson, M., Jaskolski, M., and Bujacz, G. (2009). Crystal structure of the parasite inhibitor chagasin in complex with papain allows identification of structural requirements for broad reactivity and specificity determinants for target proteases. FEBS Journal 276, 793-806.
Rosano, G. L., and Ceccarelli, E. A. (2014). Recombinant protein expression in Escherichia coli: advances and challenges. Frontiers in Microbiology 5.
Salis, H. M. (2011). The Ribosome Binding Site Calculator. In Methods in enzymology, V. Christopher, ed. (Academic Press), pp. 19-42.
Schultheisz, H. L., Szymczyna, B. R., Scott, L. G., and Williamson, J. R. (2008). Pathway Engineered Enzymatic de Novo Purine Nucleotide Synthesis. ACS Chemical Biology 3, 499-511
Scott, S. R., and Hasty, J. (2016). Quorum Sensing Communication Modules for Microbial Consortia. ACS Synth Biol.
Shimizu, Y., and Ueda, T. (2010). PURE Technology. In Cell-Free Protein Production: Methods and Protocols, Y. Endo, K. Takai, and T. Ueda, eds. (Totowa, NJ: Humana Press), pp. 11-21.
Shong, J., Jimenez Diaz, M. R., and Collins, C. H. (2012). Towards synthetic microbial consortia for bioprocessing. Curr Opin Biotechnol 23, 798-802.
Takahashi, M. K., Hayes, C. A., Chappell, J., Sun, Z. Z., Murray, R. M., Noireaux, V., and Lucks, J. B. (2015). Characterizing and prototyping genetic networks with cell-free transcription-translation reactions. Methods (San Diego, Calif.) 86, 60-72.
Teague, B. P., and Weiss, R. (2015). Synthetic communities, the sum of parts. Science 349, 924.
Tsuji, G., Fujii, S., Sunami, T., and Yomo, T. (2016). Sustainable proliferation of liposomes compatible with inner RNA replication. Proceedings of the National Academy of Sciences 113, 590-595.
Wang, H. H., Huang, P.-Y., Xu, G., Haas, W., Marblestone, A., Li, J., Gygi, S. P., Forster, A. C., Jewett, M. C., and Church, G. M. (2012). Multiplexed in Vivo His-Tagging of Enzyme Pathways for in Vitro Single-Pot Multienzyme Catalysis. ACS Synthetic Biology 1, 43-52.
Wu, G., Yan, Q., Jones, J. A., Tang, Y. J., Fong, S. S., and Koffas, M.A.G. Metabolic Burden: Cornerstones in Synthetic Biology and Metabolic Engineering Applications. Trends in biotechnology 34, 652-664.
Design and Experimental Analysis of Fluorescent Protein Consortia
In classical preparation of multi-protein complexes, proteins are individually purified, and then combined to achieve their required concentrations. Conversely, the one-shot approach enables co-expression and co-purification of all the proteins without subsequent combining steps. Therefore, it is important to modulate the expression level of each protein in the consortium. This way, the purification yield of each factor will match the required concentration of each protein.
To start, we created bacterial consortia expressing four different fluorescent proteins (each tagged with 6x-His in the C-end). The design of these consortia accounted for variables controlling protein expression, including relative densities of each strain, and rates at which the proteins are transcribed and translated. These variables were incorporated into a mathematical model that was used to predict protein expression levels (see Section 1.2).
First, three consortia were designed to modulate protein yield through relative strain densities. In these consortia, densities of CFP and GFP strains were one order of magnitude lower (consortium A), equal (consortia B), or one order of magnitude higher (consortium C) when compared to the densities of mCherry and mOrange strains (
According to the model, protein levels in the consortia can be controlled by changing the relative density of each strain in the consortia (
Next, we experimentally established consortium L by cloning mOrange in a low copy number plasmid, and consortium W by modifying the RBS sequence controlling GFP expression (
In addition, we investigated if purification procedures can disrupt the ratio between expression levels of each protein. To this end, we analyzed yields of each fluorescent protein from consortia A, B, and C following purification with the one-shot procedure (
Mathematical Modeling of Bacterial Consortia That Express Fluorescent Proteins
We formulate a system of ordinary differential equations to model the production of fluorescent proteins by the consortia (
Where kc represents the consumption rate constant of nutrient (nM cell−1), kg represents the basal growth rates of bacteria (min−1), xi represents the densities of bacterial strain i (cell), S represents the nutrient (nM), Pi represents the fluorescent protein (nM), ks represents the synthesis rate constant (nM min−1), and kd represents the degradation rate constant (min−1). ks is adjusted based on the known difference between the genetic constructs. Specifically, high copy number plasmid concentration is ten times higher than low copy number plasmid1. The initiation rates of modified RBS is eight times less than the original RBS (see Section 1.1 and
Creation of Compatible Plasmids pIURAH, pIURCM and pIURKL for Cloning of TraM Genes
For the development of TraMOS, we utilized the backbones of pET15b (Novagen), pLysS (Novagen), and a pSC101 plasmids1 to create three plasmids with the same promoter region, cloning site, and transcription termination, but different selection markers and replication origins (
First, pET15b (AmpicillinR, ColE1 replication origin, constitutive lad expression) was digested with XhoI and XbaI to remove the RBS and 6x-His tag coding sequence. The His-tag was removed because a subset of TraM genes were to be tagged on the C-end, but the original configuration of pET15b only allowed N-end 6x-His tag cloning. Next, using Gibson cloning, we ligated a new cloning site restoring the RBS sequence and adding restriction sites for NsiI and PacI restriction enzymes. The resulting vector formed the first plasmid of our pIUR series, termed pIURAH (pIUR AmpR, High copy number).
Next, pLysS plasmid (ChloramphenicolR, p15A replication origin, expressing T7 lysozyme) was digested using SalI and XhoI, while pSCTet-T7 plasmid (KanamycinR, SC101 replication origin) was digested using BglI and AvrII. Then, the fragment containing promoter, cloning site, and terminator was amplified from pIURAH using primers pairs that contained complementary regions to the digested plasmids pLysS or pSC101. The amplified fragment was then inserted into the digested plasmids through Gibson cloning. This way, we created pIURCM (pIUR CmR, Medium copy number) from pLysS and pIURKL (pIUR KmR, Low copy number) from pSC101. Each of the plasmids contained the features of the original plasmids plus the hybrid PT7/lacO, the RBS sequence upstream of the unique NsiI and PacI sites, and the T7 terminator region.
As a result, we constructed plasmids with high, medium, and low copy number (pIURAH, pIURCM and pIURKL, respectively) with compatible replication origins, so they can be simultaneously maintained inside a single cell. Each plasmid has the same regulatory region and cloning site, facilitating the insertion of the TraM genes by Gibson cloning.
34-Strain TraMOS
All 34 strains were generated by co-transforming BL21(DE3) using pIURAH, pIURCM and pIURKL. Each strain of this consortium coded for a single TraM gene that was cloned into either pIURAH or pIURKL (1Tg strains, Table S3). For example, strain 1Tg metG expressed the methionyl-tRNA amino acyl transferase from the pIURAH plasmid plus the non-modified (empty) pIURCM and pIURKL. Strain 1Tg aspS expressed aspartyl-tRNA amino acyl transferase from the pIURKL plasmid plus non-modified pIURAH and pIURCM. Consequently, all 34 strains carried the three plasmids. A summary of the steps taken to optimize the 34-strain consortia is described in the next sections (also shown in
Creation of TraMOS I, TraMOS II and TraMOS III
TraMOS I was designed using fixed strain densities of each strain as per the plasmid was high- or low-copy number. Therefore, strain relative densities in consortium was of 0.22% for high copy number or 2.17% for low copy number. We also predicted the translation initiation rates (TIR) of each gene cloned in pIURAH, pIURKL and pIURCM using The RBS Calculator (Table S6). We used these predicted rates to correct the strain densities volumes when the predicted TIR was lower than 10000 au and coded in low copy number plasmid (Table S6). This initial approach generated TraMOS I with very low expression activity (not shown). To understand the issue, we analyzed the protein composition of TraMOS I by mass spectrometry (not shown). The results were used to correct the relative densities of the strains, using the concentrations reported in a previous work3. Based on the results, we established 4 new subconsortia: IET TraMOS II (11-strains), AAT1 TraMOS II, AAT2 TraMOS II (each with 8 different AAT-strains) and AAT3 TraMOS II (7-strains) (Table S6). Again, these preparations yielded very low in vitro translation activities.
To identify the problem and to optimize the consortia, we took several steps to understand the functionality of the translation factors. For IET factors, we purified them separately and created a Control IET that was functional. For a comparative analysis, we ran the Control IET mixture and TraMOS II fraction on SDS-PAGE and quantified the bands corresponding to each factor. This way, using the Control IET as the target, we measured the amount of each protein in TraMOS II and used the data to calculate the initial relative densities of the strains in the subsequent consortia (Table S6).
Because AAT genes have very similar molecular weights, we could not apply the above strategy to these factors. To this end, we measured the activity of each enzyme using a colorimetric method4. This method relies on the generation of pyrophosphate from ATP, which is a required step in the conjugation of tRNA-amino acyl catalyzed by the enzyme. Pyrophosphate is then converted to free inorganic phosphate (Pi). Therefore, the levels of Pi represent a direct measurement of AAT activity. Using tRNA and the specific amino acid, we determined activity of all the enzymes in the three subconsortia (
With these new insights, we developed two subconsortia, TraMOS IET III and TraMOS AAT III (Table S6), as presented in the main text. These preparations generated moderate expression activities when compared to the Control IET and AAT (
Creation of Optimized IET Subconsortia
For the optimization of IET subconsortia, we compared the TraMOS IET III with the Control IET by SDS-PAGE. IET IV was then designed based on the quantification of bands on SDS-PAGE for each factor, which guided the readjustment of relative strain densities. In addition, we designed TraMOS IET V and VI because initiation and elongation factors (particularly EF-G, EF-Ts and EF-Tu) are required at a higher ratio relative to termination factors. Using the design of TraMOS IET IV as a starting point, we increased the relative initial densities of initiation factors-strains by 50% and decreased EF's strains by the same factor to produce TraMOS IET V. Similarly, we designed TraMOS IET VI by increasing both initiation and elongation factors' strains by 25%, while reducing termination factors' strains by 50% (Table S6). Of these three preparations, IET VI resulted in the highest activity (
Creation of Optimized AAT Subconsortia
The optimization of AAT subconsortia was approached differently. Based on the requirements of each AAT factor in a previous work5, we adjusted the relative volumes of the strains based on their activities and protein-gel quantification (the latter, whenever possible considering that some of the AAT factors cannot be separated in SDS-PAGE due to similarities in their molecular weights). The resulting subconsortium was termed TraMOS AAT IV. We also designed another subconsortium using the same method (TraMOS AAT V), but replaced the strains coding for 6 AAT factors in low copy number plasmids by strains coding for these genes in high copy number plasmids. Finally, we created another subconsortium (TraMOS AAT VI), in which we utilized the same strains as in TraMOS AAT V, but with adjusted composition. For this, the relative densities of strains in TraMOS AAT VI were calculated based on the required protein levels, plasmid copy number, and TIR. Specifically, we first estimated the relative protein concentration of each factor in the PURE system (RPure). Following this step, we calculated a factor T for each factor by multiplying the relative plasmid copy number (values of 100 for high and 10 for low) times their predicted TIR. We then normalized these factors using the maximal T (corresponding to glyS-C in high copy number plasmid). Finally, we calculated the relative strain density for the consortium correcting the density in TraMOS AAT III by the factor estimated with the formula RPure/T. With this information, we experimentally established the consortia (Table S6). According to our results, TraMOS AAT VI resulted in the highest activity by a factor of approximately 1.5 times relative to the control (
Establishment of Functional 34-Strain Consortia
Finally, we established 34-strain consortia A and B by preparing IET IV and AAT VI subconsortia with the optimized relative densities and strains, and then combined them IET IV:AAT VI with ratios 30:1 (34-strain TraMOS A) or 60:1 (34-strain TraMOS B). The resulting consortia were inoculated into LB media, followed by induction and one-shot purification of TraMOS (Table S7).
18-Strain TraMOS
We first created strains that simultaneously expressed two TraM genes. To do this, we co-transformed BL21(DE3) strain using both pIURAH and pIURKL plasmids that expressed TraM genes, together with the empty plasmid pIURCM (2Tg strains). The composition of the 18-strain consortia is shown in the Table S8. We utilized the design of the 34-strain consortium to guide the design of the 2Tg strains. Specifically, the TraM genes expressed in strains at the highest densities in the 34-strain consortium were combined into a single 2Tg strain. For example, in 34-strain consortium, the two strains required at higher densities are 1Tg EF-Tu (high copy number) and 1Tg EF-Ts (low copy number). Therefore, one strain 2Tg IET 2 was created carrying both EF-Tu and EF-Ts genes in high and low copy number plasmids respectively. Following this logic, we created the remaining 16 2Tg strains (Table S8). We also considered grouping the genes functionally whenever possible. Therefore, we combined all the IET factors in five 2Tg IET strains and 22 AAT factors in eleven 2Tg AAT strains. One strain (2Tg IET 4) coded for both EF-4 (in low copy number plasmid) and alaS AAT gene (in high copy number plasmid). Using these strains, we established a 17-strain consortium that resulted in a non-functional mixture. The activity was restored, however, following supplementation with purified EF-G (
15-Strain TraMOS
To further decrease the number of strains in the consortia, we created strains coding for three TraM genes simultaneously (3Tg strains) by co-transforming BL21(DE3) bacteria with pIURAH, pIURCM, and pIURKL plasmids, each expressing one TraM gene. We designed the 3Tg strains based on the design of the 18-strain consortia and grouped initiation, elongation, termination or AAT factors together whenever possible (Table S9). This way, we designed strains that expressed the three initiation factors (3Tg IET), elongation factors Tu-Ts-G (3Tg EF), and release factors (3Tg RF). We also created a fourth strain coding for EF-4 (required at lower concentration compared to the other elongation factors), RRF, and EF-G (3Tg E4RRF). In addition, we designed eight 3Tg AAT strains, each coding for three distinct AAT genes, except for 3Tg AAT 6, which coded for alaS in both pIURAH and pIURCM (since alaS is the AAT required at higher levels). We were not able to obtain colonies for the 3Tg IET strains. In addition, strain 3Tg AAT 8 that expressed cysS, glyS, and thrS presented a very low growth rate upon induction and low expression level of glyS (
Chagasin Library Design and Development
Cys-protease inhibitors from parasites such as Trypanozoma cruzi or Plasmodium falciparum are implicated in pathogenesis6. Interaction of the inhibitors with the protease is mediated through a number of amino acids in three loops in the inhibitor (termed BC, DE and FG) with amino acids surrounding the protease's active site7 (
The WT chagasin DNA sequence (derived from the amino acid sequence Q966X9.1) was synthesized by incorporating a strong RBS sequence (designed to maximize translation rate), an octapeptide FLAG-tag sequence in the C-end, and a synthetic terminator, T7U-T7 TΦ8. The synthesized fragment was inserted into pET15b plasmid (digested Xba I/EcoRI) using Gibson Assembly, generating the plasmid WTCHGSN-pET15b (GenBank accession#KX765180). We produced chagasin both in vivo and in vitro, as demonstrated by western blot using anti-FLAG antibody (
Using WTCGSHN-pET15b as the template, we generated four PCR fragments using the degenerated primers, covering overlapping regions of the full length chagasin gene. Two fragments covered the BC loop, each with one of the two possible variants (Thr31 or Gly31), one fragment introduced mutations in loop DE and the fourth carried mutations in loop FG. All these fragments, together with the XbaI/HindIII-digested WTCHGSN-pET15b plasmid, were combined in a single Gibson Assembly reaction to randomly generate chagasin variants. The resulting library was transformed into E. coli, obtaining approximately 104 clones after a single transformation event. We randomly sequenced 24 clones and observed that the sequences were highly variable with the expected mutations at the target positions (
Mathematical Model for 34- and 18-Strain TraMOS
Predicting quantitative outputs from design inputs is an important feature of engineered systems. For an engineered consortium, a model that uses design inputs such as plasmid copy number would be a valuable tool for the a priori design of a system that yields specific protein concentrations. To this end, we create a set of equations that models the inter- and intra-cellular interaction in order to lay a foundation for predicting the protein yields of engineered, multi-strain consortia.
To begin, we compared quantified TraM protein levels in three biological replicates of 34- and 18-strain consortia from mass spectrometry (
To predict protein yields from knowledge of the way the consortium is engineered, the model includes processes at both the population and molecular levels. To begin, the model predicts how individual strains grow while competing for resources with other strains in the consortia (Eqn. 4). The number of cells, N, for the ith strain in the consortium grows exponentially at rate, r. However, further growth is inhibited as the total number of cells in the consortium reach the cultures carrying capacity, K.
On the molecular level, each cell carries multiple copies of the gene expressed by each strain, Di, (Eqn 5). The number of genes present in the consortium is determined by the plasmid copy number engineered into each strain, Ci and is directly proportional to the number of cells for each strain. Finally, the protein output of strain, Pi, is determined by a synthesis rate, αi, and degradation rate, Δi, which incorporates multiple cellular processes such as transcription and translation (Eqn 6). The synthesis of protein is dependent on the amount of genes present and the length of the gene. Degradation is solely dependent on the amount of protein.
The growth rates r for the strains following IPTG induction are calculated based on experimental results (
Measuring the in vivo synthesis and degradation for each protein is not feasible for the TraMOS system. Instead, we train the model in silico using the average mass spectrometry data for the 34-strain consortium. Using MATLAB's stiff ODE solver, we first set αi and Δi to one and use the relative initial cell density (as a percentage of the initial inoculum with OD600 of 0.01) as the initial condition for each strain, Ni(0). We then iterate the model for each strain to simulate the growth and protein production of the consortium over time.
Using the protein concentration achieved at steady state, we then create a prediction for the protein output of the 34-strain consortium not taking into account differences in synthesis and degradation. Comparing these values to actual mass spectrometry values, we quantitatively determine the synthesis rate that would achieve perfect correlation (r=1) between predictive and measured protein outputs while leaving degradation rates equal to 1 (
To further test the validity of this approach, we extend the model to the 18-strain consortia, using synthesis rates previously calculated. Here, the growth rate, ri, is recalculated for each 2Tg strain. The 18-strain model uses the previously described equation for modeling population dynamics of the strain (Eqn 4). Similarly, the number of genes for each protein and the total protein yield uses the same equations for the 2Tg as for the 1Tg strain. However, now each 2Tg strain is modeled with two gene copy equations, D1i, and D2i (Eqn 7 and 8) that are both directly proportional to cell number of the strain. Furthermore, there are two protein yield equations, P1i and P2i, which uses the calculated synthesis rates and the DNA copies of their respective genes (Eqn 9 and 10).
Using the same in silico method as described above, the predicted protein output at steady state is compared to measured values of the 18-strain consortium (
This model lays a foundation for predicting protein yields from engineered, multi-strain consortia. For TraMOS, where the proportions of proteins relative to one another are key to the activity of the whole, this model is a valuable tool in future optimization and modification of the consortia .
Table S1. List of oligonucleotides used in this study. For the primers used in chagasin mutagenesis (Information Section 4 and
Table S2. Features of TraM genes. TraM genes are divided in two main functional categories, IETs (Initiation, Elongation and Termination factors), and AATs (tRNA-amino acyl transferases). Location of the 6x-His-tag is shown for each TraM gene (-N, N-end; -C, C-end). EcoGene database accession numbers are shown. Translation initiation rates (TIR) are calculated using The RBS calculator. Purity of each factor is quantified from protein gels stained with Coomassie brilliant blue.
aTIR (translation initiation rate)
Table S4. Purified proteins for the preparation of Control IET. Protein purification yields and requirements for the assembly of Control IET.
aProtein concentration for each factor in our conditions
bBased on requirements defined at Kazuta, Y., Matsuura, T., Ichihashi, N., et al. (2014) Journal of Bioscience and Bioengineering. 118, 554-557
Table S5. Identified proteins and quantified counts from 34-strain TraMOS. (mean±SEM, n=3).
Table S7. Detailed strain composition of 34-strain TraMOS A and B consortia. See Supplementary Information Section 3.1.4 and
Table S9. Detailed strain composition of 15-strain TraMOS consortia. See Supplementary Information Section 3.3 and
Table S10. Cys-proteases inhibitors used in multiple sequence alignment. 3PNR_B corresponds to the PbICP inhibitor crystallized with a Cys-protease Falcipain-27. See Fig. S10B for details.
Trypanosoma cruzi
Trypanosoma cruzi strain CL Brener
Leishmania major strain Friedlin
Trypanosoma brucei gambiense DAL972
Trypanosoma brucei brucei TREU927
Leishmania infantum JPCM5
Leishmania panamensis
Leishmania braziliensis MHOM/BR/75/M2904
Entamoeba nuttalli P19
Entamoeba histolytica HM-1:IMSS
Leishmania mexicana MHOM/GT/2001/U1103
Pseudomonas aeruginosa
Pseudomonas mendocina
Bacillus cereus
Plasmodium berghei
It is understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims. All publications, patents, and patent applications cited herein are hereby incorporated by reference in their entirety for all purposes.
The present patent application claims benefit of priority to U.S. Provisional Patent Application No. 62/455,941, filed Feb. 7, 2017, of which are incorporated by reference for all purposes.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2018/017102 | 2/6/2018 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62455941 | Feb 2017 | US |