The Sequence Listing XML associated with this application is provided in XML format and is hereby incorporated by reference into the specification. The name of the XML file containing the sequence listing is 3915-P1247USPNP_Seq_List_20230424.xml. The XML file is 258,462 bytes; was created on Apr. 24, 2023; and is being submitted electronically via Patent Center with the filing of the specification.
The development of microbial platforms for industrial chemical production frequently requires optimizing the expression levels of multiple genes. The advent of CRISPR-Cas provides tools that can be used to rapidly program gene expression promises to accelerate pathway engineering for the efficient production of high-value compounds. The application of CRISPR-Cas tools for transcriptional repression (CRISPRi) in bacterial metabolic engineering is well-established. By comparison, the development of CRISPR-Cas tools for programmable transcriptional activation (CRISPRa) has lagged due to the paucity of effective transcriptional activators, and the complexity of the rules governing CRISPRa-directed transcription from bacterial promoters. Despite these challenges, the potential for using CRISPRa to program gene expression has been demonstrated through the successful implementation in E.coli, M.xanthus, K.oxytoca, and S.enterica. Determining how to strategically port CRISPRa systems into other microbes could significantly improve available metabolic engineering capabilities.
Pseudomonas putida is a gram-negative soil bacterium that has recently received attention as a potential chassis for bioproduction due to desirable metabolic capabilities and the capacity to survive harsh bioprocessing conditions. P.putida has high reducing power and the ability to metabolize a broad range of feedstocks, from glucose to the toxic products of aromatic lignin degradation. The successful implementation of CRISPR genome editing and CRISPRi in P.putida shows that CRISPR gene targeting can be effective in P.putida and provides a starting point to assess whether gene activation with a CRISPRa system can be achieved.
CRISPR-Cas transcriptional control typically uses the catalytically inactive Cas9 protein (dCas9) with programmable guide RNAs that recognize DNA targets through Watson-Crick base pairing. Recently, a variant of the transcriptional activator SoxS (R93A/S101A) that can be linked to a programmable CRISPR-Cas DNA binding domain to activate gene expression in E.coli was identified and optimized. SoxS interacts with an interface on the α-subunit of RNA polymerase (RpoA) that is widely conserved throughout bacterial species, including in P.putida, suggesting that the CRISPRa system that was developed in E.coli should also be effective in P.putida and other bacteria. However, in contrast to the relative permissiveness of CRISPRi (and CRISPRa in eukaryotes), CRISPRa in bacteria is known to be sensitive to several features of target promoters, including the precise distance from the transcription start site and the intervening sequence composition. Accordingly, it is not known to what extent the rules characterized in one bacterial species are generalizable in others.
Despite the advances in the art of CRISPR-based modification of gene expression, there remains a need for efficient systems and methods for implementing programmed transcriptional activation in desirable bacterial species. The present disclosure addresses these and related needs.
Accordingly, in an aspect of the present disclosure there is provided an engineered bacterium comprising genetic elements supporting programmable transcriptional activation and/or repression. In some embodiments, the genetic elements comprise at least one heterologous nucleic acid construct. In certain embodiments, the at least one heterologous nucleic acid construct comprises a first nucleic acid sequence encoding an endonuclease that lacks endonuclease activity. In some embodiments, the endonuclease is selected from dCas9, dCas12, dCasX, dCasPhi, dCas3 (Cascade), and the like. In an embodiment, the at least one heterologous nucleic acid construct comprises a second nucleic acid sequence encoding a transcriptional activator. In some embodiments, the transcriptional activator comprises an RNA-binding protein (RBP) fused to an effector domain. In certain embodiments, the effector domain is selected from SoxS, TetD, PspF, AsiA, N-terminus of RpoA (aNTD), and SoxS-family activators. In some embodiments of the present disclosure, the RNA-binding protein is selected from MCP, PCP, Com, LambdaN22Plus, Qbeta. In certain embodiments, the effector domain comprises SoxS. In an embodiment, the SoxS is engineered to reduce or abolish DNA-binding capacity. In some embodiments, the SoxS is engineered to comprise a mutation. In certain embodiments, the mutation in SoxS is at R93 and/or S101. In an embodiment, the SoxS mutation comprises R93A and/or S101A.
In some embodiments of the present disclosure, the at least one heterologous nucleic acid construct comprises a third nucleic acid sequence encoding a scaffold RNA (scRNA). In an embodiment, the scRNA comprises a 3′ MS2 hairpin loop that interacts with a transcriptional activator. In some embodiments, the scRNA comprises a 5′ domain comprising a guide sequence that hybridizes to a target sequence. In an embodiment, the target sequence is proximal to a PAM and/or a promoter sequence of an endogenous gene of the engineered bacterium.
In another embodiment of the present disclosure, the at least one heterologous nucleic acid construct comprises a fourth nucleic acid sequence. In some embodiments, the fourth nucleic acid sequence comprises an open reading frame of at least one gene of interest. In some embodiments, the at least one gene of interest is operatively linked to a promoter sequence. In an embodiment, the at least one gene of interest is linked to a PAM sequence. In some embodiments, the at least one gene of interest is operatively linked to a promoter sequence and a PAM sequence. In some embodimets, the target sequence is proximal to the promoter sequence and/or the PAM sequence. In certain embodiments of the present disclosure, the open reading frame encodes a gene product that results in production of an aromatic compound.
In an aspect of the present disclosure, the at least one heterologous nucleic acid construct comprises the first, second, third, and fourth sequences distributed in any combination on two vectors. In another aspect, the at least one heterologous nucleic acid construct comprises the first, second, third, and fourth sequences distributed on a single vector. In some embodiments, the vector is optionally pBBR1, pRK2, pRSF1010, pBAV1, and the like, or is a derivative thereof. In some embodiments, the at least one heterologous nucleic acid construct is integrated into the genome of the engineered bacterium. In some embodiments, the first, second, third, and fourth sequences each comprise or are operatively linked to a promoter operable in the engineered bacterium. In some embodiments, the engineered bacterium is Pseudomonas putida or Acinetobacter baylyi.
In an embodiment of the present disclosure, the engineered bacterium is Pseudomonas putida, and wherein the target sequence is between about 60 to about 120 bases upstream (5′) of a transcriptional start site (TSS) of the endogenous gene or open reading frame. In an embodiment, the target sequence is about 15 to about 25 bases upstream (5′) of a transcriptional start site (TSS) of the endogenous gene or open reading frame. In some embodiments, the target sequence corresponds with the J1, J3, J5, or J6 promoter, or portions thereof. In some embodiments of the present disclosure, the promoter sequence resides in the intervening sequence between the target sequence and the transcriptional start site (TSS) of the endogenous genes or open reading frame. In certain embodiments, the promoter sequence is a synthetic 5′-upstream sequences containing appropriate NGG PAM at an optimal position, wherein the optimal position is selected from about 75 to 85 nucleotides, about 78 to 83 nucleotides, and about 81 nucleotides upstream of the TSS.
In certain aspects of the present disclosure, the genetic elements are under control of a small-molecule inducible promoter. In some embodiment, the small molecule inducer is selected from m-toluic acid, salicylic acid, benzoic acid, and related compounds. In some embodiments, the small-molecule inducible promoter is XylS/Pm, derived from P.putida mt-2.
In an aspect of the present disclosure, there is provided a bacterium engineered to produce p-aminophenylalanine (p-AF) or p-aminocinnamic acid (p-ACA). In some embodiments, the bacterium comprises an open reading frame encoding PAL. In certain embodiments, the PAL is derived from Arabinobsis thaliana. In some embodiments, the PAL is derived from or Rhodotorula glutinis. In some embodiments, the bacterium comprises an open reading frame encoding PapABC. In some embodiments, the open reading frame encoding PapABC is derived from Pseudomonas fluorescens. In some embodiments, the bacterium comprises an open reading frame encoding AroGL. In an embodiment, the open reading frame encoding AroGL is derived from E.coli.
In yet another aspect of the present disclosure there is provided a bacterium engineered to produce tetrahydrobiopterin (BH4) or derivatives thereof. In some embodiments, the bacterium comprises an open reading frame encoding GTPCH. In some embodiments, the open reading frame encoding GTPCH is derived from E.coli. In some embodiments, the bacterium comprises an open reading frame encoding PTPS/SR. In an embodiment, the open reading frame encoding PTPS/SR is derived from M.alpina.
In yet another aspect, of the present disclosure, there is provided a system for production of aromatic compounds or compounds with aromatic metabolites or intermediates. In some embodiments, the system comprises an engineered bacterium comprising genetic elements supporting programmable transcriptional activation and/or repression. In some embodiments, the system further comprises a suitable growth medium.
In a related aspect, the present disclosure also pertains to a method of producing aromatic compounds or compounds with aromatic metabolites or intermediates. In some embodiments, the method comprises providing an engineered bacterium comprising genetic elements supporting programmable transcriptional activation and/or repression; and a suitable substrate permitting production of the compounds. In some embodiments, the compound is p-AF, and/or p-ACA. In some embodiments, the substrate is selected from glucose, glycerol, p-coumaric acid, and other substrates from lignocellulosic biomass.
The foregoing aspects and many of the attendant advantages of this invention will become more readily appreciated as the same become better understood by reference to the following detailed description, when taken in conjunction with the accompanying drawings, wherein:
CRISPR-Cas transcriptional programming in bacteria is an emerging tool to regulate gene expression for metabolic pathway engineering. The present disclosure provides methods of CRISPR-Cas transcriptional activation (CRISPRa) in P.putida using a system previously developed in E.coli. The present disclosure provides a methodology to transfer CRISPRa to a new host by first optimizing expression levels for the CRISPRa system components, and then applying rules for effective CRISPRa based on a systematic characterization of promoter features. Using the optimized system disclosed herein, the inventors regulated biosynthesis in the biopterin and mevalonate pathways. The present disclosure demonstrates that multiple genes can be activated simultaneously by targeting multiple promoters or by targeting a single promoter in a multi-gene operon. The optimized CRISPRa approach provided herein can activate endogenous promoters for P.putida and inducible CRISPRa can be obtained by expressing dCas9 from inducible promoters. The present disclosure facilitates new metabolic engineering strategies in P.putida and paves the way for CRISPR-Cas transcriptional programming in other bacterial species.
In accordance with the foregoing, in one aspect the disclosure provides an engineered Pseudomonas bacterium containing genetic elements supporting programmable transcriptional activation and/or repression. In some embodiments, the engineered Pseudomonas bacterium comprises at least one heterologous nucleic acid construct.
In some embodiments, the at least one heterologous nucleic acid construct comprises a first sequence encoding an endonuclease that lacks endonuclease activity. In some embodiments, the endonuclease is dCas9, dCas12, dCasX, dCasPhi, dCas3 (Cascade), and the like.
In some embodiments, the at least one heterologous nucleic acid construct comprises a second sequence encoding a transcriptional activator. In some embodiments, the transcriptional activator comprises an RNA-binding protein (RBP) fused to an effector domain of a transcriptional activator. In some embodiments, the transcriptional activator is selected from SoxS, TetD, PspF, AsiA, N-terminus of RpoA (aNTD), and Soxs-family activators (e.g., AraC-XylS superfamily), and the like. In some embodiments, the RNA-binding protein can be selected from MCP, PCP, Com, LambdaN22Plus, Qbeta. In some embodiments, the SoxS is derived from E.coli. In some embodiments, the SoxS is engineered to reduce or abolish DNA-binding capacity. In some embodiments, the SoxS is engineered to contain a mutation, e.g., substitution, at reside R93 and/or S101, e.g., R93A and/or S101A, and the like.
In some embodiments, the at least one heterologous nucleic acid construct comprises a third sequence encoding a scaffold RNA (scRNA). In some embodiments, the scRNA comprises a 3′ MS2 hairpin loop that interacts with a transcriptional activator. In some embodiments, the scRNA comprises a 5′ domain comprising a guide sequence that hybridizes to a target sequence. In some embodiments, the target sequence is proximal to a protospacer adjacent motif (PAM) and/or promoter sequence of an endogenous gene of the Pseudomonas bacterium. In some embodiments, the at least one heterologous nucleic acid construct comprises a fourth sequence comprising an open reading frame of a gene of interest (GOI) operatively linked to a promoter sequence and/or PAM sequence, and wherein the target sequence is proximal to the promoter sequence and/or PAM sequence.
In some embodiments, the at least one heterologous nucleic acid construct comprises the first, second, third, and fourth sequences distributed in any combination on two vectors. In some embodiments, the at least one heterologous nucleic acid construct comprises the first, second, third, and fourth sequences distributed on a single vector. In some embodiments, the vector is optionally pBBR1, pRK2, pRSF1010, and the like, or is derived from.
In some embodiments, the at least one heterologous nucleic acid construct is integrated into the genome of the Pseudomonas bacterium. In some embodiments, the first, second, third, and fourth sequences each comprise or are operatively linked to a promoter operable in the Pseudomonas bacterium. In some embodiments, the Pseudomonas bacterium is P.putida. In some embodiments, the target sequence is between 60 and 120 bases upstream (5′ to) the transcriptional start site of the endogenous gene or open reading frame. In some embodiments, the target sequence is 15-25 bases. In some embodiments, the target sequence corresponds with the J1 or J3 promoter, or portion thereof.
In some embodiments, a promoter sequence resides in the intervening sequence between the target sequence and the transcriptional start site (TSS) of the endogenous genes or open reading frame. In some embodiments, the promoter sequence is a synthetic 5′-upstream sequence containing appropriate NGG PAM at an optimal position (e.g., 75-85 nt, e.g, 78-83 nt, e.g., 81 nt upstream of the TSS). In some embodiments, the genetic elements are under control of a small-molecule inducible promoter. In some embodiments, the small molecule inducer is selected from m-toluic acid, salicylic acid, benzoic acid, and related compounds. In some embodiments, the small-molecule inducible promoter is XylS/Pm, e.g., derived from P.putida mt-2. In some embodiments, the at least one heterologous nucleic acid construct comprises a fourth sequence comprising an open reading frame of a gene of interest, wherein the open reading frame encodes gene product that results in production of an aromatic compound.
In some embodiments, the Pseudomonas bacterium is engineered to produce p-aminocinnamic acid (pACA) from glucose. In some embodiments, the Pseudomonas bacterium comprises an open reading frame encoding PAL, optionally wherein the PAL is derived from Arabinobsis thaliana or Rhodotorula glutinis. In some embodiments, the Pseudomonas bacterium comprises an open reading frame encoding PapABC (4-amino-4-deoxychorismate synthase (PapA), 4-amino-4-deoxychorismate mutase (PapB) and 4-amino-4-deoxyprephenate dehydrogenase (PapC)), e.g., derived from Pseudomonas fluorescens, to facilitate p-AF synthesis. In some embodiments, the Pseudomonas bacterium comprises an open reading frame encoding AroGL, e.g., derived from E.coli, to facilitate chorismite flux upcycling.
In some embodiments, the Pseudomonas bacterium is engineered to produce tetrahydrobiopterin (BH4) or derivatives thereof. In some embodiments, the Pseudomonas bacterium comprises an open reading frame encoding GTP cyclohydrolase I (GTPCH), e.g., derived from E.coli. In some embodiments, the Pseudomonas bacterium comprises an open reading frame encoding PTPS (pyruvoyltetrahydropterin synthase) /SR (sepiapterin reductase), e.g., derived from M.alpina.
In another aspect, the disclosure provides a system for production of aromatic compounds or compounds with aromatic metabolites or intermediates, comprising the engineered Pseudomonas bacterium disclosed herein and a growth medium.
In another aspect, the disclosure provides a method of producing aromatic compounds or compounds with aromatic metabolites or intermediates, comprising providing the engineered Pseudomonas bacterium disclosed herein and a suitable substrate and permitting production of the compounds. In some embodiments, the compound is p-AF and/or p-ACA and the substrate is glucose.
Unless specifically defined herein, all terms used herein have the same meaning as they would to one skilled in the art of the present disclosure. Practitioners are particularly directed to Sambrook J., et al. (eds.), Molecular Cloning: A Laboratory Manual, 3rd ed., Cold Spring Harbor Press, Plainsview, New York (2001); Ausubel, F.M., et al. (eds.), Current Protocols in Molecular Biology, John Wiley & Sons, New York (2010); Coligan, J.E., et al. (eds.), Current Protocols in Immunology, John Wiley & Sons, New York (2010); Mirzaei, H. and Carrasco, M. (eds.), Modern Proteomics - Sample Preparation, Analysis and Practical Applications in Advances in Experimental Medicine and Biology, Springer International Publishing, 2016; Comai, L, et al., (eds.), Proteomic: Methods and Protocols in Methods in Molecular Biology, Springer International Publishing, 2017; Mali P, Esvelt KM, and Church GM. Cas9 as a versatile tool for engineering biology. Nat Methods. 2013 Oct;10(10):957-63; and Dominguez AA, Lim WA, and Qi LS. Beyond editing: repurposing CRISPR-Cas9 for precision genome regulation and interrogation. Nat Rev Mol Cell Biol. 2016 Jan;17(1):5-15, for definitions and terms of art.
For convenience, certain terms employed herein, in the specification, examples and appended claims are provided here. The definitions are provided to aid in describing particular embodiments and are not intended to limit the claimed invention, because the scope of the disclosure is limited only by the claims.
A nucleic acid is a polymer of monomer units or “residues”. The monomer subunits, or residues, of the nucleic acids each contain a nitrogenous base (i.e., nucleobase) a five-carbon sugar, and a phosphate group. The identity of each residue is typically indicated herein with reference to the identity of the nucleobase (or nitrogenous base) structure of each residue. Canonical nucleobases include adenine (A), guanine (G), thymine (T), uracil (U) (in RNA instead of thymine (T) residues) and cytosine (C). However, the nucleic acids of the present disclosure can include any modified nucleobase, nucleobase analogs, and/or non-canonical nucleobase, as are well-known in the art. Modifications to the nucleic acid monomers, or residues, encompass any chemical change in the structure of the nucleic acid monomer, or residue, that results in a noncanonical subunit structure. Such chemical changes can result from, for example, epigenetic modifications (such as to genomic DNA or RNA), or damage resulting from radiation, chemical, or other means. Illustrative and nonlimiting examples of noncanonical subunits, which can result from a modification, include uracil (for DNA), 5-methylcytosine, 5-hydroxymethylcytosine, 5-formethylcytosine, 5 -carboxycytosine b-glucosyl-5-hydroxy-methylcytosine, 8-oxoguanine, 2-amino-adenosine, 2-amino-deoxyadenosine, 2-thiothymidine, pyrrolo-pyrimidine, 2-thiocytidine, or an abasic lesion. An abasic lesion is a location along the deoxyribose backbone but lacking a base. Known analogs of natural nucleotides hybridize to nucleic acids in a manner similar to naturally occurring nucleotides, such as peptide nucleic acids (PNAs) and phosphorothioate DNA.
The five-carbon sugar to which the nucleobases are attached can vary depending on the type of nucleic acid. For example, the sugar is deoxyribose in DNA and is ribose in RNA. In some instances, herein, the nucleic acid residues can also be referred with respect to the nucleoside structure, such as adenosine, guanosine, 5-methyluridine, uridine, and cytidine. Moreover, alternative nomenclature for the nucleoside also includes indicating a “ribo” or deoxyribo” prefix before the nucleobase to infer the type of five-carbon sugar. For example, “ribocytosine” as occasionally used herein is equivalent to a cytidine residue because it indicates the presence of a ribose sugar in the RNA molecule at that residue. A nucleic acid polymer can be or comprise a deoxyribonucleotide (DNA) polymer, a ribonucleotide (RNA) polymer. The nucleic acids can also be or comprise a PNA polymer, or a combination of any of the polymer types described herein (e.g., contain residues with different sugars).
The terms “polynucleotide”, “nucleotide”, “nucleotide sequence”, “nucleic acid” and “oligonucleotide” are used interchangeably. They refer to a polymeric form of nucleotides of any length, either deoxyribonucleotides or ribonucleotides, or analogs thereof. Polynucleotides may have any three-dimensional structure, and may perform any function, known or unknown. The following are non-limiting examples of polynucleotides: coding or non-coding regions of a gene or gene fragment, loci (locus) defined from linkage analysis, exons, introns, messenger RNA (mRNA), transfer RNA, ribosomal RNA, short interfering RNA (siRNA), short-hairpin RNA (shRNA), micro-RNA (miRNA), ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes, and primers. A polynucleotide may comprise one or more modified nucleotides, such as methylated nucleotides and nucleotide analogs. If present, modifications to the nucleotide structure may be imparted before or after assembly of the polymer. The sequence of nucleotides may be interrupted by non-nucleotide components. A polynucleotide may be further modified after polymerization, such as by conjugation with a labeling component.
As used herein, the term “polypeptide” or “protein” refers to a polymer in which the monomers are amino acid residues that are joined together through amide bonds. When the amino acids are alpha-amino acids, either the L-optical isomer or the D-optical isomer can be used, the L-isomers being preferred. The term polypeptide or protein as used herein encompasses any amino acid sequence and includes modified sequences such as glycoproteins. The term polypeptide is specifically intended to cover naturally occurring proteins, as well as those that are recombinantly or synthetically produced.
One of skill will recognize that individual substitutions, deletions or additions to a peptide, polypeptide, or protein sequence which alters, adds, or deletes a single amino acid or a percentage of amino acids in the sequence is a “conservatively modified variant” where the alteration results in the substitution of an amino acid with a chemically similar amino acid. Conservative amino acid substitution tables providing functionally similar amino acids are well known to one of ordinary skill in the art. The following six groups are examples of amino acids that are considered to be conservative substitutions for one another:
Reference to sequence identity addresses the degree of similarity of two polymeric sequences, such as protein or nucleic acid sequences. Determination of sequence identity can be readily accomplished by persons of ordinary skill in the art using accepted algorithms and/or techniques. Sequence identity is typically determined by comparing two optimally aligned sequences over a comparison window, where the portion of the peptide or polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical amino-acid residue or nucleic acid base occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity. Various software driven algorithms are readily available, such as BLAST N or BLAST P to perform such comparisons.
The use of the term “or” in the claims is used to mean “and/or” unless explicitly indicated to refer to alternatives only or the alternatives are mutually exclusive, although the disclosure supports a definition that refers to only alternatives and “and/or.”
Following long-standing patent law, the words “a” and “an,” when used in conjunction with the word “comprising” in the claims or specification, denotes one or more, unless specifically noted.
Unless the context clearly requires otherwise, throughout the description and the claims, the words “comprise,” “comprising,” and the like, are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is to indicate, in the sense of “including, but not limited to.” Words using the singular or plural number also include the plural and singular number, respectively. Additionally, the words “herein,” “above,” and “below,” and words of similar import, when used in this application, shall refer to this application as a whole and not to any particular portions of the application. The word “about” indicates a number within range of minor variation above or below the stated reference number. For example, “about” can refer to a number within a range of 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, or 1% above or below the indicated reference number.
“Promoter” as used herein means a synthetic or naturally-derived molecule which is capable of conferring, activating or enhancing expression of a nucleic acid in a cell. A promoter may comprise one or more specific transcriptional regulatory sequences to further enhance expression and/or to alter the spatial expression and/or temporal expression of same. A promoter may also comprise distal enhancer or repressor elements, which may be located as much as several thousand base pairs, or anywhere in the genome, from the start site of transcription. A promoter may be derived from sources including viral, bacterial, fungal, plants, insects, and animals. A promoter may regulate the expression of a gene component constitutively, or differentially with respect to cell, the tissue or organ in which expression occurs or, with respect to the developmental stage at which expression occurs, or in response to external stimuli such as physiological stresses, hormones, toxins, drugs, pathogens, metal ions, or inducing agents.
“Protospacer sequence” or “protospacer segment” as used interchangeably herein refers to a DNA sequence targeted by the Cas9 nuclease or Cpf1 nuclease in the CRISPR bacterial adaptive immune system. In the CRISPR/Cas9 system, the protospacer sequence is typically followed by a protospacer-adjacent motif (PAM); the PAM is at the 5′-end. In the CRISPR/Cpf1 system, PAM is followed by the protospacer sequence; the PAM is at the 3′-end.
“Protospacer adjacent motif” or “PAM” as used herein refers to a DNA sequence immediately following the DNA sequence targeted by the Cas9 or immediately before the DNA sequence targeted by the Cpf1 nuclease in the CRISPR bacterial adaptive immune system.
Disclosed are materials, compositions, and components that can be used for, can be used in conjunction with, can be used in preparation for, or are products of the disclosed methods and compositions. It is understood that, when combinations, subsets, interactions, groups, etc., of these materials are disclosed, each of various individual and collective combinations is specifically contemplated, even though specific reference to each and every single combination and permutation of these compounds may not be explicitly disclosed. This concept applies to all aspects of this disclosure including, but not limited to, steps in the described methods. Thus, specific elements of any foregoing embodiments can be combined or substituted for elements in other embodiments. For example, if there are a variety of additional steps that can be performed, it is understood that each of these additional steps can be performed with any specific method steps or combination of method steps of the disclosed methods, and that each such combination or subset of combinations is specifically contemplated and should be considered disclosed. Additionally, it is understood that the embodiments described herein can be implemented using any suitable material such as those described elsewhere herein or as known in the art.
CRISPR-Cas system has been repurposed for several applications in the field of synthetic biology including transcriptional modifications based on catalytically-dead Cas9 (dCas9), guided-RNA, and other prosthetic machinery. CRISPR interference (CRISPRi) can be achieved by having dCas9 physically blocking the RNA polymerase function. On the other hand, CRISPR activation (CRISPRa) requires an auxiliary component to recruit/stabilize RNA polymerase to the proper position to elevate the expression of designated genes. Combined CRISPR activation/repression (CRISPRa/i) circuits can be programmed at the engineered guide-RNA(s) containing complementary sequences to the DNA target, single-guide-RNA (sgRNA) for CRISPRi, and scaffold-RNA (scRNA) for CRISPRa. Therefore, CRISPRa/i can provide a programmable environment towards genome-scale engineering to accelerate chemical production optimization.
Moreover, the CRISPRa/i tool was recently demonstrated to be applicable in Pseudomonas putida, the emerging bacterial chassis that has different industrially relevant traits suitable for metabolic engineering applications. Hence, the accelerated genetic engineering platform will be applied to P.putida and other microbes of interest to explore biosynthesis space beyond a benchmark E.coli host. The biosynthetic pathway of p-aminocinnamic acid was chosen for manipulation due to difficulties observed in the E.coli system.
Successful acceleration of genetic manipulation tools will provide a novel platform for strains engineering towards bioproduction of any desired chemicals in multiple organisms of choice. By reducing the time from individual gene engineering at the genome level to multiple gene perturbations based on CRISPRa/i program, the accessibility to novel strains will be significantly boosted.
CRISPR-based transcriptional activation is enabled by appending the transcription factor that recruits/stabilizes RNA polymerase to the bacterial promoter and increases transcription rate. The initial screening covered diverse protein candidates and SoxS was found to be the best performer among several candidates. The novel CRISPRa tool was also demonstrated simultaneously with CRISPRi to activate/repress multiple fluorescent reporters. Further, the engineering of SoxS with R93A and S101A mutations led to significant improvement in activation which also allowed upregulation of endogenous promoters with moderate success rate. The present disclosure demonstrates successful use of CRISPRa tool in P.putida with thorough methodology to enable bioproduction of biopterins and mevalonic acid. Even though the genetic tools available in E.coli were shown to be somewhat compatible in P.putida, key limitations existed in the plasmid-borne expression system used in E.coli, which is incompatible with other bacterial strains distantly related to Escherichia genus. Therefore, broad-host-range plasmids were used instead, and an initial CRISPRa activity was demonstrated, and the expression system was optimized in both plasmid-borne and genome-integrated manners in which the latter was demonstrated to be superior in comparison (Example 1,
As CRISPRa recruits/stabilizes RNA polymerase to the designated promoter region, there are specific requirements for CRISPRa to work effectively. Several factors, e.g., distance to the transcriptional start site (TSS) and promoter strengths, influence the efficiency of CRISPRa, and these factors were investigated in the P.putida system for optimization (Example 1,
Despite the CRISPRa activity of synthetic reporters observed, the activation of endogenous promoters remains challenging. The inability to engineer the endogenous promoter led to unoptimized traits for CRISPRa, e.g., limited availability of the PAM sites and out-of-range promoter strengths.
To examine the endogenous CRISPRa capability in P.putida, fluorescent protein fusion reporters were constructed with the native/endogenous promoters and CRISPRa activity tested with all available scRNAs filtered by working distance-to-TSS. Out of 10 promoters tested, 4 can be activated with at least 1.5-fold increase in the fluorescent output (
The inventors have successfully ported CRISPRa and characterized effective CRISPRa in P.putida. Further, the data presented herein establish that endogenous genes can be activated in P.putida using fluorescent protein fusion reporter.
Although characterization of CRISPRa in P.putida was an important step, it was equally critical to characterize the repression counterpart (CRISPRi). CRISPRi in P.putida was previously reported to be accessible and can be used in various applications. To this end the inventors tested CRISPRi efficiency with 17 different sgRNAs targeting a sfGFP reporter integrated into the P.putida genome. It was observed that distance is not the only factor governing the CRISPRi efficiency and that RNA folding energetics play an important role in the efficiency of CRISPRi. To further investigate the polar effect from interfering with the adjacent gene in the operon, CRISPRi was also tested in the sfGFP-mRFP and mRFP-sfGFP operon. It is evident that targeting the adjacent gene will likely affect the expression level of the adjacent gene regardless of the orientation. (Data not shown)
Furthermore, to demonstrate the simultaneous activity of CRISPRa and CRISPRi, the inventors constructed and transfected, plasmid-borne and integrated dual-fluorescent proteins reporters, in P.putida. The CRISPRa and CRISPRi circuits were shown to be functional both individually and simultaneously on these promoters (
Other than its ability to activate and repress genes of interest, the CRISPRa/i program also encompasses fine-tuning of the perturbation levels using various strategies. The degree of activation and repression can be tuned in respect to concentration of each CRISPRa/i machinery (
Based on these studies and the data presented herein, the inventors have characterized CRISPRi in P.putida, including the polar effect by targeting the multi-gene operon, demonstrated that CRISPRa and CRISPRi work simultaneously both in the plasmid and genome-integrated platforms, CRISPRa/i level can be tuned in various aspects, CRISPR components affect level of CRISPRa/i, and scRNA truncation led to decreased CRISPRa magnitude.
To prove the ability to accelerate strain engineering processes in bacteria, p-aminocinnamic acid (pACA) production pathway was chosen as an exemplary pathway. pACA is a non-native chemical in biologically-derived chemical repertoire which can be achieved by coupling the p-aminophenylalanine (pAF) biosynthetic pathway, retrieved from Pseudomonas fluorescens, with phenylalanine ammonia lyase, available in plant and fungal chemistry. With further chemical modification or bioconversions, pACA can be converted into p-aminostyrene (pAS) which is a precursor of derivatized polystyrene, containing a functional group for further modification or functionalization with other polymers. The biosynthesis of pACA directly from common feedstocks, e.g., glucose, has not been reported which may suggest that production of pACA in bacteria is problematic. Assuming that high concentration of pACA could be deleterious to the host, a growth experiment using E.coli, the standard chassis, and P.putida, known for resistance to aromatic compounds was performed. It is obvious that E.coli cannot tolerate high concentrations of pACA while P.putida growth is less affected. See
In E.coli, pAF can be produced by CRISPRa control of two operons: aroG*L from E.coli and papABC from P.fluorescens. The whole cassette was successfully ported into P.putida compatible plasmid, and it was shown that pAF can be produced efficiently. Next, the pal-tyrB operon, Pal from Arabidopsis thaliana and tyrB from E.coli, was supplied to enable the pACA production from pAF on the second plasmid and trace amounts of pACA in the supernatant were observed. See
Prior to genome-wide perturbations, the heterologous gene expressions were optimized to ease the downstream engineering. Three approaches, to both express pACA biosynthetic pathway and perturb P.putida endogenous metabolism, were developed (
Therefore, a second approach was taken to reduce the number of plasmids down to just one with three operons incorporated with a multi-gRNA program. Accordingly, a big plasmid (13kb) incorporating the three operons, was successfully constructed, which demonstrated elevated production of pACA. However, additional morphology, with larger colony size, in the P.putida transformation was observed, which may suggest instability of oversized plasmid. Restriction digest and sequencing suggested that part of the plasmid was deleted plausibly by transient recombination activity of P.putida. To solve this problem, a smaller backbone architecture for this broad-host-range pBBR1 plasmid was used to mitigate the plasmid size problem.
The third approach utilized by the present disclosure was to move the whole heterologous gene cassettes into the P.putida genome which leaves only the compact gRNA program on the plasmid. Initially, the first two operons for producing pAF were moved into the genome and it was observed that the pAF being produced by this approach is drastically reduced. By comparing the protein expression from multi-copy plasmid and single-copy genome-integrated cassette, it was observed that the expression capacity of the genome integrated cassette is several magnitudes lower than that of plasmid ones. The weak promoter of integrated cassettes was altered to one with moderate strength and it was found that pAF level significantly increases and is enough for pACA production. However, pACA production levels by this approach were lower than that of the big plasmid approach. To this point, both approaches are suitable for pACA production.
With the pACA production platform established, the inventors used genome-scale manipulation to optimize the process. A modified genome-scale model (GSM) that includes the pACA production pathway (papABC and pal reactions from chorismate to pACA) into the available P.putida KT2440 GSM using MetaCyc database was used. With iterations of change in chemical reactions corresponding to single-gene perturbations, the upregulation or downregulation of gene candidates that lead to higher production of pACA were recommended. 12 recommended genes for upregulation were mainly related to aromatic amino acid biosynthesis or central carbon metabolism. 31 recommended genes for downregulation were mostly nucleic acid biosynthesis and amino acid biosynthesis. 8 additional reactions were also identified that potentially compete with the pACA biosynthesis and these were included into downregulation candidates.
To test the ability to perturb endogenous gene candidates, the CRISPRa/i activity was investigated using GFP-fusion reporters by appending the sfGFP gene to the coding sequence of potential targets. sfGFP sequence 60bp was tagged after the start codon for CRISPRa similar to reported literature and 300bp for CRISPRi to accommodate space for CRISPRi target in the coding sequence. All scRNA with proper distance-to-TSS with reported and predicted TSS were screened. Out of 11 promoters (PP_0578 and PP_0579 are under the same promoter), 7 promoters were activated with >1.5-fold activation (
For CRISPRi candidates, all sgRNA were analyzed through the Wayfinder algorithm and screened based on RNA folding energetics. Two best sgRNA candidates for each promoter will be experimentally tested for both CRISPRi efficiency and growth defect. Out of 38 CRISPRi target promoters (PP_0420 and PP_0421 are under the same promoter), the first 5 promoters were tested to have > 1.5-fold repression (
In summary, pACA was selected to demonstrate strain engineering acceleration. The direct bioproduction of pACA from glucose in bacteria was demonstrated in P.putida CRISPRa platform. Phenylalanine ammonia lyase (Pal) from R. glutinis was observed to outperform A. thaliana Pal in pACA conversion.
Further, different approaches in heterologous genes and multi-gRNA cassettes delivery were tested. The two-plasmid system appeared to be relatively burdensome, whereas use of the big, single plasmid system provided the highest pACA production but suffered from instability. Genome integration of the pAF/pACA pathway were tested. The CRISPR-control expression of pAF pathway yielded significantly decreased amount of pAF compared to the plasmid version plausibly due to change in copy-number. Changing the promoter strength elevates the pAF production, and pACA production is enabled with plasmid-borne Pal expression. Finally, adjusted Genome-Scale Model (GSM) was utilized to recommend the target for CRISPRa/i perturbations. Twelve CRISPRa targets and 31 CRISPRi targets were identified using GSM. Eight additional genes were identified to be potential competing pathways for pACA production. Seven out of the eleven CRISPRa candidates were found to be activatable. Five CRISPRi candidates tested were all found to be repressible.
Production of valuable chemical compounds using engineered biological hosts is a promising route with many chemical advantages, but accommodating, avoiding, or taking advantage of endogenous metabolism and its accompanying regulation can be a major obstacle to industrially relevant bioproduction. Often, overcoming this obstacle requires wide-ranging alterations of endogenous metabolism, and new tools have emerged to understand the effects of such changes. Large-scale observation of strain engineering effects using -omics technologies, combined with genome-scale modeling, and design-build-test-learn (DBTL) approaches enhanced by machine learning, hold great promise for rapid improvement of production strains through well-targeted changes to endogenous metabolism.
The present disclosure demonstrates that the combinatorial, orthogonal, and tunable features of CRISPR-based expression control can be effectively leveraged and is well-matched with the framework of DBTL cycles for incremental strain improvement.
Aromatic compounds are a promising but challenging class of bioproducts due to their connection to the host’s central metabolism through the aromatic amino acid precursor chorismate. Development of aromatic compound-producing strains is particularly attractive when using renewable, non-edible lignocellulosic feedstocks. The products and intermediates can pose challenges, however, due to toxicity and solubility concerns. For example, p-aminocinnamic acid (p-ACA) can be used as a precursor for p-aminostyrene, but p-ACA production in E.coli is accompanied by toxic effects, even though its immediate precursor p-aminophenylalanine (p-AF) can safely accumulate in that host. The present disclosure demonstrates that Pseudomonas putida is a more suitable host, free from these toxic effects.
The heterologous contributions to the p-ACA production pathway start with a feedback-resistant AroG and overexpression of AroL aimed at boosting levels of the endogenous precursor chorismate. From there, the Pseudomonas fluorescens enzymes PapA, PapB, and PapC produce p-aminophenyl pyruvate, which becomes p-AF through endogenous transaminase activity, sometimes supplemented by additional expression of E.coli TyrB. Finally, a phenylalanine ammonia lyase enzyme (PAL), either from Arabidopsis thaliana or Rhodotorula glutinis, converts p-AF to p-ACA. Production of p-ACA from this strain is likely to be enhanced by boosting metabolic flux into the pathway and by limiting loss of flux to side products, and we aim to rapidly design this enhancement using the machine-learning-based DBTL approach.
The modularity and orthogonality of not only heterologous enzyme expression by CRISPRa, but of a whole array of endogenous CRISPRa/i interventions, can be used to introduce the variation driving such DBTL improvement, especially in a host like P.putida. Additionally, the data presented herein suggest substantial freedom in an ability to expand this array of scRNAs/gRNAs to arbitrary size. This expansion relates less to the autoregulatory architectures and more to a wide-ranging, single-layered control circuit-but if circuit expansion becomes overly burdensome at some point, either through expression burden or through changes to endogenous metabolism, autoregulation can be added to the circuit as easily as adding another gRNA. The present disclosure thus describes a non-model bacterial strain producing p-ACA under the control of CRISPR-based expression.
As originally implemented in E.coli, the p-AF production pathway consisted of the P. fluorescens papABC operon78 under tet-inducible control, along with a feedback-resistant E.coli aroG74 and aroL in their own operon, which is also tet-inducible. Inducing this pathway in the DH10B strain routinely produced up to 800 µM p-AF, but trying to use an A.thaliana PAL2 enzyme to extend this pathway flux to the products downstream of p-AF proved difficult, probably due to the toxic effect of p-ACA on E.coli arresting the growth of any cells producing it. Toxic amounts of extracellular p-ACA, the result of a spike into the media, are shown in
In contrast to E.coli, p-ACA has little effect on P.putida growth, even up to 20 mM extracellular (close to its solubility limit). Therefore, the inventors chose to port the existing pAF pathway into P.putida to extend it to more valuable downstream aromatic products.
From this baseline of p-AF production, p-ACA, was produced first with the A. thaliana PAL2, and later replacing it with R. glutinis PAL75. The challenging aspect of this process was to balance the burden contributed by plasmid-based genetics versus the need to express enough enzyme to produce detectable amounts of metabolite. The pathway including PAL was large enough to present difficulties fitting onto one stable 76 plasmid, P.putida is severely burdened by a second plasmid, and strengthening the base CRISPRa promoters in preparation for genomic integration (and its reduction in copy number relative to plasmid) proved difficult. Despite these challenges, surprisingly, even suboptimal expression was enough to produce small amounts of p-ACA, a first from a bacterial host.
While the plasmid-based, CRISPRa-controlled p-AF pathway was producing up to 1.3 mM p-AF extracellularly, the inventors sought to expand the circuit, through both: additional enzymes (namely, PAL); and additional scRNAs and gRNAs with endogenous targets. Since the pBBR1 plasmid is already large, initially the inventors focused on integrating the enzyme genes, driven by their synthetic CRISPRa promoters, while keeping the arbitrarily large scRNA/gRNA array on the plasmid for ease of adjustment. Given the eventual goal of several DBTL cycles optimizing the effects of these adjusted CRISPRa/i interventions, this ease of adjustment was an important design factor balancing the substantial burden of carrying a plasmid in P.putida. This burden is mitigated somewhat by limiting the size of the plasmid, and the inventors integrated as much of the heterologous genes as possible, aided by the trans-acting nature of scRNAs.
Because integration would lead to a reduction in DNA copy number from the medium-copy plasmid to the single-copy genome, and this reduction in gene dosage would reduce overall expression levels, even when activated by CRISPRa, the inventors sought to use stronger base promoters within the synthetic CRISPRa promoters. As anticipated, upon integration of papABC and aroGL, the lower expression level was found to be not producing enough enzyme to produce measurable p-AF in the culture supernatant.
Guided by a small promoter strength library controlling RFP expression by CRISPRa, J23110, J23106, and J23105 base promoters were cloned, combined with low leak upstream sequences (between the spacer target and the base promoter). The challenge with this methodology was that the integration process required an initial step of plasmid cloning in E.coli. E.coli does not tolerate even low-copy plasmids like pSC101**, pBBR1, and pGNW, specifically, when combined with the stronger base promoters and the substantial size of the output genes. This challenge was also not resolved by cloning-specific E.coli strains.
To address this problem, a cloning workflow was devised in which In-Fusion reactions were co-transformed with a “helper” plasmid consisting of dCas9 and a gRNA targeted to repress any output of a J23110 promoter. Due to sequence similarities between J23110, J23106, and J23105, it was reasoned that the same helper plasmid would sufficiently repress any of these promoters. This helper plasmid was maintained throughout the plasmid-cloning phase of integration, until eventual transformation into P.putida, before which it was restriction digested into nonreplicable linear fragments. Combined with a highly competent pir+ cloning strain, this strategy resulted in successful cloning of the plasmid-based phase of the integration workflow, and successful transformation of the integration plasmid into the P.putida recipient strain. Production of p-AF and p-ACA by strains with J23110 and J23105 base promoters driving papABC and aroGL expression is shown in
An alternative strategy using a second (pRK2 origin) plasmid, into which A. thaliana PAL2 under control of the J6 synthetic promoter was cloned, with an optional inclusion of E.coli TyrB in the same operon. The burden of the second plasmid was not well-tolerated by P.putida, resulting in diminished growth rate and greatly reduced p-AF production. Despite the low concentration of its substrate, activation of PAL2 in this system resulted in a miniscule amount of p-ACA production, demonstrating the viability of even this suboptimal strategy. Because the extracellular p-ACA titers were so low, however, optimization of the base pathway was continued before implementing endogenous CRISPRa/i or iterating DBTL cycles, aiming to have more certainty in the quantification of p-ACA production differences arising from these interventions.
Within the two-plasmid system, the A.thaliana PAL2 was replaced with a PAL enzyme from the yeast Rhodotorula glutinis, despite the system’s low production of pAF. It has been reported that Rg-PAL shows more promiscuous activity than A. thaliana’s similar enzyme PAL4, which is more specific to the native substrate phenylalanine. It was rationalized, therefore, that Rg-PAL might have more activity on the heterologous substrate p-AF than At-PAL2. Even with low amounts of that substrate, Rg-PAL indeed produced a substantial increase in p-ACA.
Assuming a proportional increase in p-ACA production as in p-AF production, this new, Rg-PAL-containing pathway worked into a one-plasmid system would theoretically predict p-ACA titers reaching into the millimolar range, even before flux optimization by endogenous CRISPRa/i. Upon building multiple versions of one-plasmid p-ACA production strains, it was found that p-ACA titers were not quite so dramatic, but still an improvement over the two-plasmid system.
The strategies for constructing this system were either a large plasmid containing scRNAs, papABC, aroGL, and Rg-PAL, but excluding tyrB; or a smaller plasmid containing only scRNAs and Rg-PAL, used in one of the strains with papABC and aroGL integrated, driven by either the J23110 and J23105 base promoters. Avoiding the burden of maintaining and replicating the second plasmid resulted in up to four-fold improvement of p-ACA titer. Hence, the one-plasmid strain was chosen as the basis for production improvement through endogenous CRISPRa/i during DBTL cycles.
To confirm that the identified p-ACA was the product, and that its accumulation in culture supernatant was stable, numerous follow-up experiments were performed verifying the peak location within the chromatogram, the lack of consumption of extracellular p-ACA by growing P.putida cultures, and the lack of p-ACA toxicity even in a strain with the putative catabolic pathway knocked out.
Not only does HPLC indicate p-ACA production, but it also reveals several side products whose titers are increased by CRISPR activation of papABC/aroGL or of either PAL. These metabolites are produced by heterologous enzymes acting on endogenous substrates, or by endogenous enzymes acting on heterologous substrates, especially the pathway intermediates. Independent activation of each promoter was utilized to determine which side products are associated with each heterologous gene expression. Metabolites with peaks occurring at 4.8 minutes and 6.1 minutes in the HPLC chromatogram are produced by PapABC and/or AroGL, while a metabolite with a 17.7-minute peak is produced by PAL. Interestingly, they all undergo significant CRISPR activation even in the weak (J23117) integrated strain, leading to the conclusion that even in that system there is enough enzyme to affect overall metabolism, and suggesting that reducing side pathway activity through endogenous CRISPRa/i could result in p-ACA production even from that nonproducing strain. Regardless of whether one can achieve production from the J23117 integrated strain, the aim was to use endogenous CRISPRa/i to improve a strain that has already demonstrated production: either using the two-plasmid system or one of the one-plasmid alternatives.
Once a base strain was optimized (to the point where it’s stable, produces a reasonably-quantifiable amount of p-ACA, and easily accepts changes to the scRNA/gRNA program), the inventors sought to improve production through iterative improvement of the CRISPR program responsible for modulating endogenous metabolism. The detection of side products in the heterologous-only pathway suggests the potential for substantial improvement because there is metabolic flux adjacent to the desired pathway. Whether the detected side products arise from endogenous enzymes or endogenous substrates, they can provide clues for rational selection of endogenous CRISPRa/i targets. For example, it was determined through spike-in experiments that the 6.1-minute peak corresponds to paminobenzoic acid, likely resulting from endogenous PapC acting on a heterologous intermediate and competing with PapB for that substrate. It is reasonable to expect, then, that PapC is a high-priority target for CRISPRi. Such a knockdown would relieve some of the competition with PapB and redirect metabolic flux into the heterologous pathway. Thus, a wide array of endogenous CRISPRa/i that will work in combination to improve product titer are envisioned herein.
The present disclosure provides systems and methods for predicting the size of endogenous CRISPRa/i scRNAs/gRNAs for optimizing p-ACA production. Other factors that are likely to limit size include competition for dCas9 binding and the metabolic effects of the interventions themselves. To investigate the former’s effect, and perhaps to quantify the burden limited size of a CRISPRa/i circuit, the effects of an arbitrary number of off-target scRNAs/gRNAs on both a CRISPR-activated reporter gene and a different constitutive reporter gene will be determined. The former reporter will determine the circuit size’s effect on CRISPR functionality, while the latter will determine the circuit size’s effect on overall expression capacity (and normalize this effect out of the CRISPR-specific effect).
The present disclosure provides the utility and portability of CRISPR-based control with pathway enzymes as outputs instead of reporter proteins. The present disclosure not only demonstrates an on par production of p-AF with tet-inducible control, it also demonstrates p-ACA production in a bacterial host, made possible by porting the circuit to a host better-suited to the pathway chemistry. Compared to TF control, orthogonal CRISPR-activatable promoters allow for more independent control of individual operons and endogenous targets, while still retaining the ability to use dCas9 or MCP-SoxS expression as a master regulator. The independent control of individual operons and endogenous targets, while still retaining the ability to use dCas9 or MCP-SoxS expression as a master regulator, can be used to tune enzyme stoichiometry within heterologous pathways, and to rationally prioritize endogenous CRISPRa/i targets based on observed side products.
Another potential pitfall of large CRISPR-controlled circuits is competition between RNAs for binding to dCas9, with a recent report noting a ten-fold reduction in CRISPRi efficacy when co-expressing 5-10 gRNAs, though CRISPRa efficacy may be more resistant. To try to boost this circuit size, autoregulation of CRISPR activity is an option, and importantly can be controlled by CRISPR itself, dCas9 (and MCP) affinity between different scRNAs/gRNAs can be equalized. Observations of very low CRISPRa fold-activation at high base promoter strengths could form the basis of a small autoregulatory boost to unbound shared component levels when reduced by binding competition.
Publications cited herein and the subject matter for which they are cited are hereby specifically incorporated by reference in their entireties. A listing of bacterial strains and plasmids used in the present disclosure can be found in Tables 1 &2. Sequence identifiers for all the sequences disclosed herein are listed in Tables 4, 5, 7, and 8.
The following examples are set forth so as to provide those of ordinary skill in the art with a complete disclosure and description of how to make and use the present invention and are not intended to limit the scope of what the inventors regard as their invention nor are they intended to represent that the experiments below are all or the only experiments performed.
The examples disclose the inventors’ development of CRISPRa for programming heterologous gene expression in a Pseudomonas bacterial strain, e.g., P.putida KT2440. These efforts establish a framework for the further development of CRISPRa tools for programming gene expression in industrially promising bacteria.
Elements of the disclosure are included in Kiattisewee, C., Dong, et al., (2021) Portable bacterial CRISPR transcriptional activation enables metabolic engineering in Pseudomonas putida. Metabolic engineering, 66, 283-295, incorporated herein by reference in its entirety. Briefly, genetic components were constructed and experimental approaches established to permit CRISPRa machinery to be expressed and utilized in P.putida. By investigating promoter features that impact CRISPRa, such as guide RNA target sites and promoter strengths, designs permitting 30- to 100-fold activation of heterologous reporter gene expression were identified. CRISPRa was coupled with CRISPRi for multi-gene programming and endogenous gene activation. Using an inducible system derived from P.putida, an inducible CRISPRa/CRISPRi platform with low leakage in the uninduced state was developed. Further it was demonstrated that CRISPRa can drive the expression of heterologous genes to produce desirable metabolic products including biopterin derivatives and mevalonate. Using this approach, the inventors demonstrated that the inducible CRISPRa system can generate 40-fold increases in mevalonate production, achieving titers comparable to those from a previously reported IPTG-inducible system. Taken together, this work and the data generated herein provide a toolbox of components and validated workflows for implementing CRISPRa to program heterologous gene expression in P.putida.
Plasmids pBBR1-MCS2(pBBR1-KmR), pBBR1-MCS5(pBBR1-GmR) (Kovach et al., 1995), pTNS1, pUC18T-miniTn7T-GmR (Choi and Schweizer, 2006. mini-Tn7 insertion in bacteria with single attTn7 sites: example Pseudomonas aeruginosa. Nat. Protoc. 1, 153-161), pRK2013, pFLP2, and P.putida KT2440 were a gift from the Harwood lab at the University of Washington. pRK2-AraE (Cook et al., 2018 Genetic tools for reliable gene expression and recombineering in Pseudomonas putida. J. Ind. Microbiol. Biotechnol. 45, 517-527) was a gift from the Pfleger lab at the University of Wisconsin-Madison (Addgene #110141). pMVA2RBS035 (Jervis et al., 2019. Machine Learning of Designed Translational Control Allows Predictive Pathway Optimization in Escherichia coli. ACS Synth. Biol. 8, 127-136) was a gift from the Scrutton lab at the University of Manchester (Addgene #121051). S.pyogenes dCas9 (Sp-dCas9) was expressed from the endogenous Sp.pCas9 promoter and the MCP-SoxS (R93A, S101A) (abbreviated MCP-SoxS) transcriptional activator fusion protein was expressed from the BBa_J23107 promoter (Fontana et al., 2020a. Effective CRISPRa-mediated control of gene expression in bacteria must overcome strict target site requirements. Nat. Commun. 11, 1618) (http://parts.igem.org). The modified single guide RNAs (sgRNA) (Dong et al., 2018. Synthetic CRISPR-Cas gene activators for transcriptional reprogramming in bacteria. Nat. Commun. 9, 2489), scaffold RNAs b2.1xMS2 (scRNAs), were expressed from the BBa_J23119 promoter in the pBBR1-GmR plasmid, unless specified. 20 bp scRNA/sgRNA target sequences are provided in Table 5. mRFP1 and sfGFP reporters were expressed from the weak BBa_J23117 minimal promoter (http://parts.igem.org), unless specified, either by integrating into the genome or in the pBBR1-GmR plasmid together with the scRNA(s). All plasmids were constructed and propagated in E.coli NEB turbo cells (New England Biolabs). All P.putida strains were constructed from the wild-type strain KT2440. See Tables 1 and 2 for a complete list of bacterial strains and plasmid constructs used in the present disclosure. Exemplary plasmids used in the present disclosure are listed in Table 3.
P. putida KT2440
All PCR fragments were amplified with Phusion DNA Polymerase (Thermo-Fisher Scientific) for Infusion Cloning (Takara Bio). Transformants were cultured or selected either on Lysogeny Broth (LB) or agar plates, with appropriate antibiotics, used in the following concentrations: 100 µg/mL Carbenicillin, 25 µg/mL Chloramphenicol, 30 µg/mL Kanamycin, 30 µg/mL Gentamicin. Successful constructs were confirmed by Sanger sequencing (GENEWIZ). Details for cloning strategies are well known in the art. The various constructs used for the cloning strategies are described in Tables 2-4, below. sgRNA/scRNA target sequences are provided in Table 5.
Pseudomonas putida genome integrations were performed using the tri-parental conjugation for the mini-Tn7 method (Choi and Schweizer, 2006. mini-Tn7 insertion in bacteria with single attTn7 sites: example Pseudomonas aeruginosa. Nat. Protoc. 1, 153-161) or electroporation for the pGNW2 method (Wirth et al., 2019 Wirth, N.T., Kozaeva, E., Nikel, P.I., 2019). Accelerated genome engineering of Pseudomonas putida by I-SceI-mediated recombination and CRISPR-Cas9 counter selection. Microb Biotechnol). Plasmid transformations into P.putida were performed either by electroporation (Choi and Schweizer, 2006. mini-Tn7 insertion in bacteria with single attTn7 sites: example Pseudomonas aeruginosa. Nat. Protoc. 1, 153-161) or heat-shock of CaCl2 chemically competent cells (Zhao et al., 2013. [CaCl2-heat shock preparation of competent cells of three Pseudomonas strains and related transformation conditions]. Ying Yong Sheng Tai Xue Bao 24, 788-94).
Fluorescence measurements of reporter gene expression were carried out either by flow cytometry or plate reader. Single colonies from LB plates were inoculated in 500 µL of EZ-RDM (Teknova) supplemented with the appropriate antibiotics and grown in 96-deep-well plates at 30° C. with shaking overnight 225 rpm. For small-molecule induction, overnight cultures were diluted 100-fold into a new culture with appropriate antibiotics and inducers, then shaken overnight at 30° C., 225 rpm. For flow cytometry, overnight cultures were diluted 1:50 in Dulbecco’s phosphate-buffered saline (PBS) and analyzed on a MACSQuant VYB flow cytometer with the MACSQuantify 2.8 software (Miltenyi Biotec) using the methods and instruments settings as described (Dong et al., 2018. Synthetic CRISPR-Cas gene activators for transcriptional reprogramming in bacteria. Nat Commun 9, 2489). For plate reader measurements, 150 µL of overnight culture were transferred into a flat, clear-bottomed black 96-well plate. OD600 and fluorescence values were measured in a Biotek Synergy HTX plate reader and analyzed using the BioTek Gen5 2.07.17 software. For mRFP1 detection, the excitation wavelength was 540 nm and emission wavelength was 600 nm. For sfGFP detection, the excitation wavelength was 485 nm and the emission wavelength was 528 nm. Data were plotted using Prism (GraphPad).
For mevalonate production experiments, the GC-MS method was adapted from prior methods (Pfleger et al., 2007. Microbial sensors for small molecules: Development of a mevalonate biosensor. Metab. Eng. 9, 30-38). Single colonies from LB plates were inoculated in 500 µL of EZ-RDM (Teknova) supplemented with the appropriate antibiotics and grown in 96-deep-well plates at 30° C. with shaking overnight at 225 rpm. Overnight cultures were subcultured by 1:100 dilution into 3 mL of EZ-RDM media with 1% glucose as the carbon source, supplemented with the appropriate antibiotics, and shaken at 225 rpm for 72 hours at 30° C. After 72 hours, 560 µL of cell suspension was acidified with 140 µL of 0.5 M HCl and vortexed. 700 µL ethyl acetate was added and samples were then vortexed again vigorously for 3 minutes and centrifuged at maximum speed in a benchtop centrifuge (15,000 rcf) for 10 min. The organic phase was then transferred into GC-MS vials for analysis. GC-MS analysis was performed using an Agilent 5973 instrument with a temperature program as follows. The inlet temperature was 250° C. (splitless mode). The column flow was kept at 1 mL/min in HP-5MS (Agilent). The temperature cycle started at 80° C. and was followed by a gradient of 20° C./min to 260° C., a second gradient of 40° C./min to 300° C., and a hold at 300° C. for 2 min. m/z = 71, the second most abundant peak corresponding to mevalonolactone, was used for quantitation (Pfleger et al., 2007. Microbial sensors for small molecules: Development of a mevalonate biosensor. Metab. Eng. 9, 30-38). A calibration curve was generated using freshly-prepared D,L-mevalonolactone (Sigma) dissolved in ethyl acetate. The calculated concentration was adjusted by the addition of HCl. Data were plotted using Prism (GraphPad).
For the biopterin production experiments, single colonies from LB plates were inoculated in 500 µL of EZ-RDM supplemented with the appropriate antibiotics and grown in 96-deep-wellplates at 30° C. with shaking overnight. Each sample was then sub-cultured at 100-fold dilution in 5 mL of EZ-RDM supplemented with the appropriate antibiotics and grown in 14 mL culture tubes at 30° C. and shaking for 24 hours. The overnight cultures were spun down and pteridine concentrations were determined by measuring the OD340 and comparing the results to a standard calibration curve prepared with purchased reagents (Cayman Chemical). The HPLC-MS measurements were performed as described (Ehrenworth et al., 2015. Pterin-Dependent Mono-oxidation for the Microbial Synthesis of a Modified Monoterpene Indole Alkaloid. ACS Synth. Biol. 4, 1295-1307). A detailed HPLC-MS protocol is provided in the Supplementary Methods. Data were plotted using Prism (GraphPad).
The first challenge to enable a CRISPRa system in P.putida is to express the components from E.coli in P.putida. The bacterial CRISPRa system developed in E.coli consists of three components, dCas9, MCP-SoxS, and scRNA (Dong et al., 2018. Synthetic CRISPR-Cas gene activators for transcriptional reprogramming in bacteria. Nat Commun 9, 2489), delivered in a p15A plasmid that is present at ~10 copies/cell (Shetty et al., 2008. Engineering BioBrick vectors from BioBrick parts. J. Biol. Eng. 2, 5) (
E.
coli SoxS activator domain was used because it recognizes a motif on RpoA that is conserved between E.coli and P.putida (Dong et al., 2018. Synthetic CRISPR-Cas gene activators for transcriptional reprogramming in bacteria. Nat Commun 9, 2489), and there is no direct homolog of SoxS in P.putida (Park et al., 2006. Regulation of superoxide stress in Pseudomonas putida KT2440 is different from the SoxR paradigm in Escherichia coli.Biochem. Biophys. Res. Commun. 341, 51-56). To test this system in P.putida, the three CRISPRa components need to be expressed at levels sufficient to activate the target gene without dCas9 expression being so high that cellular functions are inhibited (Depardieu and Bikard, 2020. Gene silencing with CRISPRi in bacteria and optimization of dCas9 expression levels. Methods 172, 61-75; Zhang and Voigt, 2018. Engineered dCas9 with reduced toxicity in bacteria: implications for genetic circuit design. Nucleic Acids Res. 46, 11115-11125). Components from two E.coli plasmid constructs, a CRISPRa system plasmid and a reporter plasmid, were first moved directly into two P.putida expression plasmids, pBBR1 and pRK2 (each present at 25-30 copies/cell according to (Cook et al., 2018. Genetic tools for reliable gene expression and recombineering in Pseudomonas putida. J. Ind. Microbiol. Biotechnol. 45, 517-527)) (
P.
putida strains with the initial implementation of the CRISPRa system grew poorly on both agar and liquid media (
The highest level of activation (~5-fold) were observed with the scRNA and reporter both expressed from a single pBBR1-GmR backbone, while the plasmid with either pRK2 origin or KmR marker yielded weaker activation (~2-fold) (
To improve the fold-activation of CRISPRa in P.putida, the criteria for effective CRISPRa observed for E.coli (Fontana et al., 2020a. Effective CRISPRa-mediated control of gene expression in bacteria must overcome strict target site requirements. Nat. Commun. 11, 1618) were investigated. Specifically, factors known to affect CRISPRa efficiency in E.coli include i) the distance of target sequence to transcription start-site (TSS), ii) the sequence composition of the 20 bp scRNA targeting sequence, iii) the basal minimal promoter strength, and iv) the 5′-proximal sequence composition between target sequence and minimal promoter (
In E.coli, the most effective CRISPRa target sites are in the region of -60 to -100 bp before the TSS, with sharp peaks of activity every 10 bases, separated by regions of inactivity (Fontana et al., 2020a. Effective CRISPRa-mediated control of gene expression in bacteria must overcome strict target site requirements. Nat. Commun. 11, 1618). An integrated reporter that can be targeted at multiple sites (J1-sfGFP, previously characterized in E.coli (Fontana et al., 2020a. Effective CRISPRa-mediated control of gene expression in bacteria must overcome strict target site requirements. Nat. Commun. 11, 1618) was constructed and used to deliver plasmids with scRNAs targeting different sites as shown in
Next, the 20 bp target sequence that is recognized by the scRNA was examined. The experiments described above were performed with the J1 promoter, which contains an array of 20 base target sites. An alternative promoter, termed J3, that has a different set of 20 base target sites was tested. Multiple target sites in the J3 promoter were tested and it was found that the J306 site, located 81 bases upstream of the TSS, yielded the highest fold-activation (
To test whether the different basal expression levels were due to differences in the 20 base target sites or to other features of the promoters, hybrid promoters where the 20 base J106 target site in J1 was replaced by J306 (J1(306)) and vice versa (J3(106)) were constructed and tested similarly. A low basal expression only with the hybrid J3(106) promoter (
The promoters tested comprise a 35 base minimal promoter that binds the sigma subunit of RNA polymerase and an upstream 170 base sequence region with scRNA target sites. The 35 bp minimal promoter sequence is also a key factor that governs the dynamic range of CRISPRa. In E.coli, it was observed that minimal promoter strength and the sigma factor regulating the promoter have large effects on CRISPRa (Fontana et al., 2020a. Effective CRISPRa-mediated control of gene expression in bacteria must overcome strict target site requirements. Nat. Commun. 11, 1618). However, the alternative sigma factor regulons in P.putida are less characterized compared to those in E.coli. Therefore, the sigma-70 regulon, the house-keeping sigma factor, that covers the majority of E.coli and P.putida endogenous promoters (Fujita et al., 1995) was selected for further investigation.
To test the effects of promoter strength, 11 minimal 35 base promoters from the Anderson promoter collection (BBa_J231XX, parts.igem.org) were introduced into the J3-mRFP reporter (
CRISPRa-mediated expression from the Anderson promoter series followed trends similar to that previously observed in E.coli (Fontana et al., 2020a. Effective CRISPRa-mediated control of gene expression in bacteria must overcome strict target site requirements. Nat. Commun. 11, 1618). When the promoter strengths are extremely weak (BBa_J23109 and BBa_J23113), the CRISPRa fold-activation dropped significantly to 3.1-fold and 1.4-fold compared to 27-fold with the moderately weak BBa_J23117 minimal promoter. As promoter strength increases from BBa_J23117 to the strong BBa_J23110 promoter, CRISPRa fold-activation decreases because basal expression increases ~10-fold, while the maximal CRISPRa output varies by <4-fold (
The last factor tested was the intervening sequence between the 20 base target site and the 35 base minimal promoter, termed the 5′-proximal sequence. This sequence is 26 bp long when using an optimal target site located at -81 bp from the TSS. A pooled library of reporter gene plasmids with variable 26 base 5′ proximal sequences was constructed using a randomized oligo pool. Each reporter retains the same 20 base J306 scRNA target site and the BBa_J23117 minimal promoter. This library was transformed into a P.putida reporter strain and a large number of single colonies were functionally characterized without sequencing each colony. The random 5′-proximal sequences led to a broad range of mRFP levels from CRISPRa (
The variation in CRISPRa outputs with different promoter features suggests that a set of distinct and orthogonal heterologous promoters could be developed for tunable control of gene expression. Promoters with orthogonal 20 base target sequences, together with different 5′ proximal sequences, minimal promoters, and target site positions could be used to access a broad range of CRISPRa-mediated gene expression levels. Further, systematically varying the 5′-proximal sequence could allow for the identification of promoters with lower basal expression and higher dynamic range of activation, similar to the case of the 5′-PS5 sequence mentioned above. The present disclosure contemplates constructing combinatorial libraries of multi-gene programs to explore how independently tuning gene expression levels in metabolic pathways affects product titers.
To compare CRISPRa in P.putida to that in E.coli, a correlation plot of mRFP expression from CRISPRa strains with different promoter sequence variations was constructed (
With an optimized CRISPRa system in P.putida, several strategies to enable more sophisticated control over gene expression programs were explored. Multi-gene CRISPRa/CRISPRi programs were constructed, and endogenous gene activation was demonstrated. Further, an inducible CRISPRa system for tunable, dynamically regulated expression was developed. These strategies will enable the construction of multi-gene programs to rewire metabolic networks for optimal biosynthesis in P.putida.
With optimized expression levels and a delivery strategy for the CRISPRa system in P.putida in place, whether CRISPRa and CRISPRi can be used together to activate and repress multiple genes was tested. This strategy has been previously successful in E.coli (Dong et al., 2018. Synthetic CRISPR-Cas gene activators for transcriptional reprogramming in bacteria. Nat Commun 9, 2489). A dual-reporter plasmid with weakly expressed mRFP (J3-BBa_J23117-mRFP) and highly expressed sfGFP (J3(106)-BBa_J23111-sfGFP) was constructed. A dual scRNA/sgRNA cassette was inserted in this plasmid with a J306 scRNA for mRFP activation and an sgRNA that targets within the sfGFP open reading frame (ORF) for repression. This plasmid was delivered to a P.putida strain with integrated dCas9/MCP-SoxS and simultaneous activation of mRFP (6.6-fold) and repression of sfGFP (13-fold) (
To determine if CRISPRa can be used to activate multiple genes simultaneously, a dual-reporter plasmid with weakly expressed mRFP (J3-BBa_J23117-mRFP) and weakly expressed sfGFP (J3(106)-BBa_J23117-sfGFP) was constructed. A dual scRNA cassette was inserted into this plasmid with scRNAs that target mRFP and sfGFP for activation and delivered it to a P.putida strain with integrated dCas9/MCP-SoxS. Simultaneous activation of mRFP (19-fold) and sfGFP (69-fold) (
Additionally, simultaneous CRISPRa/CRISPRi and dual CRISPRa on multi-gene reporters with integrated genomic reporters were also demonstrated. The general trends were similar to what is observed with plasmid-based reporters (
To determine if CRISPRa can activate endogenous promoters, a set of endogenous genes with appropriate upstream scRNA target sites was identified. Thousands of reported TSSs for P.putida (D′Arrigo et al., 2016. Genome-wide mapping of transcription start sites yields novel insights into the primary transcriptome of Pseudomonas putida. Environ. Microbiol. 18, 3466-3481) were analyzed and ten promoters with potentially activatable target sites located at the proper distance from the TSS were selected. Specifically, NGG protospacer adjacent motifs (PAMs), which are required for recognition of Sp-dCas9/guide-RNA complex (Qi et al., 2013. Repurposing CRISPR as an RNA-guided platform for sequence-specific control of gene expression. Cell 152, 1173-83), at distances corresponding to the J105-J112 target sites (
> 1.5-fold activation at 4 of the 10 promoters tested, with the highest fold-activation (2.8-fold) from scRNA G2 targeting katG (PP_3668) promoter was observed (
This success rate and the magnitude of gene activation at endogenous targets in P. putida was similar to that observed previously in E.coli (Fontana et al., 2020a. Effective CRISPRa-mediated control of gene expression in bacteria must overcome strict target site requirements. Nat. Commun. 11, 1618). To predictably activate any endogenous gene, it will be necessary to further elucidate the rules for effective CRISPRa. Accurate annotations of TSSs and PAM-flexible dCas9 variants to precisely target the optimal distance upstream of the endogenous gene may improve activation (Fontana et al., 2020a. Effective CRISPRa-mediated control of gene expression in bacteria must overcome strict target site requirements. Nat. Commun. 11, 1618). Alternative bacterial activation domains are also available with different properties (Ho et al., 2020. Programmable CRISPR-Cas transcriptional activation in bacteria. Mol. Syst. Biol. 16, e9427; Liu et al., 2019. Engineered CRISPRa enables programmable eukaryote-like gene activation in bacteria. Nat. Commun. 10, 3693), and it may be possible to combine multiple activators as has been previously reported in eukaryotic systems (Chavez et al., 2015. Highly efficient Cas9-mediated transcriptional programming. Nat Methods 12, 326-8; Konermann et al., 2015. Genome-scale transcriptional activation by an engineered CRISPR-Cas9 complex. Nature 517, 583-8).
To tune expression levels with CRISPRa and CRISPRi, the CRISPR system components were placed under the control of a small-molecule inducible promoter. dCas9 and/or MCP-SoxS were expressed using XylS-Pm, an inducible promoter system from the P.putida mt-2 toluene degradation pathway (Wirth et al., 2019. Accelerated genome engineering of Pseudomonas putida by I-SceI-mediated recombination and CRISPR-Cas9 counterselection. Microb Biotechnol). XylS-Pm provides a higher dynamic range compared to the widely-used LacI-Ptrc system (
Using a strong reporter (J3-BBa_J23110-mRFP) that can be either activated or repressed, it was demonstrated that the extent of CRISPRa or CRISPRi could be tuned with different inducer levels. A reporter with either an activating scRNA or a repressing sgRNA was delivered to the inducible dCas9 strain (PPC08) and 3-fold activation with CRISPRa or 7-fold repression with CRISPRi at 1 mM m-toluic acid (
By characterizing the promoter features necessary for effective CRISPRa in P.putida, application of CRISPRa for metabolic pathway engineering was tested. The J3-BBa_J23117 promoter described in the previous section was used to place genes of interest under the control of a CRISPRa system. In a strain with integrated dCas9/MCP-SoxS (PPC01), transcriptional units controlled by J3-BBa_J23117 can be activated by the cognate J306 scRNA (
BH4 is an important cofactor in aromatic amino acid biosynthesis that can be produced from a three-enzyme pathway (
The major product of the biopterin pathway in P.putida is BH2, in contrast to S. cerevisiae where fully oxidized biopterin is the major product (Ehrenworth et al., 2015. Pterin-Dependent Mono-oxidation for the Microbial Synthesis of a Modified Monoterpene Indole Alkaloid. ACS Synth. Biol. 4, 1295-1307). The finding that BH2 is the major product suggests that the reducing potential of P.putida prevented BH2 from further oxidation. In E.coli, BH2 is the major product but the ratio of BH2:biopterin is significantly lower than in P.putida (
Next, it was determined whether CRISPRa could be used to produce mevalonic acid, a precursor to terpenoid natural products including fine chemicals, biofuels, and therapeutics (Anthony et al., 2009. Optimization of the mevalonate-based isoprenoid biosynthetic pathway in Escherichia coli for production of the anti-malarial drug precursor amorpha-4,11-diene. Metab. Eng. 11, 13-19; Jervis et al., 2019. Machine Learning of Designed Translational Control Allows Predictive Pathway Optimization in Escherichia coli. ACS Synth. Biol. 8, 127-136; Peralta-Yahya et al., 2011. Identification and microbial production of a terpene-based advanced biofuel. Nat. Commun. 2, 483). Mevalonate has previously been produced in P.putida using two genes, mvaE and mvaS, expressed in a single operon under the control of LacI-Ptrc (
To determine if an inducible CRISPRa system could effectively regulate mevalonate production, a strain with toluic acid-inducible CRISPRa machinery (dCas9, MCP-SoxS, or both) was tested. In the absence of inducer 84 ± 11 mg/L mevalonate from the inducible dCas9 strain was observed. With inducer added to this strain (0.01 to 1.0 mM), a similar mevalonate level to that with constitutively expressed dCas9 was observed (345 to 397 mg/L and 402 ± 21 mg/L, respectively) (
To validate the ability of CRISPRa to modulate multiple genetic constructs, functionality of CRISPRa in pACA production with increasing number of scRNA (3 to 6 scRNAs with J306, J506, and J606 as the first set of 3 scRNAs) was tested. The additional scRNAs (J106, hAAV, and J206 as the 4th to 6th scRNA, respectively) are off--target, i.e., these have no target anywhere on the plasmid or on P.putida genome. Two expression strategies: A) express pACA pathway and scRNAs on plasmid; or B) move pACA pathway to the genome but keep scRNAs on the plasmid, were tested. When pACA pathway was delivered on the plasmid, a minimal decrease in pACA level where production at 6 scRNAs equal to ~75% of pACA production at 3 scRNAs (
Since CRISPRa was shown to be functional both in E.coli and P.putida, it was contemplated that CRISPRa will be functional in broad range of bacteria. Acinetobacter baylyi ADP1 which has been reported for its ability to use lignocellulosic biomass was selected to demonstrate the transferability/portability of CRISPRa. Inspired by P.putida CRISPRa portability studies, showing that dCas9 and activator expression on the genome is more reliable, ADP1 was engineered into CRISPR enabled strain (CKAB029,
Apart from SoxS, multiple CRISPRa systems have been recently reported (Dong C et al. Synthetic CRISPR-Cas gene activators for transcriptional reprogramming in bacteria. Nat Commun. 2018 ;9(1):2489., Liu Y et al. Engineered CRISPRa enables programmable eukaryote-like gene activation in bacteria. Nat Commun. 2019 ;10(1):3693., Ho et al. Programmable CRISPR-Cas transcriptional activation in bacteria. Mol Syst Biol. 2020 Jul;16(7):e9427. Among these systems, PspF-mediated CRISPRa is the most distinct from that of SoxS as it works on sigma54 promoters instead of sigma70 family (Liu Y et al. Engineered CRISPRa enables programmable eukaryote-like gene activation in bacteria. Nat Commun. 2019 ;10(1):3693.). Here, it was demonstrated that PspF-mediated CRISPRa is functional in P.putida. PspF-AN22 was integrated into dCas9/MCP-SoxS bearing strain to enable PspF CRISPRa (CKPP038,
Key challenge of CRISPR-Cas9 system is the availability of PAM at the proper position. To bypass the PAM requirement, engineered dCas9 proteins with expanded PAM sequences have been used instead of the original dCas9 (Fontana J. et al. Effective CRISPRa-mediated control of gene expression in bacteria must overcome strict target site requirements. Nat Commun. 2020;11(1):1618. The ability to expand the targetable sites further with dxCas9-NG and dSpRY variants has been recently reported (Kiattisewee C et al. Expanding the Scope of Bacterial CRISPR Activation with PAM-Flexible dCas9 Variants. ACS Synth Biol. 2022, 4103-4112.) (
In this work, the inventors ported a CRISPRa system from E.coli to P.putida successfully. The expression methods of dCas9, MCP-SoxS, and scRNA were optimized in P.putida and criteria for effective CRISPRa target sites in P.putida were defined. Based on the data and the methods disclosed herein, it is contemplated that a similar process of optimizing expression systems will enable effective CRISPRa-regulated gene expression in a wide range of bacterial species to enable complex CRISPR-based transcriptional programming in other industrially relevant microbes.
As reported previously in E.coli and in many eukaryotic systems, CRISPRa and CRISPRi can be used to target multiple genes simultaneously for activation or repression. Further, the CRISPRa system can be induced with small molecules, which will enable dynamic control of heterologous pathway activation. In P.putida, CRISPRa was applied to metabolic pathway engineering for tetrahydrobiopterin and mevalonate biosynthesis, providing proof-of-concept that CRISPRa-mediated gene regulation can be used to activate heterologous biosynthetic pathways.
Based on the present disclosure, the inventors contemplate an inducible CRISPR-Cas transcriptional control system to enable the rapid exploration of large combinatorial spaces of gene expression levels. A key advantage of CRISPR-Cas-mediated control is that, in principle, each gene of interest can be targeted by an orthogonal guide RNA and its expression level can be independently tuned. Endogenous genes can be targeted in this manner for both activation and repression to redirect metabolic flux towards the desired pathway precursors and activate heterologous pathways in a controlled manner to achieve optimal expression levels to maximize the production of desired biosynthetic products. The present disclosure therefore contemplates that the design principles disclosed herein can be used to rewire metabolic networks to enable more efficient biosynthetic production pathways for valuable chemical products.
E. coli MG1655
E.
coli and P.putida culture and engineering were generally performed in LB media. Pseudomonas isolation agar (Difco) was used in the tri-parental mating for mini-Tn7 cloning (Choi, K.-H., Schweizer, H.P., 2006. mini-Tn7 insertion in bacteria with single attTn7 sites: example Pseudomonas aeruginosa. Nat. Protoc. 1, 153-161). Fluorescent protein reporter gene activation and metabolic engineering experiments were performed in EZ rich-defined media (Teknova) with 0.2% glucose as the carbon source, unless specified. Appropriate concentration of antibiotics were included for plasmid maintenance: 100 µg/mL for carbenicillin, 25 µg/mL for chloramphenicol, 30 µg/mL for gentamicin, 30 µg/mL for kanamycin. For two-plasmid transformations in this work, the antibiotic concentration was reduced by half to 15 µg/mL each of gentamicin and kanamycin. IPTG was prepared in water as a 1 M stock solution prior to use. m-Toluic acid was prepared as a 0.5 M stock solution in 50% DMSO/water. Biopterin, BH2, and BH4 (Cayman Chemical) were stored in DMSO and diluted into water prior to use. D,L-mevalonolactone (Sigma) was freshly prepared in ethanol as a 20 mg/mL solution before dilution in ethyl acetate.
Plasmids used in this study can be separated into genome integration plasmids and replicable plasmids. Integration plasmids were made based on pUC18T-mini-Tn7T-Gm. Replicable plasmids were constructed from pBBR1-MCS2 (named pBBR1-KmR in this study), pBBR1-MCS5 (named pBBR1-GmR in this study), and pRK2-AraE (Table 1). The AraE cassette from pRK2-AraE (bearing GmR marker) was replaced with multiple-cloning-site regions of pBBR1 to generate pRK2-GmR. pRK2-KmR was made by replacing the GmR marker and AraE cassette with KmR marker and its multiple-cloning-site from pBBR1-KmR. The CRISPRa components and genes of interest were incorporated into each backbone at the multiple-cloning-site region. The detailed methodology for construction of each backbone is provided below. Table 2 shows the list of strains and plasmids used in each figure. Plasmids descriptions are listed in Table 3.
MCP-SoxS(R93A/S101A) was used in this study and will be abbreviated as MCP-SoxS. Both dCas9 and MCP-SoxS were obtained from the pCD442 plasmid (Fontana, J., et al. 2020. Effective CRISPRa-mediated control of gene expression in bacteria must overcome strict target site requirements. Nat. Commun. 11, 1618). The 1xMS2 scRNA.b2 was used in this study with variable 20 bp target sequences (Dong, C., et al. 2018. Synthetic CRISPR-Cas gene activators for transcriptional reprogramming in bacteria. Nat Commun 9, 2489). The full sequence of sgRNA and scRNA are provided in the DNA sequences section. Any plasmid with sg/scRNA has different 20bp target sequences according to Table 3. sg/scRNA sequences were provided in Table 5.
For pPPC001, the dCas9/MCP-SoxS cassette was amplified from pCD442 and inserted into pUC 18T-miniTn7T-Gm with KpnI/SacI. For pPPC005, the dCas9/MCP-SoxS coding sequence was amplified from pCD442 and inserted together with XylS-Pm as a promoter of dCas9, amplified from pS448-CsR (Wirth et al., 2019. Accelerated genome engineering of Pseudomonas putida by I-SceI-mediated recombination and CRISPR-Cas9 counterselection. Microb Biotechnol. 13, 223-249). Further modification of the MCP-SoxS promoter was achieved by digestion of pPPC001/pPPC05 with PstI/Bsp120I and XylS-Pm was inserted into the corresponding site to give pPPC006/007, respectively. See
For integration plasmids with the reporter gene included, J1-BBa_J23117-sfGFP was amplified from pJF076Sa (Fontana et al., 2020) and inserted into the KpnI/SacI site of pUT18T-miniTn7T-Gm. Then, dCas9/MCP-SoxS was added to the HindIII site to give pPPC002. For pPPC003.N, the J1(+N)-BBa_J23117-sfGFP fragments were amplified from pJF155.1-12 (Fontana et al., 2020. Effective CRISPRa-mediated control of gene expression in bacteria must overcome strict target site requirements. Nat. Commun. 11, 1618) instead of pJF076Sa. In the case of pPPC004 with additional BBa_J23111-mRFP reporter, the corresponding reporter fragment was amplified from the pJF143.J3.J23111 (Fontana et al., 2020. Effective CRISPRa-mediated control of gene expression in bacteria must overcome strict target site requirements. Nat. Commun. 11, 1618), with BBa_J23111 promoter instead of a BBa_J23117, and inserted into pPPC002 at the Mph1103I cut site. Replicable plasmids
pRK2-GmR was made by digesting pRK2-AraE (containing GmR marker) with AatII/BspTI and the multiple-cloning-site (MCS) from pBBR1 was inserted into the pRK2 backbone. pRK2-KmR was made by digestion of pRK2-AraE with SacI/BspTI and insert KmR and MCS fragments from pBBR1-KmR. Then, the further modification of these plasmids followed the general manipulation at the MCS. See
scRNA (or sgRNA) was inserted into the replicable plasmid at the SacI/KpnI site of the MCS. Then, the reporter fragment was inserted at the Mph1103I region. The dCas9/MCP-SoxS cassette was inserted into Mph1103I. For pRK2-GmR and pRK2-KmR, the scRNA fragment was amplified from pPPC013 and inserted into pRK2 backbones at NotI/Bsp120I site due to conflicting SacI/KpnI cut sites in the pRK2 backbone.
To change the scRNA target sequence, the existing scRNA cassette was excised with SpeI/BspTI and the new scRNA fragment was inserted. To express multiple scRNAs from the same plasmid, additional scRNA (or sgRNA) cassettes can be inserted at the BspTI site. To generate a new scRNA fragment, any existing scRNA construct can be amplified with a forward primer binding at the promoter region, oCK079_GCTCAGTCCTAGGTATAATACTAGT. To introduce a new 20 base target sequence, a forward primer with the same overhang can be used, oCK287_TAGGTATAATACTAGTNNNNNNNNNNNNNNNNNNNNGTTTTAGAGC TAGAAATAGCAAGT, where the variable 20nt in oCK287 can be replaced with the desired target sequence.
To insert J1-mRFP reporter cassettes, the PCR fragment was amplified from pJF076Sa (Fontana et al., 2020. Effective CRISPRa-mediated control of gene expression in bacteria must overcome strict target site requirements. Nat. Commun. 11, 1618) as a template and cloned into the Mph1103I site. The J3-mRFP variants were constructed in the same manner with pJF143.J3 (Fontana et al., 2020) as a PCR template. To insert other genes of interest under control of J1 or J3 promoters, several approaches are available. AatII was introduced upstream of J1 and J3 sequences. KpnI was added at the end of strong RBS and XhoI was added between the stop codon and terminator. The desired cassette can be cloned into the Mph1103I site directly or inserted at the aforementioned sites. Biopterin pathway genes were inserted with AatII/XhoI using pCK015 (J3-GTPCH-J3-PTPS-J3-SR) and pCK014 (J3-GTPCH-J3-PTPS) as templates for pPPC027-028. LacI-Ptrc was added into AatII/XhoI and mvaES was added into KpnI/XhoI using pMVA2RBS035 as a template for PCR to give pPPC029-030 respectively.
pCK014 and pCK015 were analogs of pPPC028 and pPPC027, respectively, in pSC101** origin for E.coli experiment which can be double transformed with pCK005.AAV and pCD581 (Fontana et al., 2020. Effective CRISPRa-mediated control of gene expression in bacteria must overcome strict target site requirements. Nat. Commun. 11, 1618). The gtpch gene was amplified from the E.coli MG1655 genome. ptps and sr from M.alpina were synthesized from GeneArt (Thermo-Fisher) with codon-optimization for expression in E.coli using Gene Designer (Atum). Each J3-CDS was individually added into the reporter cassette. Then, an additional J3-CDS construct was inserted into the existing one at the EcoRV site (altered from AatII due to the presence of cut-site in sr gene) to get pCK014 and pCK015.
To alter the scRNA 20 base target sequence, a single-fragment PCR was used to change the existing 20 bp target of J106 in pPPC016 to the desired J306 using oCK237/oCK279 (Table 4). Then, the fragment was treated with DpnI, gel purified, and circularized with Infusion. The same method was used for converting J306 to J106 with oJF365 /oJF366.
To generate a library of different 26bp sequences upstream of a minimal promoter, a fragment with randomized 26bp region (5′-PS-BBa_J23117-mRFP) was constructed with the oJF447 (SEQ ID NO: 35) and oCK219 (SEQ ID NO: 34) primers (Table 4). pPPC020 bearing J306 scRNA was linearized by PCR with oJF448 (SEQ ID NO: 36)/oCK084 (SEQ ID NO: 37) and treated with DpnI to remove the parent vector. Then the linearized pPPC016 backbone fragment and a randomized 26bp library fragment were assembled with Infusion.
pPPC023 was constructed similar to pPPC022 as described above. Five oJF447 (SEQ ID NO: 35) variants with known 26bp sequences (provided in the DNA Sequences section) were used to generate 5′-PSN-BBa_J23117-mRFP fragments (PS1 to PS5) for insertion into the linearized backbone.
For the plasmid-based dual reporter for multi-gene CRISPRa with two strongly expressed fluorescent reporters, a J3(106)-BBa_J23117-sfGFP cassette was inserted at the AatII site of J3-BBa -J23117-mRFP (pPPC020) to generate pPPC024. The plasmid-based dual reporter for CRISPRi/a with weakly expressed mRFP and strongly expressed sfGFP was constructed by delivering J3(106)-BBa_J23111-sfGFP to pPPC020 to generate pPPC025. Multiple sgRNA/scRNA cassettes were delivered as described above in the Replicable plasmids section.
The genomically-integrated dual reporter strains were constructed by sequentially integrating separate mRFP and sfGFP reporters at different genomic sites. Plasmids, pGNW2-ppl and pGNW2-pp2 were constructed from pGNW2 (Wirth et al., 2019. Accelerated genome engineering of Pseudomonas putida by I-SceI-mediated recombination and CRISPR-Cas9 counterselection. Microb Biotechnol) by addition of prophage 1 (pp 1) or prophage2 (pp2) regions into the XbaI/EcoRI site. Flanking homology sites (HR1 and HR2) were separated by an Mph1103I site for insertion of the desired heterologous gene. J3-BBa_J23117-mRFP was inserted into pGNW2-ppl at the Mph1103I site to construct pPPC031. sfGFP constructs with different promoters were cloned into pGNW2-pp2 at the Mph1103I site to generate pPPC032-034.
The J3-BBa-J23117 reporter (pPPC020) was modified into an endogenous promoter reporter by replacing the J3-BBa_J23117 promoter with an intergenic region from each gene of interest. The intergenic region contained 60 bases from the ORF of interest on the 3′ end. On the 5′ end, the intergenic region extended 60 bp into the next upstream ORF, following a previously reported strategy (Zaslaver et al., 2006. A comprehensive library of fluorescent transcriptional reporters for Escherichia coli. Nat. Methods 3, 623-628). The mRFP cassette, along with its original strong RBS, was included downstream of the 60 bp fragment of the ORF of interest. Complete sequences are provided below.
The chemically competent cell preparation was adapted from a prior method (Zhao et al., 2013. [CaCl2-heat shock preparation of competent cells of three Pseudomonas strains and related transformation conditions]. Ying Yong Sheng Tai Xue Bao 24, 788-94). From an overnight culture seeded from a single colony of a P.putida strain in LB, the cell suspension was 100-fold diluted into 50 mL LB without antibiotic in a 250 mL Erlenmeyer flask. The culture was incubated at 30° C. to OD600 = 0.8 - 1.0, transferred to 2 × 50 mL conical tubes, and placed on ice for 5 minutes. The cell suspension was centrifuged at 4° C. for 10 min at 5000 rpm. After discarding the supernatant, the cells were washed with an ice-cold solution of 50 mM MgCl2 + 10 mM CaCl2 twice. The final pellets were resuspended in 15% glycerol + 100 mM CaCl2 solution to give chemically competent cells. The competent cells can be stored at -80° C. for a month with negligible loss of activity.
For transformation, 50 ng of a P.putida compatible plasmid was added to 100 µL of CaCl2 chemically competent cells in a 1.5 mL microcentrifuge tube. Cells were mixed gently and incubated on ice for 30 minutes. The incubated competent cells were subjected to heat-shock at 42° C. for 3 minutes and cooled on ice for another 5 minutes. Then, 900 µL of LB was added to the competent cells and cultures were shaken at 30° C. for 1.5 hours. The outgrowth competent cells were spun down at 10000 rcf, room temperature for 1 minute. After discarding ~900 µL of supernatant, cells were resuspended in residual media for plating on a pre-warmed agar plate with appropriate antibiotic selection.
The LC-MS quantification was adapted from the prior method (Ehrenworth et al., 2015. Pterin-Dependent Mono-oxidation for the Microbial Synthesis of a Modified Monoterpene Indole Alkaloid. ACS Synth. Biol. 4, 1295-1307). LC-MS analysis was completed using an Agilent 1100/1260 series system equipped with a 1260 ALS autosampler and a 6120 Single Quadrupole LC-MS with a Poroshell 120 SB-Aq 3.0 mm × 100 mm × 2.7 µm column and an electrospray ion source. LC conditions: solvent A-150 mM acetic acid with 0.1% formic acid; solvent B-methanol with 0.1% formic acid. Gradient: 4 min ramp from 95%:5%:0.2 (A:B:flow rate in mL/min) to 70%:30%:0.2, 6 min ramp to 40%:60%:0.2, 2 min ramp to 2%:98%:0.2, 2 min ramp to 2%:98%:0.5, 4 min at 2%:98%:0.5, 1 min ramp to 95%:5%:0.5, 7 min at 95%:5%:0.5, and 1.5 min post time. MS acquisition (positive ion mode) included 25% scan from m/z 100-600, 25% scan from m/z 230-260, 25% scan from m/z 145-165, and 25% selective ion monitoring (SIM) for BH4 (m/z 242.1), dihydrobiopterin (m/z 240.1), and biopterin (m/z 238.1). Retention times were determined using commercially available standards (BH4, BH2, and biopterin from Cayman Chemical).
Single colonies from LB plates were inoculated in 500 µL of EZ-RDM (Teknova) supplemented with the appropriate antibiotics and grown in 96-deep-well plates at 30° C. with shaking overnight. From the overnight cultures, OD600 of each replicate was measured in a 1-cm cuvette, then diluted to OD600 = 0.1 (30-50 fold dilution) and 200 µL of each diluted culture were grown in flat bottom microplate at 30° C. in a Biotek Synergy HTX plate reader for 16 hours with continuous slow orbital shaking.
As described in the proceeding examples, CRISPRa can be ported to P.putida when accompanied by select optimizations, e.g., optimizations to the expression of dCas9, MCP-SoxS, and scRNA. The CRISPRa system was successfully targeted to endogenous targets. Multiple genes can be activated simultaneously by targeting multiple promoters or by targeting a single promoter in a multi-gene operon. Ultimately, the disclosed approach can be used for chemical productions in P.putida.
This example describes implementation of additional modifications to P.putida to facilitate production of p-aminocinnamic acid (pACA), as determined HPLC chromatography.
Initial attempts in E.coli demonstrated difficulty to produce pACA from glucose. P.putida possess resistance towards aromatic compounds in general and, thus, may serve as an advantageous platform for production of various products. For example, growth resistance experiments demonstrated P.putida can tolerate a higher concentration of pACA.
To optimize the production of pACA in P.putida, certain changes were determined to be significantly beneficial. In one instance, change the key enzyme, i.e., from At-PAL (from plant, Arabinobsis thaliana) to Rg-PAL (from yeast, Rhodotorula glutinis), led to significant improvement. Additionally, a change to the expression system was beneficial. The expression system was changed from a two-plasmid system to a one-plasmid system, which led to significant improvement in production. However, this change comes with a tradeoff of plasmid burden due to a large plasmid size. Another optimization was to implement a genetically integrated system. However, it is not routine to simply place a heterologous cassette onto the genome. Instead, several rounds of optimization on the minimal promoter are necessary to achieve sufficient base expression to be CRISPR-activatable to relevant levels, and also to avoid overexpression when activated, leading to instability. It was determined that a stable integration that led to stable production of pACA. For CRISPRa-controlled expression, an optimization of scRNA expression level is also necessary
Constitutive dCas9 and MCP-SoxS were previously integrated into P.putida KT2440 to make CKPP002. This strain, or its derivative IFPP006 with integrated papABC and aroGL, was transformed by electroporation with a pBBR1 plasmid containing either scRNAs only or pathway genes and scRNAs. Two-plasmid production strains including the additional pRK2 plasmid were doubly transformed in series, using competent cells containing the papABC/aroGL plasmid. A control strain for standard curve diluent was transformed with similar plasmids not containing pathway genes. Single colonies were picked in triplicate and used to inoculate 2 mL of MOPS EZ-Rich defined media (Teknova), supplemented with appropriate antibiotics, in 14 mL polypropylene culture tubes. Cultures were grown at 30° C. and shaken at 200 rpm for 24 hours.
Culture supernatants were filtered by centrifuging at 14000 g for 20 minutes using an Amicon® Ultracel-10 centrifuge filter (Millipore). Filtered supernatants were supplemented with 0.2% trifluoroacetic acid (TFA) and assessed using an Agilent HPLC with a diode array detector set at 210 nm. p-AF, p-ACA, and other components were separated using a ZORBAX Eclipse Plus phenyl-hexyl column (Agilent) with water plus 0.2% TFA as solvent A and methanol plus 0.2% TFA as solvent B. See
In E.coli, p-AF production can occur, but p-ACA is toxic. See, e.g.,
Various optimizations were implemented.
While illustrative embodiments have been illustrated and described, it will be appreciated that various changes can be made therein without departing from the spirit and scope of the invention.
This application claims the benefit of U.S. Provisional Application No. 63/335143, filed Apr. 26, 2022, the disclosure of which is incorporated herein by reference in its entirety.
This invention was made with government support under Grant Nos. CBET 1844152 and EF-1935087 and MCB 1817623, awarded by the National Science Foundation and Grant No. EERE DE-EE0008927, awarded by the U.S. Department of Energy. The government has certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
63335143 | Apr 2022 | US |