Disclosed herein are methodologies and kits for dynamic targeted hypermutation that harness the enzymatic activity of a polynucleic acid-binding protein fused to a nucleobase-editing enzyme to specifically target mutations across a region of interest. These methodologies and kits facilitate the rapid creation of diverse DNA libraries in vivo or in vitro.
Mutagenesis is central to the generation of diverse target gene libraries. Previously described in vitro mutagenesis methodologies allow precise control over sites of mutation; however, they are laborious and time-consuming. Moreover, previously described methodologies directed at the generation of large, diverse libraries in vivo generally act globally on the organism (i.e., they indiscriminately alter DNA sequences in living systems, resulting in undesired off-target mutations). The off-target mutations caused by global mutagenesis result in two major drawbacks in the context of directed evolution. First, they increase the chances of false positives whereby an off-target mutation increases the fitness of an organism and enables “cheating” of the selection process. Second, they result in undesired toxicity due to the off-target mutation of critical genes. These drawbacks require users to carefully optimize global mutagenesis such that the mutation rate is maximized while cellular toxicity is minimized. The careful balance between the number of mutations and cell death constrains mutation rates, ultimately limiting library size and resulting in a lower chance of finding an improved variant and/or a less active final product of the directed evolution process.
Lab-timescale evolution relies on the generation of large mutational libraries to rapidly explore biomolecule sequence landscapes. Although numerous in vitro mutagenesis techniques are available, in vivo mutagenesis is limited (Wong et al., Comb. Chem. High Throughput Screen. 2006 May; 9(4): 271-88.). Global mutagenesis methods are capable of increasing mutation rates in vivo but unfortunately introduce extensive off-target mutations in essential and cheating genes.
In some aspects the disclosure relates to dynamic targeted hypermutation (DTH), a novel methodology for specifically targeting mutations across a gene of interest. This methodology facilitates the rapid creation of diverse DNA libraries in vivo or in vitro such that increased mutation rates are constrained to the target DNA of interest.
In some aspects the disclosure relates to nucleobase-editing fusion proteins capable of introducing nucleobase mutations in a pre-existing polynucleic acid sequence. In some embodiments, a nucleobase-editing fusion protein comprises a processive polynucleic acid-binding protein fused to a nucleobase-editing enzyme.
In some embodiments, the processive polynucleic acid-binding protein of the nucleobase-editing fusion protein comprises the amino acid sequence of an RNA polymerase, a DNA polymerase, a DNA methyltransferase, a DNA glycosylase, or a DNA helicase. In some embodiments, the processive polynucleic acid-binding protein of the nucleobase-editing fusion proteins comprises the amino acid sequence of T7 RNA polymerase or a functional variant thereof.
In some embodiments, the nucleobase-editing enzyme comprises the amino acid sequence of an Apobec protein, a TadA protein, an AMPD protein, a CDA protein, an ADAT protein, an ADAR protein, or a GDA protein. In some embodiments, the nucleobase-editing enzyme comprises the amino acid sequence of an Apobec protein. In some embodiments, the Apobec protein is rApobec1 or a functional variant thereof. In some embodiments, the nucleobase-editing enzyme comprises the amino acid sequence of a TadA protein. In some embodiments, the TadA protein is E. coli TadA comprising an A106V mutation and/or a D108N mutation or a protein homolog comprising a homologous mutation(s).
In some aspects, the disclosure relates to methods of performing dynamic targeted hypermutation. In some embodiments, the method comprises contacting at least one polynucleic acid with at least one non-naturally occurring nucleobase-editing fusion protein, wherein: (a) each of the at least one non-naturally occurring nucleobase-editing fusion proteins comprises a processive polynucleic acid-binding protein fused to a nucleobase-editing enzyme; (b) each of the at least one polynucleic acid comprises a target region; and (c) the contacting of the at least one polynucleic acid with the at least one non-naturally occurring nucleobase-editing fusion protein generates mutations at a rate exceeding background mutation rates only in the target region of the at least one polynucleic acid of (b), wherein the background mutation rate of the at least one polynucleic acid of (b) is determined in the absence of the non-naturally occurring nucleobase-editing fusion protein.
In some embodiments, the processive polynucleic acid-binding protein of at least one of the at least one non-naturally occurring nucleobase-editing fusion proteins comprises the amino acid sequence of an RNA polymerase, a DNA polymerase, a DNA methyltransferase, a DNA glycosylase, or a DNA helicase. In some embodiments, the processive polynucleic acid-binding protein of at least one of the at least one non-naturally occurring nucleobase-editing fusion proteins comprises the amino acid sequence of T7 RNA polymerase or a functional variant thereof.
In some embodiments, the nucleobase-editing enzyme of at least one of the at least one non-naturally occurring nucleobase-editing fusion proteins comprises the amino acid sequence of an Apobec protein, a TadA protein, an AMPD protein, a CDA protein, an ADAT protein, an ADAR protein, or a GDA protein. In some embodiments, the nucleobase-editing enzyme of at least one of the at least one non-naturally occurring nucleobase-editing fusion proteins comprises the amino acid sequence of an Apobec protein. In some embodiments, the Apobec protein is rApobec1 or a functional variant thereof. In some embodiments, the nucleobase-editing enzyme comprises the amino acid sequence of a TadA protein. In some embodiments, the TadA protein is E. coli TadA comprising an A106V mutation and/or a D108N mutation or a protein homolog comprising a homologous mutation(s).
In some embodiments, each of the at least one polynucleic acid comprises, from 5′ to 3′: a promoter region that is bound by at least one of the at least one non-naturally occurring nucleobase-editing fusion proteins in a sequence-specific manner; the target region; and a terminator region comprising a terminator array.
In some embodiments, the terminator array comprises four or more terminators, optionally four or more T7 UUCG terminators.
In some embodiments, the promoter region of at least one of the at least one polynucleic acids comprises the sequence of SEQ ID NO: 21, SEQ ID NO: 22, and/or SEQ ID NO: 23.
In some embodiments, the contacting of the at least one polynucleic acid with the at least one non-naturally occurring nucleobase-editing fusion protein occurs in a living cell.
In some embodiments, at least one of the at least non-naturally occurring nucleobase-editing fusion proteins is encoded for on a plasmid, wherein the plasmid has copy number of less than 10. In some embodiments, at least one of the at least one non-naturally occurring nucleobase-editing fusion proteins is conditionally expressed in the living cell.
In some embodiments, the living cell contains a modified genome comprising: (a) an integration of a polynucleic acid sequence encoding for and driving the expression of at least one non-naturally occurring nucleobase-editing fusion protein; and/or (b) an integration of a polynucleic sequence comprising, from 5′ to 3′: a promoter region that is bound by at least one of the at least one non-naturally occurring nucleobase-editing fusion proteins in a sequence-specific manner; the target region; and a terminator region comprising a terminator array.
In some embodiments, the living cell contains a modified genome and a plasmid that facilitates expression of a T7 inhibitor, wherein the modified genome of the living cell comprises: (a) an integration of a polynucleic acid sequence encoding for and driving the expression of the non-naturally occurring nucleobase-editing fusion protein, wherein the sequence driving the expression of the fusion protein comprises a sequence bound by LacI repressor that inhibits transcription of the fusion protein when LacI is bound; and/or (b) a deletion of genomic sequence encoding for uracil deglycosylase. In some embodiments, the T7 inhibitor is T7 lysozyme.
In some embodiments, the living cell is treated to increase the expression and/or activity of the uracil deglycosylase inhibitor, ugi.
In some aspects, the disclosure relates to kits for performing dynamic targeted hypermutation. In some embodiments, a kit comprises: (a) a polypeptide comprising the amino acid sequence of a non-naturally occurring nucleobase-editing fusion protein comprising a processive polynucleic acid-binding protein fused to a nucleobase-editing enzyme; and (b) a polynucleic acid sequence comprising, from 5′ to 3′: a promoter region that is bound by the non-naturally occurring nucleobase-editing fusion protein of (a) in a sequence-specific manner; a cloning site; and a terminator region comprising a terminator array.
In some embodiments, a kit comprises: (a) a polynucleic acid sequence encoding for and driving the expression of a non-naturally occurring nucleobase-editing fusion protein comprising a processive polynucleic acid-binding protein fused to a nucleobase-editing enzyme; and (b) a polynucleic acid sequence comprising, from 5′ to 3′: a promoter region that is bound by the non-naturally occurring nucleobase-editing fusion protein of (a) in a sequence-specific manner; a cloning site; and a terminator region comprising a terminator array.
In some embodiments, the processive polynucleic acid-binding protein of the non-naturally occurring nucleobase-editing fusion protein comprises the amino acid sequence of an RNA polymerase, a DNA polymerase, a DNA methyltransferase, a DNA glycosylase, or a DNA helicase. In some embodiments, the processive polynucleic acid-binding protein of the non-naturally occurring nucleobase-editing fusion proteins comprises the amino acid sequence of T7 RNA polymerase or a functional variant thereof.
In some embodiments, the nucleobase-editing enzyme of the nucleobase-editing fusion protein comprises the amino acid sequence of an Apobec protein, a TadA protein, an AMPD protein, a CDA protein, an ADAT protein, an ADAR protein, or a GDA protein. In some embodiments, the nucleobase-editing enzyme of the nucleobase-editing fusion protein comprises the amino acid sequence of an Apobec protein. In some embodiments, the Apobec protein is rApobec1 or a functional variant thereof. In some embodiments, the Apobec protein is rApobec1 or a functional variant thereof. In some embodiments, the nucleobase-editing enzyme comprises the amino acid sequence of a TadA protein. In some embodiments, the TadA protein is E. coli TadA comprising an A106V mutation and/or a D108N mutation or a protein homolog comprising a homologous mutation(s).
In some embodiments, the terminator array comprises four or more terminators, optionally four or more T7 UUCG terminators.
In some embodiments, the promoter region comprises the sequence of SEQ ID NO: 21, SEQ ID NO: 22, and/or SEQ ID NO: 23.
These and other aspects of the invention are further described below.
The following drawings form part of the present specification and are included to further demonstrate certain aspects of the present disclosure, which can be better understood by reference to one or more of these drawings in combination with the detailed description of specific embodiments presented herein. It is to be understood that the data illustrated in the drawings in no way limit the scope of the disclosure.
Traditional in vivo mutagenesis strategies, which are especially important for studying and using evolution in living systems, rely on exposing organisms to exogenous mutagens (e.g., high energy light or chemicals (Cupples C. G. and Miller J. H., Proc. Natl. Acad. Sci. U.S.A. 1989 July; 86(14): 5345-49; Tessman et al., Science. 1965 Apr. 23; 148(3669): 507-8)) or expressing mutagenic enzymes in organisms with deficient repair machinery (e.g., XL1-Red (Greener et al., Mol. Biotechnol. 1997 April; 7(2): 189-95) or the MP6 plasmid (Badran A. H. and Liu D. R., Nat. Commun. 2015 Oct. 7; 6: 8425). These global mutagenesis strategies can yield high mutation rates and diverse genetic landscapes. However, the extensive occurrence of mutations throughout the genome is problematic for many experiments, especially directed evolution (
As described herein, Dynamic Targeted Hypermutation (DTH) involves the implementation of a nucleobase-editing enzyme to create genetic diversity in a specific target region of a polynucleic acid sequence. In some embodiments, the methodology facilitates continuous directed evolution in a living system. By mutating specific regions of a polynucleic acid in a targeted fashion, these methodologies reduce off-target mutations that result in cell death or “cheating” of the selection scheme in the directed evolution platform (
In some aspects, the disclosure relates to nucleobase-editing fusion proteins. The nucleobase editing enzymes described herein are capable of altering nucleobases of (or introducing nucleobase mutations in) a pre-existing polynucleic acid sequence (as distinguished from the introduction of mutations during polynucleic acid synthesis, which leaves the parent strand unchanged). In some embodiments, the nucleobase-editing fusion protein can introduce mutations in the 5′ to 3′ direction of a polynucleic acid sequence. In some embodiments, the nucleobase-editing fusion protein can introduce mutations in the 3′ to 5′ direction of a polynucleic acid sequence. In some embodiments, the nucleobase enzyme can introduce mutations in the 5′ to 3′ and the 3′ to 5′ direction of a polynucleic acid sequence. In some embodiments, a nucleobase-editing fusion protein comprises a polynucleic acid-binding protein fused to a nucleobase-editing enzyme.
As used herein, the term “polynucleic acid-binding protein” refers to a protein that binds to specific polynucleic acid sequences. Examples of DNA binding proteins are known to those having skill in the art and include, but are not limited to, polymerases, ligases, reverse transcriptases, nucleases, methyltransferases, glycosylases, helicases, transcription factors, and transcription repressors.
In some embodiments, the polynucleic-acid binding protein is a processive enzyme. The term “processive enzyme” as used herein refers to an enzyme that catalyzes consecutive reactions without releasing its substrate (e.g., in the context of a polymerase, processivity relates to the average number of nucleotides added by the polymerase enzyme per association event with the template strand). Examples of processive enzymes include, but are not limited to, RNA polymerases, DNA polymerases, DNA methyltransferases, DNA glycosylases, and DNA helicases. In some embodiments, the processive enzyme is an RNA polymerase, a DNA polymerase, a DNA methyltransferase, a DNA glycosylase, a DNA helicase, or a functional variant thereof. In some embodiments, the processive enzyme is an RNA polymerase. Examples of RNA polymerases are known to those having skill in the art and include, but are not limited to, T7 RNA polymerase, T3 RNA polymerase, and SP6 RNA polymerase. In some embodiments, the processive enzyme is T7 RNA polymerase or a functional variant thereof.
As used herein, the term “nucleobase-editing enzyme” refers to an enzyme that catalyzes the conversion of a nucleobase to a different nucleobase. Examples of nucleobase-editing enzymes are known to those having skill in the art and include, but are not limited to, Apobec proteins (conversion of cytosine to uracil), TadA proteins (conversion of adenosine to inosine), AMPD proteins (conversion of adenosine to inosine), CDA proteins (conversion of cytidine to uridine), ADAT proteins (conversion of adenosine to inosine), ADAR proteins (conversion of adenosine to inosine), ADA proteins (conversion of adenosine to inosine), and GDA proteins (conversion of guanine to xanthine). In some embodiments, the nucleobase-editing enzyme is selected from the group consisting of an Apobec protein, a TadA protein, an AMPD protein, a CDA protein, an ADAT protein, an ADAR protein, a GDA protein, or a functional variant thereof.
As used herein, the term “Apobec protein” refers to a protein family of deaminases, capable of mutagenizing DNA and/or RNA through the conversion of cytosine to uracil. Apobec proteins have been identified in various species and are known to those having skill in the art. For example, human genes encoding an Apobec protein include APOBEC1, APOBEC2, APOBEC3A, APOBEC3B, APOBEC3C, APOBEC3D (or APOBEC3E), APOBEC3F, APOBEC3G, APOBEC3H, APOBEC4, and Activation-Induced cytidine deaminase. The ability of Apobec proteins to mutagenize DNA and/or RNA varies. For example, some Apobec proteins appear to lack deaminase activity (e.g., APOBEC2). Others are highly mutagenic (e.g., APOBEC3G and rApobec1). The term “Apobec protein” as used herein encompasses all known and currently identifiable Apobec proteins and functional variants thereof. In some embodiments, the Apobec protein is rApobec1 or a functional variant thereof.
As used herein, the term “TadA protein” refers to a family of tRNA-specific adenosine deaminases. TadA proteins have been identified in various species and are known to those having skill in the art. For example, human genes encoding a TadA protein include ADAT1 and ADAT2. E. coli TadA and mouse ADA are additional examples. In some embodiments, the TadA protein is ADAT1, ADAT2, E. coli TadA, ADA, or a functional variant thereof.
As used herein, the term “AMPD protein” refers to a family of adenosine deaminases. AMPD proteins have been identified in various species and are known to those having skill in the art. For example, human genes encoding an AMPD protein include AMPD1, AMPD2 and AMPD3. In some embodiments, the AMPD protein is AMPD1, AMPD2, AMPD3, or a functional variant thereof.
As used herein, the term “CDA protein” refers to a family of cytidine deaminases. CDA proteins have been identified in various species and are known to those having skill in the art. For example, human genes encoding a CDA protein include CDA. In some embodiments, the CDA protein is human CDA or a functional variant thereof.
As used herein, the term “ADAR protein” refers to a family of adenosine deaminases. ADAR proteins have been identified in various species and are known to those having skill in the art. For example, human genes encoding an ADAR protein include ADAR1 and ADAR2. In some embodiments, the ADAR protein is ADAR1, ADAR2 or a functional variant thereof.
As used herein, the term “GDA protein” refers to a family of guanine deaminases. GDA proteins have been identified in various species and are known to those having skill in the art. For example, human genes encoding a GDA protein include GDA. In some embodiments, the GDA protein is human GDA or a functional variant thereof.
The term “functional variant” includes polypeptides which are about 70% identical, at least about 80% identical, at least about 90% identical, at least about 95% identical, at least about 98% identical, at least about 99% identical, at least about 99.5% identical, or at least about 99.9% identical to a protein's native amino acid sequence (i.e., wild-type amino acid sequence) and which retain functionality.
The term “functional variant” also includes polypeptides which are shorter or longer than a protein's native amino acid sequence by about 5 amino acids, by about 10 amino acids, by about 15 amino acids, by about 20 amino acids, by about 30 amino acids, by about 40 amino acids, by about 50 amino acids, by about 75 amino acids, by about 100 amino acids or more and which retain functionality.
In the context of a processive polynucleic-acid binding protein, the term “retain functionality” refers to a functional variant's ability to catalyze consecutive reactions without releasing its substrate at least about 5%, 10%, 20%, 25%, 30%, 40%, 50%, 60%, 70%, 75%, 80%, 90%, 100%, or more than 100% as efficiently as the respective non-variant (i.e., wild-type) processive polynucleic-acid binding protein. Methods of measuring and comparing processivity are known to those skilled in the art.
In the context of a nucleobase-editing enzyme, the term “retain functionality” refers to a functional variant's ability to catalyze the conversion of a nucleobase to a different nucleobase at least about 5%, 10%, 20%, 25%, 30%, 40%, 50%, 60%, 70%, 75%, 80%, 90%, 100%, or more than 100% as efficiently as the respective non-variant (i.e., wild-type) protein. Methods of measuring and comparing nucleobase conversion rates are known to those having skill in the art.
As used herein, the term “fusion protein” refers to the coupling of two or more polypeptides/peptides. In some embodiments, a fusion protein comprises two or more polypeptides/peptides that are covalently coupled in a single polypeptide chain. Covalently connected fusion proteins typically are produced genetically through the in-frame fusing of the nucleotide sequences encoding for each of the said polypeptides/peptides. Expression of the fused coding sequence results in the generation of a single protein without any translational terminator between each of the polypeptides/peptides. In some embodiments, a fusion protein comprises two or more polypeptides/peptides that are coupled through non-covalent association, such as through dimerization domains like FKBP and FRB which dimerize upon the addition of a small-molecule, rapamycin (DeRose et al., Pflugers Arch. 2013 March; 465(3): 409-17). For example, in some embodiments, the polynucleic-acid binding protein is covalently coupled to FKBP and the nucleobase-editing enzyme is covalently coupled to FRB, which could dimerize (non-covalent association) in the presence of rapamycin. Examples of other dimerizing domains or adaptor proteins that facilitate non-covalent association are known to those having skill in the art.
The nucleobase-editing fusion proteins described and encompassed herein comprise a polynucleic acid-binding protein fused to a nucleobase-editing enzyme. In some embodiments, the nucleobase-editing enzyme is C-terminal to the polynucleic acid-binding protein. In other embodiments, the nucleobase-editing enzyme is N-terminal to the polynucleic acid-binding protein.
In some embodiments, the nucleobase-editing fusion protein comprises more than one nucleobase-editing enzyme and/or more than one polynucleic acid-binding protein, which can be arranged in any manner. For example, a nucleobase-editing fusion protein comprising two nucleobase-editing enzymes (“E”) and one polynucleic acid-binding protein (“B”) may be structured from N-terminus to C-terminus as follows: (i) E-B-E; (ii) E-E-B; or (iii) B-E-E.
In some embodiments, one or more proteins or protein domains are positioned between the fused polynucleic acid-binding protein and the nucleobase-editing enzyme. In some embodiments, the polynucleic acid-binding protein is fused to the nucleobase-editing enzyme through a linker. As used herein, the term “linker” refers to a flexible molecule used to connect two molecules of interest together. In some embodiments, the linker is a hydrophilic linker (e.g., PEG linker). In some embodiments, the linker is a peptide linker. In some embodiments, the peptide linker is an XTEN linker (Schellenberger et al., Nat. Biotechnol. 2009 December; 27(12): 1186-90) or a (GGS)n linker.
In some embodiments, the polynucleic acid-binding protein and the nucleobase-editing enzyme are fused via one or more of the following: (i) a cysteine-cysteine disulfide bond; (ii) intein splicing; and (iii) a covalent linkage from an unnatural amino acid (e.g., alkyne-azide “click” reactions, olefin metathesis, or oxime ligation). In some embodiments, the polynucleic acid-binding protein and the nucleobase-editing enzyme are fused through exposure to cross-linking reagents that react with amino acid side chains, such as perfluoro-aromatic stapling, or reagents like NHS esters or isothiocynates or aldehydes.
In some aspects, the disclosure relates to methods of performing dynamic targeted hypermutation. In some embodiments, the method comprises contacting at least one polynucleic acid with at least one non-naturally occurring nucleobase-editing fusion protein as described above, wherein: (a) each of the at least one non-naturally occurring nucleobase-editing fusion proteins comprises a polynucleic acid-binding protein fused to a nucleobase-editing enzyme; (b) each of the at least one polynucleic acid comprises a target region; and (c) the contacting of the at least one polynucleic acid with the at least one non-naturally occurring nucleobase-editing fusion protein generates mutations at a rate exceeding background mutation rates only in the target region of the at least one polynucleic acid of (b), wherein the background mutation rate of the at least one polynucleic acid of (b) is determined in the absence of the non-naturally occurring nucleobase-editing fusion protein.
As used herein, the term “nucleic acid,” as used herein, refers to a compound comprising a nucleobase and an acidic moiety (e.g., a nucleoside, a nucleotide, or a polymer of nucleotides). As used herein, the terms “polynucleic acid” or “polynucleic acid molecule” are used interchangeably and refer to polymeric nucleic acids (e.g., nucleic acid molecules comprising three or more nucleotides that are linked to each other via a phosphodiester linkage).
Polynucleic acid molecules have various forms. In some embodiments, the polynucleic acid molecule is DNA. In some embodiments, the polynucleic acid molecule is double-stranded DNA. For example, in some embodiments, the DNA is genomic DNA. In some embodiments, the DNA is plasmid DNA. In other embodiments, the polynucleic acid molecule is single-stranded DNA. In some embodiments, the polynucleic acid molecule is RNA. In some embodiments, the polynucleic acid molecule is double-stranded RNA. In other embodiments, the polynucleic acid molecule is single-stranded RNA. In some embodiments, the polynucleic acid is a hybrid between DNA and RNA.
The term “target region” as used herein refers to the polynucleic acid sequence that one seeks to mutagenize. In some embodiments, the target region comprises a gene-coding polynucleic acid sequence. In some embodiments, the gene-coding polynucleic acid sequence encodes for an entire gene or sets of entire genes (e.g., a bacterial operon). In other embodiments, the gene-coding polynucleic acid sequence encodes for a portion of a gene (e.g., a polynucleic acid sequence encoding for a protein domain). As used herein the term “portion of a gene” refers to a polynucleic acid sequence comprising at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, or at least 90% of a gene-coding polynucleic acid sequence.
In some embodiments, the target region comprises a non-coding nucleic acid sequence. In some embodiments, the non-coding nucleic acid sequences comprises the sequence of a regulatory element, an intron, a non-coding functional RNA, a repeat sequence, or a telomere. In some embodiments, the regulatory element is selected from the group consisting of an operator, an enhancer, a silencer, a promoter, a terminator, or an insulator. In some embodiments, the target region comprises a gene-coding and non-coding segment of DNA.
The length of a target region may vary. For example, in some embodiments, the target region is greater than 10,000 nucleotides or base pairs in length, such as at least 20,000, at least 25,000, at least 30,000, at least 40,000, at least 50,000, at least 60,000, at least 70,000, at least 80,000, at least 90,000, at least 100,000, or more nucleotides or base pairs in length. In other embodiments, the target region is between 100 and 10,000 nucleotides or base pairs in length, such 100-200, 200-500, 500-1000, or 1,000-5,000 nucleotides or base pairs in length. In other embodiments, the polynucleic acid molecule region of interest is less than 100 nucleotides or base pairs in length.
In some embodiments, a nucleobase-editing fusion protein generates mutations at a rate exceeding background mutation rates only in the target region (i.e., in polynucleic acid regions outside of the target region, the conversion of cytosine bases to uracil bases remain at background levels). In other embodiments, mutation rates outside of the target region (i.e., background mutation rates) are increased less than 100 percent, less than 90 percent, less than 80 percent, less than 70 percent, less than 60 percent, less than 50 percent, less than 40 percent, less than 30 percent, less than 20 percent, or less than 10 percent in the presence of the nucleobase-editing fusion protein relative to the rate in the absence of the nucleobase-editing fusion protein. Processes contributing to background mutation rates include the spontaneous deamination of cytosine to uracil through hydrolysis and errors in replication or transcription. Methods of measuring mutation rates are known to those having skill in the art.
In some embodiments, the at least one polynucleic acid comprises, from 5′ to 3′: a promoter region that is bound by at least one of the at least one non-naturally occurring nucleobase-editing fusion proteins in a sequence-specific manner; the target region; and a terminator region comprising a terminator array.
In some embodiments, the promoter region of at least one of the at least one polynucleic acids comprises the sequence of SEQ ID NO: 21, SEQ ID NO: 22, and/or SEQ ID NO: 23.
In some embodiments, the terminator array comprises four or more terminators, such as at least four, at least five, at least six, at least seven, at least eight, at least nine, or at least ten terminators. In some embodiments, Rho-independent terminators are used, which can be one or more types of naturally occurring terminators, such as T7 and rrnB, or one or more types of engineered high-efficiency terminators, such as T0. In some embodiments, when using a nucleobase-editing fusion protein containing T7 RNA polymerase, the terminator array comprises at least four, at least five, at least six, at least seven, at least eight, at least nine, or at least ten T7 UUCG terminators.
In some embodiments, the contacting of the at least one polynucleic acid with the at least one non-naturally occurring nucleobase-editing fusion protein occurs in a living cell. In some embodiments, the living cell is a cell of a multicellular organism. In some embodiments the living cell is a unicellular organism. In some embodiments, the unicellular organism is a bacteria. In some embodiments, the bacteria is E. coli.
In some embodiments, the nucleobase-editing fusion protein is encoded for on a plasmid contained within a living cell, wherein the plasmid has copy number of less than 10. In some embodiments the copy number is less than 9, less than 8, less than 7, less than 6, less than 5, less than 4, less than 3, or less than 2.
In some embodiments, the living cell contains a modified genome comprising an integration of a polynucleic acid sequence encoding for and driving the expression of the non-naturally occurring nucleobase-editing fusion protein. In some embodiments, the expression of the non-naturally occurring nucleobase-editing fusion protein is driven by a promoter comprising the sequence of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, and/or SEQ ID NO: 24.
In some embodiments, the living cell contains a modified genome comprising an integration of a polynucleic sequence comprising, from 5′ to 3′: a promoter region that is bound by at least one of the at least one non-naturally occurring nucleobase-editing fusion proteins in a sequence-specific manner; the target region; and a terminator region comprising a terminator array.
In some embodiments, the living cell contains a modified genome comprising: an integration of a polynucleic acid sequence encoding for and driving the expression of the non-naturally occurring nucleobase-editing fusion protein; and an integration of a polynucleic sequence comprising, from 5′ to 3′: a promoter region that is bound by at least one of the at least one non-naturally occurring nucleobase-editing fusion proteins in a sequence-specific manner; the target region; and a terminator region comprising a terminator array.
In some embodiments, the expression of at least one of the at least one non-naturally occurring nucleobase-editing fusion proteins can be conditionally controlled. Examples of inducible expression systems that facilitate conditional gene expression are known to those having skill in the art. For example, some inducible expression systems comprise promoters that are chemically regulated (e.g., alcohol-regulated, tetracycline-regulated, steroid-regulated, or metal-regulated. Other inducible expression systems comprise promoters that are physically regulated (e.g., temperature-regulated or light-regulated).
In some embodiments, the living cell contains a modified genome and a plasmid that facilitates expression of a T7 inhibitor, wherein the modified genome of the living cell comprises: (a) an integration of a polynucleic acid sequence encoding for and driving the expression of the non-naturally occurring nucleobase-editing fusion protein, wherein the sequence driving the expression of the fusion protein comprises a sequence bound by LacI repressor that inhibits transcription of the fusion protein when LacI is bound; and (b) a deletion of genomic sequence encoding for uracil deglycosylase. In some embodiments, the T7 inhibitor is T7 lysozyme. As used herein, the term “inhibits transcription” refers to a decrease in the expression of the non-naturally occurring nucleobase-editing fusion protein by about 5%, 10%, 20%, 25%, 30%, 40%, 50%, 60%, 70%, 75%, 80%, 90%, or more than 95% relative to the level of expression in the absence of LacI. Methods of measuring and comparing expression levels are known to those skilled in the art.
In some embodiments, the living cell is treated to increase the expression and/or activity of the uracil deglycosylase inhibitor, ugi (Savva R. and Pearl L. H., Nat. Struct. Biol. 1995 September; 2(9): 752-57). For example, in some embodiments, a plasmid encoding for an expressible uracil deglycosylase inhibitor is delivered to the living cell, and the expression of the uracil deglycosylase inhibitor is stimulated.
In some aspects, the invention relates to kits for performing targeted dynamic hypermutation. In some embodiments, the kit comprises: (a) a polypeptide comprising the amino acid sequence of a non-naturally occurring nucleobase-editing fusion protein comprising a processive polynucleic acid-binding protein fused to a nucleobase-editing enzyme; and (b) a polynucleic acid sequence comprising, from 5′ to 3′: a promoter region that is bound by the non-naturally occurring nucleobase-editing fusion protein of (a) in a sequence-specific manner; a cloning site; and a terminator region comprising a terminator array.
In other embodiments, the kit comprises: (a) a polynucleic acid sequence encoding for and driving the expression of a non-naturally occurring nucleobase-editing fusion protein comprising a processive polynucleic acid-binding protein fused to a nucleobase-editing enzyme; and (b) a polynucleic acid sequence comprising, from 5′ to 3′: a promoter region that is bound by the non-naturally occurring nucleobase-editing fusion protein of (a) in a sequence-specific manner; a cloning site; and a terminator region comprising a terminator array.
In some embodiments, at least one component in the kit is provided in a desiccated or lyophilized form. In other embodiments, at least one component of the kit is provided in a solubilized form. In some embodiments, the kit further comprises at least one buffer. In some embodiments at least one of the at least one buffers is a reaction buffer.
The term “cloning site,” as used herein refers to a segment of DNA that facilitates the cloning of a polynucleic acid comprising a target region. In some embodiments, the cloning site is a multiple cloning site comprising endonuclease restriction sites for restriction-mediated cloning. In some embodiments the cloning site is a TA cloning site. In some embodiments, the cloning site comprises a nucleic acid sequence that facilitates homologous recombination.
In some embodiments, the kit also comprises competent cells for use in the cloning of the target region. For example, in some embodiments, the competent cells are chosen from the list consisting of TOP10, OmniMax, PIR1, PIR2, INV α F, INV110, BL21, Mach1, DH10Bac, DH10B, DH12S, DH5α, Stb12, Stb13, and Stb14. XL1-Blue, XL2-Blue, and related strains.
The implementation of dynamic target hypermutation (DTH) depends on the action of a polynucleic acid-binding protein fused to a nucleobase-editing enzyme, such as an RNA polymerase combined with a cytidine deaminase. To demonstrate the DTH methodology, RNA polymerase from a bacteriophage (T7) was fused to cytidine deaminase from Rattus norvegicus (rApobec1) to form various rApo1−T7 constructs. These constructs specifically bind to a sequence of DNA called the T7 promoter, which is positioned adjacent to the target sequence of DNA (TABLE 1). Various constructs were engineered and tested in multiple reporter assays (TABLE 1 and TABLE 2).
When rApo1−T7 initiates transcription at the promoter site, the DNA of the target sequence is exposed and altered by the action of the T7 RNA polymerase and altered by the rApo1 domain. Since the T7 polymerase of rApo1−T7 is processive, it continues to travel along the DNA target sequence until it reaches a terminating sequence at the end of the DNA target sequence. Importantly, data disclosed herein demonstrate that rApo1−T7 has a high mutation rate and low toxicity relative to global methods (mutagenic plasmid [MP6], which is the current gold standard for in vivo global mutagenesis methods).
Additional components can provide further constraints so that mutations are limited to a defined stretch of DNA (see Examples 2-4). These constraints (and their underlying importance to the implementation of DTH) have not been demonstrated previously.
Targeted mutagenesis is defined as the constraint of mutations to a defined stretch of DNA. In other words, mutations should not appear outside of the target region. In the implementation of rApo1−T7 demonstrated in Example 1, one might expect that the mutation frequency upstream of the T7 promoter would be very low. However, preventing mutations downstream of the target region could be a tremendous challenge. Previous data has shown that monomeric RNA polymerases can be quite processive and carry out transcription for exceptionally long stretches of DNA—in excess of 20 kb in the case of T7 RNA polymerase (Rong et al., J. Biol. Chem. 1998 Apr. 24; 273(17): 10253-60; Thiel et al., J. Gen. Virol. 2001 June; 82(Pt 6): 1273-81). Effective termination of transcription is further complicated by the context-dependent nature of termination efficiency (Mairhofer et al., ACS Synth. Biol. 2015 Mar. 20; 4(3): 265-73).
An unsuccessful termination event for rApo1−T7 can result in the incorporation of undesired mutations throughout many kilobases of DNA downstream of the target region. These undesired mutations are typically catastrophic in the context of directed evolution, as these changes can produce numerous variants outside of the gene of interest that overcome a selection scheme in a living organism (i.e., “cheaters”). Previous attempts at technologies similar to DTH have failed to address or entirely ignored undesired mutagenesis downstream of the target region or elsewhere in the genome. Therefore, experiments were designed to test the possibility of off-target mutagenesis, and if necessary eliminate it.
A start codon reversion drug resistance assay was designed in which two drug resistance genes were positioned in series, both of which lacked a start codon (
Additional experiments were performed using LacO operons recruiting the Lac repressor to interfere with T7 processivity and promote termination; however, these limited attempts were unsuccessful.
Similar to termination, it was found that expression levels of rApo1−T7 can result in untargeted mutagenesis if left unchecked. In preliminary implementations of the DTH using rApo1−T7, significant cytotoxicity was observed even when rApo1−T7 was expressed under limiting conditions through a common promoter (such as an arabinose inducible promoter with glucose suppression). In the context of directed evolution, such widespread changes results in the regular appearance of “cheaters.”
Thus, experiments were designed to limit the expression of rApo1−T7 by alternative strategies beyond traditional promoters. In the most successful implementation, the combined effects of reducing promoter strength (TABLE 3; Registry of Standard Biological Parts, parts.igem.org/Promoters/Catalog/Anderson; Camsund et al., J. Biol. Eng. 2014 Jan. 27; 8(1): 4) and limiting copy number of the rApo−T7 gene were critical for limiting cytotoxicity when utilizing rApo1−T7 in E. coli. Expression of rApo1−T7 constructs under medium copy number conditions was highly toxic. Moreover, use of split T7 to increase mutagenesis and reduce toxicity failed because T7 polymerase activity of the split constructs was unacceptably low.
ctgatagctagctcagtcctagggattatgctagc
ctgacagctagctcagtcctaggtataatgctagc
ctgatagctagctcagtcctagggattatgctagc
ctgatggctagctcagtcctagggattatgctagc
No previously described mutagenesis methodology has demonstrated conditional control that allows users to conveniently turn on and shut off targeted mutation accumulation in a living organism. While MP6 inducible system have been disclosed (Badran A. H. and Liu D. R., Nat. Commun. 2015 Oct. 7; 6: 8425), it carries out mutagenesis globally. Commonly used non-inducible mutagenesis methods used in living organisms are designed to continuously carry out global mutagenesis, which forces users to isolate the final libraries of evolved genes of interest from mutagenic organisms and subsequently transfer these libraries to a non-mutagenic organism for downstream sequencing and characterization. Conditional control of mutagenesis would allow users to switch off targeted mutagenesis after a desired portion of time, effectively eliminating the need to isolate and transfer evolved libraries from one organism to another.
The results disclosed herein demonstrate that the activity of rApo−T7 can be conditionally tuned by chemically inducing the expression of LacI-repressed rApo1−T7 with IPTG, such that higher expression levels of T7 polymerase correlate with increased levels of mutagenesis (
General Methods: All PCR reactions for restriction cloning and recombineering targeting cassettes were performed using Q5 High Fidelity DNA Polymerase (New England Biolabs). Primers were ordered from Life Technologies and g-blocks were ordered from Integrated DNA Technologies.
Chemicals: Kanamycin monosulfate was purchased as a solid from Alfa Aesar (J61272). Tetracycline hydrochloride was purchased as a solid from Calbiochem (58346). Fosfomycin was purchased as a solid from Alfa Aesar (J6602). Rifampicin was purchased as a solid from TCI (R0079). Ampicillin was purchased as a solid sodium salt form Fisher bioreagents (BP1761-25). Streptomycin sulfate was purchased as a solid from MP Biomedicals (100556). Chloramphenicol was purchased as a solid from Alfa Aesar (B20841). Tetrazolium chloride was purchased as a solid from Aldrich (T8877). L-rhamnose was purchased as a solid from Sigma-Aldrich (W373011). L-arabinose was purchased as a solid form Chem Impex (01654). Isopropyl β-D-1-thiogalactopyranoside (IPTG) was purchased as a solid from Sigma-Aldrich (16758-1G). Antifoam 204 was purchased as liquid from Sigma (A8311-50ML). LB was purchased as a solid form Difco (244620). Agar was purchased as a solid from Alfa Aesar (A10752). Cycloheximide was purchased as a solid from Chem Impex (00083). Ethylmethanesulfonate (EMS) was purchased from Sigma Aldrich (M0880-1G).
Cloning: All plasmids were generated by restriction cloning. Ligation reactions were performed using Quick Ligase (New England Biolabs). All DNA cloning was performed in DH10B cells (Invitrogen). The rApo1 gene was amplified from pET28b-BE1 (Komor et al., Nature. 2016 May 19; 533(7603): 420-24) and the T7 RNA polymerase gene was amplified from pTara (Wycuff D. R. and Matthews K. S., Anal. Biochem. 2000 Jan. 1; 277(1): 67-73). Mutation assay reporter plasmids utilize the single-copy BAC origin and the terminator arrays of the UUCG-T7 derivative of the T7 terminator (Mairhofer et al., ACS Synth. Biol. 2015 Mar. 20; 4(3): 265-73), were generated by serial insertion of the annealed oligos NheI-UUCG-BamHI S and NheI-UUCG-BamHI AS.
General recombineering: The E. coli genome was edited using seamless lambda red recombineering with ccdB counterselection as previously described (Wang et al., Nucleic Acids Res. 2014 March; 42(5): e37). Cells were first transformed with the temperature-sensitive psc101-gbaA recombineering plasmid and plated on LB agar with 10 μg/mL tetracycline and incubated for 24 hr at 30° C. Colonies were picked and grown in LB with 10 μg/mL tetracycline overnight at 30° C. (18-21 hrs). The overnights were diluted 25-fold in LB with 10 μg/mL tetracycline and grown at 30° C. for about 2 hours until they reached an OD600 of 0.3-0.4. The ccdA antitoxin and recombineering machinery were then induced by adding arabinose and rhamnose to a final concentration of 2 mg/mL each and then growing the cultures at 37° C. for 40 minutes to an OD600 of ˜0.6. The cultures were then placed on ice, washed twice with ice-cold sterile ddH2O, resuspended in ˜25 μL of ice-cold sterile ddH2O and electroporated with ˜200 ng of the appropriate kan-ccdB targeting cassette (1.8 kV, 5.8 ms, 0.1 cm cuvette, BioRad Micropulser). The cells were then recovered in SOC with 2 mg/mL arabinose at 30 C for 2 hours, then plated on LB agar plates with 50 μg/mL kanamycin and 2 mg/mL arabinose and incubated for 24 hours at 30° C. Colonies that appeared had incorporated the kan-ccdB targeting and were picked and grown in LB with 50 μg/mL kanamycin and 2 mg/mL arabinose at 30° C. overnight (18-21 hours). The cultures were then diluted 25-fold in LB with 50 μg/mL kanamycin and 2 mg/mL arabinose and grown at 30° C. for about 2 hours until they reached an OD600 of 0.3-0.4. The recombineering machinery was then induced by adding rhamnose to a final concentration of 2 mg/mL each and then growing the cultures at 37° C. for 40 minutes to an OD600 of ˜0.6. The cultures were then placed on ice, washed twice with ice-cold sterile ddH2O, resuspended in ˜25 μL of ice-cold sterile ddH2O and electroporated with ˜200 ng of the final targeting cassette that will replace the kan-ccdB cassette currently integrated in the genome (1.8 kV, 5.8 ms, 0.1 cm cuvette, BioRad Micropulser). The cells were then recovered in SOC with 2 mg/mL arabinose at 30 C for 2 hours, then were washed once with LB to remove the arabinose and cease production of the ccdA antitoxin. The cultures were then plated on LB agar plates at various dilutions with 100 μg/mL streptomycin and incubated for 24 hours at 37° C. Without the ccdA antitoxin, the ccdB toxin will kill cells that have not replaced the integrated kan-ccdB cassette with the final targeting cassette. The colonies that grow should have the final targeting cassette integrated, but were screened by PCR or sequencing to confirm final targeting cassette integration as some colonies simply have inactivated the ccdB toxin. Once a clone with the desired change was found, the temperature-sensitive psc101-gbaA recombineering plasmid was cured by plating on LB agar with 100 μg/mL streptomycin and incubating at 42° C. for 18-21 hours, then streaking a colony from the plate on LB agar with 100 μg/mL streptomycin and incubating at 42° C. for another 18-21 hours. The colonies from the second plate were grown in LB with 100 μg/mL streptomycin at 37° C. to be used or to make glycerol stocks. The colonies were also incubated in LB with 10 μg/mL tetracycline at 30° C. to ensure tetracycline sensitivity and confirm that the recombineering plasmid was cured.
ung Deletion: In order to prevent dU→dC repair and increase the mutagenesis rate, uracil DNA glycosylase (ung) was deleted in several of the strains used in this work (Duncan B. K., J. Bacteriol. 1985 November; 164(2): 689-95). Deletion of ung was accomplished through lambda red recombineering, using a kan-ccdB targeting cassette that was amplified from R6K-kan-ccdB using primers 5′ Ung kanccdB and 3′ Ung kanccdB. Once the kan-ccdB targeting cassette replaced the ung gene, the kan-ccdB cassette was deleted using the annealed oligos delUng S and delUng AS as the targeting cassette to generate a markerless ung deletion.
Increasing lacI expression: The expression of the lacI repressor in DH10B cells was increased by replacing the endogenous PlacI promoter with the strong Ptac promoter using lambda red recombineering. A kan-ccdB targeting cassette was amplified from R6K-kan-ccdB using primers 5′ pLacI::kanccdB and 3′ pLacI::kanccdB and used to replace the endogenous PlacI promoter with the kan-ccdB cassette. The kan-ccdB cassette was replaced with Ptac using the annealed oligos pLacI::pTac S and pLacI::pTac AS.
Deleting the motAB and csgABCDEFG operons to decrease biofilm formation: Deletions of the motAB operon (Pratt L. A. and Kolter R., Mol. Microbiol. 1998 October; 30(2): 285-93) and the csgABCDEFG (Prigent-Combaret et al., Environ. Microbiol. 2000 August; 2(4): 450-64) have been shown to produce strains of E. coli that are deficient in biofilm formation. To minimize inlet line contamination and clogs in bioreactor experiments due to biofilms, the motAB and csgABCDEFG operons were deleted using one-step DIRex lambda red recombineering (Nasvall J., PLoS One. 2017 Aug. 30; 12(8): e0184126). The motAB targeting half-cassettes were amplified from R6K-AmilCP-kan-ccdB using primers delmotDF and AmilCP-KanR and from R6K-kan-ccdB-AmilCP using primers delmotDR and KanF-Ami1CP. The csgABCDEFG targeting half-cassettes were amplified from R6K-AmilCP-kan-ccdB using primers delcsgDF and AmilCP-KanR and from R6K-kan-ccdB-AmilCP using primers delcsgDR and KanF-AmilCP. The motAB or csgABCDEFG half cassettes were co-electroporated to replace motAB or csgABCDEFG with a kan-ccdB cassette flanked by large AmilCP inverted repeats nested between short 30 bp direct repeats. The repeat architecture leads to a high rate of spontaneous excision that was selected for using ccdB counterselection to obtain markerless deletions of motAB and csgABCDEFG.
Deactivated rApo1: The E63Q mutant of rApo1 cytidine deaminase has been shown to be catalytically dead (Navaratnam et al., Cell. 1995 Apr. 21; 81(2): 187-95). Lambda red recombineering was used to generate strains with deactivated rApoI and deactivated rApoI−T7 using a kan-ccdB targeting cassette that was amplified from R6K-kan-ccdB using primers 5′ drApoI::kanccdB and 3′ drApoI::kanccdB. Once the kan-ccdB targeting cassette replaced the E63 codon, the kan-ccdB cassette was replaced with a glutamine codon using the annealed drApoI S and drApoI AS as the targeting cassette to generate an E63Q mutant.
Insertion of rApo1 and MutaT7 into the E. coli genome: rApo1 and MutaT7 were inserted into the genome at the seam of the large Δ(araA-leu)7697 deletion in DH10B E. coli using lambda red recombineering. A kan-ccdB targeting cassette was amplified from R6K-kan-ccdB using primers dAraLeu7697 kanccdB F and dAraLeu7697 kanccdB R and used to the insert the kan-ccdB cassette between 62,378 bp and 62,379 bp in the DH10B genome (Durfee et al., J. Bacteriol. 2008 April; 190(7): 2597-606). Then targeting cassettes containing rApo1 or MutaT7 were amplified from BBa_J23114_lacO rApo1 and BBa_J23114_lacO MutaT7, respectively, using primers dAraLeu7697-rApoI and dAraLeu7697-T7 and were used to replace kan-ccdB with rApoI or MutaT7.
Replacement of promoter BBa_J23114 with PAllacO-Tenth: The BBa_J23114 promoter from the Anderson Collection (parts.igem.org/Promoters/Catalog/Anderson) that controlled the expression of rApo1 or MutaT7 from the DH10B genome was replaced with the promoter PAllacO-Tenth which was intended to be a weaker version of the PAllacO promoter (Camsund et al., J. Biol. Eng. 2014 Jan. 27; 8(1): 4). A kan-ccdB targeting cassette was amplified from R6K-kan-ccdB using primers 5′ prApoI::kanccdB and 3′ prApoI::kanccdB and used to replace BBa_J23114 with a kan-ccdB cassette. The kan-ccdB cassette was replaced with PAllacO-Tenth using the targeting cassette amplified from the pAllacO-tenth gblock using primers PAllacO-1 F and PAllacO-1 R.
Mutation assay: To test mutagenesis rates, the control and mutagenic strains (StrepR) carrying reporter plasmids (AmpR) were streaked out on LB agar with 100 μg/mL streptomycin and 100 μg/mL ampicillin and grown at 37° C. for 24 hours. Then single colonies were picked in triplicate for each sample and grown in 5 mL LB with 100 μg/mL streptomycin,100 μg/mL ampicillin and 25 mM arabinose (with 10 μg/mL chloramphenicol if the strain contains MP6) at 37° C., 250 r.p.m. for 24 hours. Then 1 mL aliquots of each overnight were pelleted at 6000×g for 3 minutes and resuspended in 1 mL LB to remove arabinose. Then 50 μL of each resuspension was plated on LB agar plates with 50 μg/mL tetrazolium chloride and 200 μg/mL kanamycin, 20 μg/mL tetracycline, 100 μg/mL fosfomycin or 100 μg/mL rifampicin unless otherwise stated. 50 μL of a 100,000-fold dilution of each culture was also plated on LB agar with 100 μg/mL streptomycin, 100 μg/mL ampicillin and 50 μg/mL tetrazolium chloride. After incubating the plates at 37° C. for 48 hours, they were imaged by inverting the plates onto transparencies and scanning on a document scanner at a resolution of 400 d.p.i. The colonies were then counted using the software OpenCFU (3.9.0) (Geissmann Q., PLoS One. 2013; 8(2): e54072), with minimum colony radius set to 3, the maximum colony radius set to 50 and the regular threshold set to 4.
Chemical mutagens: Mutagenesis with ethane methyl sulfonate (EMS) was performed as previously described (Cupples C. G. and Miller J. H., Proc. Natl. Acad. Sci. U.S.A. 1989 July; 86(14): 5345-49). An overnight culture of each sample was subcultured and grown until it reached a density of 2-3×108 cells per mL (log phase). 5 mL aliquots of cells were chilled on ice, washed twice with sodium phosphate buffer (pH 7) and resuspended in 1 mL of 1× PBS in a 1.5 mL eppendorf tube. EMS was added while cold by pipetting 14 μL of EMS into 1 ml of resuspended cells. Eppendorfs were sealed, and mixed at 1000 r.p.m. for 60 minutes at 37° C. The cells were then washed twice with LB, and resuspended in 1 mL of LB. Immediately after washing, a viability measurement was performed by plating 50 μL of a 10,000-fold dilution of mutagen and mock-treated cultures on LB agar with 100 μg/mL streptomycin, 100 μg/mL ampicillin and 50 μg/mL tetrazolium chloride. For mutation rate assessment, 500 μL of each resuspension were inoculated into 5 ml of LB with 100 μg/mL streptomycin and 100 μg/mL ampicillin. The cultures were grown at 37° C. for 20 hours, then 50 μL of each culture was plated on LB agar 50 μg/mL tetrazolium chloride and 100 μg/mL rifampicin. 50 μL of a 100,000-fold dilution of each culture was also plated on LB agar with 100 μg/mL streptomycin, 100 μg/mL ampicillin and 50 μg/mL tetrazolium chloride. After 48 hours of incubation, plates were imaged on a document scanner at a resolution of 400 d.p.i, and colonies were subsequently counted using the software OpenCFU (3.9.0) (Geissmann Q., PLoS One. 2013; 8(2): e54072), with minimum colony radius set to 3, the maximum colony radius set to 50 and the regular threshold set to 4.
Continuous culture of T7 promoter+antisense T7 promoter reporter plasmid and sequencing: The T7 promoter+antisense T7 promoter reporter plasmid was continuously cultured in the MutaT7-csg+ mot+ strain in a 70 mL culture in a round-bottomed flask that was slowly stirred in a 37° C. mineral oil bath. The culture was aerated through a needle that was connected to a standard aquarium pump and LB with 100 μg/mL streptomycin, 100 μg/mL ampicillin and 0.5% isopropanol (as antifoaming agent) was fed into the culture via a needle connected to a peristaltic pump at a rate of ˜0.5 volumes/hour. Fractions were collected every 3 days for 12 days. Each fraction was plated for single colonies on LB agar with 100 μg/mL ampicillin and 10 clones from each fraction were Sanger sequenced by colony PCR with primers 1493 and 1494.
Continuous culture of T7 promoter+filler DNA and T7 promoter+terminators reporter plasmids and sequencing: The T7 promoter+filler DNA and T7 promoter+terminators reporter plasmids were continuously cultured in the Δung (negative control), MutaT7 and MP6 strains in 20 mL cultures in a previously described multiplex bioreactor setup (Miller et al., J. Vis. Exp. 2013 Feb. 23; (72): e50262). The reactor was stored in a 37° C. warm room and was aerated and stirred with aquarium pumps. LB with 100 μg/mL streptomycin, 100 μg/mL ampicillin, 100 μg/mL cycloheximide, 0.01% (v/v) antifoam 204 and 150 μg/mL arabinose (+10 μg/mL chloramphenicol in the case of the MP6 strain) was pumped into each reaction vessel at a rate of 0.87 volumes/hour. Fractions were collected every 3 days. Each fraction was plated on LB agar with 100 μg/mL streptomycin and 100 μg/mL ampicillin and 12 single colonies from each plate were grown in 5 mL LB with 100 μg/mL ampicillin. DNA was isolated from each overnight using the Qiaprep 96 Turbo Miniprep Kit and quantified using PicoGreen assay. 1 ng of each sample was prepared using the Illumina NexteraXT Sample Preparation kit. Samples were barcoded and pooled prior to sequencing on an Illumina MiSeq 300v2 cartridge to obtain 2×150 base pair paired-end reads. Sequencing reads were aligned against respective plasmid sequences using bwa mem 0.7.10-r789 [RRID:SCR_010910]. Allele pileups were generated using samtools v.0.1.19 mpileup [RRID:SCR_002105] with flags-d 10000000-excl-flags 2052, and allele counts/frequencies were extracted (Li H., Bioinformatics. 2011 Nov. 1; 27(21): 2987-93; Li et al., Bioinformatics. 2009 Aug. 15; 25(16): 2078-79). Only positions with greater than 10-fold coverage in all replicates of each sample were included in the analysis. Fixed variant alleles (present at greater than 85% frequency) for each sample are reported. Sanger sequencing was also performed on a PCR amplicon from 96 clones of Δung (negative control) and MutaT7 after 15 days of continuous culture carrying T7 promoter+terminators reporter plasmid. Primers 2165 and 1197 were used to amplify and Sanger sequence the KanR gene.
Monomeric RNA polymerases possess inherently high promoter specificity (Rong et al., J. Biol. Chem. 1998 Apr. 24; 273(17): 10253-60) and high processivity during transcription (Thiel et al., J. Gen. Virol. 2001 June; 82(6): 1273-81). Cytidine deaminases are potent DNA-damaging enzymes that act on ssDNA substrates formed during transcription (Thiel et al., J. Gen. Virol. 2001 June; 82(6): 1273-81; Ramiro et al., Nat. Immunol. 2003 May; 4(5): 452-56). It was envisioned that merging the unique features of these two enzyme classes by creating a fusion “mutaT7” protein consisting of a cytidine deaminase (rApo1) fused to T7 RNA polymerase (T7-pol) would facilitate the targeting of mutations to any DNA region lying downstream of a T7 promoter (
T7 promoter-dependent KanR mutagenesis by mutaT7 shows that one can target mutagenesis to a desired DNA region. Since T7-pol is highly processive, it was anticipated mutations would also be introduced further downstream of the T7 promoter. The presence in the reporter plasmid of a tetracycline-resistance (TetR) gene with an inactive, ACG start codon separated by an ˜1.6 kbp spacer DNA from the KanR gene provided a mechanism to assay such processivity. High levels of mutaT7-dependent Tet resistance was observed only in reporter strains having the T7 promoter, consistent with targeting and processive introduction of mutations across a lengthy DNA region. Once again, global mutagens generated Tet-resistant colonies in all reporter plasmids.
Targeted mutagenesis using the processive mutaT7 chimera requires not just recruitment to a DNA locus but also termination upon reaching the end of the DNA region of interest. To address termination, KanR/TetR reporter plasmids were used in which the DNA spacer was replaced with one or more T7 terminators and then assayed for both Kan and Tet resistance. Four or more copies of the T7 terminator was sufficient to prevent mutagenesis beyond the DNA of interest (
An important advantage of targeted mutagenesis is the ability to attain much larger viable library sizes by avoiding off-target, toxic mutations in essential genes outside the DNA region of interest. Based on the apparently low off-target mutagenesis rate of mutaT7, one might expect that E. coli carrying mutaT7 would have significantly higher viability than bacteria treated with global mutagens. Indeed, consistent with prior work (Badran A. H. and Liu D. R., Curr. Opin. Chem. Biol. 2015 February; 24: 1-10), very low viability was observed in all populations treated with global mutagens, whereas populations expressing mutaT7 possessed viability similar to untreated cells (
The assays in
Next, next generation sequencing was performed of the entire episomal reporter plasmid DNA sequence of 36 clones drawn from the same E. coli population as in
One disadvantage of mutaT7 is its limited mutational spectrum consistent with the use of a cytidine deaminase as the mutagenic component. Indeed, the sequencing results described above indicate that C to T transitions are exclusively obtained in the sense strand of targeted DNA using a single T7 promoter. It was hypothesized that the mutational spectrum could be doubled by installing a second T7 promoter that would recruit mutaT7 to the 3′-end of the DNA of interest. Installation of an antisense T7 promoter leads to the appearance of both G to A and C to T transitions throughout the target gene (
In summary, the processively acting mutaT7 chimera is capable of selectively targeting mutations to large, yet well-defined, regions of DNA in a living system with minimal human intervention. Moreover, the availability of T7 variants with altered transcription rates (Bonner et al., J. Biol. Chem. 1994 Oct. 7; 269(40): 25120-28) likely provides the opportunity to fine-tune mutation rates. Utilizing other base editing enzymes in place of cytidine deaminase, such as the adenosine deaminases (Gaudelli et al., Nature. 2017 Nov. 23; 551(7681): 464-71), can significantly widen the mutational spectrum of mutaT7 and further enable the creation of rich, diverse DNA libraries in vivo with minimal off-target effects. The ubiquitous applicability and high specificity of T7 RNA polymerase in a large number of diverse organisms (Lieber et al., Eur. J. Biochem., 1998 Oct. 1; 217(1): 387-94) will enable implementation of targeted mutagenesis in a broad range of evolutionary and synthetic biology settings.
General: All PCR reactions for restriction cloning and recombineering targeting cassettes were performed using Q5 High Fidelity DNA Polymerase (New England Biolabs). All colony PCR reactions for sequencing were performed using OneTaq Quick-Load 2× Master Mix with Standard Buffer (New England Biolabs). Primers were obtained from Life Technologies. Gene blocks were obtained from Integrated DNA Technologies.
Reagents: The following reagents were obtained as indicated: Kanamycin monosulfate, fosfomycin, agar, and chloramphenicol (Alfa Aesar J61272, J66602, A10752, and B20841, respectively); tetracycline hydrochloride (CalBioChem 58346); rifampicin (TCI R0079); ampicillin (Fisher Bioreagents BP1760-25); streptomycin sulfate (MP Biomedical 100556); tetrazolium chloride, L-rhamnose, antifoam-204, and ethyl methanesulfonate (Sigma-Aldrich T8877, W373011, A8311, and M0880, respectively); L-arabinose and cycloheximide (Chem-Impex 01654 and 00083, respectively); and lysogeny broth (LB; Difco 244620); anhydrous sodium phosphate dibasic and monobasic sodium phosphate (Mallinckrodt 7917 and 7892, respectively); potassium chloride and isopropyl β-D-1-thiogalactopyranoside (Sigma P9333 and 16758, respectively); magnesium sulfate (Macron 6070-12); o-Nitrophenyl-β-galactoside and egg-white lysozyme (VWR 0789 and 0663, respectively); PopCulture lysis reagent (EMD Millipore 71092-4); 2-mercaptoethanol (Bio-Rad 161-0710); trimethoprim (Matrix Scientific 058373).
Cloning and Recombineering: All plasmids were generated by restriction cloning. Ligation reactions were performed using Quick Ligase (New England Biolabs). All DNA cloning was performed in DH10B cells (Invitrogen). The rApo1 gene was amplified from pET28b-BE1 (Komer et al., Nature. 2016 May 19; 533(7603): 420-2) and the T7 RNA polymerase gene was amplified from pTara (Wycuff and Matthews, Anal. Biochem. 2000 Jan. 1; 277(1): 67-73). Mutation assay reporter plasmids utilizing the single-copy BAC origin and the terminator arrays of the UUCG-T7 derivative of the T7 terminator (Mairhofer et al., ACS Synth. Biol. 2015 Mar. 20; 4(3): 265-73) were generated by serial insertion of the annealed oligos NheI-UUCG-BamHI S and NheI-UUCG-BamHI AS (TABLE 10). The folA gene was amplified from DH10B genomic DNA. All E. coli strains used in this work were engineered using lambda red recombineering strategies described in detail below.
Mutation Assay: To assess mutagenesis rates, the control (Δung, rApo1, drApo1, and drApo1−T7; TABLE 8) and mutagenic strains (MutaT7 and MP6; TABLE 8) (StrepR) carrying reporter plasmids (AmpR) were streaked on LB agar with 100 μg/mL streptomycin and 100 μg/mL ampicillin and grown at 37° C. for 24 h in order to obtain clones. Single colonies were picked in triplicate for each sample and used to inoculate 5 mL LB with 100 μg/mL streptomycin, 100 μg/mL ampicillin, and 25 mM arabinose (with 10 μg/mL chloramphenicol for the MP6 strain, TABLE 8), then shaken at 250 r.p.m. and 37° C. for 24 h to accumulate mutations during growth. 1 mL aliquots of each culture were pelleted at 6000×g for 3 min and resuspended in 1 mL LB to remove arabinose. Each resuspension was plated on LB agar plates with 50 μg/mL tetrazolium chloride (a metabolic contrast dye for visualizing colonies) and the antibiotics indicated below to analyze mutation rates and viability:
Plates were incubated at 37° C. for 48 h, then imaged by inverting the plates onto transparencies and scanning on a document scanner at a resolution of 400 dots per inch (d.p.i.). The colonies were then counted using the software OpenCFU (3.9.0) (Geissmann, PLoS One. 2013; 8(2): e5), with the minimum colony radius set to 3, the maximum colony radius set to 50, and the regular threshold set to 4.
The same assay as above was also used to assess the mutation rate of the ugi rApo1, ugi MutaT7, and ugi drApo1−T7 strains (TABLE 8), except that instead of arabinose either 0 mM or 1 mM isopropyl β-D-1-thiogalactopyranoside (IPTG) was added to the liquid overnight cultures as a control or to induce mutagenesis, respectively.
Chemical Mutagenesis with ethyl methanesulfonate (EMS): Mutagenesis with EMS was performed as previously described (Cupples and Miller, Proc. Natl. Acad. Sci. U.S.A. 1989 July; 86(14): 5345-49). An overnight culture of each sample was subcultured and grown until it reached a density of 2-3×108 cells per mL (log phase). 5 mL aliquots of cells were chilled on ice, washed twice with sodium phosphate buffer (pH=7), and resuspended in 1 mL of 1×PBS in a 1.5 mL Eppendorf tube. EMS was added while cold by pipetting 14 μL of EMS into 1 ml of resuspended cells. Eppendorfs were sealed and mixed at 1000 r.p.m. for 60 min at 37° C. The cells were then washed twice with LB and resuspended in 1 mL of LB. Immediately after washing, a viability measurement was performed by plating 50 μL of a 10,000-fold dilution of each culture on LB agar with 100 μg/mL streptomycin, 100 μg/mL ampicillin, and 50 μg/mL tetrazolium chloride. After 48 h of incubation, plates were imaged on a document scanner as described above. The number of live ampicillin resistant colonies were counted after EMS treatment in CFU/mL to measure the viability after mutagen treatment (
Continuous Culturing and Sequencing of the Dual T7 Promoter Reporter Plasmid: The dual T7 promoter reporter plasmid was continuously cultured in the MutaT7-csg+ mot+ strain (TABLE 8) in a 70 mL culture in a round-bottomed flask that was slowly stirred in a 37° C. mineral oil bath. The culture was aerated through a needle that was connected to a standard aquarium pump and LB with 100 μg/mL streptomycin, 100 μg/mL ampicillin, and 0.5% isopropanol (as an antifoaming agent) was fed into the culture via a needle connected to a peristaltic pump at a rate of ˜0.5 volumes/h. Fractions were collected every 3 d for 12 d. Each fraction was plated for single colonies on LB agar with 100 μg/mL ampicillin and 10 clones from each fraction were Sanger-sequenced by colony PCR with the primers 1493 and 1494 (TABLE 10).
Continuous Culturing and Sequencing of the T7 Promoter+Filler DNA and T7 Promoter+Terminators Reporter Plasmids Reporter Plasmids: The T7 promoter+filler DNA and T7 promoter+terminators reporter plasmids were continuously cultured in the Δung (negative control), MutaT7, and MP6 strains (TABLE 8) in 20 mL cultures using a previously described multiplex bioreactor setup (Miller et al., J. Vis. Exp. 2013 Feb. 23; (72): e50262). The reactor was stored in a 37° C. warm room and was aerated and stirred with aquarium pumps. LB with 100 μg/mL streptomycin, 100 μg/mL ampicillin, 100 μg/mL cycloheximide, 0.01% (v/v) antifoam-204, and 150 μg/mL arabinose (+10 μg/mL chloramphenicol in the case of the MP6 strain (TABLE 8)) was pumped into each reaction vessel at a rate of 0.87 volumes/h. Fractions were collected every 3 d. Each fraction was plated on LB agar with 100 μg/mL streptomycin and 100 μg/mL ampicillin and 12 single colonies from each plate were grown in 5 mL LB with 100 μg/mL ampicillin. DNA was isolated from each overnight culture using the Qiaprep 96 Turbo Miniprep Kit and quantified using the PicoGreen assay.
Library Construction and Next Generation Sequencing: Libraries were prepared using a miniaturized version of Nextera XT. Briefly, 0.5 ng of input DNA was subjected to a 1/12 scale reaction of Illumina Nextera XT performed on a TTP Labtech Mosquito HV using combinatorial dual indexing (Vfinal=4 μl). Completed libraries were size selected using SPRI beads at 0.7× volume and pooled before sequencing on an Illumina MiSeq using 150 nt paired end reads (v2 chemistry). Sequencing reads were aligned against respective plasmid sequences using bwa mem (v. 0.7.12-r1039) (Li, arXiv preprint arXiv. 16 Mar. 2013; 1303.3997), with flag-t 16, and sorted and indexed bam files were generated using samtools (v 1.3) (Li et al., Bioinformatics. 2009 Aug. 15; 25(16): 2078-79). These bam files were processed using samtools mpileup with flags-excl-flags 2052, -d 10000000 and the same plasmid reference sequences used for mapping (Li et al., Bioinformatics. 2011 Nov. 1; 27(21): 2987-93). Read coverages and alleles counts and frequencies were tabulated at each position of the reference sequence in each sample for down-stream analysis. Only positions with greater than 10-fold coverage in all replicates of each sample were included in the analysis. Fixed variant alleles (present at greater than 85% frequency) for each sample are reported. Sanger sequencing was also performed on a PCR amplicon from 96 clones of Δung (negative control) and MutaT7 strains (TABLE 8) after 15 d of continuous culture carrying the T7 promoter+terminators reporter plasmid. The primers 2165 and 1197 (TABLE 10) were used to amplify and Sanger sequence the KanR gene.
Lambda Red Recombineering: The E. coli genome was edited using seamless lambda red recombineering with ccdB counterselection, as previously described (Wang et al., Nucleic Acids Res. 2014 March; 42(5): e37). Cells were transformed with the temperature-sensitive psc101-gbaA recombineering plasmid, plated on LB agar with 10 μg/mL tetracycline, and incubated for 24 h at 30° C. Colonies were selected and grown in LB containing 10 μg/mL tetracycline overnight at 30° C. (18-21 h). Overnight cultures were diluted 25-fold in LB with 10 μg/mL tetracycline and grown at 30° C. for ˜2 h until attaining an OD600 of 0.3-0.4. The ccdA antitoxin and recombineering machinery were then induced by adding arabinose and rhamnose to a final concentration of 2 mg/mL each and then growing the cultures at 37° C. for 40 min to an OD600 of ˜0.6. The cultures were then placed on ice, washed twice with ice-cold sterile ddH2O, resuspended in ˜25 μL of ice-cold sterile ddH2O, and electroporated with ˜200 ng of the appropriate kan-ccdB targeting cassette (1.8 kV, 5.8 ms, 0.1 cm cuvette, BioRad Micropulser). The cells were then recovered in super optimal broth with catabolite repression (SOC) with 2 mg/mL arabinose at 30° C. for 2 h, then plated on LB agar plates with 50 μg/mL kanamycin and 2 mg/mL arabinose and incubated for 24 h at 30° C. Colonies that grew under these conditions had incorporated the kan-ccdB targeting cassette and were picked and grown in LB with 50 μg/mL kanamycin and 2 mg/mL arabinose at 30° C. for 18-21 h. The cultures were then diluted 25-fold in LB with 50 μg/mL kanamycin and 2 mg/mL arabinose and grown at 30° C. for ˜2 h until they reached an OD600 of 0.3-0.4. The recombineering machinery was then induced by adding rhamnose to a final concentration of 2 mg/mL and then growing the cultures at 37° C. for 40 min to an OD600 of ˜0.6. The cultures were then placed on ice, washed twice with ice-cold sterile ddH2O, resuspended in ˜25 μL of ice-cold sterile ddH2O, and electroporated with ˜200 ng of the final targeting cassette intended to replace the kan-ccdB cassette currently integrated in the genome (1.8 kV, 5.8 ms, 0.1 cm cuvette, Bio-Rad Micropulser). The cells were then recovered in SOC with 2 mg/mL arabinose at 30 C for 2 h, and then were washed once with LB to remove the arabinose and prevent continued production of the ccdA antitoxin. The cultures were then plated on LB agar plates at various dilutions with 100 μg/mL streptomycin and incubated for 24 h at 37° C. Without the ccdA antitoxin, the ccdB toxin will kill cells that have not replaced the integrated kan-ccdB cassette with the final targeting cassette. The colonies that grow should have the final targeting cassette integrated, but were screened by PCR or sequencing to confirm cassette integration as some colonies may simply inactive the ccdB toxin. Once a clone with the desired change was found, the temperature-sensitive psc101-gbaA recombineering plasmid was cured by plating on LB agar with 100 μg/mL streptomycin, incubating at 42° C. for 18-21 h, streaking a colony from the plate on LB agar with 100 μg/mL streptomycin, and incubating at 42° C. for another 18-21 h. The colonies from the second plate were grown in LB with 100 μg/mL streptomycin at 37° C. to generate glycerol stocks. The colonies were also incubated in LB with 10 μg/mL tetracycline at 30° C. to ensure tetracycline sensitivity and confirm that the recombineering plasmid was successfully cured. The various strains used in this work (TABLE 8) were generated using the primers in TABLE 10.
Deleting the motAB and csgABCDEFG operons through DIRex lambda red recombineering to decrease biofilm formation in bioreactor experiments: Deletions of the motAB operon (Pratt and Kolter, Mol. Microbiol. 1998 October; 30(2): 285-93) and the csgABCDEFG (Prigent-Combaret et al., Environ. Microbiol. 2000 August; 2(4): 450-64) have been shown to produce strains of E. coli that are deficient in biofilm formation. To minimize inlet line contamination and clogs in bioreactor experiments owing to biofilms, the motAB and csgABCDEFG operons were deleted using one-step DIRex lambda red recombineering (Näsvall, PLoS One. 2017 Aug. 30; 12(8): e0184126). The motAB targeting half-cassettes were amplified from R6K-AmilCP-kan-ccdB using the primers delmotDF and AmilCP-KanR and from R6K-kan-ccdB-AmilCP using the primers del-motDR and KanF-AmilCP (TABLE 10). The motAB half cassettes were co-electroporated to replace motAB with a kan-ccdB cassette flanked by large AmilCP inverted repeats nested between short 30 bp direct repeats. The repeat architecture leads to a high rate of spontaneous excision that was selected for using ccdB counterselection to obtain a markerless deletion of motAB. This procedure was then repeated to delete the csgABCDEFG operon. The csgABCDEFG targeting half-cassettes were amplified from R6K-AmilCP-kan-ccdB using the primers delcsgDF and AmilCP-KanR and from R6K-kan-ccdB-AmilCP using the primers delcsgDR and KanF-AmilCP (TABLE 10).
Separation of rApo1−T7 fusion (rApo1+T7) through DIRex lambda red recombineering: In order to generate a non-fusion control strain in which rApoI (or drApoI) and T7 are expressed separately from the same operon under the PA1lacO-Tenth promoter, one-step DIRex lambda red recombineering was used to insert a stop codon at the end of the rApo1 gene. The rApo1Stop targeting half-cassettes were amplified from R6K-AmilCP-kan-ccdB using the primers rApo1StopDF and AmilCP-KanR and from R6K-kan-ccdB-AmilCP using the primers rApo1StopDR and KanF-AmilCP (TABLE 10). The rApo1Stop half cassettes were co-electroporated to insert a stop codon after rApo1 followed by a kan-ccdB cassette flanked by large AmilCP inverted repeats nested between short 30 bp direct repeats. Excision of the AmilCP-kan-ccdB-AmilCP cassette was selected for using ccdB counterselection to obtain a markerless insertion of a stop codon after rApo1.
Mutation Assay and Sequencing with the T7 Promoter+rpsL Reporter Plasmid: To assess the locations and types of mutations observed, the drApo1−T7 negative control strain and MutaT7 and MP6 mutagenic strains (TABLE 8) (StrepR) carrying the T7 promoter+rpsL reporter plasmid (AmpR) were streaked on LB agar with 100 μg/mL ampicillin and grown at 37° C. for 24 h in order to obtain clones. Single colonies were picked in triplicate for each sample and used to inoculate 5 mL LB with 100 μg/mL ampicillin and 25 mM arabinose (with 10 μg/mL chloramphenicol for the MP6 strain, TABLE 8), then shaken at 250 r.p.m. and 37° C. for 24 h to accumulate mutations during growth. 1 mL aliquots of each culture were pelleted at 6000×g for 3 min and resuspended in 1 mL LB to remove arabinose. 50 μL of a 100-fold dilution of each resuspension was plated on LB Lennox agar plates (pH 8.0) with 500 μg/mL streptomycin, 100 μg/mL ampicillin, and 50 μg/mL tetrazolium chloride. 48 colonies from each plate were picked for colony PCR using the primers 2062 and 1197 (TABLE 10). The amplicons were Sanger-sequenced using the primer 1197 (TABLE 10).
LacZα Activity Assay for Quantifying T7 and MutaT7 Processivity: In order to determine if the fusion of rApo1 to the N-terminus of T7 RNA polymerase affected the processivity and/or activity of the T7 RNA polymerase, the expression of the lacZα fragment from T7 promoters of varying upstream distances was measured via the cleavage of o-Nitrophenyl-β-galactoside (oNPG) using an assay adapted from a previous publication (Schaefer et al., Anal. Biochem. 2016 Mar. 29; 503: 56-57). LacZα reporter plasmids C1A through C1F (ChlorR) were transformed into the ung+, drApo1−T7 ung+ and drApo1+T7 ung+ strains (TABLE 8) and plated on LB agar with 25 μg/mL chloramphenicol and grown at 37° C. for 24 h in order to obtain clones. Colonies of each reporter/strain combination were picked in triplicate and grown in 200 μL LB with 25 μg/mL chloramphenicol and 1 mM IPTG in a parafilm-wrapped 96-well plate that was shaken at 220 r.p.m. at 30° C. for 22 h. IPTG was added to induce the expression of the lacZα fragment from the genome that complements the lacZα fragment, and to increase the expression of drApo1−T7 and T7 from the PAllacO-Tenth promoter. 80 μL of each overnight culture was mixed with 120 μL Bgal mix (60 mM sodium dibasic, 40 mM sodium phosphate monobasic, 10 mM potassium chloride,1 mM magnesium sulfate, 26 mM 2-mercaptoethanol, 166 μg/mL egg-white lysozyme, 1.0 mg/mL oNPG, and 6.7% PopCulture lysis reagent) in a black, clear-bottomed 96-well plate. The OD600 and OD420 of each well was measured every 2 min over the course of 1 h in a Biotek Synergy H1 hybrid plate reader followed by double orbital shaking at 559 r.p.m. at 30° C. The oNPG cleavage activity of each well was calculated by measuring the slope of the linear region of each OD420 trace, dividing by the initial OD600 reading, and multiplying by 1000. The mean and standard deviation of each set of triplicates were calculated.
Episomal folA Directed Evolution Assay to Assess False Positive Frequency: To assess the effect that targeted versus global mutagenesis has on the false positive frequency of a directed evolution experiment, a model drug resistance evolution experiment was designed where the rate of true positive evolution corresponds to the frequency that drug resistance-conferring mutations appear in an episomal copy of a drug-sensitive gene. To create this system, the folA+T7 promoter plasmid (AmpR)—which contains the complete, endogenous folA promoter and coding sequence for dihydrofolate reductase followed by a T7 promoter pointing in the reverse direction—was transformed into MutaT7 and MP6 mutagenic strains (TABLE 8) (StrepR). These strains were streaked on LB agar with 100 μg/mL ampicillin and grown at 37° C. for 24 h in order to obtain clones. Single colonies were picked in triplicate for each sample and used to inoculate 5 mL LB with 100 μg/mL ampicillin and 25 mM arabinose (with 10 μg/mL chloramphenicol for the MP6 strain (TABLE 8)), then shaken at 250 r.p.m. and 37° C. for 24 h to accumulate mutations during growth. 1 mL aliquots of each culture were pelleted at 6000×g for 3 min and resuspended in 1 mL LB to remove arabinose. 50 μL of a 100-fold dilution of each resuspension was plated on LB agar plates with 5 μg/mL trimethoprim (TMP) and 50 μg/mL tetrazolium chloride. 13-15 colonies from each plate were picked for colony PCR. Episomal folA was amplified and Sanger sequenced using the primers Alof-T7 S and 1197 (TABLE 10).
Bacterial growth assay measuring trimethoprim drug resistance: Isolates were grown to stationary phase following overnight incubation at 37° C. in LB with 100 μg/mL ampicillin. Cultures were diluted 1:100 into a plate containing LB broth with increasing concentrations of TMP ranging from 1 μM to 1 mM. Growth of diluted samples was determined by measuring OD600 every 5 min in a Biotek Synergy H1 hybrid plate reader followed by orbital shaking at 282 r.p.m. and incubation at 37° C. Maximal growth rate was determined by performing “Max V” calculation in Gen5 software, using a 5-point segment of each growth curve corresponding to the highest linear slope. Upon determining maximum growth rate within each sample, growth rates were normalized to the highest growth rate within each sample series yielding the relative growth rate at each TMP concentration (
Traditional in vivo mutagenesis strategies rely on exogenous mutagens (e.g., high energy light or chemicals) (Cupples et al., Proc. Natl. Acad. Sci. U.S.A. 1989 July; 86(14): 5345-49; Tessman et al., 1965 Apr. 23; 148(3669): 507-8) or expressing mutagenic enzymes (e.g., XL1-Red (Greener et al., Mol. Biotechnol. 1997 April; 7(2): 189-95) or the MP6 plasmid (Badran A. H. and Liu D. R., Nat. Commun. 2015 Oct. 7; 6: 8425)). These global mutagenesis strategies can yield high mutation rates and diverse genetic landscapes. However, extensive mutations throughout the genome are problematic in many contexts, especially in directed evolution experiments (
Targeted in vivo mutagenesis strategies have the potential to overcome these deficiencies. DNA-damaging enzymes fused to deactivated Cas9 nucleases can edit bases at specific genetic loci (Komor et al., Nature. 2016 May 19; 533(7603): 420-24; Nishida et al., Science. 2016 Sep. 16; 353(6305): pii: aaf8729; Komor et al., Sci. Adv. 2017 Aug. 30; 3(8): eaao4774; Gaudelli et al., Nature. 2017 Nov. 23; 551(7681): 464-71; Kim et al., Nat. Biotechnol. 2017 Apr. 10; 35(5): 475-480), but require many gRNAs to tile mutagenic enzymes throughout a target DNA that may be multi-kb in length (Hess et al., Nat. Methods. 2016 December; 13(12): 1036-42; Ma et al., Nat. Methods. 2016 December; 13(12): 1029-35). Moreover, the guide RNAs must be redesigned after each evolution round introduces new mutations in the target DNA. Another example is the use of an error-prone poll variant to selectively mutagenize genes on ColE1 plasmids, although this method is limited to Escherichia coli and can target mutations within only a few kb of the ColE1 origin (Camps et al., Proc. Natl. Acad. Sci. U.S.A. 2003 Aug. 8; 100(17): 9727-9732; Allen et al., Nucleic Acids Res. 2011 May 26; 39(16): 7020-7033). Error-prone replication mediated by the Ty1 retrotransposon specifically in yeast can also selectively mutate <5 kb genetic cargoes inserted into the retrotransposon (Crook et al., Nat. Commun. 2016 Oct. 17; 7: 13051). Other targeted mutation methods in yeast include oligo-mediated genome engineering (DiCarlo et al., ACS Synth. Biol. 2013 Dec. 20; 2(12): 741-749), which can be labor-intensive, and an orthogonal replication system (Ravikumar et al., Nat. Chem. Biol. 2014 Feb. 2; 10(3): 175-177), which was developed specifically in yeast.
It was hypothesized herein that a processive, DNA-traversing biomolecule tethered to a DNA-damaging enzyme could provide a generalizable solution to the problem of targeting mutations across large, yet still well-defined, DNA regions. Monomeric RNA polymerases possess inherently high promoter specificity (Rong et al., J. Biol. Chem. 1998 Apr. 24; 273(17): 10253-60) and processivity (Thiel et al., J. Gen. Virol. 2001 June; 82(6): 1273-81). Cytidine deaminases are potent DNA-damaging enzymes that can act on single-stranded DNA substrates during transcription (Ramiro et al., Nat. Immunol. 2003 May; 4(5): 452-56). We envisioned that a chimeric “MutaT7” protein consisting of a cytidine deaminase (rApo1) fused to T7 RNA polymerase (T7-pol) would, therefore, allow us to target mutations specifically to any DNA region lying downstream of a T7 promoter (
To begin, a lacZ expression assay (Schaefer et al., Anal. Biochem. 2016 Mar. 29; 503: 56-57) was used to show that T7-Pol tolerated an rApo1 N-terminal fusion and still efficiently transcribed tens of kilobases (
Targeted mutagenesis was assayed using a codon reversion assay based on reporter plasmids either having or lacking a T7 promoter sequence upstream of silent drug resistance genes with ACG triplets in place of ATG start codons (
T7 promoter-dependent KanR mutagenesis by MutaT7 shows that mutagenesis can be targeted to a desired DNA region near a T7 promoter. Because T7-pol is highly processive, it was anticipated that mutations would also be introduced further downstream of the T7 promoter. MutaT7 processivity was assayed by inserting a tetracycline-resistance (TetR) gene with an inactive, ACG start codon ˜1.6 kb downstream of the KanR gene (
Targeted mutagenesis using the processive MutaT7 chimera requires not just recruitment to a DNA locus, but also termination at the end of targeted DNA. To address termination, KanR/TetR reporter plasmids were used in which the silent, start codon-defective resistance genes were separated by one or more T7 terminators (
To further assess whether MutaT7 induces mutagenesis specifically on the target DNA, the evolution of resistance to rifampicin (Garibyan et al., DNA Repair. 2003 May; 2(5): 593-8) and fosfomycin (Nilsson et al., Antimicrob. Agents Chemother. 2003 September; 47(9): 2850-58) was evaluated. Resistance can derive from diverse genomic mutations such that the appearance of resistant colonies correlates with off-target mutation rates in the genome (Badran and Liu, Nat. Commun. 2015 Oct. 7; 6: 8425; Garibyan et al., DNA Repair. 2003 May; 2(5): 593-8), analogous to cheating parasites in directed evolution schemes. Selection on either rifampicin- or fosfomycin-treated plates revealed that MutaT7-expressing samples displayed drug resistance frequencies comparable to background. In contrast, high frequencies of antibiotic resistance were observed in all global mutagenesis samples (
Additional experiments were directed at using MutaT7 to evolve ectopically expressed folA gene variants that confer trimethoprim resistance. The folA gene encodes dihydrofolate reductase, and folA mutations are just one of many potential routes to trimethoprim resistance (Acar and Goldstein, Rev. Infect. Dis. 1982 Mar.-Apr. 4; 4(2): 270-275). Either global mutagenesis or MutaT7 was used to mutagenize E. coli carrying a T7-targeted episomal copy of folA. Sanger-sequencing was then performed on colonies that grew on trimethoprim plates. 29 of 44 trimethoprim-resistant colonies mutagenized using MutaT7 had a mutation known to confer resistance (Herrington et al., J. Basic Microbiol. 2002; 42(3): 172) in the episomal folA promoter (TABLE 9,
DNA sequencing was then used to better understand the processivity and targeting of MutaT7 mutagenesis. An E. coli population expressing MutaT7 and the episomal KanR/TetR reporter plasmid was allowed to drift in the absence of selection pressure for 15 days prior to isolation of episomal DNA from clones (
Another benefit of targeted mutagenesis is the capacity to attain much larger library sizes by avoiding toxic mutations in essential, off-target genes. On the basis of the apparently low off-target mutagenesis rate of MutaT7, it was hypothesized that E. coli carrying MutaT7 would have significantly higher viability than bacteria treated with global mutagens. Indeed, consistent with prior work (Badran and Liu, Nat. Commun. 2015 Oct. 7; 6: 8425), a very low viability was observed in all populations treated with global mutagens. In contrast, populations expressing MutaT7 possessed viability similar to untreated cells (
Next, Illumina sequencing was used to identify mutations anywhere in the episomal reporter DNA sequence obtained from clones of the E. coli populations in
A disadvantage of MutaT7 is its limited mutational spectrum and an apparent strand bias observed in the sequencing results showing that C to T transitions were predominantly obtained in the sense strand using a single T7 promoter (
It was also observed that repair of deoxyuridine must be prevented to observe significant mutagenesis with MutaT7 (also observed with other cytidine deaminase-based systems (Badran and Liu, Nat. Commun. 2015 Oct. 7; 6: 8425; Komor et al., Nature. 2016 May 19; 533(7603): 420-24)). Although Δung cells were used to address this issue in the aforementioned experiments, a more flexible alternative is to co-express MutaT7 with the uracil glycosylase inhibitor (UGI; a protein that can inhibit UNG activity in many prokaryotes and eukaryotes (Badran and Liu, Nat. Commun. 2015 Oct. 7; 6: 8425; Komor et al., Nature. 2016 May 19; 533(7603): 420-24; Serrano-Heras et al., Nucleic Acids Res. 2007 Aug. 13; 35(16): 5393-5401)). Such co-expression resulted in a high rate of mutagenesis similar to that achieved using Δung cells (
In summary, the processively-acting MutaT7 chimera can selectively direct mutations to large, yet well-defined, regions of DNA in vivo. Utilizing other base editing enzymes (Gaudelli et al., Nature. 2017 Nov. 23; 551(7681): 464-71) in concert with cytidine deaminase will significantly widen the mutational spectrum of MutaT7 and further enable the creation of rich and diverse DNA libraries in vivo. Moreover, DNA-modifiers fused to T7 could facilitate targeted epigenetic studies (DeNizio et al., Curr. Opin. Chem. Biol. 2018 Feb. 13; 45: 10-17). The ubiquitous applicability of T7-pol in diverse organisms (McBride et al., Proc. Natl. Acad. Sci. U.S.A. 1994 Jul. 19; 91(15): 7301-7305; Lieber et al., Eur. J. Biochem., 1998 Oct. 1; 217(1): 387-94; Weinstock et al., Nat. Methods. 2016 Aug. 29; 13(10): 849-851; Dower and Rosbash, RNA. 2002 May; 8(5): 686-697) suggests that MutaT7 will prove useful in a broad range of evolutionary and synthetic biology settings.
The following strains were constructed using lambda red recombineering as described in Example 8.
To assess mutagenesis rates, the control (tadA-Only) and mutagenic strains (tadA-XTEN-T7 and tadA-GGS-T7) (StrepR) carrying reporter plasmids (BAC-KanStop-TetStop or BAC-T7-KanStop-TetStop,
Plates were incubated at 37° C. for 48 h, then imaged by inverting the plates onto transparencies and scanning on a document scanner at a resolution of 400 dots per inch. The colonies were then counted using the software OpenCFU (3.9.0) (Geissmann, PLoS One. 2013; 8(2): e54072), with the minimum colony radius set to 3, the maximum colony radius set to 50, and the regular threshold set to 4.
To show that other types of mutations can be introduced using other DNA damaging agents fused to T7, a previously reported variant of tadA (Gaudelli et al., Nature. 2017 Nov. 23; 551(7681): 464-71, the entirety of which is incorporated herein by reference) was fused to T7 using two different linker sequences (GGS and XTEN) and placed under the control of an IPTG-inducible promoter (PAllacO-Tenth). This variant of tadA is able to make A to G mutations in DNA.
Mutagenesis assays were then carried out using these tadA-T7 E. coli strains and reporter plasmids that have defective resistance genes. For these assays, reporter plasmids were used that have defective kanamycin (KanR) and tetracycline (TetR) resistance genes (each having premature TAG stop codons). The BAC-KanStop-TetStop reporter plasmid lacks a T7 promoter, and should thus not be targeted by the tadA-XTEN-T7 or tadA-GGS-T7 fusion enzymes. The BAC-T7-KanStop-TetStop reporter plasmid has a T7 promoter preceding the defective KanR and TetR genes, which should allow tadA-XTEN-T7 or tadA-GGS-T7 fusion enzymes to mutate these genes, occasionally mutating the TAG stop codon to TGG and thus conferring antibiotic resistance (
Without the T7 promoter on the reporter plasmid, only a low level of resistance-conferring mutations were observed across all conditions, including with the tadA-Only control strain, which only expresses the tadA enzyme alone (
Furthermore, low levels of rifampicin resistance were observed across all conditions (
DNA mutagenesis is an important and necessary step in all directed evolution methodologies, which are heavily utilized by academic and industrial labs around the world. Mutagenic technologies are particularly vital for research labs developing biomolecular drugs with novel actions or improved potency, as the identification of biomolecules with improved therapeutic properties inherently relies on some form of directed evolution. The recent implementation of biologics has further increased the demand for new and improved antibodies, vaccines, and recombinant proteins. As progress in biologic development is constrained by currently available methodologies for performing directed evolution, there is a widespread vested interest in more efficient and cost-effective mutagenic methods.
All of the features disclosed in this specification may be combined in any combination. Each feature disclosed in this specification may be replaced by an alternative feature serving the same, equivalent, or similar purpose. Thus, unless expressly stated otherwise, each feature disclosed is only an example of a generic series of equivalent or similar features.
From the above description, one skilled in the art can easily ascertain the essential characteristics of the present disclosure, and without departing from the spirit and scope thereof, can make various changes and modifications of the disclosure to adapt it to various usages and conditions. Thus, other embodiments are also within the claims.
While several inventive embodiments have been described and illustrated herein, those of ordinary skill in the art will readily envision a variety of other means and/or structures for performing the function and/or obtaining the results and/or one or more of the advantages described herein, and each of such variations and/or modifications is deemed to be within the scope of the inventive embodiments described herein. More generally, those skilled in the art will readily appreciate that all parameters, dimensions, materials, and configurations described herein are meant to be exemplary and that the actual parameters, dimensions, materials, and/or configurations will depend upon the specific application or applications for which the inventive teachings is/are used. Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific inventive embodiments described herein. It is, therefore, to be understood that the foregoing embodiments are presented by way of example only and that, within the scope of the appended claims and equivalents thereto, inventive embodiments may be practiced otherwise than as specifically described and claimed. Inventive embodiments of the present disclosure are directed to each individual feature, system, article, material, kit, and/or method described herein. In addition, any combination of two or more such features, systems, articles, materials, kits, and/or methods, if such features, systems, articles, materials, kits, and/or methods are not mutually inconsistent, is included within the inventive scope of the present disclosure.
All definitions, as defined and used herein, should be understood to control over dictionary definitions, definitions in documents incorporated by reference, and/or ordinary meanings of the defined terms.
All references, patents and patent applications disclosed herein are incorporated by reference with respect to the subject matter for which each is cited, which in some cases may encompass the entirety of the document.
The indefinite articles “a” and “an,” as used herein in the specification and in the claims, unless clearly indicated to the contrary, should be understood to mean “at least one.”
The phrase “and/or,” as used herein in the specification and in the claims, should be understood to mean “either or both” of the elements so conjoined, i.e., elements that are conjunctively present in some cases and disjunctively present in other cases. Multiple elements listed with “and/or” should be construed in the same fashion, i.e., “one or more” of the elements so conjoined. Other elements may optionally be present other than the elements specifically identified by the “and/or” clause, whether related or unrelated to those elements specifically identified. Thus, as a non-limiting example, a reference to “A and/or B”, when used in conjunction with open-ended language such as “comprising” can refer, in one embodiment, to A only (optionally including elements other than B); in another embodiment, to B only (optionally including elements other than A); in yet another embodiment, to both A and B (optionally including other elements); etc.
As used herein in the specification and in the claims, “or” should be understood to have the same meaning as “and/or” as defined above. For example, when separating items in a list, “or” or “and/or” shall be interpreted as being inclusive, i.e., the inclusion of at least one, but also including more than one, of a number or list of elements, and, optionally, additional unlisted items. Only terms clearly indicated to the contrary, such as “only one of” or “exactly one of,” or, when used in the claims, “consisting of,” will refer to the inclusion of exactly one element of a number or list of elements. In general, the term “or” as used herein shall only be interpreted as indicating exclusive alternatives (i.e. “one or the other but not both”) when preceded by terms of exclusivity, such as “either,” “one of,” “only one of,” or “exactly one of.” “Consisting essentially of,” when used in the claims, shall have its ordinary meaning as used in the field of patent law.
As used herein in the specification and in the claims, the phrase “at least one,” in reference to a list of one or more elements, should be understood to mean at least one element selected from any one or more of the elements in the list of elements, but not necessarily including at least one of each and every element specifically listed within the list of elements and not excluding any combinations of elements in the list of elements. This definition also allows that elements may optionally be present other than the elements specifically identified within the list of elements to which the phrase “at least one” refers, whether related or unrelated to those elements specifically identified. Thus, as a non-limiting example, “at least one of A and B” (or, equivalently, “at least one of A or B,” or, equivalently “at least one of A and/or B”) can refer, in one embodiment, to at least one, optionally including more than one, A, with no B present (and optionally including elements other than B); in another embodiment, to at least one, optionally including more than one, B, with no A present (and optionally including elements other than A); in yet another embodiment, to at least one, optionally including more than one, A, and at least one, optionally including more than one, B (and optionally including other elements); etc.
It should also be understood that, unless clearly indicated to the contrary, in any methods claimed herein that include more than one step or act, the order of the steps or acts of the method is not necessarily limited to the order in which the steps or acts of the method are recited.
In the claims, as well as in the specification above, all transitional phrases such as “comprising,” “including,” “carrying,” “having,” “containing,” “involving,” “holding,” “composed of,” and the like are to be understood to be open-ended, i.e., to mean including but not limited to. Only the transitional phrases “consisting of” and “consisting essentially of” shall be closed or semi-closed transitional phrases, respectively, as set forth in the United States Patent Office Manual of Patent Examining Procedures, Section 2111.03. It should be appreciated that embodiments described in this document using an open-ended transitional phrase (e.g., “comprising”) are also contemplated, in alternative embodiments, as “consisting of” and “consisting essentially of” the feature described by the open-ended transitional phrase. For example, if the disclosure describes “a composition comprising A and B”, the disclosure also contemplates the alternative embodiments “a composition consisting of A and B” and “a composition consisting essentially of A and B”.
This application claims the benefit under 35 U.S.C. § 119(e) of U.S. Provisional Application No. 62/644,736, filed on Mar. 19, 2018, and entitled “Methods and Kits for Dynamic Hypermutation,” which is incorporated herein by reference in its entirety for all purposes.
This invention was made with Government support under Grant No. GM119162 awarded by the National Institutes of Health (NIH). The Government has certain rights in this invention.
Number | Date | Country | |
---|---|---|---|
62644736 | Mar 2018 | US |