Genome-wide rationally-designed mutations leading to enhanced lysine production in E. coli

Information

  • Patent Grant
  • 11078458
  • Patent Number
    11,078,458
  • Date Filed
    Tuesday, January 26, 2021
    3 years ago
  • Date Issued
    Tuesday, August 3, 2021
    2 years ago
Abstract
The present disclosure relates to various different types of variants in E. coli coding and noncoding regions leading to enhanced lysine production for, e.g., supplements and nutraceuticals.
Description
FIELD OF THE INVENTION

The present disclosure relates to mutations in genes in E. coli leading to enhanced lysine production.


INCORPORATION BY REFERENCE

Submitted with the present application is an electronically filed sequence listing via EFS-Web as an ASCII formatted sequence listing, entitled “INSC046US2_seglist”, created Jan. 12, 2020, and 83,198 bytes in size. The sequence listing is part of the specification filed Jan. 26, 2021 and is incorporated by reference in its entirety.


BACKGROUND OF THE INVENTION

In the following discussion certain articles and methods will be described for background and introductory purposes. Nothing contained herein is to be construed as an “admission” of prior art. Applicant expressly reserves the right to demonstrate, where appropriate, that the articles and methods referenced herein do not constitute prior art under the applicable statutory provisions.


The amino acid lysine is an α-amino acid that is used in the biosynthesis of proteins and is a metabolite of E. coli, S. cerevisiae, plants, humans and other mammals, as well as algae. Lysine contains an α-amino group, an α-carboxylic acid group, and has a chemical formula of C6H14N2O2 One of nine essential amino acids in humans, lysine is required for growth and tissue repair and has a role as a micronutrient, a nutraceutical, an agricultural feed supplement, an anticonvulsant, as well as a precursor for the production of peptides. Because of these roles as, e.g., a supplement and nutraceutical, there has been a growing effort to produce lysine on a large scale.


Accordingly, there is a need in the art for organisms that produce enhanced amounts of lysine where such organisms can be harnessed for large scale lysine production. The disclosed nucleic acid sequences from E. coli satisfy this need.


SUMMARY OF THE INVENTION

This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter. Other features, details, utilities, and advantages of the claimed subject matter will be apparent from the following written Detailed Description including those aspects illustrated in the accompanying drawings and defined in the appended claims.


The present disclosure provides variant E. coli genes and non-coding sequences that produce enhanced amounts of lysine in culture including double and triple combinations of variant sequences. Thus, in some embodiments, the present disclosure provides any one of SEQ ID Nos. 2-42.


These aspects and other features and advantages of the invention are described below in more detail.





BRIEF DESCRIPTION OF THE DRAWINGS

The foregoing and other features and advantages of the present invention will be more fully understood from the following detailed description of illustrative embodiments taken in conjunction with the accompanying drawings in which:



FIGS. 1A and 1B are graphic depictions of the lysine pathway in E. coli, highlighting the enzymes in the pathway targeted for rationally-designed editing. FIG. 1B is a continuation of FIG. 1A.



FIG. 2 enumerates the biological target, edit outcome, edit type and scale for the initial 200,000 edits made to the E. coli lysine pathway.



FIG. 3A is an exemplary engine vector for creating edits in E. coli. FIG. 3B is an exemplary editing vector for creating edits in E. coli.





It should be understood that the drawings are not necessarily to scale, and that like reference numbers refer to like features.


DETAILED DESCRIPTION

All of the functionalities described in connection with one embodiment of the methods, devices or instruments described herein are intended to be applicable to the additional embodiments of the methods, devices and instruments described herein except where expressly stated or where the feature or function is incompatible with the additional embodiments. For example, where a given feature or function is expressly described in connection with one embodiment but not expressly mentioned in connection with an alternative embodiment, it should be understood that the feature or function may be deployed, utilized, or implemented in connection with the alternative embodiment unless the feature or function is incompatible with the alternative embodiment.


The practice of the techniques described herein may employ, unless otherwise indicated, conventional techniques and descriptions molecular biology (including recombinant techniques), cell biology, biochemistry, and genetic engineering technology, which are within the skill of those who practice in the art. Such conventional techniques and descriptions can be found in standard laboratory manuals such as Green and Sambrook, Molecular Cloning: A Laboratory Manual. 4th, ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., (2014); Current Protocols in Molecular Biology, Ausubel, et al. eds., (2017); Neumann, et al., Electroporation and Electrofusion in Cell Biology, Plenum Press, New York, 1989; and Chang, et al., Guide to Electroporation and Electrofusion, Academic Press, California (1992), all of which are herein incorporated in their entirety by reference for all purposes.


Note that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to “a cell” refers to one or more cells, and reference to “the system” includes reference to equivalent steps, methods and devices known to those skilled in the art, and so forth.


Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. All publications mentioned herein are incorporated by reference for the purpose of describing and disclosing devices, formulations and methodologies that may be used in connection with the presently described invention.


Where a range of values is provided, it is understood that each intervening value, between the upper and lower limit of that range and any other stated or intervening value in that stated range is encompassed within the invention. The upper and lower limits of these smaller ranges may independently be included in smaller ranges, and are also encompassed within the invention, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the invention.


In the following description, numerous specific details are set forth to provide a more thorough understanding of the present invention. However, it will be apparent to one of skill in the art that the present invention may be practiced without one or more of these specific details. In other instances, features and procedures well known to those skilled in the art have not been described in order to avoid obscuring the invention. The terms used herein are intended to have the plain and ordinary meaning as understood by those of ordinary skill in the art.


The term DNA “control sequences” refers collectively to promoter sequences, polyadenylation signals, transcription termination sequences, upstream regulatory domains, origins of replication, internal ribosome entry sites, nuclear localization sequences, enhancers, and the like, which collectively provide for the replication, transcription and translation of a coding sequence in a recipient cell. Not all of these types of control sequences need to be present so long as a selected coding sequence is capable of being replicated, transcribed and—for some components—translated in an appropriate host cell.


The term “CREATE cassette” or “editing cassette” refers to a gRNA linked to a donor DNA or HA. Methods and compositions for designing and synthesizing CREATE editing cassettes are described in U.S. Pat. Nos. 10,240,167; 10,266,849; 9,982,278; 10,351,877; 10,364,442; 10,435,715; and 10,465,207; and U.S. Ser. Nos. 16/550,092, filed 23 Aug. 2019; 16/551,517, filed 26 Aug. 2019; 16/773,618, filed 27 Jan. 2020; and 16/773,712, filed 27 Jan. 2020, all of which are incorporated by reference herein in their entirety.


As used herein the term “donor DNA” or “donor nucleic acid” refers to nucleic acid that is designed to introduce a DNA sequence modification (insertion, deletion, substitution) into a locus (e.g., a target genomic DNA sequence or cellular target sequence) by homologous recombination using nucleic acid-guided nucleases. For homology-directed repair, the donor DNA must have sufficient homology to the regions flanking the “cut site” or site to be edited in the genomic target sequence. The length of the homology arm(s) will depend on, e.g., the type and size of the modification being made. In many instances and preferably, the donor DNA will have two regions of sequence homology (e.g., two homology arms) to the genomic target locus. Preferably, an “insert” region or “DNA sequence modification” region—the nucleic acid modification that one desires to be introduced into a genome target locus in a cell—will be located between two regions of homology. The DNA sequence modification may change one or more bases of the target genomic DNA sequence at one specific site or multiple specific sites. A change may include changing 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 50, 75, 100, 150, 200, 300, 400, or 500 or more base pairs of the genomic target sequence. A deletion or insertion may be a deletion or insertion of 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 40, 50, 75, 100, 150, 200, 300, 400, or 500 or more base pairs of the genomic target sequence.


The terms “guide nucleic acid” or “guide RNA” or “gRNA” refer to a polynucleotide comprising 1) a guide sequence capable of hybridizing to a genomic target locus, and 2) a scaffold sequence capable of interacting or complexing with a nucleic acid-guided nuclease.


“Homology” or “identity” or “similarity” refers to sequence similarity between two peptides or, more often in the context of the present disclosure, between two nucleic acid molecules. The term “homologous region” or “homology arm” refers to a region on the donor DNA with a certain degree of homology with the target genomic DNA sequence. Homology can be determined by comparing a position in each sequence which may be aligned for purposes of comparison. When a position in the compared sequence is occupied by the same base or amino acid, then the molecules are homologous at that position. A degree of homology between sequences is a function of the number of matching or homologous positions shared by the sequences.


“Operably linked” refers to an arrangement of elements where the components so described are configured so as to perform their usual function. Thus, control sequences operably linked to a coding sequence are capable of effecting the transcription, and in some cases, the translation, of a coding sequence. The control sequences need not be contiguous with the coding sequence so long as they function to direct the expression of the coding sequence. Thus, for example, intervening untranslated yet transcribed sequences can be present between a promoter sequence and the coding sequence and the promoter sequence can still be considered “operably linked” to the coding sequence. In fact, such sequences need not reside on the same contiguous DNA molecule (i.e. chromosome) and may still have interactions resulting in altered regulation.


As used herein, the terms “protein” and “polypeptide” are used interchangeably. Proteins may or may not be made up entirely of amino acids.


A “promoter” or “promoter sequence” is a DNA regulatory region capable of binding RNA polymerase and initiating transcription of a polynucleotide or polypeptide coding sequence such as messenger RNA, ribosomal RNA, small nuclear or nucleolar RNA, guide RNA, or any kind of RNA transcribed by any class of any RNA polymerase I, II or III. Promoters may be constitutive or inducible, and in some embodiments the transcription of at least one component of the nucleic acid-guided nuclease editing system is—and often at least three components of the nucleic acid-guided nuclease editing system are—under the control of an inducible promoter. A number of gene regulation control systems have been developed for the controlled expression of genes in plant, microbe, and animal cells, including mammalian cells, including the pL promoter (induced by heat inactivation of the CI857 repressor), the pPhIF promoter (induced by the addition of 2,4 diacetylphloroglucinol (DAPG)), the pBAD promoter (induced by the addition of arabinose to the cell growth medium), and the rhamnose inducible promoter (induced by the addition of rhamnose to the cell growth medium). Other systems include the tetracycline-controlled transcriptional activation system (Tet-On/Tet-Off, Clontech, Inc. (Palo Alto, Calif.); Bujard and Gossen, PNAS, 89(12):5547-5551 (1992)), the Lac Switch Inducible system (Wyborski et al., Environ Mol Mutagen, 28(4):447-58 (1996); DuCoeur et al., Strategies 5(3):70-72 (1992); U.S. Pat. No. 4,833,080), the ecdysone-inducible gene expression system (No et al., PNAS, 93(8):3346-3351 (1996)), the cumate gene-switch system (Mullick et al., BMC Biotechnology, 6:43 (2006)), and the tamoxifen-inducible gene expression (Zhang et al., Nucleic Acids Research, 24:543-548 (1996)) as well as others.


As used herein the term “selectable marker” refers to a gene introduced into a cell, which confers a trait suitable for artificial selection. General use selectable markers are well-known to those of ordinary skill in the art. Drug selectable markers such as ampicillin/carbenicillin, kanamycin, nourseothricin N-acetyl transferase, chloramphenicol, erythromycin, tetracycline, gentamicin, bleomycin, streptomycin, rifampicin, puromycin, hygromycin, blasticidin, and G418 may be employed. In other embodiments, selectable markers include, but are not limited to sugars such as rhamnose. “Selective medium” as used herein refers to cell growth medium to which has been added a chemical compound or biological moiety that selects for or against selectable markers.


The term “specifically binds” as used herein includes an interaction between two molecules, e.g., an engineered peptide antigen and a binding target, with a binding affinity represented by a dissociation constant of about 10−7 M, about 10−8 M, about 10−9 M, about 10−10 M, about 10−11 M, about 10−12 M, about 10−13 M, about 10−14 M or about 1015 M.


The terms “target genomic DNA sequence”, “cellular target sequence”, or “genomic target locus” refer to any locus in vitro or in vivo, or in a nucleic acid (e.g., genome) of a cell or population of cells, in which a change of at least one nucleotide is desired using a nucleic acid-guided nuclease editing system. The cellular target sequence can be a genomic locus or extrachromosomal locus.


The term “variant” may refer to a polypeptide or polynucleotide that differs from a reference polypeptide or polynucleotide but retains essential properties. A typical variant of a polypeptide differs in amino acid sequence from another reference polypeptide. Generally, differences are limited so that the sequences of the reference polypeptide and the variant are closely similar overall and, in many regions, identical. A variant and reference polypeptide may differ in amino acid sequence by one or more modifications (e.g., substitutions, additions, and/or deletions). A variant of a polypeptide may be a conservatively modified variant. A substituted or inserted amino acid residue may or may not be one encoded by the genetic code (e.g., a non-natural amino acid). A variant of a polypeptide may be naturally occurring, such as an allelic variant, or it may be a variant that is not known to occur naturally.


A “vector” is any of a variety of nucleic acids that comprise a desired sequence or sequences to be delivered to and/or expressed in a cell. Vectors are typically composed of DNA, although RNA vectors are also available. Vectors include, but are not limited to, plasmids, fosmids, phagemids, virus genomes, synthetic chromosomes, and the like. As used herein, the phrase “engine vector” comprises a coding sequence for a nuclease to be used in the nucleic acid-guided nuclease systems and methods of the present disclosure. The engine vector also comprises in E. coli, the λ Red recombineering system or an equivalent thereto which repairs the double-stranded breaks resulting from the cut by the nuclease. Engine vectors also typically comprise a selectable marker. As used herein the phrase “editing vector” comprises a donor nucleic acid, optionally including an alteration to the cellular target sequence that prevents nuclease binding at a PAM or spacer in the cellular target sequence after editing has taken place, and a coding sequence for a gRNA. The editing vector may also and preferably does comprise a selectable marker and/or a barcode. In some embodiments, the engine vector and editing vector may be combined; that is, all editing and selection components may be found on a single vector. Further, the engine and editing vectors comprise control sequences operably linked to, e.g., the nuclease coding sequence, recombineering system coding sequences (if present), donor nucleic acid, guide nucleic acid(s), and selectable marker(s).


Library Design Strategy and Nuclease-Directed Genome Editing


Lysine is naturally synthesized in E. coli along the diaminopimelate (DAP) biosynthetic pathway. See, e.g., FIG. 1. Strain engineering strategies for increasing lysine production in E. coli and other industrially-relevant production hosts such as Corynebacterium glutamicum have historically focused on the genes in the DAP pathway as obvious targets for mutagenesis and over-expression. Beyond this short list of genes encoding the lysine biosynthetic enzymes, it is likely that additional loci throughout the E. coli genome may also contribute appreciably (if less directly) to improved lysine yields in an industrial production setting. For this reason, targeted mutagenesis strategies which enable a broader query of the entire genome are also of significant value to the lysine metabolic engineer.


The variants presented in this disclosure are the result of nucleic acid-guided nuclease editing of 200,000 unique and precise designs at specified loci around the genome in a wildtype strain of E. coli harboring an engine plasmid such as that shown in FIG. 3A (such transformed MG1655 strain is referred to herein as E. coli strain EC83) and using the resulting lysine production levels to conduct additional nucleic acid-guided nuclease editing in two engineered strains of MG1655 to produce double- and triple-variant engineered strains. The first engineered strain is strain MG1655 with a single mutation comprising dapA E84T (SEQ ID No. 1), the lysine production for which was approximately 500-fold over wildtype lysine production in MG1655. The second engineered strain is strain MG1655 with a double mutation comprising dapA E84T (SEQ ID No. 1) and dapA J23100 (a mutation in the E. coli dapA promoter, SEQ ID NO. 2), the lysine production for which was approximately 10,000-fold over wildtype lysine production. See, e.g., FIG. 2 for a summary of the types of edits included in the 200,000 editing vectors used to generate the variants. The engine plasmid comprises a coding sequence for the MAD7 nuclease under the control of the inducible pL promoter, the λ Red operon recombineering system under the control of the inducible pBAD promoter (inducible by the addition of arabinose in the cell growth medium), the c1857 gene under the control of a constitutive promoter, as well as a selection marker and an origin of replication. As described above, the λ Red recombineering system repairs the double-stranded breaks resulting from the cut by the MAD7 nuclease. The c1857 gene at 30° C. actively represses the pL promoter (which drives the expression of the MAD7 nuclease and the editing or CREATE cassette on the editing cassette such as the exemplary editing vector shown in FIG. 3B); however, at 42° C., the c1857 repressor gene unfolds or degrades, and in this state the c1857 repressor protein can no longer repress the pL promoter leading to active transcription of the coding sequence for the MAD7 nuclease and the editing (e.g., CREATE) cassette.



FIG. 3B depicts an exemplary editing plasmid comprising the editing (e.g, CREATE) cassette (crRNA, spacer and HA) driven by a pL promoter, a selection marker, and an origin of replication.


Mutagenesis libraries specifically targeting the genes in the DAP pathway—along with a number of genes whose enzymes convert products feeding into the DAP pathway—were designed for saturation mutagenesis. Additionally, to more deeply explore the rest of the genome for new targets involved in lysine biosynthesis, libraries were designed to target all annotated loci with either premature stop codons (for a knock-out phenotype) or insertion of a set of five synthetic promoter variants (for expression modulation phenotypes).


The 200,000 nucleic acid mutations or edits described herein were generated using MAD7, along with a gRNA and donor DNA. A nucleic acid-guided nuclease such as MAD7 is complexed with an appropriate synthetic guide nucleic acid in a cell and can cut the genome of the cell at a desired location. The guide nucleic acid helps the nucleic acid-guided nuclease recognize and cut the DNA at a specific target sequence. By manipulating the nucleotide sequence of the guide nucleic acid, the nucleic acid-guided nuclease may be programmed to target any DNA sequence for cleavage as long as an appropriate protospacer adjacent motif (PAM) is nearby. In certain aspects, the nucleic acid-guided nuclease editing system may use two separate guide nucleic acid molecules that combine to function as a guide nucleic acid, e.g., a CRISPR RNA (crRNA) and trans-activating CRISPR RNA (tracrRNA). In other aspects, the guide nucleic acid may be a single guide nucleic acid that includes both the crRNA and tracrRNA sequences.


Again, the resulting lysine production levels from the single variants were used to conduct additional nucleic acid-guided nuclease editing in two engineered strains of MG1655 to produce double- and triple-variant engineered strains. The first engineered strain is strain MG1655 with a single mutation comprising dapA E84T (SEQ ID No. 1), the lysine production for which was approximately 500-fold over wildtype lysine production in MG1655. The second engineered strain is strain MG1655 with a double mutation comprising dapA E84T (SEQ ID No. 1) and dapA J23100 (a mutation in the E. coli dapA promoter, SEQ ID NO. 2), the lysine production for which was approximately 10,000-fold over wildtype lysine production.


A guide nucleic acid comprises a guide sequence, where the guide sequence is a polynucleotide sequence having sufficient complementarity with a target sequence to hybridize with the target sequence and direct sequence-specific binding of a complexed nucleic acid-guided nuclease to the target sequence. The degree of complementarity between a guide sequence and the corresponding target sequence, when optimally aligned using a suitable alignment algorithm, is about or more than about 50%, 60%, 75%, 80%, 85%, 90%, 95%, 97.5%, 99%, or more. Optimal alignment may be determined with the use of any suitable algorithm for aligning sequences. In some embodiments, a guide sequence is about or more than about 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 75, or more nucleotides in length. In some embodiments, a guide sequence is less than about 75, 50, 45, 40, 35, 30, 25, 20 nucleotides in length. Preferably the guide sequence is 10-30 or 15-20 nucleotides long, or 15, 16, 17, 18, 19, or 20 nucleotides in length.


In the methods to generate the 200,000 member library, the guide nucleic acids were provided as a sequence to be expressed from a plasmid or vector comprising both the guide sequence and the scaffold sequence as a single transcript under the control of an inducible promoter. The guide nucleic acids are engineered to target a desired target sequence by altering the guide sequence so that the guide sequence is complementary to a desired target sequence, thereby allowing hybridization between the guide sequence and the target sequence. In general, to generate an edit in the target sequence, the gRNA/nuclease complex binds to a target sequence as determined by the guide RNA, and the nuclease recognizes a protospacer adjacent motif (PAM) sequence adjacent to the target sequence. The target sequences for the genome-wide mutagenesis here encompassed 200,000 unique and precise designs at specified loci around the genome throughout the E. coli genome.


The guide nucleic acid may be and in the processes generating the variants reported herein were part of an editing cassette that also encoded the donor nucleic acid. The target sequences are associated with a proto-spacer mutation (PAM), which is a short nucleotide sequence recognized by the gRNA/nuclease complex. The precise preferred PAM sequence and length requirements for different nucleic acid-guided nucleases vary; however, PAMs typically are 2-7 base-pair sequences adjacent or in proximity to the target sequence and, depending on the nuclease, can be 5′ or 3′ to the target sequence.


In certain embodiments, the genome editing of a cellular target sequence both introduces the desired DNA change to the cellular target sequence and removes, mutates, or renders inactive a proto-spacer mutation (PAM) region in the cellular target sequence. Rendering the PAM at the cellular target sequence inactive precludes additional editing of the cell genome at that cellular target sequence, e.g., upon subsequent exposure to a nucleic acid-guided nuclease complexed with a synthetic guide nucleic acid in later rounds of editing. Thus, cells having the desired cellular target sequence edit and an altered PAM can be selected for by using a nucleic acid-guided nuclease complexed with a synthetic guide nucleic acid complementary to the cellular target sequence. Cells that did not undergo the first editing event will be cut rendering a double-stranded DNA break, and thus will not continue to be viable. The cells containing the desired cellular target sequence edit and PAM alteration will not be cut, as these edited cells no longer contain the necessary PAM site and will continue to grow and propagate.


As for the nuclease component of the nucleic acid-guided nuclease editing system, a polynucleotide sequence encoding the nucleic acid-guided nuclease can be codon optimized for expression in particular cell types, such as archaeal, prokaryotic or eukaryotic cells. The choice of nucleic acid-guided nuclease to be employed depends on many factors, such as what type of edit is to be made in the target sequence and whether an appropriate PAM is located close to the desired target sequence. Nucleases of use in the methods described herein include but are not limited to Cas 9, Cas 12/Cpfl, MAD2, or MAD7 or other MADzymes. As with the guide nucleic acid, the nuclease is encoded by a DNA sequence on a vector (e.g., the engine vector—see FIG. 3A) and be under the control of an inducible promoter. In some embodiments—such as in the methods described herein—the inducible promoter may be separate from but the same as the inducible promoter controlling transcription of the guide nucleic acid; that is, a separate inducible promoter drives the transcription of the nuclease and guide nucleic acid sequences but the two inducible promoters may be the same type of inducible promoter (e.g., both are pL promoters). Alternatively, the inducible promoter controlling expression of the nuclease may be different from the inducible promoter controlling transcription of the guide nucleic acid; that is, e.g., the nuclease may be under the control of the pBAD inducible promoter, and the guide nucleic acid may be under the control of the pL inducible promoter.


Another component of the nucleic acid-guided nuclease system is the donor nucleic acid comprising homology to the cellular target sequence. In some embodiments, the donor nucleic acid is on the same polynucleotide (e.g., editing vector or editing cassette) as the guide nucleic acid. The donor nucleic acid is designed to serve as a template for homologous recombination with a cellular target sequence nicked or cleaved by the nucleic acid-guided nuclease as a part of the gRNA/nuclease complex. A donor nucleic acid polynucleotide may be of any suitable length, such as about or more than about 20, 25, 50, 75, 100, 150, 200, 500, or 1000 nucleotides in length. In certain preferred aspects, the donor nucleic acid can be provided as an oligonucleotide of between 20-300 nucleotides, more preferably between 50-250 nucleotides. The donor nucleic acid comprises a region that is complementary to a portion of the cellular target sequence (e.g., a homology arm). When optimally aligned, the donor nucleic acid overlaps with (is complementary to) the cellular target sequence by, e.g., about 20, 25, 30, 35, 40, 50, 60, 70, 80, 90 or more nucleotides. In many embodiments, the donor nucleic acid comprises two homology arms (regions complementary to the cellular target sequence) flanking the mutation or difference between the donor nucleic acid and the cellular target sequence. The donor nucleic acid comprises at least one mutation or alteration compared to the cellular target sequence, such as an insertion, deletion, modification, or any combination thereof compared to the cellular target sequence. Various types of edits were introduced herein, including site-directed mutagenesis, saturation mutagenesis, promoter swaps and ladders, knock-in and knock-out edits, SNP or short tandem repeat swaps, and start/stop codon exchanges.


In addition to the donor nucleic acid, an editing cassette may comprise one or more primer sites. The primer sites can be used to amplify the editing cassette by using oligonucleotide primers; for example, if the primer sites flank one or more of the other components of the editing cassette. In addition, the editing cassette may comprise a barcode. A barcode is a unique DNA sequence that corresponds to the donor DNA sequence such that the barcode can identify the edit made to the corresponding cellular target sequence. The barcode typically comprises four or more nucleotides. In some embodiments, the editing cassettes comprise a collection or library gRNAs and of donor nucleic acids representing, e.g., gene-wide or genome-wide libraries of gRNAs and donor nucleic acids. The library of editing cassettes is cloned into vector backbones where, e.g., each different donor nucleic acid is associated with a different barcode.


Variants of interest include those listed in Table 1 below:









TABLE 1







Variants














Phenotype
Phenotype


SEQ ID No.
Mutant
NCBI Gene ID
FOWT
FIOPC














SEQ ID No. 1*
Single edit: dapA E84T
946952
500
0


SEQ ID No. 2**
Single edit: dapA J21300
946952
1000
2


SEQ ID No. 3*
Triple edit: dapA
946952 + 948531
13,500
27



E84T/J21300 + lysC V339P


SEQ ID No. 4**
Triple edit: dapA
946952 + 947641
13,000
26



E84T/J21300 + garD J23101


SEQ ID No. 5**
Triple edit: dapA
946952 + 948176
13,400
26.8



E84T/J21300 + yicL J23100


SEQ ID No. 6*
Triple edit: dapA
946952 + 946667
14,600
29.2



E84T/J21300 + lysP R15***


SEQ ID No. 7**
Triple edit: dapA
946952 + 945574
13,300
26.6



E84T/J21300 + mgSA J23100


SEQ ID No. 8*
Triple edit: dapA
946952 + 945667
13,400
26.8



E84T/J21300 + pckE100Q


SEQ ID No. 9**
Double edit: dapA J21300 +
946952 + 946434
804.620
1.609



amyA J23100


SEQ ID No. 10*
Double edit: dapA J21300 +
946952 + 946434
784.779
1.570



amyA P15***


SEQ ID No. 11*
Double edit: dapA J21300 +
946952 + 947219
1320.758
2.642



cysN L5***


SEQ ID No. 12**
Double edit: dapA J21300 +
946952 + 945815
1067.701
2.135



dosP J23100


SEQ ID No. 13**
Double edit: dapA J21300 +
946952 + NA   
1016.806
2.034



emrE J23100


SEQ ID No. 14**
Double edit: dapA J21300 +
946952 + 949032
913.339
1.827



focB J23100


SEQ ID No. 15**
Double edit: dapA J21300 +
946952 + 944863
1397.503
2.795



glnD J23100


SEQ ID No. 16*
Double edit: dapA J21300 +
946952 + 947552
1085.446
2.171



glnE V15***


SEQ ID No. 17**
Double edit: dapA J21300 +
946952 + 946001
758.057
1.516



hicB J23100


SEQ ID No. 18**
Double edit: dapA J21300 +
946952 + 946947
946.484
1.893



maeB J23100


SEQ ID No. 19*
Double edit: dapA J21300 +
946952 + 947613
798.469
1.597



marA Y107D


SEQ ID No. 20*
Double edit: dapA J21300 +
946952 + 948433
726.648
1.453



metL R241E


SEQ ID No. 21*
Double edit: dapA J21300 +
946952 + 945681
983.267
1.967



mfd Y5***


SEQ ID No. 22*
Double edit: dapA J21300 +
946952 + 946655
884.027
1.768



nupX R5***


SEQ ID No. 23*
Double edit: dapA J21300 +
946952 + 945667
1409.458
2.819



pck H232G


SEQ ID No. 24**
Double edit: dapA J21300 +
946952 + 945046
781.383
1.563



phoB J23100


SEQ ID No. 25**
Double edit: dapA J21300 +
946952 + 946975
1633.414
3.267



purM J23100


SEQ ID No. 26*
Double edit: dapA J21300 +
946952 + NA   
834.477
1.669



rlmL F5***


SEQ ID No. 27*
Double edit: dapA J21300 +
946952 + 946557
793.985
1.588



wzxB K5***


SEQ ID No. 28**
Double edit: dapA J21300 +
946952 + 946148
1554.101
3.108



ydgl J23100


SEQ ID No. 29**
Double edit: dapA J21300 +
946952 + 946274
778.514
1.557



ydjE J23100


SEQ ID No. 30**
Double edit: dapA J21300 +
946952 + 948176
854.283
1.709



yicL J23100


SEQ ID No. 31**
Double edit: dapA J21300 +
946952 + 945462
979.740
1.959



yliE J23100


SEQ ID No. 32**
Double edit: dapA J21300 +
946952 + 949126
858.181
1.716



yohF J23100


SEQ ID No. 33*
Double edit: dapA J21300 +
946952 + 948741
781.981
1.564



ytfP N15***


SEQ ID No. 34*
Double edit: dapA J21300 +
946952 + 947613
728.433
1.457



marA R94*


SEQ ID No. 35*
Double edit: dapA J21300 +
946952 + 947613
733.943
1.468



marA Y107K


SEQ ID No. 36*
Double edit: dapA J21300 +
946952 + 948433
726.648
1.453



metL P240D


SEQ ID No. 37*
Double edit: dapA J21300 +
946952 + 948433
708.124
1.416



metL V235C


SEQ ID No. 38*
Double edit: dapA J21300 +
946952 + 945667
718.020
1.436



pck G64D


SEQ ID No. 39**
Double edit: dapA J21300 +
946952 + 946673
727.174
1.454



setB J23100


SEQ ID No. 40**
Double edit: dapA J21300 +
946952 + 945992
701.255
1.403



ydfO J23100


SEQ ID No. 41**
Double edit: dapA J21300 +
946952 + 946436
716.198
1.432



ydgD J23100


SEQ ID No. 42**
Double edit: dapA J21300 +
946952 + 945319
731.562
1.463



yejG J23100





In the table, *denotes an amino acid sequence (e.g., a change to the coding region of the protein), **denotes a nucleic acid sequence (e.g., a change to the promoter region or other noncoding region of the protein), “NCBI-GeneID” is the NCBI accession number, “Phenotype FOWT” is fold over wild type (MG1655) in minimal medium; “Phenotype FIOPC” is fold improved over positive control which is MG1655 with E84T single variant. J231XX is a promoter swap at a given locus, and ****denotes for hits from the genome-wide knock out library where a triple-stop was inserted at a given position in the locus. Note that the fold over wildtype was equal to or greater than 13,000-fold for all triple edits (SEQ ID Nos. 3-8) and as high as 1600-fold in the double mutant dapA J21300 + purM J23100 (SEQ ID No. 25).






EXAMPLES

Mutagenesis libraries specifically targeting the genes the DAP pathway, along with a number of genes whose enzymes convert products feeding into the DAP pathway were designed for saturation mutagenesis. Additionally, to more deeply explore the rest of the E. coli genome for new targets involved in lysine biosynthesis, libraries were designed to target all annotated loci with either premature stop codons (for a knock-out phenotype) or with an insertion of a set of five synthetic promoter variants (for expression modulation phenotypes). Then, the resulting lysine production levels from the single variants were used to conduct additional nucleic acid-guided nuclease editing in two engineered strains of MG1655 to produce double- and triple-variant engineered strains. The first engineered strain is strain MG1655 with a single mutation comprising dapA E84T (SEQ ID No. 1), the lysine production for which was approximately 500-fold over wildtype lysine production in MG1655. The second engineered strain is strain MG1655 with a double mutation comprising dapA E84T (SEQ ID No. 1) and dapA J23100 (a mutation in the E. coli dapA promoter, SEQ ID NO. 2), the lysine production for which was approximately 10,000-fold over wildtype lysine production. All libraries were screened at shallow sampling for lysine production via mass spec as described below.


Editing Cassette and Backbone Amplification and Assembly


Editing Cassette Preparation: 5 nM oligonucleotides synthesized on a chip were amplified using Q5 polymerase in 50 μL volumes. The PCR conditions were 95° C. for 1 minute; 8 rounds of 95° C. for 30 seconds/60° C. for 30 seconds/72° C. for 2.5 minutes; with a final hold at 72° C. for 5 minutes. Following amplification, the PCR products were subjected to SPRI cleanup, where 30 μLSPRI mix was added to the 50 μL PCR reactions and incubated for 2 minutes. The tubes were subjected to a magnetic field for 2 minutes, the liquid was removed, and the beads were washed 2× with 80% ethanol, allowing 1 minute between washes. After the final wash, the beads were allowed to dry for 2 minutes, 50 μL 0.5× TE pH 8.0 was added to the tubes, and the beads were vortexed to mix. The slurry was incubated at room temperature for 2 minutes, then subjected to the magnetic field for 2 minutes. The eluate was removed and the DNA quantified.


Following quantification, a second amplification procedure was carried out using a dilution of the eluate from the SPRI cleanup. PCR was performed under the following conditions: 95° C. for 1 minute; 18 rounds of 95° C. for 30 seconds/72° C. for 2.5 minutes; with a final hold at 72° C. for 5 minutes. Amplicons were checked on a 2% agarose gel and pools with the cleanest output(s) were identified. Amplification products appearing to have heterodimers or chimeras were not used.


Backbone Preparation: A 10-fold serial dilution series of purified backbone was performed, and each of the diluted backbone series was amplified under the following conditions: 95° C. for 1 minute; then 30 rounds of 95° C. for 30 seconds/60° C. for 1.5 minutes/72° C. for 2.5 minutes; with a final hold at 72° C. for 5 minutes. After amplification, the amplified backbone was subjected to SPRI cleanup as described above in relation to the cassettes. The backbone was eluted into 100 μL ddH2O and quantified before nucleic acid assembly.


Isothermal Nucleic Acid Assembly: 150 ng backbone DNA was combined with 100 ng cassette DNA. An equal volume of 2× Gibson Master Mix was added, and the reaction was incubated for 45 minutes at 50° C. After assembly, the assembled backbone and cassettes were subjected to SPRI cleanup, as described above.


Transformation of Editing Vector Library into E cloni®


Transformation: 20 μL of the prepared editing vector Gibson Assembly reaction was added to 30 μL chilled water along with 10 μL E cloni® (Lucigen, Middleton, Wis.) supreme competent cells. An aliquot of the transformed cells were spot plated to check the transformation efficiency, where >100× coverage was required to continue. The transformed E cloni® cells were outgrown in 25 mL SOB+100 μg/mL carbenicillin (carb). Glycerol stocks were generated from the saturated culture by adding 500 μL 50% glycerol to 1000 μL saturated overnight culture. The stocks were frozen at −80° C. This step is optional, providing a ready stock of the cloned editing library. Alternatively, Gibson or another assembly of the editing cassettes and the vector backbone can be performed before each editing experiment.


Creation of New Cell Line Transformed With Engine Vector:


Transformation: 1 μL of the engine vector DNA (comprising a coding sequence for MAD7 nuclease under the control of the pL inducible promoter, a chloramphenicol resistance gene, and the λ Red recombineering system) was added to 50 μL EC83 strain E. coli cells. The transformed cells were plated on LB plates with 25 μg/mL chloramphenicol (chlor) and incubated overnight to accumulate clonal isolates. The next day, a colony was picked, grown overnight in LB+25 μg/mL chlor, and glycerol stocks were prepared from the saturated overnight culture by adding 500 μL 50% glycerol to 1000 μL culture. The stocks of EC1 comprising the engine vector were frozen at −80° C.


Preparation of Competent Cells:


A 1 mL aliquot of a freshly-grown overnight culture of EC83 cells transformed with the engine vector was added to a 250 mL flask containing 100 mL LB/SOB+25 μg/mL chlor medium. The cells were grown to 0.4-0.7 OD, and cell growth was halted by transferring the culture to ice for 10 minutes. The cells were pelleted at 8000×g in a JA-18 rotor for 5 minutes, washed 3× with 50 mL ice cold ddH2O or 10% glycerol, and pelleted at 8000×g in JA-18 rotor for 5 minutes. The washed cells were resuspended in 5 mL ice cold 10% glycerol and aliquoted into 200 μL portions. Optionally at this point the glycerol stocks could be stored at −80° C. for later use.


Screening of Edited Libraries for Lysine Production:


Library stocks were diluted and plated onto 245×245 mm LB agar plates (Teknova) containing 100 μg/mL carbenicillin (Teknova) and 25 μg/mL chloramphenicol (Teknova) using sterile glass beads. Libraries were diluted an appropriate amount to yield ˜2000-3000 colonies on the plates. Plates were incubated ˜16 h at 30° C. and then stored at 4° C. until use. Colonies were picked using a QPix™ 420 (Molecular Devices) and deposited into sterile 1.2 mL square 96-well plates (Thomas Scientific) containing 300 μL of overnight growth medium (EZ Rich Defined Medium, w/o lysine (Teknova), 100 μg/mL carbenicillin and 25 μg/mL chloramphenicol). Plates were sealed (AirPore sheets (Qiagen)) and incubated for ˜19 h in a shaker incubator (Climo-Shaker ISF1-X (Kuhner), 30° C., 85% humidity, 250 rpm). Plate cultures were then diluted 20-fold (15 μL culture into 285 μL medium) into new 96-well plates containing lysine production medium (20 g/L ammonium sulfate (Teknova), 200 mM MOPS buffer (Teknova), 3 mg/L Iron(II) sulfate heptahydrate (Sigma), 3 mg/L Manganese (II) sulfate monohydrate (Sigma), 0.5 mg/L Biotin (Sigma), 1 mg/L Thiamine hydrochloride (Sigma), 0.7 g/L Potassium chloride (Teknova), 20 g/L glucose (Teknova), 5 g/L Potassium phosphate monobasic (Sigma), 1 mL/L Trace metal mixture (Teknova), 1 mM Magnesium sulfate (Teknova), 100 μg/mL carbenicillin and 25 μg/mL chloramphenicol). Production plates were incubated for 24 h in a shaker incubator (Climo-Shaker ISF1-X (Kuhner), 30° C., 85% humidity, 250 rpm).


Production plates were centrifuged (Centrifuge 5920R, Eppendorf) at 3,000 g for 10 min to pellet cells. The supernatants from production plates were diluted 100-fold into water (5 μL of supernatant with 495 μL) of water in 1.2 mL square 96-well plates. Samples were thoroughly mixed and then diluted a subsequent 10-fold further into a 50:50 mixture of acetonitrile and water (20 μL sample with 180 μL of the acetonitrile/water mixture) into a 96-well Plate (polypropylene, 335 μL/well, Conical Bottom (Thomas Scientific). Plates were heat sealed and thoroughly mixed.


Lysine concentrations were determined using a RapidFire high-throughput mass spectrometry system (Agilent) coupled to a 6470 Triple Quad mass spectrometer (Agilent). The RapidFire conditions were as follows: Pump 1: 80% acetonitrile (LC/MS grade, Fisher), 20% water (LC/MS grade, Fisher), 1.5 mL/min, Pump 2: 100% water, 1.25 mL/min, Pump 3: 5% acetonitrile, 95% water, 1.25 mL/min. RapidFire method: Aspirate: 600 ms, Load/wash: 2000 ms, Extra wash: 0 ms, Elute: 3000 ms, Re-equilibrate: 500 ms. 10 μL injection loop.


Mass Spectrometry Conditions for Lysine Detection:


Precursor ion: 147.1 m/z, Product ion (quantifying): 84 m/z, Dwell: 20, Fragmentor: 80, Collision energy: 20, Cell accelerator voltage: 4, Polarity: positive Precursor ion: 147.1 m/z, Product ion (qualifying): 130 m/z, Dwell: 20, Fragmentor: 80, Collision energy: 8, Cell accelerator voltage: 4, Polarity: positive Source conditions: Gas Temp: 300° C., Gas Flow: 10 L/min, Nebulizer: 45 psi, Sheath gas temp: 350° C., Sheath gas flow: 11 L/min, Capillary voltage: 3000V (positive), Nozzle voltage: 1500V (positive)


Data was analyzed using MassHunter Quantitative Analysis software (Agilent) with a standard curve of lysine used for quantitation of lysine in the samples. Each 96-well plate of samples contained 4 replicates of the wildtype strain and 4 replicates of the dapA E84T positive control strain to calculate the relative lysine yield of samples compared to the controls. Hits from the primary screen were re-tested in quadruplicate using a similar protocol as described above.


While this invention is satisfied by embodiments in many different forms, as described in detail in connection with preferred embodiments of the invention, it is understood that the present disclosure is to be considered as exemplary of the principles of the invention and is not intended to limit the invention to the specific embodiments illustrated and described herein. Numerous variations may be made by persons skilled in the art without departure from the spirit of the invention. The scope of the invention will be measured by the appended claims and their equivalents. The abstract and the title are not to be construed as limiting the scope of the present invention, as their purpose is to enable the appropriate authorities, as well as the general public, to quickly determine the general nature of the invention. In the claims that follow, unless the term “means” is used, none of the features or elements recited therein should be construed as means-plus-function limitations pursuant to 35 U.S.C. § 112, ¶6.

Claims
  • 1. An engineered E. coli cell comprising the following variant sequences: a promoter sequence having the nucleic acid SEQ ID NO: 2 driving transcription of a dapA gene, and further comprising one of the following proteins expression of which is not driven by a promoter sequence having a nucleic acid of SEQ ID NO: 2: a mfdY5 protein having the amino acid sequence of SEQ ID NO: 21, a nupXR5 protein having the amino acid sequence of SEQ ID NO: 22, a pck protein having the amino acid sequence of SEQ ID NO: 23, a rlmL protein having the amino acid sequence of SEQ ID NO: 26, a wzxB protein having the amino acid sequence of SEQ ID NO: 27, a ytfP protein having the amino acid sequence of SEQ ID NO: 33, a marA protein having the amino acid sequence of SEQ ID NO: 34, a marA protein having the amino acid sequence of SEQ ID NO: 35, a metL protein having the amino acid sequence of SEQ ID NO: 36, a metL protein having the amino acid sequence of SEQ ID NO: 37, a pck protein having the amino acid sequence of SEQ ID NO: 38.
  • 2. The engineered E. coli cell of claim 1 comprising the promoter sequence having the nucleic acid SEQ ID NO: 2 driving transcription of the dapA gene and further comprising the mfdY5 protein having the amino acid sequence of SEQ ID NO: 21.
  • 3. The engineered E. coli cell of claim 1 comprising the promoter sequence having the nucleic acid SEQ ID NO: 2 driving transcription of the dapA gene and further comprising the nupXR5 protein having the amino acid sequence of SEQ ID NO: 22.
  • 4. The engineered E. coli cell of claim 1 comprising the promoter sequence having the nucleic acid SEQ ID NO: 2 driving transcription of the dapA gene and further comprising the pck protein having the amino acid sequence of SEQ ID NO: 23.
  • 5. The engineered E. coli cell of claim 1 comprising the promoter sequence having the nucleic acid SEQ ID NO: 2 driving transcription of the dapA gene and further comprising the rlmL protein having the amino acid sequence of SEQ ID NO: 26.
  • 6. The engineered E. coli cell of claim 1 comprising the promoter sequence having the nucleic acid SEQ ID NO: 2 driving transcription of the dapA gene and further comprising the wzxB protein having the amino acid sequence of SEQ ID NO: 27.
  • 7. The engineered E. coli cell of claim 1 comprising the promoter sequence having the nucleic acid SEQ ID NO: 2 driving transcription of the dapA gene and further comprising the ytfP protein having the amino acid sequence of SEQ ID NO: 33.
  • 8. The engineered E. coli cell of claim 1 comprising the promoter sequence having the nucleic acid SEQ ID NO: 2 driving transcription of the dapA gene and further comprising the marA protein having the amino acid sequence of SEQ ID NO: 34.
  • 9. The engineered E. coli cell of claim 1 comprising the promoter sequence having the nucleic acid SEQ ID NO: 2 driving transcription of the dapA gene and further comprising the marA protein having the amino acid sequence of SEQ ID NO: 35.
  • 10. The engineered E. coli cell of claim 1 comprising the promoter sequence having the nucleic acid SEQ ID NO: 2 driving transcription of the dapA gene and further comprising the metL protein having the amino acid sequence of SEQ ID NO: 36.
  • 11. The engineered E. coli cell of claim 1 comprising the promoter sequence having the nucleic acid SEQ ID NO: 2 driving transcription of the dapA gene and further comprising the metL protein having the amino acid sequence of SEQ ID NO: 37.
  • 12. The engineered E. coli cell of claim 1 comprising the promoter sequence having the nucleic acid SEQ ID NO: 2 driving transcription of the dapA gene and further comprising the pck protein having the amino acid sequence of SEQ ID NO: 38.
  • 13. An engineered E. coli cell comprising the following variant sequences: a promoter sequence having the nucleic acid SEQ ID NO: 2 driving transcription of a dapA gene, and further comprising one of the following: a promoter sequence having the nucleic acid sequence of SEQ ID NO: 24 driving expression of a phoB protein; a promoter sequence having the nucleic acid sequence of SEQ ID NO: 25 driving expression of a purM protein; a promoter sequence having the nucleic acid sequence of SEQ ID NO: 28 driving expression of a ydgl protein; a promoter sequence having the nucleic acid sequence of SEQ ID NO: 29 driving expression of a ydgE protein; a promoter sequence having the nucleic acid sequence of SEQ ID NO: 30 driving expression of a yicL protein; a promoter sequence having the nucleic acid sequence of SEQ ID NO: 31 driving of a yliE protein; a promoter sequence having the nucleic acid sequence of SEQ ID NO: 32 driving expression of a yohF protein; a promoter sequence having the nucleic acid sequence of SEQ ID NO: 39 driving expression of a setB protein; a promoter sequence having the nucleic acid sequence of SEQ ID NO: 40 driving expression of a ydfO protein; a promoter sequence having the nucleic acid sequence of SEQ ID NO: 41 driving expression of a ydgD protein; or a promoter sequence having the nucleic acid sequence of SEQ ID NO: 42 driving expression of a yejD protein.
  • 14. The engineered E. coli cell of claim 13 comprising the promoter sequence having the nucleic acid SEQ ID NO: 2 driving transcription of the dapA gene and further comprising the promoter sequence having the nucleic acid sequence of SEQ ID NO: 24 driving expression of the phoB protein.
  • 15. The engineered E. coli cell of claim 13 comprising the promoter sequence having the nucleic acid SEQ ID NO: 2 driving transcription of the dapA gene and further comprising the promoter sequence having the nucleic acid sequence of SEQ ID NO: 25 driving expression of the purM protein.
  • 16. The engineered E. coli cell of claim 13 comprising the promoter sequence having the nucleic acid SEQ ID NO: 2 driving transcription of the dapA gene and further comprising the promoter sequence having the nucleic acid sequence of SEQ ID NO: 28 driving expression of the ydg1 protein.
  • 17. The engineered E. coli cell of claim 13 comprising the promoter sequence having the nucleic acid SEQ ID NO: 2 driving transcription of the dapA gene and further comprising the promoter sequence having the nucleic acid sequence of SEQ ID NO: 29 driving expression of the ydgE protein.
  • 18. The engineered E. coli cell of claim 13 comprising the promoter sequence having the nucleic acid SEQ ID NO: 2 driving transcription of the dapA gene and further comprising the promoter sequence having the nucleic acid sequence of SEQ ID NO: 30 driving expression of the yicL protein.
  • 19. The engineered E. coli cell of claim 13 comprising the promoter sequence having the nucleic acid SEQ ID NO: 2 driving transcription of the dapA gene and further comprising the promoter sequence having the nucleic acid sequence of SEQ ID NO: 31 driving expression of the yliE protein.
  • 20. The engineered E. coli cell of claim 13 comprising the promoter sequence having the nucleic acid SEQ ID NO: 2 driving transcription of the dapA gene and further comprising the promoter sequence having the nucleic acid sequence of SEQ ID NO: 32 driving expression of the ydg1 protein.
  • 21. The engineered E. coli cell of claim 13 comprising the promoter sequence having the nucleic acid SEQ ID NO: 2 driving transcription of the dapA gene and further comprising the promoter sequence having the nucleic acid sequence of SEQ ID NO: 39 driving expression of the setB protein.
  • 22. The engineered E coli cell of claim 13 comprising the promoter sequence having the nucleic acid SEQ ID NO: 2 driving transcription of the dapA gene and further comprising the promoter sequence having the nucleic acid sequence of SEQ ID NO: 40 driving expression of the ydfO protein.
  • 23. The engineered E. coli cell of claim 13 comprising the promoter sequence having the nucleic acid SEQ ID NO: 2 driving transcription of the dapA gene and further comprising the promoter sequence having the nucleic acid sequence of SEQ ID NO: 41 driving expression of the ydgD protein.
  • 24. The engineered E. coli cell of claim 13 comprising the promoter sequence having the nucleic acid SEQ ID NO: 2 driving transcription of the dapA gene and further comprising the promoter sequence having the nucleic acid sequence of SEQ ID NO: 43 driving expression of the yejG protein.
RELATED APPLICATIONS

This application is a continuation of U.S. Ser. No. 16/904,827, filed 18 Jun. 2020, entitled “Genome-Wide Rationally-Designed Mutations Leading to Enhanced Lysine Production in E. Coli”; which claims priority to U.S. Provisional Applications No. 62/865,075, filed 21 Jun. 2019, entitled “Genome-Wide Rationally-Designed Mutations Leading to Enhanced Lysine Production in E. Coli”, incorporated by reference herein in its entirety.

US Referenced Citations (147)
Number Name Date Kind
4833080 Brent et al. May 1989 A
4959317 Sauer et al. Sep 1990 A
5464764 Capecchi et al. Nov 1995 A
5487992 Capecchi et al. Jan 1996 A
5627059 Capecchi et al. May 1997 A
5631153 Capecchi et al. May 1997 A
5654182 Wahl et al. Aug 1997 A
5677177 Wahl et al. Oct 1997 A
5710381 Atwood et al. Jan 1998 A
5792943 Craig Aug 1998 A
5885836 Wahl et al. Mar 1999 A
5888732 Hartley et al. Mar 1999 A
6074605 Meserol et al. Jun 2000 A
6127141 Kopf Oct 2000 A
6143527 Pachuk et al. Nov 2000 A
6150148 Nanda et al. Nov 2000 A
6204061 Capecchi et al. Mar 2001 B1
6277608 Hartley et al. Aug 2001 B1
6391582 Ying et al. May 2002 B2
6482619 Rubinsky et al. Nov 2002 B1
6509156 Stewart et al. Jan 2003 B1
6654636 Dev et al. Nov 2003 B1
6689610 Capecchi et al. Feb 2004 B1
6746441 Hofmann et al. Jun 2004 B1
6774279 Dymecki Aug 2004 B2
6916632 Chesnut et al. Jul 2005 B2
6956146 Wahl et al. Oct 2005 B2
7029916 Dzekunov et al. Apr 2006 B2
7112715 Chambon et al. Sep 2006 B2
7141425 Dzekunov et al. Nov 2006 B2
7422889 Sauer et al. Sep 2008 B2
8110122 Alburty et al. Feb 2012 B2
8110360 Serber et al. Feb 2012 B2
8153432 Church et al. Apr 2012 B2
8332160 Platt et al. Dec 2012 B1
8569041 Church et al. Oct 2013 B2
8584535 Page et al. Nov 2013 B2
8584536 Page et al. Nov 2013 B2
8667839 Kimura Mar 2014 B2
8667840 Lee et al. Mar 2014 B2
8677839 Page et al. Mar 2014 B2
8677840 Page et al. Mar 2014 B2
8697359 Zhang et al. Apr 2014 B1
8726744 Alburty et al. May 2014 B2
8758623 Alburty et al. Jun 2014 B1
8921332 Choulika et al. Dec 2014 B2
8926977 Miller et al. Jan 2015 B2
8932850 Chang et al. Jan 2015 B2
9029109 Hur et al. May 2015 B2
D731634 Page et al. Jun 2015 S
9063136 Talebpour et al. Jun 2015 B2
9260505 Weir et al. Feb 2016 B2
9361427 Hillson Jun 2016 B2
9499855 Hyde et al. Nov 2016 B2
9534989 Page et al. Jan 2017 B2
9546350 Dzekunov et al. Jan 2017 B2
9593359 Page et al. Mar 2017 B2
9738918 Alburty et al. Aug 2017 B2
9776138 Innings et al. Oct 2017 B2
9790490 Zhang et al. Oct 2017 B2
9896696 Begemann et al. Feb 2018 B2
9982279 Gill et al. May 2018 B1
9988624 Serber et al. Jun 2018 B2
10011849 Gill et al. Jul 2018 B1
10017760 Gill et al. Jul 2018 B2
10266851 Chen Apr 2019 B2
20030059945 Dzekunov et al. Mar 2003 A1
20030073238 Dzekunov et al. Apr 2003 A1
20030104588 Orwar et al. Jun 2003 A1
20040110253 Kappler et al. Jun 2004 A1
20040115784 Dzekunov et al. Jun 2004 A1
20040171156 Hartley et al. Sep 2004 A1
20050064584 Bargh Mar 2005 A1
20050118705 Rabbitt et al. Jun 2005 A1
20060001865 Bellalou et al. Jan 2006 A1
20060224192 Dimmer et al. Oct 2006 A1
20070042427 Gerdes et al. Feb 2007 A1
20070105206 Lu et al. May 2007 A1
20070231873 Ragsdale Oct 2007 A1
20070249036 Ragsdale et al. Oct 2007 A1
20080138877 Dzekunov et al. Jun 2008 A1
20100055790 Simon Mar 2010 A1
20100076057 Sontheimer et al. Mar 2010 A1
20110002812 Asogawa et al. Jan 2011 A1
20110003303 Pagano et al. Jan 2011 A1
20110009807 Kjeken et al. Jan 2011 A1
20110065171 Dzekunov et al. Mar 2011 A1
20110213288 Choi et al. Sep 2011 A1
20110236962 Loebbert et al. Sep 2011 A1
20120156786 Bebee Jun 2012 A1
20130005025 Church et al. Jan 2013 A1
20130196441 Rubinsky et al. Aug 2013 A1
20140068797 Doudna et al. Mar 2014 A1
20140121728 Dhillon et al. May 2014 A1
20140199767 Barrangou et al. Jul 2014 A1
20140273226 Wu et al. Sep 2014 A1
20140350456 Caccia Nov 2014 A1
20150072413 Zenhausern et al. Mar 2015 A1
20150098954 Hyde et al. Apr 2015 A1
20150159174 Frendewey et al. Jun 2015 A1
20150176013 Musunuru et al. Jun 2015 A1
20150191719 Hudson et al. Jul 2015 A1
20150225732 Williams et al. Aug 2015 A1
20150297887 Dhillon et al. Oct 2015 A1
20160024529 Carstens et al. Jan 2016 A1
20160053272 Wurtzel et al. Feb 2016 A1
20160053304 Wurtzel et al. Feb 2016 A1
20160076093 Shendure et al. Mar 2016 A1
20160102322 Ravinder et al. Apr 2016 A1
20160168592 Church et al. Jun 2016 A1
20160272961 Lee Sep 2016 A1
20160281047 Chen et al. Sep 2016 A1
20160281053 Sorek et al. Sep 2016 A1
20160289673 Huang et al. Oct 2016 A1
20160298074 Dai Oct 2016 A1
20160298134 Chen et al. Oct 2016 A1
20160310943 Woizenko et al. Oct 2016 A1
20160313306 Ingber et al. Oct 2016 A1
20160354487 Zhang et al. Dec 2016 A1
20160367991 Cepheid Dec 2016 A1
20170002339 Barrangou et al. Jan 2017 A1
20170022499 Lu et al. Jan 2017 A1
20170029805 Li et al. Feb 2017 A1
20170051310 Doudna et al. Feb 2017 A1
20170073705 Chen et al. Mar 2017 A1
20170218355 Buie et al. Mar 2017 A1
20170191123 Kim et al. Jul 2017 A1
20170211078 Kamineni et al. Jul 2017 A1
20170240922 Gill et al. Aug 2017 A1
20170283761 Corso Oct 2017 A1
20170307606 Hallock Oct 2017 A1
20170349874 Jaques et al. Dec 2017 A1
20170369870 Gill et al. Dec 2017 A1
20180023045 Hallock et al. Jan 2018 A1
20180028567 Li et al. Feb 2018 A1
20180051327 Blainey et al. Feb 2018 A1
20180052176 Holt et al. Feb 2018 A1
20180073013 Lorenz et al. Mar 2018 A1
20180112235 Li et al. Apr 2018 A1
20180142196 Coppeta et al. May 2018 A1
20180155665 Zenhausern et al. Jun 2018 A1
20180169148 Adair et al. Jun 2018 A1
20180179485 Borenstein et al. Jun 2018 A1
20180200342 Bikard et al. Jul 2018 A1
20180230460 Gill et al. Aug 2018 A1
20190017072 Ditommaso et al. Jan 2019 A1
20190169605 Masquelier et al. Jun 2019 A1
Foreign Referenced Citations (40)
Number Date Country
2397122 Sep 2000 CN
2135626 Dec 2009 EP
2240238 Oct 2010 EP
2395087 Dec 2011 EP
3030652 Jun 2016 EP
1766004 Aug 2016 EP
3199632 Aug 2017 EP
2459696 Nov 2017 EP
WO 2003057819 Jul 2001 WO
WO2002010183 Feb 2002 WO
WO 2003087341 Oct 2003 WO
WO 2009091578 Jul 2009 WO
WO 2010079430 Jul 2010 WO
WO 2011072246 Jun 2011 WO
WO2011143124 Nov 2011 WO
WO 2012012779 Jan 2012 WO
WO2013142578 Sep 2013 WO
WO 2013176772 Nov 2013 WO
WO2014018423 Jan 2014 WO
WO2014144495 Sep 2014 WO
WO 2015021270 Feb 2015 WO
WO 2016003485 Jan 2016 WO
WO 2016054939 Apr 2016 WO
WO2016110453 Jul 2016 WO
WO 2016145290 Sep 2016 WO
WO2017053902 Mar 2017 WO
WO 2017078631 May 2017 WO
WO2017083722 May 2017 WO
WO2017106414 Jun 2017 WO
WO2017161371 Sep 2017 WO
WO2017174329 Oct 2017 WO
WO2017186718 Nov 2017 WO
WO2017216392 Dec 2017 WO
WO2017223330 Dec 2017 WO
WO 2018015544 Jan 2018 WO
WO2018031950 Feb 2018 WO
WO2018071672 Apr 2018 WO
WO2018083339 May 2018 WO
WO 2018191715 Oct 2018 WO
WO2019006436 Jan 2019 WO
Non-Patent Literature Citations (81)
Entry
International Search Report and Written Opinion for International Application No. PCT/US20/38345, dated Nov. 23, 2020, p. 143.
Wang, et al., “Evolving the L-lysine high-producing strain of Escherichia coli using a newly developed high-throughput screening method”, doi:10.1007/s10295-016-1803-1; J. Ind Microbial Biotechnol (2016) 43; 1227-1235.
Bao, et al., “Genome-scale engineering of Saccharomyces cerevisiae with single-nucleotide precision”, Nature Biotechnology, doi:10.1038/nbt.4132, pp. 1-6 (May 7, 2018).
Dicarlo, et al., “Genome engineering in Saccharomyces cervisiae using CRISPR-Case systems”, Nucleic Acids Research, 41(7):4336-43 (2013).
Eklund, et al., “Altered target site specificity variants of the I-Ppol His-Cys bis homing endonuclease” Nucleic Acids Research, 35(17):5839-50 (2007).
Garst, et al., “Genome-wide mapping of mutations at single-nucleotide resolution for protein, metabolic and genome engineering”, Nature Biotechnology, 35(1):48-59 (2017).
Boles, et al., “Digital-to-biological converter for on-demand production of biologics”, Nature Biotechnology, doi:10.1038/nbt.3859 (May 29, 2017).
Hsu, et al., “DNA targeting specificity of RNA-guided Cas9 nucleases”, Nature Biotechnology, 31(9):827-32 (2013).
Jiang, et al., “RNA-guided editing of bacterial genomes using CRISPR-Cas systems”, Nature Biotechnology, 31(3):233-41 (2013).
Jinek, et al., “A Programmable Dual-RNA-Guided DNA Endonuclease in Adaptive Bacterial Immunity”, Science, 337:816-20 (2012).
Pines, et al., “Codon Compression Algorithms for Saturation Mutagenesis”, ACS Synthetic Biology, 4:604-14 (2015).
Verwaal, et al., “CRISPR/Cpfl enables fast and simple genome editing of Saccharamyces cerevisiae”, Yeast, 35:201-11 (2018).
Lian, et al., “Combinatorial metabolic engineering using an orthogonal tri-functional CRISPR system”, Nature Communications, DOI:1038/s41467-017-01695-x/www.nature.com/naturecommunications, pp. 1-9 (2017).
Roy, et cl., “Multiplexed precision genome editing with trackable genomic barcodes in yeast”, Nature Biotechnolgy, doi:10.1038/nbt.4137, pp. 1-16 (2018).
Bessa et al., “Improved gap repair cloning in yeast: treatment of the gapped vector with Taq DNA polymerase avoids vector self-ligation,” Yeast, 29(10):419-23 (2012).
Boch, “TALEs of genome targeting,” Nature Biotechnology vol. 29, pp. 135-136 (2011).
Campbell et al., “Targeting protein function: the expanding toolkit for conditional disruption,” Biochem J., 473(17):2573-2589 (2016).
Casini et al., “Bricks and blueprints: methods and standards for DNA assembly,” Nat Rev Mol Cell Biol., (9):568-76 (2015).
Chica et al., “Semi-rational approaches to engineering enzyme activity: combining the benefits of directed evolution and rational design,” Current Opinion in Biotechnology, 16(4): 378-384 (2005).
Cramer et al., “Functional association between promoter structure and transcript alternative splicing,” PNAS USA, 94(21):11456-60 (1997).
Dalphin et al., “Transterm: A Database of Translational Signals,” Nucl. Acids Res., 24(1): 216-218 (1996).
Datsenko and Wanner, “One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products”, PNAS USA, 97(12):6640-5 (2000).
De Kok et al., “Rapid and reliable DNA assembly via ligase cycling reaction,” ACS Synth Biol., 3(2):97-106 (2014).
Desmet et al., “Human Splicing Finder: an online bioinformatics tool to predict splicing signals,” Nucleic Acids Res., 37(9):e67 (2009).
Divina et al., “Ab Initio prediction of mutation-induced cryptic splice-site activation and exon skipping,” European Journal of Human Genetics, 17:759-765 (2009).
Dong, “Establishment of a highly efficient virus-inducible CRISPR/Cas9 system in insect cells,” Antiviral Res., 130:50-7(2016).
Durai et al., “Zinc finger nucleases: custom-designed molecular scissors for genome engineering of plant and mammalian cells”, Nucleic Acids Res., 33(18):5978-90 (2005).
Engler et al., “PLoS One, A One Pot, One Step, Precision Cloning Method with High Throughput Capability,” 3(11):e3647 (2008).
Epinat et al., “A novel engineered meganuclease induces homologous recombination in eukaryotic cells, e.g., yeast and mammalian cells”, Nucleic Acids Research, 31(11): 2952-2962, 2003.
Faber et al., “Genome-wide prediction of splice-modifying SNPs in human genes using a new analysis pipeline called AASsites,” BMC Bioinformatics, 12(suppl 4):S2 (2011).
Farasat et al., “A Biophysical Model of CRISPR/Cas9 Activity for Rational Design of Genome Editing and Gene Regulation,” PLoS Comput Biol., 29:12(1):e1004724 (2016).
Adamo, et al., “Flow-through comb electroporation device for delivery of macromolecules”, Analytical Chemistry, 85(3):1637-41 (2015).
Greger et al., “Balancing transcriptional interference and initiation on the GAL7 promoter of Saccharomyces cerevisiae,” PNAS, 97(15):8415-20 (2000).
Juan et al., “Histone deacetylases specifically down-regulate p53-dependent gene activation,” Journal of Biological Chemistry 275.27 (2000): 20436-20443.
Kadonaga et al., “Regulation of RNA polymerase II transcription by sequence-specific DNA binding factors”, Cell, 116(2):247-57 (2004).
Lee et al., “Targeted chromosomal deletions in human cells using zinc finger nucleases”, Genome Res., 20 (1): 81-9 (2009).
Lefevre et al., “Alanine-stretch scanning mutagenesis: a simple and efficient method to probe protein structure and function, ”Nucleic Acids Research, vol. 25(2):447-448 (1997).
Liu et al., “A chemical-inducible CRISPR-Cas9 system for rapid control of genome editing”, Nature Chemical Biology, 12:980-987(2016).
Miller et al., “A TALE nuclease architecture for efficient genome editing”, Nature Biotechnology, 29 (2): 143-8 (2011).
Mittelman et al., “Zinc-finger directed double-strand breaks within CAG repeat tracts promote repeat instability in human cells”, PNAS USA, 106 (24): 9607-12 (2009).
Mullick et al., “The cumate gene-switch: a system for regulated expression in mammalian cells”, BMC Biotechnology, 6:43 (2006).
Nalla et al., “Automated splicing mutation analysis by information theory,” Hum. Mutat., 25:334-342 (2005).
No et al., “Ecdysone-inducible gene expression in mammalian cells and transgenic mice,” PNAS, 93(8):3346-3351 (1996).
Ohtsuka, “Lantibiotics: mode of action, biosynthesis and bioengineering,” Curr Pharm Biotechnol, 10(2):244-51 (2009).
Patron, “DNA assembly for plant biology: techniques and tools,” Curr Opinion Plant Biol., 19:14-9 (2014).
Sands et al., “Overview of Post Cohen-Boyer Methods for Single Segment Cloning and for Multisegment DNA Assembly,” Curr Protoc Mol Biol., 113:3.26.1-3.26.20 (2016).
Shivange, “Advances in generating functional diversity for directed protein evolution”, Current Opinion in Chemical Biology, 13 (1): 19-25 (2009).
Udo, “An Alternative Method to Facilitate cDNA Cloning for Expression Studies in Mammalian Cells by Introducing Positive Blue White Selection in Vaccinia Topoisomerase I-Mediated Recombination,” PLoS One, 10(9):e0139349 (2015).
Urnov et al., “Genome editing with engineered zinc finger nucleases”, Nature Reviews Genetics, 11:636-646 (2010).
West et al., “Molecular Dissection of Mammalian RNA Polymerase II Transcriptional Termination,” Mol Cell. 29(5):600-10 (2008).
West et al., “Transcriptional Termination Enhances Protein Expression in Human Cells,” Mol Cell.; 33(3-9); 354-364 (2009).
Yoshioka, et al., “Development for a mono-promoter-driven CRISPR/CAS9 system in mammalian cells”, Scientific Reports, Jul. 3, 2015, p. 1-8.
Remaut, et al., “Plasmid vectors for high-efficiency expression controlled by the PL promoter of coliphage lambda”, Laboratory of Molecular Biology, Apr. 15, 1981, p. 81-93.
International Search Report and Written Opinion for International Application No. PCT/US19/46515, dated Oct. 28, 2019, p. 1-11.
International Search Report and Written Opinion for International Application No. PCT/US19/49735, dated Nov. 18, 2019, p. 1-13.
International Search Report and Written Opinion for International Application No. PCT/US19/46526, dated Dec. 18, 2019, p. 1-17.
International Search Report and Written Opinion for International Application No. PCT/US18/34779, dated Nov. 26, 2018, p. 1-39.
International Search Report and Written Opinion for International Application No. PCT/US19/57250, dated Feb. 25, 2020, p. 1-16.
International Search Report and Written Opinion for International Application No. PCT/US20/24341, dated Jun. 19, 2020, p. 1-9.
International Search Report and Written Opinion for International Application No. PCT/US19/47135, dated Jun. 11, 2020, p. 1-15.
International Search Report and Written Opinion for International Application No. PCT/US20/19379, dated Jul. 22, 2020, p. 1-10.
International Search Report and Written Opinion for International Application No. PCT/US20/36064, dated Sep. 18, 2020, p. 1-16.
International Search Report and Written Opinion for International Application No. PCT/US20/40389, dated Oct. 13, 2020, p. 1-12.
Arnak, et al., “Yeast Artificial Chromosomes”, John Wiley & Sons, Ltd., doi:10.1002/9780470015902.a0000379.pub3, pp. 1-10 (2012).
Woo, et al., “Dual roles of yeast Rad51 N-terminal domain in repairing DNA double-strand breaks”, Nucleic Acids Research, doi:10.1093/nar/gkaa.587, vol. 48, No. 15, pp. 8474-8489 (2020).
International Search Report and Written Opinion for International Application No. PCT/US2018/040519, dated Sep. 26, 2018, p. 1-8.
International Search Report and Written Opinion for International Application No. PCT/US2018/053608, dated Dec. 13, 2018, p. 1-9.
International Search Report and Written Opinion for International Application No. PCT/US2018/053670, dated Jan. 3, 2019, p. 1-13.
International Search Report and Written Opinion for International Application No. PCT/US2018/053671, dated Nov. 23, 2018, p. 1-12.
International Search Report and Written Opinion for International Application No. PCT/US2019/023342 dated Jun. 6, 2019, p. 1-12.
International Search Report and Written Opinion for International Application No. PCT/US2019/026836 dated Jul. 2, 2019, p. 1-10.
International Search Report and Written Opinion for International Application No. PCT/US2019/028821 dated Aug. 2, 2019, p. 1-14.
International Search Report and Written Opinion for Interational Application No. PCT/US2019/028883 dated Aug. 16, 2019, p. 1-12.
International Search Report and Written Opinion for International Application No. PCT/US2019/030085 dated Jul. 23, 2019, p. 1-14.
NonFinal Office Action for U.S. Appl. No. 16/024,816 dated Sep. 4, 2018, p. 1-10.
Final Office Action for U.S. Appl. No. 16/024,816 dated Nov. 26, 2018, p. 1-12.
First Office Action Interview Pilot Program Pre-Interview Communication for U.S. Appl. No. 16/024,831, dated Feb. 12, 2019, p. 1-37.
First Office Action Interview Pilot Program Pre-Interview Communication for U.S. Appl. No. 16/360,404 dated Jul. 1, 2019, p. 1-27.
First Office Action Interview Pilot Program Pre-Interview Communication for U.S. Appl. No. 16/360,423 dated Jul. 1, 2019, p. 1-27.
Non Final Office Action for U.S. Appl. No. 16/399,988 dated Jul. 31, 2019, p. 1-20.
First Office Action Interview Pilot Program Pre-Interview Communication for U.S. Appl. No. 16/454,865 dated Aug. 16, 2019, p. 1-36.
Related Publications (1)
Number Date Country
20210155894 A1 May 2021 US
Provisional Applications (1)
Number Date Country
62865075 Jun 2019 US
Continuations (1)
Number Date Country
Parent 16904827 Jun 2020 US
Child 17159137 US