Genetically encoded biosensors for detection of polyketides

Information

  • Patent Grant
  • 12091718
  • Patent Number
    12,091,718
  • Date Filed
    Wednesday, October 26, 2022
    2 years ago
  • Date Issued
    Tuesday, September 17, 2024
    3 months ago
Abstract
The present disclosure relates to high-throughput detection of polyketides using genetically encoded biosensors.
Description
SEQUENCE LISTING

A Sequence Listing conforming to the rules of WIPO Standard ST.26 is hereby incorporated by reference. Said Sequence Listing has been filed as an electronic document via PatentCenter in ASCII format encoded as XML. The electronic document, created on Apr. 23, 2023, is entitled “10620-025US2_ST26.xml”, and is 116,858 bytes in size.


FIELD

The present disclosure relates to high-throughput detection of polyketides using genetically encoded biosensors.


BACKGROUND

Polyketides are a large group of diverse molecules that display broad and potent biological activities. Access to large quantities of polyketides and analogues thereof is critical for the discovery of new biological activities, optimization of pharmacological properties, and to probe discovery and development. Biosynthetic approaches to polyketide production offer enormous potential and several benefits compared to traditional chemical approaches. The scaffolds of many polyketides are constructed by type I polyketide synthases (PKSs). These are large multifunctional protein complexes organized in a modular fashion. Each module is responsible for the selection and installation of a ketide into the polyketide. The number, identity, and order of modules describe the structure of the corresponding polyketide. These scaffolds are often further elaborated by tailoring enzymes to afford the mature, biologically active natural product. Accordingly, these systems offer the potential for the synthesis of large quantities of polyketides via microbial fermentation and combinatorial synthesis of analogues by mixing and matching modules and tailoring enzymes. However, the sheer size, mechanistic diversity, and poor understanding of how specificity and catalysis are controlled by type I PKSs render rational design of new pathways difficult. For example, many hybrid PKSs designed to produce polyketide analogues fail or are less active than wild-type machinery. Consequently, the full synthetic potential of type I PKSs has yet to be realized. Synthetic biology and directed evolution offer an opportunity to overcome these challenges by testing the functions of large libraries of variants. Yet, the ability of synthetic biology and directed evolution approaches to be applied to polyketides is extremely limited because there are no generally applicable high-throughput tools available for screening polyketides, particularly those encoded by type I PKSs. Regulatory proteins such as transcription factors have been used as effective devices for sensitive and specific detection of various small molecules. Engineered transcription factors have been described for sensing several small molecules, including dicarboxylic acids, alcohols, and a lactone, but none have been reported for the complex products of type I PKSs.


The biosensor systems, cells, and methods disclosed herein address these and other needs.


SUMMARY

Described herein is a platform technology that comprises genetically-encoded biosensors and methods for detection of polyketides using mutated MphR gene sequences. Such biosensors provide a scalable, economic, high-throughput, and broadly applicable means to specifically identify a target polyketide of interest from a complex mixture of molecules.


In one aspect, disclosed herein is a biosensor system comprising:

    • a nucleic acid encoding a genetically modified MphR gene sequence, wherein the nucleic acid comprises at least one genetic mutation when compared to the wild-type MphR gene sequence; and
    • a reporter gene whose transcription is under the control of a promoter region which is regulated by the MphR transcription factor.


In one aspect, disclosed herein is a genetically modified host cell comprising:

    • a nucleic acid encoding a genetically modified MphR gene sequence, wherein the nucleic acid comprises at least one genetic mutation when compared to the wild-type MphR gene sequence; and
    • a reporter gene whose transcription is under the control of a promoter region which is regulated by the MphR transcription factor.


In one aspect, provided herein is a method for detecting a polyketide, comprising:

    • introducing into a cell:
    • i. a nucleic acid encoding a genetically modified MphR gene sequence, wherein the nucleic acid comprises at least one genetic mutation when compared to the wild-type MphR gene sequence; and
    • ii. a reporter gene whose transcription is under the control of a promoter region which is regulated by the MphR transcription factor,
    • and
    • detecting the polyketide based on the differential expression of the reporter gene in comparison to a cell comprising a wild-type MphR gene sequence.


In one aspect, provided herein is a method of screening for genetic mutations in a target gene, comprising:

    • introducing into a cell:
    • i. a nucleic acid encoding a genetically modified MphR gene sequence, wherein the nucleic acid comprises at least one genetic mutation when compared to the wild-type MphR gene sequence; and
    • ii. a reporter gene whose transcription is under the control of a promoter region which is regulated by the MphR transcription factor,
    • introducing at least one mutation into a target gene; and
    • identifying a cell comprising the target gene mutation based on the differential expression of the reporter gene in comparison to a cell comprising the wild-type target gene.





BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying figures, which are incorporated in and constitute a part of this specification, illustrate several aspects described below.



FIGS. 1A-1B. The MphR biosensor. (FIG. 1A) Structures of selected polyketides that are detected by wild-type (WT) MphR. Erythromycin A (ErA) is the natural ligand. (FIG. 1B) Artificial MphR-GFP reporter system. In the presence of ErA, MphR changes conformation and stops inhibiting transcription from the PmphR operator, thus turning on reporter expression.



FIGS. 2A-2C. Engineered MphR variants with improved sensitivity towards erythromycin A (ErA) and sensitivity of amino acid changes compared to ribosome binding site mutations. (FIG. 2A) Sensitivity of original clones A3, E7, and H4 towards erythromycin A. (FIG. 2B) Sensitivity of wild-type MphR and amino-acid change-only mutations towards erythromycin A. (FIG. 2C) Sensitivity of wild-type MphR and RBS-only mutations towards erythromycin A.



FIG. 3A. Erythromycin, clarithromycin, azithromycin, roxithromycin sensitivity with wild-type (WT) MphR.



FIG. 3B. Erythromycin, clarithromycin, azithromycin, roxithromycin sensitivity with M2D6-E7RBS MphR.



FIG. 3C. Erythromycin, clarithromycin, azithromycin, roxithromycin sensitivity with M2D6 MphR.



FIGS. 4A-4C. MphR is a robust macrolide glycosylation sensor. (FIG. 4A) WT MphR detects erythromycin A (ErA) but not the aglycone, 6dEB. (FIG. 4B) Structures of the 12-membered macrolide YC-17 and macrolactone (aglycone) 10-DML. (FIG. 4C) Left, the MphR variant D3 detects YC-17 at concentrations ˜100-fold lower than WT MphR; Right, neither WT or D3 MphR is activated by the aglycone 10-DML.



FIGS. 5A-5B. Biosynthesis of clarithromycin via an engineered O-methyltransferase (OMT). (FIG. 5A) An OMT with the requisite regioselectivity allows the single-step preparation of clarithromycin from ErA. (FIG. 5B) Role of naturally occurring OMTs that target polyketide sugar residues.



FIGS. 6A-6B. Clarithromycin selective MphR sensor. (FIG. 6A) Wild-type (WT) MphR does not discriminate ErA/clarithromycin across a 1000-fold concentration range. (FIG. 6B) MphR M1B10 provides higher GFP signal with clarithromycin vs. erythromycin A (ErA) across entire range of concentrations.



FIG. 7. Existing 18-step route to solithromycin compared to a biosynthetic route.



FIGS. 8A-8B. Biosensor-guided engineering of a solithromycin precursor. (FIG. 8A) Two genetic changes afford I, in low yield. (FIG. 8B) Biosensor-guided screening of large libraries of variants identify prototype pathways/strains with improved product titers.



FIGS. 9A-9D. O-methyltransferase (OMT) scaffolds for directed evolution. (FIG. 9A) Phyr2 generated homology model for EryG, 93% of residues were modeled at >90% confidence. Residues involved in the SAM binding site (V88, G89, F90, G91, L92, G93, A94, D112, L113, G139, S140, A141, L157). Sticks: putative macrolide (ErA) binding residues (I188, G215, W221, W252, W256, K278, R279, L281, T282, S285, G286, K288, F296), determined by comparison to known acceptor binding sites for related OMTs. (FIG. 9B) Computationally predicted internal cavities of EryG using CAVER Analyst 1.0 (Outer probe 3.00 Å, Inner probe 1.90 Å). SAM binding site and putative erythromycin A (ErA) binding site are shown. (FIG. 9C) DnrK (PDB: 1TW3) acceptor binding site shown as sticks (E298, L299, R302, M303, F306, L307, Y341). Macrolide ligand shown space filled. (FIG. 9D) MycF (PDB: 4X7U) acceptor binding site shown as sticks (L32, Y49, M132, L134, Y137, V141).



FIGS. 10A-10D. Glycosylation pathways and combinatorial biosynthesis. (FIG. 10A) Reactions catalyzed by glycosyltransferases (GTs). (FIG. 10B) Genes responsible for the biosynthesis of a given polyketide are usually clustered on microbial genomes. (FIG. 10C) Feeding non-native aglycones into heterologous host with non-native NDP-sugar and GT genes. (FIG. 10D) Overall reaction catalyzed by DesVII/VIII is shown in the grey box, along with the natural aglycone substrates for this enzyme.



FIGS. 11A-11B. Dose-response curves of several selected clones compared to the wild-type biosensor. Multiple MphR mutants displayed increased sensitivity to erythromycin A versus MphR-WT. Clones generated by error prone PCR (epPCR) (FIG. 11A) typically performed better than clones generated by multi-site mutagenesis (FIG. 11B).



FIGS. 12A-12C. Dose-response curves of MphR-A16T/T154M/M155K compared to the wild-type biosensor induced by erythromycin A, clarithromycin, azithromycin and roxithromycin. (FIG. 12A) MphR-WT responses to erythromycin A and semi-synthetic analogs. (FIG. 12B) MphR-A16T/T154M/M155K responses to erythromycin A and semi-synthetic analogs. Coding of macrolides show potential or actual points of semi-synthetic modification. (FIG. 12C) Structures for erythromycin A (compound 1), clarithromycin (compound 2), azithromycin (compound 3), and roxithromycin (compound 4).



FIG. 13. Late-stage erythromycin A biosynthesis. 6dEB, produced by DEBS1-3, is modified by a suite of enzymes to yield erythromycin D. Biosynthesis from erythromycin D to erythromycin A proceeds via biosynthetic intermediate erythromycin C (filled arrows), or by the shunt pathway via intermediate erythromycin B (dashed arrows). The eryK-catalyzed C-12 hydroxylations and eryG-catalyzed mycarosyl O′-methylations are shown in the figure.



FIGS. 14A-14B. Dose-response curves of the wild-type sensor (FIG. 14A) and the erythromycin A specific sensor MphR-P4L/W107L/H193R (FIG. 14B) in the context of discriminating between erythromycins A (compound 1) and B (compound 5). Clone MphR-P4L/W107L/H193R is capable of significant activation by erythromycin A solely, unlike the general wild-type macrolide biosensor.



FIG. 15. Plasmid map for pMLGFP.



FIG. 16. Plasmid map for pJZ12.



FIG. 17. Sensitivity of the smRBS1A1 clone versus the wild-type (WT) biosensor with erythromycin A.



FIG. 18. Sensitivity of clones E7-RBS, smRBS1A1, pikB1, and wild-type (WT) with pikromycin.



FIG. 19A. Clarithromycin/erythromycin A selectivity with R122T MphR.



FIG. 19B. Clarithromycin/erythromycin A selectivity with the M9C4 clone.



FIG. 19C. Clarithromycin/erythromycin A selectivity with wild-type (WT) MphR.



FIG. 19D. Clarithromycin/erythromycin A selectivity with the E7-M9C4 clone.



FIG. 20. MphR clone “PikB1” can detect a solithromycin biosynthetic intermediate.



FIGS. 21A-21C. Characterization of YC-17, narbomycin, and pikromycin selective MphR Clones. (FIG. 21A) YC-17 sensitivity of B1 clone vs. WT. (FIG. 21B) Narbomycin sensitivity of G7 clone vs. WT. (FIG. 21C) Pikromycin sensitivity of B1 clone vs. WT.



FIG. 22A. The E7-RBS clone shows increased detection of the erythromycin producing strain, Aeromicrobium erythreum, compared to the wild-type (WT) biosensor.



FIG. 22B. Agar plate detection of the E7-RBS clone shows increased detection of the erythromycin producing strain, Aeromicrobium erythreum, compared to the WT biosensor.



FIG. 23. Plasmid map for WT-pMLCmR.



FIG. 24. Analysis of the control of expression of the chloramphenicol (Cm) resistance gene using pMLCmR.



FIG. 25. Analysis of antibiotic sensitivities of the E7-M9C4 pMLCmR clone.



FIG. 26A. Analysis of wild-type (WT) MphR using a range of ErA/Clarithromycin concentrations. This shows that the WT biosensor does not discriminate between these two polyketides and cannot be used to determine the concentration of clarithromycin in the presence of ErA.



FIG. 26B. Analysis of MphR mutant M9C4 using a range of ErA/Clarithromycin concentrations. This shows that the WT biosensor does discriminate between these two polyketides and can be used to determine the concentration of clarithromycin in the presence of ErA.





DETAILED DESCRIPTION OF THE INVENTION

Described herein is a platform technology that comprises genetically-encoded biosensors and methods for detection of polyketides using mutated MphR gene sequences. Such biosensors provide a scalable, economic, high-throughput, and broadly applicable means to specifically identify a target polyketide of interest from a complex mixture of molecules.


Reference will now be made in detail to the embodiments of the invention, examples of which are illustrated in the drawings and the examples. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein.


Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood to one of ordinary skill in the art to which this invention belongs. The following definitions are provided for the full understanding of terms used in this specification.


Terminology

Terms used throughout this application are to be construed with ordinary and typical meaning to those of ordinary skill in the art. However, Applicant desires that the following terms be given the particular definition as defined below.


As used in the specification and claims, the singular form “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise. For example, the term “a cell” includes a plurality of cells, including mixtures thereof.


As used herein, the terms “may,” “optionally,” and “may optionally” are used interchangeably and are meant to include cases in which the condition occurs as well as cases in which the condition does not occur.


The terms “about” and “approximately” are defined as being “close to” as understood by one of ordinary skill in the art. In one non-limiting embodiment the terms are defined to be within 10%. In another non-limiting embodiment, the terms are defined to be within 5%. In still another non-limiting embodiment, the terms are defined to be within 1%.


The term “nucleic acid” as used herein means a polymer composed of nucleotides, e.g. deoxyribonucleotides or ribonucleotides.


The terms “ribonucleic acid” and “RNA” as used herein mean a polymer composed of ribonucleotides.


The terms “deoxyribonucleic acid” and “DNA” as used herein mean a polymer composed of deoxyribonucleotides.


The term “oligonucleotide” denotes single- or double-stranded nucleotide multimers of from about 2 to up to about 100 nucleotides in length. Suitable oligonucleotides may be prepared by the phosphoramidite method described by Beaucage and Carruthers, Tetrahedron Lett., 22:1859-1862 (1981), or by the triester method according to Matteucci, et al., J. Am. Chem. Soc., 103:3185 (1981), both incorporated herein by reference, or by other chemical methods using either a commercial automated oligonucleotide synthesizer or VLSIPS™ technology. When oligonucleotides are referred to as “double-stranded,” it is understood by those of skill in the art that a pair of oligonucleotides exist in a hydrogen-bonded, helical array typically associated with, for example, DNA. In addition to the 100% complementary form of double-stranded oligonucleotides, the term “double-stranded,” as used herein is also meant to refer to those forms which include such structural features as bulges and loops, described more fully in such biochemistry texts as Stryer, Biochemistry, Third Ed., (1988), incorporated herein by reference for all purposes.


The term “polynucleotide” refers to a single or double stranded polymer composed of nucleotide monomers. In some embodiments, the polynucleotide is composed of nucleotide monomers of generally greater than 100 nucleotides in length and up to about 8,000 or more nucleotides in length.


The term “polypeptide” refers to a compound made up of a single chain of D- or L-amino acids or a mixture of D- and L-amino acids joined by peptide bonds.


The term “promoter” or “regulatory element” refers to a region or sequence determinants located upstream or downstream from the start of transcription and which are involved in recognition and binding of RNA polymerase and other proteins to initiate transcription. Promoters need not be of bacterial origin, for example, promoters derived from viruses or from other organisms can be used in the compositions, systems, or methods described herein


The term “recombinant” refers to a human manipulated nucleic acid (e.g. polynucleotide) or a copy or complement of a human manipulated nucleic acid (e.g. polynucleotide), or if in reference to a protein (i.e, a “recombinant protein”), a protein encoded by a recombinant nucleic acid (e.g. polynucleotide). In embodiments, a recombinant expression cassette comprising a promoter operably linked to a second nucleic acid (e.g. polynucleotide) may include a promoter that is heterologous to the second nucleic acid (e.g. polynucleotide) as the result of human manipulation (e.g., by methods described in Sambrook et al., Molecular Cloning—A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y., (1989) or Current Protocols in Molecular Biology Volumes 1-3, John Wiley & Sons, Inc. (1994-1998)). In another example, a recombinant expression cassette may comprise nucleic acids (e.g. polynucleotides) combined in such a way that the nucleic acids (e.g. polynucleotides) are extremely unlikely to be found in nature. For instance, human manipulated restriction sites or plasmid vector sequences may flank or separate the promoter from the second nucleic acid (e.g. polynucleotide). One of skill will recognize that nucleic acids (e.g. polynucleotides) can be manipulated in many ways and are not limited to the examples above.


The terms “identical” or percent “identity,” in the context of two or more nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same (i.e., about 60% identity, preferably 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or higher identity over a specified region when compared and aligned for maximum correspondence over a comparison window or designated region) as measured using a BLAST or BLAST 2.0 sequence comparison algorithms with default parameters described below, or by manual alignment and visual inspection (see, e.g., NCBI web site or the like). Such sequences are then said to be “substantially identical.” This definition also refers to, or may be applied to, the compliment of a test sequence. The definition also includes sequences that have deletions and/or additions, as well as those that have substitutions. As described below, the preferred algorithms can account for gaps and the like. Preferably, identity exists over a region that is at least about 10 amino acids or 20 nucleotides in length, or more preferably over a region that is 10-50 amino acids or 20-50 nucleotides in length. As used herein, percent (%) amino acid sequence identity is defined as the percentage of amino acids in a candidate sequence that are identical to the amino acids in a reference sequence, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity. Alignment for purposes of determining percent sequence identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software such as BLAST, BLAST-2, ALIGN, ALIGN-2 or Megalign (DNASTAR) software. Appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full-length of the sequences being compared can be determined by known methods.


For sequence comparisons, typically one sequence acts as a reference sequence, to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. Preferably, default program parameters can be used, or alternative parameters can be designated. The sequence comparison algorithm then calculates the percent sequence identities for the test sequences relative to the reference sequence, based on the program parameters.


One example of an algorithm that is suitable for determining percent sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, which are described in Altschul et al. (1977) Nuc. Acids Res. 25:3389-3402, and Altschul et al. (1990) J. Mol. Biol. 215:403-410, respectively. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information. This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul et al. (1990) J. Mol. Biol. 215:403-410). These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always <0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 11, an expectation (E) or 10, M=5, N=−4 and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff and Henikoff (1989) Proc. Natl. Acad. Sci. USA 89:10915) alignments (B) of 50, expectation (E) of 10, M=5, N=−4, and a comparison of both strands.


The BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin and Altschul (1993) Proc. Natl. Acad. Sci. USA 90:5873-5787). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.2, more preferably less than about 0.01.


The phrase “codon optimized” as it refers to genes or coding regions of nucleic acid molecules for the transformation of various hosts, refers to the alteration of codons in the gene or coding regions of polynucleic acid molecules to reflect the typical codon usage of a selected organism without altering the polypeptide encoded by the DNA. Such optimization includes replacing at least one, or more than one, or a significant number, of codons with one or more codons that are more frequently used in the genes of that selected organism.


Nucleic acid is “operably linked” when it is placed into a functional relationship with another nucleic acid sequence. For example, DNA for a presequence or secretory leader is operably linked to DNA for a polypeptide if it is expressed as a preprotein that participates in the secretion of the polypeptide; a promoter or enhancer is operably linked to a coding sequence if it affects the transcription of the sequence; or a ribosome binding site is operably linked to a coding sequence if it is positioned so as to facilitate translation. Generally, “operably linked” means that the DNA sequences being linked are near each other, and, in the case of a secretory leader, contiguous and in reading phase. However, operably linked nucleic acids (e.g. enhancers and coding sequences) do not have to be contiguous. Linking is accomplished by ligation at convenient restriction sites. If such sites do not exist, the synthetic oligonucleotide adaptors or linkers are used in accordance with conventional practice. In embodiments, a promoter is operably linked with a coding sequence when it is capable of affecting (e.g. modulating relative to the absence of the promoter) the expression of a protein from that coding sequence (i.e., the coding sequence is under the transcriptional control of the promoter).


“Ribosome binding site” or “RBS” is also called the Shine Dalgarno sequence and generally has a sequence complementary to the 3′ terminal of 16S rRNA. The ribosomal binding site is found in bacterial and archaeal messenger RNA, and is generally located about 8 bases upstream of the start codon AUG. In particular, the RBS sequence which appears at high frequency is AGGAGG or AAGGAGG (hereinafter these sequences are referred to as “consensus RBS sequences”), or a sequence homologous with “consensus RBS sequence”. Although these sequences appear at various sites of genes, it is understood that the RBS sequences appear at high frequency in regions upstream of start codons. Also included in the term “RBS” is the RBS sequence from the MphR gene as disclosed herein (“AGAAGG”). Other functional RBS sequences can also be used in place of the specific sequences disclosed herein. When discussing nucleotide mutations in the RBS, the first A is labeled as nucleotide “1” and the final G is labelled as nucleotide “6”. Alternatively, the mutations may sometimes referred to by their relative position to the ATG start codon. The basic structure of a prokaryote gene consists of a promoter which starts the synthesis of mRNA, a ribosome binding site which participates in the binding between mRNA and ribosomes and in the translation initiation, a start codon, a translation stop codon and a terminator which terminates the synthesis of mRNA. AUG codon is the most appropriate as a start codon. Since the start codons and coding regions are determined usually based upon a DNA sequence, in the present specification, the sequences of start codons and stop codons and sequences involved in the binding of ribosomes and mRNA are expressed as DNA sequences appropriately as well as RNA sequences, unless mentioned specifically.


The term “gene” or “gene sequence” refers to the coding sequence or control sequence, or fragments thereof. A gene may include any combination of coding sequence and control sequence, or fragments thereof. Thus, a “gene” as referred to herein may be all or part of a native gene. A polynucleotide sequence as referred to herein may be used interchangeably with the term “gene”, or may include any coding sequence, non-coding sequence or control sequence, fragments thereof, and combinations thereof. The term “gene” or “gene sequence” includes, for example, control sequences upstream of the coding sequence (for example, the ribosome binding site).


MphR Biosensors


Described herein is a platform technology that comprises genetically-encoded biosensors and methods to create them for detection of a class of small molecules called polyketides. Such biosensors provide a scalable, economic, high-throughput, and broadly applicable means to specifically identify a target polyketide of interest from complex mixtures of molecules. Polyketides are used extensively as drugs to treat human, animal, and plant diseases.


Examples of polyketides include, but are not limited to, macrolides, polyenes, enediynes, and aromatic polyketides. In some embodiments, the polyketide is a macrolide. In some embodiments, the polyketide is a 12-membered macrolide. In some embodiments, the polyketide is a 14-membered macrolide.


Due to their widespread use, polyketides are often produced in bacteria via genetic engineering. Detection of polyketides in microbial hosts remains a significant challenge however, and this limits the throughput and success of engineering approaches aimed at improving yields of polyketide and accessing new molecules. Thus, the main application of the present invention relates to the production of antibiotics, anticancer drugs, insecticides, anti-parasitics, anti-fungals, anti-cholesterol, and immunosuppressants in microbial hosts. Because the biosensors can be employed in a wide variety of contexts, other commercial applications include but are not limited to: (1) discovery of polyketide producing genes from collections of genomes; (2) identification and quantification of polyketide-based drugs, contaminants, and other molecules in environmental, clinical, and other research samples; and (3) isolation or removal of target polyketide compounds from complex mixtures.


The sensor is based on the MphR gene, which encodes a transcription factor. The natural role of wild-type (WT) MphR is to activate the expression of resistance genes in response to binding the polyketide antibiotic, erythromycin A (ErA, FIG. 1). Upon binding ErA, the MphR protein undergoes a conformational change that causes it to leave its cognate operator DNA sequence, thereby allowing RNA polymerase to transcribe the gene and produce the gene product. By placing the MphR gene sequence and its operator DNA into an artificial vector, MphR can be used to drive the expression of reporter proteins that produce fluorescent, luminescent, or chromogenic signals in the presence of erythromycin A (ErA) (FIG. 1(b)). However, compared to ErA, much higher concentrations of other polyketides, even those structurally related to ErA, are required to elicit strong reporter signals using WT MphR (FIG. 3(a)). Moreover, most polyketides are not detected by WT MphR at all. These features have severely restricted the utility of MphR as a biosensor for high-throughput analysis of polyketides. Disclosed herein is a panel of MphR variants that are utilized for the detection of specific, target polyketides. Such tailored biosensors enable a suite of high-throughput approaches to be applied to the engineering of polyketide biosynthesis in microbes.


In one embodiment, the operator DNA sequence is 5′-AATATAACCGACGTGACTGTTACATTTAGG-3 (SEQ ID NO:27).


The genetically-encoded biosensors described here are unique in several aspects: (1) biosensors that respond to a broad variety of polyketides are not currently known; (2) biosensors that can discriminate between very closely related polyketide structures have not been described, (3) a strategy to engineer the ligand specificity and/or amount of MphR was developed that is efficient, novel, and non-obvious; and (4) other high-throughput analytical methods/tools to detect most polyketides are not available. Accordingly, high-throughput engineering approaches such as directed gene or enzyme evolution and synthetic biology have not been applied to the vast majority of polyketides due to the lack of suitable screening tools. Such strategies are critical to overcome the poor understanding of how to design and construct biosynthetic or chemical routes to new and existing antibiotics. In contrast, the biosensor-guided approach described herein can be applied to engineering the biosynthesis of a broad range of polyketides in potentially any microbial host, and could be generalized to other classes of natural products such as peptides, alkaloids, and terpenes. The invention disclosed herein can enable production of polyketide products rapidly and at lower cost than existing manufacturing routes, thus maximizing the return on investment and providing incentive to develop new antibiotics.


The biosensor platform is simple (consisting of two genes—one encodes the genetically modified MphR gene sequence and the other encodes a marker/reporter gene (for example, GFP) under the control of the MphR responsive promoter), scalable (genetically encoded so that the host microbe synthesizes all the parts), economic, ultra-high-throughput (millions of potential polyketide producing strains can be assayed using the biosensor), and can be easily adapted to target polyketides of interest (directed evolution is a powerful strategy to engineer the ligand specificity of proteins).


MphR is a repressor protein that controls the transcription of a gene cassette responsible for resistance to macrolide antibiotics via phosphorylation of the desosamine 2′-hydroxy group of ErA. Interestingly, MphR is also de-repressed by other macrolide antibiotics, including josamycin, oleandomycin, narbomycin, methymycin and pikromycin. This promiscuity provides a platform for creating tailored MphR variants for applications related to polyketide synthetic biology and directed evolution beyond those offered by the wild-type biosensor. For example, sensors may recognize a wide variety of polyketides, sensors may distinguish biosynthetic intermediates to allow specific detection of the desired mature product, and the binding affinity and dynamic range of a given biosensor can be tailored for specific applications.


In one aspect, disclosed herein is a biosensor system comprising:

    • a nucleic acid encoding a genetically modified MphR gene sequence, wherein the nucleic acid comprises at least one genetic mutation when compared to the wild-type MphR gene sequence; and
    • a reporter gene whose transcription is under the control of a promoter region which is regulated by the MphR transcription factor.


In some embodiments, the biosensor system further comprises a nucleic acid encoding an MphA gene sequence. In some embodiments, the biosensor system further comprises a nucleic acid encoding a portion of the mrx gene. In some embodiments, the biosensor system further comprises a nucleic acid encoding an MphA gene sequence and a portion of the mrx gene.


In one embodiment, the nucleic acid encoding the genetically modified MphR gene sequence and the reporter gene are located on one recombinant DNA vector. In one embodiment, the nucleic acid encoding the genetically modified MphR gene sequence and the reporter gene are located on one recombinant DNA vector.


In one embodiment, the reporter gene is a gene coding for chloramphenicol acetyltransferase, beta-galactosidase, luciferase or green fluorescent protein (GFP). In one embodiment, the reporter gene is a gene coding for green fluorescent protein (GFP). In one embodiment, the reporter gene is a gene coding for chloramphenicol acetyltransferase.


In some embodiments, the MphR mutation confers improved sensitivity for detecting erythromycin A. In one embodiment, the MphR genetic mutation encodes a nucleotide change in the ribosome binding site sequence. In one embodiment, the MphR genetic mutation encodes a nucleotide change in the ribosome binding site sequence selected from A1G, A1T, A1C, G2T, G2A, A3C, A3G, A4T, G5T, G6T, or a combination thereof. In one embodiment, the MphR genetic mutation encodes a nucleotide change in the ribosome binding site sequence selected from A1G, A4T, or a combination thereof. In one embodiment, the MphR genetic mutation encodes an A1G nucleotide change in the ribosome binding site sequence. In one embodiment, the MphR genetic mutation encodes an A4T nucleotide change in the ribosome binding site sequence.


In one embodiment, the MphR genetic mutation encodes a nucleotide change in the ribosome binding site sequence selected from A1T, G2T, A3C, or a combination thereof. In one embodiment, the MphR genetic mutation encodes a nucleotide change in the ribosome binding site sequence selected from A1C, G2T, A3G, or a combination thereof. In one embodiment, the MphR genetic mutation encodes a nucleotide change in the ribosome binding site sequence selected from G2A, G5T, or a combination thereof.


In one embodiment, the MphR genetic mutation encodes the amino acid change selected from T17R, T27G, Q65M, T27A, M59E, M59S, R22H, K35N, T49I, L89V, D98N, E109D, R122T, K132N, A151T, H184Q, T49I, L89V, D98N, E109D, or a combination thereof. In one embodiment, the MphR genetic mutation encodes the amino acid change selected from T17R, T27G, Q65M, T27A, M59E, M59S, R22H, K35N, or a combination thereof. In one embodiment, the MphR genetic mutation encodes the amino acid change selected from T49I, L89V, D98N, E109D, R122T, K132N, A151T, H184Q, T49I, L89V, D98N, E109D, or a combination thereof.


In some embodiments, the MphR mutation confers improved selectivity for detecting erythromycin A in comparison to other polyketides. In one embodiment, the MphR genetic mutation encodes the amino acid change selected from A16T, T154M, M155K, or a combination thereof. In one embodiment, the MphR genetic mutation encodes an A4T nucleotide change in the ribosome binding site sequence and an amino acid change selected from A16T, T154M, M155K, or a combination thereof.


In some embodiments, the MphR mutation confers improved selectivity for detecting erythromycin A in comparison to structurally similar precursors. In one embodiment, the MphR genetic mutation encodes the amino acid change selected from P4L, W107L, H193R, or a combination thereof.


In some embodiments, the MphR mutation confers improved sensitivity for detecting pikromycin. In one embodiment, the MphR genetic mutation encodes the amino acid change S106F.


In some embodiments, the MphR mutation confers improved sensitivity for detecting narbomycin. In one embodiment, the MphR genetic mutation encodes the amino acid change selected from V33L, A34S, R51C, or a combination thereof.


In some embodiments, the MphR mutation confers improved sensitivity for detecting clarithromycin. In one embodiment, the MphR genetic mutation encodes the amino acid change selected from T49I, L89V, D98N, E109D or a combination thereof. In one embodiment, the MphR genetic mutation encodes the amino acid change R122T. In one embodiment, the MphR genetic mutation encodes the amino acid change selected from R122T, K132N, A151T, H184Q, or a combination thereof. In one embodiment, the MphR genetic mutation encodes an A4T nucleotide change in the ribosome binding site sequence and an amino acid change selected from R122T, K132N, A151T, H184Q, or a combination thereof. In one embodiment, the MphR genetic mutation encodes the amino acid change selected from T49I, L89V, D98N, E109D, or a combination thereof.


In one aspect, disclosed herein is a genetically modified host cell comprising: a nucleic acid encoding a genetically modified MphR gene sequence, wherein the nucleic acid comprises at least one genetic mutation when compared to the wild-type MphR gene sequence; and a reporter gene whose transcription is under the control of a promoter region which is regulated by the MphR transcription factor.


In one embodiment, the nucleic acid encoding the genetically modified MphR gene sequence and the reporter gene are located on one recombinant DNA vector.


In one embodiment, the nucleic acid encoding the genetically modified MphR gene sequence and the reporter gene are located on one recombinant DNA vector.


In one embodiment, the reporter gene is a gene coding for chloramphenicol acetyltransferase, beta-galactosidase, luciferase or green fluorescent protein (GFP). In one embodiment, the reporter gene is a gene coding for green fluorescent protein (GFP). In one embodiment, the reporter gene is a gene coding for chloramphenicol acetyltransferase.


In one embodiment, the cell is E. coli. In one embodiment, the cell is Streptomyces. In one embodiment, the cell is Streptomyces venezuelae. In one embodiment, the cell is Saccharopolyspora erythraea.


In some embodiments, disclosed herein is a genetically modified MphR gene sequence comprising at least one mutation in the nucleotide sequence upstream of the ATG start codon of the MphR gene sequence, wherein the mutation confers increased sensitivity for detection of erythromycin A in comparison to the wild type MphR transcription factor.


In some embodiments, disclosed herein is a genetically modified MphR gene sequence comprising at least one mutation in the ribosome binding site sequence of the MphR gene sequence, wherein the mutation confers increased sensitivity for detection of erythromycin A in comparison to the wild type MphR transcription factor.


In some embodiments, disclosed herein is a genetically modified MphR gene sequence comprising at least one mutation in the MphR protein sequence, wherein the mutation confers increased sensitivity for detection of erythromycin A in comparison to the wild type MphR transcription factor.


In some embodiments, disclosed herein is a genetically modified MphR gene sequence comprising at least one mutation in the nucleotide sequence upstream of the ATG start codon of the MphR gene sequence, wherein the mutation confers increased sensitivity for detection of erythromycin A in comparison to the wild type MphR transcription factor.


In some embodiments, disclosed herein is a genetically modified MphR gene sequence comprising at least one mutation in the ribosome binding site sequence of the MphR gene sequence, wherein the mutation confers increased sensitivity for detection of erythromycin A in comparison to the wild type MphR transcription factor.


In some embodiments, disclosed herein is a genetically modified MphR gene sequence comprising at least one mutation in the MphR protein sequence, wherein the mutation confers increased selectivity for detection of erythromycin A in comparison to other polyketides.


In some embodiments, disclosed herein is a genetically modified MphR gene sequence comprising at least one mutation in the MphR protein sequence, wherein the mutation confers increased selectivity for detection of erythromycin A in comparison to structurally similar precursors.


In some embodiments, disclosed herein is a genetically modified MphR gene sequence comprising at least one mutation in the MphR protein sequence, wherein the mutation confers increased sensitivity for detection of pikromycin in comparison to the wild type MphR transcription factor.


In some embodiments, disclosed herein is a genetically modified MphR gene sequence comprising at least one mutation in the nucleotide sequence upstream of the ATG start codon of the MphR gene sequence, wherein the mutation confers increased sensitivity for detection of pikromycin in comparison to the wild type MphR transcription factor.


In some embodiments, disclosed herein is a genetically modified MphR gene sequence comprising at least one mutation in the ribosome binding site sequence of the MphR gene sequence, wherein the mutation confers increased sensitivity for detection of pikromycin in comparison to the wild type MphR transcription factor.


In some embodiments, disclosed herein is a genetically modified MphR gene sequence comprising at least one mutation in the MphR protein sequence, wherein the mutation confers increased sensitivity for detection of narbomycin in comparison to the wild type MphR transcription factor.


In some embodiments, disclosed herein is a genetically modified MphR gene sequence comprising at least one mutation in the nucleotide sequence upstream of the ATG start codon of the MphR gene sequence, wherein the mutation confers increased sensitivity for detection of narbomycin in comparison to the wild type MphR transcription factor.


In some embodiments, disclosed herein is a genetically modified MphR gene sequence comprising at least one mutation in the ribosome binding site sequence of the MphR gene sequence, wherein the mutation confers increased sensitivity for detection of narbomycin in comparison to the wild type MphR transcription factor.


In some embodiments, disclosed herein is a genetically modified MphR gene sequence comprising at least one mutation in the MphR protein sequence, wherein the mutation confers increased sensitivity for detection of YC-17 in comparison to the wild type MphR transcription factor.


In some embodiments, disclosed herein is a genetically modified MphR gene sequence comprising at least one mutation in the nucleotide sequence upstream of the ATG start codon of the MphR gene sequence, wherein the mutation confers increased sensitivity for detection of YC-17 in comparison to the wild type MphR transcription factor.


In some embodiments, disclosed herein is a genetically modified MphR gene sequence comprising at least one mutation in the ribosome binding site sequence of the MphR gene sequence, wherein the mutation confers increased sensitivity for detection of YC-17 in comparison to the wild type MphR transcription factor.


In one aspect, disclosed herein is a biosensor system comprising:

    • a nucleic acid encoding a genetically modified MphR transcription factor, wherein the nucleic acid comprises at least one genetic mutation when compared to the wild-type MphR gene sequence; and
    • a reporter gene whose transcription is under the control of a promoter region which is regulated by the MphR transcription factor.


In one aspect, disclosed herein is a genetically modified host cell comprising:

    • a nucleic acid encoding a genetically modified MphR transcription factor, wherein the nucleic acid comprises at least one genetic mutation when compared to the wild-type MphR gene sequence; and
    • a reporter gene whose transcription is under the control of a promoter region which is regulated by the MphR transcription factor.


In one aspect, provided herein is a method for detecting a polyketide, comprising:

    • introducing into a cell:
    • i. a nucleic acid encoding a genetically modified MphR transcription factor, wherein the nucleic acid comprises at least one genetic mutation when compared to the wild-type MphR gene sequence; and
    • ii. a reporter gene whose transcription is under the control of a promoter region which is regulated by the MphR transcription factor,
    • and
    • detecting the polyketide based on the differential expression of the reporter gene in comparison to a cell comprising a wild-type MphR transcription factor.


In one aspect, provided herein is a method of screening for genetic mutations in a target gene, comprising:

    • introducing into a cell:
    • i. a nucleic acid encoding a genetically modified MphR transcription factor, wherein the nucleic acid comprises at least one genetic mutation when compared to the wild-type MphR gene sequence; and
    • ii. a reporter gene whose transcription is under the control of a promoter region which is regulated by the MphR transcription factor,
    • introducing at least one mutation into a target gene; and
    • identifying a cell comprising the target gene mutation based on the differential expression of the reporter gene in comparison to a cell comprising the wild-type target gene.


      MphR Biosensors: Methods


In one aspect, provided herein is a method for detecting a polyketide, comprising:

    • introducing into a cell:
    • i. a nucleic acid encoding a genetically modified MphR gene sequence, wherein the nucleic acid comprises at least one genetic mutation when compared to the wild-type MphR gene sequence; and
    • ii. a reporter gene whose transcription is under the control of a promoter region which is regulated by the MphR transcription factor,
    • and
    • detecting the polyketide based on the differential expression of the reporter gene in comparison to a cell comprising a wild-type MphR gene sequence.


In one embodiment, the nucleic acid encoding the genetically modified MphR gene sequence and the reporter gene are located on one recombinant DNA vector.


In one embodiment, the reporter gene is a gene coding for chloramphenicol acetyltransferase, beta-galactosidase, luciferase or green fluorescent protein (GFP). In one embodiment, the reporter gene is a gene coding for green fluorescent protein (GFP).


In some embodiments, the MphR mutation confers improved sensitivity for detecting erythromycin A. In one embodiment, the MphR genetic mutation encodes a nucleotide change in the ribosome binding site sequence. In one embodiment, the MphR genetic mutation encodes a nucleotide change in the ribosome binding site sequence selected from A1G, A1T, A1C, G2T, G2A, A3C, A3G, A4T, G5T, G6T, or a combination thereof. In one embodiment, the MphR genetic mutation encodes a nucleotide change in the ribosome binding site sequence selected from A1G, A4T, or a combination thereof. In one embodiment, the MphR genetic mutation encodes an A1G nucleotide change in the ribosome binding site sequence. In one embodiment, the MphR genetic mutation encodes an A4T nucleotide change in the ribosome binding site sequence.


In one embodiment, the MphR genetic mutation encodes a nucleotide change in the ribosome binding site sequence selected from A1T, G2T, A3C, or a combination thereof. In one embodiment, the MphR genetic mutation encodes a nucleotide change in the ribosome binding site sequence selected from A1C, G2T, A3G, or a combination thereof. In one embodiment, the MphR genetic mutation encodes a nucleotide change in the ribosome binding site sequence selected from G2A, G5T, or a combination thereof.


In one embodiment, the MphR genetic mutation encodes the amino acid change selected from T17R, T27G, Q65M, T27A, M59E, M59S, R22H, K35N, T49I, L89V, D98N, E109D, R122T, K132N, A151T, H184Q, T49I, L89V, D98N, E109D, or a combination thereof. In one embodiment, the MphR genetic mutation encodes the amino acid change selected from T17R, T27G, Q65M, T27A, M59E, M59S, R22H, K35N, or a combination thereof. In one embodiment, the MphR genetic mutation encodes the amino acid change selected from T49I, L89V, D98N, E109D, R122T, K132N, A151T, H184Q, T49I, L89V, D98N, E109D, or a combination thereof.


In some embodiments, the MphR mutation confers improved selectivity for detecting erythromycin A in comparison to other polyketides. In one embodiment, the MphR genetic mutation encodes the amino acid change selected from A16T, T154M, M155K, or a combination thereof. In one embodiment, the MphR genetic mutation encodes an A4T nucleotide change in the ribosome binding site sequence and an amino acid change selected from A16T, T154M, M155K, or a combination thereof.


In some embodiments, the MphR mutation confers improved selectivity for detecting erythromycin A in comparison to structurally similar precursors. In one embodiment, the MphR genetic mutation encodes the amino acid change selected from P4L, W107L, H193R, or a combination thereof.


In some embodiments, the MphR mutation confers improved sensitivity for detecting pikromycin. In one embodiment, the MphR genetic mutation encodes the amino acid change S106F.


In some embodiments, the MphR mutation confers improved sensitivity for detecting narbomycin. In one embodiment, the MphR genetic mutation encodes the amino acid change selected from V33L, A34S, R51C, or a combination thereof.


In some embodiments, the MphR mutation confers improved sensitivity for detecting clarithromycin. In one embodiment, the MphR genetic mutation encodes the amino acid change selected from T49I, L89V, D98N, E109D or a combination thereof. In one embodiment, the MphR genetic mutation encodes the amino acid change R122T. In one embodiment, the MphR genetic mutation encodes the amino acid change selected from R122T, K132N, A151T, H184Q, or a combination thereof. In one embodiment, the MphR genetic mutation encodes an A4T nucleotide change in the ribosome binding site sequence and an amino acid change selected from R122T, K132N, A151T, H184Q, or a combination thereof. In one embodiment, the MphR genetic mutation encodes the amino acid change selected from T49I, L89V, D98N, E109D, or a combination thereof.


In one embodiment, the cell is E. coli. In one embodiment, the cell is Streptomyces. In one embodiment, the cell is Streptomyces venezuelae.


In one aspect, provided herein is a method of screening for genetic mutations in a target gene, comprising:

    • introducing into a cell:
    • i. a nucleic acid encoding a genetically modified MphR gene sequence, wherein the nucleic acid comprises at least one genetic mutation when compared to the wild-type MphR gene sequence; and
    • ii. a reporter gene whose transcription is under the control of a promoter region which is regulated by the MphR transcription factor,
    • introducing at least one mutation into a target gene; and
    • identifying a cell comprising the target gene mutation based on the differential expression of the reporter gene in comparison to a cell comprising the wild-type target gene.


In one embodiment, the reporter gene is a gene coding for chloramphenicol acetyltransferase, beta-galactosidase, luciferase or green fluorescent protein (GFP). In one embodiment, the reporter gene is a gene coding for green fluorescent protein (GFP).


In some embodiments, the MphR mutation confers improved sensitivity for detecting erythromycin A. In one embodiment, the MphR genetic mutation encodes a nucleotide change in the ribosome binding site sequence. In one embodiment, the MphR genetic mutation encodes a nucleotide change in the ribosome binding site sequence selected from A1G, A1T, A1C, G2T, G2A, A3C, A3G, A4T, G5T, G6T, or a combination thereof. In one embodiment, the MphR genetic mutation encodes a nucleotide change in the ribosome binding site sequence selected from A1G, A4T, or a combination thereof. In one embodiment, the MphR genetic mutation encodes an A1G nucleotide change in the ribosome binding site sequence. In one embodiment, the MphR genetic mutation encodes an A4T nucleotide change in the ribosome binding site sequence.


In one embodiment, the MphR genetic mutation encodes a nucleotide change in the ribosome binding site sequence selected from A1T, G2T, A3C, or a combination thereof. In one embodiment, the MphR genetic mutation encodes a nucleotide change in the ribosome binding site sequence selected from A1C, G2T, A3G, or a combination thereof. In one embodiment, the MphR genetic mutation encodes a nucleotide change in the ribosome binding site sequence selected from G2A, G5T, or a combination thereof.


In one embodiment, the MphR genetic mutation encodes the amino acid change selected from T17R, T27G, Q65M, T27A, M59E, M59S, R22H, K35N, T49I, L89V, D98N, E109D, R122T, K132N, A151T, H184Q, T49I, L89V, D98N, E109D, or a combination thereof. In one embodiment, the MphR genetic mutation encodes the amino acid change selected from T17R, T27G, Q65M, T27A, M59E, M59S, R22H, K35N, or a combination thereof. In one embodiment, the MphR genetic mutation encodes the amino acid change selected from T49I, L89V, D98N, E109D, R122T, K132N, A151T, H184Q, T49I, L89V, D98N, E109D, or a combination thereof.


In some embodiments, the MphR mutation confers improved selectivity for detecting erythromycin A in comparison to other polyketides. In one embodiment, the MphR genetic mutation encodes the amino acid change selected from A16T, T154M, M155K, or a combination thereof. In one embodiment, the MphR genetic mutation encodes an A4T nucleotide change in the ribosome binding site sequence and an amino acid change selected from A16T, T154M, M155K, or a combination thereof.


In some embodiments, the MphR mutation confers improved selectivity for detecting erythromycin A in comparison to structurally similar precursors. In one embodiment, the MphR genetic mutation encodes the amino acid change selected from P4L, W107L, H193R, or a combination thereof.


In some embodiments, the MphR mutation confers improved sensitivity for detecting pikromycin. In one embodiment, the MphR genetic mutation encodes the amino acid change S106F.


In some embodiments, the MphR mutation confers improved sensitivity for detecting narbomycin. In one embodiment, the MphR genetic mutation encodes the amino acid change selected from V33L, A34S, R51C, or a combination thereof.


In some embodiments, the MphR mutation confers improved sensitivity for detecting clarithromycin. In one embodiment, the MphR genetic mutation encodes the amino acid change selected from T49I, L89V, D98N, E109D or a combination thereof. In one embodiment, the MphR genetic mutation encodes the amino acid change R122T. In one embodiment, the MphR genetic mutation encodes the amino acid change selected from R122T, K132N, A151T, H184Q, or a combination thereof. In one embodiment, the MphR genetic mutation encodes an A4T nucleotide change in the ribosome binding site sequence and an amino acid change selected from R122T, K132N, A151T, H184Q, or a combination thereof. In one embodiment, the MphR genetic mutation encodes the amino acid change selected from T49I, L89V, D98N, E109D, or a combination thereof.


EXAMPLES

The following examples are set forth below to illustrate the systems, cells, methods, compositions and results according to the disclosed subject matter. These examples are not intended to be inclusive of all aspects of the subject matter disclosed herein, but rather to illustrate representative systems, cells, methods, compositions and results. These examples are not intended to exclude equivalents and variations of the present invention which are apparent to one skilled in the art.


Example 1: MphR Biosensors with Improved Sensitivity for Erythromycin a (ErA)

The sensitivity of biosensors often requires tailoring to meet specific needs. For example, if a certain polyketide is expected to be found inside microbial cells at concentrations between 0 and 100 μM, then a biosensor is required that displays a linear detection response within the same range. The wild-type MphR gene was subjected to a directed evolution approach in order to identify MphR gene mutations and variants with improved sensitivity towards ErA. A library of MphR gene mutations and variants was created by error-prone PCR (epPCR). Because many mutations could lead to misfolded MphR variants or those that do not bind to the operator, flow cytometry was first used to remove variants that are always ‘ON’ in the absence of ligand. Next, individual ‘OFF’ variants were tested in wells of microplates to identify the variants most improved at low concentrations of ErA. Next, using promising individual variants, GFP fluorescence was measured in the presence of varying concentrations of erythromycin A (ErA) and the data was fit to the Hill equation to provide several parameters for describing selected MphR variants: dynamic range (GFPmax-GFPmin), K1/2 (ligand conc. resulting in half-maximal induction), cooperativity (Hill coefficient), linear range of detection, and Z′-factor (score of 0.50 indicates an excellent screen). Three variants (H4, A3, and E7) displayed improvements in sensitivity (FIG. 2 and Table 9).


Additional mutations in the MphR gene sequence that provided increased sensitivity to erythromycin A (ErA) were also identified. The MphR macrolide resistance cassette operates as an analog converter of macrolide concentration to antibiotic resistance, as explained above and elsewhere ((Noguchi N, et al. Regulation of Transcription of the mph(A) Gene for Macrolide 2′-Phosphotransferase I in Escherichia Coli; Characterization of the Regulatory Gene mphR(A). Journal of Bacteriology. 2000; 182(18):5052-5058) (Zheng J, et al. Structure and Function of the Macrolide Biosensor Protein, MphR(A), With and Without Erythromycin. Journal of Molecular Biology. 2009; 387(5):1250-60). Refactoring the MphR cassette as a two plasmid system with a GFP reporter (Gardner L, et al. Photochemical Control of Bacterial Signal Processing Using a Light-activated Erythromycin. Molecular Biosystems. 2011; 7(9):2554-7) created a biosensor capable of detecting a range of macrolides. Previous literature reports various induction ranges for MphR-based biosensors depending on the plasmid construct. Church and coworkers reported Kin values of 22 and 97 μM erythromycin A for low and high copy number plasmids respectively, using a GFP reporter (Rogers, J. et al. 7648-7660 Nucleic Acids Research, 2015, Vol. 43, No. 15). Eberz and coworkers report an apparent induction range of 0 (min luminescence) to 20 (max luminescence) μM erythromycin A with an approximate half maximal induction at 10 μM using the LuxABCDE luminescence reporter system (Mohrle, V. et al. Anal. Bioanal. Chem. 2007 July; 388(5-6):1117-25). In the experiments conducted herein, a previously reported MphR-based biosensor (MphR-WT) (Gardner L, et al. Photochemical Control of Bacterial Signal Processing Using a Light-activated Erythromycin. Molecular Biosystems. 2011; 7(9):2554-7) had a K1/2 of only 2.73 μM erythromycin A (Table 1) using a GFP reporter. Error-prone and multi-site saturation mutagenesis of the MphR gene was performed in order to improve sensitivity to erythromycin A.


Plasmid pMLGFP (See FIG. 15 and sequence below) (Gardner L, et al. Photochemical Control of Bacterial Signal Processing Using a Light-activated Erythromycin. Molecular Biosystems. 2011; 7(9):2554-7) containing the MphR gene was utilized to make mutants of the MphR protein. Three and five site saturation mutagenesis libraries of the MphR gene that targeted residues of the ligand binding domain were generated using the Quikchange Multi Site-Directed mutagenesis kit (Agilent) and designated QCMS3 and QCMS5, respectively. A third library was generated via error-prone PCR (epPCR) with an average of two amino acid mutations per library clone. Libraries were transformed into E. coli TOP10 cells with plasmid pJZ12 (See FIG. 16 and sequence below) containing genes MphA and mrx and subjected to an initial round of negative sorting in the absence of added ligand via Fluorescence Activated Cell Sorting (FACS) to eliminate variants that are constitutively expressing GFP. Pools of negatively-selected mutants were then plated on LB-agar plates and individual colonies were screened in 96-well microtiter plates in the presence of no ligand and 1 uM erythromycin A. Several clones showed initial improvements in erythromycin A sensitivity versus MphR-WT.


The best performing clones from each library were selected for further analysis. Dose-response experiments revealed clones with improved performance features compared to MphR-WT for erythromycin A sensitivity (FIG. 11 (A-B) and Table 1.) The QCMS3, QCMS5, and epPCR libraries all yielded clones with higher sensitivity to low concentrations of erythromycin A, with the greatest results coming from the epPCR library. Clone MphR-G76C, containing the mutation G76C in the MphR protein, showed a sensitivity increase that shifted its linear range of detection into nanomolar concentrations, approaching an order of magnitude sensitivity increase versus MphR-WT.









TABLE 1







Biosensor Performance Features for MphR Mutations.















linear range





dynamic range
of detection


Clone
K1/2 (μm)
Cooperativity
(GFPmax-GFPmin)
(μM)





G76C
0.42 ± 0.01
1.80 ± 0.01
59000
0.1-0.6


V90I
0.55 ± 0.01
2.84 ± 0.42
58600
0.1-1  


T17R
0.93 ± 0.03
3.16 ± 0.13
59300
0.3-1  


T27G/Q65M
1.55 ± 0.09
2.92 ± 0.17
60200
0.6-2  


T27A/M59E
1.15 ± 0.09
2.59 ± 0.04
54800
0.1-2  


WT
2.73 ± 0.72
4.44 ± 1.52
54800
0.9-5  









In Table 1, Hill functions were used to derive biosensor transfer functions. K1/2 is the inducer concentration at half maximal induction. Cooperativity is derived from the Hill function to indicate cooperative ligand binding between protein monomers of the MphR dimer. Dynamic range is the GFP maximal response minus the minimum GFP response, which in all cases was the response with no ligand. The linear range of detection is the linear portion of the dose-response curve with a slope R2=0.95 or higher.


Importantly, several of these sensors have linear detection ranges capable of detecting titers of erythromycin A heterologously produced in shake-flask E. coli cultures. As this has remained a preferred method for the production of erythromycin A and erythromycin A derivatives resulting from precursor-directed mutasynthesis (Sundermann U, et al. Enzyme-directed Mutasynthesis: a Combined Experimental and Theoretical Approach to Substrate Recognition of a Polyketide Synthase. ACS Chemical Biology. 2013; 8(2): 443-50) or domain-swapping biosynthesis (Jiang M., Pfeifer, B. Metabolic and Pathway Engineering to Influence Native and Altered Erythromycin Production Through E. coli. Metabolic Engineering. 2013; 19:42-9), MphR biosensors can be used in high-throughput approaches to the continued improvement of heterologous erythromycin A biosynthetic engineering.


After further analysis of these clones, via DNA sequencing, the ribosome binding site (RBS) of A3 and E7 were found to be mutated, compared to the wild-type MphR sequence. Clone H4 also had mutations in other portions of the sequence and thus was omitted from further analysis here. This implicates the RBS mutations in these variants are responsible for sensitivity to erythromycin, rather than the amino acid changes identified. To confirm this, new versions of A3 and E7 were constructed that either only included the RBS mutations or the amino acids for each clone. Subsequent analysis revealed that the RBS mutations alone were responsible for the improvement in sensitivity to erythromycin (FIG. 2; Tables 2 and 3).









TABLE 2







Sensitivity of wild-type MphR and ribosome binding


site (RBS)-only mutations towards erythromycin A













WT
WT A3-RBS
WT E7-RBS







K1/2 (μM)
1.9 ± 0.03
0.52 ± 0.02
0.64 ± 0.02

















TABLE 3







Sensitivity of wild-type MphR and amino-acid


change-only mutations towards erythromycin A













WT-AA
A3-AA
E7-AA







K1/2 (μM)
1.9 ± 0.03
1.9 ± 0.02
2.2 ± 0.03










Example 2. Engineering Sensitivity Towards Erythromycin Via Ribosome Binding Site (RBS) Mutagenesis of MphR

The finding that mutations to the ribosome binding site (RBS) of clones A3 and E7 were responsible for modulating sensitivity prompted the inventors to make a dedicated library of RBS mutations to search for biosensors with improved sensitivities. Screening the “smRBS” library and analysis of the best performing clones revealed three clones (see below) with significantly improved sensitivity towards erythromycin. The best clone, smRBSA1, outperforms each mutant previously described (FIG. 17; Table 4). In addition, the sensitivity of smRBSA1 towards pikromycin was improved 2-fold, compared to the wild-type MphR. Thus, the RBS mutations discovered by screening against erythromycin can impact sensitivity towards other polyketides (FIG. 18; Table 5).









TABLE 4







Sensitivity of smRBS mutants with erythromycin A.












Clone
RBS
K1/2 (μM)
DR (GFP)
LRD (μM)
Hillc





MphR-WT
AGAAGGT
1.88 ± 0.03
66000
0.9-5  
3.6 ± 0.3


smRBS1A1
TTCAGGT
0.19 ± 0.02
66000
0.01-0.7 
1.7 ± 0.1


smRBS1G6
CTGAGGT
0.91 ± 0.04
64000
0.3-2  
5.4 ± 1.2


smRBS2E1
AAAGGTT
1.44 ± 0.08
63000
0.3-3  
3.9 ± 0.5





‘DR’ is the dynamic range, GFPmax-GFPmin;


‘LRD’ is the linear range of detection.













TABLE 5







E7-RBS, smRBS1A1, pikB1, and WT with pikromycin










Clone
K1/2 (μm)
HillC
Dyn. Range (RFU)





WT
97 ± 2 
2.9 ± 0.3
26800 ± 400 


E7-RBS
50 ± 20
2.3 ± 0.1
40000 ± 5000


smRBS1A1
48 ± 5 
2.5 ± 0.2
37000 ± 6000


pikB1
0.81 ± 0.02
1.8 ± 0.2
64000 ± 2000









Example 3: MphR Biosensors with Improved Selectivity Towards ErA

In many cases, it is necessary to determine the presence and concentration of a given polyketide in the presence of other structurally related molecules. Accordingly, the selectivity of MphR requires tailoring towards target molecules. To test the capacity of random mutations to alter the ligand specificity of MphR, the initial goal was to find variants that were more selective with erythromycin A compared to clarithromycin, azithromycin, and roxithromycin. A library of MphR gene mutations and variants was created by error-prone PCR (epPCR) and flow cytometry was first used to remove variants that are always ‘ON’ in the absence of erythromycin A and the presence of clarithromycin and azithromycin. Next, individual ‘OFF’ variants were tested in wells of microplates to identify the variants most improved at low concentrations of erythromycin A. Thus, some of the ‘OFF’ library members were duplicated and each screened in the presence of erythromycin A or a mixture of clarithromycin, azithromycin, and roxithromycin. Several variants were not activated by clarithromycin, azithromycin, and roxithromycin but were strongly activated by erythromycin A (FIG. 3). One variant, M2D6, was chosen for quantitative analysis, which confirmed that the ligand specificity of this variant was very different from that of the WT MphR (FIG. 3 and Table 11).


To confirm previous reports of the broad inducer tolerance of the MphR biosensor (Eberz 2007), erythromycin A and several clinically useful semi-synthetic macrolides were screened versus MphR-WT. In liquid culture, dose-dependent MphR-WT activations for erythromycin A (compound 1), clarithromycin (compound 2), azithromycin (compound 3), and roxithromycin (compound 4) were obtained (FIG. 12) and the induction parameters with each compound were compared (Table 6).


Clarithromycin is an erythromycin A semi-synthetic analog that differs by a single methoxy in place of a hydroxyl group at the C-6 carbon of the polyketide core macrolactone. Azithromycin is an erythromycin analog synthesized by an oxime-mediated nitrogen insertion and ring expansion at C-9 of the polyketide backbone. Roxithromycin replaces the C-9 ketone of erythromycin A with an imine-linked polyester. Clarithromycin, azithromycin and roxithromycin are semi-synthetic products of microbially produced erythromycin A. Distinction between erythromycin A and these modified analogs has thus far relied on inherently low-throughput techniques such as LC-MS, HPLC and NMR.


Biosensors capable of selective detection of specific macrolides from laboratory, industrial or environmental samples are useful in improving biotransformations, increasing final titers by detecting biosynthetic bottlenecks, and identifying macrolide contaminants.


Clone MphR-A16T/T154M/M155K (Clone M2D6) demonstrated exceptional selectivity for erythromycin A versus the three semi-synthetic analogs. Dose-response analysis revealed MphR-A16T/T154M/M155K maintained a K1/2 of 5.54 μM for erythromycin A, but displayed little to no activation by clarithromycin, azithromycin and roxithromycin. As summarized in Table 6 and FIG. 12, compared to MphR-WT, MphR-A16T/T154M/M155K proved to be a much more selective biosensor than its wild-type counterpart with the compounds tested.









TABLE 6







K1/2 values of MphR-WT and MphR-


A16T/T154M/M155K with erythromycin A,


clarithromycin, azithromycin and roxithromycin.











K1/2
1
2
3
4





WT
2.03 ± 0.10
 2.69 ± 0.14
0.60 ± 0.02
67.16 ± 3.41


A16T/T154M/
5.54 ± 0.53
20.10 ± 0.28
N.C.
N.C.


M155K









In Table 6, Compounds are numbered above their corresponding K1/2 value of each numbered compound (erythromycin A (1), clarithromycin (2), azithromycin (3) and roxithromycin (4)). MphR-A16T/T154M/M155K demonstrated much higher selectivity for erythromycin A versus its semi-synthetic counterparts compared to the wild-type biosensor.


MphR-A16T/T154M/M155K's ability to discriminate between closely related compounds that structurally differ by as little as a methyl substituent demonstrate the powerful application mutagenesis and high-throughput screen (HTS) have on developing tailored biosensors. Biosensors with specific ligand activation selectivities as demonstrated here are useful tools for monitoring reaction conversions in the production of erythromycin A analogs and in screening environmental samples for specific macrolide contaminants.


The RBS mutations from the erythromycin sensitive variant E7 were transferred to the MphR variant M2D6, which was previously engineering to be specific for erythromycin A. This new variant MphR M2D6-E7RBS displayed 2-fold enhanced sensitivity towards erythromycin A, but with negligible change in sensitivity towards semi-synthetic derivatives (analogues) (FIG. 3; Table 7).









TABLE 7





E7RBS-M2D6 compared to WT and M2D6


















Erythromycin


Selectivity


(ErA)
K1/2 (μM)
Dynamic range
(K1/2ErA/K1/2analogue)





WT
1.98
67000



M2D6
4.84
39000



M2D6-E7RBS
2.63
49000






Clarithromycin
K1/2 (μM)
Dynamic range
Selectivity





WT
2.00
64000
0.99


M2D6
21.51
7000
0.23


M2D6-E7RBS
12.67
16000
0.21





Azithromycin
K1/2 (μM)
Dynamic range
Selectivity





WT
0.60
28000
N.C.


M2D6
N.C.
0
N.C.


M2D6-E7RBS
N.C.
0
N.C.





Roxithromycin
K1/2 (μM)
Dynamic range
Selectivity





WT
74.08
32000
N.C.


M2D6
N.C.
0
N.C.


M2D6-E7RBS
N.C.
0
N.C.









Example 4. Biosensors for Detection of Macrolide Glycosylation

The ability for MphR or MphR gene variants thereof to discriminate between closely related polyketides provides opportunities to report the activity of enzymes which catalyze the transformation of a polyketide not detected by MphR into a product that is detected by MphR. For example, MphR may specifically recognize the sugar residues attached to detected polyketides. Thus, MphR likely does not detect the corresponding aglycones. To test this, the aglycone 6-deoxyerythronolide B (6dEB) was produced via an engineered E. coli strain and purified by flash chromatography. The identity of the compound was confirmed by comparison of the 13C/1H-NMR spectral data to that published, by high-resolution mass analysis (6dEB calc. [M+Na]+ m/z=409.25664; 6dEB obs. [M+Na]+ m/z=409.25525), and by comparison to authentic biosynthetic and synthetic standards. Next, the ability of 6dEB to activate GFP expression under control of WT MphR was tested. As predicted, the aglycone failed to activate GFP expression, whereas the corresponding glycoside erythromycin A is a good activator (FIG. 4). To extend this to other systems, the ability of MphR was examined to detect macrolide antibiotics from S. venezuelae. The mono-glycosylated 12-membered macrolide YC-17 was detected by WT MphR whereas its corresponding aglycone (10-deoxymethynolide, 10-DML) was not (FIG. 4). Because the only structural difference between YC-17 and 10-DML is the desosamine sugar, this data confirms the ability of MphR to report macrolactone glycosylation. MphR libraries were also screened in the presence of YC-17 to identify variants that could detect the macrolide at lower concentrations than WT MphR. Indeed, one particular mutant detected YC-17 at concentrations up to 100-fold lower than that of the WT MphR while maintaining the same dynamic range as the WT sensor (FIG. 4). Whereas the desosamine moiety is likely a specificity-conferring factor for MphR, it is clear that directed evolution can be used to alter the ligand specificity of MphR towards otherwise poorly detected macrolides. These methods can be used for directed evolution to expand the recognition capabilities of MphR towards other sugar residues.


Example 5. Expanding the Synthetic Scope of Polyketide Glycosylation Machinery by Directed Evolution

The stringent substrate specificity of natural product glycosyltransferases (GTs) severely restricts the scope of polyketide glycodiversification strategies. Directed evolution is used to expand the specificity of macrolide GTs. The specificity of MphR towards desosaminylated macrolides can be leveraged as a sensor to report glycosylation and identify GT variants with improved activity and substrate specificity. Libraries of GT variants can be challenged with diverse substrates and screening via the MphR biosensor. By testing the function of many GT variants using MphR, potentially any GT can be engineered. These described methods can produce variant GTs with broad specificities beyond those originally screened for, the creation of new tools for glycoside synthesis and a new approach for engineering natural product GTs.


Anthracyclines (e.g. doxorubicin), enediynes (e.g. calicheamicin), avermectins (e.g. avermectin B1a), polyenes (nystatin A1), and perhaps most notably, macrolides are examples of glycosylated polyketides. The sugars of macrolide antibiotics such as erythromycin A are absolutely essential for the ability of macrolides to inhibit protein synthesis at the ribosome and the corresponding aglycone is not an effective antibiotic. In fact, altering the glycosylation pattern of macrolides can even change the biological activity from antimicrobial to anti-viral or anti-parasitic. Glycosylated polyketides have also been used as probes to perturb biological function. Classical chemical approaches for the synthesis of glycoconjugates are challenging since regio- and stereochemical control of glycosidic linkage formation requires multiple protection/deprotection steps, typically resulting in poor yields. On the other hand, biosynthetic approaches for glycoconjugate synthesis are an attractive alternative to traditional chemical synthesis, since enzymes are usually highly regio- and stereoselective and do not require complex protection strategies. Moreover, approaches that involve enzymes are particularly promising given the potential to produce multi-gram scale quantities of natural products via bacterial fermentation, at low cost, and with minimal use of organic solvents. Accordingly, biosynthetic pathways responsible for the synthesis of glycosylated polyketides have been intensively investigated as tools for the production of glycosides. Glycosylation, which is often rate limiting, is achieved through the transfer of a sugar moiety from an activated glycosyl-donor, usually in form of a nucleotide diphosphate (NDP)-sugar, and is catalyzed by glycosyltransferases (GTs) (FIG. 10(A)). The GT and the genes required for production of the NDP-sugar are frequently grouped together in a module within the gene cluster (FIG. 10(B)). Conveniently, the polyketide synthase (PKS) genes are usually also grouped together (FIG. 10(B)). This convenient (yet superficial) modularity of biosynthetic pathways lends itself to the ‘design-build-test’ mantra of synthetic biology. Thus, mixing and matching various NDP-sugar pathways and GTs between heterologous or native hosts has been explored in an effort to produce non-natural hybrid natural product glycosides. Perhaps the most potentially versatile combinatorial biosynthesis strategy in this respect involves feeding aglycones into a heterologous host that is engineered to express a non-native GT and the enzymes for synthesis of a non-native NDP-sugar (FIG. 10(C)). This takes advantage of fast-growing, genetically tractable heterologous hosts such as E. coli. Yet, most hybrid glycosylation pathways suffer from poor bioconversion yields and limited substrate scope. For example, an engineered Streptomyces venezuelae system, in which a non-native TDP-olivose biosynthesis pathway was introduced, produced <10% yield of the desired glycosides after aglycone feeding to the culture. They key factor limiting the scope and efficiency of engineered glycosylation pathways is the poor activity and narrow substrate scope of natural product GTs. In fact, only a small number of GTs display substrate specificity sufficiently broad for generating libraries of glycosides. Moreover, GTs can be remarkably sensitive to relatively minor structural modifications to both the aglycone and NDP-sugar. The specificity of the macrolide GT DesVII (along with its required accessory protein, DesVII) exemplifies this major limitation (FIG. 10(D)). The relatively large number of GT crystal structures that are now available has proven insufficient to enable rational redesign of GT substrate specificity. Thus, the molecular determinants that control substrate specificity are unknown. This is particularly frustrating given the structural modularity of natural product GTs whereby the N- and C-terminal domains of GTs each house the acceptor and NDP-donor binding site, respectively. These domains could be exchanged between various GTs to construct chimeric enzymes for the synthesis of hybrid glycosides. However, this has yet to be realized, likely due to the poor understanding of inter-domain communication and catalysis in GTs. Directed evolution offers an opportunity to overcome these limitations (FIG. 10(C)). However, macrolide GTs have yet to be engineered by directed evolution or rational redesign. The closest example involved engineering the oleandomycin GT OleD by screening the ability of OleD mutants to glucosylate 4-methylumbelifferone. Activity/specificity towards macrolides was not and could not be targeted in this study. The critical issue is the lack of high-throughput screens/selections for polyketide GTs. The current methods disclose how to utilize genetically modified MphR for screening libraries of GT variants for production of polyketide glycosides. Non-limiting examples of these MphR biosensors are disclosed herein.


Example 6: Biosensors for Detection of Erythromycin A C6 O-Methylation

Erythromycin A is one of most widely prescribed macrolide antibiotics. Yet, its poor bioavailability and limited spectrum of activity have spurred tremendous efforts to alter the structure of erythromycin A and have resulted in the development of several generations of novel antibiotics. For example, the second generation macrolide antibiotic 6-O-methylerythromycin (clarithromycin, FIG. 5(A)) has been remarkably successful due to its enhanced antibacterial activity, improved pharmacokinetic properties, and expanded spectrum of activity. Unfortunately, like other 14-membered macrolides, clarithromycin has poor activity against macrolide-resistant bacteria. Newer generation macrolides such as solithromycin (See FIG. 7) may address the problem of resistance but also depend on the 6-O-methylation for activity. The simple C6 O-methylation of erythromycin A prevents hemi-ketal formation with the C9-ketone in the acidic environment of the stomach. However, this simple semi-synthetic modification requires six steps to transform erythromycin A to clarithromycin (FIG. 5(A)). The industrial process for production of clarithromycin therefore involves microbial fermentation of erythromycin A, extraction, and chemical synthesis. The methods described herein are used to provide an engineered microbial strain that produces clarithromycin directly, resulting in a faster, cheaper, and “greener” world supply of this pharmaceutical. Moreover, such a production strain could be coupled with other biosynthetic transformations to rapidly produce new clarithromycin analogues for further drug discovery efforts.


For example, an O-methyltransferase (OMT) could afford clarithromycin in a single step from erythromycin A (FIG. 5(A)). OMTs are a diverse group of enzymes distributed throughout all domains of life and catalyze a simple SN2-like substitution using the cofactor S-adenosyl-L-methionine (SAM). The diverse target substrates of OMTs include nucleotide-sugars, carboxylic acids, phenols, and natural products. Yet, there are no known examples of OMTs that methylate the C6-hydroxyl group of erythromycin A. However, many OMTs target hydroxyls of sugar residues on polyketides and macrolides (FIG. 5(B)). Indeed, methylation of the cladinose residue of erythromycin A is catalyzed by EryG, an OMT from the erythromycin A gene cluster (FIG. 5(A)). Although some OMTs can methylate several positions, most OMTs seem to be regioselective with respect to the acceptor hydroxyl. Thus, example approaches to an OMT for the conversion of erythromycin A to clarithromycin are to engineer the regioselectivity of EryG or manipulate the substrate specificity of another candidate. In support of this, natural product OMTs, including macrolide OMTs, are known to display acceptor promiscuity (a good starting point for directed evolution), and the specificity of OMTs has been changed. Moreover, the regioselectivity of phenylpropanoid and flavone OMTs has been altered via site-directed mutagenesis, iterative saturation mutagenesis, and error-prone PCR. Notably, although there are >50 structures of OMTs in the Protein Data Bank (PDB), many with bound SAM, only a few include the bound acceptor, thus precluding the effective use of structural based approaches to OMT redesign. The recently described structures of two OMTs involved in the biosynthesis of mycinamicin (FIG. 5(B)) correctly predicted that these OMTs use alternative macrolides and also enabled relaxation of specificity via mutagenesis. These demonstrations cumulatively highlight additional examples of engineering the regio- and substrate specificity of OMTs.


A genetic selection to identify OMT variants from large combinatorial libraries of OMT mutants can be used. Directed evolution and selections are known strategies for dramatically altering enzyme regio- and substrate specificity. The key challenge is that screening/selection methods with the requisite throughput or general applicability are not available for natural product OMTs. There are no reported ultra-high-throughput screens for methyltransferases. Most polyketides are not chromophores or fluorophores and don't offer a spectrophotometric change upon methylation that could be monitored. Moreover, methylation typically does not provide a suitable phenotype that can be leveraged for a screen or selection. Mass spectrometry is suitable for screening relatively small libraries of variants when the requisite instrumentation and expertise is available. Regardless, the ability of high-throughput mass spectrometry to quantify polyketides in complex mixtures and to distinguish congeners is unproven. Moreover, identification of suitable OMTs for the biosynthesis of clarithromycin might require the ability to screen hundreds of thousands of variants (if not more), a throughput that is well out of the range of liquid chromatography. To address this need, an MphR sensor is generated that is activated by clarithromycin but not erythromycin A. Given OMT libraries expressed in E. coli are fed with erythromycin A, and E. coli is not able to modify the structure of erythromycin A, the sensor must be selective for clarithromycin in the presence of erythromycin A, and the reporter MphR signal should be low (ideally zero) in the presence of erythromycin A.


Directed evolution has been used here to alter the ligand specificity of MphR. A library of MphR variants was created by error-prone PCR (epPCR). Reasoning that many mutations could lead to misfolded variants or those that do not bind to the operator, and that variants are required that are not activated by ErA, fluorescent activated cell sorting (FACS) was first used to remove those variants that were constitutively ‘ON’ in the presence of ErA. To test the capacity of random mutations to alter the ligand specificity of MphR, the initial goal was to find variants that were more selective with clarithromycin compared to erythromycin A. Thus, some of the ‘OFF’ library members were duplicated and each screened in the presence of clarithromycin and erythromycin A. Several variants were identified that showed higher GFP reporter signals in the presence of clarithromycin compared to erythromycin A. One particular clone, “M1B10” (comprising amino acid changes T49I, L89V, D98N, E109D) was selected for further analysis. GFP fluorescence was measured in the presence of varying concentrations of erythromycin A or clarithromycin (0.1-150 μM) and showed that the selectivity of this MphR variant was now shifted towards clarithromycin. For example, at 10 μM ligand, the fluorescence response with clarithromycin is 10-fold higher than with erythromycin A (FIG. 6). Remarkably, the dynamic range (GFPmax-GFPmin) of M1B10 is still ˜50% that of the WT MphR.


MphR M1B10 was replaced by the variant “M9C4.” MphR WT was subjected to structural-guided mutagenesis (R122T mutation), and error-prone PCR based on R122T mutation, yielding the variant “M9C4”. This variant is the most clarithromycin/erythromycin selective biosensor reported to date. At 10 μM ligand, the fluorescence response with clarithromycin is 29-fold higher than with erythromycin A. The RBS of the variant E7 was included (E7_M9C4), further improving sensitivity (FIG. 19; Table 8). The sensitivity of M9C4 was tested using mixtures (e.g. 0:10 thru 10:0) of ErA/clarithromycin at fixed total concentration of 10 μM. The data showed that M9C4 could be used to determine the concentration of clarithromycin the presence of erythromycin A (ErA) in the linear range of 0-10 μM, whereas the WT biosensor was not effective (FIG. 26).









TABLE 8







M9C4 clarithromycin specific biosensor














Dynamic







range

Selectivity
Hill




(RFUmax-
K1/2
(K1/2ErA/
co-


Mutation
Ligand
RFUmin)
(μM)
K1/2Clarithromycin)
efficient















WT
ErA
52125
 1.51
0.92
3.52



Clari
52749
 1.64

2.30


R122T
ErA
3666
47.09
1.94
2.39



Clari
5751
24.22

3.03


M9C4
ErA
11342
68.32
6.74
2.03



Clari
33326
10.14

1.49


E7_M9C4
ErA
15318
29.33
6.01
1.95



Clari
46345
 4.88

1.49









Example 7. Identification of Enzymes for Synthesis of Clarithromycin

The objective here is to utilize MphR variants that recognize semi-synthetic polyketide analogues to identify enzymes for their chemo-enzymatic synthesis. MphR-based sensors can be used to identify and enrich novel polyketide tailoring enzymes by sensing the production of the desired product in vivo. An MphR variant specific for 6-O-methylerythromycin (clarithromycin) is generated and in vivo selections are performed to identify novel O-methyltransferases (OMTs) that enable the in vivo production of this valuable semi-synthetic derivative. Such enzymatic activity is difficult or impossible to identify without a genetically encoded biosensor and this approach could afford an array of other semi-synthetic derivatives.


Several candidate OMTs have been identified for directed evolution. EryG is a candidate given it already recognizes the desired substrate, albeit in a different conformation than required. EryG has been expressed in E. coli and displays some macrolide promiscuity. Given a crystal structure for EryG is not available, Phyre2 and I-TASSER were used to generate homology models. The conserved SAM-binding site was identified by Phyre2 and I-TASSER, while the putative macrolide-binding site were identified by comparison to known OMT sequences and acceptor-bound structures (FIG. 9(A)). Furthermore, the server CAVER predicted a cavity that agreed with a manual approach (FIG. 9(A)). DnrK is an OMT involved in daunorubicin biosynthesis (FIG. 9(B)). The structure shows that the large hydrophobic acceptor substrate binds into a hydrophobic deep binding pocket (FIG. 9(C)). The fact that (1) hydrophobic binding pockets often render enzymes highly evolvable, (2) DnrK uses a simple proximity driven mechanism, and (3) the acceptor-binding site is known, makes DnrK a candidate for redesign. Finally, the MycF structure shows that the macrolactone is located in a hydrophobic region at the opening of the active site funnel and makes no specific contacts with MycF (FIG. 9(D)). Consistent with this, MycF has been shown to display macrolactone promiscuity37.


With a clarithromycin-sensor in place, approaches for the discovery of novel OMT activity using EryG, MycF, and DnrK as scaffolds can be pursued. epPCR libraries of these enzymes are generated in addition to multi-site saturation mutagenesis at residues lining each acceptor-binding pocket (FIG. 9(A)-(D)). Mutation rates as high as 3-4 amino acid mutations per gene and multi-site saturation of 6-7 simultaneous residues can be searched using MphR-based selections. Given the breadth of OMT acceptor substrates and variety of catalytic mechanisms, the sequences of most OMTs are highly divergent, even though most OMTs belong to the same superfamily of SAM-dependent MTs and share similar overall topologies. Thus, SCHEMA structure-guided recombination to prepare protein chimera libraries from all three scaffolds can be used. Initial candidate OMTs could support conversion of μM concentrations of clarithromycin in the timeframe of a culture growth and this feature was used to drive the evolution of MphR variants with the requisite selectivity and sensitivity. The gfp reporter gene of the current MphR plasmid system is replaced with a selection marker (e.g. chloramphenicol). The elegance of the in vivo biosensor is that the OMT selection process is made more selective simply by decreasing the concentration of clarithromycin. Thus, once activity of an OMT variant is identified that exceeds the activation threshold for the sensor, this variant is used to parent the next library and is subjected to selection using a lower (sub-activating) concentration of clarithromycin and/or less incubation time. Thus, each round enriches OMTs with better kcat and/or Km.


Once activity is isolated and sufficiently robust to achieve in vivo conversion, OMT variants are expressed and purified for biochemical characterization. A genetic selection could enrich OMTs that methylate the C6-OH of erythromycin A, but also other hydroxyl groups. Thus, HPLC-ELSD coupled with MS is used to determine if other products are present. However, other regiospecificities could prove useful sources of new products. Once regiospecificity of the OMT is established, full characterization (e.g. kcat, Km, stability) is determined by HPLC-ELSD, using erythromycin A, SAM, and clarithromycin as a product standard. Moreover, SAM-analogues are utilized to determine whether the evolved OMTs can be used to alkyl-diversify macrolides.


Example 8. Biosensors for Production of an Advanced Solithromycin Precursor

Cempra, Inc (Chapel Hill) have completed Phase III clinical trials for solithromycin and a New Drug Application (NDA) is in progress for the treatment of community-acquired bacterial pneumonia. Solithromycin is chemically synthesized via a lengthy 19-step sequence of reactions (FIG. 7). To streamline the synthesis of this promising new antibiotic, an engineered biosensor can provide the advanced precursor I by simple microbial fermentation, in one step, using a known enzymatic pathway (FIG. 7). The precursor is then converted to solithromycin via a proposed chemo-enzymatic route (6-steps) or by known organic chemistry (11) steps, thus eliminating up to 10 chemical steps. Crucially, providing I biosynthetically circumvents some of the most inefficient chemistry (installing the double bond). The logic of polyketide biosynthesis is understood, such that an artificial biosynthetic pathway for I has been designed based on validated genetic modifications to the biosynthetic gene cluster for erythromycin A (FIG. 8). Yet, such modified pathways usually produce low product titers insufficient for large-scale fermentation. A biosensor for detection of I would enable screening many thousands of enzyme/pathway variants for production of I (FIG. 7).


Precursor I can be produced in an E. coli strain because: (1) a plasmid system for expressing entire polyketide gene clusters in E. coli can be used and have demonstrated erythromycin A production; (2) suitable E. coli strains for expression of the such genes including BAP1 can be used; and (3) the natural production host cannot provide the growth speed, technical amenability, and scalability offered by E. coli. Additionally, the necessary genetic manipulations in E. coli can be performed by those skilled in the art.


The artificial pathway is constructed in pieces via commercial gene synthesis, and inserted into E. coli BAP1. The prototype strain is tested by examining I in lysed cells and/or culture supernatant directly by LC-MS analysis. Notably, I is not toxic to E. coli. Subsequently, baseline I production, expected to be ˜1 mg/L culture broth, is determined by LC-MS. The MphR variant is capable of detecting I produced via the strain by measuring the GFP reporter signal. The unnatural DH/KR insertion (FIG. 8) is likely to be responsible for the poor product titer of this pathway. Accordingly, a library of variants is constructed using standard molecular biology techniques in which the composition of the linkers surrounding this insertion are varied. Top performing library members are identified by screening thousands of clones in agar-plates under a UV lamp. The hits are copied, and re-assayed in microplates, allowing quantification in a microplate reader. The DNA sequences of the most productive library members are then obtained.


Given the known polyketide product titers of in vivo systems, a sensor that can detect I in the linear range 0-100 μM, with a ˜50 μM K1/2 and fold-activation similar to WT MphR (with erythromycin A) is useful. Because the initial artificial pathway can produce I, albeit in poor yield, significant (e.g. >10-fold compared to initial strain) further mutations identified can provide critical proof-of-principle that biosensor-guided engineering is a viable alternative to traditional chemical synthesis of the precursor. Then, more elaborate libraries of variants can be generated and screened over multiple generations to furnish further mutations and improvements. Ultimately, product titers >1 g/L are typically needed for commercial viability of the production process.


The ability of the MphR clone “PikB1” to detect a Solithromycin biosynthetic intermediate (see structure below) was determined. This biosensor can detect the intermediate at concentrations as low as 0.1 μM (FIG. 20; WT K1/2 70.9 t 4.6 uM; PikB1 K1/2 1.46±0.16) making it suitable for identifying mutant strains capable of producing the intermediate in engineered microbes. Moreover, this intermediate can be accessed by simple genetic modifications to the genome of the erythromycin producing strain. Thus, biosensors like these improve the productivity of other modified producing strains that produce valuable biosynthetic intermediates that can be used to access highly diversified antibiotics through semi-synthesis.




embedded image


Example 9. Engineering MphR Biosensors that Discriminate Between Late Stage Macrolides in Erythromycin A Biosynthesis

Erythromycin A is a macrolide produced by the organized biosynthesis of type I polyketide synthase (PKS) and several late-stage tailoring enzymes. 6-Deoxyerythronolide B Synthase (DEBS) is organized as three giant polypeptides (DEBS1-3) that assemble the macrolactone 6-deoxyerythronolide B (6dEB). 6dEB is further tailored by P450 monooxygenases, glycosyltransferases, and a methyltransferase to yield the final product, erythromycin A (FIG. 13).


Recently reported titers of one cell biosynthesis of erythromycin A in E. coli are ˜1 mg/L (Zhang H, et al. Complete Biosynthesis of Erythromycin A and Designed Analogs Using E. coli as a Heterologous Host. Cell Chemistry & Biology. 2010; 17(11):1232-40). The impressive coordination of 26 heterologous proteins to produce a foreign natural product notwithstanding, this yield can be seen as suboptimal, since the aglycone precursor, 6dEB, is routinely produced in E. coli shake-flask cultures exceeding 100 mg/L (Boghigian B A, et al. Multi-factorial Engineering of Heterologous Polyketide Production in Escherichia coli Reveals Complex Pathway Interactions. Biotechnology and Bioengineering. 2011; 108(6): 1360-71). Rather than solely produce the single macrolide erythromycin A, heterologous biosynthesis results in mixtures of erythromycins A, B, C and D.


Typical erythromycin A biosynthesis occurs via the erythromycin C pathway. A P450 hydroxylation catalyzed by eryK converts erythromycin D to erythromycin C. Subsequently, the methyltransferase eryG catalyzes the S-adenosylmethione (SAM) dependent methylation of erythromycin C to yield erythromycin A. Erythromycin B is generally regarded as an undesired shunt product of a competing alternative pathway that reverses the order of hydroxylation and methylation of erythromycin D so that eryG methylation occurs first (Montemiglio, L C, et al. Redirecting P450 EryK Specificity by Rational Site-directed Mutagenesis. Biochemistry. 2013; 52 (21) 3678-87; Savino, C, et al. Investigating the Structural Plasticity of a Cytochrome P450: Three-dimensional Structures of P450 EryK and Binding to its Physiological Substrate. Journal of Biological Chemistry. 2009; 284 (42) 29170-9).


Biosensor guided screening of natural or heterologous erythromycin A biosynthesis would rely of the ability of the biosensors to report the true concentration of erythromycin A without falsely over-reporting yield due to off target activation by a late-stage biosynthetic intermediate. MphR-WT was assayed for its ability to detect the late-stage biosynthetic intermediates of erythromycin biosynthesis, erythromycins B and C. Compared to erythromycin A, erythromycins B and C activate MphR-WT in a nearly identical manner (FIG. 14, Table 9).


Successful application of the method above revealed MphR-P4L/W107L/H193R, a clone with enhanced erythromycin A selectivity versus erythromycin B. Compared to MphR-WT, MphR-P4L/W107L/H193R demonstrated no detectable or calculable activation by erythromycin B but retained significant erythromycin A sensitivity (FIG. 14, Table 9).









TABLE 9







Performance features of the wild-type


sensor with erythromycins A and B.












K1/2

dynamic range
linear range of


MphR-WT
(μm)
Cooperativity
(GFPmax-GFPmin)
detection (μM)





1 (ErA)
1.49
3.39
52400
0.5-2.5


5 (ErB)
1.72
1.99
55800
0.3-2.5
















TABLE 10







Performance features of the P4L/W107L/H193R


sensor with erythromycins A and B.











MphR-P4L/
K1/2

dynamic range
linear range of


W107L/H193R
(μm)
Cooperativity
(GFPmax-GFPmin)
detection (μM)





1 (ErA)
1.27
2.04
3800
0.3-2.5


5 (ErB)
N.C.
N.C.
N.C.
N.C.









As seen in Tables 9 and 10, MphR-P4L/W107L/H193R displays a clear selectivity shift towards erythromycin A from B, while maintaining nearly the same performance features as the wild-type sensor, except dynamic range. MphR-P4L/W107L/H193R can be used as a biosensor capable of distinguishing erythromycin A from its structurally similar precursors. Sensors capable of HTS allow contemporary techniques that leverage giant library sizes to improve true erythromycin A titers. In addition to usefulness as an erythromycin A detector with less off-target activation, MphR-P4L/W107L/H193R also serves as a sensor for the detection of P450 monooxygenase eryK-catalyzed C-12 hydroxylation of erythromycin A's core. MphR-P4L/W107L/H193R and newly developed sensors of this type provide the tools necessary for high-throughput screening of late-stage tailoring enzymes in the erythromycin biosynthetic pathway.


Example 10. Engineered MphR Biosensors

A summary of non-limiting examples of MphR biosensor mutations is provided in Table 11 below. A number of the mutations were discussed in the examples above. Additional mutations are shown in Table 11 that provide increased pikromycin sensitivity. Further mutations are shown in Table 11 that improved narbomycin sensitivity.




embedded image


embedded image









TABLE 11







MphR Mutations











Label
Mutation
Goal
Effect
Quantification





A3
nt: A1G
erythromycin A
erythromycin A
3.6 times more



aa: G76C
sensitivity
sensitivity
sensitive vs. WT


E7
nt: A4T
erythromycin A
erythromycin A
3.0 times more



aa: V90I
sensitivity
sensitivity
sensitive vs. WT


smRBS1A1
nt:
erythromycin A
erythromycin A
9.9 times more



A1T/G2T/A3C
sensitivity
sensitivity
sensitive vs. WT


QCMS3D6
T17R
erythromycin A
erythromycin A
2.4 times more




sensitivity
sensitivity
sensitive vs. WT


QCMS3F8
T17A/M59S
erythromycin A
erythromycin A
1.6 times more




sensitivity
sensitivity
sensitive vs. WT


QCMS5B4
T27G/Q65M
erythromycin A
erythromycin A
1.5 times more




sensitivity
sensitivity
sensitive vs. WT


QCMS5D7
T27A/M59E
erythromycin A
erythromycin A
2.0 times more




sensitivity
sensitivity
sensitive vs. WT


D3 (pikB1)
S106F
pikromycin
pikromycin
118 times more




sensitivity
sensitivity
sensitive vs. WT


D3 (pikB1)
S106F
Solithromycin
Solithromycin
52 times more




precursor I
precursor I
sensitive vs. WT




sensitivity
sensitivity



D3 (pikB1)
S106F
YC-17 sensitivity
YC-17
40 times more





sensitivity
sensitive vs. WT


YCA11
S31R
YC-17 sensitivity
YC-17
8.5 times more





sensitivity
sensitive vs. WT


Nbn. YCG11
L39F
YC-17 and
YC-17 and
2.9 times more




narbomycin
narbomycin
sensitive vs. WT




sensitivity
sensitivity



NbnD11
V33L
narbomycin
narbomycin
2.6 times higher




sensitivity
sensitivity
activation ratio at 5






uM than WT


NbnE1
A34S
narbomycin
narbomycin
2.3 times higher




sensitivity
sensitivity
activation ratio at 5






uM than WT


NbnG7
R51C
narbomycin
narbomycin
1.7 times higher




sensitivity
sensitivity
activation ratio at 5






uM than WT


M2D6
A16T/T154M/
erythromycin A
erythromycin A
20 times less



M155K
selectivity versus
selectivity
sensitive for




clarithromycin,
versus
clarithromycin. No




azithromycin, and
clarithromycin,
calculable




roxithromycin
azithromycin,
activation with





and
azithromycin and





roxithromycin
roxithromycin


M2D7
P4L/W107L/
erythromycin A
erythromycin A
No calculable



H193R
selectivity versus
selectivity
activation with




erythromycin B
versus
erythromycin B





erythromycin B



C9
A34S/Y103N/
erythromycin C
erythromycin C
6.8 and 13 times



L189F
selectivity versus
selectivity
less sensitive to




erythromycins A
versus
erythromycins A




and B
erythromycins
and B versus the





A and B
WT


V66P
V66P
erythromycin A
always on as
Compared at 100




sensitivity
tested
uM erythromycin


V66R
V66R
erythromycin A
always off as
Compared at 100




sensitivity
tested
uM erythromycin


V66G
V66G
erythromycin A
~same
Compared at 100




sensitivity
activation as
uM erythromycin





wild-type



V66I
V66I
erythromycin A
always off as
Compared at 100




sensitivity
tested
uM erythromycin


V66D
V66D
erythromycin A
always off as
Compared at 100




sensitivity
tested
uM erythromycin


M1B10
T49I/L89V/
clarithromycin
clarithromycin
29.2 and 6.4 times



D98N/E109D
selectivity versus
selectivity
less sensitive to




erythromycin A
versus
erythromycin A and





erythromycin A
clarithromycin






versus the WT


M9C4
R122T K132N
clarithromycin
clarithromycin
45.2 and 6.2 times



A151T H184Q
selectivity versus
selectivity
less sensitive to




erythromycin A
versus
erythromycin A and





erythromycin A
clarithromycin






versus the WT


E7_M9C4
nt: A4T
clarithromycin
clarithromycin
19.4 and 3 times



aa: R122T
selectivity versus
selectivity
less sensitive to



K132N A151T
erythromycin A
versus
erythromycin A and



H184Q
and
erythromycin A
clarithromycin




clarithromycin
and
versus the WT




sensitivity
clarithromycin






sensitivity





Numbering of the nt (nucleotide) mutations corresponds to the ribosome binding site sequence. For example, the RBS sequence for the MphR gene is AGAAGG. Thus, the first A is the “1” position and the final G is the “6” position of the RBS.






Some of the mutations were further characterized for YC-17, narbomycin, and pikromycin selective MphR clones (FIG. 21; Tables 12-14).









TABLE 12







Selected sensitivity mutants with YC-17












WT
A11
pikB1
G11





K1/2
19.6 ± 0.6
2.3 ± 0.1
0.49 ± 0.05
6.7 ± 0.2
















TABLE 13







Selected sensitivity mutants with Narbomycin










WT
D11





Activation ratio (5 uM/0 uM)
4
11
















TABLE 14







Selected sensitivity mutants with Pikromycin










WT
pikB1





K1/2
96.6 ± 2.7
0.81 ± 0.03









Example 11. Screening Erythromycin Producing Strains

An erythromycin producing strain, Aeromicrobium erythreum (Reeves A R, et al. Engineering precursor flow for increased erythromycin production in Aeromicrobium erythreum. Metabolic Engineering. 2004; 6(4): 300-12; Miller E S, et al. Description of the erythromycin-producing bacterium Arthrobacter sp. strain NRRL B-3381 as Aeromicro-bium erythreum gen. nov., sp. Nov. International Journal of Systematic Bacteriology. 1991; 41: 363-368), and a knock-out mutant (KO) were grown in wells of a 96-well microtiter plate. Culture supernatants were removed and transferred to another microplate that contained cultures of either the MphR mutant E7-RBS or the wild-type biosensor. Fluorescence analysis revealed the unequivocal detection of only those wells containing the producing strain, and demonstrated the superior dynamic range of the engineered vs. wild-type biosensor (FIG. 22).


A similar method using biosensor strains immobilized on agar plates reveals the sensitivity of the engineered biosensor and demonstrates the ability to screen culture collection supernatants in high-throughput via agar plates (FIG. 22).


Example 12. Growth Selection for Erythromycin Producing Strains

Wild-type (WT) MphR was used to control expression of the chloramphenicol (Cm) resistance gene via the plasmid pMLCmR (FIG. 23). In this way, colonies should only grow in the presence of Cm when clarithromycin or erythromycin A are also provided. The following data indicates that when Cm is provided, colonies grow when erythromycin A (ErA) or clarithromycin are provided (FIG. 24; bottom middle, bottom right), but not in their absence (top middle). Thus, MphR can be used in a growth selection format, significantly expanding the throughput of analysis.


A similar trend was observed when the engineered MphR E7-M9C4 was used in place of the wild-type MphR. However, using this clarithromycin-selective MphR variant, at 5 μM polyketide, colonies grew when clarithromycin was provided but not in the presence of erythromycin, thus highlighting the improved sensitivity of this mutant, in comparison to the wild-type biosensor (FIG. 25). Furthermore, comparison of colony growth at 0.5 μM vs. 5 μM polyketide highlights the expected dose response of the selection system.












SEQUENCES















Provided herein is the gene sequence of the wild-


type MphR gene:


DNA sequence-Wild-type MphR


ATGCCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGAGGCCGCCAC


CGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAGTAG


CAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAAC


CGCGATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCG


GCATTACCTGAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGG


AATTTTTGCAGGTGCTCGTTCGGAGCATGAACACTCGCAACGACTTCTCG


GTGAACTATCTCATCTCCTGGTACGAGCTCCAGGTGCCGGAGCTACGCAC


GCTTGCGATCCAGCGGAACCGCGCGGTGGTGGAGGGGATCCGCAAGCGAC


TGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTCCTGCACTCGGTCATC


GCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGAGCTAGCTGA


TCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAAC


ACGACGATTTCCAACTCCTCCAGGCACATGCGTAA (SEQ ID NO: 1)





Also provided herein is the amino acid sequence of


the wild-type MphR protein:


Amino acid sequence-Wild-type MphR


MPRPKLKSDDEVLEAATVVLKRCGPIEFTLSGVAKEVGLSRAALIQRFTN


RDTLLVRMMERGVEQVRHYLNAIPIGAGPQGLWEFLQVLVRSMNTRNDFS


VNYLISWYELQVPELRTLAIQRNRAVVEGIRKRLPPGAPAAAELLLHSVI


AGATMQWAVDPDGELADHVLAQIAAILCLMFPEHDDFQLLQAHA (SEQ


ID NO: 2)





Provided herein are the gene sequences of the MphR


mutations (see Table 11) (mutated nucleotides are


underlined) (the sequences directly below only


contain the coding sequences; for additional


sequence upstream of ATG, see SEQ ID NO: 28-57).





epA3


ATGCCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGAGGCCGCCAC


CGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAGTAG


CAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAAC


CGCGATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCG


GCATTACCTGAATGCGATACCGATATGCGCAGGGCCGCAAGGGCTCTGGG


AATTTTTGCAGGTGCTCGTTCGGAGCATGAACACTCGCAACGACTTCTCG


GTGAACTATCTCATCTCCTGGTACGAGCTCCAGGTGCCGGAGCTACGCAC


GCTTGCGATCCAGCGGAACCGCGCGGTGGTGGAGGGGATCCGCAAGCGAC


TGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTCCTGCACTCGGTCATC


GCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGAGCTAGCTGA


TCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAAC


ACGACGATTTCCAACTCCTCCAGGCACATGCGTAA (SEQ ID NO: 3)





epE7


ATGCCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGAGGCCGCCAC


CGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAGTAG


CAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAAC


CGCGATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCG


GCATTACCTTAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGG


AATTTTTGCAGGTGCTCATTCGGAGCATGAACACTCGCAACGACTTCTCG


GTGAACTATCTCATCTCCTGGTACGAGCTCCAGGTGCCGGAGCTACGCAC


GCTTGCGATCCAGCGGAACCGCGCGGTGGTGGAGGGGATCCGCAAGCGAC


TGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTCCTGCACTCGGTCATC


GCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGAGCTAGCTGA


TCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAAC


ACGACGATTTCCAACTCCTCCAGGCACATGCGTAA (SEQ ID NO: 4)





epH4


ATGCCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGAGGCCGCCAC


CGTAGTGCTGAAGCATTGCGGTCCCATAGAGTTCACGCTCAGCGGAGTAG


CAAATGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAAC


CGCGATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCG


GCATTACCTGAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGG


AATTTTTGCAGGTGCTCGTTCGGAGCATGAACACTCGCAACGACTTCTCG


GTGAACTATCTCATCTCTTGGTACGAGCTCCAGGTGCCGGAGCTACGCAC


GCTTGCGATCCAGCGGAACCGCGCGGTGGTGGAGGGGATCCGCAAGCGAC


TGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTCCTGCACTCGGTCATC


GCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGAGCTAGCTGA


TCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAAC


ACGACGATTTCCAACTCCTCCAGGCACATGCGTAA (SEQ ID NO: 5)





QCMS3D6


ATGCCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGAGGCCGCCAG


GGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAGTAG


CAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAAC


CGCGATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCG


GGATTACCTGAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGG


AATTTTTGCAGGTGCTCGTTCGGAGCATGAACACTCGCAACGACTTCTCG


GTGAACTATCTCATCTCCTGGTACGAGCTCCAGGTGCCGGAGCTACGCAC


GCTTTGCGATCCAGCGGAACCGCGCGGTGGTGGAGGGGATCCGCAAGCGA


CTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTCCTGCACTCGGTCAT


CGCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGAGCTAGCTG


ATCATGTGCTGGCTCAGATCGCTTGCCATCCTGTGTTTTAATGTTTCCCG


AACAcGAcGATTTCCAACTCCTCCAGGCACATGCGTAA (SEQ ID


NO: 6)





QCMS3F8


ATGCCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGAGGCCGCCGC



GGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAGTAG



CAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAAC


CGCGATACGCTGCTGGTGAGGATGAGTGAGCGCGGCGTCGAGCAGGTGCG


GCATTACCTGAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGG


AATTTTTGCAGGTGCTCGTTCGGAGCATGAACACTCGCAACGACTTCTCG


GTGAACTATCTCATCTCCTGGTACGAGCTCCAGGTGCCGGAGCTACGCAC


GCTTGCGATCCAGCGGAACCGCGCGGTGGTGGAGGGGATCCGCAAGCGAC


TGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTCCTGCACTCGGTCATC


GCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGAGCTAGCTGA


TCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAAC


ACGACGATTTCCAACTCCTCCAGGCACATGCGTAA (SEQ ID NO: 7)





QCMS5B4


ATGCCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGAGGCCGCCGG



TGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAGTAG



CAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAAC


CGCGATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGATGGTTCG


GCATTACCTGAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGG


AATTTTTGCAGGTGCTCGTTCGGAGCATGAACACTCGCAACGACTTCTCG


GTGAACTATCTCATCTCCTGGTACGAGCTCCAGGTGCCGGAGCTACGCAC


GCTTGCGATCCAGCGGAACCGCGCGGTGGTGGAGGGGATCCGCAAGCGAC


TGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTCCTGCACTCGGTCATC


GCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGAGCTAGCTGA


TCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAAC


AcGACGATTTCCAACTCCTCCAGGCACATGCGTAA (SEQ ID NO: 8)





QCMS5D7


ATGCCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGAGGCCGCCGC



TGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAGTAG



CAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAAC


CGCGATACGCTGCTGGTGAGGATGGAGGAGCGCGGCGTCGAGCAGGTGCG


GCATTACCTGAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGG


AATTTTTGCAGGTGCTCGTTCGGAGCATGAACACTCGCAACGACTTCTCG


GTGAACTATCTCATCTCCTGGTACGAGCTCCAGGTGCCGGAGCTACGCAC


GCTTGCGATCCAGCGGAACCGCGCGGTGGTGGAGGGGATCCGCAAGCGAC


TGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTCCTGCACTCGGTCATC


GCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGAGCTAGCTGA


TCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAAC


ACGACGATTTCCAACTCCTCCAGGCACATGCGTAA (SEQ ID NO: 9)





D3 (pikB1)


ATGCCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGAGGCCGCCAC


CGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAGTAG


CAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAAC


CGCGATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCG


GCATTACCTGAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGG


AATTTTTGCAGGTGCTCGTTCGGAGCATGAACACTCGCAACGACTTCTCG


GTGAACTATCTCATCTTCTGGTACGAGCTCCAGGTGCCGGAGCTACGCAC


GCTTGCGATCCAGCGGAACCGCGCGGTGGTGGAGGGGATCCGCAAGCGAC


TGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTCCTGCACTCGGTCATC


GCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGAGCTAGCTGA


TCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAAC


ACGACGATTTCCAACTCCTCCAGGCACATGCGTAA (SEQ ID


NO: 10)





YCA11


ATGCCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGAGGCCGCCAC


CGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGAGGAGTAG


CAAAGGAGGTGGGGCTCTCCCGCGCTGCGTTAATCCAGCGCTTCACCAAC


CGCGATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCG


ACATTACCTGAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGG


AATTTTTGCAGGTGCTCGTTCGGAGCATGAACACTCGCAACGACTTCTCG


GTGAACTATCTCATCTCCTGGTACGAGCTCCAGGTGCCGGAGCTACGCAC


GCTTGCGATCCAGCGGAACCGCGCGGTGGTGGAGGGGATCCGCAAGCGAC


TGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTCCTGCACTCGGTCATC


GCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGAGCTAGCTGA


TCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAAC


ACGACGATTTCCAACTCCTCCAGGCACATGCGTAA (SEQ ID


NO: 11)





Nbn.YCG11


ATGCCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGAGGCCGCCAC


CGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAGTAG


CAAAGGAGGTGGGGTTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAAC


CGCGATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCG


GCATTACCTGAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGG


AATTTTTGCAGGTGCTCGTTCGGAGCATGAACACTCGCAACGACTTCTCG


GTGAACTATCTCATCTCCTGGTACGAGCTCCAGGTGCCGGAGCTACGCAC


GCTTGCGATCCAGCGGAACCGCGCGGTGGTGGAGGGGATCCGCAAGCGAC


TGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTCCTGCACTCGGTCATC


GCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGAGCTAGCTGA


TCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAAC


ACGACGATTTCCAACTCCTCCAGGCACATGCGTAA (SEQ ID


NO: 12)





NbnD11


ATGCCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGAGGCCGCCAC


CGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGACTAG


CAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAAC


CGCGATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCG


GCATTACCTGAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGG


AATTTTTGCAGGTGCTCGTTCGGAGCATGAACACTCGCAACGACTTCTCG


GTGAACTATCTCATCTCCTGGTACGAGCTCCAGGTGCCGGAGCTACGCAC



CCTTGCGATCCAGCGGAACCGCGCGGTGGTGGAGGGGATCCGCAAGCGAC



TGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTCCTGCACTCGGTCATC


GCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGAGCTAGCTGA


TCATGTGCTGGCTCAGATCTCTGCCATCCTGTGTTTAATGTTTCCCGAAC


ACGACGATTTCCAACTCCTCCAGGCACATGCGTAA (SEQ ID


NO: 13)





NbnE1


ATGCCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGAGGCCGCCAC


CGTAGTGCTGAAGCGTTGCGGTCCCATTGAGTTCACGCTCAGCGGAGTAT


CAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAAC


CGCGATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCG


GCATTACCTGAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGG


AATTTTTGCAGGTGCTCGTTCGGAGCATGAACACTCGCAACGACTTCTCG


GTGAACTATCTCATCTCCTGGTACGAGCTCCAGGTGCCGGAGCTACGCAC


GCTTGCGATCCAGCGGAACCGCGCGGTGGTGGAGGGGATCCGCAAGCGAC


TGCCCCCAGGTGCTCCTGCGGCAGCAGAGTTGCTCCTGCACTCGGTCATC


GCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGAGCTAGCTGA


TCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAAC


ACGACGATTTCCAACTCCTCCAGGCACATGCGTAA (SEQ ID


NO: 14)





NbnG7


ATGCCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGAGGCCGCCAC


CGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAGTAG


CAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAAC



TGCGATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCG



GCATTACCTGAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGG


AATTTTTGCAGGTGCTCGTTCGGAGCATGAACACTCGCAACGACTTCTCG


GTGAACTATCTCATCTCCTGGTACGAGCTCCAGGTGCCGGAGCTACGCAC


GCTTGCGATCCAGCGGAACCGCGCGGTGGTGGAGGGGATCCGCAAGCGAC


TGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTCCTGCACTCGGTCATC


GCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGAGCTAGCTGA


TCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAAC


ACGACGATTTCCAACTCCTCCAGGCACATGCGTAA (SEQ ID


NO: 15)





M2D6


ATGCCCCGCCCCAAGCTCAAGTCCGATGACGAGGTTCTCGAGGCCACCAC


CGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGTGGAGTGG


CAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAAC


CGCGATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCG


GCATTACCTGAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGG


AATTTTTGCAGGTGCTCGTTCGGAGCATGAACACTCGCAACGACTTCTCG


GTGAACTATCTCATCTCCTGGTACGAGCTCCAGGTGCCGGAGCTACGCAC


GCTTGCGATCCAGCGGAACCGCGCGGTAGTGGAGGGGATCCGCAAGCGAC


TGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTCCTGCACTCGGTCATC


GCTGGCGCGATGAAGCAGTGGGCCGTCGATCCGGATGGTGAGCTAGCTGA


TCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAAC


ACGACGATTTCCAACTCCTCCAGGCACATGCGTAA (SEQ ID


NO: 16)





M2D7


ATGCCCCGCCTCAAGCTCAAGTCCGATGACGAGGTACTCGAGGCCGCCAC


CGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAGTAG


CAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAAC


CGCGATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCG


GCATTACCTGAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGG


AATTTTTGCAGGTGCTCGTTCGGAGCATGAACACTCGCAACGACTTCTCG


GTGAACTATCTCATCTCCTTGTACGAGCTCCAGGTGCCGGAGCTACGCAC


GCTTGCGATCCAGCGGAACCGCGCGGTGGTGGAGGGGATCCGCAAGCGAC


TGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTCCTGCACTCGGTCATC


GCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGAGCTAGCTGA


TCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAAC


ACGACGATTTCCAACTCCTCCAGGCACGTGCGTAA (SEQ ID


NO: 17)





C9


ATGCCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGAGGCCGCCAC


CGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAGTAT


CAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAAC


CGCGATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCG


GCATTACCTGAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGG


AATTTTTGCAGGTGCTCGTTCGGAGCATGAACACTCGCAACGACTTCTCG


GTGAACAATCTCATCTCCTGGTACGAGCTCCAGGTGCCGGAGCTACGCAC


GCTTGCGATCCAGCGGAACCGCGCGGTGGTGGAGGGGATCCGCAAGCGAC


TGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTCCTGCACTCGGTCATC


GCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGAGCTAGCTGA


TCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAAC


ACGACGATTTCCAATTCCTCCAGGCACATGCGTAA (SEQ ID


NO: 18)





V66P


ATGCCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGAGGCCGCCAC


CGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAGTAG


CAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAAC


CGCGATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGCCACG


GCATTACCTGAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGG


AATTTTTGCAGGTGCTCGTTCGGAGCATGAACACTCGCAACGACTTCTCG


GTGAACTATCTCATCTCCTGGTACGAGCTCCAGGTGCCGGAGCTACGCAC


GCTTGCGATCCAGCGGAACCGCGCGGTGGTGGAGGGGATCCGCAAGCGAC


TGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTCCTGCACTCGGTCATC


GCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGAGCTAGCTGA


TCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAAC


ACGACGATTTCCAACTCCTCCAGGCACATGCGTAA (SEQ ID


NO: 19)





V66R


ATGCCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGAGGCCGCCAC


CGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAGTAG


CAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAAC


CGCGATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGAGGCG


GCATTACCTGAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGG


AATTTTTGCAGGTGCTCGTTCGGAGCATGAACACTCGCAACGACTTCTCG


GTGAACTATCTCATCTCCTGGTACGAGCTCCAGGTGCCGGAGCTACGCAC


GCTTGCGATCCAGCGGAACCGCGCGGTGGTGGAGGGGATCCGCAAGCGAC


TGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTCCTGCACTCGGTCATC


GCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGAGCTAGCTGA


TCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAAC


ACGACGATTTCCAACTCCTCCAGGCACATGCGTAA (SEQ ID


NO: 20)





V66G


ATGCCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGAGGCCGCCAC


CGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAGTAG


CAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAAC


CGCGATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGGACG


GGCATTACCTGAATGCGATACCGATAGCGCAGGGCCGCAAGGGCTCTGGG


AATTTTTGCAGGTGCTCGTTCGGAGCATGAACACTCGCAACGACTTCTCG


GTGAACTATCTCATCTCCTGGTACGAGCTCCAGGTGCCGGAGCTACGCAC


GCTTGCGATCCAGCGGAACCGCGCGGTGGTGGAGGGGATCCGCAAGCGAC


TGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTCCTGCACTCGGTCATC


GCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGAGCTAGCTGA


TCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAAC


ACGACGATTTCCAACTCCTCCAGGCACATGCGTAA (SEQ ID


NO: 21)





V66I


ATGCCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGAGGCCGCCAC


CGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAGTAG


CAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAAC


CGCGATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGATCCG


GCATTACCTGAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGG


AATTTTTGCAGGTGCTCGTTCGGAGCATGAACACTCGCAACGACTTCTCG


GTGAACTATCTCATCTCCTGGTACGAGCTCCAGGTGCCGGAGCTACGCAC


GCTTGCGATCCAGCGGAACCGCGCGGTGGTGGAGGGGATCCGCAAGCGAC


TGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTCCTGCACTCGGTCATC


GCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGAGCTAGCTGA


TCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAAC


ACGACGATTTCCAACTCCTCCAGGCACATGCGTAA (SEQ ID


NO: 22)





V66D


ATGCCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGAGGCCGCCAC


CGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAGTAG


CAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAAC


CGCGATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGACCG


GCATTACCTGAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGG


AATTTTTGCAGGTGCTCGTTCGGAGCATGAACACTCGCAACGACTTCTCG


GTGAACTATCTCATCTCCTGGTACGAGCTCCAGGTGCCGGAGCTACGCAC


GCTTGCGATCCAGCGGAACCGCGCGGTGGTGGAGGGGATCCGCAAGCGAC


TGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTCCTGCACTCGGTCATC


GCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGAGCTAGCTGA


TCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAAC


ACGACGATTTCCAACTCCTCCAGGCACATGCGTAA (SEQ ID


NO: 23)





M1B10


ATGCCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGAGGCCGCCAC


CGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAGTAG


CAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCATCAAC


CGCGATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCG


GCATTACCTGAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGG


AATTTTTGCAGGTGTTCGTTCGGAGCATGAACACTCGCAACAACTTCTCG


GTGAACTATCTCATCTCCTGGTACGATCTCCAGGTGCCGGAGCTACGCAC


GCTTGCGATCCAGCGGAACCGCGCGGTGGTGGAGGGGATCCGCAAGCGAC


TGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTCCTGCACTCGGTCATC


GCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGAGCTAGCTGA


TCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAAC


ACGACGATTTCCAACTCCTCCAGGCACATGCGTAA (SEQ ID


NO: 24)





M9C4


ATGCCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGAGGCCGCCAC


CGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAGTAG


CAAAGGAGGTGGGACTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAAC


CGCGATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCG


GCATTACCTGAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGG


AATTTTTGCAGGTGCTCGTTCGGAGCATGAACACTCGCAACGACTTCTCG


GTGAACTATCTCATCTCCTGGTACGAGCTCCAGGTGCCGGAGCTACGCAC


GCTTGCGATCCAGACTAACCGCGCGGTGGTGGAGGGGATCCGCAATCGAC


TGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTCCTGCACTCGGTCATC



ACTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGAGCTAGCTGA



TCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAAC


AAGACGATTTCCAACTCCTCCAGGCACATGCGTAA (SEQ ID


NO: 58)








E7_M9C4


ATGCCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGAGGCCGCCAC


CGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAGTAG


CAAAGGAGGTGGGACTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAAC


CGCGATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCG


GCATTACCTGAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGG


AATTTTTGCAGGTGCTCGTTCGGAGCATGAACACTCGCAACGACTTCTCG


GTGAACTATCTCATCTCCTGGTACGAGCTCCAGGTGCCGGAGCTACGCAC


GCTTGCGATCCAGACTAACCGCGCGGTGGTGGAGGGGATCCGCAATCGAC


TGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTCCTGCACTCGGTCATC



ACTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGAGCTAGCTGA



TCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAAC


AAGACGATTTCCAACTCCTCCAGGCACATGCGTAA (SEQ ID


NO: 59)









Provided herein are the nucleic acid sequences for the plasmid vectors disclosed above:

    • Plasmid pMLGFP:
    • LOCUS pMLGFP 3957 bp DNA circular
    • SOURCE
      • ORGANISM
    • COMMENT This file is created by Vector NTI
    • COMMENT VNTDATE|493119689|
    • COMMENT VNTDBDATE|508971571|
    • COMMENT VNTNAME|pMLGFP|
    • COMMENT VNTAUTHORNAME|zh|
    • FEATURES Location/Qualifiers
      • misc_feature 1796..1953
        • /vntifkey=“21”
        • /label=Terminator
      • CDS 2233..3093
        • /vntifkey=“4”
        • /label=Amp
      • rep_origin 3238..3911
        • /vntifkey=“33”
        • /label=pBR322\ori
      • CDS complement(103..687)
        • /vntifkey=“4”
        • /label=MphR
      • promoter complement(716..752)
        • /vntifkey=“30”
        • /label=PlacIQ
      • RBS 697..702
        • /vntifkey=“32”
        • /label=RBS
      • promoter 759..842
        • /vntifkey=“30”
        • /label=lacpromoter
      • promoter 843..880
        • /vntifkey=“30”
        • /label=PmphR
      • CDS 901..1617
        • /vntifkey=“4”
        • /label=GFP
      • RBS 887..892
        • /vntifkey=“32”
        • /label=RBS
    • BASE COUNT 1017 a 972 c 992 g 976 t
    • ORIGIN










(SEQ ID NO: 25)










   1
tctagtgtac agtgatcaag acttcgatac caccgaccgt accggtacta atcgacgacg






  61
gtcgtgttcg tcgcctgccg cagggactct gcacacctcc gtttacgcat gtgcctggag





 121
gagttggaaa tcgtcgtgtt cgggaaacat taaacacagg atggcagcga tctgagccag





 181
cacatgatca gctagctcac catccggatc gacggcccac tgcatcgtcg cgccagcgat





 241
gaccgagtgc aggagcaact cagctgccgc aggagcacct gggggcagtc gcttgcggat





 301
cccctccacc accgcgcggt tccgctggat cgcaagcgtg cgtagctccg gcacctggag





 361
ctcgtaccag gagatgagat agttcaccga gaagtcgttg cgagtgttca tgctccgaac





 421
gagcacctgc aaaaattccc agagcccttg cggccctgcg cctatcggta tcgcattcag





 481
gtaatgccgc acctgctcga cgccgcgctc catcatcctc accagcagcg tatcgcggtt





 541
ggtgaagcgc tggattaacg ctgcgcggga gagccccacc tcctttgcta ctccgctgag





 601
cgtgaactct atgggaccgc aacgcttcag cactacggtg gcggcctcga gtacctcgtc





 661
atcggacttg agcttggggc ggggcatcag tgttcacctt ctgtatgggt tggggggcgc





 721
tatcatgcca taccgcgaaa ggttttgcac catctagagc gcaacgcaat taatgtgagt





 781
tagctcactc attaggcacc ccaggcttta cactttatgc ttccggctcg tatgttgtgt





 841
gggattgaat ataaccgacg tgactgttac atttaggtgg gctaacagga ggaaactagt





 901
atgagtaaag gagaagaact tttcactgga gttgtcccaa ttcttgttga attagatggt





 961
gatgttaatg ggcacaaatt ttctgtcagt ggagagggtg aaggtgatgc aacatacgga





1021
aaacttaccc ttaaatttat ttgcactact ggaaaactac ctgttccatg gccaacactt





1081
gtcactactt tctcttatgg tgttcaatgc ttttcccgtt atccggatca tatgaaacgg





1141
catgactttt tcaagagtgc catgcccgaa ggttatgtac aggaacgcac tatatctttc





1201
aaagatgacg ggaactacaa gacgcgtgct gaagtcaagt ttgaaggtga tacccttgtt





1261
aatcgtatcg agttaaaagg tattgatttt aaagaagatg gaaacattct cggacacaaa





1321
ctcgagtaca actataactc acacaatgta tacatcacgg cagacaaaca aaagaatgga





1381
atcaaagcta acttcaaaat tcgccacaac attgaagatg gatccgttca actagcagac





1441
cattatcaac aaaatactcc aattggcgat ggccctgtcc ttttaccaga caaccattac





1501
ctgtcgacac aatctgccct ttcgaaagat cccaacgaaa agcgtgacca catggtcctt





1561
cttgagtttg taactgctgc tgggattaca catggcatgg atgagctcta caaataagct





1621
tgggcccgaa caaaaactca tctcagaaga ggatctgaat agcgccgtcg accatcatca





1681
tcatcatcat tgagtttaaa cggtctccag cttggctgtt ttggcggatg agagaagatt





1741
ttcagcctga tacagattaa atcagaacgc agaagcggtc tgataaaaca gaatttgcct





1801
ggcggcagta gcgcggtggt cccacctgac cccatgccga actcagaagt gaaacgccgt





1861
agcgccgatg gtagtgtggg gtctccccat gcgagagtag ggaactgcca ggcatcaaat





1921
aaaacgaaag gctcagtcga aagactgggc ctttcgtttt atctgttgtt tgtcggtgaa





1981
cgctctcctg agtaggacaa atccgccggg agcggatttg aacgttgcga agcaacggcc





2041
cggagggtgg cgggcaggac gcccgccata aactgccagg catcaaatta agcagaaggc





2101
catcctgacg gatggccttt ttgcgtttct acaaactctt tttgtttatt tttctaaata





2161
cattcaaata tgtatccgct catgagacaa taaccctgat aaatgcttca ataatattga





2221
aaaaggaaga gtatgagtat tcaacatttc cgtgtcgccc ttattccctt ttttgcggca





2281
ttttgccttc ctgtttttgc tcacccagaa acgctggtga aagtaaaaga tgctgaagat





2341
cagttgggtg cacgagtggg ttacatcgaa ctggatctca acagcggtaa gatccttgag





2401
agttttcgcc ccgaagaacg ttttccaatg atgagcactt ttaaagttct gctatgtggc





2461
gcggtattat cccgtgttga cgccgggcaa gagcaactcg gtcgccgcat acactattct





2521
cagaatgact tggttgagta ctcaccagtc acagaaaagc atcttacgga tggcatgaca





2581
gtaagagaat tatgcagtgc tgccataacc atgagtgata acactgcggc caacttactt





2641
ctgacaacga tcggaggacc gaaggagcta accgcttttt tgcacaacat gggggatcat





2701
gtaactcgcc ttgatcgttg ggaaccggag ctgaatgaag ccataccaaa cgacgagcgt





2761
gacaccacga tgcctgtagc aatggcaaca acgttgcgca aactattaac tggcgaacta





2821
cttactctag cttcccggca acaattaata gactggatgg aggcggataa agttgcagga





2881
ccacttctgc gctcggccct tccggctggc tggtttattg ctgataaatc tggagccggt





2941
gagcgtgggt ctcgcggtat cattgcagca ctggggccag atggtaagcc ctcccgtatc





3001
gtagttatct acacgacggg gagtcaggca actatggatg aacgaaatag acagatcgct





3061
gagataggtg cctcactgat taagcattgg taactgtcag accaagttta ctcatatata





3121
ctttagattg atttaaaact tcatttttaa tttaaaagga tctaggtgaa gatccttttt





3181
gataatctca tgaccaaaat cccttaacgt gagttttcgt tccactgagc gtcagacccc





3241
gtagaaaaga tcaaaggatc ttcttgagat cctttttttc tgcgcgtaat ctgctgcttg





3301
caaacaaaaa aaccaccgct accagcggtg gtttgtttgc cggatcaaga gctaccaact





3361
ctttttccga aggtaactgg cttcagcaga gcgcagatac caaatactgt ccttctagtg





3421
tagccgtagt taggccacca cttcaagaac tctgtagcac cgcctacata cctcgctctg





3481
ctaatcctgt taccagtggc tgctgccagt ggcgataagt cgtgtcttac cgggttggac





3541
tcaagacgat agttaccgga taaggcgcag cggtcgggct gaacgggggg ttcgtgcaca





3601
cagcccagct tggagcgaac gacctacacc gaactgagat acctacagcg tgagctatga





3661
gaaagcgcca cgcttcccga agggagaaag gcggacaggt atccggtaag cggcagggtc





3721
ggaacaggag agcgcacgag ggagcttcca gggggaaacg cctggtatct ttatagtcct





3781
gtcgggtttc gccacctctg acttgagcgt cgatttttgt gatgctcgtc aggggggcgg





3841
agcctatgga aaaacgccag caacgcggcc tttttacggt tcctggcctt ttgctggcct





3901
tttgctcaca tgttctttcc tgcgttatcc cctgattctg tggataaccg tattacc








    • Plasmid pJZ12:

    • LOCUS pJZ12 5131 bp DNA circular

    • SOURCE
      • ORGANISM

    • COMMENT This file is created by Vector NTI

    • COMMENT VNTDATE|493491327|

    • COMMENT VNTDBDATE|508971571|

    • COMMENT VNTNAME|pJZ12|

    • COMMENT VNTAUTHORNAME|zh|

    • FEATURES Location/Qualifiers
      • CDS 582..1772
        • /vntifkey=“4”
        • /label=TetR
      • rep_origin 4713..412
        • /vntifkey=“33”
        • /label=rep(p15A)
      • CDS 2945..3850
        • /vntitkey=“4”
        • /label=mphA
      • CDS 3847..4649
        • /vntiflkey=“4”
        • /label=mrx\incomplete\CDS

    • BASE COUNT 980 a 1521 c 1515 g 1115 t

    • ORIGIN













(SEQ ID NO: 26)










   1
tcattccgct gttatggccg cgtttgtctc attccacgcc tgacactcag ttccgggtag






  61
gcagttcgct ccaagctgga ctgtatgcac gaaccccccg ttcagtccga ccgctgcgcc





 121
ttatccggta actatcgtct tgagtccaac ccggaaagac atgcaaaagc accactggca





 181
gcagccactg gtaattgatt tagaggagtt agtcttgaag tcatgcgccg gttaaggcta





 241
aactgaaagg acaagttttg gtgactgcgc tcctccaagc cagttacctc ggttcaaaga





 301
gttggtagct cagagaacct tcgaaaaacc gccctgcaag geggtttttt cgttttcaga





 361
gcaagagatt acgcgcagac caaaacgatc tcaagaagat catcttatta atcagataaa





 421
atatttctag atttcagtgc aatttatctc ttcaaatgta gcacctgaag tcagccccat





 481
acgatataag ttgtaattct catgtttgac agcttatcat cgataagctt taatgcggta





 541
gtttatcaca gttaaattgc taacgcagtc aggcaccgtg tatgaaatct aacaatgcgc





 601
tcatcgtcat cctcggcacc gtcaccctgg atgctgtagg cataggcttg gttatgccgg





 661
tactgccggg cctcttgcgg gatatcgtcc attccgacag catcgccagt cactatggcg





 721
tgctgctagc gctatatgcg ttgatgcaat ttctatgcgc acccgttctc ggagcactgt





 781
ccgaccgctt tggccgccgc ccagtcctgc tcgcttcgct acttggagcc actatcgact





 841
acgcgatcat ggcgaccaca cccgtcctgt ggatcctcta cgccggacgc atcgtggccg





 901
gcatcaccgg cgccacaggt gcggttgctg gcgcctatat cgccgacatc accgatgggg





 961
aagatcgggc tcgccacttc gggctcatga gcgcttgttt cggcgtgggt atggtggcag





1021
gccccgtggc cgggggactg ttgggcgcca tctccttgca tgcaccattc cttgcggcgg





1081
cggtgctcaa cggcctcaac ctactactgg gctgcttcct aatgcaggag tcgcataagg





1141
gagagcgtcg accgatgccc ttgagagcct tcaacccagt cagctccttc cggtgggcgc





1201
ggggcatgac tatcgtcgcc gcacttatga ctgtcttctt tatcatgcaa ctcgtaggac





1261
aggtgccggc agcgctctgg gtcattttcg gcgaggaccg ctttcgctgg agcgcgacga





1321
tgatcggcct gtcgcttgcg gtattcggaa tcttgcacgc cctcgctcaa gccttcgtca





1381
ctggtcccgc caccaaacgt ttcggcgaga agcaggccat tatcgccggc atggcggccg





1441
acgcgctggg ctacgtcttg ctggcgttcg cgacgcgagg ctggatggcc ttccccatta





1501
tgattcttct cgcttccggc ggcatcggga tgcccgcgtt gcaggccatg ctgtccaggc





1561
aggtagatga cgaccatcag ggacagcttc aaggatcgct cgcggctctt accagcctaa





1621
cttcgatcac tggaccgctg atcgtcacgg cgatttatgc cgcctcggcg agcacatgga





1681
acgggttggc atggattgta ggcgccgccc tataccttgt ctgcctcccc gcgttgcgtc





1741
gcggtgcatg gagccgggcc acctcgacct gaatggaagc cggcggcacc tcgctaacgg





1801
attcaccact ccaagaattg gagccaatca attcttgcgg agaactgtga atgcgcaaac





1861
caacccttgg cagaacatat ccatcgcgtc cgccatctcc agcagccgca cgcggcgcat





1921
ctcgggcagc gttgggtcct ggccacgggt gcgcatgatc gtgctcctgt cgttgaggac





1981
ccggctaggc tggcggggtt gccttactgg ttagcagaat gaatcaccga tacgcgagcg





2041
aacgtgaagc gactgctgct gcaaaacgtc tgcgacctga gcaacaacat gaatggtctt





2101
cggtttccgt gtttcgtaaa gtctggaaac gcggaagtcc cctacgtgct gctgaagttg





2161
cccgcaacag agagtggaac cggtacccgg ggatcctcta gagtcgacct gcaggagatg





2221
ctggctgaac gcggagtgaa tgtcgatcac tccacgattt accgctgggt tcagcgttat





2281
gcgcctgaaa tggaaaaacg gctgcgctgg tactggcgta acccttccga tctttgcccg





2341
tggcacatgg atgaaaccta cgtgaaggtc aatggccgct gggcgtatct gtaccgggcc





2401
gtcgacagcc ggggccgcac tgtcgatttt tatctctcct cccgtcgtaa cagcaaagct





2461
gcataccggt ttctgggtaa aatcctcaac aacgtgaaga agtggcagat cccgcgattc





2521
atcaacacgg ataaagcgcc cgcctatggt cgcgcgcttg ctctgctcaa acgcgaaggc





2581
cggtgcccgt ctgacgttga acaccgacag attaagtacc ggaacaacgt gattgaatgc





2641
gatcatggca aactgaaacg gataatcggc gccacgctgg gatttaaatc catgaagacg





2701
gcttacgcca ccatcaaagg tattgaggtg atgcgtgcac tacgcaaagg ccaggcctca





2761
gcattttatt atggtgatcc cctgggcgaa atgcgcctgg taagcagagt ttttgaaatg





2821
taaggccttt gaataagaca aaaggctgcc tcatcgctaa ctttgcaaca gtgccggatt





2881
gaatataacc gacgtgactg ttacatttag gtggctaaac ccgtcaagcc ctcaggagtg





2941
aatcatgacc gtagtcacga ccgccgatac ctcccaactg tacgcacttg cagcccgaca





3001
tgggctcaag ctccatggcc cgctgactgt caatgagctt gggctcgact ataggatcgt





3061
gatcgccacc gtcgacgatg gacgtcggtg ggtgctgcgc atcccgcgcc gagccgaggt





3121
aagcgcgaag gtcgaaccag aggcgcgggt gctggcaatg ctcaagaatc gcctgccgtt





3181
cgcggtgccg gactggcgcg tggccaacgc cgagctcgtt gcctatccca tgctcgaaga





3241
ctcgactgcg atggtcatcc agcctggttc gtccacgccc gactgggtcg tgccgcagga





3301
ctcggaggtc ttcgcggaga gcttcgcgac cgcgctcgcc gccctgcatg ccgtccccat





3361
ttccgccgcc gtggatgcgg ggatgctcat ccgtacaccg acgcaggccc gtcagaaggt





3421
ggccgacgac gttgaccgcg tccgacgcga gttcgtggtg aacgacaagc gcctccaccg





3481
gtggcagcgc tggctcgacg acgattcgtc gtggccagat ttctccgtgg tggtgcatgg





3541
cgatctctac gtgggccatg tgctcatcga caacacggag cgcgtcagcg ggatgatcga





3601
ctggagcgag gcccgcgttg atgaccctgc catcgacatg gccgcgcacc ttatggtctt





3661
tggtgaagag gggctcgcga agctcctcct cacgtatgaa gcggccggtg gccgggtgtg





3721
gccgcggctc gcccaccaca tcgcggagcg ccttgcgttc ggggcggtca cctacgcact





3781
cttcgccctc gactcgggta acgaagagta cctcgctgcg gcgaaggcgc agctcgccgc





3841
agcggaatga gcgaacgtcg atatagcccg ctcgcgacgc tgttcgcggc gacctttctc





3901
ttccggatcg gcaacgcggt ggcggccctc gcgcttccat ggttcgtcct gtctcataca





3961
aagagcgcgg cctgggcggg cgccacggcc gctagcagcg tcatcgcgac catcatcggc





4021
gcgtgggttg gtggtggcct cgtcgatcgg ttcgggcgcg cgcccgtcgc attgatctcg





4081
ggtgtggtgg gcggcgtggc catggcgagc atcccactgc tcgatgccgt tggcgccctc





4141
tcgaacactg ggctgatcgc ttgcgtggtg ctcggtgccg cgttcgacgc acccggtatg





4201
gccgcgcagg acagtgagct gcccaaactc ggccacgtcg ccgggctctc cgttgagcgc





4261
gtctcgtcac tgaaagcggt gatcgggaac gtcgcgattc taggtggccc ggcccttggg





4321
ggggccgcaa tcggcctgct tggcgctgcg ccaacgctcg ggctgacggc gttctgctcc





4381
gtccttgcag gtctgctcgg cgcgtgggtg cttcccgcgc gtgccgctcg gacgatgacc





4441
acgacggcga ctctctccat gcgcgccggc gtcgcttttc tctggagcga acccctgctg





4501
cgccctctct ttggtatagt gatgatcttc gtgggcatcg ttggcgccaa cggcagcgtc





4561
atcatgcctg cgctgtttgt agatgcagga cgccaagtag cagagctcgg gctgttctcc





4621
tcaatgatgg gggctggtgg tctccttggc tgtccctcct gttcagctac tgacggggtg





4681
gtgcgtaacg gcaaaagcac cgccggacat cagcgctagc ggagtgtata ctggcttact





4741
atgttggcac tgatgagggt gtcagtgaag tgcttcatgt ggcaggagaa aaaaggctgc





4801
accggtgcgt cagcagaata tgtgatacag gatatattcc gcttcctcgc tcactgactc





4861
gctacgctcg gtcgttcgac tgcggcgagc ggaaatggct tacgaacggg gcggagattt





4921
cctggaagat gccaggaaga tacttaacag ggaagtgaga gggccgcggc aaagccgttt





4981
ttccataggc tccgcccccc tgacaagcat cacgaaatct gacgctcaaa tcagtggtgg





5041
cgaaacccga caggactata aagataccag gcgtttcccc ctggcggctc cctcgtgcgc





5101
tctcctgttc ctgcctttcg gtttaccggt g
















DNA sequences with upstream nucleotide sequences















WT


(SEQ ID NO: 28)



AGAAGGTGAACACTGcustom character CCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGA






GGCCGCCACCGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAG





TAGCAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAACCGC





GATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCGGCATTACCT





GAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGC





TCGTTCGGAGCATGAACACTCGCAACGACTTCTCGGTGAACTATCTCATCTCCTGGT





ACGAGCTCCAGGTGCCGGAGCTACGCACGCTTGCGATCCAGCGGAACCGCGCGGTG





GTGGAGGGGATCCGCAAGCGACTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCT





CCTGCACTCGGTCATCGCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTG





AGCTAGCTGATCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGA





ACACGACGATTTCCAACTCCTCCAGGCACATGCGTAA





epA3


(SEQ ID NO: 29)




G

GAAGGTGAACACTGcustom character CCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGA






GGCCGCCACCGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAG





TAGCAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAACCGC





GATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCGGCATTACCT





GAATGCGATACCGATATGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGC





TCGTTCGGAGCATGAACACTCGCAACGACTTCTCGGTGAACTATCTCATCTCCTGGT





ACGAGCTCCAGGTGCCGGAGCTACGCACGCTTGCGATCCAGCGGAACCGCGCGGTG





GTGGAGGGGATCCGCAAGCGACTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCT





CCTGCACTCGGTCATCGCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTG





AGCTAGCTGATCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGA





ACACGACGATTTCCAACTCCTCCAGGCACATGCGTAA





epE7


(SEQ ID NO: 30)



AGA

T

GGTGAACACTGcustom character CCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGA






GGCCGCCACCGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAG





TAGCAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAACCGC





GATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCGGCATTACCT






TAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGC






TCATTCGGAGCATGAACACTCGCAACGACTTCTCGGTGAACTATCTCATCTCCTGGT





ACGAGCTCCAGGTGCCGGAGCTACGCACGCTTGCGATCCAGCGGAACCGCGCGGTG





GTGGAGGGGATCCGCAAGCGACTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCT





CCTGCACTCGGTCATCGCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTG





AGCTAGCTGATCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGA





ACACGACGATTTCCAACTCCTCCAGGCACATGCGTAA





WT A3-RBS


(SEQ ID NO: 31)




G

GAAGGTGAACACTGcustom character CCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGA






GGCCGCCACCGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAG





TAGCAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAACCGC





GATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCGGCATTACCT





GAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGC





TCGTTCGGAGCATGAACACTCGCAACGACTTCTCGGTGAACTATCTCATCTCCTGGT





ACGAGCTCCAGGTGCCGGAGCTACGCACGCTTGCGATCCAGCGGAACCGCGCGGTG





GTGGAGGGGATCCGCAAGCGACTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCT





CCTGCACTCGGTCATCGCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTG





AGCTAGCTGATCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGA





ACACGACGATTTCCAACTCCTCCAGGCACATGCGTAA





WT E7-RBS


(SEQ ID NO: 32)



AGA

T

GGTGAACACTGcustom character CCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGA






GGCCGCCACCGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAG





TAGCAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAACCGC





GATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCGGCATTACCT





GAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGC





TCGTTCGGAGCATGAACACTCGCAACGACTTCTCGGTGAACTATCTCATCTCCTGGT





ACGAGCTCCAGGTGCCGGAGCTACGCACGCTTGCGATCCAGCGGAACCGCGCGGTG





GTGGAGGGGATCCGCAAGCGACTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCT





CCTGCACTCGGTCATCGCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTG





AGCTAGCTGATCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGA





ACACGACGATTTCCAACTCCTCCAGGCACATGCGTAA





WT H4-RBS


(SEQ ID NO: 33)



AGAAGG
CGAACACTGcustom character CCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGA






GGCCGCCACCGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAG





TAGCAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAACCGC





GATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCGGCATTACCT





GAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGC





TCGTTCGGAGCATGAACACTCGCAACGACTTCTCGGTGAACTATCTCATCTCCTGGT





ACGAGCTCCAGGTGCCGGAGCTACGCACGCTTGCGATCCAGCGGAACCGCGCGGTG





GTGGAGGGGATCCGCAAGCGACTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCT





CCTGCACTCGGTCATCGCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTG





AGCTAGCTGATCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGA





ACACGACGATTTCCAACTCCTCCAGGCACATGCGTAA





QCMS3D6


(SEQ ID NO: 34)



AGAAGGTGAACACTGcustom character CCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGA






GGCCGCCAGGGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAG





TAGCAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAACCGC





GATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCGGCATTACCT





GAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGG





TCGTTCGGAGCATGAACACTCGCAACGACTTCTCGGTGAACTATCTCATCTCCTGGT





ACGAGCTCCAGGTGCCGGAGCTACGCACGCTTGCGATCCAGCGGAACCGCGCGGTG





GTGGAGGGGATCCGCAAGCGACTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCT





CCTGCACTCGGTCATCGCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTG





AGCTAGCTGATCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGA





ACAcGAcGATTTCCAACTCCTCCAGGCACATGCGTAA





QCMS3F8


(SEQ ID NO: 35)



AGAAGGTGAACACTGcustom character CCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGA






GGCCGCCGCGGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAG





TAGCAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAACCGC





GATACGCTGCTGGTGAGGATGAGTGAGCGCGGCGTCGAGCAGGTGCGGCATTACCT





GAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGC





TCGTTCGGAGCATGAACACTCGCAACGACTTCTCGGTGAACTATCTCATCTCCTGGT





ACGAGCTCCAGGTGCCGGAGCTACGCACGCTTGCGATCCAGCGGAACCGCGCGGTG





GTGGAGGGGATCCGCAAGCGACTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCT





CCTGCACTCGGTCATCGCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTG





AGCTAGCTGATCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGA





ACACGACGATTTCCAACTCCTCCAGGCACATGCGTAA





QCMS5B4


(SEQ ID NO: 36)



AGAAGGTGAACACTGcustom character CCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGA






GGCCGCCGGTGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAG





TAGCAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAACCGC





GATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGATGGTTCGGCATTACCT





GAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGC





TCGTTCGGAGCATGAACACTCGCAACGACTTCTCGGTGAACTATCTCATCTCCTGGT





ACGAGCTCCAGGTGCCGGAGCTACGCACGCTTGCGATCCAGCGGAACCGCGCGGTG





GTGGAGGGGATCCGCAAGCGACTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCT





CCTGCACTCGGTCATCGCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTG





AGCTAGCTGATCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGA





ACAcGACGATTTCCAACTCCTCCAGGCACATGCGTAA





QCMS5D7


(SEQ ID NO: 37)



AGAAGGTGAACACTGcustom character CCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGA






GGCCGCCGCTGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAGT





AGCAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAACCGCG





ATACGCTGCTGGTGAGGATGGAGGAGCGCGGCGTCGAGCAGGTGCGGCATTACCTG





AATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGCT





CGTTCGGAGCATGAACACTCGCAACGACTTCTCGGTGAACTATCTCATCTCCTGGTA





CGAGCTCCAGGTGCCGGAGCTACGCACGCTTGCGATCCAGCGGAACCGCGCGGTGG





TGGAGGGGATCCGCAAGCGACTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTC





CTGCACTCGGTCATCGCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGA





GCTAGCTGATCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAA





CACGACGATTTCCAACTCCTCCAGGCACATGCGTAA





pikB1/D3


(SEQ ID NO: 38)



AGAAGGTGAACACTGcustom character CCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGA






GGCCGCCACCGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAG





TAGCAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAACCGC





GATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCGGCATTACCT





GAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGC





TCGTTCGGAGCATGAACACTCGCAACGACTTCTCGGTGAACTATCTCATCTTCTGGT





ACGAGCTCCAGGTGCCGGAGCTACGCACGCTTGCGATCCAGCGGAACCGCGCGGTG





GTGGAGGGGATCCGCAAGCGACTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCT





CCTGCACTCGGTCATCGCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTG





AGCTAGCTGATCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGA





ACACGACGATTTCCAACTCCTCCAGGCACATGCGTAA





YCA11 (Three mutations upstream of the RBS [2 in promoter])


(SEQ ID NO: 39)


TGGTGCAAAACCTTTCGCGGTATGACATGATAGCGCCTCCCAGCCCATACAGAAGG





TGAACACTGcustom character CCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGAGGCCGC





CACCGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGAGGAGTAGCAA





AGGAGGTGGGGCTCTCCCGCGCTGCGTTAATCCAGCGCTTCACCAACCGCGATACGC





TGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCGACATTACCTGAATGCG





ATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGCTCGTTCG





GAGCATGAACACTCGCAACGACTTCTCGGTGAACTATCTCATCTCCTGGTACGAGCT





CCAGGTGCCGGAGCTACGCACGCTTGCGATCCAGCGGAACCGCGCGGTGGTGGAGG





GGATCCGCAAGCGACTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTCCTGCACT





CGGTCATCGCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGAGCTAGCT





GATCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAACACGAC





GATTTCCAACTCCTCCAGGCACATGCGTAA





Nbn.YCG11 (Two mutations [1 in promoter])


(SEQ ID NO: 40)


TGGTGCAAAACCTTTCGCGATATGGCATGATAGCGCCCCCCAACCCATACAGAAGG





TGAACTCTGcustom character CCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGAGGCCGC





CACCGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAGTAGCAA





AGGAGGTGGGGTTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAACCGCGATACG





CTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCGGCATTACCTGAATGC





GATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGCTCGTTC





GGAGCATGAACACTCGCAACGACTTCTCGGTGAACTATCTCATCTCCTGGTACGAGC





TCCAGGTGCCGGAGCTACGCACGCTTGCGATCCAGCGGAACCGCGCGGTGGTGGAG





GGGATCCGCAAGCGACTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTCCTGCAC





TCGGTCATCGCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGAGCTAGC





TGATCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAACACGA





CGATTTCCAACTCCTCCAGGCACATGCGTAA





NbnDll


(SEQ ID NO: 41)



AGAAGGTGAACACTGcustom character CCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGA






GGCCGCCACCGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGACT





AGCAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAACCGCG





ATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCGGCATTACCTG





AATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGCT





CGTTCGGAGCATGAACACTCGCAACGACTTCTCGGTGAACTATCTCATCTCCTGGTA





CGAGCTCCAGGTGCCGGAGCTACGCACCCTTGCGATCCAGCGGAACCGCGCGGTGG





TGGAGGGGATCCGCAAGCGACTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTC





CTGCACTCGGTCATCGCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGA





GCTAGCTGATCATGTGCTGGCTCAGATCTCTGCCATCCTGTGTTTAATGTTTCCCGAA





CACGACGATTTCCAACTCCTCCAGGCACATGCGTAA





NbnE1 (One mutation between the RBS and start codon)


(SEQ ID NO: 42)



AGAAGGTGGACACTGcustom character CCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGA






GGCCGCCACCGTAGTGCTGAAGCGTTGCGGTCCCATTGAGTTCACGCTCAGCGGAGT





ATCAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAACCGCG





ATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCGGCATTACCTG





AATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGCT





CGTTCGGAGCATGAACACTCGCAACGACTTCTCGGTGAACTATCTCATCTCCTGGTA





CGAGCTCCAGGTGCCGGAGCTACGCACGCTTGCGATCCAGCGGAACCGCGCGGTGG





TGGAGGGGATCCGCAAGCGACTGCCCCCAGGTGCTCCTGCGGCAGCAGAGTTGCTC





CTGCACTCGGTCATCGCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGA





GCTAGCTGATCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAA





CACGACGATTTCCAACTCCTCCAGGCACATGCGTAA





NbnG7 (One mutation in promotor)


(SEQ ID NO: 43)


TGGTGCAAAACCTTTCGCGGTATGTCATGATAGCGCCCCCCAACCCATACAGAAGG





TGAACACTGcustom character CCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGAGGCCGC





CACCGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAGTAGCAA





AGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAACTGCGATACG





CTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCGGCATTACCTGAATGC





GATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGCTCGTTC





GGAGCATGAACACTCGCAACGACTTCTCGGTGAACTATCTCATCTCCTGGTACGAGC





TCCAGGTGCCGGAGCTACGCACGCTTGCGATCCAGCGGAACCGCGCGGTGGTGGAG





GGGATCCGCAAGCGACTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTCCTGCAC





TCGGTCATCGCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGAGCTAGC





TGATCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAACACGA





CGATTTCCAACTCCTCCAGGCACATGCGTAA





M2D6


(SEQ ID NO: 44)



AGAAGGTGAACACTGcustom character CCCCGCCCCAAGCTCAAGTCCGATGACGAGGTTCTCGA






GGCCACCACCGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGTGGAGT






GGCAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAACCGCG






ATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCGGCATTACCTG





AATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGCT





CGTTCGGAGCATGAACACTCCTCAACGACTTCTCGGTGAACTATCTCATCTCCTGGTA





CGAGCTCCAGGTGCCGGAGCTACGCACGCTTGCGATCCAGCGGAACCGCGCGGTAG





TGGAGGGGATCCGCAAGCGACTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTC





CTGCACTCGGTCATCGCTGGCGCGATGAAGCAGTGGGCCGTCGATCCGGATGGTGA





GCTAGCTGATCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAA





CACGACGATTTCCAACTCCTCCAGGCACATGCGTAA





M2D7


(SEQ ID NO: 45)



AGAAGGTGAACACTGcustom character CCCCGCCTCAAGCTCAAGTCCGATGACGAGGTACTCGA






GGCCGCCACCGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAG





TAGCAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAACCGC





GATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCGGCATTACCT





GAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGC





TCGTTCGGAGCATGAACACTCGCAACGACTTCTCGGTGAACTATCTCATCTCCTTGT





ACGAGCTCCAGGTGCCGGAGCTACGCACGCTTGCGATCCAGCGGAACCGCGCGGTG





GTGGAGGGGATCCGCAAGCGACTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCT





CCTGCACTCGGTCATCGCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTG





AGCTAGCTGATCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGA





ACACGACGATTTCCAACTCCTCCAGGCACGTGCGTAA





C9


(SEQ ID NO: 46)



AGAAGGTGAACACTGcustom character CCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGA






GGCCGCCACCGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAG





TATCAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAACCGC





GATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCGGCATTACCT





GAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGC





TCGTTCGGAGCATGAACACTCGCAACGACTTCTCGGTGAACAATCTCATCTCCTGGT





ACGAGCTCCAGGTGCCGGAGCTACGCACGCTTGCGATCCAGCGGAACCGCGCGGTG





GTGGAGGGGATCCGCAAGCGACTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCT





CCTGCACTCGGTCATCGCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTG





AGCTAGCTGATCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGA





ACACGACGATTTCCAATTCCTCCAGGCACATGCGTAA





V66P


(SEQ ID NO: 47)



AGAAGGTGAACACTGcustom character CCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGA






GGCCGCCACCGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAG





TAGCAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAACCGC





GATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGCCACGGCATTACCT





GAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGC





TCGTTCGGAGCATGAACACTCGCAACGACTTCTCGGTGAACTATCTCATCTCCTGGT





ACGAGCTCCAGGTGCCGGAGCTACGCACGCTTGCGATCCAGCGGAACCGCGCGGTG





GTGGAGGGGATCCGCAAGCGACTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCT





CCTGCACTCGGTCATCGCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTG





AGCTAGCTGATCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGA





ACACGACGATTTCCAACTCCTCCAGGCACATGCGTAA





V66R


(SEQ ID NO: 48)



AGAAGGTGAACACTGcustom character CCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGA






GGCCGCCACCGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAG





TAGCAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAACCGC





GATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGAGGCGGCATTACCT





GAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGC





TCGTTCGGAGCATGAACACTCGCAACGACTTCTCGGTGAACTATCTCATCTCCTGGT





ACGAGCTCCAGGTGCCGGAGCTACGCACGCTTGCGATCCAGCGGAACCGCGCGGTG





GTGGAGGGGATCCGCAAGCGACTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCT





CCTGCACTCGGTCATCGCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTG





AGCTAGCTGATCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGA





ACACGACGATTTCCAACTCCTCCAGGCACATGCGTAA





V66G


(SEQ ID NO: 49)



AGAAGGTGAACACTGcustom character CCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGA






GGCCGCCACCGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAG





TAGCAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAACCGC





GATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGGACGGCATTACCT





GAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGC





TCGTTCGGAGCATGAACACTCGCAACGACTTCTCGGTGAACTATCTCATCTCCTGGT





ACGAGCTCCAGGTGCCGGAGCTACGCACGCTTGCGATCCAGCGGAACCGCGCGGTG





GTGGAGGGGATCCGCAAGCGACTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCT





CCTGCACTCGGTCATCGCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTG





AGCTAGCTGATCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGA





ACACGACGATTTCCAACTCCTCCAGGCACATGCGTAA





V66I


(SEQ ID NO: 50)



AGAAGGTGAACACTGcustom character CCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGA






GGCCGCCACCGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAG





TAGCAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAACCGC





GATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGATCCGGCATTACCT





GAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGC





TCGTTCGGAGCATGAACACTCGCAACGACTTCTCGGTGAACTATCTCATCTCCTGGT





ACGAGCTCCAGGTGCCGGAGCTACGCACGCTTGCGATCCAGCGGAACCGCGCGGTG





GTGGAGGGGATCCGCAAGCGACTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCT





CCTGCACTCGGTCATCGCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTG





AGCTAGCTGATCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGA





ACACGACGATTTCCAACTCCTCCAGGCACATGCGTAA





V66D


(SEQ ID NO: 51)



AGAAGGTGAACACTGcustom character CCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGA






GGCCGCCACCGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAG





TAGCAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAACCGC





GATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGACCGGCATTACCT





GAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGC





TCGTTCGGAGCATGAACACTCGCAACGACTTCTCGGTGAACTATCTCATCTCCTGGT





ACGAGCTCCAGGTGCCGGAGCTACGCACGCTTGCGATCCAGCGGAACCGCGCGGTG





GTGGAGGGGATCCGCAAGCGACTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCT





CCTGCACTCGGTCATCGCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTG





AGCTAGCTGATCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGA





ACACGACGATTTCCAACTCCTCCAGGCACATGCGTAA





M1B10


(SEQ ID NO: 52)



AGAAGGCGAACACTGcustom character CCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGA






GGCCGCCACCGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAG





TAGCAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCATCAACCGC





GATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCGGCATTACCT





GAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGT





TCGTTCGGAGCATGAACACTCGCAACAACTTCTCGGTGAACTATCTCATCTCCTGGT





ACGATCTCCAGGTGCCGGAGCTACGCACGCTTGCGATCCAGCGGAACCGCGCGGTG





GTGGAGGGGATCCGCAAGCGACTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCT





CCTGCACTCGGTCATCGCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTG





AGCTAGCTGATCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGA





ACACGACGATTTCCAACTCCTCCAGGCACATGCGTAA





smRBS 1A1


(SEQ ID NO: 53)




TTC

AGGTGAACACTGcustom character CCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGA






GGCCGCCACCGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAG





TAGCAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAACCGC





GATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCGGCATTACCT





GAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGC





TCGTTCGGAGCATGAACACTCGCAACGACTTCTCGGTGAACTATCTCATCTCCTGGT





ACGAGCTCCAGGTGCCGGAGCTACGCACGCTTGCGATCCAGCGGAACCGCGCGGTG





GTGGAGGGGATCCGCAAGCGACTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCT





CCTGCACTCGGTCATCGCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTG





AGCTAGCTGATCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGA





ACACGACGATTTCCAACTCCTCCAGGCACATGCGTAA





smRBS 1G7


(SEQ ID NO: 54)




CTG

AGGTGAACACTGcustom character CCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGA






GGCCGCCACCGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAG





TAGCAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAACCGC





GATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCGGCATTACCT





GAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGC





TCGTTCGGAGCATGAACACTCGCAACGACTTCTCGGTGAACTATCTCATCTCCTGGT





ACGAGCTCCAGGTGCCGGAGCTACGCACGCTTGCGATCCAGCGGAACCGCGCGGTG





GTGGAGGGGATCCGCAAGCGACTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCT





CCTGCACTCGGTCATCGCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTG





AGCTAGCTGATCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGA





ACACGACGATTTCCAACTCCTCCAGGCACATGCGTAA





smRBS 2E1


(SEQ ID NO: 55)



A

A

AAGGTGAACACTGcustom character CCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGA






GGCCGCCACCGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAG





TAGCAAAGGAGGTGGGGCTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAACCGC





GATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCGGCATTACCT





GAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGC





TCGTTCGGAGCATGAACACTCGCAACGACTTCTCGGTGAACTATCTCATCTCCTGGT





ACGAGCTCCAGGTGCCGGAGCTACGCACGCTTGCGATCCAGCGGAACCGCGCGGTG





GTGGAGGGGATCCGCAAGCGACTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCT





CCTGCACTCGGTCATCGCTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTG





AGCTAGCTGATCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGA





ACACGACGATTTCCAACTCCTCCAGGCACATGCGTAA





M9C4


(SEQ ID NO: 56)



AGAAGGTGAACACTGcustom character CCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGA






GGCCGCCACCGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAG





TAGCAAAGGAGGTGGGACTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAACCGC





GATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCGGCATTACCT





GAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGC





TCGTTCGGAGCATGAACACTCGCAACGACTTCTCGGTGAACTATCTCATCTCCTGGT





ACGAGCTCCAGGTGCCGGAGCTACGCACGCTTGCGATCCAGACTAACCGCGCGGTG





GTGGAGGGGATCCGCAATCGACTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTC





CTGCACTCGGTCATCACTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGA





GCTAGCTGATCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAA





CAAGACGATTTCCAACTCCTCCAGGCACATGCGTAA





E7_M9C4


(SEQ ID NO: 57)



AGA

T

GGTGAACACTGcustom character CCCCGCCCCAAGCTCAAGTCCGATGACGAGGTACTCGA






GGCCGCCACCGTAGTGCTGAAGCGTTGCGGTCCCATAGAGTTCACGCTCAGCGGAG





TAGCAAAGGAGGTGGGACTCTCCCGCGCAGCGTTAATCCAGCGCTTCACCAACCGC





GATACGCTGCTGGTGAGGATGATGGAGCGCGGCGTCGAGCAGGTGCGGCATTACCT





GAATGCGATACCGATAGGCGCAGGGCCGCAAGGGCTCTGGGAATTTTTGCAGGTGC





TCGTTCGGAGCATGAACACTCGCAACGACTTCTCGGTGAACTATCTCATCTCCTGGT





ACGAGCTCCAGGTGCCGGAGCTACGCACGCTTGCGATCCAGACTAACCGCGCGGTG





GTGGAGGGGATCCGCAATCGACTGCCCCCAGGTGCTCCTGCGGCAGCTGAGTTGCTC





CTGCACTCGGTCATCACTGGCGCGACGATGCAGTGGGCCGTCGATCCGGATGGTGA





GCTAGCTGATCATGTGCTGGCTCAGATCGCTGCCATCCTGTGTTTAATGTTTCCCGAA





CAAGACGATTTCCAACTCCTCCAGGCACATGCGTAA





Mutated nucleotides are underlined


RBS region is shown bold


Start codon is shown boxed








    • pMLCmR, E7_M9C4_pMLCmR

    • MphR sequence same as WT and E7 mutant (above)





In some embodiments, the MphR gene sequence may be codon optimized, without changing the resulting polypeptide sequence. In some embodiments, the codon optimization includes replacing at least one, or more than one, or a significant number, of codons.


In some embodiments, the MphR gene sequence is substantially identical to the wild-type MphR sequence (SEQ ID NO:1). In some embodiments, the MphR gene is about 60% identical, 35 preferably 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or higher, over a specified region when compared and aligned for maximum correspondence with the wild-type sequence.


In some embodiments, the MphR gene sequence is substantially identical to the wild-type MphR sequence (SEQ ID NO:28) (which includes gene sequences upstream of the start codon). In some embodiments, the MphR gene is about 60% identical, preferably 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or higher, over a specified region when compared and aligned for maximum correspondence with the wild-type sequence.


Throughout this application, various publications are referenced. The disclosures of these publications in their entireties are hereby incorporated by reference into this application in order to more fully describe the state of the art to which this pertains. The references disclosed are also individually and specifically incorporated by reference herein for the material contained in them that is discussed in the sentence in which the reference is relied upon.


Those skilled in the art will appreciate that numerous changes and modifications can be made to the preferred embodiments of the invention and that such changes and modifications can be made without departing from the spirit of the invention. It is, therefore, intended that the appended claims cover all such equivalent variations as fall within the true spirit and scope of the invention.

Claims
  • 1. A biosensor system comprising: a nucleic acid encoding a genetically modified MphR gene sequence, wherein the nucleic acid comprises at least one genetic mutation when compared to the wild-type MphR gene sequence; anda reporter gene whose transcription is under the control of a promoter region which is regulated by a MphR transcription factor;wherein the MphR genetic mutation encodes an amino acid change selected from T17R, T27G, T49I, 065M, T27A, M59E, M59S, R22H, K35N, T49I, L89V, D98N, E109D, R122T, K132N, A151T, H184Q, or a combination thereof, in the ligand binding domain.
  • 2. The biosensor system of claim 1, wherein the nucleic acid encoding the genetically modified MphR gene sequence and the reporter gene are located on one recombinant DNA vector.
  • 3. The biosensor system of claim 1, wherein the reporter gene is a gene coding for chloramphenicol acetyltransferase, beta-galactosidase, luciferase or green fluorescent protein (GFP).
  • 4. The biosensor system of claim 3, wherein the reporter gene is a gene coding for green fluorescent protein (GFP).
  • 5. The biosensor system of claim 1, wherein the mutation confers improved sensitivity for detecting erythromycin A.
  • 6. The biosensor system of claim 1, wherein the mutation confers improved selectivity for detecting erythromycin A in comparison to other polyketides.
  • 7. A genetically modified host cell comprising: a nucleic acid encoding the biosensor of claim 1.
  • 8. A method for detecting a polyketide, comprising: introducing into a cell: i. a nucleic acid encoding a genetically modified MphR gene sequence, wherein the nucleic acid comprises at least one genetic mutation when compared to the wild-type MphR gene sequence; andii. a reporter gene whose transcription is under the control of a promoter region which is regulated by a MphR transcription factor; anddetecting the polyketide based on the differential expression of the reporter gene in comparison to a cell comprising a wild-type MphR gene sequence;wherein the MphR genetic mutation encodes an amino acid change selected from T17R, T27G, T49I, 065M, T27A, M59E, M59S, R22H, K35N, T49I, L89V, D98N, E109D, R122T, K132N, A151T, H184Q, or a combination thereof, in the ligand binding domain.
CROSS REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No. 16/099,749, filed Nov. 8, 2018, which is a national stage application filed under 35 U.S.C. § 371 of PCT/US2017/031962 filed May 10, 2017, which claims the benefit of U.S. Provisional Patent Application Ser. No. 62/334,204 filed May 10, 2016, the disclosures of which are is expressly incorporated herein by reference.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH

This invention was made with Government Support under Grant No. GM104258 awarded by the National Institutes of Health. The Government has certain rights to the invention.

US Referenced Citations (1)
Number Name Date Kind
20040209270 Eberz Oct 2004 A1
Foreign Referenced Citations (3)
Number Date Country
2005033287 Aug 2005 WO
2014158594 Oct 2014 WO
2014093402 Jul 2015 WO
Non-Patent Literature Citations (27)
Entry
Pray, L. (2008) Eukaryotic genome complexity. Nature Education (Year: 2008).
Zheng et al. Structure and Function of the Macrolide Biosensor Protein, MphR(A), with and without Erythromycin. J. Mol. Biol. (2009) 387, 1250-1260 (Year: 2009).
International Search Report and Written Opinion in PCT/US2017/031962. Mailed Aug. 17, 2017. 10 pages.
Brakhage, Axel A. et al. Use of Reporter Genes to Identify Recessive trans-Acting Mutations Specifically Involved in the Regulation of Aspergillus nidulans Penicillin Biosynthesis Genes. Journal of Bacteriology, May 1995., p. 2781-2788.
Feng, Tingting et al. Insights into Resistance Mechanism of the Macrolide Biosensor Protein MphR (A) Binding to Macrolide Antibiotic Erythromycin by Molecular Dynamics Simulation. J Comput Aided Mol Des, Aug. 6, 2015. 14 pages.
Fu, Y et al. Study of Transcriptional Regulation Using a Reporter Gene Assay. Methods Mol Biol. 2006; 313:257-64.
Altschul et al. (1977) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nuc. Acids Res. 25:3389-3402.
Altschul et al. (1990) Basic local alignment search tool. J. Mol. Biol. 215:403-410.
Beaucage and Carruthers, Deoxynucleoside phosphoramidites—a new class of key intermediates for deoxypolynucleotide synthesis. Tetrahedron Lett., 1981, 22:1859-1862.
Boghigian BA, et al. Multi-factorial Engineering of Heterologous Polyketide Production in Escherichia coli Reveals Complex Pathway Interactions. Biotechnology and Bioengineering. 2011; 108(6): 1360-71.
Gardner L, et al. Photochemical Control of Bacterial Signal Processing Using a Light-activated Erythromycin. Molecular Biosystems. 2011;7(9):2554-7.
Henikoff and Henikoff (1989) Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA 89:10915.
Jiang M., Pfeifer, B. Metabolic and Pathway Engineering to Influence Native and Altered Erythromycin Production Through E. coli. Metabolic Engineering. 2013;19:42-9.
Karlin and Altschul (1993) Applications and statistics for multiple high-scoring segments in molecular sequences. Proc. Natl. Acad. Sci. USA 90:5873-5877.
Matteucci, et al., Synthesis of deoxyoligonucleotides on a polymer support. J. Am. Chem. Soc., 103:3185, 1981.
Miller ES, et al. Description of the erythromycin—producing bacterium Arthrobacter sp. strain NRRL B-3381 as Aeromicro- bium erythreum gen. nov., sp. Nov. International Journal of Systematic Bacteriology. 1991;41: 363-368.
Mohrle, V. et al. Biosensor-guided screening for macrolides. Anal. Bioanal. Chem. Jul. 2007;388(5-6):1117-25.
Montemiglio, LC, et al. Redirecting P450 EryK Specificity by Rational Site-directed Mutagenesis. Biochemistry. 2013; 52(21) 3678-87.
Noguchi N, et al. Regulation of Transcription of the mph(A) Gene for Macrolide 2′-Phosphotransferase I in Escherichia coli; Characterization of the Regulatory Gene mphR(A). Journal of Bacteriology. 2000; 182(18):5052-5058.
Reeves AR, et al. Engineering precursor flow for increased erythromycin production in Aeromicrobium erythreum. Metabolic Engineering. 2004;6(4): 300-12.
Rogers, J. et al., Synthetic biosensors for precise gene control and real-time monitoring of metabolites, Nucleic Acids Research, 2015, vol. 43, No. 15, 7648-7660.
Savino, C, et al. Investigating the Structural Plasticity of a Cytochrome P450: Three-dimensional Structures of P450 EryK and Binding to its Physiological Substrate. Journal of Biological Chemistry. 2009;284(42) 29170-9.
Sundermann U, et al. Enzyme-directed Mutasynthesis: a Combined Experimental and Theoretical Approach to Substrate Recognition of a Polyketide Synthase. ACS Chemical Biology. 2013;8(2):443-50.
Zhang H, et al. Complete Biosynthesis of Erythromycin A and Designed Analogs Using E. coli as a Heterologous Host. Cell Chemistry & Biology. 2010;17(11):1232-40.
Zheng J, et al. Structure and Function of the Macrolide Biosensor Protein, MphR(A), With and Without Erythromycin. Journal of Molecular Biology. 2009;387(5):1250-60.
Meinhardt, Sarah, et al. “Rheostats and toggle switches for modulating protein function.” PloS one 8.12 (2013): e83502.
Miller, M., Y. Bromberg, and L. Swint-Kruse. “Computational predictors fail to identify amino acid substitution effects at rheostat positions.” Scientific reports 7.1 (2017): 41329.
Related Publications (1)
Number Date Country
20230323481 A1 Oct 2023 US
Provisional Applications (1)
Number Date Country
62334204 May 2016 US
Continuations (1)
Number Date Country
Parent 16099749 US
Child 18049805 US