The Sequence Listing, which is a part of the present disclosure, includes a computer-readable form comprising nucleotide and/or amino acid sequences of the present invention (file name “019731-US-NP_Sequence Listing_ST25.txt” created on 27 Apr. 2022; 183,773 bytes). The subject matter of the Sequence Listing is incorporated herein by reference in its entirety.
The present disclosure generally relates to engineered probiotics and proteins for selective sensing.
Among the various aspects of the present disclosure is the provision of systems and methods for detecting aromatic compounds. An aspect of the present disclosure provides for a method of protein engineering (e.g., a regulator protein) comprising: mutagenizing specific amino acids in and around a ligand-binding site of a protein, wherein the mutagenizing enables changes in ligand-protein binding specificity while maintaining protein-DNA interaction and thus downstream gene expression control; and/or linking ligand-protein binding to output response. An aspect of the present disclosure provides for an engineered molecular sensor. In some embodiments, the sensor comprises an engineered regulator protein or enzyme (e.g., TrpR, TyrR, TynA, FeaR) comprising a ligand-protein binding site; a reporter (comprising a signaling moiety, e.g., GFP); and/or a promoter (e.g., PtyrP) capable of inducing or repressing the reporter in the presence or absence of a target aromatic compound bound to the engineered molecular sensor. In some embodiments, the engineered regulator protein or the ligand-protein binding site or a sequence encoding the engineered regulator protein or the ligand-protein binding site of the engineered regulator protein is at least about 80% identical to a WT regulator protein or a WT regulator ligand-protein binding site; or a sequence encoding the engineered regulator protein or ligand binding site thereof is at least about 80% identical to a sequence encoding the WT regulator protein or the WT regulator ligand-protein binding site; or functional fragment thereof, the engineered regulator protein having ligand binding activity. In some embodiments, the engineered molecular sensor is optionally genomically integrated into a microorganism, optionally selected from a probiotic or is a purified cell-free sensor. In some embodiments, the engineered regulator protein or enzyme is an engineered TrpR, TyrR, TynA, or FeaR protein. In some embodiments, the engineered regulator protein is an engineered TyrR and is an aromatic amino acid-specific sensor, wherein the aromatic amino acid-specific sensor is a phenylalanine (Phe)- or a tyrosine (Tyr)-specific sensor In some embodiments, the engineered regulator protein or enzyme is an engineered FeaR or TynA and is an aromatic amine-specific sensor, wherein the aromatic amine-specific sensor is a dopamine (DA)-, phenylethylamine (PEA)-, tyramine (Tyra)-, or tryptamine (Trypta)-specific sensor. In some embodiments, the engineered regulator protein is an engineered TrpR and is a tryptophan (Trp)-, 5-hydroxytryptophan (5-HTP)-, or tryptamine (Trpta)-specific sensor. In some embodiments, the engineered regulator protein or enzyme is an engineered TrpR, TyrR, TynA, or FeaR, and/or wherein the engineered TrpR, TyrR, TynA, or FeaR comprises at least one mutation to a ligand-protein binding site having specific amino acid binding activity. In some embodiments, PEA induces reporter expression; Phe induces reporter expression; or Tyr represses or induces reporter expression. In some embodiments, carboxylic acids 3,4-dihydroxyphenylacetic acid (DOPAC), phenylacetic acid (PAA), 4-hydroxyphenylacetic acid (HPPA), or indole-3-acetic acid (IAA) do not induce reporter expression or wherein IAA induces reporter expression. In some embodiments, the engineered regulator protein is an engineered TyrR and the engineered TyrR comprises one or more of the following mutations: E274Q+T14V mutations in TyrR inducing reporter expression in the presence of Phe; E274Q+D103S mutations in TyrR inducing reporter expression in the presence of Phe; or an R10F mutation in TyrR repressing reporter expression in the presence of Tyr. In some embodiments, TyrR protein sequence is at least about 80% identical to SEQ ID NO: 124 or TyrR ligand binding site is at least about 80% identical to residues 7-274 of SEQ ID NO: 124; a polypeptide encoded by SEQ ID NO: 11 or the WT ligand binding site sequence; or functional fragment or conservative substitution thereof of TyrR having ligand binding functional activity. Yet another aspect of the present disclosure provides for a TyrR-based selective or specific sensor specifically detecting phenylalanine (Phe) or tyrosine (Tyr) comprising: a TyrR or functional mutant or variant thereof; and/or a reporter gene (comprising a signaling moiety, e.g., GFP) operably linked to an inducible promoter (e.g., PtyrP promoter). In some embodiments, the inducible promoter comprises an activating response to Phe or a repressing response to Tyr or both Phe and Tyr. In some embodiments, TyrR is selected from: wild type (VVT) TyrR (SEQ ID NO: 124) or having at least about 80% identity to WT TyrR (SEQ ID NO: 124) and the reporter gene overexpresses if Phe is present; TyrR or TyrR having at least about 80% identity to TyrR comprising the mutation E274Q and the reporter gene overexpresses in the presence of Phe and Phe+Tyr; TyrR or TyrR having at least about 80% identity to TyrR comprising the mutation E274Q+T14V and the reporter gene overexpresses in the presence of Phe and Phe+Tyr and the reporter gene does not overexpress with Tyr in the absence of Phe; TyrR or TyrR having at least about 80% identity to TyrR comprising the mutation E274Q+D103S and the reporter gene overexpresses in the presence of Phe and Phe+Tyr and the reporter gene does not overexpress with Tyr in the absence of Phe; or TyrR or TyrR having at least about 80% identity to TyrR comprising the mutation R10F and the reporter gene is repressed with Tyr or Tyr+Phe and the reporter gene does not overexpress in the absence of Tyr. In some embodiments, the sensor is a target aromatic compound-inducible sensor or target aromatic compound-repressible sensor selected from: a Phe-inducible TyrR system (e.g., E274Q, E274Q+T14V, E274Q+D103H, E274Q+D103S), or Tyr-repressible TyrR system (e.g., R10F). In some embodiments, the TyrR-based sensor is for use to kinetically diagnose or treat disorders that cause Phe-dysregulation without interference from intestinal Tyr. In some embodiments, the TyrR-based sensor comprises E274Q+T14V or E274Q+D103S variants of TyrR. In some embodiments, the TyrR-based sensor is sensitive to Phe in the presence or absence of Tyr and does not respond to Tyr alone. In some embodiments, the TyrR-based sensor comprises R10F variant of TyrR and wherein the TyrR-based sensor exhibits about a 12-fold repression in the presence of Tyr independent of the presence of Phe. In some embodiments, the TyrR sensor has no significant response to Phe alone. In some embodiments, the TyrR-based sensor wherein a significant response is: under about 4000 au, under about 3000 au, or preferably at or under about 2000 au for Phe-inducible sensor; or above about 2000 au or preferably at or above 2000 au for a Phe-repressible sensor. In some embodiments, the engineered regulator protein or enzyme comprises an engineered FeaR and/or TynA and the engineered FeaR or TynA comprises one or more of the following mutations: a G494S mutation in TynA inducing reporter expression in the presence of PEA; a G494S mutation in TynA and A81T mutation in FeaR (e.g., G494S*) inducing reporter expression in the presence of PEA; a G494S mutation in TynA and A81S mutation in FeaR inducing reporter expression in the presence of PEA; a A81T or A81S mutation in FeaR inducing reporter expression in the presence of PEA and Tyra; a A81T mutation in FeaR inducing reporter expression in the presence of PEA; an A81T mutation in FeaR and S414M mutation in TynA inducing reporter expression in the presence of PEA; an A81T mutation in FeaR and G415H mutation in TynA inducing reporter expression in the presence of PEA; an A81 L, A81P, A81I, or A81N mutation in FeaR inducing reporter expression in the presence of PEA; a S414M or G415H mutation in TynA inducing reporter expression in the presence of Tyra; a G415H mutation in TynA inducing reporter expression in the presence of Tyra; a G415H mutation in TynA inducing reporter expression in the presence of Tyra and PEA; a M83Y mutation in FeaR inducing reporter expression in the presence of PEA; a M83N mutation in FeaR inducing reporter expression in the presence of Tyra; a Fear-KA mutant inducing reporter expression in the presence of Trypta; a tynA-KA or tynA-KP inducing reporter expression in the presence of Trypta; a I109N tynA-KA inducing reporter inducing expression in the presence of Trypta; a I109N FeaR-KA inducing reporter inducing expression in the presence of Trypta; a Q76, Q116, L108(T), W110(S), or W110(C) FeaR mutation; a D413 or Y496 TynA mutation; or PtynA-MG, TynA-KP and FeaR-KA inducing reporter inducing expression in the presence of Trypta and does not induce expression in the presence of dopamine (DA), phenylethylamine (PEA), or tyramine (Tyra). In some embodiments, the FeaR or TynA protein sequence or ligand binding site of FeaR or TynA is at least about 80% identical to a polypeptide encoded by the WT FeaR sequence of SEQ ID NO: 36 or WT TynA sequence of SEQ ID NO: 32 or WT ligand binding site or functional fragment or conservative substitution thereof and has ligand-binding activity. Yet another aspect of the present disclosure provides for a selective sensor for specifically detecting target aromatic compounds (e.g., Phe, Tyr, PEA, Tyra) comprising: a molecular sensor or an engineered microorganism (e.g., an E. coli strain) capable of expressing native or non-native FeaR and TynA (or functional fragments, conservative substitutions, mutants, or variants thereof); and/or a promoter-reporter system comprising a reporter gene under the control of a promoter capable of being induced or repressed by one or more target aromatic compounds (e.g., Phe, Tyr, PEA, Tyra) or enzymatic-reaction product thereof (e.g., aromatic aldehyde, aromatic carboxylic acid) and/or producing an output response (e.g., increased expression or repression). In some embodiments, the output response is a repression of the reporter (e.g., signaling or detection moiety, such as GFP) expression. In some embodiments, the output response is an overexpression of the reporter (e.g., signaling or detection moiety, such as GFP). In some embodiments, the selective sensor comprises one or more enzymes or transcription factors. Yet another aspect of the present disclosure provides for a selective sensor for specifically detecting aromatic amines (e.g., PEA, Tyra, DA, Trypta) or aromatic aldehydes thereof comprising: a TynA and FeaR protein (or a functional mutant or variant thereof); and/or a reporter gene (comprising a signaling moiety, e.g., GFP) operably linked to an inducible promoter (e.g., PtynA promoter). In some embodiments, the selective sensor induces expression of PtynA in response to aromatic aldehydes, but not aromatic amines. In some embodiments, the TynA protein or mutant variant thereof is selective to a specific aromatic amine. In some embodiments, TynA converts periplasmic amines (e.g., PEA, Tyra, DA, Trypta) to aldehydes which are imported into the cytoplasm; and in the cytoplasm, FeaR induces expression from the PtynA promoter expressing a reporter gene (e.g., detectable signal moiety) when in the presence of aldehydes. In some embodiments, the selective sensor is selective to PEA and Tyra. In some embodiments, either TynA or FeaR or both are engineered for selectivity. In some embodiments, the sensor comprises G494S mutation in TynA and/or optionally A81T mutation in the FeaR protein (i.e., double mutant sensor (G494S*)). In some embodiments, the selective sensor is a Tyra-specific variant, comprises G415H mutation in TynA. In some embodiments, G415H mutation in TynA induces expression of a reporter by about 200-fold and/or about 79-fold in response to Tyra and PEA, respectively, with minimal response to DA and Trypta. In some embodiments, A81T FeaR mutation with WT TynA is PEA and Tyra non-selective; S414M TynA and WT FeaR is slightly Tyra selective; S141M TynA and A81T FeaR is PEA selective; G415H TynA and WT FeaR is Tyra selective; or G415H TynA and A81T FeaR is PEA selective. In some embodiments, FeaR A81L, A81P, A81I, and A81N are PEA-specific variants. In some embodiments, Tyra-selective sensors (S414M or G415H TynA) can be transformed into PEA-specific sensors by introducing A81T FeaR; sensor sensitivity (G494S TynA) can be improved by A81T or A81S FeaR mutations; and/or PEA-Tyra selective sensors (A81T or A81S FeaR) can become PEA-specific sensors when combined with G494S TynA. In some embodiments, the engineered regulator protein is an engineered TrpR and the engineered TrpR comprises one or more of the following mutations: the TrpR variant is OD or O1, repressing reporter expression in the presence of Trp and 5-HTP, but not Trypta, wherein the protein sequence or ligand binding site sequence of TrpR is at least about 80% identical to a polypeptide encoded by the WT TrpR sequence of SEQ ID NO: 113 or WT ligand binding site sequence or functional fragment or conservative substitution thereof and/or has ligand-binding activity. In some embodiments, the reference nucleotide sequence encoding TrpR is SEQ ID NO: 113 and the reference amino acid sequence for TrpR is SEQ ID NO: 122; the reference nucleotide sequence encoding TyrR is SEQ ID NO: 11 and the reference amino acid sequence for TyrR is SEQ ID NO: 124; the reference nucleotide sequence encoding FeaR is SEQ ID NO: 36 and the reference amino acid sequence for FeaR is SEQ ID NO: 125; or the reference nucleotide sequence encoding TynA is SEQ ID NO: 32 and the reference amino acid sequence for TynA is SEQ ID NO: 123. In some embodiments, the engineered regulator protein or enzyme (e.g., TyrR, FeaR, TynA, or TrpR) or the ligand binding site of the engineered regulator protein or enzyme comprises at least 80% identity to a polypeptide encoded by SEQ ID NO: 32, SEQ ID NO: 36, SEQ ID NO: 11, or SEQ ID NO: 113; at least 80% identity to SEQ ID NO: 122 (TrpR WT), SEQ ID NO: 123 (tynA WT), SEQ ID NO: 124 (tyrR WT), or SEQ ID NO: 125 (fear WT); or at least 80% identity to D413 to Y496 of SEQ ID NO: 123, C7 to E274 of SEQ ID NO: 124, W12 to S118 of SEQ ID NO: 125, A81 to W110 of SEQ ID NO: 125, Q76 to Q116 of SEQ ID NO: 125, K72 to T83 of SEQ ID NO: 122, or a functional fragment or conservative substitution thereof. In some embodiments, the KA value of the engineered regulator protein is less than the KA value of the wild type regulator protein in the presence of a target aromatic compound; the KA value of the engineered TyrR sensor is between about 0.05 mM and about 0.3 mM optionally in human intestines, serum, or urine; or the KA value of the engineered TynA-FeaR is between about 0.001 mM and about 0.1 mM optionally in plasma or food. In some embodiments, the presence of a target aromatic compound induces or represses reporter gene expression of the engineered regulator protein or enzyme compared to the wild type regulator protein or enzyme. In some embodiments, the engineered regulator protein or enzyme has a selectivity, induction, or repression response that is greater than wild type. In some embodiments, the regulator protein or enzyme or the regulator protein or enzyme binding site is modified to increase selectivity. In some embodiments, the engineered molecular sensor is a ligand-specific biosensor for phenylalanine, tyrosine, phenylethylamine, or tyramine. In some embodiments, the engineered molecular sensor is directly transferred into probiotic organisms or purified for cell-free sensor application. Yet another embodiment provides for a selective sensor for specifically detecting target aromatic compounds (e.g., Phe, Tyr) comprising: an engineered microorganism (e.g., an E. coli strain) comprising the engineered molecular sensor capable of expressing native or non-native TrpR (or functional mutants or variants thereof); and/or a promoter-reporter system comprising a reporter gene (e.g., GFP) under the control of a promoter (e.g., Ptrp) capable of being induced or repressed by one or more target aromatic compounds and producing an output response (e.g., increased expression or repression of the reporter gene). In some embodiments, the TrpR is a TrpR variant OD or O1, each with a synthetic Ptrp promoter, the selective sensor having strong repression in the presence of Trp and 5-HTP, but not Trypta. In some embodiments, a genomic copy of wild-type trpR is knocked out from the engineered microorganism selected from an E. coli strain (e.g., EcN). In some embodiments, the WT TrpR system has strong repression of GFP expression with fold repressions of 120-fold, 20-fold, and 7-fold in response to Trp, 5-HTP, and/or Trpta, respectively. In some embodiments, the engineered TrpR variants maintain strong repression in the presence of Trp and 5-HTP, but not Trypta. In some embodiments, the OD variant demonstrates about 5-fold and about 7-fold repression in response to Trp and/or 5-HTP, respectively. In some embodiments, the O1 variant demonstrates about 60-fold and about 15-fold repression in response to Trp and 5-HTP, respectively. In some embodiments, the target aromatic compounds are one or more of tryptophan (Trp), 5-hydroxytryptophan (5-HTP), or tryptamine (Trypta). Yet another aspect of the present disclosure provides for an artificial DNA construct comprising, as operably associated components in the 5′ to 3′ direction of transcription: (a) a promoter functional in a microorganism (e.g., transgenic microorganism, wild type microorganism); (b) a first polynucleotide comprising a nucleotide sequence encoding (i) a first polypeptide having TrpR activity; (ii) a second polypeptide having TyrR activity; or (iii) a first polypeptide having FeaR activity and a second polynucleotide comprising a nucleotide sequence encoding a second polypeptide having TynA activity; (c) a reporter gene (e.g., GFP); and/or (d) a transcriptional termination sequence. In some embodiments, the microorganism is capable of expressing native or non-native (i) TrpR; (ii) TyrR; or (iii) FeaR and TynA (or functional mutants or variants thereof). In some embodiments, the microorganism specifically expresses or represses reporter gene expression compared to a microorganism not comprising the artificial DNA construct in the presence or absence of aromatic compounds. Yet another aspect of the present disclosure provides for a microbial sensor selected from an engineered wild type or transgenic microorganism transformed with the artificial DNA construct. In some embodiments, the wild type or transgenic microorganism is selected from Escherichia coli Nissle 1917 (EcN), DH10B, or E. coli MG1655. In some embodiments, the selective sensor is selective for an aromatic compound and/or the aromatic compound is an aromatic amino acid selected from phenylalanine (Phe), tyrosine (Tyr), and/or tryptophan (Trp). In some embodiments, the selective sensor is selective for an aromatic compound and the aromatic compound is an aromatic amine neurochemical selected from dopamine (DA), phenylethylamine (PEA), tyramine (Tyra), tryptamine (Trypta), serotonin, epinephrine, or norepinephrine. In some embodiments, a TrpR-based sensor is selective for Trp; a TyrR-based sensor is selective for Phe and/or Tyr; and/or a TynA-FeaR sensor system is selective for aromatic amines. Yet another aspect of the present disclosure provides for a ligand-specific sense-and-respond system, comprising purified sensors or engineered proteins or probiotics for specific sensing of aromatic compounds (e.g., amino acids, aromatic amines, aromatic neurochemicals) comprising: providing a orthogonal DNA-TF binding system with accompanying selectivity changes; changing ligand-TF binding specificity by leveraging differential multimerization patterns of TyrR without affecting DNA-TF binding interaction; or a “dual-control knob” strategy to improve the specificity and sensitivity of substrate-enzyme and ligand-TF interaction while maintaining DNA-TF binding interaction. In some embodiments, target ligands are structurally similar and ligand-protein binding controls downstream functions such as reporter gene expression. In some embodiments, the ligand-specific sense-and-respond system comprises an engineered microorganism. Yet another aspect of the present disclosure provides for a method of using the engineered molecular sensor, comprising obtaining or having obtained a biological sample from a subject and/or contacting the biological sample with the engineered molecular sensor. In some embodiments, the subject has an aromatic compound-associated disease, disorder, or condition. In some embodiments, elevated levels of Phe detected by the sensor indicate the subject has phenylketonuria. In some embodiments, elevated levels of Tyr detected by the sensor indicate the subject has type 2 tyrosinemia. In some embodiments, elevated levels of PEA detected by the sensor indicate the subject has a psychological disorder. In some embodiments, the presence of Tyra detected by the sensor indicates catecholamine release and/or an increase in blood pressure. In some embodiments, the presence of Trypta detected by the sensor causes serotonin release and/or stimulation of gastrointestinal motility. Yet another aspect of the present disclosure provides for a method of using the engineered molecular sensor, comprising monitoring food quality or diagnosing or treating metabolic, digestive, and/or neurological disorders. In some embodiments, the presence of PEA, Tyra, and/or Trypta in food detected by the sensor indicates microbial contamination. In some embodiments, the sensor dynamically identifies microbial contamination in consumable products, manages various debilitating neurological disorders, or normalizes dysregulated metabolites associated with metabolic disorders. In some embodiments, the sensor recognizes or is selective for aromatic metabolites associated with various metabolic or neurological disorders or medical conditions. In some embodiments, the aromatic compounds are selected from phenylalanine (Phe) or tyrosine (Tyr). In some embodiments, the aromatic compounds are neurochemicals. In some embodiments, the neurochemicals are selected from aromatic neurotransmitters or neuromodulators. In some embodiments, the neurochemicals are selected from dopamine (DA), phenylethylamine (PEA), tyramine (Tyra), tryptamine (Trypta), serotonin, epinephrine, or norepinephrine. In some embodiments, the subject has or is suspected of having a medical condition associated with elevation, presence, or absence of aromatic compounds. In some embodiments, the sensor differentiates metabolites with divergent functions even having structural similarity. In some embodiments, the sensor modulates the specificity of ligand-protein binding while maintaining protein-DNA interactions and thus downstream gene expression control. Yet another aspect of the present disclosure provides for a method of using the engineered molecular sensor, the method comprising monitoring food quality, diagnosing or treating metabolic, digestive, and/or neurological disorders in probiotics or ex vivo wearable, paper-based or cell-free systems, or dynamically regulating enzymatic pathways for microbial metabolic engineering using the engineered molecular sensor. Yet another aspect of the present disclosure provides for a method of protein engineering (e.g., a regulator protein or enzyme) comprising: mutagenizing specific amino acids in and around a ligand-binding site of a protein or enzyme (e.g., TrpR, TyrR, FeaR, TynA), wherein the mutagenizing enables changes in ligand-protein binding specificity while maintaining protein-DNA interaction and thus downstream gene expression control; and/or linking ligand-protein binding to output response (e.g., promoter-reporter gene system).
Other objects and features will be in part apparent and in part pointed out hereinafter.
Those of skill in the art will understand that the drawings, described below, are for illustrative purposes only. The drawings are not intended to limit the scope of the present teachings in any way.
The present disclosure is based, at least in part, on the development of the first ligand-specific or selective sensors for aromatic compounds such as aromatic neurochemicals or aromatic amino acids.
Examples of aromatic amino acids can be phenylalanine (Phe) and tyrosine (Tyr). Examples of aromatic neurochemicals (e.g., neurotransmitters, neuromodulators) can be aromatic amines, such as phenylethylamine (PEA) and tyramine (Tyra), which are structurally similar and have been implicated in a variety of medical conditions.
As described herein, biosensors with high selectivity for aromatic compounds, such as phenylalanine, tyrosine, phenylethylamine, or tyramine, were developed and demonstrated for the first time.
Applications for this technology can include:
1) developing medically-relevant probiotic sensors with high specificity, which is critical to differentiate metabolites with divergent functions and create smart probiotics for accurate diagnostic tools; and
2) developing autonomous microbes that dynamically identify microbial contamination in consumable products, manage various debilitating neurological disorders, and normalize dysregulated metabolites associated with metabolic disorders.
Publically available or computational tools purchased include: the MOTIF Search webtool (genome.jp), Robetta Web Server, Chem3D 16.0 (PerkinElmer, Waltham, Mass.), Rosetta Ligand Docking Server hosted by ROSIE, KEGG database, https://www.hmpdacc.org/HMRGD/, Interactive tree of life (iTOL) phylogeny tree annotation tool, HMP database, Greengenes, NCBI, and PyMOL 2.4.1 (Schrödinger, Inc., New York, N.Y.).
Engineered Microorganism
Many probiotics have been engineered for therapeutic applications, including treating various cancers, inflammatory disorders, metabolic disorders, and bacterial infections. Many of these probiotics constitutively express the proteins, raising concerns of off-target effects and genetic stability. Several engineered probiotics partially mitigate this concern by using sensors that limit protein expression to targeted biogeographical regions. For example, probiotics have been designed to regulate the therapeutic expression through sensors that detect oxygen content. However, they should also be engineered with biosensors that recognize the disease state by measuring the concentrations of relevant metabolites. By using disease-specific sensors, therapeutics can be delivered with both geographical and temporal precision. In addition, these sensors expand the potential applications of the engineered probiotics, which can be used to diagnose numerous disorders.
To develop these smart probiotics, the toolbox of sensors needs to be expanded for common probiotic microbes. Synthetic sensors can be created using several protein design approaches. However, sensors can also be obtained more efficiently by mining sensors that naturally exist in microbes. These sensors can often be directly transferred into probiotic organisms for use in various applications. However, most natural sensors recognize multiple structurally similar, but functionally diverse, metabolites often found in close proximity. To accurately correlate the concentration of a chemical with biological outcomes, engineered microbes need to differentiate between different chemicals. In addition, sensors often have an inadequate sensitivity to the ligand or lack orthogonality with native regulatory pathways in the probiotic, limiting their utility.
Specific sensing is critical for many probiotic sensor applications. Chronically elevated levels of structurally similar phenylalanine (Phe) and tyrosine (Tyr) are associated with the distinct disorders phenylketonuria and type 2 tyrosinemia, respectively. Sensors for both metabolites have been generated in different E. coli strains. However, the sensors are based on the wild-type version of the multi-ligand responsive TyrR transcription factor from E. coli and have limited selectivity and low dynamic ranges, making them unsuitable for differentiating between the two disorders. Similarly, the structurally similar amines phenylethylamine (PEA), tyramine (Tyra), and tryptamine (Trypta) are all commonly found and produced in the gut, but contribute to distinct biological outcomes. For example, extreme levels of PEA have been associated with a variety of psychological disorders, the presence of Tyra leads to catecholamine release and an increase in blood pressure, and the presence of Trypta causes serotonin release and the stimulation of gastrointestinal motility. Additionally, the presence of PEA, Tyra, and Trypta in food are indicators of microbial contamination, and eating foods with high levels of Tyra can lead to poisoning. Currently, there are no biosensors with high ligand specificity for these medically relevant chemicals, which can be employed in probiotic microbes.
A variety of protein engineering methods have been demonstrated for optimizing the selectivity and sensitivity of protein and RNA sensors. These approaches include directed evolution, rational design, and computational de novo design. Although each method has advantages, rational engineering can uniquely be performed using both basic knowledge of the protein structure and small library sizes, allowing for rapid construction and screening of libraries. In addition, when the structure of the protein is unknown, conserved or essential residues can also be identified through computational simulations or by aligning the sequence of the sensor with other proteins in the same family. Despite advances in protein engineering, creating ligand-specific sense-and-respond systems remains challenging due to the need to couple subtle protein conformational changes with differential protein-DNA interactions and gene expression control, especially when the target ligands are structurally similar.
Here is described the generation of sensors for a variety of disease-relevant aromatic metabolites in probiotic Escherichia coli Nissle 1917 (EcN). These metabolites include aromatic amino acids Phe, Tyr, and tryptophan (Trp), which are required to synthesize proteins and diverse essential metabolites, and aromatic amine neurochemicals dopamine (DA), PEA, Tyra, and Trypta, which are associated with many health issues. To generate sensors for these aromatic metabolites, three sensor systems were identified and engineered: the TrpR (Trp) sensor, the TyrR (Phe and Tyr) sensor, and the TynA-FeaR (aromatic amine) sensor system. Multiple engineered TrpR sensors were first characterized, which were previously created to be orthogonal to the wild-type E. coli-native system, and how the mutations impact the selectivity of the sensors was assessed. Next, the ligand selectivity of the TyrR and TynA-FeaR sensor systems was engineered and their sensitivity was tuned by rationally selecting and individually mutating amino acids in TyrR, TynA, and FeaR. This method of protein engineering quickly generates multiple small libraries that can be efficiently screened. Altogether, the first sensors selective for Phe, Tyr, PEA, or Tyra were generated. In engineering FeaR, novel insights into the uncharacterized structure of FeaR are also provided and residues important for ligand binding were identified for the first time. This work provides sensors with diverse clinical applications in engineering smart probiotics as well as a generalizable approach to modulating the specificity of ligand-protein binding while maintaining protein-DNA interactions and thus downstream gene expression control.
Microorganism
The disclosed system uses a microorganism (e.g., probiotic (bacteria, yeast)) to display biomolecules or binding agents.
Generally, one of the criteria that bacteria must meet in order for them to be regarded as probiotics is that they have to be able to survive and thrive throughout the GIT conditions and confer their beneficial effects. A preferred microorganism does not colonize the gut, and is thus, easily controlled.
For example, the microorganism can be Escherichia coli Nissle 1917 (EcN), DH10B, or E. coli MG1655. Other microorganisms that can engineered to incorporate sensors can be probiotics known in the art (see e.g., Bober Synthetic Biology Approaches to Engineer Probiotics and Members of the Human Microbiota for Biomedical Applications. Annu Rev Biomed Eng. 2018; 20:277-300; Mathipa and Thantsha Gut Pathog (2017) 9:28). Probiotics can include bacteria from the genera Streptococcus, Enterococcus, Pediococci, Weissella, or Lactococcus, but the most common ones used belong to Lactobacillus and Bifidobacteria spp.
The microorganism can be yeast, such as a Saccharomyces (e.g., Saccharomyces cerevisiae, Saccharomyces boulardii CNCM 1-745). Due to its inability to colonize the human gut, S. boulardii can be engineered to act as a sensor (and can pass through a host gut).
Binding Agents/Biomolecules
Binding agents, such as biomolecules (e.g., a ligand-binding moiety) can be displayed on the surface of the engineered microorganism. As such, the engineered microorganism can be designed to have a binding affinity to a specific aromatic compound with discriminating specificity.
The binding moiety can be the proteins, TrpR (Trp) sensor, the TyrR (Phe and Tyr) sensor, and the TynA-FeaR (aromatic amine) sensor system. Here, it is shown that their sensitivity can be tuned by rationally selecting and individually mutating amino acids in TyrR, TynA, and FeaR. These sensors have been shown to be selective for Phe, Tyr, PEA, or Tyra.
KA is a half maximal constant. KA values can be greater than 0 mM, between about 0.001 mM and 0.3 mM. For example, for a TyrR in human intestines, serum, or urine a KA of 0.05-0.3 mM is considered sufficient. As another example, in plasma or food, a KA of 0.001-0.1 mM is considered sufficient.
TrpR (Trp) Sensor.
First, multiple engineered TrpR sensors were characterized, which were previously created to be orthogonal to the wild-type E. coli-native system, and assess how the mutations impact the selectivity of the sensors.
Transcriptional Regulatory Protein TyrR (Phe and Tyr) Sensor.
The ligand selectivity of the TyrR sensor was engineered and their sensitivity was tuned by rationally selecting and individually mutating amino acids in TyrR.
As shown in Example 1, the Phe-specific sensors have the potential to be applied to kinetically diagnose and treat disorders that cause Phe-dysregulation without interference from intestinal Tyr.
TyrR protein is involved in transcriptional regulation of aromatic amino acid biosynthesis and transport. TyrR modulates the expression of at least 8 unlinked operons. Seven of these operons are regulated in response to changes in the concentration of the three aromatic amino acids (phenylalanine, tyrosine and tryptophan). These amino acids are suggested to act as co-effectors which bind to the TyrR protein to form an active regulatory protein. In most cases TyrR causes negative regulation, but positive effects on the tyrP gene have been observed at high phenylalanine concentrations. The native tyrR gene (E. coli) is autogenously regulated by a mechanism that gives similar rates of expression of tyrR irrespective of the concentration of the aromatic amino acids.
Monoamine Oxidase TynA-FeaR Regulator (Aromatic Amine) Sensor System.
The ligand selectivity of the TynA-FeaR sensor systems was engineered and sensitivity was tuned by rationally selecting and individually mutating amino acids in TynA and FeaR.
TynA-FeaR system: a sensor plasmid, which consisted of constitutively expressed TynA and FeaR from E. coli MG1655, and a reporter plasmid, which consisted of GFP under the control of PtynA promoter (see e.g.,
Target Aromatic Compounds
As described herein, the present disclosure provides for a biosensor that specifically interacts with aromatic compounds.
For example, the aromatic compound can be an aromatic amino acid selected from phenylalanine (Phe), tyrosine (Tyr), or tryptophan (Trp).
As another example, the aromatic compound can be an aromatic amine neurochemical selected from dopamine (DA), phenylethylamine (PEA), tyramine (Tyra), tryptamine (Trypta), serotonin, epinephrine, or norepinephrine.
Reporters for Artificial Sensors/Signaling System
The engineered microorganism can be engineered to incorporate reporters for use in artificial sensing and signaling.
The engineered microorganisms described herein can comprise a reporter (or sensor). The reporter can be a sensor protein to detect pH, an immune receptor, or reporter proteins (e.g., GFP, RFP (e.g., mCherry), BFP, luminescence protein (e.g., luciferase)).
For example, the engineered microorganism can comprise a promoter-reporter system and/or a dual-control knob system.
Biosensors/Detectors
The present disclosure also provides for a method of constructing engineered microorganism biosensors that selectively recognize and react to a wide range of sensible target aromatic compounds in vivo. The engineered microorganism can include a reporter. The three sensor systems described here include the TrpR (Trp) sensor, the TyrR (Phe and Tyr) sensor, and the TynA-FeaR (aromatic amine) sensor system.
As described herein, the engineered microorganisms can be used for sensing applications. For example, a portion of a functional protein can be used to induce or repress a transcriptional response.
The promoter PtyrP upregulates gene expression via the phenylalanine-binding transcriptional dual regulator TyrR.
As described in Example 1, six E. coli-native promoters were characterized using genomic expression or plasmid-based overexpression of TyrR to identify the best promoter in EcN. PtyrP was selected as a sensor as it displayed an activating response to Phe and a repressing response to Tyr, allowing for clear differentiation between the presence of each amino acid, especially with overexpression of TyrR (see e.g.,
As described herein, engineered microorganisms that can be used for detecting or sensing aromatic compounds. For example, a detection moiety can be used (e.g., GFP, RFP, BFP, luminescence protein). As such, every engineered microorganism that sensed a target aromatic compound will overexpress (“light up”) or be repressed.
As a microbial biosensor targeting gut health, the disclosed engineered E. coli can overcome limitations of conventional approaches that depend on very limited existing sensing mechanisms in nature.
Molecular Engineering
The following definitions and methods are provided to better define the present invention and to guide those of ordinary skill in the art in the practice of the present invention. Unless otherwise noted, terms are to be understood according to conventional usage by those of ordinary skill in the relevant art.
The term “transfection,” as used herein, refers to the process of introducing nucleic acids into cells by non-viral methods. The term “transduction,” as used herein, refers to the process whereby foreign DNA is introduced into another cell via a viral vector.
The terms “heterologous DNA sequence”, “exogenous DNA segment”, or “heterologous nucleic acid,” as used herein, each refers to a sequence that originates from a source foreign to the particular host cell or, if from the same source, is modified from its original form. Thus, a heterologous gene in a host cell includes a gene that is endogenous to the particular host cell but has been modified through, for example, the use of DNA shuffling or cloning. The terms also include non-naturally occurring multiple copies of a naturally occurring DNA sequence. Thus, the terms refer to a DNA segment that is foreign or heterologous to the cell, or homologous to the cell but in a position within the host cell nucleic acid in which the element is not ordinarily found. Exogenous DNA segments are expressed to yield exogenous polypeptides. A “homologous” DNA sequence is a DNA sequence that is naturally associated with a host cell into which it is introduced.
Expression vector, expression construct, plasmid, or recombinant DNA construct is generally understood to refer to a nucleic acid that has been generated via human intervention, including by recombinant means or direct chemical synthesis, with a series of specified nucleic acid elements that permit transcription or translation of a particular nucleic acid in, for example, a host cell. The expression vector can be part of a plasmid, virus, or nucleic acid fragment. Typically, the expression vector can include a nucleic acid to be transcribed operably linked to a promoter.
An “expression vector”, otherwise known as an “expression construct”, is generally a plasmid or virus designed for gene expression in cells. The vector is used to introduce a specific gene into a target cell, and can commandeer the cell's mechanism for protein synthesis to produce the protein encoded by the gene. Expression vectors are the basic tools in biotechnology for the production of proteins. The vector is engineered to contain regulatory sequences that act as enhancer and/or promoter regions and lead to efficient transcription of the gene carried on the expression vector. The goal of a well-designed expression vector is the efficient production of protein, and this may be achieved by the production of significant amount of stable messenger RNA, which can then be translated into protein. The expression of a protein may be tightly controlled, and the protein is only produced in significant quantity when necessary through the use of an inducer, in some systems however the protein may be expressed constitutively. As described herein, Escherichia coli is used as the host for protein production, but other cell types may also be used.
In molecular biology, an “inducer” is a molecule that regulates gene expression. An inducer can function in two ways, such as:
(i) By disabling repressors. The gene is expressed because an inducer binds to the repressor. The binding of the inducer to the repressor prevents the repressor from binding to the operator. RNA polymerase can then begin to transcribe operon genes.
(ii) By binding to activators. Activators generally bind poorly to activator DNA sequences unless an inducer is present. An activator binds to an inducer and the complex binds to the activation sequence and activates the target gene. Removing the inducer stops transcription. Because a small inducer molecule is required, the increased expression of the target gene is called induction.
Repressor proteins bind to the DNA strand and prevent RNA polymerase from being able to attach to the DNA and synthesize mRNA. Inducers bind to repressors, causing them to change shape and preventing them from binding to DNA. Therefore, they allow transcription, and thus gene expression, to take place.
For a gene to be expressed, its DNA sequence must be copied (in a process known as transcription) to make a smaller, mobile molecule called messenger RNA (mRNA), which carries the instructions for making a protein to the site where the protein is manufactured (in a process known as translation). Many different types of proteins can affect the level of gene expression by promoting or preventing transcription. In prokaryotes (such as bacteria), these proteins often act on a portion of DNA known as the operator at the beginning of the gene. The promoter is where RNA polymerase, the enzyme that copies the genetic sequence and synthesizes the mRNA, attaches to the DNA strand.
Some genes are modulated by activators, which have the opposite effect on gene expression as repressors. Inducers can also bind to activator proteins, allowing them to bind to the operator DNA where they promote RNA transcription. Ligands that bind to deactivate activator proteins are not, in the technical sense, classified as inducers, since they have the effect of preventing transcription.
A “promoter” is generally understood as a nucleic acid control sequence that directs transcription of a nucleic acid. An inducible promoter is generally understood as a promoter that mediates transcription of an operably linked gene in response to a particular stimulus. A promoter can include necessary nucleic acid sequences near the start site of transcription, such as, in the case of a polymerase II type promoter, a TATA element. A promoter can optionally include distal enhancer or repressor elements, which can be located as much as several thousand base pairs from the start site of transcription.
A “ribosome binding site”, or “ribosomal binding site (RBS)”, refers to a sequence of nucleotides upstream of the start codon of an mRNA transcript that is responsible for the recruitment of a ribosome during the initiation of translation. Generally, RBS refers to bacterial sequences, although internal ribosome entry sites (IRES) have been described in mRNAs of eukaryotic cells or viruses that infect eukaryotes. Ribosome recruitment in eukaryotes is generally mediated by the 5′ cap present on eukaryotic mRNAs.
A “transcribable nucleic acid molecule” as used herein refers to any nucleic acid molecule capable of being transcribed into an RNA molecule. Methods are known for introducing constructs into a cell in such a manner that the transcribable nucleic acid molecule is transcribed into a functional mRNA molecule that is translated and therefore expressed as a protein product. Constructs may also be constructed to be capable of expressing antisense RNA molecules, in order to inhibit translation of a specific RNA molecule of interest. For the practice of the present disclosure, conventional compositions and methods for preparing and using constructs and host cells are well known to one skilled in the art (see e.g., Sambrook and Russel (2006) Condensed Protocols from Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, ISBN-10: 0879697717; Ausubel et al. (2002) Short Protocols in Molecular Biology, 5th ed., Current Protocols, ISBN-10: 0471250929; Sambrook and Russel (2001) Molecular Cloning: A Laboratory Manual, 3d ed., Cold Spring Harbor Laboratory Press, ISBN-10: 0879695773; Elhai, J. and Wolk, C. P. 1988. Methods in Enzymology 167, 747-754).
The “transcription start site” or “initiation site” is the position surrounding the first nucleotide that is part of the transcribed sequence, which is also defined as position +1. With respect to this site all other sequences of the gene and its controlling regions can be numbered. Downstream sequences (i.e., further protein encoding sequences in the 3′ direction) can be denominated positive, while upstream sequences (mostly of the controlling regions in the 5′ direction) are denominated negative.
“Operably-linked” or “functionally linked” refers preferably to the association of nucleic acid sequences on a single nucleic acid fragment so that the function of one is affected by the other. For example, a regulatory DNA sequence is said to be “operably linked to” or “associated with” a DNA sequence that codes for an RNA or a polypeptide if the two sequences are situated such that the regulatory DNA sequence affects expression of the coding DNA sequence (i.e., that the coding sequence or functional RNA is under the transcriptional control of the promoter). Coding sequences can be operably-linked to regulatory sequences in sense or antisense orientation. The two nucleic acid molecules may be part of a single contiguous nucleic acid molecule and may be adjacent. For example, a promoter is operably linked to a gene of interest if the promoter regulates or mediates transcription of the gene of interest in a cell.
A “construct” is generally understood as any recombinant nucleic acid molecule such as a plasmid, cosmid, virus, autonomously replicating nucleic acid molecule, phage, or linear or circular single-stranded or double-stranded DNA or RNA nucleic acid molecule, derived from any source, capable of genomic integration or autonomous replication, comprising a nucleic acid molecule where one or more nucleic acid molecule has been operably linked.
A construct of the present disclosure can contain a promoter operably linked to a transcribable nucleic acid molecule operably linked to a 3′ transcription termination nucleic acid molecule. In addition, constructs can include but are not limited to additional regulatory nucleic acid molecules from, e.g., the 3′-untranslated region (3′ UTR). Constructs can include but are not limited to the 5′ untranslated regions (5′ UTR) of an mRNA nucleic acid molecule which can play an important role in translation initiation and can also be a genetic component in an expression construct. These additional upstream and downstream regulatory nucleic acid molecules may be derived from a source that is native or heterologous with respect to the other elements present on the promoter construct.
The term “transformation” refers to the transfer of a nucleic acid fragment into the genome of a host cell, resulting in genetically stable inheritance. Host cells containing the transformed nucleic acid fragments are referred to as “transgenic” cells, and organisms comprising transgenic cells are referred to as “transgenic organisms”.
“Transformed,” “transgenic,” and “recombinant” refer to a host cell or organism such as a bacterium, cyanobacterium, animal, or a plant into which a heterologous nucleic acid molecule has been introduced. The nucleic acid molecule can be stably integrated into the genome as generally known in the art and disclosed (Sambrook 1989; Innis 1995; Gelfand 1995; Innis & Gelfand 1999). Known methods of PCR include, but are not limited to, methods using paired primers, nested primers, single specific primers, degenerate primers, gene-specific primers, vector-specific primers, partially mismatched primers, and the like. The term “untransformed” refers to normal cells that have not been through the transformation process.
“Wild-type” refers to a virus or organism found in nature without any known mutation. A “wild type” organism can be genetically engineered to modulate native or non-native gene expression resulting in a transgenic or engineered organism.
Design, generation, and testing of the variant nucleotides, and their encoded polypeptides, having the above-required percent identities and retaining a required activity of the expressed protein is within the skill of the art. For example, directed evolution and rapid isolation of mutants can be according to methods described in references including, but not limited to, Link et al. (2007) Nature Reviews 5(9), 680-688; Sanger et al. (1991) Gene 97(1), 119-123; Ghadessy et al. (2001) Proc Natl Acad Sci USA 98(8) 4552-4557. Thus, one skilled in the art could generate a large number of nucleotide and/or polypeptide variants having, for example, at least 95-99% identity to the reference sequence described herein and screen such for desired phenotypes according to methods routine in the art.
Nucleotide and/or amino acid sequence identity percent (%) is understood as the percentage of nucleotide or amino acid residues that are identical with nucleotide or amino acid residues in a candidate sequence in comparison to a reference sequence when the two sequences are aligned. To determine percent identity, sequences are aligned and if necessary, gaps are introduced to achieve the maximum percent sequence identity. Sequence alignment procedures to determine percent identity are well known to those of skill in the art. Often publicly available computer software such as BLAST, BLAST2, ALIGN2, or Megalign (DNASTAR) software is used to align sequences. Those skilled in the art can determine appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full-length of the sequences being compared. When sequences are aligned, the percent sequence identity of a given sequence A to, with, or against a given sequence B (which can alternatively be phrased as a given sequence A that has or comprises a certain percent sequence identity to, with, or against a given sequence B) can be calculated as: percent sequence identity=XN100, where X is the number of residues scored as identical matches by the sequence alignment program's or algorithm's alignment of A and B and Y is the total number of residues in B. If the length of sequence A is not equal to the length of sequence B, the percent sequence identity of A to B will not equal the percent sequence identity of B to A. For example, the percent identity to a reference sequence (e.g., binding pocket, entire protein, or functional fragment thereof) can be at least 80% or about 80%, about 81%, about 82%, about 83%, about 84%, about 85%, about 86%, about 87%, about 88%, about 89%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, about 99%, or about 100%.
Substitution refers to the replacement of one amino acid with another amino acid in a protein or the replacement of one nucleotide with another in DNA or RNA. Insertion refers to the insertion of one or more amino acids in a protein or the insertion of one or more nucleotides with another in DNA or RNA. Deletion refers to the deletion of one or more amino acids in a protein or the deletion of one or more nucleotides with another in DNA or RNA. Generally, substitutions, insertions, or deletions can be made at any position so long as the required activity is retained.
So-called conservative exchanges can be carried out in which the amino acid which is replaced has a similar property as the original amino acid, for example, the exchange of Glu by Asp, Gln by Asn, Val by Ile, Leu by Ile, and Ser by Thr. For example, amino acids with similar properties can be Aliphatic amino acids (e.g., Glycine, Alanine, Valine, Leucine, Isoleucine), hydroxyl or sulfur/selenium-containing amino acids (e.g., Serine, Cysteine, Selenocysteine, Threonine, Methionine); Cyclic amino acids (e.g., Proline); Aromatic amino acids (e.g., Phenylalanine, Tyrosine, Tryptophan); Basic amino acids (e.g., Histidine, Lysine, Arginine); or Acidic and their Amide (e.g., Aspartate, Glutamate, Asparagine, Glutamine). Deletion is the replacement of an amino acid by a direct bond. Positions for deletions include the termini of a polypeptide and linkages between individual protein domains. Insertions are introductions of amino acids into the polypeptide chain, a direct bond formally being replaced by one or more amino acids. An amino acid sequence can be modulated with the help of art-known computer simulation programs that can produce a polypeptide with, for example, improved activity or altered regulation. On the basis of these artificially generated polypeptide sequences, a corresponding nucleic acid molecule coding for such a modulated polypeptide can be synthesized in-vitro using the specific codon-usage of the desired host cell.
“Highly stringent hybridization conditions” are defined as hybridization at 65° C. in a 6×SSC buffer (i.e., 0.9 M sodium chloride and 0.09 M sodium citrate). Given these conditions, a determination can be made as to whether a given set of sequences will hybridize by calculating the melting temperature (Tm) of a DNA duplex between the two sequences. If a particular duplex has a melting temperature lower than 65° C. in the salt conditions of a 6×SSC, then the two sequences will not hybridize. On the other hand, if the melting temperature is above 65° C. in the same salt conditions, then the sequences will hybridize. In general, the melting temperature for any hybridized DNA:DNA sequence can be determined using the following formula: Tm=81.5° C.+16.6(log10[Na+])+0.41(fraction G/C content)−0.63(% formamide)−(600/1). Furthermore, the Tm of a DNA:DNA hybrid is decreased by 1-1.5° C. for every 1% decrease in nucleotide identity (see e.g., Sambrook and Russel, 2006).
Host cells can be transformed using a variety of standard techniques known to the art (see e.g., Sambrook and Russel (2006) Condensed Protocols from Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, ISBN-10: 0879697717; Ausubel et al. (2002) Short Protocols in Molecular Biology, 5th ed., Current Protocols, ISBN-10: 0471250929; Sambrook and Russel (2001) Molecular Cloning: A Laboratory Manual, 3d ed., Cold Spring Harbor Laboratory Press, ISBN-10: 0879695773; Elhai, J. and Wolk, C. P. 1988. Methods in Enzymology 167, 747-754). Such techniques include, but are not limited to, viral infection, calcium phosphate transfection, liposome-mediated transfection, microprojectile-mediated delivery, receptor-mediated uptake, cell fusion, electroporation, and the like. The transformed cells can be selected and propagated to provide recombinant host cells that comprise the expression vector stably integrated in the host cell genome.
Exemplary nucleic acids that may be introduced to a host cell include, for example, DNA sequences or genes from another species, or even genes or sequences which originate with or are present in the same species, but are incorporated into recipient cells by genetic engineering methods. The term “exogenous” is also intended to refer to genes that are not normally present in the cell being transformed, or perhaps simply not present in the form, structure, etc., as found in the transforming DNA segment or gene, or genes which are normally present and that one desires to express in a manner that differs from the natural expression pattern, e.g., to over-express. Thus, the term “exogenous” gene or DNA is intended to refer to any gene or DNA segment that is introduced into a recipient cell, regardless of whether a similar gene may already be present in such a cell. The type of DNA included in the exogenous DNA can include DNA that is already present in the cell, DNA from another individual of the same type of organism, DNA from a different organism, or a DNA generated externally, such as a DNA sequence containing an antisense message of a gene, or a DNA sequence encoding a synthetic or modified version of a gene.
Host strains developed according to the approaches described herein can be evaluated by a number of means known in the art (see e.g., Studier (2005) Protein Expr Purif. 41(1), 207-234; Gellissen, ed. (2005) Production of Recombinant Proteins: Novel Microbial and Eukaryotic Expression Systems, Wiley-VCH, ISBN-10: 3527310363; Baneyx (2004) Protein Expression Technologies, Taylor & Francis, ISBN-10: 0954523253).
Methods of down-regulation or silencing genes are known in the art. For example, expressed protein activity can be down-regulated or eliminated using antisense oligonucleotides (ASOs), protein aptamers, nucleotide aptamers, and RNA interference (RNAi) (e.g., small interfering RNAs (siRNA), short hairpin RNA (shRNA), and micro RNAs (miRNA) (see e.g., Rinaldi and Wood (2017) Nature Reviews Neurology 14, describing ASO therapies; Fanning and Symonds (2006) Handb Exp Pharmacol. 173, 289-303G, describing hammerhead ribozymes and small hairpin RNA; Helene, et al. (1992) Ann. N.Y. Acad. Sci. 660, 27-36; Maher (1992) Bioassays 14(12): 807-15, describing targeting deoxyribonucleotide sequences; Lee et al. (2006) Curr Opin Chem Biol. 10, 1-8, describing aptamers; Reynolds et al. (2004) Nature Biotechnology 22(3), 326-330, describing RNAi; Pushparaj and Melendez (2006) Clinical and Experimental Pharmacology and Physiology 33(5-6), 504-510, describing RNAi; Dillon et al. (2005) Annual Review of Physiology 67, 147-173, describing RNAi; Dykxhoorn and Lieberman (2005) Annual Review of Medicine 56, 401-423, describing RNAi). RNAi molecules are commercially available from a variety of sources (e.g., Ambion, TX; Sigma Aldrich, MO; Invitrogen). Several siRNA molecule design programs using a variety of algorithms are known to the art (see e.g., Cenix algorithm, Ambion; BLOCK-iT™ RNAi Designer, Invitrogen; siRNA Whitehead Institute Design Tools, Bioinformatics & Research Computing). Traits influential in defining optimal siRNA sequences include G/C content at the termini of the siRNAs, Tm of specific internal domains of the siRNA, siRNA length, position of the target sequence within the CDS (coding region), and nucleotide content of the 3′ overhangs.
Genome Editing
As described herein, expression of various signals (e.g., mutants, proteins) can be modulated (e.g., reduced, eliminated, or enhanced) using genome editing. Processes for genome editing are well known; see e.g. Aldi 2018 Nature Communications 9 (1911). Except as otherwise noted herein, therefore, the process of the present disclosure can be carried out in accordance with such processes.
For example, genome editing can comprise CRISPR/Cas9, CRISPR-Cpf1, TALEN, or ZNFs. Adequate blockage of various signals by genome editing can result in engineered sensors for detection of aromatic compounds, for example.
As an example, clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated (Cas) systems are a new class of genome-editing tools that target desired genomic sites in mammalian cells. Recently published type II CRISPR/Cas systems use Cas9 nuclease that is targeted to a genomic site by complexing with a synthetic guide RNA that hybridizes to a 20-nucleotide DNA sequence and immediately preceding an NGG motif recognized by Cas9 (thus, a (N)20NGG target DNA sequence). This results in a double-strand break three nucleotides upstream of the NGG motif. The double strand break instigates either non-homologous end-joining, which is error-prone and conducive to frameshift mutations that knock out gene alleles, or homology-directed repair, which can be exploited with the use of an exogenously introduced double-strand or single-strand DNA repair template to knock in or correct a mutation in the genome. Thus, genomic editing, for example, using CRISPR/Cas systems could be useful tools for targeting cells by the removal or addition of various signals (e.g., to activate (e.g., CRISPRa), upregulate, downregulate).
For example, the methods as described herein can comprise a method for altering a target polynucleotide sequence in a cell comprising contacting the polynucleotide sequence with a clustered regularly interspaced short palindromic repeats-associated (Cas) protein.
Examples of engineering a probiotic, for example, can include inserting a functional gene with a viral vector.
Any vector known in the art can be used. For example, the vector can be a viral vector selected from retrovirus, lentivirus, herpes, adenovirus, adeno-associated virus (AAV), rabies, Ebola, lentivirus, or hybrids thereof.
Aromatic Compound-Associated Diseases, Disorders, or Conditions
The sensors, systems, and methods described herein can be used for the detection or monitoring of aromatic compound-associated diseases, disorders, or conditions.
For example, elevated levels of structurally similar Phe and Tyr are associated with the distinct disorders phenylketonuria and type 2 tyrosinemia, respectively. As another example, extreme levels of PEA have been associated with a variety of psychological disorders, the presence of Tyra leads to catecholamine release and an increase in blood pressure, and the presence of Trypta causes serotonin release and the stimulation of gastrointestinal motility. Additionally, the presence of PEA, Tyra, and Trypta in food are indicators of microbial contamination, and eating foods with high levels of Tyra can lead to poisoning. Currently, there are no biosensors with high ligand specificity for these chemicals.
Serotonin is a chemical that the body produces naturally. It's needed for the nerve cells and brain to function. But too much serotonin can cause signs and symptoms that can range from mild (shivering or diarrhea) to severe (muscle rigidity, fever, or seizures). Severe serotonin syndrome can cause death if not treated. Serotonin syndrome can be a result of too much serotonin in your body. It is usually caused by taking drugs or medications that affect serotonin levels. Stopping the drug(s) or medication(s) causing serotonin syndrome is the main treatment.
Many of these metabolites, including DA, PEA, Tyra, and Trypta can be found in the intestines. As such, selective detection of such compounds is needed.
Formulation
The agents and compositions described herein can be formulated by any conventional manner using one or more pharmaceutically acceptable carriers or excipients as described in, for example, Remington's Pharmaceutical Sciences (A. R. Gennaro, Ed.), 21st edition, ISBN: 0781746736 (2005), incorporated herein by reference in its entirety. Such formulations will contain a therapeutically effective amount of a biologically active agent described herein, which can be in purified form, together with a suitable amount of carrier so as to provide the form for proper administration to the subject.
The term “formulation” refers to preparing a drug in a form suitable for administration to a subject, such as a human. Thus, a “formulation” can include pharmaceutically acceptable excipients, including diluents or carriers.
The term “pharmaceutically acceptable” as used herein can describe substances or components that do not cause unacceptable losses of pharmacological activity or unacceptable adverse side effects. Examples of pharmaceutically acceptable ingredients can be those having monographs in United States Pharmacopeia (USP 29) and National Formulary (NF 24), United States Pharmacopeial Convention, Inc, Rockville, Md., 2005 (“USP/NF”), or a more recent edition, and the components listed in the continuously updated Inactive Ingredient Search online database of the FDA. Other useful components that are not described in the USP/NF, etc. may also be used.
The term “pharmaceutically acceptable excipient,” as used herein, can include any and all solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic, or absorption delaying agents. The use of such media and agents for pharmaceutically active substances is well known in the art (see generally Remington's Pharmaceutical Sciences (A.R. Gennaro, Ed.), 21st edition, ISBN: 0781746736 (2005)). Except insofar as any conventional media or agent is incompatible with an active ingredient, its use in the therapeutic compositions is contemplated. Supplementary active ingredients can also be incorporated into the compositions.
A “stable” formulation or composition can refer to a composition having sufficient stability to allow storage at a convenient temperature, such as between about 0° C. and about 60° C., for a commercially reasonable period of time, such as at least about one day, at least about one week, at least about one month, at least about three months, at least about six months, at least about one year, or at least about two years.
The formulation should suit the mode of administration. The agents of use with the current disclosure can be formulated by known methods for administration to a subject using several routes which include, but are not limited to, parenteral, pulmonary, oral, topical, intradermal, intratumoral, intranasal, inhalation (e.g., in an aerosol), implanted, intramuscular, intraperitoneal, intravenous, intrathecal, intracranial, intracerebroventricular, subcutaneous, intranasal, epidural, intrathecal, ophthalmic, transdermal, buccal, and rectal. The individual agents may also be administered in combination with one or more additional agents or together with other biologically active or biologically inert agents. Such biologically active or inert agents may be in fluid or mechanical communication with the agent(s) or attached to the agent(s) by ionic, covalent, Van der Waals, hydrophobic, hydrophilic, or other physical forces.
Controlled-release (or sustained-release) preparations may be formulated to extend the activity of the agent(s) and reduce dosage frequency. Controlled-release preparations can also be used to affect the time of onset of action or other characteristics, such as blood levels of the agent, and consequently, affect the occurrence of side effects. Controlled-release preparations may be designed to initially release an amount of an agent(s) that produces the desired therapeutic effect, and gradually and continually release other amounts of the agent to maintain the level of therapeutic effect over an extended period of time. In order to maintain a near-constant level of an agent in the body, the agent can be released from the dosage form at a rate that will replace the amount of agent being metabolized or excreted from the body. The controlled-release of an agent may be stimulated by various inducers, e.g., change in pH, change in temperature, enzymes, water, or other physiological conditions or molecules.
Agents or compositions described herein can also be used in combination with other therapeutic modalities, as described further below. Thus, in addition to the therapies described herein, one may also provide to the subject other therapies known to be efficacious for treatment of the disease, disorder, or condition.
Administration
Agents and compositions described herein can be administered according to methods described herein in a variety of means known to the art. The agents and composition can be used therapeutically either as exogenous materials or as endogenous materials. Exogenous agents are those produced or manufactured outside of the body and administered to the body. Endogenous agents are those produced or manufactured inside the body by some type of device (biologic or other) for delivery within or to other organs in the body.
As discussed above, administration can be parenteral, pulmonary, oral, topical, intradermal, intratumoral, intranasal, inhalation (e.g., in an aerosol), implanted, intramuscular, intraperitoneal, intravenous, intrathecal, intracranial, intracerebroventricular, subcutaneous, intranasal, epidural, intrathecal, ophthalmic, transdermal, buccal, and rectal.
Agents and compositions described herein can be administered in a variety of methods well known in the arts. Administration can include, for example, methods involving oral ingestion, direct injection (e.g., systemic or stereotactic), implantation of cells engineered to secrete the factor of interest, drug-releasing biomaterials, polymer matrices, gels, permeable membranes, osmotic systems, multilayer coatings, microparticles, implantable matrix devices, mini-osmotic pumps, implantable pumps, injectable gels and hydrogels, liposomes, micelles (e.g., up to 30 μm), nanospheres (e.g., less than 1 μm), microspheres (e.g., 1-100 μm), reservoir devices, a combination of any of the above, or other suitable delivery vehicles to provide the desired release profile in varying proportions. Other methods of controlled-release delivery of agents or compositions will be known to the skilled artisan and are within the scope of the present disclosure.
Delivery systems may include, for example, an infusion pump which may be used to administer the agent or composition in a manner similar to that used for delivering insulin or chemotherapy to specific organs or tumors. Typically, using such a system, an agent or composition can be administered in combination with a biodegradable, biocompatible polymeric implant that releases the agent over a controlled period of time at a selected site. Examples of polymeric materials include polyanhydrides, polyorthoesters, polyglycolic acid, polylactic acid, polyethylene vinyl acetate, and copolymers and combinations thereof. In addition, a controlled release system can be placed in proximity of a therapeutic target, thus requiring only a fraction of a systemic dosage.
Agents can be encapsulated and administered in a variety of carrier delivery systems. Examples of carrier delivery systems include microspheres, hydrogels, polymeric implants, smart polymeric carriers, and liposomes (see generally, Uchegbu and Schatzlein, eds. (2006) Polymers in Drug Delivery, CRC, ISBN-10: 0849325331). Carrier-based systems for molecular or biomolecular agent delivery can: provide for intracellular delivery; tailor biomolecule/agent release rates; increase the proportion of biomolecule that reaches its site of action; improve the transport of the drug to its site of action; allow colocalized deposition with other agents or excipients; improve the stability of the agent in vivo; prolong the residence time of the agent at its site of action by reducing clearance; decrease the nonspecific delivery of the agent to nontarget tissues; decrease irritation caused by the agent; decrease toxicity due to high initial doses of the agent; alter the immunogenicity of the agent; decrease dosage frequency; improve taste of the product; or improve shelf life of the product.
Kits
Also provided are kits. Such kits can include an agent or composition described herein and, in certain embodiments, instructions for administration. Such kits can facilitate performance of the methods described herein. When supplied as a kit, the different components of the composition can be packaged in separate containers and admixed immediately before use. Components include, but are not limited to engineered probiotics, engineered protein regulators, and components for making and using same. Such packaging of the components separately can, if desired, be presented in a pack or dispenser device which may contain one or more unit dosage forms containing the composition. The pack may, for example, comprise metal or plastic foil such as a blister pack. Such packaging of the components separately can also, in certain instances, permit long-term storage without losing activity of the components.
Kits may also include reagents in separate containers such as, for example, sterile water or saline to be added to a lyophilized active component packaged separately. For example, sealed glass ampules may contain a lyophilized component and in a separate ampule, sterile water, sterile saline each of which has been packaged under a neutral non-reacting gas, such as nitrogen. Ampules may consist of any suitable material, such as glass, organic polymers, such as polycarbonate, polystyrene, ceramic, metal, or any other material typically employed to hold reagents. Other examples of suitable containers include bottles that may be fabricated from similar substances as ampules and envelopes that may consist of foil-lined interiors, such as aluminum or an alloy. Other containers include test tubes, vials, flasks, bottles, syringes, and the like. Containers may have a sterile access port, such as a bottle having a stopper that can be pierced by a hypodermic injection needle. Other containers may have two compartments that are separated by a readily removable membrane that upon removal permits the components to mix. Removable membranes may be glass, plastic, rubber, and the like.
In certain embodiments, kits can be supplied with instructional materials. Instructions may be printed on paper or another substrate, and/or may be supplied as an electronic-readable medium or video. Detailed instructions may not be physically associated with the kit; instead, a user may be directed to an Internet website specified by the manufacturer or distributor of the kit.
A control sample or a reference sample as described herein can be a sample from a healthy subject or sample, a wild-type subject or sample, or from populations thereof. A reference value can be used in place of a control or reference sample, which was previously obtained from a healthy subject, a group of healthy subjects, or a wild-type subject or sample. A control sample or a reference sample can also be a sample with a known amount of a detectable compound or a spiked sample.
Compositions and methods described herein utilizing molecular biology protocols can be according to a variety of standard techniques known to the art (see e.g., Sambrook and Russel (2006) Condensed Protocols from Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, ISBN-10: 0879697717; Ausubel et al. (2002) Short Protocols in Molecular Biology, 5th ed., Current Protocols, ISBN-10: 0471250929; Sambrook and Russel (2001) Molecular Cloning: A Laboratory Manual, 3d ed., Cold Spring Harbor Laboratory Press, ISBN-10: 0879695773; Elhai, J. and Wolk, C. P. 1988. Methods in Enzymology 167, 747-754; Studier (2005) Protein Expr Purif. 41(1), 207-234; Gellissen, ed. (2005) Production of Recombinant Proteins: Novel Microbial and Eukaryotic Expression Systems, Wiley-VCH, ISBN-10: 3527310363; Baneyx (2004) Protein Expression Technologies, Taylor & Francis, ISBN-10: 0954523253).
Definitions and methods described herein are provided to better define the present disclosure and to guide those of ordinary skill in the art in the practice of the present disclosure. Unless otherwise noted, terms are to be understood according to conventional usage by those of ordinary skill in the relevant art.
In some embodiments, numbers expressing quantities of ingredients, properties such as molecular weight, reaction conditions, and so forth, used to describe and claim certain embodiments of the present disclosure are to be understood as being modified in some instances by the term “about.” In some embodiments, the term “about” is used to indicate that a value includes the standard deviation of the mean for the device or method being employed to determine the value. In some embodiments, the numerical parameters set forth in the written description and attached claims are approximations that can vary depending upon the desired properties sought to be obtained by a particular embodiment. In some embodiments, the numerical parameters should be construed in light of the number of reported significant digits and by applying ordinary rounding techniques. Notwithstanding that the numerical ranges and parameters setting forth the broad scope of some embodiments of the present disclosure are approximations, the numerical values set forth in the specific examples are reported as precisely as practicable. The numerical values presented in some embodiments of the present disclosure may contain certain errors necessarily resulting from the standard deviation found in their respective testing measurements. The recitation of ranges of values herein is merely intended to serve as a shorthand method of referring individually to each separate value falling within the range. Unless otherwise indicated herein, each individual value is incorporated into the specification as if it were individually recited herein. The recitation of discrete values is understood to include ranges between each value.
In some embodiments, the terms “a” and “an” and “the” and similar references used in the context of describing a particular embodiment (especially in the context of certain of the following claims) can be construed to cover both the singular and the plural, unless specifically noted otherwise. In some embodiments, the term “or” as used herein, including the claims, is used to mean “and/or” unless explicitly indicated to refer to alternatives only or the alternatives are mutually exclusive.
The terms “comprise,” “have” and “include” are open-ended linking verbs. Any forms or tenses of one or more of these verbs, such as “comprises,” “comprising,” “has,” “having,” “includes” and “including,” are also open-ended. For example, any method that “comprises,” “has” or “includes” one or more steps is not limited to possessing only those one or more steps and can also cover other unlisted steps. Similarly, any composition or device that “comprises,” “has” or “includes” one or more features is not limited to possessing only those one or more features and can cover other unlisted features.
All methods described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g., “such as”) provided with respect to certain embodiments herein is intended merely to better illuminate the present disclosure and does not pose a limitation on the scope of the present disclosure otherwise claimed. No language in the specification should be construed as indicating any non-claimed element essential to the practice of the present disclosure.
Groupings of alternative elements or embodiments of the present disclosure disclosed herein are not to be construed as limitations. Each group member can be referred to and claimed individually or in any combination with other members of the group or other elements found herein. One or more members of a group can be included in, or deleted from, a group for reasons of convenience or patentability. When any such inclusion or deletion occurs, the specification is herein deemed to contain the group as modified thus fulfilling the written description of all Markush groups used in the appended claims.
All publications, patents, patent applications, and other references cited in this application are incorporated herein by reference in their entirety for all purposes to the same extent as if each individual publication, patent, patent application, or other reference was specifically and individually indicated to be incorporated by reference in its entirety for all purposes. Citation of a reference herein shall not be construed as an admission that such is prior art to the present disclosure.
Having described the present disclosure in detail, it will be apparent that modifications, variations, and equivalent embodiments are possible without departing the scope of the present disclosure defined in the appended claims. Furthermore, it should be appreciated that all examples in the present disclosure are provided as non-limiting examples.
The following non-limiting examples are provided to further illustrate the present disclosure. It should be appreciated by those of skill in the art that the techniques disclosed in the examples that follow represent approaches the inventors have found function well in the practice of the present disclosure, and thus can be considered to constitute examples of modes for its practice. However, those of skill in the art should, in light of the present disclosure, appreciate that many changes can be made in the specific embodiments that are disclosed and still obtain a like or similar result without departing from the spirit and scope of the present disclosure.
This work represents a considerable achievement and includes the first ligand-specific or ligand-selective sensors for aromatic amino acids such as phenylalanine (Phe) and tyrosine (Tyr) as well as neurochemicals such as phenylethylamine (PEA) and tyramine (Tyra), which are structurally similar and have been implicated in a variety of medical conditions. Importantly, it lays the groundwork for developing medically relevant probiotic sensors with high specificity, which is critical to differentiate metabolites with divergent functions and create smart probiotics for accurate diagnostic tools.
Microbes have evolved diverse sensor systems that detect metabolites of great medical importance. These sensors have the potential to be utilized in probiotic microbes to provide diagnostic information and deliver therapeutics with temporal and geographical precision. However, microbial sensors found in nature often have promiscuity to several structurally similar aromatic amino acids, common neurotransmitters, or other neuromodulators, limiting their practical applications. Although similar, these metabolites have significantly different functions. For example, chronically elevated levels of structurally similar Phe and Tyr are associated with the distinct disorders phenylketonuria and type 2 tyrosinemia, respectively. Extreme PEA levels have been associated with a variety of psychological disorders, while the presence of Tyra leads to catecholamine release and an increase in blood pressure. Due to such differences in associated diseases and functions, specific sensing is critical for probiotic sensor applications. Despite advances in protein engineering for specific ligand-protein interactions, engineering ligand-specific sense-and-respond systems remain challenging, especially when the target ligands are structurally similar and when ligand-protein binding should control downstream functions such as gene expression. This is mainly due to the challenge in coupling subtle protein conformational changes caused by binding of similar ligands with differential DNA interactions.
In this work, two promiscuous sensors are characterized that recognize aromatic metabolites associated with various metabolic and neurological disorders. Common methods of protein engineering require extensive structural knowledge of the proteins or massive library sizes. In contrast, to improve ligand selectivity, the responsible proteins, including TyrR, TynA, and FeaR, were rationally engineered by identifying and individually mutagenizing specific amino acids in and around the ligand-binding sites of these sensors. From these three case studies, this simple and generalizable method of protein engineering is shown to be effective and time-efficient, require small library sizes with only a basic understanding of the protein structure, and enables changes in ligand-protein binding specificity while maintaining protein-DNA interaction and thus downstream gene expression control.
Protein engineering for ligand-protein interaction has been extensively performed. However, linking ligand-protein binding to output response remains challenging. Current approaches include designing proteins that fluoresce upon ligand binding or employing ligand-binding transcription factors (TFs). While the former can be used to create sensors, the latter can generate sensors or signal-responsive controllers for gene expression. Because this TF-based system requires the maintenance or engineering of DNA-protein interaction in addition to engineering ligand-protein binding, engineering the TF-based system has been challenging, especially when the target ligands are structurally similar. To address this issue, demonstrated herein is an approach to change ligand-TF binding specificity by leveraging differential multimerization patterns of TyrR without affecting DNA-TF binding interaction (see e.g.,
The ligand binding site for TynA comprises at least a functional portion of SEQ ID NO: 32, having catalytic residues D413 and Y496. The ligand binding site for FeaR comprises at least a functional portion of SEQ ID NO: 36, wherein the ligand binding region is from W12 to S118 cover by the magenta region in
In addition, the computational and experimental analyses of FeaR provide novel insights into the otherwise uncharacterized structure of FeaR. The location of the FeaR ligand-binding pocket and potentially critical residues in ligand binding are identified herein for the first time. This approach allows for the generation of a highly efficient, specific, and sensitive sensor for PEA. In addition, residues that can be mutagenized in future work to generate novel ligand-selective sensors for several additional aromatic neurotransmitters and neuromodulators are also identified herein.
The novel ligand-selective sensors generated in this work will have diverse applications in synthetic biology. They can be applied to engineer autonomous microbes that dynamically identify microbial contamination in consumable products, manage various debilitating neurological disorders, and normalize dysregulated metabolites associated with metabolic disorders. In addition, the protein engineering methods demonstrated here can be widely applied to develop enzymes and sensors with applications in bioenergy, commodity chemicals, bioremediation, and healthcare.
Summary
Microbial biosensors have diverse applications in metabolic engineering and medicine. Specific and accurate quantification of chemical concentrations allows for adaptive regulation of enzymatic pathways and temporally precise expression of diagnostic reporters. Although biosensors should differentiate structurally similar ligands with distinct biological functions, such specific sensors are rarely found in nature and are challenging to create. Using E. coli Nissle 1917, generally regarded as a safe microbe, two biosensor systems that promiscuously recognize aromatic amino acids or neurochemicals were characterized. To improve the sensors' selectivity and sensitivity, rational protein engineering was applied by identifying and mutagenizing amino acid residues and the ligand-specific biosensors for phenylalanine, tyrosine, phenylethylamine, and tyramine were successfully demonstrated (see e.g.,
Introduction
Microbial biosensors can be utilized to kinetically regulate and quantify the products of metabolic pathways, diagnose diseases in vivo in probiotics, and analyze ex vivo samples through wearable, paper-based, and cell-free systems. Synthetic biological sensors can be created using several protein design approaches. However, sensors can also be obtained more efficiently by mining sensors that naturally exist in microbes. These sensors can often be directly transferred into probiotic organisms or purified for cell-free sensor applications. However, most natural sensors recognize multiple structurally similar, but functionally diverse, metabolites often found in close proximity. To accurately correlate the concentration of a chemical with biological outcomes, sensors need to differentiate between different chemicals and recognize the relevant chemical with the precise sensitivity.
Specific sensing is critical for many microbial sensor applications. A variety of protein engineering methods have been demonstrated for optimizing the selectivity and sensitivity of protein and RNA sensors. These approaches include directed evolution and rational design, and computational de novo design. Although each method has advantages, rational engineering can uniquely be performed using both basic knowledge of the protein structure and small library sizes, allowing for rapid construction and screening of libraries. In addition, when the structure of the protein is unknown, conserved or essential residues can also be identified through computational simulations or by aligning the sequence of the sensor with other proteins in the same family. Despite advances in protein engineering, creating ligand-specific sense-and-respond systems remains challenging due to the need to couple subtle protein conformational changes with differential protein-DNA interactions and gene expression control, especially when the target ligands are structurally similar.
The aromatic amino acids phenylalanine (Phe) and tyrosine (Tyr) are common microbial metabolic engineering products and precursors derived from the same pathway. In addition, chronically elevated levels of structurally similar Phe and Tyr are associated with the distinct disorders phenylketonuria and type 2 tyrosinemia, respectively (see e.g., TABLE 1).
Sensors for both metabolites have been generated in different E. coli strains. However, the sensors are based on the wild-type version of the multi-ligand responsive TyrR transcription factor (TF) from E. coli and have limited selectivity and low dynamic ranges. Similarly, the structurally similar amines phenylethylamine (PEA), tyramine (Tyra), and tryptamine (Trypta) are all commonly found in food and in the gut but contribute to distinct biological outcomes (see e.g., TABLE 1). For example, extreme levels of PEA have been associated with a variety of psychological disorders, the presence of Tyra leads to catecholamine release and an increase in blood pressure, and the presence of Trypta causes serotonin release and the stimulation of gastrointestinal motility. Additionally, the presence of PEA, Tyra, and Trypta in food are indicators of microbial contamination, and eating foods with high levels of Tyra can lead to poisoning. Currently, there are no biosensors with high ligand specificity for these chemicals.
Described herein is the generation of sensors for the aromatic metabolites Phe, Tyr, PEA, and Trypta using Escherichia coli Nissle 1917 (EcN) as a host. To generate sensors for these aromatic metabolites, two sensor systems were identified and engineered: the TyrR (Phe and Tyr) sensor and the TynA-FeaR (aromatic amine) sensor system. The ligand selectivity of the TyrR and TynA-FeaR sensor systems was engineered and their sensitivity was tuned by rationally selecting and individually mutating amino acids in TyrR, TynA, and FeaR. This method of rational protein engineering quickly generates multiple small libraries that can be efficiently screened, making it an attractive approach for specificity engineering. Altogether, the sensors specific for Phe, Tyr, PEA, or Tyra were generated. In engineering FeaR, insights were provided into the uncharacterized structure of FeaR and residues important for ligand binding were identified. Herein are provided sensors with diverse applications in microbial biosensing as well as a generalizable approach to modulating the specificity of ligand-protein binding while maintaining protein-DNA interactions and thus downstream gene expression control.
Results
Developing Specific Phenylalanine and Tyrosine Sensors
Phe and Tyr are structurally similar metabolites utilized throughout the body for many processes as additives in food and animal feed and as precursors for many chemicals and pharmaceuticals (see e.g.,
First, six E. coli-native promoters were characterized using genomic expression or plasmid-based overexpression of TyrR to identify the best promoter in EcN (see e.g,
The E274Q TyrR mutation was previously shown to prevent TyrR from forming hexamers, which occurs when TyrR is bound to Tyr. Since hexamerization is required for Tyr-mediated repression of PtyrP, this mutation should render TyrR nonresponsive to Tyr but maintain Phe-dependent induction. When the mutation was inserted into the plasmid-expressed TyrR, Tyr-dependent repression of PtyrP was mitigated and the Tyr-dominating response of the TF was eliminated (see e.g.,
Next, the Phe-sensitivity of the E274Q mutant was tuned to create sensors that better align with variable physiological concentrations and improve the utility of the sensors for therapeutic and diagnostic applications. Eight amino acid positions with known roles in TyrR-mediated promoter induction were selected (see e.g.,
TyrR was also engineered for Tyr specificity. Because TyrR binds to Phe and Tyr using two separate binding pockets, the libraries designed to identify less sensitive Phe sensors were used to identify variants completely devoid of Phe-activity, while maintaining Tyr activity (see e.g.,
Engineering Ligand-Specific Aromatic Amine Sensors
The body harbors a variety of aromatic amines with diverse neurological functions. Many of these metabolites, including dopamine (DA), PEA, Tyra, and Trypta, can be found in the intestines. Many commensal microbes can convert these amines into aldehydes through monoamine oxidases (see e.g.,
To develop sensors for aromatic amines, the TynA-FeaR system was first characterized in EcN. A sensor plasmid was constructed, which consisted of constitutively expressed TynA and FeaR from E. coli MG1655, and a reporter plasmid, which consisted of GFP under the control of PtynA (see e.g.,
Due to their superior induction and roles in numerous processes, PEA and Tyra were selected as targets for engineering selectivity in the TynA-FeaR system. This system presents an interesting opportunity to optimize ligand selectivity using two different knobs of control. Either TynA or FeaR can be engineered for selectivity. However, the protein structure and mode of ligand binding of FeaR have not been elucidated. In contrast, the structure of TynA has been characterized. Thus, improving the amine selectivity of TynA was focused on first. Several residues in TynA have been identified as catalytically essential, including D413 and Y496 (see e.g.,
To quickly screen variants from the libraries for a response to each ligand, a growth-based assay was developed by incorporating the carbenicillin resistance gene, bla, and the sucrose counterselection gene, sacB, onto the reporter plasmid under the control of PtynA (see e.g.,
Two PEA-specific variants were successfully identified (see e.g.,
Two variants with improved selectivity to Tyra were also identified (see e.g.,
Characterization and Engineering of FeaR
Intrigued by the appearance of the A81T FeaR mutation in the G494S* sensor, next the importance of the 81st residue for ligand binding was explored. A protein motif search indicated that the position 12-185 of FeaR is a ligand-binding domain of the AraC-like TF, and the position 218-298 is its DNA-binding domain with a helix-turn-helix motif at positions 258-298. Since no experimentally derived structures of FeaR are known to exist, computational simulations were applied based on comparative modeling to predict the structure. The algorithm utilized the moderate homology of FeaR to CuxR, another AraC-like TF with a previously solved structure, to comparatively predict the structure of FeaR. Visualization of the predicted structure revealed that A81 is part of a solvent-accessible beta-barrel, and the alanine side chain is oriented toward the interior of the barrel (see e.g.,
To further elucidate the role of the A81T mutation in ligand selectivity, the A81T mutation was inserted into the FeaR-TynA sensor plasmid with wild-type TynA. The mutation had an insignificant impact on the maximum response of the sensor to PEA and Tyra but significantly reduced the responses to DA and Trypta (see e.g.,
Noting the PEA selectivity given by the A81T mutation and the smallest size of the PEA-aldehyde (no OH group), followed by that of Tyra—(1 OH), DA—(2 OHs), and Trypta—(additional 5-membered ring) aldehyde (see e.g.,
FeaR recognized Tyra-aldehyde at a range of size and hydropathy index values (see e.g.,
Structural simulations of the wild-type (promiscuous), A81T (Tyra- and PEA-responsive), A81L (PEA-specific), and A81H (non-functional) variants were performed to understand how the mutations may alter the structure and thus ligand binding of FeaR. Residues of increasing size (A<T<L) protruded further into and occupied more space in the ligand-binding pocket, which is consistent with the observed size effect. All three mutations also shifted the positions of several side chains within the beta-barrel relative to the wild-type structure (see e.g.,
Given their significant rotations, it was hypothesized that residues M83, L108, and W110 may also play important roles in ligand binding, potentially by making contacts with alternative functional groups in the ligand. It was hypothesized that mutagenizing these residues may reveal a sensor for DA, Tyra, or Trypta. Each residue was individually mutated and multiple ligand-specific sensors for PEA or Tyra were identified (see e.g.,
To further confirm the specificity of the best performing sensors, the response of each to the four amines and the four respective carboxylic acids, 3,4-dihydroxyphenylacetic acid (DOPAC), phenylacetic acid (PAA), 4-hydroxyphenylacetic acid (HPAA), and indole-3-acetic acid (IAA) was tested in minimal medium with casamino acids and LB medium (see e.g.,
Together, this work provides insights into the structure and activity of FeaR. In addition, by engineering FeaR, the best performing PEA- and Tyra-specific sensors are provided for future applications. Future work could include random approaches of protein engineering such as error-prone PCR-based directed evolution to develop additional ligand-specific sensors for the larger DA and Trypta amines. In addition, further exploration of the relevant regulatory pathways is required to uncouple the activity of the sensors with resource availability.
Discussion
The ligand-specific sensors developed here have the potential for diverse applications, including (1) monitoring food quality, (2) diagnosing or treating metabolic, digestive, and neurological disorders in probiotics or ex vivo wearable, paper-based and cell-free systems, and (3) dynamically regulating enzymatic pathways for microbial metabolic engineering. The high degree of ligand specificity shown by the engineered sensors allows them to effectively differentiate between diverse structurally similar aromatic metabolites. Demonstrated herein is an efficient and effective method of rational protein engineering by individually performing saturation mutagenesis on logically selected amino acid residues. The generalizability of this method is shown by applying it to three protein systems, including both enzymes and TFs. The simplicity of the approach and small library sizes make it an attractive first step in sensor engineering. Although this protein engineering method requires basic structural knowledge of the protein of interest, the scope of the required information is less than fully computation-based engineering approaches. However, also shown herein is how protein simulations can be used to identify important residues from uncharacterized proteins.
Protein engineering for specific ligand-protein interactions has been extensively performed, especially for facilitating chemical drug screening. However, coupling ligand-protein binding with an output response has remained challenging. Current approaches include designing proteins that fluoresce upon ligand binding or employing ligand-binding TFs. While the former can be used to develop sensitive sensors, the latter can generate sensors as well as controllers for downstream functions such as gene expression. Because this TF-based system requires the maintenance or engineering of DNA-protein interaction in addition to engineering ligand-protein binding, developing ligand-specific sensors using this system has been challenging, especially when the target ligands are structurally similar. To address this challenging issue, two approaches were demonstrated: optimizing ligand-TF binding specificity and sensitivity by leveraging differential multimerization patterns of TyrR without affecting DNA-TF binding interactions (see e.g.,
The feaR-tynA system represents a unique way to develop sensors using two different control knobs. As shown in
Altogether, the specific microbial sensors for Phe, Tyr, PEA, and Tyra were generated. This work provides ligand-specific sensors that can be applied to create probiotics for diverse applications. In addition, the generalizable protein engineering techniques demonstrated here can be used to quickly and effectively engineer enzymes and TFs for ligand specificity and sensitivity. Although sensors were specifically developed with potential applications primarily in medicine and food quality, these enzymes and sensors can be similarly engineered to produce fuels, pharmaceuticals, and commodity chemicals in response to the levels of metabolites or the products in a dynamic way. This work represents a considerable achievement in the challenging goal of engineering ligand-specific sense-and-respond systems for medically relevant chemicals through coupling protein conformational changes caused by ligand binding with DNA interactions.
Methods
Plasmids, Strains, and Reagents
All plasmids, strains, and genetic parts used in this study are summarized in TABLE 4, TABLE 5, and TABLE 6, respectively.
It is noted that TrpR and Ptrp constructs are either WT sequences form E. coli or from the following paper: Ellefson, J. W., Ledbetter, M. P. & Ellington, A. D. Directed evolution of a synthetic phylogeny of programmable Trp repressors. Nat Chem Biol 14, 361-367 (2018). https://doi.org/10.1038/s41589-018-0006-7.
E. coli strains used in this work.
All plasmids were assembled in E. coli DH10B using the Gibson Assembly or Golden Gate Assembly methods. EcN was transformed with the purified and sequence-verified plasmids for testing. The EcN strain used in this work lacks its native plasmids (obtained from DSMZ). The tynA, feaR, and tyrR genes and PtynA, PtyrP, PtyrR, Pmtr, PtyrB, ParoF, and ParoP promoters were obtained from E. coli MG1655 genomic DNA. pHJY23 containing the inactivated tynA gene (see e.g.,
Plasmid DNA was isolated using the PureLink Quick Plasmid Miniprep Kit (Invitrogen, Walthem, Mass., USA), and PCR products were extracted from electrophoresis gels using the Zymoclean Gel DNA Recovery Kit (ZYMO research, Irvine, Calif., USA). Enzymes were purchased from New England Biolabs (Ipswich, Mass., USA). Chemicals were purchased from Sigma-Aldrich (St. Louis, Mo., USA) or Gold Biotechnology (Olivette, Mo., USA). All sequencing was performed by Genewiz (South Plainfield, N.J., USA). Primers were purchased from Integrated DNA Technologies (Coralville, Iowa, USA).
Aromatic Amino Acid Sensing Assay
Everything but (Eb) medium was prepared by supplementing M9 minimal medium with 1 mM MgSO4, 100 mM CaCl2), 0.4% w/v glucose, and all non-aromatic amino acids (0.8 mM alanine, 5 mM arginine, 0.4 mM asparagine, 0.4 mM aspartate, 0.1 mM cysteine, 0.6 mM glutamate, 0.6 mM glutamine, 0.8 mM glycine, 0.2 mM histidine, 0.4 mM isoleucine, 0.8 mM leucine, 0.4 mM lysine, 0.2 mM methionine, 0.4 mM proline, 10 mM serine, 0.4 mM threonine, and 0.6 mM valine). Single colonies of EcN containing the relevant sensor plasmids were transferred to 5 mL Eb medium in 14 mL round bottom tubes and incubated overnight at 250 rpm and 37° C. Experimental cultures were prepared by diluting overnight cultures 200× into 0.6 mL fresh Eb medium supplemented with the respective ligands (Phe and Tyr) in 2 mL 96-deep well plates (Eppendorf, Hamburg, Germany). Cultures were grown for 8 h at 37° C. and 250 rpm before sampled for fluorimetry or flow cytometry analysis. All medium was supplemented with the relevant antibiotics for plasmid maintenance (34 mg/ml chloramphenicol and 100 mg/ml spectinomycin).
Aromatic Amine Sensing Assays
For fluorescence-based quantification, single colonies were transferred to 5 mL LB medium (VWR, Radnor, Pa., USA) in 14 mL round bottom tubes and incubated overnight at 250 rpm and 37° C. Unless otherwise specified, experimental cultures were prepared by diluting overnight cultures 100× into 0.6 mL M9 minimal medium supplemented with 1 mM MgSO4, 100 mM CaCl2), and 2% w/v glycerol as well as the respective ligands (DA, PEA, Tyra, and Trypta) in 2 mL 96-deep well plates. Cultures were grown for 24 h at 37° C. and 250 rpm. After 5 h and 24 h of incubation, samples were obtained from the cultures for flow cytometry analysis.
For growth-based library screening, b/a (encoding b-lactamase) and sacB (encoding levansucrase) were incorporated downstream of gfp in the pHJY028 reporter plasmid (see e.g., TABLE 4). For optimization, ribosome binding site libraries were designed for b/a and sacB using the RBS Calculator (see e.g., TABLE 6). The optimization resulted in reporter pCX008. To test sensor variants using the pCX008 reporter, single colonies were transferred to 0.6 mL LB medium in 2 mL 96-deep well plates and incubated overnight at 250 rpm and 37° C. The overnight cultures were diluted 50× into 0.6 mL M9 minimal medium supplemented with 1 mM MgSO4, 100 mM CaCl2, and 2% w/v glycerol in 2 mL 96-well deep well plates. For positive selection, cultures were supplemented with 300 mg/mL carbenicillin. Cultures were then incubated for 2 h at 250 rpm and 37° C. After the 2 h incubation, 2 mL of cells were plated onto M9 agar plates supplemented with 2% glycerol and either 300 mg/mL carbenicillin and 1 mM of the amine of interest (positive selection) or 5% (w/v) sucrose and 1 mM of the non-desired amines (negative selection). Plates were incubated overnight at 37° C.
Fluorimetry
200 mL culture samples were collected and transferred to 96-well assay microplates (clear, flat bottom black, Greiner Bio-One). The fluorescence and culture absorbance (Abs) were measured using a Tecan microplate reader (Infinite M200 Pro) as previously described. The fluorescence of GFP was measured with an excitation at 483 nm and emission at 530 nm. The Abs of the samples was measured at 600 nm. The measured fluorescence was normalized by dividing by the Abs and subtracting the same ratio obtained from non-fluorescent wild-type cells (Equation 1).
Flow Cytometry
Flow cytometry was performed as previously described. Culture samples were collected and diluted to a final OD600 of ˜0.005-0.01 in 200 mL filtered phosphate-buffered saline supplemented with 2 mg/ml kanamycin in 96-well assay microplates (U-bottom, REF-353910 from BD Biosciences, San Jose, Calif., USA). The fluorescence of the samples was measured using a Millipore Guava EasyCyte High Throughput flow cytometer with a 488 nm excitation laser and 512/18 nm emission filter. Cytometry data was gated by forward and side scatter. FlowJo (TreeStar Inc.) was used to obtain the arithmetic mean of the fluorescence distribution. The averages of the arithmetic means were calculated from three biological replicates. The average fluorescence of the non-fluorescent wild-type cell was subtracted from each sample to obtain the final fluorescence (au) values (Equation 2). To obtain the relative fluorescence in
Hill Equation Fitting
The Hill equation (Equations 4 and 5) was used to fit lines to the fluorescence data. The model was fit to the experimentally collected data by minimizing the root mean square error (RMSE; Equation 6). Fitted values are listed in TABLE 2.
For repressible constructs:
For inducible constructs:
where
F=Calculated fluorescence
Fmax=Maximum fluorescence
Fmin=Minimum fluorescence
KA=Half maximal constant
n=Hill coefficient
[L]=Ligand concentration
where
F=Calculated fluorescence
Fexp=Actual experimental fluorescence
N=Number of data points
FeaR Protein Modeling
FeaR structural motifs were annotated using the MOTIF Search web tool. Structural predictions were made using the comparative modeling function of the Robetta web server with the wild-type or mutant FeaR amino acid sequences as inputs. Predicted structures were used as the basis for ligand docking simulations. Three-dimensional conformers of the aldehydes of DA, PEA, Tyra, and Trypta were generated using Chem3D 16.0 (PerkinElmer, Waltham, Mass.). Ligand conformer and protein structure.pdb files were uploaded to the Rosetta Ligand Docking Server hosted by ROSIE, with the ligand of interest initially centered at the coordinates of the 81st residue's side chain. All protein structures were visualized using PyMOL 2.4.1 (Schrodinger, Inc., New York, N.Y.). Values for amino acid size and hydropathy index were obtained from literature for the analysis in
Quantification and Statistical Analysis
All statistical details of experiments, including significance criteria and sample size can be found in the figure legends. No sample size calculations were performed during the design of experiments. No samples were excluded.
This example describes design and development of a tryptamine biosensor.
Tryptamine specific biosensors were first designed by genetic parts swapping. Three types of tynA and feaR genes from E. coli K-12 MG1655 (MG), Klebsiella aerogenes (KA), and Klebsiella pneumoniae (KP), and two types of PtynA sequences were chosen and tested in 18 different combinations (see e.g.,
Klebsiella
aerogenes
Next, significant enhancement of fluorescence intensity induced by Tryptamine was shown in the engineered sensor systems containing FeaR-KA (see e.g.,
This application claims priority from U.S. Provisional Application Ser. No. 63/180,176 filed on Apr. 27, 2021, which is incorporated herein by reference in its entirety.
This invention was made with government support under AT009741 awarded by the National Institutes of Health, CBET-1350498 and DGE-1745038 awarded by the National Science Foundation, N00014-17-1-2611 and N00014-19-1-2357 awarded by the Office of Naval Research, and 2020-33522-32319 awarded by the United States Department of Agriculture. The government has certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
63180176 | Apr 2021 | US |