A Sequence Listing accompanies this application and is submitted as an ASCII text file of the sequence listing named “702581_02339_Sequence_Listing.xml” which is 369,446 bytes in size and was created on May 1, 2023. The sequence listing is electronically submitted via EFS-Web with the application and is incorporated herein by reference in its entirety.
The technical field relates to sensors for detecting molecules and metals in aqueous solution. In particular, the technical field relates to low-cost, programmable, and rapid sensors for detecting molecules such as toxins, drugs, contaminants and the like, and metals such as zinc, lead, copper and the like in aqueous solutions.
Cell-free biosensing is emerging as a low-cost, easy-to-use and field-deployable diagnostic technology that can be applied to detect a range of chemical contaminants related to human and environmental health [1]. At their core, these systems consist of two layers: a sensing layer that includes an RNA or protein-based biosensor that can detect a chemical target, and an output layer that includes a reporter construct. By genetically wiring the sensing layer to the output layer, a signal can be generated when the target compound binds to the biosensor and activates the expression of the reporter (
The present invention relates to compositions, systems, kits, and methods for detecting analytes and target molecules. The compositions, systems, kits, and methods utilize regulated in vitro transcription in order to detect an analyte or a target molecule in a sample via toehold-mediated strand displacement circuits.
The disclosed compositions, systems, kits, and methods typically comprise and/or utilize one or more components selected from: (a) an RNA polymerase; (b) an allosteric transcription factor (aTF), wherein the aTF binds an analyte or target molecule as a ligand; (c) an engineered transcription template; (d) a dsDNA signal gate molecule; (e) a dsDNA fuel gate molecule; and/or any combination thereof. The engineered transcription template typically comprises a promoter sequence for the RNA polymerase and an operator sequence for the aTF. The promoter sequence and operator sequence are operably linked to a sequence encoding an RNA, wherein the aTF modulates transcription of the encoded RNA when the aTF binds the analyte or target molecule as a ligand. The RNA that is transcribed from the engineered transcription template displaces a DNA strand of the dsDNA signal gate which generates a detectable signal.
In some embodiments of the cell-free sensor systems disclosed herein, polymerase strand recycling components (PSR) are included. By way of example, such components include at least a fuel gate. In embodiments comprising PSR components, the RNA that is transcribed from the engineered transcription template additionally or alternatively displaces a DNA strand (e.g., the “RecycleD” strand) of a double-stranded (ds) DNA fuel gate by hybridizing to the “waste” strand of the ds DNA fuel gate, thereby freeing the RecyleD strand of the double-stranded fuel gate. The RecycleD molecule may then displace a DNA strand of the dsDNA signal gate to form a double-stranded molecule which (a) generates detectable signal, and (b) is used as template for promoter independent T7 RNA polymerase transcription, releasing the RecycleD molecules to displace additional DNA signal gates to achieve signal amplification. Any of the logic gates/logic gate assay cell-free systems disclosed herein (e.g., as exemplified in the figures and detailed description) may include polymerase strand recycling components (PSR) as described herein, to amplify signal and/or to improve assay sensitivity.
The disclosure provides compositions, systems, and/or kits for detecting an analyte comprising as components one or more of (a) an RNA polymerase; (b) an allosteric transcription factor (aTF), wherein the analyte is a ligand to which the aTF binds; (c) an engineered transcription template; (d) a double-stranded DNA (dsDNA) signal gate molecule; and (e) a dsDNA fuel gate molecule. In some embodiments, the engineered transcription template comprises a promoter sequence for the RNA polymerase and an operator sequence for the aTF operably linked to a sequence encoding an RNA. In embodiments of the analyte detection systems disclosed herein, the aTF modulates transcription of the encoded RNA when the aTF binds the analyte as a ligand. The transcribed RNA displaces a strand of the dsDNA fuel gate molecule generating a free single-stranded DNA molecule which displaces a strand of the dsDNA signal gate molecule to produce a detectable signal and a new hybrid signal/fuel gate. The RNA polymerase transcribes the hybrid signal/fuel gate to release the fuel gate strand of DNA which can then displace a strand of another dsDNA signal gate molecule, thus creating a positive feedback loop of signal amplification. In some aspects, the dsDNA fuel gate molecule is a double-stranded nucleic acid comprising a waste strand and a RecycleD strand, and in some further aspects the dsDNA signal gate molecule is a fluorescently labeled double-stranded DNA molecule comprising a top strand having a fluorophore conjugated at its 3′-end and a bottom strand having a quencher conjugated at its 5′ end that quenches the fluorophore in the fluorescently labeled double-stranded DNA molecule and the RecycleD strand displaces the bottom strand of the fluorescently labeled double-stranded DNA molecule which results in dequenching of the fluorophore to generate the detectable signal and hybridization of the RecycleD strand to the top strand generates a double-stranded DNA molecule comprising a 3′ toehold on the top strand wherein the RNA polymerase transcribes the top strand and displaces the RecycleD strand and generates a fluorescent DNA/RNA hybrid, and the released RecycleD strand can displace an additional signal gate generating a positive feedback loop of signal amplification. In some aspects, the top strand is longer than the bottom strand and the transcribed RNA comprises a sequence that is complementary to the full length of the top strand. In further aspects, the waste strand or the bottom strand comprises one or more non-natural modifications that prevent the strand from being utilized as a template for transcription (e.g., 2′-O-methylation). In some aspects, the compositions, systems, and/or kits further comprise a non-labeled double-stranded DNA molecule comprising a top strand that comprises a nucleotide sequence that is identical to the nucleotide sequence of the top strand of the labeled double-stranded DNA molecule, and in other aspects, the top strand of the non-labeled double-stranded DNA molecule is longer than the bottom strand of the non-labeled double-stranded DNA molecule. In some aspects, the bottom strand of the non-labeled double-stranded DNA molecule is shorter in length than the length of the bottom strand of the fluorescently labeled double-stranded DNA molecule.
In another aspect, the transcribed RNA does not form and/or is designed not to form an intramolecular secondary structure. In some aspects, the RNA polymerase is selected from T7 RNA polymerase, T3 RNA polymerase, SP6 RNA polymerase, and Syn5 RNA polymerase or the RNA polymerase is an engineered polymerase. In some aspects, the aTF represses, blocks, or inhibits transcription from the engineered transcription template when the aTF binds the operator, and in other aspects, the aTF activates transcription from the engineered transcription template when the aTF binds the operator. In some aspects, in the presence of the analyte as a ligand the aTF does not bind to the operator sequence or binds to the operator at a lower affinity than in the absence of the analyte as a ligand, while in other aspects, in the absence of the analyte as a ligand the aTF does not bind to the operator sequence or binds to the operator at a lower affinity than in the presence of the analyte as a ligand. In some aspects, the aTF belongs to the TetR, MarR, or ArsR/SmtB class or family of transcription factors or the aTF is an engineered aTF, and in other aspects, the aTF is selected from the group consisting of TetR, MphR, QacR, OtrR, CtcS, SAR2349, MobR, SmtB, CadC, CsoR, AdcR, TtgR, and HucR.
In other aspects, the analyte that is a ligand for the aTF is an antibiotic, and in some further aspects, the analyte that is a ligand for the aTF is a member of the macrolide-family of antibiotics or member of the tetracycline-family of antibiotics. In other aspects, the analyte is a quaternary amine or salts thereof. In other aspects, the analyte is a metal or a cation thereof, and in further aspects, the metal or the cation thereof is Zn, Pb, Cu, Cd, Ni, As, or Mn. In other aspects, the analyte is selected from salicylate, 3-hydroxy benzoic acid, naringenin, and uric acid.
In other aspects, the compositions, systems, and/or kits further comprise (f) one or more components for preparing a reaction mixture for RNA transcription.
In some aspects, the compositions, systems, and/or kits disclosed herein are adapted to detect and/or quantify a range of compounds related to environmental contamination and human health.
The disclosure further provides a method for detecting an analyte in a sample, the method comprising contacting the sample with any one of the compositions, systems, and/or kits disclosed herein and detecting a signal.
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
The present invention is described herein using several definitions, as set forth below and throughout the application.
Unless otherwise specified or indicated by context, the terms “a”, “an”, and “the” mean “one or more.” For example, “a component,” “a composition,” “a system,” “a kit,” “a method,” “a protein,” “a vector,” “a domain,” “a binding site,” and “an RNA” should be interpreted to mean “one or more components,” “one or more compositions,” “one or more systems,” “one or more kits,” “one or more methods,” “one or more proteins,” “one or more vectors,” “one or more domains,” “one or more binding sites,” and “one or more RNAs,” respectively.
As used herein, “about,” “approximately,” “substantially,” and “significantly” will be understood by persons of ordinary skill in the art and will vary to some extent on the context in which they are used. If there are uses of these terms which are not clear to persons of ordinary skill in the art given the context in which they are used, “about” and “approximately” will mean plus or minus ≤10% of the particular term and “substantially” and “significantly” will mean plus or minus >10% of the particular term.
As used herein, the terms “include” and “including” have the same meaning as the terms “comprise” and “comprising” in that these latter terms are “open” transitional terms that do not limit claims only to the recited elements succeeding these transitional terms. The term “consisting of,” while encompassed by the term “comprising,” should be interpreted as a “closed” transitional term that limits claims only to the recited elements succeeding this transitional term. The term “consisting essentially of,” while encompassed by the term “comprising,” should be interpreted as a “partially closed” transitional term which permits additional elements succeeding this transitional term, but only if those additional elements do not materially affect the basic and novel characteristics of the claim.
As used herein, the terms “regulation” and “modulation” may be utilized interchangeably and may include “promotion” and “induction.” For example, a transcription factor that regulates or modulates expression of a target gene may promote and/or induce expression of the target gene. In addition, the terms “regulation” and “modulation” may be utilized interchangeably and may include “inhibition” and “reduction.” For example, a transcription factor that regulates or modulates expression of a target gene may inhibit and/or reduce expression of the target gene.
As used herein, the term “sample” may include “biological samples” and “non-biological samples.” Biological samples may include samples obtained from a human or non-human subject. Biological samples may include but are not limited to, blood samples and blood product samples (e.g., serum or plasma), urine samples, saliva samples, fecal samples, perspiration samples, and tissue samples. Non-biological samples may include but are not limited to aqueous samples (e.g., watershed samples) and surface swab samples.
The terms “polynucleotide,” “polynucleotide sequence,” “nucleic acid” and “nucleic acid sequence” refer to a nucleotide, oligonucleotide, polynucleotide (which terms may be used interchangeably), or any fragment thereof. These phrases also refer to DNA or RNA of genomic, natural, or synthetic origin (which may be single-stranded or double-stranded and may represent the sense or the antisense strand).
The terms “nucleic acid” and “oligonucleotide,” as used herein, may refer to polydeoxyribonucleotides (containing 2-deoxy-D-ribose), polyribonucleotides (containing D-ribose), and to any other type of polynucleotide that is an N glycoside of a purine or pyrimidine base. There is no intended distinction in length between the terms “nucleic acid”, “oligonucleotide” and “polynucleotide”, and these terms will be used interchangeably. These terms refer only to the primary structure of the molecule. Thus, these terms include double- and single-stranded DNA, as well as double- and single-stranded RNA. For use in the present methods, an oligonucleotide also can comprise nucleotide analogs in which the base, sugar, or phosphate backbone is modified as well as non-purine or non-pyrimidine nucleotide analogs.
Oligonucleotides can be prepared by any suitable method, including direct chemical synthesis by a method such as the phosphotriester method of Narang et al., 1979, Meth. Enzymol. 68:90-99; the phosphodiester method of Brown et al., 1979, Meth. Enzymol. 68:109-151; the diethylphosphoramidite method of Beaucage et al., 1981, Tetrahedron Letters 22:1859-1862; and the solid support method of U.S. Pat. No. 4,458,066, each incorporated herein by reference. A review of synthesis methods of conjugates of oligonucleotides and modified nucleotides is provided in Goodchild, 1990, Bioconjugate Chemistry 1(3): 165-187, incorporated herein by reference.
Regarding polynucleotide sequences, the terms “percent identity” and “% identity” refer to the percentage of residue matches between at least two polynucleotide sequences aligned using a standardized algorithm. Such an algorithm may insert, in a standardized and reproducible way, gaps in the sequences being compared in order to optimize alignment between two sequences, and therefore achieve a more meaningful comparison of the two sequences. Percent identity for a nucleic acid sequence may be determined as understood in the art. (See, e.g., U.S. Pat. No. 7,396,664, which is incorporated herein by reference in its entirety). A suite of commonly used and freely available sequence comparison algorithms is provided by the National Center for Biotechnology Information (NCBI) Basic Local Alignment Search Tool (BLAST), which is available from several sources, including the NCBI, Bethesda, Md., at its website. The BLAST software suite includes various sequence analysis programs including “blastn,” that is used to align a known polynucleotide sequence with other polynucleotide sequences from a variety of databases. Also available is a tool called “BLAST 2 Sequences” that is used for direct pairwise comparison of two nucleotide sequences. “BLAST 2 Sequences” can be accessed and used interactively at the NCBI website. The “BLAST 2 Sequences” tool can be used for both blastn and blastp (discussed above).
Regarding polynucleotide sequences, percent identity may be measured over the length of an entire defined polynucleotide sequence, for example, as defined by a particular SEQ ID number, or may be measured over a shorter length, for example, over the length of a fragment taken from a larger, defined sequence, for instance, a fragment of at least 20, at least 30, at least 40, at least 50, at least 70, at least 100, or at least 200 contiguous nucleotides. Such lengths are exemplary only, and it is understood that any fragment length supported by the sequences shown herein, in the tables, figures, or Sequence Listing, may be used to describe a length over which percentage identity may be measured.
Regarding polynucleotide sequences, “variant,” “mutant,” or “derivative” may be defined as a nucleic acid sequence having at least 50% sequence identity to the particular nucleic acid sequence over a certain length of one of the nucleic acid sequences using blastn with the “BLAST 2 Sequences” tool available at the National Center for Biotechnology Information's website. (See Tatiana A. Tatusova, Thomas L. Madden (1999), “Blast 2 sequences—a new tool for comparing protein and nucleotide sequences”, FEMS Microbiol Lett. 174:247-250). Such a pair of nucleic acids may show, for example, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% or greater sequence identity over a certain defined length.
Nucleic acid sequences that do not show a high degree of identity may nevertheless encode similar amino acid sequences due to the degeneracy of the genetic code where multiple codons may encode for a single amino acid. It is understood that changes in a nucleic acid sequence can be made using this degeneracy to produce multiple nucleic acid sequences that all encode substantially the same protein. For example, polynucleotide sequences as contemplated herein may encode a protein and may be codon-optimized for expression in a particular host. In the art, codon usage frequency tables have been prepared for a number of host organisms including humans, mouse, rat, pig, E. coli, plants, and other host cells.
A “recombinant nucleic acid” is a sequence that is not naturally occurring or has a sequence that is made by an artificial combination of two or more otherwise separated segments of sequence. This artificial combination is often accomplished by chemical synthesis or, more commonly, by the artificial manipulation of isolated segments of nucleic acids, e.g., by genetic engineering techniques known in the art. The term recombinant includes nucleic acids that have been altered solely by addition, substitution, or deletion of a portion of the nucleic acid. Frequently, a recombinant nucleic acid may include a nucleic acid sequence operably linked to a promoter sequence. Such a recombinant nucleic acid may be part of a vector that is used, for example, to transform a cell.
The nucleic acids disclosed herein may be “substantially isolated or purified.” The term “substantially isolated or purified” refers to a nucleic acid that is removed from its natural environment, and is at least 60% free, preferably at least 75% free, and more preferably at least 90% free, even more preferably at least 95% free from other components with which it is naturally associated.
The term “amplification reaction” refers to any chemical reaction, including an enzymatic reaction, which results in increased copies of a template nucleic acid sequence or results in transcription of a template nucleic acid. Amplification reactions include reverse transcription, the polymerase chain reaction (PCR), including Real Time PCR (see U.S. Pat. Nos. 4,683,195 and 4,683,202; PCR Protocols: A Guide to Methods and Applications (Innis et al., eds, 1990)), and the ligase chain reaction (LCR) (see Barany et al., U.S. Pat. No. 5,494,810). Exemplary “amplification reactions conditions” or “amplification conditions” typically comprise either two or three step cycles. Two-step cycles have a high temperature denaturation step followed by a hybridization/elongation (or ligation) step. Three step cycles comprise a denaturation step followed by a hybridization step followed by a separate elongation step.
The term “hybridization,” as used herein, refers to the formation of a duplex structure by two single-stranded nucleic acids due to complementary base pairing. Hybridization can occur between fully complementary nucleic acid strands or between “substantially complementary” nucleic acid strands that contain minor regions of mismatch. Conditions under which hybridization of fully complementary nucleic acid strands is strongly preferred are referred to as “stringent hybridization conditions” or “sequence-specific hybridization conditions”. Stable duplexes of substantially complementary sequences can be achieved under less stringent hybridization conditions; the degree of mismatch tolerated can be controlled by suitable adjustment of the hybridization conditions. Those skilled in the art of nucleic acid technology can determine duplex stability empirically considering a number of variables including, for example, the length and base pair composition of the oligonucleotides, ionic strength, and incidence of mismatched base pairs, following the guidance provided by the art (see, e.g., Sambrook et al., 1989, Molecular Cloning—A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York; Wetmur, 1991, Critical Review in Biochem. and Mol. Biol. 26(3/4):227-259; and Owczarzy et al., 2008, Biochemistry, 47: 5336-5353, which are incorporated herein by reference).
The term “primer,” as used herein, refers to an oligonucleotide capable of acting as a point of initiation of DNA synthesis under suitable conditions. Such conditions include those in which synthesis of a primer extension product complementary to a nucleic acid strand is induced in the presence of four different nucleoside triphosphates and an agent for extension (for example, a DNA polymerase or reverse transcriptase) in an appropriate buffer and at a suitable temperature.
A primer is preferably a single-stranded DNA. The appropriate length of a primer depends on the intended use of the primer but typically ranges from about 6 to about 225 nucleotides, including intermediate ranges, such as from 15 to 35 nucleotides, from 18 to 75 nucleotides and from 25 to 150 nucleotides. Short primer molecules generally require cooler temperatures to form sufficiently stable hybrid complexes with the template. A primer need not reflect the exact sequence of the template nucleic acid, but must be sufficiently complementary to hybridize with the template. The design of suitable primers for the amplification of a given target sequence is well known in the art and described in the literature cited herein.
Primers can incorporate additional features which allow for the detection or immobilization of the primer but do not alter the basic property of the primer, that of acting as a point of initiation of DNA synthesis. For example, primers may contain an additional nucleic acid sequence at the 5′ end which does not hybridize to the target nucleic acid, but which facilitates cloning or detection of the amplified product, or which enables transcription of RNA (for example, by inclusion of a promoter) or translation of protein (for example, by inclusion of a 5′-UTR, such as an Internal Ribosome Entry Site (IRES) or a 3′-UTR element, such as a poly(A)n sequence, where n is in the range from about 20 to about 200). The region of the primer that is sufficiently complementary to the template to hybridize is referred to herein as the hybridizing region.
As used herein, a primer is “specific,” for a target sequence if, when used in an amplification reaction under sufficiently stringent conditions, the primer hybridizes primarily to the target nucleic acid. Typically, a primer is specific for a target sequence if the primer-target duplex stability is greater than the stability of a duplex formed between the primer and any other sequence found in the sample. One of skill in the art will recognize that various factors, such as salt conditions as well as base composition of the primer and the location of the mismatches, will affect the specificity of the primer, and that routine experimental confirmation of the primer specificity will be needed in many cases. Hybridization conditions can be chosen under which the primer can form stable duplexes only with a target sequence. Thus, the use of target-specific primers under suitably stringent amplification conditions enables the selective amplification of those target sequences that contain the target primer binding sites.
As used herein, a “polymerase” refers to an enzyme that catalyzes the polymerization of nucleotides. “DNA polymerase” catalyzes the polymerization of deoxyribonucleotides. Known DNA polymerases include, for example, Pyrococcus furiosus (Pfu) DNA polymerase, E. coli DNA polymerase I, T7 DNA polymerase and Therrnus aquaticus (Taq) DNA polymerase, among others. “RNA polymerase” catalyzes the polymerization of ribonucleotides. The foregoing examples of DNA polymerases are also known as DNA-dependent DNA polymerases. RNA-dependent DNA polymerases also fall within the scope of DNA polymerases. Reverse transcriptase, which includes viral polymerases encoded by retroviruses, is an example of an RNA-dependent DNA polymerase. Known examples of RNA polymerase (“RNAP”) include, for example, RNA polymerases of bacteriophages (e.g. T3 RNA polymerase, T7 RNA polymerase, SP6 RNA polymerase, Syn5 RNA polymerase), and E. coli RNA polymerase, among others. The foregoing examples of RNA polymerases are also known as DNA-dependent RNA polymerase. The polymerase activity of any of the above enzymes can be determined by means well known in the art.
Also contemplated for us in the disclosed compositions, systems, kits, and methods are engineered RNA polymerase. For example, an engineered polymerase may be a non-naturally occurring RNA polymerase whose amino acid sequence has been engineered to include one or more of an insertion, a deletion, or a substitution relative to the amino acid sequence of a naturally occurring or wild-type RNA polymerase.
The term “promoter” refers to a cis-acting DNA sequence that directs RNA polymerase and other trans-acting transcription factors to initiate RNA transcription from the DNA template that includes the cis-acting DNA sequence.
As used herein, “an engineered transcription template” or “an engineered expression template” refers to a non-naturally occurring nucleic acid that serves as substrate for transcribing at least one RNA. As used herein, “expression template” and “transcription template” have the same meaning and are used interchangeably. Engineered include nucleic acids composed of DNA or RNA. Suitable sources of DNA for use in a nucleic acid for an expression template include genomic DNA, cDNA and RNA that can be converted into cDNA. Genomic DNA, cDNA and RNA can be from any biological source, such as a tissue sample, a biopsy, a swab, sputum, a blood sample, a fecal sample, a urine sample, a scraping, among others. The genomic DNA, cDNA and RNA can be from host cell or virus origins and from any species, including extant and extinct organisms.
“Transformation” or “transfection” describes a process by which exogenous nucleic acid (e.g., DNA or RNA) is introduced into a recipient cell. Transformation or transfection may occur under natural or artificial conditions according to various methods well known in the art, and may rely on any known method for the insertion of foreign nucleic acid sequences into a prokaryotic or eukaryotic host cell. The method for transformation or transfection is selected based on the type of host cell being transformed and may include, but is not limited to, bacteriophage or viral infection or non-viral delivery. Methods of non-viral delivery of nucleic acids include lipofection, nucleofection, microinjection, electroporation, heat shock, particle bombardment, biolistics, virosomes, liposomes, immunoliposomes, polycation or lipid:nucleic acid conjugates, naked DNA, artificial virions, and agent-enhanced uptake of DNA. Lipofection is described in e.g., U.S. Pat. Nos. 5,049,386, 4,946,787; and 4,897,355) and lipofection reagents are sold commercially (e.g., Transfectam™ and Lipofectin™). Cationic and neutral lipids that are suitable for efficient receptor-recognition lipofection of polynucleotides include those of Felgner, WO 91/17424; WO 91/16024. Delivery can be to cells (e.g. in vitro or ex vivo administration) or target tissues (e.g. in vivo administration). The term “transformed cells” or “transfected cells” includes stably transformed or transfected cells in which the inserted DNA is capable of replication either as an autonomously replicating plasmid or as part of the host chromosome, as well as transiently transformed or transfected cells which express the inserted DNA or RNA for limited periods of time.
The polynucleotide sequences contemplated herein may be present in expression vectors. For example, the vectors may comprise a polynucleotide encoding an ORF of a protein operably linked to a promoter. “Operably linked” refers to the situation in which a first nucleic acid sequence is placed in a functional relationship with a second nucleic acid sequence. For instance, a promoter is operably linked to a coding sequence if the promoter affects the transcription or expression of the coding sequence. Operably linked DNA sequences may be in close proximity or contiguous and, where necessary to join two protein coding regions, in the same reading frame. Vectors contemplated herein may comprise a heterologous promoter operably linked to a polynucleotide that encodes a protein. A “heterologous promoter” refers to a promoter that is not the native or endogenous promoter for the protein or RNA that is being expressed.
As used herein, “expression” refers to the process by which a polynucleotide is transcribed from a DNA template (such as into mRNA or another RNA transcript) and/or the process by which a transcribed mRNA is subsequently translated into peptides, polypeptides, or proteins. Transcripts and encoded polypeptides may be collectively referred to as “gene product.”
The term “vector” refers to some means by which nucleic acid (e.g., DNA) can be introduced into a host organism or host tissue. There are various types of vectors including plasmid vector, bacteriophage vectors, cosmid vectors, bacterial vectors, and viral vectors. As used herein, a “vector” may refer to a recombinant nucleic acid that has been engineered to express a heterologous polypeptide (e.g., the fusion proteins disclosed herein). The recombinant nucleic acid typically includes cis-acting elements for expression of the heterologous polypeptide.
In the methods contemplated herein, a host cell may be transiently or non-transiently transfected (i.e., stably transfected) with one or more vectors described herein. A cell transfected with one or more vectors described herein may be used to establish a new cell line comprising one or more vector-derived sequences. In the methods contemplated herein, a cell may be transiently transfected with the components of a system as described herein (such as by transient transfection of one or more vectors), and modified through the activity of a complex, in order to establish a new cell line comprising cells containing the modification but lacking any other exogenous sequence.
Fuel gates of the present disclosure comprise double-stranded nucleic acid molecules, and may comprise DNA:DNA duplexes, DNA:RNA duplexes, or RNA:RNA duplexes. In some embodiments, one or both nucleic acid molecules of the fuel gate duplex comprise one or more modifications, such as, without limitation, one or more non-natural nucleic acids, one or more methyl groups. In some embodiments, the two nucleic strands of the fuel gate comprise a complementary region or hybridization regions of at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% that allows the two strands to hybridize with each other. In some embodiments, the region of hybridization regions is about 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 bases pairs in length. In some embodiments, the complementary regions is about 15 base pairs in length, and has 100% complementarity for all 15 nucleotides. In some embodiments, the double-stranded fuel gate comprises, in addition to the complementary region, one or more overhands (e.g., 5′ overhands). See e.g.,
Considerations and aspects for fuel gate design comprise the following, without limitation: in some embodiments, the toehold from the fuel gate should be about ½ the length of the toehold from Flurophore/Quencher (signal gate). In some embodiments, the toehold of a fuel gate/signal gate hybrid is about ½ the length of the toe hold (e.g., overhangs) of a fuel gate duplex and about half the length of the toe hold of a signal gate duplex. (See e.g.,
As used herein, the terms “protein” or “polypeptide” or “peptide” may be used interchangeable to refer to a polymer of amino acids. Typically, a “polypeptide” or “protein” is defined as a longer polymer of amino acids, of a length typically of greater than 50, 60, 70, 80, 90, or 100 amino acids. A “peptide” is defined as a short polymer of amino acids, of a length typically of 50, 40, 30, 20 or less amino acids.
A “protein” as contemplated herein typically comprises a polymer of naturally or non-naturally occurring amino acids (e.g., alanine, arginine, asparagine, aspartic acid, cysteine, glutamine, glutamic acid, glycine, histidine, isoleucine, leucine, lysine, methionine, phenylalanine, proline, serine, threonine, tryptophan, tyrosine, and valine). The proteins contemplated herein may be further modified in vitro or in vivo to include non-amino acid moieties. These modifications may include but are not limited to acylation (e.g., O-acylation (esters), N-acylation (amides), S-acylation (thioesters)), acetylation (e.g., the addition of an acetyl group, either at the N-terminus of the protein or at lysine residues), formylation lipoylation (e.g., attachment of a lipoate, a C8 functional group), myristoylation (e.g., attachment of myristate, a C14 saturated acid), palmitoylation (e.g., attachment of palmitate, a C16 saturated acid), alkylation (e.g., the addition of an alkyl group, such as an methyl at a lysine or arginine residue), isoprenylation or prenylation (e.g., the addition of an isoprenoid group such as farnesol or geranylgeraniol), amidation at C-terminus, glycosylation (e.g., the addition of a glycosyl group to either asparagine, hydroxylysine, serine, or threonine, resulting in a glycoprotein). Distinct from glycation, which is regarded as a nonenzymatic attachment of sugars, polysialylation (e.g., the addition of polysialic acid), glypiation (e.g., glycosylphosphatidylinositol (GPI) anchor formation), hydroxylation, iodination (e.g., of thyroid hormones), and phosphorylation (e.g., the addition of a phosphate group, usually to serine, tyrosine, threonine or histidine).
The proteins disclosed herein may include “wild type” proteins and variants, mutants, and derivatives thereof. As used herein the term “wild type” is a term of the art understood by skilled persons and means the typical form of an organism, strain, gene or characteristic as it occurs in nature as distinguished from mutant or variant forms. As used herein, a “variant, “mutant,” or “derivative” refers to a protein molecule having an amino acid sequence that differs from a reference protein or polypeptide molecule. A variant or mutant may have one or more insertions, deletions, or substitutions of an amino acid residue relative to a reference molecule. A variant or mutant may include a fragment of a reference molecule. For example, a mutant or variant molecule may have one or more insertions, deletions, or substitution of at least one amino acid residue relative to a reference polypeptide.
Regarding proteins, a “deletion” refers to a change in the amino acid sequence that results in the absence of one or more amino acid residues. A deletion may remove at least 1, 2, 3, 4, 5, 10, 20, 50, 100, 200, or more amino acids residues. A deletion may include an internal deletion and/or a terminal deletion (e.g., an N-terminal truncation, a C-terminal truncation or both of a reference polypeptide). A “variant,” “mutant,” or “derivative” of a reference polypeptide sequence may include a deletion relative to the reference polypeptide sequence.
Regarding proteins, “fragment” is a portion of an amino acid sequence which is identical in sequence to but shorter in length than a reference sequence. A fragment may comprise up to the entire length of the reference sequence, minus at least one amino acid residue. For example, a fragment may comprise from 5 to 1000 contiguous amino acid residues of a reference polypeptide, respectively. In some embodiments, a fragment may comprise at least 5, 10, 15, 20, 25, 30, 40, 50, 60, 70, 80, 90, 100, 150, 250, or 500 contiguous amino acid residues of a reference polypeptide. Fragments may be preferentially selected from certain regions of a molecule. The term “at least a fragment” encompasses the full-length polypeptide. A fragment may include an N-terminal truncation, a C-terminal truncation, or both truncations relative to the full-length protein. A “variant,” “mutant,” or “derivative” of a reference polypeptide sequence may include a fragment of the reference polypeptide sequence.
Regarding proteins, the words “insertion” and “addition” refer to changes in an amino acid sequence resulting in the addition of one or more amino acid residues. An insertion or addition may refer to 1, 2, 3, 4, 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, or more amino acid residues. A “variant,” “mutant,” or “derivative” of a reference polypeptide sequence may include an insertion or addition relative to the reference polypeptide sequence. A variant of a protein may have N-terminal insertions, C-terminal insertions, internal insertions, or any combination of N-terminal insertions, C-terminal insertions, and internal insertions.
Regarding proteins, the phrases “percent identity” and “% identity,” refer to the percentage of residue matches between at least two amino acid sequences aligned using a standardized algorithm. Methods of amino acid sequence alignment are well-known. Some alignment methods take into account conservative amino acid substitutions. Such conservative substitutions, explained in more detail below, generally preserve the charge and hydrophobicity at the site of substitution, thus preserving the structure (and therefore function) of the polypeptide. Percent identity for amino acid sequences may be determined as understood in the art. (See, e.g., U.S. Pat. No. 7,396,664, which is incorporated herein by reference in its entirety). A suite of commonly used and freely available sequence comparison algorithms is provided by the National Center for Biotechnology Information (NCBI) Basic Local Alignment Search Tool (BLAST), which is available from several sources, including the NCBI, Bethesda, Md., at its website. The BLAST software suite includes various sequence analysis programs including “blastp,” that is used to align a known amino acid sequence with other amino acids sequences from a variety of databases.
Regarding proteins, percent identity may be measured over the length of an entire defined polypeptide sequence, for example, as defined by a particular SEQ ID number, or may be measured over a shorter length, for example, over the length of a fragment taken from a larger, defined polypeptide sequence, for instance, a fragment of at least 15, at least 20, at least 30, at least 40, at least 50, at least 70 or at least 150 contiguous residues. Such lengths are exemplary only, and it is understood that any fragment length supported by the sequences shown herein, in the tables, figures or Sequence Listing, may be used to describe a length over which percentage identity may be measured.
Regarding proteins, the amino acid sequences of variants, mutants, or derivatives as contemplated herein may include conservative amino acid substitutions relative to a reference amino acid sequence. For example, a variant, mutant, or derivative protein may include conservative amino acid substitutions relative to a reference molecule. “Conservative amino acid substitutions” are those substitutions that are a substitution of an amino acid for a different amino acid where the substitution is predicted to interfere least with the properties of the reference polypeptide. In other words, conservative amino acid substitutions substantially conserve the structure and the function of the reference polypeptide. The following table provides a list of exemplary conservative amino acid substitutions which are contemplated herein:
Conservative amino acid substitutions generally maintain (a) the structure of the polypeptide backbone in the area of the substitution, for example, as a beta sheet or alpha helical conformation, (b) the charge or hydrophobicity of the molecule at the site of the substitution, and/or (c) the bulk of the side chain. Non-conservative amino acids typically disrupt (a) the structure of the polypeptide backbone in the area of the substitution, for example, as a beta sheet or alpha helical conformation, (b) the charge or hydrophobicity of the molecule at the site of the substitution, and/or (c) the bulk of the side chain.
The disclosed proteins, mutants, variants, or described herein may have one or more functional or biological activities exhibited by a reference polypeptide (e.g., one or more functional or biological activities exhibited by wild-type protein).
In some embodiments of the disclosed compositions, systems, kits, and methods, the components may be substantially isolated or purified. The term “substantially isolated or purified” refers to components that are removed from their natural environment, and are at least 60% free, preferably at least 75% free, and more preferably at least 90% free, even more preferably at least 95% free from other components with which they are naturally associated.
Detection of Analytes and Target Molecules Using Regulated In Vitro Transcription with DNA Strand Displacement Circuits
Disclosed herein are compositions, systems, kits, and methods that relate to the detection of analytes and target molecules using regulated in vitro transcription. The disclosed compositions, systems, kits, and methods include and utilize components as described herein including components for forming DNA strand displacement circuits.
Disclosed herein is a generalizable strategy to enhance and expand the function of cell-free biosensors by introducing an information processing layer that can manipulate responses from the sensing layer before final signal generation (
To create a more generalized information processing layer in a cell-free context, the inventors leveraged the development of toehold-mediated DNA strand displacement (TMSD)—a computationally powerful and versatile DNA nanotechnology platform that can be used for information processing in vitro [11]. In TMSD, single-stranded DNA (ssDNA) inputs interact with double-stranded DNA (dsDNA) ‘gates’ that are designed to exchange strands and produce ssDNA outputs. By configuring the DNA gates into different network architectures, a range of operations can be performed such as signal restoration [12], signal amplification [13] and logic computation [14, 15], much like a general chemical computational architecture [16]. The well-characterized thermodynamics of DNA base pairing enable large programmable networks to be built from relatively simple building blocks. In addition, the kinetics of these reactions can be precisely tuned by changing the strength of the ‘toeholds’— single-stranded regions within the DNA gates that initiate the strand displacement process [17]. TMSD has been used to create a range of devices including in vitro oscillators [18], catalytic amplifiers [19], autonomous molecular motors [20, 21] and reprogrammable DNA nanostructures [22, 23]. Furthermore, TMSD circuits capable of sophisticated molecular computations such as complex arithmetic [24] and even molecular neural networks that recognize chemical patterns [25] have been designed. Thus, there is a great potential in utilizing TMSD-based information processing to enhance and expand cell-free biosensor function.
The features of TMSD circuits have motivated the development of diagnostics that can detect nucleic acid targets such as microRNAs [26, 27] and human pathogens [28]. These circuits work by programming DNA gates to directly match the sequence complementarity of the desired nucleic acid input, which triggers strand exchange upon binding. However, there are currently no similar general design rules for triggering TMSD circuits with small molecules of varying size, shape and chemical properties. To leverage the power of TMSD circuits for small molecule chemical detection, an interface is needed that can convert the binding event of a chemical target to changes in nucleic acid sequence or structure that can trigger TMSD cascades. Allosteric transcription factors (aTFs) naturally create such an interface by activating the transcription of a programmable RNA sequence upon detection of a compound. However, there are significant challenges in creating an interface that allows aTFs and TMSD circuits to function together in situ, such as interference between RNA polymerase (RNAP) and nucleic acid gates [29], the lack of detailed experimental characterization of RNA-mediated TMSD circuits [30] and the need to develop design principles that insulate circuit function from the complexities of RNA folding.
As described herein, the inventors have overcome these challenges by interfacing the sensing layers of a previously developed cell-free biosensing platform called ROSALIND [7] with TMSD circuits to expand the platform's capabilities. The novel platform comprises a highly processive phage RNAP, an aTF and a DNA template that together regulate the synthesis of an invading RNA strand that can activate fluorescence from a DNA signal gate—a dsDNA consisting of a quencher strand and a fluorophore strand with a toehold region. In this way, this new platform combines TMSD with the biochemistry of aTFs and in vitro transcription (IVT) to enable TMSD circuits to serve as downstream signal processing units to a chemical ligand sensing reaction.
The inventors first show that the design of the DNA gate can be optimized to enable T7 RNAP driven in vitro transcription (IVT) and TMSD within the same reaction.
Next, the inventors systematically develop design principles for optimizing the secondary structure of the synthesized RNA to tune the reaction kinetics of TMSD, notably improving the biosensing response speed. The inventors also apply this principle to interface TMSD with several different aTFs to create biosensors for their cognate ligands.
The inventors then showcase programmability of the platform by building twelve different circuits that implement seven different logic functions (NOT, OR, AND, NOR, IMPLY, NIMPLY, NAND). Importantly, this required the development of additional RNA-level design principles such as fine-tuning of transcription efficiency and optimization of RNA secondary structure to efficiently interface RNA inputs with DNA-based TMSD circuits.
Finally, the inventors address a current limitation of cell-free biosensors by using a model-driven approach to design and build a multi-layer TMSD circuit that acts like an analog-to-digital converter to create a series of binary outputs that encode the concentration range of the target molecule being sensed. Taken together, this work demonstrates that the combination of TMSD and cell-free biosensing reactions can implement molecular computations to enhance the speed and utility of biosensors.
Applications of the disclosed technology include, but are not limited to: (i) Chemical testing; (ii) Chemical screening; (iii) Water quality testing; (iv) Environmental sensing; (v) Health marker sensing in human fluids (blood, urine, saliva, breast milk, etc.); (vi) Micronutrient diagnostics in water, soils, plants and animals; (vii) Drug testing; (viii) Drug discovery; (ix) Heavy metal testing; (x) Contaminant testing; (xi) Diagnostics; (xii) High-throughput screening; (xiii) Research (transcription factor screening, protein engineering); (xiv) Food testing; (xv) Beverage testing; (xvi) Agriculture; (xvii) Aquaculture; and (xviii) Animal health.
The advantages of the disclosed technology include, but are not limited to: (i) speed, where the methods can be performed within minutes; (ii) low cost, where the cost for performing the methods is less than a few dollars to pennies per sample; (iii) robustness, where the methods can be performed using a variety of samples; (iv) reproducibility, where the technology utilizes biochemically defined reactions; (v) ease of use, where the methods may be performed using handheld and portable components; (vi) methods are performed in vitro and do not involve replicating components (e.g., cells); and (vii) extensibility and adaptability, where the methods may be performed to detect a variety of target molecules and analytes.
The disclosed compositions, systems, kits, and methods may be utilized to detect an analyte or a target molecule in a sample. In some embodiments, the disclosed compositions, systems, kits, and methods comprise or utilize one or more components selected from: (a) an RNA polymerase; (b) an allosteric transcription factor (aTF), wherein the analyte or target molecule is a ligand to which the aTF binds; (c) an engineered transcription template; (d) a dsDNA signal gate molecule (e.g., a dsDNA molecule comprising a quencher strand hybridized to a fluorophore strand with a toehold region), (e) a dsDNA fuel gate molecule (e.g., a dsDNA molecule comprising a waste strand and a RecycleD strand); and/or a combination thereof. The transcription template typically comprises a promoter sequence for the RNA polymerase and an operator sequence for the aTF. The promoter sequence and operator sequence are operably linked to a sequence encoding an RNA, wherein the aTF modulates transcription of the encoded RNA when the aTF binds the analyte or target molecule as a ligand. The RNA that is transcribed from the transcription template may displace a strand of the dsDNA fuel molecule by hybridizing to the waste strand, and the RecycleD strand displaces a strand of the dsDNA signal gate whereby a signal is generated (e.g., a fluorescent signal), thereby indicating that the analyte or target molecule is present.
In some embodiments of the disclosed compositions, systems, or kits, the transcribed RNA displaces a nucleotide strand of a reporter molecule which comprises a fluorescently labeled double-stranded DNA signal gate molecule as disclosed herein. In other embodiments of the disclosed compositions, systems, or kits, the compositions, systems, or kits further comprise a second engineered transcription template, in which the second engineered transcription template comprises a promoter sequence for the RNA polymerase operably linked to a sequence encoding a second RNA. In these embodiments, the second transcribed RNA displaces a nucleotide strand of a reporter molecule which comprises a fluorescently labeled double-stranded DNA signal gate molecule as disclosed herein.
Suitable RNA polymerases for inclusion or use in the disclosed compositions, systems, kits, and methods may include, but are not limited to, RNA polymerases derived from bacteriophages. Suitable RNA polymerases may include but are not limited to T7 RNA polymerase, T3 RNA polymerase, SP6 RNA polymerase, and Syn5 RNA polymerase. Suitable RNA polymerases may include engineered RNA polymerases as contemplated herein. In some embodiments, the RNA polymerase is T7 RNA polymerase.
In the disclosed compositions, systems, kits, and methods, the allosteric transcription factor (aTF) modulates transcription from the engineered transcription template. In some embodiments, the aTF modulates transcription from the engineered transcription template when the aTF binds the operator sequence. In some embodiments, the aTF represses transcription from the engineered transcription template when the aTF binds the operator sequence. In other embodiments, the aTF activates, derepresses, and/or augments transcription from the engineered transcription template when the aTF binds the operator sequence.
In the disclosed compositions, systems, kits, and methods, the allosteric transcription factor (aTF) binds the analyte or target molecule as a ligand. In some embodiments, in the absence of the analyte or target molecule as a ligand the aTF binds to the operator sequence, and/or in the presence of the analyte or target molecule as a ligand the aTF does not bind to the operator sequence or binds to the operator sequence at a lower affinity than in the absence of the analyte or target molecule as a ligand. In other embodiments, in the presence of the analyte or target molecule as a ligand the aTF binds to the operator sequence, and/or in the absence of the analyte or target molecule as a ligand the aTF does not bind to the operator sequence or binds to the operator sequence at a lower affinity than in the presence of the analyte or target molecule as a ligand.
Allosteric transcription factors (aTFs) are known in the art. Suitable aTFs for the disclosed compositions, systems, kits, and methods may include, but are not limited to prokaryotic aTFs. Suitable aTFs may include but are not limited to TetR, MphR, QacR, OtrR, CtcS, SAR2349, MobR, and SmtB. The TetR family of aTFs include TetR, MphR, and QacR. The MarR family of aTFs include OtrR, CtcS, SAR2349, and MobR. Suitable aTF may also include the ArsR/SmtB family of aTFs.
Suitable aTFs may include engineered aTFs. For example an engineered aTF is a non-naturally occurring aTF having an amino acid sequence which has been engineered to include one or more of an insertion, a deletion, or a substitution relative to the amino acid sequence of a naturally occurring or wild-type aTF.
In some embodiments of the disclosed compositions, systems, kits, and methods, the analyte or target molecule that is a ligand for the aTF is a member of the tetracycline-family of antibiotics. Suitable analytes/target molecules as ligands for the aTF may include, but are not limited to tetracycline, anhydrotetracyline, oxytetracycline, chlortetracycline, and doxycycline.
In some embodiments of the disclosed systems and methods, the target molecule that is the ligand for the aTF is a member of the macrolide-family of antibiotics. Suitable target molecules/ligands for the aTF may include, but are not limited to erythromycin, azithromycin, and clarithromycin.
In some embodiments of the disclosed compositions, systems, kits, and methods, the analyte or target molecule that is a ligand for the aTF is a quaternary amine or salt thereof. Suitable quaternary amines may include but are not limited to alkyldimethylbenzylammonium salts.
In some embodiments of the disclosed compositions, systems, kits, and methods, the analyte that is a ligand for the aTF is a metal or a cation thereof. Suitable metals or cations thereof may include but are not limited to heavy metals and cations thereof. Suitable metals or cations thereof may include but are not limited to Zn, Pb, Cu, Cd, Ni, As, Mn (or Zn2+, Pb2+, Cu+, Cu2+, Cd2+, Ni2+, As3+, As5+, and Mn2+).
In some embodiments of the disclosed compositions, systems, kits, and methods, the analyte that is a ligand for the aTF is selected from salicylate, 3-hydroxy benzoic acid, narigenin, uric acid.
In the disclosed compositions, systems, kits, and methods, the RNA that is transcribed from the engineered transcription template additionally or alternatively displaces the RecycleD strand from the dsDNA fuel gate molecule which then allows the RecycleD strand to bind to a reporter molecule, and the binding to the reporter molecule results in a detectable signal being generated from the reporter molecule. Suitable reporter molecules may include dsDNA molecules which may be referred to as dsDNA signal gate molecules. In some embodiments of the disclosed compositions, systems, kits, and methods, the reporter molecule is a fluorescently labeled dsDNA molecule (e.g., which functions as an output gate) comprising a fluorophore and a quencher that quenches the fluorophore in the fluorescently labeled double-stranded nucleic acid, and a toehold region. In these embodiments, the RNA that is transcribed from the engineered transcription template displaces the RecycleD strand from the dsDNA fuel gate molecule which then displaces one of the strands of the fluorescently labeled double-stranded nucleic acid reporter molecule which results in dequenching of the fluorophore to generate the detectable signal.
In some embodiments of the disclosed compositions, systems, kits, and methods, RNA transcribed by off-target transcription by T7 RNA polymerase (T7 RNAP; (promoter independent T7 RNAP transcription) is used to recycle circuit inputs, creating a catalytic amplification effect by which one circuit input can activate multiple outputs to increase sensitivity, the molecules of which can be referred to as the dsDNA fuel gate molecule. In some embodiments, the dsDNA fuel gate molecule (e.g., a dsDNA molecule comprising a waste strand and a RecycleD strand) generates a free single-stranded DNA molecule (e.g., a RecycleD strand) which displaces a strand of the dsDNA signal gate molecule to produce a detectable signal and a new hybrid signal/fuel gate, and wherein RNA polymerase transcribes the hybrid signal/fuel gate to release the RecycleD strand of DNA which can then displace an additional dsDNA signal gate molecule, thus creating a positive feedback loop of signal amplification.
In some embodiments, suitable reporter molecules may include but are not limited to fluorescently labeled double-stranded DNA molecules (e.g., which function as an output gate) comprising a top strand having a fluorophore conjugated at its 3′-end and a bottom strand having a quencher conjugated at its 5′ end that quenches the fluorophore in the fluorescently labeled double-stranded DNA molecule and a toehold region. In these embodiments, the RecycleD strand comprises a sequence that is complementary to the full length of the top strand and the RecycleD displaces the bottom strand which results in dequenching of the fluorophore to generate the detectable signal. Typically these reporter molecules are configured such that, the top strand is longer than the bottom strand (e.g., by about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 nucleotides or more). In this configuration, displacement of the bottom strand by the RecycleD strand is thermodynamically favored because the RecycleD strand comprises a sequence that permits additional base-pairing between the RecycleD strand and the top strand that is not presented between the top strand and the bottom strand. In some embodiments, the top strand could comprise the quencher and the bottom strand the fluorophore.
In some embodiments, the fuel molecule is a dsDNA molecule comprising a waste strand and a RecycleD strand. In some embodiments, the transcribed RNA is complementary to the waste strand and displaces the RecycleD strand. In such embodiments, the RecycleD strand is complementary to the top strand of the dsDNA signal gate molecule and displaces the bottom strand comprising the quencher. Therefore, binding of the transcribed RNA to the waste strand must be thermodynamically favored compared to binding of the RecycleD strand to the waste strand. Likewise, the binding of the RecycleD strand to the top strand of the dsDNA signal gate must be thermodynamically favored compared to the binding of the top strand to the bottom strand. Furthermore, in some embodiments, the full length of the RecycleD strand hybridizes to the top strand, but leaves one or more nucleotides unhybridized, i.e., a toehold, at the 3′ end of the top strand. In some embodiments, the toehold is 4-8 nucleotides in length. In some embodiments, the waste strand, the bottom strand, or both comprise methylated nucleotides to prevent leakage of transcription.
Optionally, the disclosed systems and methods further may comprise a non-labeled double-stranded DNA molecule (e.g., which functions as a threshold gate) comprising a top strand that comprises a nucleotide sequence that is identical to the nucleotide sequence of the top strand of the labeled double-stranded DNA molecule. Typically, the top strand of the non-labeled double-stranded DNA molecule is longer than the bottom strand of the non-labeled double-stranded DNA molecule (e.g., by about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 nucleotides or more). Optionally, the bottom strand of the non-labeled double-stranded DNA molecule is shorter in length than the length of the bottom strand of the fluorescently labeled double-stranded DNA molecule (e.g., by about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 nucleotides or more), such that displacement of the bottom strand of the non-labeled double-stranded DNA molecule is favored thermodynamically versus displacement of the bottom strand of the fluorescently labeled double-stranded DNA molecule.
In some embodiments of the disclosed compositions, systems, kits, and methods, multiple aTFs and/or multiple engineered transcription templates may be included and/or utilized. For example, multiple aTFs and/or multiple engineered transcription templates may be included and/or utilized in order to create logic gates.
The compositions, systems, kits, and methods disclosed herein further may include or utilize additional components, such as additional components for performing RNA transcription. Additional components may include but are not limited to one or more of ribonucleoside triphosphates, an aqueous butter system that includes a reducing agent such dithiothreitol (DTT), divalent cations such as Mg++, spermidine, an inorganic pyrophosphatase, an RNase inhibitor, crowding agents, and monovalent salts (e.g., NaCl and K-glutamate).
The components of the disclosed compositions, systems, kits, and methods may be mixed. For example, the components of the disclosed compositions, systems, kits, and methods may be mixed as an aqueous solution and/or may be dried or lyophilized to prepare a dried mixture which may be reconstituted (e.g., to perform the methods disclosed herein).
The disclosed compositions, systems, and kits, and the components thereof may be utilized in methods for detecting an analyte or target molecule in a sample (e.g., by performing an RNA transcription reaction). The methods may include contacting one or more components of the disclosed compositions, systems, and kits with the sample and detecting a detectable signal, thereby detecting the analyte or target molecule in the sample.
The compositions, systems, methods, and kits disclosed herein are exemplified by the embodiments below. These exemplary embodiments are not intended to be limiting.
1. A first embodiment comprise a composition, system, or kit for detecting an analyte comprising one or more of the following components: (a) an RNA polymerase; (b) an allosteric transcription factor (aTF), wherein the analyte is a ligand to which the aTF binds; (c) an engineered transcription template, (d) a dsDNA signal gate molecule, wherein the engineered transcription template comprises a promoter sequence for the RNA polymerase and an operator sequence for the aTF operably linked to a sequence encoding an RNA, wherein the aTF modulates transcription of the encoded RNA when the aTF binds the analyte as a ligand and wherein the transcribed RNA displaces a strand of the dsDNA signal gate molecule and a detectable signal is generated.
2. The composition, system, or kit of embodiment 1, wherein dsDNA signal gate molecule is a fluorescently labeled double-stranded nucleic acid comprising a fluorophore and a quencher that quenches the fluorophore in the fluorescently labeled double-stranded nucleic acid and the transcribed RNA displaces one of the strands of the fluorescently labeled double-stranded nucleic acid which results in dequenching of the fluorophore to generate the detectable signal.
3. The composition, system, or kit of any of the previous embodiments, wherein the reporter molecule is a fluorescently labeled double-stranded DNA molecule comprising a top strand having a fluorophore conjugated at its 3′-end and a bottom strand having a quencher conjugated at its 5′ end that quenches the fluorophore in the fluorescently labeled double-stranded DNA molecule and the transcribed RNA displaces the bottom strand of the fluorescently labeled double-stranded DNA molecule which results in dequenching of the fluorophore to generate the detectable signal.
4. The composition, system, or kit of any of the previous embodiments, wherein the top strand is longer than the bottom strand and wherein the transcribed RNA comprises a sequence that is complementary to the full length of the top strand.
5. The composition, system, or kit of any of the previous embodiments, wherein the top strand comprises one or more non-natural modifications that prevent the top strand from being utilized as a template for transcription (e.g., 2′-O-methylation).
6. The composition, system, or kit of any of the previous embodiments, wherein the system further comprises a non-labeled double-stranded DNA molecule comprising a top strand that comprises a nucleotide sequence that is identical to the nucleotide sequence of the top strand of the labeled double-stranded DNA molecule.
7. The composition, system, or kit of any of the previous embodiments, wherein the top strand of the non-labeled double-stranded DNA molecule is longer than the bottom strand of the non-labeled double-stranded DNA molecule.
8. The composition, system, or kit of any of the previous embodiments, wherein the bottom strand of the non-labeled double-stranded DNA molecule is shorter in length than the length of the bottom strand of the fluorescently labeled double-stranded DNA molecule.
9. The composition, system, or kit of any of the previous embodiments, wherein the transcribed RNA does not form and/or is designed not to form an intramolecular secondary structure, and optionally an intramolecular structure comprising more than 3 consecutively paired nucleotides.
10. The composition, system, or kit of any of the previous embodiments, wherein the RNA polymerase is selected from T7 RNA polymerase, T3 RNA polymerase, SP6 RNA polymerase, and Syn5 RNA polymerase or the RNA polymerase is an engineered polymerase.
11. The composition, system, or kit of any of the previous embodiments, wherein the RNA polymerase is an engineered RNA polymerase.
12. The composition, system, or kit of any of the previous embodiments, wherein the aTF represses, blocks, or inhibits transcription from the engineered transcription template when the aTF binds the operator.
13. The composition, system, or kit of of any of the previous embodiments, wherein the aTF activates transcription from the engineered transcription template when the aTF binds the operator.
14. The composition, system, or kit of any of the previous embodiments, wherein in the absence of the analyte as a ligand the aTF binds to the operator sequence.
15. The composition, system, or kit of of any of the previous embodiments, wherein in the presence of the analyte as a ligand the aTF does not bind to the operator or binds to the operator at a lower affinity than in the absence of the analyte as a ligand.
16. The composition, system, or kit of any of the previous embodiments, wherein in the presence of the analyte as a ligand the aTF binds to the operator sequence.
17. The composition, system, or kit of any of the previous embodiments, wherein in the absence of the analyte as a ligand the aTF does not bind to the operator or binds to the operator at a lower affinity than in the presence of the analyte as a ligand.
18. The composition, system, or kit of any of the previous embodiments, wherein the aTF belongs to the TetR, MarR, or ArsR/SmtB class or family of transcription factors or the aTF is an engineered aTF.
19. The composition, system, or kit of any of the previous embodiments, wherein the aTF is selected from the group consisting of TetR, MphR, QacR, OtrR, CtcS, SAR2349, MobR, SmtB, CadC, CsoR, AdcR, TtgR, and HucR.
20. The composition, system, or kit of any of the previous embodiments, wherein the analyte that is a ligand for the aTF is a member of the tetracycline-family of antibiotics.
21. The composition, system, or kit of any of the previous embodiments, wherein the analyte that is a ligand for the aTF is a member of the macrolide-family of antibiotics.
22. The composition, system, or kit of any of the previous embodiments, wherein the analyte is a quaternary amine or salts thereof.
23. The composition, system, or kit of any of the previous embodiments, wherein the analyte is a metal or a cation thereof.
24. The composition, system, or kit of any of the previous embodiments, wherein the metal or the cation thereof is Zn, Pb, Cu, Cd, Ni, As, or Mn.
25. The composition, system, of any of the previous embodiments, wherein the analyte is selected from salicylate, 3-hydroxy benzoic acid, naringenin, and uric acid.
26. The composition, system, of any of the previous embodiments, further comprising (d) one or more components for preparing a reaction mixture for RNA transcription.
27. The composition, system, of any of the previous embodiments, wherein the components are mixed and form an aqueous solution for performing RNA transcription.
28. The composition, system, of any of the previous embodiments, wherein the components are mixed and form a dried mixture which may be reconstituted to form a reaction mixture for performing RNA transcription.
29. A method for detecting an analyte in a sample, the method comprising contacting the sample with one or more components of the composition, system, or kit of any of the foregoing embodiments and detecting signal.
30. The composition, system, kit or method of any of the foregoing embodiments comprising and/or utilizing a plurality of RNA output sequences that are adapted to displacement multiple DNA strands in a dsDNA signal gate molecule, optionally wherein the composition, system, kit or method exhibits improved reaction kinetics for example as illustrated in
31. The composition, system, kit or method of any of the foregoing embodiments which enables molecular computation between the sensing events and the reporting events optionally as illustrated in
32. The composition, system, kit or method of any of the foregoing embodiments comprising and/or utilizing a non-labelled dsDNA gate with a longer toehold than the dsDNA signal gate molecule which functions as a kinetic “comparator” circuit which function to delay the temporal response of the reaction, optionally as illustrated in
33. The composition, system, kit or method of embodiment 32, comprising and/or utilizing a plurality of comparator circuits in series which function to act as a genetic “analog-to-digital converter” (ADC) to enable target input quantification, optionally as illustrated in
34. The composition, system, kit or method of embodiment 32 or 33, which is adapted to detect and/or quantify a range of compounds related to environmental contamination and human health, optionally as illustrated in
35. The composition, system, kit or method of any of embodiments 32-34, in which the genetic ADC circuit receives an analog input concentration of a target compound and converts it to a digital binary output to indicate the concentration range of that compound.
36. The composition, system, kit or method of any of embodiments 32-35, wherein the target compound is zinc, optionally as illustrated in
37. A method for predicting the functional characteristics of any of a composition, system, kit or method disclosed herein comprising utilizing one or more ordinary differential equations (ODEs) as disclosed herein.
38. The composition, system, or kit of any of embodiments 1-28, or 30-36, or the method of embodiment 29 or 37, wherein the composition, system, kit, or method is configured to detect the presence and/or absence of at least two analytes.
39. The composition, system, kit, or method of embodiment 38, comprising a first aTF and a second aTF, wherein the first aTF binds a first ligand, and wherein the second aTF binds a second ligand.
40. The composition, system, kit, or method of embodiment 39, comprising a first engineered transcription template and a second engineered transcription template, wherein the first engineered transcription template comprises a first operator for the first aTF, and wherein the second engineered transcription template comprises a second operator for the second aTF.
41. The composition, system, kit, or method of embodiment 40, wherein the first engineered transcription template encodes a first RNA, and wherein the second engineered transcription template encodes a second RNA, wherein (a) the first RNA and the second RNA are different, or (b) the first RNA and the second RNA are the same.
42. The composition, system, kit, or method of embodiment 41, comprising a first dsDNA signal gate molecule and a second ds DNA signal gate molecule, wherein one strand of the first ds DNA signal gate molecule is complementary to the first encoded RNA, and wherein one strand of the second ds DNA signal gate molecule is complementary to the second encoded RNA.
43. The composition, system, kit, or method of any one of embodiments 38-42, wherein the first and second aTFs bind the first and second operators on the first and second engineered transcription templates, respectively, in the presence of the first and second ligands.
44. The composition, system, kit, or method of any one of embodiments 38-42, wherein the first and second aTFs bind the first and second operators on the first and second engineered transcription templates, respectively, in the absence of the first and second ligands.
45. The composition, system, kit, or method of any one of embodiments 38-42, wherein the first aTF binds the first operator on the first engineered transcription template in the presence of the first ligand, and wherein the second aTF binds the second operator on the second engineered transcription template in the absence of the second ligand.
46. The composition, system, kit, or method of any embodiments 38-41, comprising a dsDNA signal gate molecule, wherein one strand of the ds DNA signal gate molecule comprises a first region complementary to the first encoded RNA, and a second region complementary to the second encoded RNA.
47. The composition, system, kit, or method of embodiment 46, wherein the first and second aTFs bind the first and second operators on the first and second engineered transcription templates, respectively, in the presence of the first and second ligands.
48. The composition, system, kit, or method of embodiment 46, wherein the first and second aTFs bind the first and second operators on the first and second engineered transcription templates, respectively, in the absence of the first and second ligands.
49. The composition, system, kit, or method of embodiment 46, wherein the when the first aTF binds the first operator on the first engineered transcription template in the presence of the first ligand, and wherein the second aTF binds the second operator on the second engineered transcription template in the absence of the first ligand.
50. The composition, system, kit, or method of embodiment 38, comprising a first engineered transcription template encoding a first RNA, and an unregulated transcription template encoding a second RNA; wherein the unregulated transcription template comprises a promoter sequence for RNA polymerase and wherein the encoded second RNA is different than the first encoded RNA.
51. The composition, system, kit, or method of embodiment 50, wherein the first encoded RNA hybridizes to the second encoded RNA.
52. A composition, system, or kit for detecting an analyte comprising one or more of the following components: (a) an RNA polymerase; (b) an allosteric transcription factor (aTF), wherein the analyte is a ligand to which the aTF binds; (c) an engineered transcription template, (d) a dsDNA signal gate molecule, (e) a dsDNA fuel molecule, wherein the engineered transcription template comprises a promoter sequence for the RNA polymerase and an operator sequence for the aTF operably linked to a sequence encoding an RNA, wherein the aTF modulates transcription of the encoded RNA when the aTF binds the analyte as a ligand and wherein the transcribed RNA displaces a strand of the DNA fuel molecule generating a free single-stranded DNA molecule which displaces a strand of the dsDNA signal gate molecule and a detectable signal is generated.
53. The composition, system, or kit of embodiment 52, wherein the dsDNA fuel molecule is a double-stranded nucleic acid comprising a waste strand and a RecycleD strand.
54. The composition, system, or kit of embodiment 52 or 53, wherein the dsDNA signal gate molecule is a fluorescently labeled double-stranded DNA molecule comprising a top strand having a fluorophore conjugated at its 3′-end and a bottom strand having a quencher conjugated at its 5′ end that quenches the fluorophore in the fluorescently labeled double-stranded DNA molecule and the RecycleD strand displaces the bottom strand of the fluorescently labeled double-stranded DNA molecule which results in dequenching of the fluorophore to generate the detectable signal and hybridization of the RecycleD strand to the top strand generates a double-stranded DNA molecule comprising a 3′ toehold on the top strand wherein the RNA polymerase transcribes the top strand and displaces the RecycleD strand and generates a fluorescent DNA/RNA hybrid.
55. The composition, system, or kit of any one of embodiments 52-54, wherein the top strand is longer than the bottom strand and wherein the transcribed RNA comprises a sequence that is complementary to the full length of the top strand.
56. The composition, system, or kit of any one of embodiments 52-55, wherein the waste strand or the bottom strand comprises one or more non-natural modifications that prevent the strand from being utilized as a template for transcription (e.g., 2′-O-methylation).
57. The composition, system, or kit of any one of embodiments 52-56, which is adapted to detect and/or quantify one or more compounds, or a range of compounds related to environmental contamination and/or human health.
The following examples are illustrative and should not be interpreted to limit the scope of the claimed subject matter.
Engineering TMSD to be Compatible with In Vitro Transcription
To interface aTFs with TMSD, we first sought to directly interface unregulated IVT reactions with the DNA gates used to generate signals in TMSD. This required us to validate that a single-stranded RNA can strand-displace a DNA signal gate. We adapted the design and the sequence of the DNA signal gate from a previous work [31] and created the gate by annealing two ssDNA strands: (1) a fluorophore strand consisting of a 24-nucleotide (nt) ssDNA modified with a 6′ FAM fluorophore on its 5′ end and (2) a quencher strand consisting of a 16-nt ssDNA strand complementary to the fluorophore strand and modified with an Iowa black quencher on its 3′ end (
We tested the strand displacement efficiency of this DNA signal gate by adding purified InvadeR to the reaction and monitoring fluorescence, which was standardized to an external fluorescein standard (
We next sought to determine if InvadeR can be transcribed in situ in the presence of the DNA signal gate to generate a fluorescent output. Following the ROSALIND platform design, we chose reaction conditions that use the fast phage polymerase, T7 RNAP. We configured the DNA template encoding InvadeR to consist of the minimal 17-base pair (bp) T7 promoter sequence followed by two initiating guanines and the InvadeR sequence. To begin, we tested whether adding T7 RNAP along with other in vitro transcription (IVT) reagents could interfere with the DNA signal gate. To our surprise, we observed an increase in fluorescence in the absence of a DNA template when only T7 RNAP, IVT buffer and NTPs were added to the DNA signal gate (
Together, these results revealed several important design features required to interface TMSD with IVT reactions. In particular, the use of a 5′ toehold is an important design requirement of the DNA signal gate to prevent promoter-independent transcription by T7 RNAP.
Interfacing In Vitro Transcription with TMSD Outputs
We next sought to use TMSD to directly track the RNA outputs generated by T7 RNAP-driven IVT in situ. In particular, we focused on optimizing the design of InvadeR for rapid, robust signal generation. Based on our observations in the differences between InvadeR and InvadeD (
To test this hypothesis, we designed three different variants of InvadeR that can strand-invade the DNA signal gate optimized in the previous section (
Next, we tested the strand displacement reaction kinetics of the variants transcribed in situ. Fifty nM of the DNA template encoding each InvadeR variant was added to a reaction mixture containing IVT buffer, T7 RNAP, NTPs and the DNA signal gate, and their fluorescence activation was measured (
While the qualitative ordering of fluorescence kinetics and predicted thermodynamic stabilities of the InvadeR variants held, we did observe some discrepancies between the quantitative predicted secondary structure stabilities and the reaction kinetics of the strengthened variants. For instance, although NUPACK predicts lower minimum free energy values from the strengthened variants than variant 1, the strengthened variants show faster reaction kinetics than variant 1 (
Together, these results show that both secondary structure and transcription efficiency impact the ability of RNA strands transcribed in situ to invade DNA signal gates and that these design principles can be leveraged to enhance reaction speed.
Interfacing Cell-Free Biosensors with TMSD Outputs
Next, we sought to determine whether the transcription of InvadeR can be regulated with an aTF, thus creating a ligand-responsive biosensor that uses TMSD outputs. This required us to insert an aTF operator sequence in between the T7 promotor and InvadeR sequence to allow an aTF to regulate transcription. We previously demonstrated that the spacing between the minimal 17-bp T7 promoter sequence and the aTF operator sequence is important for efficient regulation of IVT in ROSALIND reactions [7]. To test if this spacing remained important in the TMSD platform, we used TetR as our model aTF to determine the optimum spacing for efficient repression in the presence of TetR and efficient transcription in the absence of TetR [41]. We used the native sequence that follows the canonical T7 RNAP promoter as a spacer in 2-bp increments from 0 to 10-bp, immediately followed by the tetO sequence and InvadeR sequence (
Using the 2-bp spacer, we next determined whether TetR can be de-repressed with its cognate ligand, anhydrotetracycline (aTc) to allow transcription of InvadeR (
Due to the rapid speed of TMSD reactions [17], we hypothesized that the ligand-mediated induction speed of the InvadeR output would be much faster than the previously used fluorescence-activating RNA aptamer output (
Overall, these results demonstrate that an aTF-based biosensor can be successfully interfaced with TMSD outputs, leading to immediate improvements in reaction speed.
Optimizing Invading RNA Designs for Different Biosensor Families
Having demonstrated the ability to regulate InvadeR with TetR, we next sought to determine whether the system is compatible with different families of aTFs to create biosensors for various classes of chemical contaminants. In addition to TetR, we chose TtgR [44] and SmtB [45] as representative aTFs of the MarR family [46] and SmtB/ArsR family [47], respectively. We placed the cognate operator sequence of each aTF 2-bp downstream of the T7 promoter and immediately upstream of the InvadeR sequences (
We were immediately successful in adapting the system to TetR and TtgR (
Together, these results demonstrate that the modularity of the ROSALIND platform is extensible to the TMSD platform (See Data Availability section for more information). They also reinforce that the secondary structures of the invading RNA strands play a critical role in determining reaction speed and revealed several design principles to improve the TMSD response speed.
Performing Logic Gate Computation with Cascaded TMSD Circuits
The interface of cell-free biosensors with TMSD creates a potentially powerful molecular computation paradigm for engineering devices that can perform programmed tasks in response to specific chemical inputs. This is especially true since TMSD circuits are much easier to program than protein-based circuits as a result of their simpler design rules [49], computational models that accurately predict their behavior [36, 37] and the emerging suite of design tools [24, 50]. We, therefore, sought to leverage these features of TMSD circuits to create an information processing layer for cell-free biosensors that could be used to expand their function.
As the first step towards this goal, we began with simple logic computation to process two different chemical ligands as inputs to the system. Previously, DNA-based TMSD circuits have demonstrated several approaches to building AND and OR logic gates. They typically involve engineering specific interactions between independent sequence domains to trigger a cascade of TMSD reactions—the final output strand then can interact with the DNA signal gate to activate fluorescence under the desired logic conditions with DNA inputs [12, 14, 15, 24]. We therefore thought to adapt this DNA-based logic gate architecture to build RNA-based TMSD circuits that can take chemical inputs, instead of nucleic acid inputs, to perform logic gate computation.
We started by constructing basic components of logic gates, namely OR, AND and NOT gates (
Next, we designed an AND gate by adapting a recently reported DNA AND gate design [30]. In this architecture, the AND gate includes three domains: domain 1 complementary to InvadeR 1 controlled by one aTF (blue), domain 2 complementary to InvadeR 2 controlled by the second aTF (orange) and domain 3 complementary to the DNA signal gate (green) (
While AND and OR are important logic components, more complex logic gate computation requires a NOT circuit, which blocks signal in the presence of a ligand input. Such computation is a basic component of several logic gates including NOR and NAND, but it has not been extensively explored or applied in the context of TMSD circuits. To achieve signal inversion, we designed an RNA NOT gate that is capable of sequestering InvadeR away from the DNA signal gate (
These results demonstrate that with additional RNA-level design considerations, DNA-based TMSD logic gate architectures can be adapted to accommodate RNA strands whose transcription is induced by small molecule inputs in situ, thereby establishing a basis for building cascaded TMSD circuits for more complex logic gate computation.
Layering Gate Components to Perform Complex Logic Computation
With the three basic logic components established (OR, AND, NOT), we next sought to layer these components to enable more complex logic gate computation that form the basis of more sophisticated circuits, including NOR, NAND, IMPLY and NIMPLY.
We began with NOR, an inversion of the OR gate that only generates signal when all inputs are absent, by combining two RNA NOT gates each regulated by TetR or SmtB (
Next, we focused on the A IMPLY B architecture, which has a truth table whose output is always on except under the condition in which A is present and B is absent. The ZnSO4 IMPLY tetracycline gate was built by layering the tetracycline-induced DNA OR gate with the ZnSO4-induced RNA NOT gate (
We then constructed a NAND gate which combines NOT and AND gates to produce signals in all conditions except when both inputs are present. We explored two design options for the NAND gate: (1) inversion of an AND gate output strand (A NAND B=NOT (A AND B)) and (2) the combination of two RNA NOT gates being integrated as inputs into a DNA OR gate (NOT (A AND B)=NOT A OR NOT B). The first design scheme requires the AND gate output strand to form a NOT gate hairpin structure upon being strand-displaced by both InvadeR strands. This poses a sequence constraint where the two domains on the AND gate need to be complementary to each other. Because of this complexity, we instead chose to build the NAND gate using the second design option (
Finally, we designed the A NIMPLY B gate, which combines AND and NOT gates to implement A AND NOT B logic, producing an output only when input A is present alone. The specific NIMPLY gate design shown in
Together, these results show that basic logic gate components can be combined and layered to perform more complex molecular computation using small molecules as inputs to the system. Specifically, the novel development of an RNA NOT gate architecture enabled the constructions of four different logic gates, namely NOR, IMPLY, NAND and NIMPLY.
Using a TMSD Circuit Processing Layer to Quantify Biosensor Outputs
Two-input logic gate computations with small molecule inputs are a powerful demonstration of the platform's programmability for information processing. To demonstrate a practical application of such an information processing layer, we next chose to focus on quantifying biosensor outputs. In typical cell-free biosensing systems, the sensor layer is wired to the output layer (
To construct a genetic ADC circuit, we first needed to create a comparator circuit—a building block of ADCs that produces a “True” binary output when the input is above a pre-defined threshold. ADC circuits can then be built by creating a series of comparators, each with different thresholds. Previously, this concept of thresholding was implemented in in vitro DNA-only TMSD circuits to act as a low-level noise filter [12, 24]. Thresholding can be implemented in TMSD because the reaction kinetics of strand displacement can be precisely increased by lengthening DNA gate toehold regions [17] (
Our first step was to build a similar thresholding circuit but using input RNA strands generated in situ. The DNA threshold gate was designed to contain two strands: an identical strand to the fluorophore strand of the signal gate and a shortened complementary strand to allow a longer 8-nt toehold compared to the 4-nt toehold of the signal gate (
Next, we sought to create a series of biosensing TMSD comparator circuits to act as an ADC for ligand concentration. Specifically, we prepared a strip of reactions where each tube contains a different amount of the threshold gate. By adding the same input ligand concentration to each tube and observing the output at a specific time point, a user can observe which tubes in the series were activated to obtain semi-quantitative information about ligand concentration (
We first built a model for the system to determine the feasibility of the approach using the same set of ODEs used in
While simple, this demonstration represents the potential of TMSD circuits as an information processing layer to expand the functionality of cell-free biosensors where the circuits transform an analog input signal into a digital readout to increase ease of interpretation and information content of the output signals
In this study, we show that nucleic acid strand displacement circuits can be interfaced with IVT to act as an information processing layer for cell-free biosensors. We found that the speed of DNA strand displacement outputs led to a significant enhancement of output signal generation speed, with visible outputs being produced in ˜10 minutes compared to ˜50 minutes for fluorescent RNA aptamer outputs (
While simple in concept, we found that the combination of TMSD with cell-free biosensing reactions did not work immediately. This was due to several factors including the incompatibility of 3′ toehold overhangs in DNA gates with T7 RNAP-driven IVT reactions [50]. A careful analysis of the issues determined that this incompatibility is due to undesired transcription of these 3′ toehold overhangs by T7 RNAP, which can be solved by changing toehold overhangs to be on the 5′ ends (
One of the major limitations of the platform is its cost. Despite the significant decrease in cost of DNA synthesis, chemically modified oligos with purification can still cost ˜$100 USD or more, though a single batch can be used to make hundreds of reactions. Furthermore, DNA gates often need to be gel-purified after hybridization to eliminate any unbound ssDNA strands, which can be time-intensive and laborious. This challenge can be partially solved by designing DNA signal gate sequences to minimize fluorophore quenching by the base adjacent to the modification [57]. Additionally, invading RNA strands can be designed to minimize intra- and intermolecular interactions to ensure that all TMSD reactions go to completion to maximize a fluorescent signal from the amount of a DNA signal gate used. We note, however, that the cost of chemical dyes for fluorescence-activating RNA aptamer reporting systems is not insignificant, and the advantages provided by the TMSD system such as the improved response speed and computational power outcompete its limitations.
The key feature of this study was demonstrating the potential of TMSD circuits to expand the function of cell-free biosensors by acting as additional information processing layers. While a similar approach was recently developed to interface aTF-based biosensing with TMSD through endonuclease-mediated TMSD cascades [58], no programmable molecular computation beyond simple contaminant detection was presented. As in natural organisms, information processing layers significantly expand the function of cell-free sensors by enabling systems to manipulate output signals, perform logic operations and make decisions. As a demonstration, we modeled, designed and validated several layered TMSD circuits capable of performing complex logic gate computation with chemical inputs (
To further highlight the platform's capability for information processing, we developed a genetic ADC circuit that can be used to estimate an input ligand concentration at a semi-quantitative level (
We believe that this platform opens the door to enabling other types of molecular computation in cell-free systems. For example, an amplification circuit such as a catalytic hairpin assembly [59] could be applied to ROSALIND with TMSD for amplifying signals and making a sensor ultrasensitive. Beyond thresholding, other operations demonstrated in DNA seesaw gate architectures could be ported to this platform for various computations [24]. For instance, logic gate operations can be extended to develop a general strategy to fix aTF ligand promiscuity [7]. In addition, since virtually any aTF that functions in an in vitro context can be used [7], multiple DNA gates with different reporters could be added for multiplexing. The fundamental role that ADC circuits play in interfacing analog and digital electronic circuitry also holds promise for adopting additional electronic circuit designs to biochemical reactions.
Together, these results show that establishing an interface between small molecule biosensing and TMSD circuits is a promising first step towards creating a general molecular computation platform to enhance and expand the function of cell-free biosensing technologies.
Materials and Methods
Strains and Growth Medium
E. coli strain K12 (NEB Turbo Competent E. coli, New England Biolabs #C2984) was used for routine cloning. E. coli strain Rosetta 2(DE3)pLysS (Novagen #71401) was used for recombinant protein expression. Luria Broth supplemented with the appropriate antibiotic(s) (100 μg/mL carbenicillin, 100 μg/mL kanamycin and/or 34 μg/mL chloramphenicol) was used as the growth media.
DNA Gate Preparation
DNA signal gates used in this study were synthesized by Integrated DNA technologies as modified oligos. They were generated by denaturing a 6-FAM (fluorescein) modified oligonucleotide and the complementary Iowa Black® FQ quencher modified oligonucleotide (
Plasmids and Genetic Parts Assembly
DNA oligonucleotides for cloning and sequencing were synthesized by Integrated DNA Technologies. Genes encoding aTFs were synthesized either as gBlocks (Integrated DNA Technologies) or gene fragments (Twist Bioscience). Protein expression plasmids were cloned using Gibson Assembly (NEB Gibson Assembly Master Mix, New England Biolabs #E2611) into a pET-28c plasmid backbone and were designed to overexpress recombinant proteins as C-terminus His-tagged fusions. A construct for expressing SmtB additionally incorporated a recognition sequence for cleavage and removal of the His-tag using TEV protease. Gibson assembled constructs were transformed into NEB Turbo cells, and isolated colonies were purified for plasmid DNA (QIAprep Spin Miniprep Kit, Qiagen #27106). Plasmid sequences were verified with Sanger DNA sequencing (Quintara Biosciences) using the primers listed in
All transcription templates except for the templates encoding InvadeR variant 1 in
All plasmids and DNA templates were stored at 4° C. until usage. A listing of the sequences of the oligos and plasmids described in this document are provided in
RNA Expression and Purification
InvadeR variants used for the purified oligo binding assays were first expressed by an overnight IVT at 37° C. from a transcription template encoding a cis-cleaving Hepatitis D ribozyme on the 3′ end of the InvadeR sequence with the following components: IVT buffer (40 mM Tris-HCl pH 8, 8 mM MgCl2, 10 mM DTT, 20 mM NaCl, and 2 mM spermidine), 11.4 mM NTPs pH 7.5, 0.3 U thermostable inorganic pyrophosphatase (#M0296S, New England Biolabs), 100 nM transcription template, 50 ng of T7 RNAP and MilliQ ultrapure H2O to a total volume of 500 The overnight IVT reactions were then ethanol-precipitated and purified by resolving them on a 20% urea-PAGE-TBE gel, isolating the band of expected size (26-29 nt) and eluting at 4° C. overnight in MilliQ ultrapure H2O. The eluted InvadeR variants were ethanol precipitated, resuspended in MilliQ ultrapure H2O, quantified using the Qubit RNA BR Assay Kit (Invitrogen #Q10211) and stored at −20° C. until usage.
aTF Expression and Purification
aTFs were expressed and purified as previously described [7]. Briefly, sequence-verified pET-28c plasmids were transformed into the Rosetta 2(DE3) pLysS E. coli strain. 1-2 L of cell cultures were grown in Luria Broth at 37° C., induced with 0.5 mM of IPTG at an optical density (600 nm) of ˜0.5 and grown for 4 additional hours at 37° C. Cultures were then pelleted by centrifugation and were either stored at −80° C. or resuspended in lysis buffer (10 mM Tris-HCl pH 8, 500 mM NaCl, 1 mM TCEP, and protease inhibitor (complete EDTA-free Protease Inhibitor Cocktail, Roche)) for purification. Resuspended cells were then lysed on ice through ultrasonication, and insoluble materials were removed by centrifugation. Clarified supernatant containing TetR was then purified using His-tag affinity chromatography with a Ni-NTA column (HisTrap FF 5 mL column, GE Healthcare Life Sciences) followed by size exclusion chromatography (Superdex HiLoad 26/600 200 pg column, GE Healthcare Life Sciences) using an AKTAxpress fast protein liquid chromatography (FPLC) system. Clarified supernatants containing TtgR and SmtB were purified using His-tag affinity chromatography with a gravity column charged with Ni-NTA Agarose (Qiagen #30210). The eluted fractions from the FPLC (for TetR) or from the gravity column (for TtgR and SmtB) were concentrated and buffer exchanged (25 mM Tris-HCl, 100 mM NaCl, 1 mM TCEP, 50% glycerol v/v) using centrifugal filtration (Amicon Ultra-0.5, Millipore Sigma). Protein concentrations were determined using the Qubit Protein Assay Kit (Invitrogen #Q33212). The purity and size of the proteins were validated on a SDS-PAGE gel (Mini-PROTEAN TGX and Mini-TETRA cell, Bio-Rad). Purified proteins were stored at −20° C.
In Vitro Transcription (IVT) Reactions
Homemade IVT reactions were set up by adding the following components listed at their final concentration: IVT buffer (40 mM Tris-HCl pH 8, 8 mM MgCl2, 10 mM DTT, 20 mM NaCl, and 2 mM spermidine), 11.4 mM NTPs pH 7.5, 0.3 U thermostable inorganic pyrophosphatase (#M0296S, New England Biolabs), transcription template, DNA gate(s) and MilliQ ultrapure H2O to a total volume of 20 μL. Regulated IVT reactions additionally included a purified aTF at the indicated concentration and were equilibrated at 37° C. for ˜10 minutes. Immediately prior to plate reader measurements, 2 ng of T7 RNAP and, optionally, a ligand at the indicated concentration were added to the reaction. Reactions were then characterized on a plate reader as described in Plate reader quantification and micromolar equivalent fluorescein (MEF) standardization.
RNA Extraction from IVT Reactions
For RNA products shown on the gel images of
Freeze-Drying
Prior to lyophilization, PCR tube caps were punctured with a pin to create three holes. Lyophilization of ROSALIND reactions was then performed by assembling the components of IVT (see above) with the addition of 50 mM sucrose and 250 mM D-mannitol. Assembled reaction tubes were immediately transferred into a pre-chilled aluminum block and placed in a −80° C. freezer for 10 minutes to allow slow-freezing. Following the slow-freezing, reaction tubes were wrapped in Kimwipes and aluminum foil, submerged in liquid nitrogen and then transferred to a FreeZone 2.5 L Bench Top Freeze Dry System (Labconco) for overnight freeze-drying with a condenser temperature of −85° C. and 0.04 millibar pressure. Unless rehydrated immediately, freeze-dried reactions were packaged as follows. The reactions were placed in a vacuum-sealable bag with a desiccant (Dri-Card Desiccants, Uline #S-19582), purged with Argon using an Argon canister (ArT Wine Preserver, Amazon #8541977939) and immediately vacuum-sealed (KOIOS Vacuum Sealer Machine, Amazon #TVS-2233). The vacuum-sealed bag then was placed in a light-protective bag (Mylar open-ended food bags, Uline #S-11661), heat-sealed (Metronic 8 inch Impulse Bag Sealer, Amazon #8541949845) and stored in a cool, shaded area until usage.
Plate Reader Quantification and Micromolar Equivalent Fluorescein (MEF) Standardization
A NIST traceable standard (Invitrogen #F36915) was used to convert arbitrary fluorescence measurements to micromolar equivalent fluorescein (MEF). Serial dilutions from a 50 μM stock were prepared in 100 mM sodium borate buffer at pH 9.5, including a 100 mM sodium borate buffer blank (total of 12 samples). The samples were prepared in technical and experimental triplicate (12 samples×9 replicates=108 samples total), and fluorescence values were read at an excitation wavelength of 495 nm and emission wavelength of 520 nm for 6-FAM (Fluorescein)-activated fluorescence, or at an excitation wavelength of 472 nm and emission wavelength of 507 nm for 3WJdB-activated fluorescence on a plate reader (Synergy H1, BioTek). Fluorescence values for a fluorescein concentration in which a single replicate saturated the plate reader were excluded from analysis. The remaining replicates (9 per sample) were then averaged at each fluorescein concentration, and the average fluorescence value of the blank was subtracted from all values. Linear regression was then performed for concentrations within the linear range of fluorescence (0-3.125 μM fluorescein) between the measured fluorescence values in arbitrary units and the concentration of fluorescein to identify the conversion factor. For each plate reader, excitation, emission and gain setting, we found a linear conversion factor that was used to correlate arbitrary fluorescence values to MEF (
To characterize reactions, 19 μL of reactions were loaded onto a 384-well optically-clear, flat-bottom plate using a multichannel pipette, covered with a plate seal and measured on a plate reader (Synergy H1, BioTek). Kinetic analysis of 6-FAM (Fluorescein)-activated fluorescence was performed by reading the plate at 1 minute intervals with excitation and emission wavelengths of 495 nm and 520 nm, respectively, for two hours at 37° C. Kinetic analysis of 3WJdB-activated fluorescence was performed by reading the plate at 3-minute intervals with excitation and emission wavelengths of 472 nm and 507 nm, respectively, for four hours at 37° C. Arbitrary fluorescence values were then converted to MEF by dividing with the appropriate calibration conversion factor.
Except for the data in
Fluorescence Data Normalization (
Data shown in
where x is a given time point (0≤x≤120)
Background subtraction was performed to account for the non-zero fluorescence observed for the quenched DNA signal gate. Once all data were normalized according to the formula above, n=3 replicates per condition were averaged, and the corresponding standard deviation value per condition was calculated.
Gel Image Analysis
See Data availability section for uncropped, unprocessed gel images presented in
Tap and Lake Water Sampling
For ZnSO4—spiked tap water from Evanston, IL, two bottles of approximately 50 ml of the water samples were collected from a drinking fountain. One of the bottles was then filtered at 0.22 μm using a Steriflip-GP sterile vacuum filtration system (MilliPore Sigma Cat. #SCGP00525). Both the filtered and unfiltered water samples were spiked using either 10 mM, 1 mM or 0.1 mM ZnSO4 solution that has been diluted from the 2 M ZnSO4 solution stock (Sigma Cat. #83265). Upon rehydration, fluorescence measurements of the reactions were performed by a plate reader (see “Plate reader quantification and MEF standardization”). For ZnSO4—spiked Lake Michigan water from Evanston, IL, the same sampling method was applied.
Statistics and Reproducibility
The number of replicates and types of replicates performed are described in the legend to each figure. Individual data points are shown, and where relevant, the average±standard deviation is shown; this information is provided in each figure legend. The type of statistical analysis performed in
Data Availability
All data presented in this document are deposited in Mendeley Data (doi: 10.17632/hr3j3yztxb.1). All plasmids used in this manuscript are available in Addgene with the identifiers 140371, 140374, 140391 and 140395.
Code Availability
The Python code used in
Supplementary Methods
ODE Model of TMSD Thresholding Circuit
Here, we use the kinetic rates of T7 RNAP-DNA binding, SmtB-srntO binding, SmtBZn binding, TMSD reactions and T7 RNAP-mediated IVT reactions to simulate ROSALIND Reactions. The following variables will be used:
In this model, we assume:
1. One-One-to-one binding of T7 RNAP and T7 promoter on the DNA template.
2. The DNA template can be bound to either RNAP or TF, but not both.
3. One-to-one binding of SmtB tetramer and smtO on the DNA template (see footnote a).
4. One-to-one binding of SmtB tetramer and a zinc ion (see footnote a).
5. SmtB tetramer can be bound to either smtO on the DNA template or zinc, but not both.
6. All TMSD reactions are irreversible.
7. Fraying within each gate is ignored.
With these assumptions, we have the following reactions and ODEs in the system:
Reactions:
ODEs:
This set of ODEs was then run using an ODE solver function, odeint from the Scipy. Integrate package in Python 3.7.6. using the rate parameters shown below. For the unregulated reactions shown in
Parameters:
Polymerase Strand Recycling (PSR) is a DNA toehold-mediated strand displacement circuit that exploits promoter-independent transcription by T7 RNA polymerase. In this reaction, off-target transcription by T7 RNA polymerase (T7 RNAP) can be used to recycle circuit inputs, creating a catalytic amplification effect by which one circuit input can activate multiple outputs to increase sensitivity. We showed that we can generate higher fluorescent signal from the combination of controlled T7 RNAP off-target transcription and toehold-mediated strand displacement than for circuits that solely rely on strand displacement. This work establishes incorporating T7 RNAP side-reactions as an intended design choice into engineered TMSD circuits and has the potential to enhance existing molecular diagnostics systems as well as increase circuit design possibilities by creating an intermediate layer in cell-free systems that amplifies signal.
A cell-free biosensor (ROSALIND) typically activates when a target compound (input) binds to a protein transcription factor (sensor layer) that is configured to activate expression of a reporter construct (output layer). The single-stranded RNA output is capable of activating toehold-mediated strand displacement (TMSD) circuits that generate signal. Adding a downstream amplification layer before signal generation can recycle the single-stranded RNA output and decrease the amount of input required to fully activate a DNA signal gate.
Advantages to the disclosed technology include, but are not limited to, using T7 RNAP as a recycler for DNA strands to achieve signal recycling and amplification; the need for smaller amounts of one or more of the following components to achieve a detectable signal: (1) InvadeR; (2) DNA template; (3) aTF; (4) ligand. The system, including PSR (e.g., including a fuel gate and T7 RNAP as disclosed herein), achieves more sensitive detection of small molecule ligands as compared to the same system lacking the PSR components. The system including the PSR components as disclosed herein provides for greater sensitivity and lower detection limits (detection of lower concentrations and/or amounts) of small molecule ligands as compared to the same system without the PSR components.
In some embodiments, the sensitivity of embodiments including PSR is increased by greater than 10 fold, 20 fold, 30 fold, 40 fold, 50 fold, 60 fold, 70 fold, 80 fold, 90 fold, 100 fold, 150 fold, or greater than 200 fold compared to embodiments without PSR. In some embodiments, the amount of DNA template input can be reduced by a factor of 10 fold, 20 fold, 30 fold, 40 fold, 50 fold, 60 fold, 70 fold, 80 fold, 90 fold, 100 fold, 150 fold, or greater than 200 fold compared to embodiments without PSR. In some embodiments, the amount of aTF input can be reduced by a factor of 10 fold, 20 fold, 30 fold, 40 fold, 50 fold, 60 fold, 70 fold, 80 fold, 90 fold, 100 fold, 150 fold, or greater than 200 fold compared to embodiments without PSR.
In some embodiments, the TMSD circuit design incorporating T7 RNAP promoter-independent transcription activities comprises one or more chemically modified nucleic acids in the TMSD circuit component design.
In the presence of in vitro transcription components, T7 RNAP can non-specifically bind to the toehold region of the DNA signal gate. When the toehold is on the 3′ endo fo the gate, this leads to transcription of the unwanted RNA side products that can displace the quencher strand. This process is blocked when the toehold is on the 5′ endo fhte gate.
In some embodiments, the ligand binds to the allosteric transcription factor (aTF) allowing T7 RNA polymerase to transcribe the InvadeR strand which is complementary to the waste strand of the fuel gate and binds to the waste strand and displaces the RecycleD strand (
Using the embodiments described herein, DNA template inputs needed for the amplification system can be reduced by 100- to 200-fold (
High leak signal proves to be a significant challenge in building a catalytic amplifier in the presence of T7 RNAP. However, methylation of the quencher strand of the signal gate reduces leak from the T7 RNAP amplifier and improves signal/noise ratio (
The present disclosure reveals that T7 RNAP activity is sensitive to the quality and source of T7 RNAP (
In some embodiments, toehold lengths of the fuel gate may be between about 1-15 nt, between about 2-10 nt, or between about 4-8 nt. In some embodiments, the double stranded region of the fuel gate may be greater than 15 nt, such as 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, or more nucleotides.
A three-way fuel gate is disclosed herein, wherein the InvadeR strand displaced the DNA fuel gate and exposes a 3′ toehold for T7 RNAP to bind. In this system, T7 RNAP promoter-independently transcribes from the 3′ toehold and releases RecycleD which then activated the Signal Gate and is recycled by a second round of T7 RNAP promoter-independent transcription to amplify signal.
The present disclosure exhibits that tetracycline sensor sensitivity can be increased by PSR. As illustrated in
By way of example, and not by way of limitation, in some embodiments, the use of a fuel gate will reduce the amount of transcription factor needed for a ROSALIND reaction and increase the sensitivity of small molecule sensors, including, for example, lead and zinc small molecule sensors. For example, T7 RNAP may be able to recycle the RNA output with promoter-independent transcription and amplify the fluorescent signal of aTF-regulated transcription. For example, the Environmental Protection Agency's limit for lead contamination is below the sensitivity limit of some small molecule sensors. The inventors predict that using the embodiments of the present disclosure, small molecule sensors will be able to detect lower levels of ligands, such as, without limitation lead, at levels mandated by regulatory agencies or below (see e.g.,
In the foregoing description, it will be readily apparent to one skilled in the art that varying substitutions and modifications may be made to the invention disclosed herein without departing from the scope and spirit of the invention. The invention illustratively described herein suitably may be practiced in the absence of any element or elements, limitation or limitations which is not specifically disclosed herein. The terms and expressions which have been employed are used as terms of description and not of limitation, and there is no intention that in the use of such terms and expressions of excluding any equivalents of the features shown and described or portions thereof, but it is recognized that various modifications are possible within the scope of the invention. Thus, it should be understood that although the present invention has been illustrated by specific embodiments and optional features, modification and/or variation of the concepts herein disclosed may be resorted to by those skilled in the art, and that such modifications and variations are considered to be within the scope of this invention.
Citations to a number of patent and non-patent references may be made herein. The cited references are incorporated by reference herein in their entireties. In the event that there is an inconsistency between a definition of a term in the specification as compared to a definition of the term in a cited reference, the term should be interpreted based on the definition in the specification.
This application is a continuation-in-part of PCT/US2022/018133 filed on Feb. 28, 2022, which claims the benefit of U.S. Application No. 63/154,247 filed Feb. 26, 2021, and U.S. Application No. 63/254,824 filed Oct. 12, 2021. This application claims priority to U.S. Application No. 63/337,267 filed May 2, 2022. The entire contents of each of the above-referenced applications are incorporated herein by reference.
This invention was made with government support under NSF1452441 and NSF1929912 awarded by the National Science Foundation, and T32GM008449 awarded by the Institutes of Health. The government has certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
63254824 | Oct 2021 | US | |
63154247 | Feb 2021 | US | |
63337267 | May 2022 | US |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/US2022/018133 | Feb 2022 | US |
Child | 18311030 | US |