This application is filed with a Sequence Listing in electronic form as a Sequence Listing XML, “NEB-478-US” created on Jan. 17, 2025, and having a size of 10,585 bytes. The contents of the Sequence Listing XML are incorporated by reference herein in their entirety.
Reverse transcriptases (RTs) are multi-functional enzymes that typically have multiple enzymatic activities, including an RNA-dependent DNA polymerization activity, a DNA-dependent DNA polymerization activity, and an RNaseH activity that catalyzes the cleavage of RNA in RNA-DNA hybrids. These enzymes, which are used to synthesize complementary DNA (cDNA) using RNA as a template, were first identified in RNA viruses. Subsequently, reverse transcriptases have been isolated and purified directly from virus particles, cells, and tissues (e.g., see Kacian et al., 1971, Biochim. Biophys. Acta 46:365-83; Yang et al., 1972, Biochem. Biophys. Res. Comm. 47:505-11; Gerard et al., 1975, J. Virol. 15:785-97; Liu et al., 1977, Arch. Virol. 55 187-200; Kato et al., 1984, J. Virol. Methods 9:325-39; Luke et al., 1990, Biochem. 29:1764-69 and Le Grice et al., 1991, J. Virol. 65:7004-07). A variety of RTs are commercially available and are routinely used in research and diagnostic applications such as PCR tests and RNA sequencing. These include naturally occurring RTs such as Moloney murine leukemia virus (MMLV) RT and derivatives engineered to have improvements, such as greater thermostability. However, there remains a need for improved RTs that work under challenging conditions. For example, the presence of certain sample constituents, such as salts and substances present in medical and environmental samples, can reduce the efficiency of RTs. Reduced RT efficiency leads to a reduced amount of copied DNA generated and even to false positive results in quantitative PCR tests. Thus, it would be desirable to obtain new RTs that can synthesize DNA under a variety of conditions, such as in the presence of substances that inhibit activity of some RTs.
Provided herein are reverse transcriptases comprising an amino acid sequence that is at least 90% identical to the amino acid sequence of SEQ ID NO:1 or SEQ ID NO:2. In some embodiments, the reverse transcriptase has an improvement in one or more properties as compared to other known reverse transcriptases. For example, the present reverse transcriptase may be more efficient than other reverse transcriptases, particularly in the presence of substances that typically inhibit reverse transcriptases. Also provided are variants of the reverse transcriptases and fusion proteins of the reverse transcriptases, which have reverse transcriptase activity. Kits, reaction mixes and methods that include the reverse transcriptase are also provided.
These and other features of the present teachings are set forth herein.
The skilled artisan will understand that the drawings, described below, are for illustration purposes only. The drawings are not intended to limit the scope of the present teachings in any way.
Provided herein are engineered reverse transcriptase enzymes (RTs) that can have improvements in one or more properties as compared to one or more known RTs. For example, the present RTs are believed to be more efficient in the presence of sample constituents that can inhibit reverse transcriptase activity (e.g., salts) relative to commercially available RTs. It is also believed that the described RTs can produce copied DNA in two-step RT-qPCR within short reaction times relative to commercially available RTs.
Inhibitory compounds can exist in RNA and/or DNA samples even after purification, causing reduced efficiency of cDNA synthesis and possibly resulting in false RT-PCR and RT-qPCR results. As described in Example 3, an embodiment of the engineered reverse transcriptases described herein has surprisingly robust polynucleotide extension activity in the presence of a variety of compounds that inhibit known reverse transcriptases, such as salts, chemicals, sample-processing reagents, dyes, and compounds naturally present in environmental and animal samples.
Although embodiments of the disclosure are explained in detail, it is to be understood that other embodiments are contemplated. Accordingly, it is not intended that the invention is limited in its scope to the details of construction and arrangement of components set forth in the following description or illustrated in the drawings. The invention is capable of other embodiments and of being practiced or carried out in various ways. Also, in describing the embodiments, specific terminology will be resorted to for the sake of clarity.
It must also be noted that, as used in the specification and the appended claims, the singular forms “a,” “an” and “the” include plural referents unless the context clearly dictates otherwise. For example, reference to a sheet or portion is intended also to include the manufacturing of a plurality of sheets or portions. References to a sheet containing “a” constituent is intended to include other constituents in addition to the one named.
All publications cited herein are incorporated by reference herein.
Also, in describing the embodiments, terminology will be resorted to for the sake of clarity. It is intended that each term contemplates its broadest meaning as understood by those skilled in the art and includes all technical equivalents which operate in a similar manner to accomplish a similar purpose.
Ranges can be expressed herein as from “about” or “approximately” one particular value and/or to “about” or “approximately” another particular value. When such a range is expressed, another embodiment includes from the one particular value and/or to the other particular value. “Comprising” or “containing” or “including” mean that at least the named compound, element, particle, or method step is present in the composition or article or method, but does not exclude the presence of other compounds, materials, particles, method steps, even if the other such compounds, material, particles, method steps have the same function as what is named.
The term “non-naturally occurring” used in reference to a polypeptide or composition described herein means that the polypeptide or composition does not exist in nature. A “non-naturally occurring” composition can differ from naturally occurring compositions in one or more of the following respects: (a) having components that are not combined in nature; (b) having components in concentrations not found in nature; (c) omitting one or more components otherwise found in naturally occurring compositions; (d) having a form not found in nature, e.g., dried, freeze dried, crystalline, aqueous; and (e) having one or more additional components beyond those found in nature (e.g., buffering agents, a detergent, a dye, a solvent or a preservative). The RT compositions, reaction mixtures, and mixtures formed when performing the described methods are examples of non-naturally occurring compositions; the RTs described herein are examples of non-naturally occurring polypeptides.
The term “position,” when used in reference to an amino acid, means the place such an amino acid occupies in the primary sequence of a polypeptide numbered from its amino terminus to its carboxy terminus. A position in one primary sequence can correspond to a position in a second primary sequence, for example, where the two positions are opposite one another when the two primary sequences are aligned using an alignment algorithm (e.g., BLAST (Journal of Molecular Biology. 215 (3): 403-410) using default parameters (e.g., expect threshold 0.05, word size 3, max matches in a query range 0, matrix BLOSUM62, Gap existence 11 extension 1, and conditional compositional score matrix adjustment) or custom parameters). An amino acid position in one sequence can correspond to a position within a functionally equivalent motif or structural motif that can be identified within one or more other sequence(s) in a database by alignment of the motifs. Analogously, with reference to a nucleotide, “position” means the place such nucleotide occupies in the nucleotide sequence of an oligonucleotide or polynucleotide numbered from its 5′ end to its 3′ end.
This disclosure relates to RTs designed using computational approaches. These engineered RTs differ from the well-studied MMLV RT in that they have C- and N-terminal truncations, amino acid insertions, and/or at least about 20% of amino acids of the RTs differ from corresponding positions in the MMLV reverse transcriptase. The MMLV RT has been crystallized (see, e.g., Das et al., Structure. 2004 12:819-29), the structure-functional relationships in MMLV RT have been studied (see, e.g., Cote et al., Virus Res. 2008 134:186-202, Georgiadis et al., Structure. 1995 3:879-92 and Crowther et al., Proteins 2004 57:15-26) and many mutations in MMLV RT are known (see, e.g., Yasukawa et al., J. Biotechnol. 2010 150:299-306, Arezo et al Nucleic Acids Res. 2009 37:473-81 and Konishi et al., Biochem. Biophys. Res. Commun. 2014 454:269-74, among many others).
As used herein, the term “reverse transcriptase” means a DNA polymerase that can copy first-strand cDNA from an RNA template, and in some cases, a DNA template, via its polymerase domain. Such enzymes are commonly referred to as RNA-directed DNA polymerases and have IUBMB activity EC 2.7.7.49. In some cases, a reverse transcriptase can copy a complementary DNA strand using either single-stranded RNA or DNA as a template. The RTs described herein have reverse transcriptase activity. The RTs described herein can use RNA and/or DNA as a template. Thus, the RTs described herein can copy first-strand DNA from RNA template and can copy second-strand DNA from the resulting cDNA, e.g., for applications such as library preparation and RT-PCR.
The RTs described herein may have template switching activity. The term “template switching” means a reverse transcription reaction in which the reverse transcriptase switches template from an RNA molecule to a synthetic oligonucleotide (which usually contains two or three Gs at its 3′ end, thereby copying the sequence of the synthetic oligonucleotide onto the end of the cDNA). Template switching is described, for example, in Matz et al., Nucl. Acids Res. 1999 27:1558-1560 and Wu et al., Nat Methods. 2014 11:41-6. In template switching, a primer hybridizes to an RNA template. This primer serves as a primer for an RT that copies the RNA template to form a complementary DNA product. In copying the RNA template, the RT commonly travels beyond the 5′ end of the RNA template to add non-template nucleotides to the 3′ end of the DNA product (typically Cs). Upon addition of an oligonucleotide that has ribonucleotides or deoxyribonucleotides that are complementary to the non-template nucleotides added onto the DNA product (e.g., a “template switching” oligonucleotide that typically has two or three G's at its 3′ end), the RT will jump templates from the RNA template to the oligonucleotide template, thereby producing a DNA product that has the complement of the template switching oligonucleotide at its 3′ end. Example 10 describes exemplary template switching activity of an RT disclosed herein. The template switching activity of an RT described herein is useful in applications where it is desirable to add a sequence to a DNA product, such as for mRNA sequencing, targeted RNA sequencing, rare transcript detection, single-cell RNA sequencing, cDNA library construction, diagnostic applications.
The RTs described herein may have RNase H activity or may lack RNAse H activity. The term “RNAse H activity” means an enzymatic activity that hydrolyzes RNA in RNA/DNA hybrid. Many reverse transcriptases have an RNAse H activity that can be inactivated by truncation or by substitution. The RTs described herein can lack RNAse H activity. For example, the RT of SEQ ID NO: 1 contains amino acids at locations expected to reduce or abolish RNAse H activity (e.g., G at position 502; Q at position 540, and N at position 561) and likewise the RT of SEQ ID NO:2 has G at position 500, Q at position 538 and N at position 559. These amino acids can be mutated to increase RNAse H activity. For example, the following substitutions could restore RNAse H activity: for SEQ ID NO:1, D at position 502, E at position 540, and D at position 561 and likewise for SEQ ID NO:2, D at position 500, E at position 538 and D at position 559.
Provided herein a reverse transcriptase (RT) comprising (a) an amino acid sequence having at least 90% identity with SEQ ID NO:1, or (b) an amino acid sequence having at least 90% identity with SEQ ID NO:2.
There is also provided a reverse transcriptase (RT) consisting of an amino acid sequence containing an N-terminal start sequence (e.g., a methionine) and (a) an amino acid sequence having at least 90% identity with SEQ ID NO:1, or (b) an amino acid sequence having at least 90% identity with SEQ ID NO:2. It is understood that a polypeptide consisting of a particular sequence can further contain one or more amino acids required for translation of the amino acid sequence into a polypeptide. Thus, in some embodiments, in which the RT consists of a defined amino acid sequence, it is understood that the RT may further contain a start codon, e.g., methionine, to permit its expression.
In some embodiments, the RT comprises an amino acid sequence that has at least 90% sequence identity to SEQ ID NO: 1 (e.g., at least 93%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to SEQ ID NO: 1) but less than 100% sequence identity to SEQ ID NO: 1. In an embodiment the RT comprises an amino acid sequence that has at least 95% sequence identity to SEQ ID NO:1. In an embodiment, the RT comprises an amino acid sequence that is identical to SEQ ID NO:1.
In some embodiments, the RT comprises an amino acid sequence that has at least 90% sequence identity (e.g., at least 93%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% sequence identity) to SEQ ID NO: 1 (i.e., the RT comprises a variant of SEQ ID NO: 1) wherein one or more of the following amino acid substitutions are present: a Q to R at position 190, D to R at position 445, an E to K at position 509, and an E to K at position 588, wherein the positions correspond to positions in SEQ ID NO:1 . . . . In an embodiment, the RT has R at position 190, and optionally R at position 445 and/or K at position 509 and/or K at position 588. In an embodiment, the RT has R at position 445 and optionally has R at position 190 and/or K at position 509 and/or K at position 588. In an embodiment, the RT has K at position 509 and optionally R at position 190 and/or R at position 445 and/or K at position 588. In an embodiment, the RT has K at position 588 and optionally R at position 190 and/or R at position 445 and/or K at position 509. In an embodiment, the RT comprises an amino acid sequence having at least 90% identity with SEQ ID NO:1, wherein the amino acid sequence has an R at position 190, an R at position 445, a K at position 509, and a K at position 588, wherein the positions correspond to positions in SEQ ID NO:1. For example, the RT of SEQ ID NO:3 contains each of these amino acid substitutions.
In an embodiment, the RT comprises an amino acid sequence that has 10 or fewer, 9 or fewer, 8 or fewer, 7 or fewer, 6 or fewer, 5 or fewer, 4 or fewer, 3 or fewer, or 2 or fewer amino acid substitutions relative to SEQ ID NO:1. In an embodiment, the amino acid substitutions are conservative substitutions.
In some embodiments, the RT comprises an amino acid sequence that has at least 90% sequence identity to SEQ ID NO: 2 (e.g., at least 93%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to SEQ ID NO: 2) but less than 100% sequence identity to the amino acid sequence of SEQ ID NO: 2. In an embodiment the RT comprises an amino acid sequence that has at least 95 sequence identity to SEQ ID NO:2. In an embodiment, the RT comprises an amino acid sequence that is identical to SEQ ID NO:2.
In some embodiments, the RT comprises an amino acid sequence that has at least 90% sequence identity (e.g., at least 93%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% sequence identity) to SEQ ID NO: 2 (i.e., the RT comprises a variant of SEQ ID NO: 2) wherein one or more of the following amino acid substitutions are present: a Q to R at position 190, an E to R at position 444, an E to K at position 507, and a D to K at position 586, wherein the positions correspond to positions in SEQ ID NO:2. In an embodiment, the RT has R at position 190, and optionally R at position 444 and/or K at position 507 and/or K at position 586. In an embodiment, the RT has R at position 444 and optionally has R at position 190 and/or K at position 507 and/or K at position 586. In an embodiment, the RT has K at position 507 and optionally R at position 190 and/or R at position 444 and/or K at position 586. In an embodiment, the RT has K at position 586 and optionally R at position 190 and/or R at position 444 and/or K at position 507. In an embodiment, the RT comprises and amino acid sequence having at least 90% identity with SEQ ID NO:2, wherein the amino acid sequence has an R at position 190, an R at position 444, a K at position 507, and a K at position 586, wherein the positions correspond to positions in SEQ ID NO:2. For example, the RT of SEQ ID NO:4 contains each of these amino acid substitutions
In an embodiment, the RT comprises an amino acid sequence that has 10 or fewer, 9 or fewer, 8 or fewer, 7 or fewer, 6 or fewer, 5 or fewer, 4 or fewer, 3 or fewer, or 2 or fewer amino acid substitutions relative to SEQ ID NO:2. In an embodiment, the amino acid substitutions are conservative substitutions.
Mutations to MMLV are well documented as described above, and such mutations can be incorporated into an RT amino acid sequence described above to produce a variant RT. Examples of such mutations that could be made to an amino acid sequence having at least 90% identity with SEQ ID NO:1 (e.g., at least 93%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% sequence identity to SEQ ID NO: 1) include any one or more of: K97S, 1102V, T115S, T115N, R120K, H181K, D186Q, F187W, R188P, R188T, V189P, V189Q, E193Q, L204M, L2051, P208E, P208Q, E210Q, G216W, G216D, A236L, 1249T, Q254T, R261M, K262R, T283S, A284V, A295G, T309Q, H317D, Q318E, E322K, L339I, K366Q, R367Q, V3691, T413V, D426F, L4421, G450V, L474V, V5181, A527T, F603M, V6081, H612K, and K618R. Similarly, examples of such mutations that could be made to an amino acid sequence having at least 90% identity with SEQ ID NO: 2 (e.g., at least 93%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% sequence identity to SEQ ID NO: 2), include any one or more of: K97S, 1102V, S115N, R120K, H181K, P186Q, F187W, R188P, R188T, A189P, A189Q, L204M, V2051, A208E, A208Q, K210Q, G216W, G216D, A236L, K254T, R261M, K262R, T283S, A284V, A295G, V309Q, H317D, Q318E, E322K, L339I, K366Q, R367Q, V3691, A413V, D426F, L4421, A449V, S473V, V5161, A525T, H601M, V6061, and H610K.
Any substitution of an amino acid in SEQ ID NO:1 or SEQ ID NO:2 (that is, in a “variant” of SEQ ID NO:1 or SEQ ID NO:2) can be a conservative substitution. The term “conservative substitution” means replacement of an amino acid in a polypeptide by one with similar characteristics; such substitutions are not likely to change the shape of the polypeptide chain, e.g., substituting one hydrophobic amino acid for another. For example, a non-polar amino acid (e.g., A, V, L, I, M, W, and F (and optionally C, G, and P)) may substitute for another non-polar amino acid, a polar amino acid (e.g., N, Q, S, T, and Y) may substitute for another polar amino acid (e.g., C, D, E, H, K, N, P. Q, R, S, and T), a positively charged amino acid (H, K, and R) may substitute for another positively charged amino acid, and a negatively charged amino acid (e.g., D and E) may substitute for another negatively charged amino acid. A substitute amino acid may be a natural amino acid (e.g., replacing another natural amino acid or a non-natural amino acid). A substitute amino acid may be a non-natural amino acid (e.g., replacing a natural amino acid or another non-natural amino acid). Examples of non-natural amino acids include norleucine, ornithine, norvaline, homoserine, and other amino acid analogs such as those described in Ellman et al. Meth, Enzym. 202:301-336 (1991).
As is described above, the RTs described herein have reverse transcriptase activity. In some embodiments, the RT has increased reverse transcriptase activity, and/or has improved tolerance to one or more reaction conditions such as high salt concentration, temperature, or the presence of reaction or sample components that inhibit reverse transcriptase activity, as compared to an RT known in the art.
Also provided are RTs that are fusion proteins comprising an exogenous amino acid sequence fused to an RT provided herein and described above. As used herein, the term “fusion protein” means a non-naturally occurring polypeptide containing two or more amino acid segments that are not joined in their naturally occurring states, e.g., a polymerase domain of an RT described here (where such polymerase domain has RT activity) and an amino acid sequence that is not joined to the polymerase domain in its naturally occurring state (an “exogenous” amino acid sequence). A fusion protein can be constructed for a variety of purposes, such as for ease of purification (a “purification tag,” e.g., poly-His, chitin binding domain, maltose binding protein, glutathione S-transferase (GST), alpha mating factor or SNAP-Tag® (New England Biolabs, Ipswich, MA)); for detection (e.g., a fluorescent protein for direct detection, an enzyme for indirect detection such as horse radish peroxidase); for protein translocation within a cell, tissue or organism (e.g., nuclear localization signal, mitochondrial targeting sequence, endoplasmic reticulum signal sequence, peroxisomal targeting signal, lysosomal targeting signal, organ-targeting signal); for protein interaction with other targets (e.g., DNA binding domain, which can be non-specific or specific); for chemical modification (e.g., to introduce a modification site), and the like. DNA binding domains have been shown to increase the processivity of other polymerases (see, e.g., US 2016/0160193); exemplary DNA binding domains include: Sso7d, BD007, BD023, BD009, BD062, BD093, BD109, BD006, and BD012. Accordingly, in some embodiments there is provided a fusion protein comprising the polymerase domain of an RT as defined herein (e.g., a polymerase domain having reverse transcriptase activity, comprising an amino acid sequence having at least 90% identity with SEQ ID NO: 1 or SEQ ID NO:2, or a polymerase domain having a reverse transcriptase activity comprising an amino acid sequence that is a portion of an amino acid sequence having at least 90% identity with SEQ ID NO:1 or SEQ ID NO:2) and any one or more of a purification tag; a detection moiety; a protein translocation sequence; a protein interaction sequence; and a sequence comprising a chemical modification or comprising a site that can be chemically modified. Thus, in some embodiments, the exogenous amino acid sequence comprises a purification tag, a detection moiety; a protein translocation sequence; a protein interaction sequence; and a sequence comprising a chemical modification or comprising a site that can be chemically modified.
An RT described herein can be joined with one or more of such domains at its N-terminus, C-terminus, and/or the middle portion located anywhere between the N- and C-terminus, or at more than one location. Segments of a fusion protein can optionally be separated by a linker. Polypeptide components of a fusion protein can be joined by one or more peptide bonds, disulfide linkages, and/or other covalent bonds. In an embodiment, the RT is a fusion protein comprising a polymerase domain that comprises an RT amino acid sequence described herein, and an exogenous amino acid sequence as defined herein. In some embodiments, there is provided a reverse transcriptase that comprises: (a) an amino acid sequence that is at least 90% identical to the amino acid sequence of SEQ ID NO:1 or SEQ ID NO: 2; and (b) a purification tag. In some embodiments, the reverse transcriptase fusion protein comprises (i) a polymerase domain comprising an amino acid sequence selected from: an amino acid sequence having at least 90% identity with SEQ ID NO:1; or an amino acid sequence having at least 90% identity with SEQ ID NO:2, or a portion of the selected amino acid sequence having reverse transcriptase activity; and (ii) an exogenous amino acid sequence.
For a particular fusion protein comprising a polymerase domain that is a segment of SEQ ID NO:1 or SEQ ID NO:2 (or a segment of a variant of SEQ ID NO:1 or SEQ ID NO:2 having an amino acid sequence with at least 90%, 92%, at least 93%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% sequence identity to SEQ ID NO:1 or SEQ ID NO:2, respectively), it is understood that the % sequence identity is determined with respect to the number of amino acids in the polymerase domain (and not with respect to the number of amino acids in the full length SEQ ID NO:1 or SEQ ID NO:2 unless the polymerase domain corresponds to the full length of SEQ ID NO:1 or SEQ ID NO:2 or the variant thereof).
The RT fusion proteins described herein have reverse transcriptase activity. In some embodiments, the RT fusion protein has increased reverse transcriptase activity, and/or has improved tolerance to one or more reaction conditions such as high salt concentration, temperature, or the presence of reaction or sample components that inhibit reverse transcriptase activity, as compared to an RT known in the art and/or as compared to a fusion protein comprising an RT known in the art (e.g., an RT fusion protein containing an exogenous amino acid sequence of equivalent function).
Also provided by the present disclosure are compositions including an RT described herein. Such a composition can include one or more RTs and one or more substances selected for purposes such as storage stability (including a substance such as a solid support, gel, or solution); detection of presence, concentration, or activity of the reverse transcriptase; and/or for performing a method using the reverse transcriptase (e.g., providing the RT with other components for polynucleotide extension of a target nucleic acid (e.g., RNA template or DNA template)), and/or template switching reaction, referenced in some instances as a reaction mixture).
A composition can therefore contain components for polynucleotide extension (e.g., extension of nucleic acid target), such as dNTPs. Compositions containing dNTPs can include one, two, three or all four of dATP, dTTP, dGTP and dCTP, and can include one or more modified dNTPs, such as forms that are resistant to, or susceptible, to a particular enzymatic or chemical conversion (e.g., deaminase-resistant), or that are detectable. Examples of modified dNTPs include alpha-phosphorothioate dNTPs, dUTP, dITP, labeled dNTPs such as, e.g., fluorescein- or cyanin-dye family dNTPs, radiolabeled dNTPs.
An RT composition can include any of (including one or more of) a buffering agent (e.g., HEPES, MES, MOPS, TAPS, tricine, Tris, ACES, ADA, BES, Bicine, CAPS, carbonic acid/bicarbonic acid, CHES, citric acid, DIPSO, EPPS, histidine, MOPSO, phosphoric acid, PIPES, POPSO, TAPS, TAPSO, triethanolamine); an excipient; a salt, such as a cationic salt generally required for polynucleotide extension activity of an RT (e.g., NaCl, MnCl2, MgCl2, MgSO4, CaCl2)); a protein (e.g., albumin; an enzyme, such as a uracil DNA glycosidase (UDG), a polymerase such as Taq polymerase, e.g., for performing qPCR procedures; a nucleic acid binding protein, e.g., Gp32), a dye (e.g., for detecting the presence, concentration or activity of the RT, including detecting cDNA molecules generated such as in qPCR methods); a stabilizer; an inhibitor (e.g., RNase inhibitor, such as human placental RNase inhibitors, porcine liver RNase inhibitors, mouse RNA inhibitor, 2′-cytidine monophosphate free acid (2′-CMP), aluminon, adenosine 5′-pyrophosphate); a detergent (for example, ionic, non-ionic, and/or zwitterionic detergents, a poloxamer); a reducing agent (e.g., dithiothreitol, beta mercaptoethanol); an polynucleotide such as one or more oligonucleotide (e.g., template switching oligos), primers and/or control polynucleotide; a cell (e.g., intact, digested, or any cell-free extract); a biological sample; an aptamer (e.g., for controlling the activity of an RT); a crowding agent (e.g., polyethylene glycol, e.g., PEG6000); a sugar (e.g., a mono, di, tri, tetra, or higher saccharide); a starch; cellulose; a glass-forming agent (e.g., for lyophilization); a lipid; an oil; aqueous media; a support (e.g., a matrix such as a bead, filter paper, slide) and/or (non-naturally occurring) combinations thereof. Combinations can include, for example, two or more of the listed components (e.g., a salt and a buffering agent) or a plurality of a single listed component (e.g., two different salts or two different sugars). Those skilled in the art will be able to identify additional components for their use of the disclosed RT in a composition (or method or kit, as described below).
Thus, in an embodiment, a composition includes an RT described herein. In various embodiments, the composition can include one or more of: dNTPs, a buffering agent, an oligonucleotide (e.g., a primer, a template switching oligonucleotide), a cationic salt (e.g., a divalent salt and optionally a monovalent salt), a DNA polymerase, an RNA template, a DNA template, a DNA binding protein (e.g., T4 GP32, for example when performing a template switching reaction) and an aptamer. In an embodiment, the RT is associated with (e.g., covalently or non-covalently attached to) a solid support. In an embodiment, the composition is lyophilized, dried, and/or frozen.
The RTs described herein can be used for any purpose in which their activity is necessary or desired, for example to copy RNA into DNA and/or copy DNA into DNA via the DNA polymerase activity of an RT; to employ the template switching activity for any purpose; or both. The RTs described herein can have reverse transcriptase activity; template switching activity; or both reverse transcriptase and template switching activity. In an embodiment, the RT has reverse transcriptase activity. In an embodiment, the RT has reverse transcriptase and template switching activity. Accordingly, the RTs described herein can be useful, e.g., for first strand cDNA synthesis, second strand cDNA synthesis, RT-PCR, RT-qPCR, RNA-seq library preparation, 5′RACE. When performing some methods, additional enzymes can be necessary or beneficial for high performance. For example, Examples 6-9 describe RT-qPCR methods that employ Taq polymerase (Luna® Universal Probe qPCR Master Mix) for the quantitative PCR.
Accordingly, provided herein are methods, which involve incubating a reaction mixture comprising (i) a reverse transcriptase described herein, (ii) a primer, (ii) dNTPs, and (iii) a target nucleic acid, under conditions suitable for copying of the template (polynucleotide extension of a primer, resulting in copying of the nucleic acid template to produce a copied DNA product).
Also provided herein are methods for template switching employing an RT described herein. As background, conventional cDNA construction strategies sometimes underrepresent the 5′ end sequences of the mRNA. A known approach to solving this problem is template switching using a chimeric DNA: RNA oligo and a reverse transcriptase having template switching activity to improve 5′ transcript coverage in a sequence-independent manner and thereby obtain better coverage of the sequences present in the mRNA library. Thus, a described RT can be used for this among other template switching applications. In an embodiment, an RT described herein can be combined with a DNA binding protein such as T4 Gene 32 Protein in a template switching reaction (see, e.g., Example 10).
The RTs described herein are useful for polynucleotide extension of nucleic acid templates using a primer, also referenced herein copying a target nucleic acid to generate copied DNA. As generally used herein, the term “nucleic acid” means a polymeric form of nucleotides of any length, such as deoxyribonucleotides or ribonucleotides, or analogs thereof. For example, a nucleic acid can be DNA, RNA or the DNA product of RNA subjected to reverse transcription. Non-limiting examples of nucleic acids include coding or non-coding regions of a gene or gene fragment, loci (locus) defined from linkage analysis, exons, introns, messenger RNA (mRNA), transfer RNA, ribosomal RNA, ribozymes, cDNA, recombinant nucleic acids, branched nucleic acids, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes, and primers. Other examples of nucleic acids include, without limitation, cDNA, aptamers, and peptide nucleic acids. Nucleic acid can contain modified nucleotides, such as methylated nucleotides and nucleotide analogs (“analogous” forms of purines and pyrimidines are well known in the art). If present, modifications to the nucleotide structure can be imparted before or after assembly of the polymer. Nucleic acid can be a single-stranded, double-stranded, partially single-stranded, or partially double-stranded DNA or RNA, depending on the application. As used herein, the term “polynucleotide extension” means the synthesis of DNA catalyzed by an RT resulting in polymerization of individual nucleoside triphosphates using a primer as a point of initiation. A primer is hybridized to a target nucleic acid (RNA or DNA) to form a primer-template complex. The primer-template complex is contacted with the RT and nucleoside triphosphates (dNTPs) in a suitable environment to permit the addition of nucleotides to the 3′ end of the primer, thereby producing a copied DNA product complementary to at least a portion of the target nucleic acid.
As used herein, the term “primer” means an oligonucleotide that is capable of, upon forming a duplex with a polynucleotide template, acting as a point of initiation of DNA synthesis and being extended from its 3′ end along the template so that an extended duplex is formed. The sequence of nucleotides added during the extension process is determined by the sequence of the template polynucleotide. Primers are of a length compatible with their use in synthesis of primer extension products, and can be in the range of between 8 to 100 nucleotides in length, such as 10 to 75, 15 to 60, 15 to 40, 18 to 30, 20 to 40, 21 to 50, 22 to 45, 25 to 40, and so on, more typically in the range of between 18 to 40, 20 to 35, 21 to 30 nucleotides long, and any length between the stated ranges. Primers are usually single stranded. Primers have a 3′ hydroxyl.
One or more sequence-specific primers can be used for RT-PCR (e.g., quantitative RT-PCR) and other similar analyses. Primers containing oligo-dT and random primers can be used when making a cDNA library, e.g., to be sequenced, used for gene expression analysis (e.g., RACE). Template switching oligonucleotides (described above) can be used for template switching applications such as adding tags to polynucleotide strands. In conjunction with a template switching oligonucleotide, cDNA is synthesized with a known sequence of choice attached to the 3′ end. The resulting cDNA can be amplified by PCR or serve as template for, e.g., 5′ RACE (rapid amplification of cDNA ends) or second strand cDNA synthesis.
A primer used for polynucleotide extension using an RT can contain a feature (e.g., chemical moiety, sequence, modified nucleotide, etc.) for the detection or immobilization of the primer so long as such feature(s) do not destroy the ability of the primer to act as a point of initiation of DNA synthesis. For example, primers can contain an additional nucleic acid sequence at the 5′ end that does not hybridize to the target nucleic acid, but that facilitates cloning or sequencing of the amplified product. A primer can include conventional nucleotides, unconventional nucleotides (e.g., ribonucleotides or labeled nucleotides), nucleotide analogs, and mixtures thereof, as suitable for a particular application. A primer can have a detectable tag (e.g., a fluorescent tag).
Conditions suitable for polynucleotide extension are known in the art (see, e.g., Sambrook et al., supra. See also Ausubel et al., Short Protocols in Molecular Biology (4th ed., John Wiley & Sons 1999). In general, a reaction mixture for carrying out polynucleotide extension using an RT may include a buffering agent, dNTPs, a divalent cation (e.g., Mg2+, Mn2+, Co2+, Cd2+), a monovalent cation (e.g., a salt such as KCl and/or NaCl), one or more primers, and optionally can include an antibody, antibody-like molecule, an aptamer, or other entity to inhibit the RT or another reaction component under selected conditions (such as temperature or salt concentration). The design of aptamers is well known (see, e.g., Byun J. Life (Basel). 2021 Feb. 28; 11 (3): 193). A reaction mixture can have a pH range of about 7.5-8.8 and polynucleotide extension can be carried out at a temperature range of about 40° C.-65° C. as described herein below. The RTs described herein are active in a greater temperature range, and particularly at higher temperatures than some commercially available RTs (see, e.g., Example 11).
Target nucleic acids, which can be RNA templates or DNA templates, are substrates for the RTs described herein. An RNA template used herein can be any type of RNA template, e.g., total RNA, polyA+ RNA, capped RNA, enriched RNA. An RNA template can be from any source, e.g., bacteria, mammals, an in vitro transcription reaction, etc., processes for the making of which are known. For example, Examples 2-4 describe use of in vitro transcribed RNA templates; Examples 5 and 6 describe use of cultured cell derived RNA. The RNA template can contain RNA molecules that are at least 1 kb in length, e.g., at least 3 kb, at least 5 kb, at least 8 kb, at least 10 kb, at least 16 kb, at least 20 kb. For example, Example 4 describes using polyA RNA templates of 1, 4, and 8 kb for first strand cDNA synthesis, and Example 9 describes using 16 kb RNA template in a two-step RT-qPCR method. The starting amount of RNA template is determined by the particular application and can be in the range of, for example, 1 μg to 100 ng (e.g., 5 pg to 50 ng) for one-step RT-PCR, or in the range of, for example, 0.5 pg to 5 μg (e.g., 1 pg to 1 μg) for two-step RT-PCR. For example, Example 5 describes using 5 pg-50 ng of Jurkat RNA for performing one step RT-qPCR; Example 7 describes using 1 pg-1 μg for two step RT-PCR; and other starting amounts of RNA templates are described in further examples as well as in published literature. A DNA template used herein can be any type of DNA template, e.g., cDNA resulting from first strand synthesis, e.g., by an RT described herein, double-stranded DNA, hybrid.
The RTs described herein can be used in generating copied DNA products from nucleic acids obtained from diverse sources, such as people, animals and plants, environments (e.g., soils, waters, vehicles, homes, hospitals, airports), food products, forensic and archaeological materials. Thus, as used herein, the term “sample” means a natural or man-made substance suspected of containing a target nucleic acid, such as a biological fluid, cell, tissue, or fraction thereof, food or environmental substance that can contain or be contaminated by a target nucleic acid. A sample can be derived from a prokaryote or eukaryote and therefore can include cells from, for example, animals, plants, or fungi as well as viruses. Accordingly, a sample includes a specimen obtained from one or more individuals or can be derived from such a specimen. For example, a sample can be a tissue section obtained by biopsy, or cells that are placed in or adapted to tissue culture. Exemplary samples include biological specimens such a cheek swab, nasopharyngeal swab, throat swab, nasopharynx flush through, amniotic fluid, skin biopsy, organ biopsy, tumor biopsy, blood, urine, saliva, semen, sputum, cerebral spinal fluid, tears, mucus, and the like. A sample can be further fractionated, if desired, to a fraction containing particular cell types. For example, a blood sample can be fractionated into serum or into fractions containing particular types of blood cells. If desired, a sample can be a combination of samples from an individual such as a combination of a tissue and fluid, or a combination of samples from more than one individual (e.g., pooled samples, maternal sample containing fetal nucleic acid). Prior to analysis, a sample can be processed to preserve the integrity of nucleic acid targets. Such methods include the use of appropriate buffers and/or inhibitors, including nuclease, protease, and phosphatase inhibitors, which preserve or minimize changes in the molecules in the sample, including tissue fixatives (e.g., in the case of FFPE preserved tissues).
A sample can contain more than one target nucleic acid, either naturally or due to combination prior to analysis. Thus, the methods employing an RT described herein can be used to detect more than one target nucleic acid simultaneously (e.g., at least two, more than two, at least five, more than five, at least ten, or more than ten). For example, Example 8 describes a 5-plex two-step RT-qPCR method. A sample can contain a substance that can inhibit reverse transcriptase activity of some RTs under reaction conditions in which an RT described herein has activity, such as an amount of salt, chemical, dyes, lysis reagents, environment and animal sample components and other substances.
In some embodiments, the method further involves quantifying the amount of the copies of the target nucleic acid.
In some embodiments, the RT is tolerant of one or more of a high salt condition, a chemical, a dye, a lysis reagent, an environmental component, an animal sample component, or other sample contaminant or reaction component, as is described herein. In some embodiments, the RT is salt-tolerant.
In some embodiments, the RT is in a composition (e.g., reaction mixture) comprising one or more salts. For example, the salt(s) may be selected from potassium (e.g. KCl), magnesium (e.g. MgCl2), lithium (e.g. LiCl), and sodium (e.g. NaCl, Na citrate, Na acetate, or Na heparin). For example, in some embodiments, the RT is in a composition (e.g., reaction mixture) comprising any one or more salt selected from potassium (e.g. KCl) at a concentration of up to 400 mM, optionally 20 mM to 400 mM, 40 mM to 300 mM, 150 mM to 250 mM (e.g. 250 mM KCl); magnesium (e.g. MgCl2) at a concentration of up to 12 mM, optionally 1 mM to 12 mM, 1.5 mM to 6 mM, or 8 mM to 12 mM; lithium (e.g. LiCl) at a concentration of up to 250 mM, optionally 150-250 mM; NaCl at a concentration up to 250 mM, optionally 150-250 mM; Na citrate at a concentration of up to 50 mM, optionally 25-50 mM; Na acetate at a concentration of up to 175 mM, optionally 125-175 mM; and Na heparin at a concentration of up to 0.01 mM, optionally 0.0025 to 0.01 mM. As demonstrated herein, an exemplary RT of the invention has reverse transcriptase activity in the presence of any of these salt concentrations.
In some embodiments, the RT composition (e.g., reaction mixture) comprises a buffer. For example, the composition (e.g., reaction mixture) is at a pH in the range of 7.5 to 8.5 (e.g. pH 8.5). An RT of the invention has reverse transcriptase activity within this pH range.
In some embodiments, the reaction mixture can be incubated at a temperature in the range of 35° C. to 65° C., e.g., at a temperature of 42° C. to 65° C., optionally at a temperature of about 55° C. or in the range of 55° C. to 65° C. In some embodiments, the reaction mix can be incubated at a temperature of at least 42° C., 45° C., 48° C., 51° C., 54° C., 57° C., 60° C., or 65° C.; or at a temperature in the range of 42° C. to 45° C., 48° C. to 51° C., 51° C. to 54° C., 54° C. to 57° C., 57° C. to 60° C., or 60° C. to 65° C. Example 1 demonstrates that exemplary RTs described herein have RT activity at 58° C. and 62° C. Example 11 describes the activity of an exemplary RT at temperatures between 42° C. to 65° C.
As is described herein, an RT disclosed herein is capable of synthesizing high yields of cDNA product from an RNA template (such as an RNA template of a size of at least 1 kb, at least 4 kb, at least 8 kb, at least 16 kb) in short reaction times, such as a time period selected from 10 minutes, 9 minutes, 8 minutes, less than 10 minutes, less than 9 minutes, and less than 8 minutes.
In some embodiments, the RT has reverse transcriptase activity in a reaction mixture comprising KCl at a concentration of 40 mM to 300 mM (e.g. 40-120 mM) and MgCl2 at a concentration of 1.5 mM to 6 mM; optionally wherein the pH is in the range of 7.5 to 8.5 and/or the reaction temperature is about 55° C.
For example, a general-purpose reaction mixture suitable for generating long cDNA (e.g., 8-16 kb) using an RT of the invention comprises KCl at a concentration of about 120 mM and MgCl2 at a concentration of about 3 mM; and optionally further comprises about 0.02% Tween-20 and/or has a pH of 8.5. In some specific embodiments, an RT as disclosed herein is capable of synthesizing high yields of a long (e.g. about 8 kb or about 16 kb) cDNA product from an RNA template within 10 minutes in this general purpose reaction mixture at about 55° C.
In some embodiments, the RT composition (e.g., reaction mixture) comprises one or more commonly used chemical inhibitors, dyes, or cell lysis substances; for example, any one or more of Tween (e.g. Tween 20), ethanol, isopropanol, DMSO, SDS, urea, guanidine iso thiocyanate, DTT, hydrogen peroxide, SYBR® Green II, SYBR Gold, formalin, Solulyse reagent, BugBuster® reagent, and Luna Cell Ready reagent. For example, the RT composition (e.g., reaction mixture) may comprise Tween 20 at a concentration of up to 0.04 mM (e.g. 0.01-0.04 mM), ETOH at a concentration of up to 0.1 mM (e.g. 0.1 mM), isopropanol at a concentration of up to 0.1 mM (e.g. 0.1 mM), DMSO at a concentration of up to 0.15 mM (e.g. 0.05-0.15 mM), SDS at a concentration of up to 5e-05 mM (e.g. 1e-05 to 5e-05 mM), Urea at a concentration of up to 2000 mM (e.g. 1000-2000 mM), guanidine iso thiocyanate at a concentration of up to 75 mM (e.g. 75 mM), DTT at a concentration of up to 100 mM (e.g. 10-100 mM, and/or hydrogen peroxide at a concentration of up to 50 mM (e.g. 25-50 mM). For example, the RT composition (e.g., reaction mixture) may comprise SYBR Green II dye at a concentration of up to 50 mM (e.g. 10-50 mM), SYBR Gold dye at a concentration of up to 20 mM (e.g. 10-20 mM), formalin at a concentration of up to 0.001 mM (e.g. 0.0002-0.001 mM), Solulyse reagent at a concentration of up to 0.2 mM (e.g. 0.1-0.2 mM), BugBuster reagent at a concentration of up to 0.02 mM (e.g. 0.01-0.02 mM), and/or Luna Cell Ready reagent at a concentration of up to 0.02 mM (e.g. 0.1-0.2 mM). As demonstrated herein, an exemplary RT of the invention has reverse transcriptase activity in the presence of these components at the recited concentrations/concentration ranges.
In some embodiments, the RT composition (e.g., reaction mixture) comprises one or more environmental or animal sample components, optionally tannic acid, hemin, hematin, humic acid, melanin, hemoglobin, and myoglobin. For example, the RT composition (e.g., reaction mixture) may comprise tannic acid at a concentration of up to 0.01 mM (e.g. 2e-05 to 0.01 mM), hemin at a concentration of up to 10 mM (e.g. 5-10 mM), hematin at a concentration of up to 20 mM (e.g. 2-20 mM), humic acid at a concentration of up to 1 mM (e.g. 0.5-1 mM), melanin at a concentration of up to 1 mM (e.g. 0.5-1 mM), hemoglobin at a concentration of up to 5 mM (e.g. 1-5 mM), and/or myoglobin at a concentration of up to 5 mM (e.g. 1-5 mM). As demonstrated herein, an exemplary RT of the invention has reverse transcriptase activity in the presence of these components at the recited concentrations/concentration ranges.
Also provided by the present disclosure are kits for using an RT described herein. A kit can include one or more RTs described herein together with one or more other components useful for carrying out a method involving RNA polynucleotide extension, and/or DNA polynucleotide extension, and/or template switching.
A kit can therefore contain components for polynucleotide extension (e.g., amplification of a nucleic acid target), and/or template switching (e.g., amplification of a nucleic acid targets with addition of sequences via a template switching oligonucleotide such as for sequencing library preparation), such as dNTPs. A kit containing dNTPs can include one, two, three of all four of dATP, dTTP, dGTP and dCTP, and can include one or more modified dNTPs, such as forms that are resistant to, or susceptible, to a particular enzymatic or chemical conversion, or that are detectable. Examples of modified dNTPs include alpha-phosphorothioate dNTPs, dUTP, dITP, labeled dNTPs such as, e.g., fluorescein- or cyanin-dye family dNTPs. A kit can include radiolabeled dNTPs, such as 3HdTTP.
A kit can include a reaction mixture in any convenient form, such as in solution, concentrated form, dried form, disposed in, on, or within a solid support (e.g., a tube, plate, pellet, membrane, bead). In an embodiment, one or more components is associated with (e.g., covalently or non-covalently attached to) a solid support. Such a reaction mixture can contain components useful for enabling use of the RT in a particular assay format, e.g., to promote a particular aspect of the RT enzymatic activity, a molecular interaction, a stability profile, and other desirable properties. Accordingly, reaction mixture can contain one or more salts, detergents (ionic, non-ionic, zwitterionic), poloxamers, preservatives, inhibitors of unwanted activities, crowding agents, reducing agents (e.g., DTT), catalysts, dyes, and other substances. In an embodiment, the reaction mixture is suitable for receiving and amplifying a nucleic acid template in the presence of the RT and one or more primers.
In an embodiment, the RT is in a form selected from: dried, lyophilized, and in solution, wherein the solution is optionally glycerol-free. In some embodiments, a kit includes one or more oligonucleotides that bind to a predetermined nucleic acid template. In the case of oligonucleotide primers, the primers can be for an RNA template or DNA template for polynucleotide extension. In some embodiments, the kit includes one or more oligo dT primers. In some embodiments, a kit does not include primers or includes a limited number of primers, in instances where the kit user provides primers appropriate for their selected target. In some embodiments, a kit includes a control primer (e.g., rActin control). In some embodiments, a kit includes target-specific primers (e.g., to detect a pathogen). In some embodiments, a kit includes one or more of an oligo dT primer, a random primer, and a target specific primer such as a gene-specific primer. If desired, a primer can be exonuclease-resistant and/or chemically modified.
A kit can include an aptamer, e.g., for binding to an RT to control the conditions under which the RT has activity (e.g., to reduce off-target binding/amplification), for binding to another component in the kit (e.g., another enzyme such as a DNA polymerase), to control the activity of a sample constituent (e.g., to inhibitor RNAse).
A kit can include a probe, e.g., for detecting copied DNA, such as when performing qPCR. A kit can also include instructions for practicing a desired method (e.g., amplifying a target nucleic acid, detecting a target nucleic acid, sequencing library preparation) via any communication means. For example, the instructions can be printed (e.g., on paper or plastic), and/or electronic (e.g., provided on a device such as a portable drive, or remotely accessible such as on a web application, phone application, video, or voice transmission), and/or via demonstration.
A kit can include one or more other enzymes, as suitable for a particular purpose. For example, a kit can contain a UDG enzyme for reducing carryover contamination in PCR assays, which can lead to false positives. A UDG can be employed in a workflow to degrade amplified DNA that contains Us, leaving native target nucleic acids intact. Thus, UDG treatment is often performed as a treatment prior to DNA amplification. A kit can include a DNA polymerase (e.g., Taq polymerase) for amplification of target nucleic acids, e.g., for DNA target amplification in RT-PCR.
A kit can include a control, such as a control polynucleotide, which can be a plasmid or linear nucleic acid, depending on the application.
Components of a kit can be provided in a single container or compartment (e.g., for a single step use) or multiple containers or compartments (e.g., for combining, for sequential use, for parallel use, or another desired workflow). A kit can include a sample collection container, which optionally can contain a reagent, e.g., for stabilizing the sample (e.g., a poloxamer) or preparing it for assay.
A kit can contain RT described herein and a reaction mixture. The reaction mixture can contain a buffering agent and can include a cationic salt as well as other components described above. For example, a kit can further include one or more components selected from dNTPs, a primer, an aptamer, a detergent.
A kit can contain an RT that is thermostable at a particular temperature, as described herein above such as in Example 11, e.g., to enable use of reaction mixtures that will contain an RT during a heating process (e.g., heat lysis of cells, heat denaturation of nucleic acids).
Embodiment 1. An engineered reverse transcriptase comprising: an amino acid sequence having at least 90% identity with SEQ ID NO:1; or an amino acid sequence having at least 90% identity with SEQ ID NO:2.
Embodiment 2. The reverse transcriptase of embodiment 1, wherein amino acid sequence has at least 95% sequence identity with SEQ ID NO:1 or SEQ ID NO:2.
Embodiment 3. The reverse transcriptase of any prior embodiment, wherein the amino acid sequence is identical to SEQ ID NO:1 or SEQ ID NO:2.
Embodiment 4. A reverse transcriptase of embodiment 1, comprising: an amino acid sequence having at least 90% identity with SEQ ID NO:1, wherein the amino acid sequence has the following amino acid substitutions: an R at position 191, an R at position 446, a K at position 510, and a K at position 589, wherein the positions correspond to positions in SEQ ID NO:1 preceded by an M.
Embodiment 5. A reverse transcriptase of embodiment 1, comprising: an amino acid sequence having at least 90% identity with SEQ ID NO:2, wherein the amino acid sequence has the following amino acid substitutions: a Q to R at position 191, an E to R at position 445, an E to K at position 508, and a D to K at position 587, wherein the positions correspond to positions in SEQ ID NO:2 preceded by an M.
Embodiment 6. The reverse transcriptase of any prior embodiment, wherein the reverse transcriptase is a fusion protein comprising: (i) a polymerase domain comprising the amino acid sequence; and (ii) an exogenous sequence.
Embodiment 7. The reverse transcriptase of embodiment 6, wherein the exogenous sequence comprises a purification tag.
Embodiment 8. A composition comprising a reverse transcriptase of any prior embodiment.
Embodiment 9. The composition of embodiment 8, further comprising one or more of: dNTPs, a buffering agent, an oligonucleotide, and an aptamer.
Embodiment 10. The composition of embodiment 8, wherein reverse transcriptase is associated with a solid support.
Embodiment 11. The composition of embodiment 8, wherein the composition is in a form selected from lyophilized, dried, and in solution.
Embodiment 12. A kit comprising: (i) a reverse transcriptase of any of embodiments 1-7; and (ii) a reaction mixture.
Embodiment 13. The kit of embodiment 12, wherein the reaction mixture comprises a buffering agent.
Embodiment 14. The kit of any prior embodiment, wherein the reaction mixture further comprises a cationic salt.
Embodiment 15. The kit of any prior embodiment, further comprising one or more components selected from dNTPs, a primer, an aptamer, a detergent, a template switching oligonucleotide.
Embodiment 16. The kit of any prior embodiment, wherein the reverse transcriptase is in a form selected from lyophilized, dried, and in solution.
Embodiment 17. The kit of any prior embodiment, wherein one or more components is associated with a solid support.
Embodiment 18. A method comprising: incubating a reaction mixture comprising (i) a reverse transcriptase of any of embodiments 1-7, (ii) a primer, (ii) a target nucleic acid, and (iii) dNTPs, under conditions suitable for polynucleotide extension of the target nucleic acid to generate copied DNA.
Embodiment 19. The method of any embodiment herein, wherein the incubating is done at a temperature of between 55° C.-65° C.
Embodiment 20. The method of any embodiment herein, wherein the primer is selected from an oligo (dT) primer, a random primer, a gene-specific primer.
Embodiment 21. The method of any embodiment herein, wherein the target nucleic acid is an RNA template.
Embodiment 22. The method of any of embodiment herein, wherein the target nucleic acid is a DNA template.
Embodiment 23. The method of embodiment 18, wherein the target nucleic acid is an RNA template, and further comprising incubating the generated copied DNA with an enzyme to generate second copied DNA, optionally wherein the enzyme is a reverse transcriptase of any of embodiments 1-7.
Embodiment 24. The method of embodiment 18, wherein generation of copied DNA is detected using a quantitative method.
Embodiment 25. The method of embodiment 21, wherein the RNA template is of a size of at least about 1 kb, and full length copied DNA is generated within a time selected from: 10 minutes, 9 minutes, and 8 minutes.
Embodiment 26. A reverse transcriptase comprising: an amino acid sequence having at least 90% identity with SEQ ID NO:1; or an amino acid sequence having at least 90% identity with SEQ ID NO:2.
Embodiment 27. The reverse transcriptase of embodiment 26, wherein the amino acid sequence has at least 95% sequence identity with SEQ ID NO:1 or SEQ ID NO:2.
Embodiment 28. The reverse transcriptase of Embodiment 26 or 27, wherein the amino acid sequence is identical to SEQ ID NO:1 or SEQ ID NO:2.
Embodiment 29. The reverse transcriptase of embodiment 26, comprising:
Embodiment 30. The reverse transcriptase of embodiment 29, wherein the amino acid sequence contains
Embodiment 31. The reverse transcriptase of embodiment 26, comprising:
Embodiment 32. The reverse transcriptase of embodiment 31, wherein the amino acid sequence contains
Embodiment 33. The reverse transcriptase of any preceding embodiment, wherein the reverse transcriptase is a fusion protein comprising: (i) a polymerase domain comprising the amino acid sequence; and (ii) an exogenous amino acid sequence.
Embodiment 34. A reverse transcriptase fusion protein comprising:
Embodiment 35. The reverse transcriptase of embodiment 33 or 34, wherein the exogenous amino acid sequence comprises a purification tag.
Embodiment 36. A composition comprising a reverse transcriptase of any preceding embodiment.
Embodiment 37. The composition of embodiment 36, further comprising one or more of: dNTPs, a buffering agent, an oligonucleotide, an aptamer, a template switching oligonucleotide, a DNA polymerase, an RNA template and a DNA template.
Embodiment 38. The composition of embodiment 36, wherein reverse transcriptase is associated with a solid support.
Embodiment 39. The composition of embodiment 36, wherein the composition is in a form selected from lyophilized, dried, and in solution.
Embodiment 40. A kit comprising:
Embodiment 41. The kit of embodiment 40, wherein the reverse transcriptase is in a form selected from lyophilized, dried, and in solution.
Embodiment 42. The kit of embodiment 40 or embodiment 41, wherein one or more components is associated with a solid support.
Embodiment 43. A method comprising:
Embodiment 44. The method of embodiment 43, wherein the incubating is done at a temperature of between 55° C.-65° C.
Embodiment 45. The method of any embodiment herein, wherein the primer is selected from an oligo (dT) primer, a random primer, a target-specific primer.
Embodiment 46. The method of any embodiment herein, wherein the target nucleic acid is RNA.
Embodiment 47. The method of any of embodiment herein, wherein the target nucleic acid is a DNA.
Embodiment 48. The method of embodiment 43, wherein the target nucleic acid is RNA, and further comprising incubating the generated copied DNA with an enzyme to generate second copied DNA, optionally wherein the enzyme is a reverse transcriptase of any of embodiments 1-7.
Embodiment 49. The method of embodiment 43, further comprising quantifying the amount of the copies of the target nucleic acid.
Embodiment 50. The method of embodiment 46, wherein the RNA template is of a size of at least about 1 kb, and full length copied DNA is generated within a time period selected from 10 minutes, 9 minutes, 8 minutes, less than 10 minutes, less than 9 minutes, and less than 8 minutes.
Embodiment 51. The method of embodiment 50, wherein the full length copied DNA is generated within 9 minutes.
Embodiment 52. The method of embodiment 50, wherein the full length copied DNA is generated within 8 minutes.
Embodiment 53. The RT of embodiment 30, wherein the RT comprises SEQ ID NO:3.
Embodiment 54. The RT of Embodiment 32, wherein the RT comprises SEQ ID NO:4.
The skilled artisan will understand that the figures, described above, and examples, described below, are for illustration purposes only. Neither the figures nor the examples are intended to limit the scope of the disclosed teachings in any way.
Engineered reverse transcriptases ERT1 (SEQ ID NO:1), ERT2 (SEQ ID NO:2), an exemplary variant of ERT1 (ERT3 (SEQ ID NO:3)), and an exemplary variant of ERT2 (ERT4 (SEQ ID NO: 4)) were expressed as fusion proteins containing the following sequence at their N termini: MGKIEESKHHHHHHGS (SEQ ID NO:8). For clarity, this sequence was used for ease of purification and can be omitted or substituted for an alternative purification tag (many purification tags are described herein above and/or described in the literature). The fusion proteins are used in all experiments that follow.
Recombinantly expressed and purified engineered reverse transcriptase fusion proteins ERT1 and ERT2 and their respective variants, ERT3 and ERT4, efficiently synthesize cDNA product under stringent reaction conditions. A commercially available reverse transcriptase (CAV RT1) (M3025, New England Biolabs, Inc.) and ERT1-4 were expressed, purified, and assayed for cDNA synthesis via extension of a fluorescently labeled DNA primer annealed to an in vitro transcribed 450 bp RNA template. Assays were performed in 50 mM Tris-HCl [pH 8.5], 250 mM KCl (high salt condition), 3 mM MgCl2, and 0.02% Tween-20. Briefly, reactions containing 100 nM primer: template were initiated with 80 nM enzyme, incubated at 58° C. for 10 minutes and stopped by heat inactivation at 95° C. for 5 minutes. Results were visualized by capillary electrophoresis, analyzed by PeakScanner software and full-length product was reported relative to total integrated signal. Under these stringent conditions, ERT1-4 all displayed significantly more cDNA synthesis than CAV RT1.
As shown in Table 1 and
The engineered Reverse Transcriptase 1 (ERT1) can efficiently synthesize full-length cDNA product within 10 minutes at 55° C. In vitro transcribed poly(A) 1 kb or 4 kb RNA templates were used to investigate full-length cDNA synthesis in a variety of buffers with pH ranging from 7.5 to 8.5, KCl concentration ranging from 40 mM to 300 mM, and MgCl2 concentration from 1.5 to 6 mM. After first-strand cDNA synthesis, an aliquot of the cDNA products was used to make full-length ds cDNA in the presence of a 5′ specific primer. Equal volume of ds cDNA was analyzed on an agarose gel.
Results are shown in
Tolerance of ERT1 to various salts that can inhibit RT reactions, including concentrations of KCl, NaCl, MgCl2, LiCl, Na citrate, Na acetate, and Na heparin, was examined. Results are shown in
Tolerance of ERT1 to various substances sometimes present in samples to be treated with an RT, including Tween 20, ethanol, isopropanol, DMSO, SDS, urea, guanidine iso thiocyanate, DTT and hydrogen peroxide, was examined. Results are shown in
Tolerance of ERT1 to various dyes and cell lysis substances, including SYBR Green II, SYBR Gold, formalin, Solulyse reagent, BugBuster reagent, and Luna Cell Ready reagent, was examined. Results are shown in
Tolerance of ERT1 to various environmental and animal sample components, including tannic acid, hemin, hematin, humic acid, melanin, hemoglobin, and myoglobin, was examined. Results are shown in
ERT1 can efficiently synthesize 1 kb, 4 kb as well as 8 kb full-length cDNA product within 10 minutes at 55° C. In vitro transcribed poly(A) 1 kb, 4 kb or 8 kb RNA templates were used to investigate full-length cDNA synthesis in a variety of buffers with pH ranging from 8 to 8.5, KCl concentration ranging from 40 mM to 120 mM, and MgCl2 concentration from 1.5 mM to 6 mM (see
ERT1 was used in a One-Step RT-qPCR reaction with probe-based detection. Results are shown in
ERT1 was used in a Two-Step RT-qPCR reaction (25° C./2 minutes, 55° C./10 minutes, 95° C./1 minutes), followed by probe-based detection. Results are shown in
ERT1 was used in an 8-minute Two-Step RT-qPCR reaction (25° C./1 minute, 60° C./5 minutes, 95° C./2 minutes), followed by probe-based detection. Results are shown in
ERT1 was used in an 8-minute Two-Step RT-qPCR reaction (25° C./1 minute, 60° C./5 minutes, 95° C./2 minutes), followed by probe-based 5-plex detection. cDNA synthesis was performed with the RNA input across 5-log (1 μg to 100 pg). The samples were actin, glyceraldehyde-3-phosphate dehydrogenase (GAPDH), ribosomal protein L32 (L32g), succinate dehydrogenase complex, subunit A (SDHA), SMG1 nonsense mediated mRNA decay associated PI3K related kinase (SMG). Amplicons were relatively small, with the longest being GAPDH at 226 bp. 1 μl cDNA from each reaction was then quantitated by qPCR using Luna Universal Probe qPCR Master Mix (M3004, New England Biolabs, Inc.). Cqs were plotted against the RNA input across 5-log. Cq average and efficiency were also calculated through the 5-log and shown in
ERT1 was used in an 8-minute Two-Step RT-qPCR reaction (25° C./1 minute, 60° C./5 minutes, 95° C./2 minutes), followed by dye-based detection. cDNA synthesis was performed with the RNA input across 4-log (1 μg to 1 ng). 1 μl cDNA was then quantitated by qPCR using Luna Universal qPCR Master Mix (M3003, New England Biolabs, Inc.). Typical amplification curves for the amplicon HerC are shown in
Template switching reactions were performed using components from the NEBNext® Single Cell/Low Input cDNA Synthesis & Amplification Module (E6421, New England Biolabs). In this protocol, reverse transcription is performed using a reverse transcription primer containing a 3′ poly(T) sequence and a 5′ tail sequence that allows for annealing of a PCR primer. A template switching oligo (TSO) is used to incorporate a known sequence to the 3′ end of the cDNA for PCR priming when template switching is performed. Amplification of full-length cDNA products is only observed when the sequence of the TSO is incorporated into the cDNA strand.
Reactions were performed using 4 ng of Universal Human Reference RNA (740000, Agilent) containing SIRV-Set 4 spike-in RNAs (141, Lexogen), enriched for poly(A)-containing RNA using the NEBNext® High Input Poly(A) mRNA Isolation Module (E3370, New England Biolabs). Reverse transcription reactions were performed in 1× Template Switching RT Buffer (B0466, New England Biolabs) containing Murine RNase Inhibitor (M0314, New England Biolabs), T4 Gene 32 Protein (M0300, New England Biolabs), NEBNext® Single Cell RT Primer Mix (E6422, New England Biolabs, 5′-AAGCAGTGGTATCAACGCAGAGTACTTTTTTTTTTTV-3′) (SEQ ID NO:5), NEBNext® Template Switching Oligo (E6424, New England Biolabs, 5′-GCTAATCATTGCAAGCAGTGGTATCAACGCAGAGTACATrGrGrG-3′) (SEQ ID NO:6), and reverse transcriptase. Template Switching RT Enzyme Mix (M0466, New England Biolabs) served as a positive control for template switching activity (TS Control). Reverse transcription reactions were performed for 90 minutes at the indicated temperature, followed by heat denaturation at 85° C. for 5 minutes. Amplification of cDNA was performed for 10 cycles following manufacturer protocol using primers with the sequence 5′-AAGCAGTGGTATCAACGCAGAGT-3′ (SEQ ID NO:7). cDNA products were analyzed using Agilent® High Sensitivity D5000 ScreenTape® System (5067, Agilent) on an Agilent 4200 TapeStation® (G2991, Agilent).
As shown in
Experiments were performed to determine whether template switching activity of ERT1 is altered by the presence of T4 Gene 32 Protein (GP32). As is depicted in
Sequencing libraries were prepared from cDNA using NEBNext Ultra II FS DNA Library Prep Kit for Illumina (E7805, New England Biolabs) and sequenced on an Illumina NextSeq® 500 (paired-end, 75 bases). Reads were aligned to the hg38 reference genome using RNA STAR and transcript coverage was calculated from the 1,000 most-abundant transcripts using the CollectRnaSeqMetrics (Picard) tool. Shown are the average of two replicates.
Temperature dependence of ERT1 and commercially available RT (CAV RT1) was determined by measuring full-length cDNA product formation. Synthesis reactions were assembled with 300 nM 1 kb RNA, 450 nM of DNA primer, 500 UM dNTPs, 1 U/μl RNase Inhibitor (NEB M0314) and 1× optimal buffer. Reactions were initiated with 37.5 nM of Enzyme and incubated for 30 minutes at 42° C., 43.6° C., 46.3° C., 50.7° C., 55.9° C., 60.2° C., 63.1° C., or 65° C. Full-length cDNA products were measured with 450 nM of a FAM-IABκFQ molecular beacon that emits fluorescence (excitation: 480/20 nm, emission: 540/20 nm) upon annealing to the 3′ terminus of the full-length cDNA product. Activity was calculated from duplicate readings and normalized according to the maximum fluorescence reading of each enzyme. As shown in
While the present invention has been described with reference to the specific embodiments thereof, it should be understood by those skilled in the art that various changes may be made, and equivalents may be substituted without departing from the true spirit and scope of the invention. In addition, many modifications may be made to adapt a particular situation, material, composition of matter, process, process step or steps, to the objective, spirit, and scope of the present invention. All such modifications are intended to be within the scope of the claims appended hereto.
This application claims the benefit of provisional application Ser. No. 63/622,710, filed on Jan. 19, 2024, which application is incorporated herein in its entirety.
Number | Date | Country | |
---|---|---|---|
63622710 | Jan 2024 | US |