Synthetic oligonucleotides are crucial to many aspects of biotechnological research in both the academic and industrial settings. Despite the high demand for longer, cheaper, and error-free oligonucleotides, current industrial leaders have not addressed the many limitations of traditional chemical synthesis methods developed decades ago. This is especially true for de novo RNA oligonucleotide synthesis, which remains largely inaccessible to those heavily invested in furthering genome engineering technologies, RNA-based diagnostics and therapeutics, RNA-based sequencing technologies, high-density nucleic acid-based information storage, and biological computing (Kaczmarek, Kowalski, and Anderson 2017). While there have been some improvements to the methods employed for the chemical synthesis of RNA oligonucleotides, the overall chemistry has changed very little since the late 1970s (Hughes and Ellington 2017). Chemical synthesis is plagued by many complicated reaction steps that require both harsh chemical reagents and biologically incompatible organic solvents. These reaction conditions lead to the depurination of the nucleotide bases, unexpected insertions or deletions from the overall sequence, and the preemptive irreversible capping of the oligonucleotide resulting in unwanted truncated products. This greatly increases the overall error-rate of synthesis, limits the maximum length of RNA oligonucleotides to less than 120 nucleotides, and requires longer lead-times to obtain appreciable yields of a desired product. Furthermore, the chemical synthesis of RNA oligonucleotide is extremely costly; compared to the current cost of DNA oligonucleotide synthesis at $0.1 per base (Carlson 2018), RNA synthesis is nearly 100-fold greater, not accounting for long or complex RNA oligonucleotides. Addressing the current limitations of RNA oligonucleotide synthesis is therefore important.
Described herein are compounds, enzymes, compositions, systems, kits, and methods for the controlled de novo synthesis of RNA oligonucleotides using enzymatic catalysis. For example, provided herein are methods for preparing RNA oligonucleotides via controlled, template-independent addition of nucleotides to an initiator oligonucleotide's 3′-terminus via enzymatic catalysis (also known as terminal transferase activity). Single nucleotides can be iteratively added by a compatible polymerase (e.g., a poly(N) polymerase such as a poly(U) polymerase) until a desired RNA oligonucleotide sequence is synthesized.
The present disclosure is based on the discovery that certain polymerase enzymes can effectively catalyze template-independent terminal transferase reactions with a variety of modified and unmodified nucleotides. In one aspect, provided herein are methods for the synthesis of RNA oligonucleotides, wherein a poly(N) polymerase incorporates one or more nucleotides onto the 3′-terminus of an initiator oligonucleotide. It has been found that certain polymerases, such as poly(U) and poly(A) polymerases, among others, can catalyze terminal transferase reactions with a diversity of nucleotides, including modified nucleotides. After a poly(N) polymerase incorporates one or more nucleotides via terminal transferase, the process can be repeated in one or more in iterative steps, optionally with different nucleotides, until a desires RNA oligonucleotide sequence is obtained. Also provided herein are new poly(N) polymerase enzymes (e.g., mutant poly(U) polymerases) that are useful in the methods described herein.
In one aspect, proved herein are methods for preparing an RNA oligonucleotide comprising combining an initiator oligonucleotide, a poly(N) polymerase (e.g., poly(U) polymerase), and one or more modified nucleotides under conditions sufficient for the addition of at least one modified nucleotide to the 3′ end of the initiator oligonucleotide, thereby synthesizing an RNA oligonucleotide. The method may further comprise adding one or more additional nucleotides (modified or unmodified) to the resulting RNA oligonucleotide in iterative steps until a desired RNA oligonucleotide sequence is obtained. Also provided herein are compounds (e.g., modified nucleotides) that are useful in the methods described herein.
Other methods for controlled de novo RNA oligonucleotide synthesis are provided herein. For example, in another aspect, provided herein are methods wherein modified nucleotides (i.e., “reversible terminator oligonucleotides”, e.g., 2′- or 3′-O-protected nucleotides) are incorporated that reversibly alter the binding affinity of the polymerase (e.g., a poly(N) polymerase, such as poly(U)) to the extended initiator oligonucleotide. This change in binding affinity results in the termination of further nucleotide addition, producing a (n+1) oligonucleotide product that can be further extended after the modified group is restored to its natural state (e.g., a 2′- or 3′-OH group) via a mild deprotection chemistry. An “(n+1) oligonucleotide” is a product wherein a single nucleotide has been added to the initiator sequence. These methods are exemplified in the generic scheme shown in
In another aspect, provided herein are methods for RNA oligonucleotide synthesis employing non-hydrolyzable nucleotides. In these methods, the rate at which a polymerase incorporates nucleotides at the 3′-terminus of an initiator oligonucleotide is controlled by introducing a non-hydrolyzable nucleotide that competes for the enzyme's active site. These methods are exemplified in the generic scheme shown in
In addition, provided herein are methods of ligating two oligonucleotides to yield an RNA oligonucleotides. In certain embodiments, the methods comprise providing a first oligonucleotide, wherein the first oligonucleotide comprises a 5′-triphosphate group; providing a second oligonucleotide; providing a poly(U) polymerase; combining the first and second oligonucleotides and the poly(U) polymerase under conditions sufficient for the ligation of the first oligonucleotide to the 3′ end of the second oligonucleotide. This embodiment is possible due to the discovery that 5′-triphosphate nucleotides with oligonucleotides at the 3′-positon are viable substrates for poly(N) polymerases (e.g., wild-type and mutated poly(U) polymerases) described herein.
In addition, RNA oligonucleotides produced by these methods can undergo reverse transcription (RT) to yield complementary DNA (e.g., cDNA) that is amplifiable by a DNA polymerase via the polymerase chain reaction (PCR).
Also provided herein are RNA oligonucleotides and DNA oligonucleotides produced by any method described herein.
The methods described herein involve enzymatic catalysis. Because enzymatic catalysis occurs under biologically compatible reaction conditions, unwanted degradation of the RNA molecule currently experienced by chemical synthesis can be eliminated. The present methods improve upon the current state of de novo RNA oligonucleotide synthesis, which is performed with phosphoramidite chemistry, often under harsh reaction conditions. The harsh reaction conditions of chemical RNA oligonucleotide synthesis makes it difficult and expensive to produce long RNA oligonucleotides, for example, >100 nucleotides in length If appreciable yields of a long RNA oligonucleotide are produced via chemical synthesis, it is possible that the error-rate of the oligonucleotide is very high. By using an enzymatic approach described herein, many of the current issues associated with long RNA oligonucleotide synthesis are solved. Applications of the methods described herein include the direct synthesis of RNA as well as material generation for nucleic acid nanotechnology, genome engineering techniques, and novel RNA and DNA therapeutics. In certain embodiments, methods described herein have the capacity to be miniaturized in a microfluidic format or performed in a highly parallelized manner such as micro-droplet printing. The methods provided herein can also be carried out in solid phase.
Also provided herein are compositions and kits comprising one or more of the poly(N) polymerases and/or nucleotides described herein.
The details of certain embodiments of the invention are set forth in the Detailed Description of Certain Embodiments, as described below. Other features, objects, and advantages of the invention will be apparent from the Definitions, Examples, Figures, and Claims.
As used herein, the term “polymerase” generally refers to an enzyme that is capable of synthesizing RNA or DNA oligonucleotides. In some embodiments, a polymerase is capable of synthesizing an oligonucleotide in a template-dependent manner. In other embodiments, a polymerase is capable of synthesizing an oligonucleotide in a template-independent manner. In some embodiments, a polymerase is an RNA polymerase. In some embodiments, a polymerase is a DNA polymerase. In some embodiments, a polymerase is a reverse transcriptase. A polymerase may be derived from any source, e.g., recombinant polymerase, bacterial polymerase. In some embodiments, a polymerase is a poly(N) polymerases. In some embodiments, a polymerase is a poly(U), poly(A), poly(C), or poly(G) polymerase. In some embodiments, a polymerase is capable of adding a nucleotide, e.g., a nucleotide, to the 3′ end of an oligonucleotide, e.g., an initiator oligonucleotide. In some embodiments, a polymerase selectively adds a single nucleotide species, e.g., nucleotide comprising an uracil base in the case of poly(U) polymerases, to the 3′ end of an oligonucleotide, e.g., an initiator oligonucleotide.
As used herein, the term “RNA oligonucleotide” generally refers to a polymer of nucleotides, ribonucleotides, or analogs thereof. An RNA oligonucleotide can have any sequence. As used herein, an RNA oligonucleotide may have any three-dimensional structure, and may perform any function, known or unknown to one of skill in the art. An RNA oligonucleotide may be naturally occurring or synthetic. In some embodiments, a RNA oligonucleotide may be a messenger RNA (mRNA), a transfer RNA, ribosomal RNA, a short interfering RNA (siRNA), a short-hairpin RNA (shRNA), a micro-RNA (miRNA), a ribozyme, a recombinant oligonucleotide, a branched oligonucleotide, an isolated or synthetic RNA oligonucleotide of any sequence, a probe, and/or a primer. In some embodiments, an RNA oligonucleotide comprises nucleotides comprising naturally occurring bases, e.g., adenine or uracil. In some embodiments, an RNA oligonucleotide comprises non-naturally occurring or modified nucleotides, e.g., nucleotides comprising sugar modifications, base modifications, e.g., purine or pyrimidine modifications. In some embodiments, a RNA oligonucleotide comprises a combination of naturally, non-naturally occurring, and modified nucleotides. In some embodiments, a nucleotide may comprise at least one modified backbone or linkage, e.g., a phosphorothioates backbone or linkage. In some embodiments, a RNA oligonucleotide is single-stranded. In other embodiments, a RNA oligonucleotide is double-stranded. In some embodiments, a RNA oligonucleotide is synthesized via template-independent synthesis. In some embodiments, a RNA oligonucleotide is at least 5, at least 10, at least 20, at least 50, at least 100, at least 200, at least 300, at least 400, or at least 500 nucleotides in length.
As used herein, the term “DNA oligonucleotide” generally refers to a polymer of DNA nucleotides, deoxyribonucleotides, or analogs thereof. As used herein, a DNA oligonucleotide may have any three-dimensional structure, and may perform any function, known or unknown to one of skill in the art. A DNA oligonucleotide may be naturally occurring or synthetic. In some embodiments, a DNA oligonucleotide may be an exon, an intron, a cDNA sequence, a recombinant oligonucleotide, a branched oligonucleotide, a plasmid, a vectors, and/or an isolated DNA of any sequence. In some embodiments, a DNA oligonucleotide comprise DNA nucleotides comprising naturally occurring bases, e.g., adenine, cytosine, guanine, or thymine. In some embodiments, a DNA oligonucleotide comprise non-naturally occurring or modified DNA nucleotides, e.g., DNA nucleotides comprising sugar modifications, purine or pyrimidine modifications. In some embodiments, a DNA oligonucleotide comprises a combination of naturally, non-naturally occurring, and modified DNA nucleotides. In some embodiments, a DNA nucleotide may comprise at least one modified backbone or linkage, e.g., a phosphorothioates backbone or linkage. In some embodiments, a DNA oligonucleotide is single-stranded. In other embodiments, a DNA oligonucleotide is double-stranded. In some embodiments, a DNA oligonucleotide is synthesized via reverse transcription. In some embodiments, a DNA oligonucleotide is at least 5, at least 10, at least 20, at least 50, at least 100, at least 200 DNA, at least 300, at least 400, or at least 500 DNA nucleotides in length.
As used herein, the term “nucleotide” or “ribonucleotide” generally refers to a nucleotide monomer that comprises a ribose sugar, a phosphate group, and a nucleobase. An nucleotide may be naturally occurring, non-naturally occurring, or modified. In some embodiments, an nucleotide comprises a nucleobase or base, e.g., a purine or pyrimidine base. In some embodiments, a base is a naturally occurring base, e.g., adenine, cytosine, guanine, thymine, uracil, or inosine. In some embodiments, a nucleotide may comprise a non-naturally occurring nucleobase. In some embodiments, a nucleotide may comprise a modified nucleobase. In some embodiments, a nucleotide may comprise a modification of the ribose sugar, e.g., at the 2′ position, e.g., 2′-F, 2′-O-alkyl, 2′-amino, or 2′-azido. In some embodiments, a nucleotide is a non-hydrolyzable nucleotide, e.g., may comprise a modified triphosphate group. In certain embodiments, the modified nucleotide is a reversible terminator oligonucleotide, e.g., a 2′- or 3′-OH-protected nucleotide.
As used herein, the term “initiator oligonucleotide” generally refers to a short, single-stranded RNA oligonucleotide that is capable of initiating template-independent synthesis. An initiator oligonucleotide is, in certain embodiments, less than 20 nucleotides in length. In some embodiments, an initiator oligonucleotide is less than 20, less than 18, less than 15, less than 12, less than 10, less than 8, or less than 5 nucleotides in length. In some embodiments, an initiator oligonucleotide is 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 nucleotides in length. In some embodiments, an initiator oligonucleotide is labeled at its 5′ end, e.g., labeled with a fluorophore. In some embodiments, an initiator oligonucleotide is attached to a substrate at its 5′ end. In some embodiments, a substrate may be a glass surface, a bead, a biomolecule, or any conceivable substrate suitable for template-independent synthesis.
As used herein, the term “template-independent” generally refers to the synthesis of a RNA oligonucleotide that does not require a template DNA oligonucleotide. Template-independent synthesis will generally comprise the use of an initiator oligonucleotide and a polymerase, e.g., a poly(N) polymerase. Oligonucleotides, e.g., RNA oligonucleotides, synthesized using template-independent synthesis are generally synthesized by adding nucleotides, e.g., nucleotides, to the 3′ end of an existing oligonucleotide, e.g., an initiator oligonucleotide.
Definitions of specific functional groups and chemical terms are described in more detail below. The chemical elements are identified in accordance with the Periodic Table of the Elements, CAS version, Handbook of Chemistry and Physics, 75th Ed., inside cover, and specific functional groups are generally defined as described therein. Additionally, general principles of organic chemistry, as well as specific functional moieties and reactivity, are described in Organic Chemistry, Thomas Sorrell, University Science Books, Sausalito, 1999; Smith and March, March's Advanced Organic Chemistry, 5th Edition, John Wiley & Sons, Inc., New York, 2001; Larock, Comprehensive Organic Transformations, VCH Publishers, Inc., New York, 1989; and Carruthers, Some Modern Methods of Organic Synthesis, 3rd Edition, Cambridge University Press, Cambridge, 1987.
Compounds described herein can comprise one or more asymmetric centers, and thus can exist in various stereoisomeric forms, e.g., enantiomers and/or diastereomers. For example, the compounds described herein can be in the form of an individual enantiomer, diastereomer, or geometric isomer, or can be in the form of a mixture of stereoisomers, including racemic mixtures and mixtures enriched in one or more stereoisomer. Isomers can be isolated from mixtures by methods known to those skilled in the art, including chiral high pressure liquid chromatography (HPLC) and the formation and crystallization of chiral salts; or preferred isomers can be prepared by asymmetric syntheses. See, for example, Jacques et al., Enantiomers, Racemates and Resolutions (Wiley Interscience, New York, 1981); Wilen et al., Tetrahedron 33:2725 (1977); Eliel, E. L. Stereochemistry of Carbon Compounds (McGraw-Hill, N Y, 1962); and Wilen, S. H., Tables of Resolving Agents and Optical Resolutions p. 268 (E. L. Eliel, Ed., Univ. of Notre Dame Press, Notre Dame, Ind. 1972). The invention additionally encompasses compounds as individual isomers substantially free of other isomers, and alternatively, as mixtures of various isomers.
Unless otherwise stated, structures depicted herein are also meant to include compounds that differ only in the presence of one or more isotopically enriched atoms. For example, compounds having the present structures except for the replacement of hydrogen by deuterium or tritium, replacement of 19F with 18F, or the replacement of 12C with 13C or 14C are within the scope of the disclosure. Such compounds are useful, for example, as analytical tools or probes in biological assays.
When a range of values is listed, it is intended to encompass each value and sub-range within the range. For example “C1-6 alkyl” is intended to encompass, C1, C2, C3, C4, C5, C6, C1-6, C1-5, C1-4, C1-3, C1-2, C2-6, C2-5, C2-4, C2-3, C3-6, C3-5, C3-4, C4-6, C4-5, and C5-6 alkyl.
The term “alkyl” refers to a radical of a straight-chain or branched saturated hydrocarbon group having from 1 to 10 carbon atoms (“C1-10 alkyl”). In some embodiments, an alkyl group has 1 to 9 carbon atoms (“C1-9 alkyl”). In some embodiments, an alkyl group has 1 to 8 carbon atoms (“C1-8 alkyl”). In some embodiments, an alkyl group has 1 to 7 carbon atoms (“C1-7 alkyl”). In some embodiments, an alkyl group has 1 to 6 carbon atoms (“C1-6 alkyl”). In some embodiments, an alkyl group has 1 to 5 carbon atoms (“C1-5 alkyl”). In some embodiments, an alkyl group has 1 to 4 carbon atoms (“C1-4 alkyl”). In some embodiments, an alkyl group has 1 to 3 carbon atoms (“C1-3 alkyl”). In some embodiments, an alkyl group has 1 to 2 carbon atoms (“C1-2 alkyl”). In some embodiments, an alkyl group has 1 carbon atom (“C1 alkyl”). In some embodiments, an alkyl group has 2 to 6 carbon atoms (“C2-6 alkyl”). Examples of C1-6 alkyl groups include methyl (C1), ethyl (C2), propyl (C3) (e.g., n-propyl, isopropyl), butyl (C4) (e.g., n-butyl, tert-butyl, sec-butyl, iso-butyl), pentyl (C5) (e.g., n-pentyl, 3-pentanyl, amyl, neopentyl, 3-methyl-2-butanyl, tertiary amyl), and hexyl (C6) (e.g., n-hexyl). Additional examples of alkyl groups include n-heptyl (C7), n-octyl (C8), and the like. Unless otherwise specified, each instance of an alkyl group is independently unsubstituted (an “unsubstituted alkyl”) or substituted (a “substituted alkyl”) with one or more substituents (e.g., halogen, such as F). In certain embodiments, the alkyl group is an unsubstituted C1-10 alkyl (such as unsubstituted C1-6 alkyl, e.g., —CH3 (Me), unsubstituted ethyl (Et), unsubstituted propyl (Pr, e.g., unsubstituted n-propyl (n-Pr), unsubstituted isopropyl (i-Pr)), unsubstituted butyl (Bu, e.g., unsubstituted n-butyl (n-Bu), unsubstituted tert-butyl (tert-Bu or t-Bu), unsubstituted sec-butyl (sec-Bu), unsubstituted isobutyl (i-Bu)). In certain embodiments, the alkyl group is a substituted C1-10 alkyl (such as substituted C1-6 alkyl, e.g., —CF3, Bn).
The term “haloalkyl” is a substituted alkyl group, wherein one or more of the hydrogen atoms are independently replaced by a halogen, e.g., fluoro, bromo, chloro, or iodo. In some embodiments, the haloalkyl moiety has 1 to 8 carbon atoms (“C1-8 haloalkyl”). In some embodiments, the haloalkyl moiety has 1 to 6 carbon atoms (“C1-6 haloalkyl”). In some embodiments, the haloalkyl moiety has 1 to 4 carbon atoms (“C1-4 haloalkyl”). In some embodiments, the haloalkyl moiety has 1 to 3 carbon atoms (“C1-3 haloalkyl”). In some embodiments, the haloalkyl moiety has 1 to 2 carbon atoms (“C1-2 haloalkyl”). Examples of haloalkyl groups include —CHF2, —CH2F, —CF3, —CH2CF3, —CF2CF3, —CF2CF2CF3, —CCl3, —CFCl2, —CF2Cl, and the like.
The term “heteroalkyl” refers to an alkyl group, which further includes at least one heteroatom (e.g., 1, 2, 3, or 4 heteroatoms) selected from oxygen, nitrogen, or sulfur within (i.e., inserted between adjacent carbon atoms of) and/or placed at one or more terminal position(s) of the parent chain.
The term “alkenyl” refers to a radical of a straight-chain or branched hydrocarbon group having from 2 to 10 carbon atoms and one or more carbon-carbon double bonds (e.g., 1, 2, 3, or 4 double bonds). In some embodiments, an alkenyl group has 2 to 9 carbon atoms (“C2-9 alkenyl”). In some embodiments, an alkenyl group has 2 to 8 carbon atoms (“C2-8 alkenyl”). In some embodiments, an alkenyl group has 2 to 7 carbon atoms (“C2-7 alkenyl”). In some embodiments, an alkenyl group has 2 to 6 carbon atoms (“C2-6 alkenyl”). In some embodiments, an alkenyl group has 2 to 5 carbon atoms (“C2-5 alkenyl”). In some embodiments, an alkenyl group has 2 to 4 carbon atoms (“C2-4 alkenyl”). In some embodiments, an alkenyl group has 2 to 3 carbon atoms (“C2-3 alkenyl”). In some embodiments, an alkenyl group has 2 carbon atoms (“C2 alkenyl”). The one or more carbon-carbon double bonds can be internal (such as in 2-butenyl) or terminal (such as in 1-butenyl). Examples of C2-4 alkenyl groups include ethenyl (C2), 1-propenyl (C3), 2-propenyl (C3), 1-butenyl (C4), 2-butenyl (C4), butadienyl (C4), and the like. Examples of C2-6 alkenyl groups include the aforementioned C2-4 alkenyl groups as well as pentenyl (C5), pentadienyl (C5), hexenyl (C6), and the like. Additional examples of alkenyl include heptenyl (C7), octenyl (C8), octatrienyl (C8), and the like. Unless otherwise specified, each instance of an alkenyl group is independently unsubstituted (an “unsubstituted alkenyl”) or substituted (a “substituted alkenyl”) with one or more substituents. In certain embodiments, the alkenyl group is an unsubstituted C2-10 alkenyl. In certain embodiments, the alkenyl group is a substituted C2-10 alkenyl. In an alkenyl group, a C═C double bond for which the stereochemistry is not specified (e.g., —CH═CHCH3 or
may be an (E)- or (Z)-double bond.
The term “heteroalkenyl” refers to an alkenyl group, which further includes at least one heteroatom (e.g., 1, 2, 3, or 4 heteroatoms) selected from oxygen, nitrogen, or sulfur within (i.e., inserted between adjacent carbon atoms of) and/or placed at one or more terminal position(s) of the parent chain.
The term “alkynyl” refers to a radical of a straight-chain or branched hydrocarbon group having from 2 to 10 carbon atoms and one or more carbon-carbon triple bonds (e.g., 1, 2, 3, or 4 triple bonds) (“C2-10 alkynyl”). In some embodiments, an alkynyl group has 2 to 9 carbon atoms (“C2-9 alkynyl”). In some embodiments, an alkynyl group has 2 to 8 carbon atoms (“C2-8 alkynyl”). In some embodiments, an alkynyl group has 2 to 7 carbon atoms (“C2-7 alkynyl”). In some embodiments, an alkynyl group has 2 to 6 carbon atoms (“C2-6 alkynyl”). In some embodiments, an alkynyl group has 2 to 5 carbon atoms (“C2-5 alkynyl”). In some embodiments, an alkynyl group has 2 to 4 carbon atoms (“C2-4 alkynyl”). In some embodiments, an alkynyl group has 2 to 3 carbon atoms (“C2-3 alkynyl”). In some embodiments, an alkynyl group has 2 carbon atoms (“C2 alkynyl”). The one or more carbon-carbon triple bonds can be internal (such as in 2-butynyl) or terminal (such as in 1-butynyl). Examples of C2-4 alkynyl groups include, without limitation, ethynyl (C2), 1-propynyl (C3), 2-propynyl (C3), 1-butynyl (C4), 2-butynyl (C4), and the like. Examples of C2-6 alkenyl groups include the aforementioned C2-4 alkynyl groups as well as pentynyl (C5), hexynyl (C6), and the like. Additional examples of alkynyl include heptynyl (C7), octynyl (C8), and the like. Unless otherwise specified, each instance of an alkynyl group is independently unsubstituted (an “unsubstituted alkynyl”) or substituted (a “substituted alkynyl”) with one or more substituents. In certain embodiments, the alkynyl group is an unsubstituted C2-10 alkynyl. In certain embodiments, the alkynyl group is a substituted C2-10 alkynyl.
The term “heteroalkynyl” refers to an alkynyl group, which further includes at least one heteroatom (e.g., 1, 2, 3, or 4 heteroatoms) selected from oxygen, nitrogen, or sulfur within (i.e., inserted between adjacent carbon atoms of) and/or placed at one or more terminal position(s) of the parent chain.
The term “carbocyclyl” or “carbocyclic” refers to a radical of a non-aromatic cyclic hydrocarbon group having from 3 to 14 ring carbon atoms (“C3-14 carbocyclyl”) and zero heteroatoms in the non-aromatic ring system. In some embodiments, a carbocyclyl group has 3 to 10 ring carbon atoms (“C3-10 carbocyclyl”). In some embodiments, a carbocyclyl group has 3 to 8 ring carbon atoms (“C3-8 carbocyclyl”). In some embodiments, a carbocyclyl group has 3 to 7 ring carbon atoms (“C3-7 carbocyclyl”). In some embodiments, a carbocyclyl group has 3 to 6 ring carbon atoms (“C3-6 carbocyclyl”). In some embodiments, a carbocyclyl group has 4 to 6 ring carbon atoms (“C4-6 carbocyclyl”). In some embodiments, a carbocyclyl group has 5 to 6 ring carbon atoms (“C5-6 carbocyclyl”). In some embodiments, a carbocyclyl group has 5 to 10 ring carbon atoms (“C5-10 carbocyclyl”). Exemplary C3-6 carbocyclyl groups include, without limitation, cyclopropyl (C3), cyclopropenyl (C3), cyclobutyl (C4), cyclobutenyl (C4), cyclopentyl (C5), cyclopentenyl (C5), cyclohexyl (C6), cyclohexenyl (C6), cyclohexadienyl (C6), and the like. Exemplary C3-8 carbocyclyl groups include, without limitation, the aforementioned C3-6 carbocyclyl groups as well as cycloheptyl (C7), cycloheptenyl (C7), cycloheptadienyl (C7), cycloheptatrienyl (C7), cyclooctyl (C8), cyclooctenyl (C8), bicyclo[2.2.1]heptanyl (C7), bicyclo[2.2.2]octanyl (C8), and the like. Exemplary C3-10 carbocyclyl groups include, without limitation, the aforementioned C3-8 carbocyclyl groups as well as cyclononyl (C9), cyclononenyl (C9), cyclodecyl (C10), cyclodecenyl (C10), octahydro-1H-indenyl (C9), decahydronaphthalenyl (C10), spiro[4.5]decanyl (C10), and the like. As the foregoing examples illustrate, in certain embodiments, the carbocyclyl group is either monocyclic (“monocyclic carbocyclyl”) or polycyclic (e.g., containing a fused, bridged or spiro ring system such as a bicyclic system (“bicyclic carbocyclyl”) or tricyclic system (“tricyclic carbocyclyl”)) and can be saturated or can contain one or more carbon-carbon double or triple bonds. “Carbocyclyl” also includes ring systems wherein the carbocyclyl ring, as defined above, is fused with one or more aryl or heteroaryl groups wherein the point of attachment is on the carbocyclyl ring, and in such instances, the number of carbons continue to designate the number of carbons in the carbocyclic ring system. Unless otherwise specified, each instance of a carbocyclyl group is independently unsubstituted (an “unsubstituted carbocyclyl”) or substituted (a “substituted carbocyclyl”) with one or more substituents. In certain embodiments, the carbocyclyl group is an unsubstituted C3-14 carbocyclyl. In certain embodiments, the carbocyclyl group is a substituted C3-14 carbocyclyl.
In some embodiments, “carbocyclyl” is a monocyclic, saturated carbocyclyl group having from 3 to 14 ring carbon atoms (“C3-14 cycloalkyl”). In some embodiments, a cycloalkyl group has 3 to 10 ring carbon atoms (“C3-10 cycloalkyl”). In some embodiments, a cycloalkyl group has 3 to 8 ring carbon atoms (“C3_8 cycloalkyl”). In some embodiments, a cycloalkyl group has 3 to 6 ring carbon atoms (“C3-6 cycloalkyl”). In some embodiments, a cycloalkyl group has 4 to 6 ring carbon atoms (“C4-6 cycloalkyl”). In some embodiments, a cycloalkyl group has 5 to 6 ring carbon atoms (“C5-6 cycloalkyl”). In some embodiments, a cycloalkyl group has 5 to 10 ring carbon atoms (“C5-10 cycloalkyl”). Examples of C5-6 cycloalkyl groups include cyclopentyl (C5) and cyclohexyl (C5). Examples of C3-6 cycloalkyl groups include the aforementioned C5-6 cycloalkyl groups as well as cyclopropyl (C3) and cyclobutyl (C4). Examples of C3-8 cycloalkyl groups include the aforementioned C3-6 cycloalkyl groups as well as cycloheptyl (C7) and cyclooctyl (C8). Unless otherwise specified, each instance of a cycloalkyl group is independently unsubstituted (an “unsubstituted cycloalkyl”) or substituted (a “substituted cycloalkyl”) with one or more substituents. In certain embodiments, the cycloalkyl group is an unsubstituted C3-14 cycloalkyl. In certain embodiments, the cycloalkyl group is a substituted C3-14 cycloalkyl.
The term “heterocyclyl” or “heterocyclic” refers to a radical of a 3- to 14-membered non-aromatic ring system having ring carbon atoms and 1 to 4 ring heteroatoms, wherein each heteroatom is independently selected from nitrogen, oxygen, and sulfur (“3-14 membered heterocyclyl”). In heterocyclyl groups that contain one or more nitrogen atoms, the point of attachment can be a carbon or nitrogen atom, as valency permits. A heterocyclyl group can either be monocyclic (“monocyclic heterocyclyl”) or polycyclic (e.g., a fused, bridged or spiro ring system such as a bicyclic system (“bicyclic heterocyclyl”) or tricyclic system (“tricyclic heterocyclyl”)), and can be saturated or can contain one or more carbon-carbon double or triple bonds. Heterocyclyl polycyclic ring systems can include one or more heteroatoms in one or both rings. “Heterocyclyl” also includes ring systems wherein the heterocyclyl ring, as defined above, is fused with one or more carbocyclyl groups wherein the point of attachment is either on the carbocyclyl or heterocyclyl ring, or ring systems wherein the heterocyclyl ring, as defined above, is fused with one or more aryl or heteroaryl groups, wherein the point of attachment is on the heterocyclyl ring, and in such instances, the number of ring members continue to designate the number of ring members in the heterocyclyl ring system. Unless otherwise specified, each instance of heterocyclyl is independently unsubstituted (an “unsubstituted heterocyclyl”) or substituted (a “substituted heterocyclyl”) with one or more substituents. In certain embodiments, the heterocyclyl group is an unsubstituted 3-14 membered heterocyclyl. In certain embodiments, the heterocyclyl group is a substituted 3-14 membered heterocyclyl.
In some embodiments, a heterocyclyl group is a 5-10 membered non-aromatic ring system having ring carbon atoms and 1-4 ring heteroatoms, wherein each heteroatom is independently selected from nitrogen, oxygen, and sulfur (“5-10 membered heterocyclyl”). In some embodiments, a heterocyclyl group is a 5-8 membered non-aromatic ring system having ring carbon atoms and 1-4 ring heteroatoms, wherein each heteroatom is independently selected from nitrogen, oxygen, and sulfur (“5-8 membered heterocyclyl”). In some embodiments, a heterocyclyl group is a 5-6 membered non-aromatic ring system having ring carbon atoms and 1-4 ring heteroatoms, wherein each heteroatom is independently selected from nitrogen, oxygen, and sulfur (“5-6 membered heterocyclyl”). In some embodiments, the 5-6 membered heterocyclyl has 1-3 ring heteroatoms selected from nitrogen, oxygen, and sulfur. In some embodiments, the 5-6 membered heterocyclyl has 1-2 ring heteroatoms selected from nitrogen, oxygen, and sulfur. In some embodiments, the 5-6 membered heterocyclyl has 1 ring heteroatom selected from nitrogen, oxygen, and sulfur.
The term “aryl” refers to a radical of a monocyclic or polycyclic (e.g., bicyclic or tricyclic) 4n+2 aromatic ring system (e.g., having 6, 10, or 14 π electrons shared in a cyclic array) having 6-14 ring carbon atoms and zero heteroatoms provided in the aromatic ring system (“C6-14 aryl”). In some embodiments, an aryl group has 6 ring carbon atoms (“C6 aryl”; e.g., phenyl). In some embodiments, an aryl group has 10 ring carbon atoms (“C10 aryl”; e.g., naphthyl such as 1-naphthyl and 2-naphthyl). In some embodiments, an aryl group has 14 ring carbon atoms (“C14 aryl”; e.g., anthracyl). “Aryl” also includes ring systems wherein the aryl ring, as defined above, is fused with one or more carbocyclyl or heterocyclyl groups wherein the radical or point of attachment is on the aryl ring, and in such instances, the number of carbon atoms continue to designate the number of carbon atoms in the aryl ring system. Unless otherwise specified, each instance of an aryl group is independently unsubstituted (an “unsubstituted aryl”) or substituted (a “substituted aryl”) with one or more substituents. In certain embodiments, the aryl group is an unsubstituted C6-14 aryl. In certain embodiments, the aryl group is a substituted C6-14 aryl.
The term “heteroaryl” refers to a radical of a 5-14 membered monocyclic or polycyclic (e.g., bicyclic, tricyclic) 4n+2 aromatic ring system (e.g., having 6, 10, or 14 π electrons shared in a cyclic array) having ring carbon atoms and 1-4 ring heteroatoms provided in the aromatic ring system, wherein each heteroatom is independently selected from nitrogen, oxygen, and sulfur (“5-14 membered heteroaryl”). In heteroaryl groups that contain one or more nitrogen atoms, the point of attachment can be a carbon or nitrogen atom, as valency permits. Heteroaryl polycyclic ring systems can include one or more heteroatoms in one or both rings. “Heteroaryl” includes ring systems wherein the heteroaryl ring, as defined above, is fused with one or more carbocyclyl or heterocyclyl groups wherein the point of attachment is on the heteroaryl ring, and in such instances, the number of ring members continue to designate the number of ring members in the heteroaryl ring system. “Heteroaryl” also includes ring systems wherein the heteroaryl ring, as defined above, is fused with one or more aryl groups wherein the point of attachment is either on the aryl or heteroaryl ring, and in such instances, the number of ring members designates the number of ring members in the fused polycyclic (aryl/heteroaryl) ring system. Polycyclic heteroaryl groups wherein one ring does not contain a heteroatom (e.g., indolyl, quinolinyl, carbazolyl, and the like) the point of attachment can be on either ring, i.e., either the ring bearing a heteroatom (e.g., 2-indolyl) or the ring that does not contain a heteroatom (e.g., 5-indolyl).
In some embodiments, a heteroaryl group is a 5-10 membered aromatic ring system having ring carbon atoms and 1-4 ring heteroatoms provided in the aromatic ring system, wherein each heteroatom is independently selected from nitrogen, oxygen, and sulfur (“5-10 membered heteroaryl”). In some embodiments, a heteroaryl group is a 5-8 membered aromatic ring system having ring carbon atoms and 1-4 ring heteroatoms provided in the aromatic ring system, wherein each heteroatom is independently selected from nitrogen, oxygen, and sulfur (“5-8 membered heteroaryl”). In some embodiments, a heteroaryl group is a 5-6 membered aromatic ring system having ring carbon atoms and 1-4 ring heteroatoms provided in the aromatic ring system, wherein each heteroatom is independently selected from nitrogen, oxygen, and sulfur (“5-6 membered heteroaryl”). In some embodiments, the 5-6 membered heteroaryl has 1-3 ring heteroatoms selected from nitrogen, oxygen, and sulfur. In some embodiments, the 5-6 membered heteroaryl has 1-2 ring heteroatoms selected from nitrogen, oxygen, and sulfur. In some embodiments, the 5-6 membered heteroaryl has 1 ring heteroatom selected from nitrogen, oxygen, and sulfur. Unless otherwise specified, each instance of a heteroaryl group is independently unsubstituted (an “unsubstituted heteroaryl”) or substituted (a “substituted heteroaryl”) with one or more substituents. In certain embodiments, the heteroaryl group is an unsubstituted 5-14 membered heteroaryl. In certain embodiments, the heteroaryl group is a substituted 5-14 membered heteroaryl.
A group is optionally substituted unless expressly provided otherwise. The term “optionally substituted” refers to being substituted or unsubstituted. In certain embodiments, alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl groups are optionally substituted. “Optionally substituted” refers to a group which may be substituted or unsubstituted. In general, the term “substituted” means that at least one hydrogen present on a group is replaced with a permissible substituent, e.g., a substituent which upon substitution results in a stable compound, e.g., a compound which does not spontaneously undergo transformation such as by rearrangement, cyclization, elimination, or other reaction. Unless otherwise indicated, a “substituted” group has a substituent at one or more substitutable positions of the group, and when more than one position in any given structure is substituted, the substituent is either the same or different at each position. The term “substituted” is contemplated to include substitution with all permissible substituents of organic compounds, and includes any of the substituents described herein that results in the formation of a stable compound. The present invention contemplates any and all such combinations in order to arrive at a stable compound. For purposes of this invention, heteroatoms such as nitrogen may have hydrogen substituents and/or any suitable substituent as described herein which satisfy the valencies of the heteroatoms and results in the formation of a stable moiety. The invention is not intended to be limited in any manner by the exemplary substituents described herein.
Exemplary substituents include, but are not limited to, halogen, —CN, —NO2, —N3, —SO2H, —SO3H, —OH, —ORaa, —ON(Rbb)2, —N(Rbb)2, —N(Rbb)3+X−, —N(ORcc)Rbb, —SH, —SRaa, —SSRcc, —C(═O)Raa, —CO2H, —CHO, —C(ORcc)3, —CO2Raa, —OC(═O)Raa, —OCO2Raa, —C(═O)N(Rbb)2, —OC(═O)N(Rbb)2, —NRbbC(═O)R—, —NRbbCO2Raa, —NRbbC(═O)N(Rbb)2, —C(═NRbb)Raa, —C(═NRbb)ORaa, —OC(═NRbb)Raa, —OC(═NRbb)ORaa, —C(═NRbb)N(Rbb)2, —OC(═NRbb)N(Rbb)2, —NRbbC(═NRbb)N(Rbb)2, —C(═O)NRbbSO2Raa, —NRbbSO2Raa, —SO2N(Rbb)2, —SO2Raa, —SO2ORaa, —OSO2Raa, —S(═O)Raa, —OS(═O)Raa, —Si(Raa)3, —OSi(Raa)3—C(═S)N(Rbb)2, —C(═O)SRaa, —C(═S)SRaa, —SC(═S)SRaa, —SC(═O)SRaa, —OC(═O)SRaa, —SC(═O)ORaa, —SC(═O)Raa, —P(═O)(Raa)2, —P(═O)(ORcc)2, —OP(═O)(Raa)2, —OP(═O)(ORcc)2, —P(═O)(N(Rbb)2)2, —OP(═O)(N(Rbb)2)2, —NRbbP(═O)(Raa)2, —NRbbP(═O)(ORcc)2, —NRbbP(═O)(N(Rbb)2)2, —P(Rcc)2, —P(ORcc)2, —P(Rcc)3+X−, —P(ORcc)3+X−, —P(Rcc)4, —P(ORcc)4, —OP(Rcc)2, —OP(Rcc)3+X−, —OP(ORcc)2, —OP(ORcc)3+X−, —OP(Rcc)4, —OP(ORcc)4, —B(Raa)2, —B(ORcc)2, —BRaa(ORcc), C1-10 alkyl, C1-10 perhaloalkyl, C2-10 alkenyl, C2-10 alkynyl, heteroC1-10 alkyl, heteroC2-10 alkenyl, heteroC2-10 alkynyl, C3-10 carbocyclyl, 3-14 membered heterocyclyl, C6-14 aryl, and 5-14 membered heteroaryl, wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 Rdd groups; wherein X− is a counterion;
or two geminal hydrogens on a carbon atom are replaced with the group ═O, ═S, ═NN(Rbb)2, ═NNRbbC(═O)Raa, ═NNRbbC(═O)ORaa, ═NNRbbS(═O)2Raa, ═NRbb, or ═NORcc;
each instance of Raa is, independently, selected from C1-10 alkyl, C1-10 perhaloalkyl, C2-10 alkenyl, C2-10 alkynyl, heteroC1-10 alkyl, heteroC2-10 alkenyl, heteroC2-10 alkynyl, C3-10 carbocyclyl, 3-14 membered heterocyclyl, C6-14 aryl, and 5-14 membered heteroaryl, or two Raa groups are joined to form a 3-14 membered heterocyclyl or 5-14 membered heteroaryl ring;
each instance of Rbb is, independently, selected from hydrogen, —OH, —ORaa, —N(Rcc)2, —CN, —C(═O)Raa, —C(═O)N(Rcc)2, —CO2Raa, —SO2Raa, —C(═NRcc)ORaa, —C(═NRcc)N(Rcc)2, —SO2N(Rcc)2, —SO2Rcc, —SO2ORcc, —SORaa, —C(═S)N(Rcc)2, —C(═O)SRcc, —C(═S)SRcc, —P(═O)(Raa)2, —P(═O)(ORcc)2, —P(═O)(N(Rcc)2)2, C1-10 alkyl, C1-10 perhaloalkyl, C2-10 alkenyl, C2-10 alkynyl, heteroC1-10 alkyl, heteroC2-10 alkenyl, heteroC2-10 alkynyl, C3-10 carbocyclyl, 3-14 membered heterocyclyl, C6-14 aryl, and 5-14 membered heteroaryl, or two Rbb groups are joined to form a 3-14 membered heterocyclyl or 5-14 membered heteroaryl ring;
each instance of Rcc is, independently, selected from hydrogen, C1-10 alkyl, C1-10 perhaloalkyl, C2-10 alkenyl, C2-10 alkynyl, heteroC1-10 alkyl, heteroC2-10 alkenyl, heteroC2-10 alkynyl, C3-10 carbocyclyl, 3-14 membered heterocyclyl, C6-14 aryl, and 5-14 membered heteroaryl, or two Rcc groups are joined to form a 3-14 membered heterocyclyl or 5-14 membered heteroaryl ring;
each instance of Rdd is, independently, halogen, —CN, —NO2, —N3, —SO2H, —SO3H, —OH, —OC1-6 alkyl, —ON(C1-6 alkyl)2, —N(C1-6 alkyl)2, —N(C1-6 alkyl)3+X−, —NH(C1-6 alkyl)2+X−, —NH2(C1-6 alkyl)+X−, —NH3+X−, —N(OC1-6 alkyl)(C1-6 alkyl), —N(OH)(C1-6 alkyl), —NH(OH), —SH, —SC1-6 alkyl, —SS(C1-6 alkyl), —C(═O)(C1-6 alkyl), —CO2H, —CO2(C1-6 alkyl), —OC(═O)(C1-6 alkyl), —OCO2(C1-6 alkyl), —C(═O)NH2, —C(═O)N(C1-6 alkyl)2, —OC(═O)NH(C1-6 alkyl), —NHC(═O)(C1-6 alkyl), —N(C1-6 alkyl)C(═O)(C1-6 alkyl), —NHCO2(C1-6 alkyl), —NHC(═O)N(C1-6 alkyl)2, —NHC(═O)NH(C1-6 alkyl), —NHC(═O)NH2, —C(═NH)O(C1-6 alkyl), —OC(═NH)(C1-6 alkyl), —OC(═NH)OC1-6 alkyl, —C(═NH)N(C1-6 alkyl)2, —C(═NH)NH(C1-6 alkyl), —C(═NH)NH2, —OC(═NH)N(C1-6 alkyl)2, —OC(═NH)NH(C1-6 alkyl), —OC(═NH)NH2, —NHC(═NH)N(C1-6 alkyl)2, —NHC(═NH)NH2, —NHSO2(C1-6 alkyl), —SO2N(C1-6 alkyl)2, —SO2NH(C1-6 alkyl), —SO2NH2, —SO2(C1-6 alkyl), —SO2O(C1-6 alkyl), —OSO2(C1-6 alkyl), —SO(C1-6 alkyl), —Si(C1-6 alkyl)3, —OSi(C1-6 alkyl)3-C(═S)N(C1-6 alkyl)2, C(═S)NH(C1-6 alkyl), C(═S)NH2, —C(═O)S(C1-6 alkyl), —C(═S)SC1-6 alkyl, —SC(═S)SC1-6 alkyl, —P(═O)(OC1-6 alkyl)2, —P(═O)(C1-6 alkyl)2, —OP(═O)(C1-6 alkyl)2, —OP(═O)(OC1-6 alkyl)2, C1-6 alkyl, C1-6 perhaloalkyl, C2-6 alkenyl, C2-6 alkynyl, heteroC1-6 alkyl, heteroC2-6 alkenyl, heteroC2-6 alkynyl, C3-10 carbocyclyl, C6-10 aryl, 3-10 membered heterocyclyl, 5-10 membered heteroaryl;
or two geminal Rdd substituents can be joined to form ═O or ═S.
The term “halo” or “halogen” refers to fluorine (fluoro, —F), chlorine (chloro, —Cl), bromine (bromo, —Br), or iodine (iodo, —I).
The term “hydroxyl” or “hydroxy” refers to the group —OH. The term “substituted hydroxyl” or “substituted hydroxyl,” by extension, refers to a hydroxyl group wherein the oxygen atom directly attached to the parent molecule is substituted with a group other than hydrogen, and includes groups selected from —ORaa, —ON(Rbb)2, —OC(═O)SRaa, —OC(═O)Raa, —OCO2Raa, —OC(═O)N(Rbb)2, —OC(═NRbb)Raa, —OC(═NRbb)ORaa, —OC(═NRbb)N(Rbb)2, —OS(═O)Raa, —OSO2Raa, —OSi(Raa)3, —OP(Rcc)2, —OP(Rcc)3+X−, —OP(ORcc)2, —OP(ORcc)3+X−, —OP(═O)(Raa)2, —OP(═O)(ORcc)2, and —OP(═O)(N(Rbb)2)2, wherein X−, Raa, Rbb, and Rcc are as defined herein.
The term “amino” refers to the group —NH2. The term “substituted amino,” by extension, refers to a monosubstituted amino, a disubstituted amino, or a trisubstituted amino. In certain embodiments, the “substituted amino” is a monosubstituted amino or a disubstituted amino group. The term “monosubstituted amino” refers to an amino group wherein the nitrogen atom directly attached to the parent molecule is substituted with one hydrogen and one group other than hydrogen, and includes groups selected from —NH(Rbb) —NHC(═O)Raa, —NHCO2Raa, —NHC(═O)N(Rbb)2, —NHC(═NRbb)N(Rbb)2, —NHSO2Raa, —NHP(═O)(ORcc)2, and —NHP(═O)(N(Rbb)2)2, wherein Raa, Rbb, and Rcc are as defined herein, and wherein Rbb of the group —NH(Rbb) is not hydrogen. The term “disubstituted amino” refers to an amino group wherein the nitrogen atom directly attached to the parent molecule is substituted with two groups other than hydrogen, and includes groups selected from —N(Rbb)2, —NRbbC(═O)Raa, —NRbbCO2Raa, —NRbbC(═O)N(Rbb)2, —NRbbC(═NRbb)N(Rbb)2, —NRbbSO2Raa, —NRbbP(═O)(ORcc)2, and —NRbbP(═O)(N(Rbb)2)2, wherein Raa, Rbb, and Rcc are as defined herein, with the proviso that the nitrogen atom directly attached to the parent molecule is not substituted with hydrogen. The term “trisubstituted amino” refers to an amino group wherein the nitrogen atom directly attached to the parent molecule is substituted with three groups, and includes groups selected from —N(Rbb)3 and —N(Rbb)3+X−, wherein Rbb and X− are as defined herein.
The term “thio” or “thiol” refers to the group —SH. The term “substituted thio” or “substituted thiol,” by extension, refers to a thiol group wherein the sulfur atom directly attached to the parent molecule is substituted with a group other than hydrogen. In certain embodiments, the substituent present on a sulfur atom is a sulfur protecting group (also referred to as a “thiol protecting group”). Sulfur protecting groups include, but are not limited to, —Raa, —N(Rbb)2, —C(═O)SRaa, —C(═O)Raa, —CO2R—, —C(═O)N(Rbb)2, —C(═NRbb)R—, —C(═NRbb)ORaa, —C(═NRbb)N(Rbb)2, —S(═O)Raa, —SO2Raa, —Si(Raa)3, —P(Rcc)2, —P(Rcc)3+X−, —P(ORcc)2, —P(OR)3+X−, —P(═O)(Raa)2, —P(═O)(ORcc)2, and —P(═O)(N(Rbb)2)2, wherein Raa, Rbb, and Rcc are as defined herein.
The term “acyl” refers to a group having the general formula —C(═O)RX1, —C(═O)ORX1, —C(═O)—O—C(═O)RX1, —C(═O)SRX1, —C(═O)N(RX1)2, —C(═S)RX1, —C(═S)N(RX1)2, —C(═S)O(RX1), —C(═S)S(RX1), —C(═NRX1)RX1, —C(═NRX1)ORX1, —C(═NRX1)SRX1, and —C(═NRX1)N(RX1)2, wherein RX1 is hydrogen; halogen; substituted or unsubstituted hydroxyl; substituted or unsubstituted thiol; substituted or unsubstituted amino; substituted or unsubstituted acyl, cyclic or acyclic, substituted or unsubstituted, branched or unbranched aliphatic; cyclic or acyclic, substituted or unsubstituted, branched or unbranched heteroaliphatic; cyclic or acyclic, substituted or unsubstituted, branched or unbranched alkyl; cyclic or acyclic, substituted or unsubstituted, branched or unbranched alkenyl; substituted or unsubstituted alkynyl; substituted or unsubstituted aryl, substituted or unsubstituted heteroaryl, aliphaticoxy, heteroaliphaticoxy, alkyloxy, heteroalkyloxy, aryloxy, heteroaryloxy, aliphaticthioxy, heteroaliphaticthioxy, alkylthioxy, heteroalkylthioxy, arylthioxy, heteroarylthioxy, mono- or di-aliphaticamino, mono- or di-heteroaliphaticamino, mono- or di-alkylamino, mono- or di-heteroalkylamino, mono- or di-arylamino, or mono- or di-heteroarylamino; or two RX1 groups taken together form a 5- to 6-membered heterocyclic ring. Exemplary acyl groups include aldehydes (—CHO), carboxylic acids (—CO2H), ketones, acyl halides, esters, amides, imines, carbonates, carbamates, and ureas. Acyl substituents include, but are not limited to, any of the substituents described herein, that result in the formation of a stable moiety (e.g., aliphatic, alkyl, alkenyl, alkynyl, heteroaliphatic, heterocyclic, aryl, heteroaryl, acyl, oxo, imino, thiooxo, cyano, isocyano, amino, azido, nitro, hydroxyl, thiol, halo, aliphaticamino, heteroaliphaticamino, alkylamino, heteroalkylamino, arylamino, heteroarylamino, alkylaryl, arylalkyl, aliphaticoxy, heteroaliphaticoxy, alkyloxy, heteroalkyloxy, aryloxy, heteroaryloxy, aliphaticthioxy, heteroaliphaticthioxy, alkylthioxy, heteroalkylthioxy, arylthioxy, heteroarylthioxy, acyloxy, and the like, each of which may or may not be further substituted).
The term “amino acid” refers to a molecule containing both an amino group and a carboxyl group. Amino acids include alpha-amino acids and beta-amino acids, the structures of which are depicted below. In certain embodiments, an amino acid is an alpha amino acid.
Suitable amino acids include, without limitation, natural alpha-amino acids such as D- and L-isomers of the 20 common naturally occurring alpha-amino acids found in peptides (e.g., A, R, N, C, D, Q, E, G, H, I, L, K, M, F, P, S, T, W, Y, V, as provided below), unnatural alpha-amino acids natural beta-amino acids (e.g., beta-alanine), and unnatural beta-amino acids. Exemplary natural alpha-amino acids include L-Alanine (A), L-Arginine (R), L-Asparagine (N), L-Aspartic acid (D), L-Cysteine (C), L-Glutamic acid (E), L-Glutamine (Q), Glycine (G), L-Histidine (H), L-Isoleucine (I), L-Leucine (L), L-Lysine (K), L-Methionine (M), L-Phenylalanine (F), L-Proline (P), L-Serine (S), L-Threonine (T), L-Tryptophan (W), L-Tyrosine (Y), and L-Valine (V). Exemplary unnatural alpha-amino acids include D-Arginine, D-Asparagine, D-Aspartic acid, D-Cysteine, D-Glutamic acid, D-Glutamine, D-Histidine, D-Isoleucine, D-Leucine, D-Lysine, D-Methionine, D-Phenylalanine, D-Proline, D-Serine, D-Threonine, D-Tryptophan, D-Tyrosine, D-Valine, Di-vinyl, α-methyl-Alanine (Aib), α-methyl-Arginine, α-methyl-Asparagine, α-methyl-Aspartic acid, α-methyl-Cysteine, α-methyl-Glutamic acid, α-methyl-Glutamine, α-methyl-Histidine, α-methyl-Isoleucine, α-methyl-Leucine, α-methyl-Lysine, α-methyl-Methionine, α-methyl-Phenylalanine, α-methyl-Proline, α-methyl-Serine, α-methyl-Threonine, α-methyl-Tryptophan, α-methyl-Tyrosine, α-methyl-Valine, Norleucine, terminally unsaturated alpha-amino acids and bis alpha-amino acids (e.g., modified cysteine, modified lysine, modified tryptophan, modified serine, modified threonine, modified proline, modified histidine, modified alanine, and the like). There are many known unnatural amino acids any of which may be included in the peptides of the present invention. See for example, S. Hunt, The Non-Protein Amino Acids: In Chemistry and Biochemistry of the Amino Acids, edited by G. C. Barrett, Chapman and Hall, 1985.
In certain embodiments, the substituent present on the nitrogen atom is an nitrogen protecting group (also referred to herein as an “amino protecting group”). Nitrogen protecting groups include, but are not limited to, —OH, —ORaa, —N(Rcc)2, —C(═O)Raa, —C(═O)N(Rcc)2, —CO2Raa, —SO2Raa, —C(═NRcc)Raa, —C(═NRcc)ORaa, —C(═NRcc)N(Rcc)2, —SO2N(Rcc)2, —SO2Rcc, —SO2ORcc, —SORaa, —C(═S)N(Rcc)2, —C(═O)SRcc, —C(═S)SRcc, C1-10 alkyl (e.g., aralkyl, heteroaralkyl), C2-10 alkenyl, C2-10 alkynyl, heteroC1-10 alkyl, heteroC2-10 alkenyl, heteroC2-10 alkynyl, C3-10 carbocyclyl, 3-14 membered heterocyclyl, C6-14 aryl, and 5-14 membered heteroaryl groups, wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aralkyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 Rdd groups, and wherein Raa, Rbb, Rcc and Rdd are as defined herein. Nitrogen protecting groups are well known in the art and include those described in detail in Protecting Groups in Organic Synthesis, T. W. Greene and P. G. M. Wuts, 3rd edition, John Wiley & Sons, 1999, incorporated herein by reference.
For example, protecting groups (e.g., nitrogen or oxygen protecting groups) such as amide groups (e.g., —C(═O)Raa) include, but are not limited to, formamide, acetamide, chloroacetamide, trichloroacetamide, trifluoroacetamide, phenylacetamide, 3-phenylpropanamide, picolinamide, 3-pyridylcarboxamide, N-benzoylphenylalanyl derivative, benzamide, p-phenylbenzamide, o-nitrophenylacetamide, o-nitrophenoxyacetamide, acetoacetamide, (N′-dithiobenzyloxyacylamino)acetamide, 3-(p-hydroxyphenyl)propanamide, 3-(o-nitrophenyl)propanamide, 2-methyl-2-(o-nitrophenoxy)propanamide, 2-methyl-2-(o-phenylazophenoxy)propanamide, 4-chlorobutanamide, 3-methyl-3-nitrobutanamide, o-nitrocinnamide, N-acetylmethionine derivative, o-nitrobenzamide and o-(benzoyloxymethyl)benzamide.
Protecting groups (e.g., nitrogen or oxygen protecting groups) such as carbamate groups (e.g., —C(═O)ORaa) include, but are not limited to, methyl carbamate, ethyl carbamate, 9-fluorenylmethyl carbamate (Fmoc), 9-(2-sulfo)fluorenylmethyl carbamate, 9-(2,7-dibromo)fluoroenylmethyl carbamate, 2,7-di-t-butyl-[9-(10,10-dioxo-10,10,10,10-tetrahydrothioxanthyl)]methyl carbamate (DBD-Tmoc), 4-methoxyphenacyl carbamate (Phenoc), 2,2,2-trichloroethyl carbamate (Troc), 2-trimethylsilylethyl carbamate (Teoc), 2-phenylethyl carbamate (hZ), 1-(1-adamantyl)-1-methylethyl carbamate (Adpoc), 1,1-dimethyl-2-haloethyl carbamate, 1,1-dimethyl-2,2-dibromoethyl carbamate (DB-t-BOC), 1,1-dimethyl-2,2,2-trichloroethyl carbamate (TCBOC), 1-methyl-1-(4-biphenylyl)ethyl carbamate (Bpoc), 1-(3,5-di-t-butylphenyl)-1-methylethyl carbamate (t-Bumeoc), 2-(2′- and 4′-pyridyl)ethyl carbamate (Pyoc), 2-(N,N-dicyclohexylcarboxamido)ethyl carbamate, t-butyl carbamate (BOC or Boc), 1-adamantyl carbamate (Adoc), vinyl carbamate (Voc), allyl carbamate (Alloc), 1-isopropylallyl carbamate (Ipaoc), cinnamyl carbamate (Coc), 4-nitrocinnamyl carbamate (Noc), 8-quinolyl carbamate, N-hydroxypiperidinyl carbamate, alkyldithio carbamate, benzyl carbamate (Cbz), p-methoxybenzyl carbamate (Moz), p-nitrobenzyl carbamate, p-bromobenzyl carbamate, p-chlorobenzyl carbamate, 2,4-dichlorobenzyl carbamate, 4-methylsulfinylbenzyl carbamate (Msz), 9-anthrylmethyl carbamate, diphenylmethyl carbamate, 2-methylthioethyl carbamate, 2-methylsulfonylethyl carbamate, 2-(p-toluenesulfonyl)ethyl carbamate, [2-(1,3-dithianyl)]methyl carbamate (Dmoc), 4-methylthiophenyl carbamate (Mtpc), 2,4-dimethylthiophenyl carbamate (Bmpc), 2-phosphonioethyl carbamate (Peoc), 2-triphenylphosphonioisopropyl carbamate (Ppoc), 1,1-dimethyl-2-cyanoethyl carbamate, m-chloro-p-acyloxybenzyl carbamate, p-(dihydroxyboryl)benzyl carbamate, 5-benzisoxazolylmethyl carbamate, 2-(trifluoromethyl)-6-chromonylmethyl carbamate (Tcroc), m-nitrophenyl carbamate, 3,5-dimethoxybenzyl carbamate, o-nitrobenzyl carbamate, 3,4-dimethoxy-6-nitrobenzyl carbamate, phenyl(o-nitrophenyl)methyl carbamate, t-amyl carbamate, S-benzyl thiocarbamate, p-cyanobenzyl carbamate, cyclobutyl carbamate, cyclohexyl carbamate, cyclopentyl carbamate, cyclopropylmethyl carbamate, p-decyloxybenzyl carbamate, 2,2-dimethoxyacylvinyl carbamate, o-(N,N-dimethylcarboxamido)benzyl carbamate, 1,1-dimethyl-3-(N,N-dimethylcarboxamido)propyl carbamate, 1,1-dimethylpropynyl carbamate, di(2-pyridyl)methyl carbamate, 2-furanylmethyl carbamate, 2-iodoethyl carbamate, isoborynl carbamate, isobutyl carbamate, isonicotinyl carbamate, p-(p′-methoxyphenylazo)benzyl carbamate, 1-methylcyclobutyl carbamate, 1-methylcyclohexyl carbamate, 1-methyl-1-cyclopropylmethyl carbamate, 1-methyl-1-(3,5-dimethoxyphenyl)ethyl carbamate, 1-methyl-1-(p-phenylazophenyl)ethyl carbamate, 1-methyl-1-phenylethyl carbamate, 1-methyl-1-(4-pyridyl)ethyl carbamate, phenyl carbamate, p-(phenylazo)benzyl carbamate, 2,4,6-tri-t-butylphenyl carbamate, 4-(trimethylammonium)benzyl carbamate, and 2,4,6-trimethylbenzyl carbamate.
Protecting groups (e.g., nitrogen or oxygen protecting groups) such as sulfonamide groups (e.g., —S(═O)2Raa) include, but are not limited to, p-toluenesulfonamide (Ts), benzenesulfonamide, 2,3,6-trimethyl-4-methoxybenzenesulfonamide (Mtr), 2,4,6-trimethoxybenzenesulfonamide (Mtb), 2,6-dimethyl-4-methoxybenzenesulfonamide (Pme), 2,3,5,6-tetramethyl-4-methoxybenzenesulfonamide (Mte), 4-methoxybenzenesulfonamide (Mbs), 2,4,6-trimethylbenzenesulfonamide (Mts), 2,6-dimethoxy-4-methylbenzenesulfonamide (iMds), 2,2,5,7,8-pentamethylchroman-6-sulfonamide (Pmc), methanesulfonamide (Ms), β-trimethylsilylethanesulfonamide (SES), 9-anthracenesulfonamide, 4-(4′,8′-dimethoxynaphthylmethyl)benzenesulfonamide (DNMBS), benzylsulfonamide, trifluoromethylsulfonamide, and phenacylsulfonamide.
Other protecting groups (e.g., nitrogen or oxygen protecting groups) include, but are not limited to, phenothiazinyl-(10)-acyl derivative, N′-p-toluenesulfonylaminoacyl derivative, N′-phenylaminothioacyl derivative, N-benzoylphenylalanyl derivative, N-acetylmethionine derivative, 4,5-diphenyl-3-oxazolin-2-one, N-phthalimide, N-dithiasuccinimide (Dts), N-2,3-diphenylmaleimide, N-2,5-dimethylpyrrole, N-1,1,4,4-tetramethyldisilylazacyclopentane adduct (STABASE), 5-substituted 1,3-dimethyl-1,3,5-triazacyclohexan-2-one, 5-substituted 1,3-dibenzyl-1,3,5-triazacyclohexan-2-one, 1-substituted 3,5-dinitro-4-pyridone, N-methylamine, N-allylamine, N-[2-(trimethylsilyl)ethoxy]methylamine (SEM), N-3-acetoxypropylamine, N-(1-isopropyl-4-nitro-2-oxo-3-pyroolin-3-yl)amine, quaternary ammonium salts, N-benzylamine, N-di(4-methoxyphenyl)methylamine, N-5-dibenzosuberylamine, N-triphenylmethylamine (Tr), N-[(4-methoxyphenyl)diphenylmethyl]amine (MMTr), N-9-phenylfluorenylamine (PhF), N-2,7-dichloro-9-fluorenylmethyleneamine, N-ferrocenylmethylamino (Fcm), N-2-picolylamino N′-oxide, N-1,1-dimethylthiomethyleneamine, N-benzylideneamine, N-p-methoxybenzylideneamine, N-diphenylmethyleneamine, N-[(2-pyridyl)mesityl]methyleneamine, N—(N′,N′-dimethylaminomethylene)amine, N,N′-isopropylidenediamine, N-p-nitrobenzylideneamine, N-salicylideneamine, N-5-chlorosalicylideneamine, N-(5-chloro-2-hydroxyphenyl)phenylmethyleneamine, N-cyclohexylideneamine, N-(5,5-dimethyl-3-oxo-1-cyclohexenyl)amine, N-borane derivative, N-diphenylborinic acid derivative, N-[phenyl(pentaacylchromium- or tungsten)acyl]amine, N-copper chelate, N-zinc chelate, N-nitroamine, N-nitrosoamine, amine N-oxide, diphenylphosphinamide (Dpp), dimethylthiophosphinamide (Mpt), diphenylthiophosphinamide (Ppt), dialkyl phosphoramidates, dibenzyl phosphoramidate, diphenyl phosphoramidate, benzenesulfenamide, o-nitrobenzenesulfenamide (Nps), 2,4-dinitrobenzenesulfenamide, pentachlorobenzenesulfenamide, 2-nitro-4-methoxybenzenesulfenamide, triphenylmethylsulfenamide, and 3-nitropyridinesulfenamide (Npys). In certain embodiments, a protecting group (e.g., nitrogen or oxygen protecting group) is benzyl (Bn), tert-butyloxycarbonyl (BOC), carbobenzyloxy (Cbz), 9-flurenylmethyloxycarbonyl (Fmoc), trifluoroacetyl, triphenylmethyl, acetyl (Ac), benzoyl (Bz), p-methoxybenzyl (PMB), 3,4-dimethoxybenzyl (DMPM), p-methoxyphenyl (PMP), 2,2,2-trichloroethyloxycarbonyl (Troc), triphenylmethyl (Tr), tosyl (Ts), brosyl (Bs), nosyl (Ns), mesyl (Ms), triflyl (Tf), or dansyl (Ds).
As used herein, the term “salt” refers to any and all salts, and encompasses pharmaceutically acceptable salts. The term “pharmaceutically acceptable salt” refers to those salts which are, within the scope of sound medical judgment, suitable for use in contact with the tissues of humans and lower animals without undue toxicity, irritation, allergic response, and the like, and are commensurate with a reasonable benefit/risk ratio. Pharmaceutically acceptable salts are well known in the art. For example, Berge et al. describe pharmaceutically acceptable salts in detail in J. Pharmaceutical Sciences, 1977, 66, 1-19, incorporated herein by reference. Pharmaceutically acceptable salts of the compounds of this invention include those derived from suitable inorganic and organic acids and bases. Examples of pharmaceutically acceptable, nontoxic acid addition salts are salts of an amino group formed with inorganic acids, such as hydrochloric acid, hydrobromic acid, phosphoric acid, sulfuric acid, and perchloric acid or with organic acids, such as acetic acid, oxalic acid, maleic acid, tartaric acid, citric acid, succinic acid, or malonic acid or by using other methods known in the art such as ion exchange. Other pharmaceutically acceptable salts include adipate, alginate, ascorbate, aspartate, benzenesulfonate, benzoate, bisulfate, borate, butyrate, camphorate, camphorsulfonate, citrate, cyclopentanepropionate, digluconate, dodecylsulfate, ethanesulfonate, formate, fumarate, glucoheptonate, glycerophosphate, gluconate, hemisulfate, heptanoate, hexanoate, hydroiodide, 2-hydroxy-ethanesulfonate, lactobionate, lactate, laurate, lauryl sulfate, malate, maleate, malonate, methanesulfonate, 2-naphthalenesulfonate, nicotinate, nitrate, oleate, oxalate, palmitate, pamoate, pectinate, persulfate, 3-phenylpropionate, phosphate, picrate, pivalate, propionate, stearate, succinate, sulfate, tartrate, thiocyanate, p-toluenesulfonate, undecanoate, valerate salts, and the like. Salts derived from appropriate bases include alkali metal, alkaline earth metal, ammonium, and N+(C1-4 alkyl)4− salts. Representative alkali or alkaline earth metal salts include sodium, lithium, potassium, calcium, magnesium, and the like. Further pharmaceutically acceptable salts include, when appropriate, nontoxic ammonium, quaternary ammonium, and amine cations formed using counterions such as halide, hydroxide, carboxylate, sulfate, phosphate, nitrate, lower alkyl sulfonate, and aryl sulfonate.
The accompanying drawings, which constitute a part of this specification, illustrate several embodiments of the invention and together with the description, serve to explain the principles of the invention.
Described herein are methods for the de novo synthesis of RNA oligonucleotides using enzymatic catalysis. For example, provided herein are methods for the synthesis of RNA oligonucleotides wherein a terminal transferase enzyme (e.g., a poly(N) polymerase) incorporates one or more nucleotides onto an initiator oligonucleotide. For example, provided herein are methods for preparing RNA oligonucleotides wherein a poly(U) polymerase incorporates one or more modified nucleotides onto an initiator oligonucleotide via a terminal transferase.
In one aspect, provided herein are methods wherein modified nucleotides (i.e., 2′- or 3′-modified reversible terminator oligonucleotides) are incorporated that reversibly alter the binding affinity of the polymerase (e.g., a poly(U) polymerase) for the extended initiator oligonucleotide, thereby producing a (n+1) extended RNA oligonucleotide which can be deprotected and then further extended.
In another aspect, provided herein are methods for RNA oligonucleotide synthesis wherein non-hydrolyzable nucleotides are used to control the rate at which a polymerase (e.g., a poly(U) polymerase) incorporates hydrolyzable nucleotides onto an initiator oligonucleotide.
In another aspect, provided herein are methods of ligating two oligonucleotides using a poly(N)polymerase described herein (e.g., a wild-type or mutated poly(U) polymerase described herein) to yield a desired RNA oligonucleotide.
Additionally, RNA oligonucleotides produced by these methods can undergo reverse transcription (RT) to yield complementary DNA (e.g., cDNA) that is amplifiable by a high-fidelity DNA polymerase via the polymerase chain reaction (PCR). Also, provided herein are RNA oligonucleotides and DNA oligonucleotides produced by any method described herein.
Also provided herein are modified nucleotides that are useful in the methods described herein, as well as poly(N) polymerase enzymes (e.g., mutant poly(U) polymerases) that are useful in the methods described herein.
Also provided herein are compositions and kits comprising one or more of the poly(N) polymerases and/or nucleotides described herein. In another aspect, provided herein are reaction mixtures and systems for carrying out the methods described herein.
Provided herein are methods for the synthesis of RNA oligonucleotides wherein a poly(N) polymerase incorporates one or more modified nucleotides onto an initiator oligonucleotide via a terminal transferase (e.g., a poly(N) polymerase). In certain embodiments, provided herein is a method for template-independent synthesis of an RNA oligonucleotide, the method comprising:
(a) providing an initiator oligonucleotide, wherein the initiator oligonucleotide is single-stranded RNA;
(b) providing a poly(N) polymerase;
(c) combining the initiator oligonucleotide, the poly(N) polymerase, and one or more modified nucleotides under conditions sufficient for the addition of at least one modified nucleotide to the 3′ end of the initiator oligonucleotide.
In certain embodiments, the poly(N) polymerase is a poly(U) polymerase. Therefore, in certain embodiments, provided herein is a method for template-independent synthesis of an RNA oligonucleotide, the method comprising:
(a) providing an initiator oligonucleotide, wherein the initiator oligonucleotide is single-stranded RNA;
(b) providing a poly(U) polymerase;
(c) combining the initiator oligonucleotide, the poly(U) polymerase, and one or more modified nucleotides under conditions sufficient for the addition of at least one modified nucleotide to the 3′ end of the initiator oligonucleotide.
Once the one or more modified nucleotides are added to the initiator oligonucleotide, one or more additional nucleotides (modified or unmodified) can be added subsequently in order to synthesize a desired RNA oligonucleotide. Therefore, in certain embodiments, the method further comprises adding one or more natural or modified nucleotides to the 3′ end of the resulting RNA oligonucleotide (i.e., the RNA oligonucleotide formed in step (c)) until a desired RNA sequence is obtained. In certain embodiments, one or more additional modified nucleotides are added. In certain embodiments, the method further comprises:
(d) repeating steps (a)-(c) until a desired RNA sequence is obtained.
As described herein, the enzyme incorporating one or more nucleotides is an RNA polymerase, such as a poly(N) polymerase. Provided herein are poly(N) polymerases, e.g., mutant (i.e., mutated) poly(U) polymerases, that are useful in the methods described herein.
In certain embodiments, the poly(N) polymerase is a poly(U) polymerase, a poly(A) polymerase, a poly(C) polymerase, or a poly(G) polymerase. The RNA polymerase may be a wild-type polymerase, or a mutant (i.e., mutated), variant, or homolog thereof. In certain embodiments, the poly(N) polymerase is a wild-type polymerase. In certain embodiments, the polymerase is a mutant of a poly(N) polymerase. In certain embodiments, the polymerase is a variant of a poly(N) polymerase. In certain embodiments, a mutant or variant is approximately 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% identical to the wild-type polymerase. In certain embodiments, the polymerase is a homolog of a poly(N) polymerase.
In certain embodiments, the poly(N) polymerase is a poly(U) polymerase. In certain embodiments, the poly(U) polymerase is wild-type Schizosaccharomyces pombe poly(U) polymerase, or a mutant thereof, or a homolog thereof. In certain embodiments, the poly(U) polymerase is wild-type Schizosaccharomyces pombe poly(U) polymerase. In certain embodiments, the poly(U) polymerase is a mutant of Schizosaccharomyces pombe poly(U) polymerase. In certain embodiments, the poly(U) polymerase is a variant of Schizosaccharomyces pombe poly(U) polymerase. In certain embodiments, the poly(U) polymerase is a homolog of Schizosaccharomyces pombe poly(U) polymerase.
In certain embodiments, the poly(N) polymerase is a poly(A) polymerase. In certain embodiments, the poly(A) polymerase is wild-type Saccharomyces cerevisiae poly(A) polymerase, or a mutant thereof. In certain embodiments, the poly(N) polymerase is wild-type Saccharomyces cerevisiae poly(A) polymerase. In certain embodiments, the poly(N) polymerase is a mutant of Saccharomyces cerevisiae poly(A) polymerase. In certain embodiments, the poly(N) polymerase is a variant of Saccharomyces cerevisiae poly(A) polymerase. In certain embodiments, the poly(N) polymerase is a homolog of Saccharomyces cerevisiae poly(A) polymerase.
As described herein, in certain embodiments, the poly(N) polymerase is a mutant of a poly(N) polymerase (i.e., mutated poly(N) polymerase). In certain embodiments, the poly(N) polymerase is an Schizosaccharomyces pombe poly(U) (S. pombe poly(u)) polymerase comprising mutations at one or more positions selected from H336, N171, and T172.
In certain embodiments, the poly(N) polymerase is an Schizosaccharomyces pombe poly(U) (S. pombe poly(u)) polymerase comprising an H336 mutation (i.e., wherein the amino acid H at position 336 is replaced with another amino acid). In certain embodiments, the poly(N) polymerase is an S. pombe poly(u) polymerase comprising an H336 mutation selected from the group consisting of H336A H336C, H336D, H336E, H336F, H336G, H336I, H336K, H336L, H336M, H336T, H336V, H336W, H336Y, H336N, H336P, H336Q, H336R, H336S, and H336W. In certain embodiments, the H336 mutation is the only mutation. In certain embodiments, the poly(N) polymerase comprises one or more addition mutations and is about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% identical to SEQ ID NO:3. In certain embodiments, the poly(N) polymerase is identical to SEQ ID NO:3, but includes one H336 mutation selected from the group consisting of H336A H336C, H336D, H336E, H336F, H336G, H336I, H336K, H336L, H336M, H336T, H336V, H336W, H336Y, H336N, H336P, H336Q, H336R, H336S, and H336W.
In certain embodiments, the poly(N) polymerase is an S. pombe poly(u) polymerase comprising an H336R mutation. In certain embodiments, the H336R mutation is the only mutation. In certain embodiments, the poly(N) polymerase comprises one or more addition mutations and is about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% identical to SEQ ID NO:3. In certain embodiments, the poly(N) polymerase is identical to SEQ ID NO:3, but includes one mutation: H336R.
In certain embodiments, the poly(N) polymerase is an S. pombe poly(u) polymerase comprising an N171 mutation. In certain embodiments, the poly(N) polymerase is an S. pombe poly(u) polymerase comprising an N171 mutation selected from the group consisting of N171E, N171L, N171Q, N171S, N171M, N171D, N171G, N171C, N171A, N171W, N171T, N171I, N171V, N171P, N171R, N171H, and N171K. In certain embodiments, the N171 mutation is the only mutation. In certain embodiments, the poly(N) polymerase comprises one or more addition mutations and is about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% identical to SEQ ID NO:3. In certain embodiments, the poly(N) polymerase is identical to SEQ ID NO:3, but includes one N171 mutation selected from the group consisting of N171E, N171L, N171Q, N171S, N171M, N171D, N171G, N171C, N171A, N171W, N171T, N171I, N171V, N171P, N171R, N171H, and N171K.
In certain embodiments, the poly(N) polymerase is an S. pombe poly(u) polymerase comprising an N171A mutation. In certain embodiments, the N171A mutation is the only mutation. In certain embodiments, the poly(N) polymerase comprises one or more addition mutations and is about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% identical to SEQ ID NO:3. In certain embodiments, the poly(N) polymerase is identical to SEQ ID NO:3, but includes one mutation: N171A.
In certain embodiments, the poly(N) polymerase is an S. pombe poly(u) polymerase comprising an N171T mutation. In certain embodiments, the N171T mutation is the only mutation. In certain embodiments, the poly(N) polymerase comprises one or more addition mutations and is about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% identical to SEQ ID NO:3. In certain embodiments, the poly(N) polymerase is identical to SEQ ID NO:3, but includes one mutation: N171T.
In certain embodiments, the poly(N) polymerase is an S. pombe poly(u) polymerase comprising an T172 mutation. In certain embodiments, the poly(N) polymerase is an S. pombe poly(u) polymerase comprising an T172 mutation selected from the group consisting of T172E, T172L, T172Q, T172S, T172M, T172D, T172G, T172C, T172A, T172W, T172T, T172I, T172V, T172P, T172R, T172H, and T172K. In certain embodiments, the T172 mutation is the only mutation. In certain embodiments, the poly(N) polymerase comprises one or more addition mutations and is about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% identical to SEQ ID NO:3. In certain embodiments, the poly(N) polymerase is identical to SEQ ID NO:3, but includes one T172 mutation selected from the group consisting of T172E, T172L, T172Q, T172S, T172M, T172D, T172G, T172C, T172A, T172W, T172T, T172I, T172V, T172P, T172R, T172H, and T172K.
In certain embodiments, the poly(N) polymerase is an S. pombe poly(u) polymerase comprising H336 and N171 mutations. In certain embodiments, the poly(N) polymerase is an S. pombe poly(u) polymerase comprising an H336 mutation selected from the group consisting of H336A H336C, H336D, H336E, H336F, H336G, H336I, H336K, H336L, H336M, H336T, H336V, H336W, H336Y, H336N, H336P, H336Q, H336R, H336S, and H336W; and an N171 mutation selected from the group consisting of N171E, N171L, N171Q, N171S, N171M, N171D, N171G, N171C, N171A, N171W, N171T, N171I, N171V, N171P, N171R, N171H, and N171K. In certain embodiments, the H336 and N171 mutations are the only mutations. In certain embodiments, the poly(N) polymerase comprises one or more addition mutations and is about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% identical to SEQ ID NO:3. In certain embodiments, the poly(N) polymerase is identical to SEQ ID NO:3, but includes one H336 mutation selected from the group consisting of H336A H336C, H336D, H336E, H336F, H336G, H336I, H336K, H336L, H336M, H336T, H336V, H336W, H336Y, H336N, H336P, H336Q, H336R, H336S, and H336W; and one N171 mutation selected from the group consisting of N171E, N171L, N171Q, N171S, N171M, N171D, N171G, N171C, N171A, N171W, N171T, N171I, N171V, N171P, N171R, N171H, and N171K.
In certain embodiments, the poly(N) polymerase is an S. pombe poly(u) polymerase comprising H336R and N171A mutations. In certain embodiments, the H336R and N171A mutations are the only mutations. In certain embodiments, the poly(N) polymerase comprises one or more addition mutations and is about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% identical to SEQ ID NO:3. In certain embodiments, the poly(N) polymerase is identical to SEQ ID NO:3, but includes two mutations: H336R and N171A.
In certain embodiments, the poly(N) polymerase is an S. pombe poly(u) polymerase comprising H336R and N171T mutations. In certain embodiments, the H336R and N171T mutations are the only mutations. In certain embodiments, the poly(N) polymerase comprises one or more addition mutations and is about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% identical to SEQ ID NO:3. In certain embodiments, the poly(N) polymerase is identical to SEQ ID NO:3, but includes two mutations: H336R and N171T.
In certain embodiments, the poly(N) polymerase is an S. pombe poly(u) polymerase comprising H336 and T172 mutations. In certain embodiments, the poly(N) polymerase is an S. pombe poly(u) polymerase comprising an H336 mutation selected from the group consisting of H336A H336C, H336D, H336E, H336F, H336G, H336I, H336K, H336L, H336M, H336T, H336V, H336W, H336Y, H336N, H336P, H336Q, H336R, H336S, and H336W; and a T172 mutation selected from the group consisting of T172E, T172L, T172Q, T172S, T172M, T172D, T172G, T172C, T172A, T172W, T172T, T172I, T172V, T172P, T172R, T172H, and T172K. In certain embodiments, the H336 and T172 mutations are the only mutations. In certain embodiments, the poly(N) polymerase comprises one or more addition mutations and is about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% identical to SEQ ID NO:3. In certain embodiments, the poly(N) polymerase is identical to SEQ ID NO:3, but includes one H336 mutation selected from the group consisting of H336A H336C, H336D, H336E, H336F, H336G, H336I, H336K, H336L, H336M, H336T, H336V, H336W, H336Y, H336N, H336P, H336Q, H336R, H336S, and H336W; and one T172 mutation selected from the group consisting of T172E, T172L, T172Q, T172S, T172M, T172D, T172G, T172C, T172A, T172W, T172T, T172I, T172V, T172P, T172R, T172H, and T172K. In certain embodiments, the H336 mutation is H336R.
In certain embodiments, the poly(N) polymerase is an S. pombe poly(u) polymerase comprising H336, N171, and T172 mutations. In certain embodiments, the poly(N) polymerase is an S. pombe poly(u) polymerase comprising an H336 mutation selected from the group consisting of H336A H336C, H336D, H336E, H336F, H336G, H336I, H336K, H336L, H336M, H336T, H336V, H336W, H336Y, H336N, H336P, H336Q, H336R, H336S, and H336W; an N171 mutation selected from the group consisting of N171E, N171L, N171Q, N171S, N171M, N171D, N171G, N171C, N171A, N171W, N171T, N171I, N171V, N171P, N171R, N171H, and N171K; and a T172 mutation selected from the group consisting of T172E, T172L, T172Q, T172S, T172M, T172D, T172G, T172C, T172A, T172W, T172T, T172I, T172V, T172P, T172R, T172H, and T172K. In certain embodiments, the H336, N171, and T172 mutations are the only mutations. In certain embodiments, the poly(N) polymerase comprises one or more addition mutations and is about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% identical to SEQ ID NO:3. In certain embodiments, the poly(N) polymerase is identical to SEQ ID NO:3, but includes one H336 mutation selected from the group consisting of H336A H336C, H336D, H336E, H336F, H336G, H336I, H336K, H336L, H336M, H336T, H336V, H336W, H336Y, H336N, H336P, H336Q, H336R, H336S, and H336W; one N171 mutation selected from the group consisting of N171E, N171L, N171Q, N171S, N171M, N171D, N171G, N171C, N171A, N171W, N171T, N171I, N171V, N171P, N171R, N171H, and N171K; and one T172 mutation selected from the group consisting of T172E, T172L, T172Q, T172S, T172M, T172D, T172G, T172C, T172A, T172W, T172T, T172I, T172V, T172P, T172R, T172H, and T172K. In certain embodiments, the H336 mutation is H336R. In certain embodiments, the N171 mutation is N171A or N171T.
As described herein, one or more modified nucleotides can be incorporated into an oligonucleotide in order to synthesize a desired RNA oligonucleotide. Modified nucleotides can be incorporated in order to prepare custom RNA or DNA oligonucleotides. In other embodiments, modified nucleotides can be incorporated to block the incorporation of subsequent nucleotides (i.e., via the use of “reversible terminators” as described herein). Provided herein are modified nucleotides that are useful in the methods described herein, as well as in other applications (e.g., chemical oligonucleotide synthesis, therapeutic applications, etc.)
A “modified nucleotide” is an nucleotide monomer (e.g., comprising a ribose sugar, a phosphate group, and a nucleobase) comprising one or more non-natural modifications. In certain embodiments, for example, the modified nucleotide is the structural equivalent of a naturally occurring RNA or DNA nucleotide (i.e., guanine (G), uracil (U), adenine (A), cytosine (C)) but comprising one or more non-natural modifications. In certain embodiments, the modified nucleotide is the equivalent of a naturally occurring nucleotide, wherein one or more positions are substituted, or wherein one or more substituents or groups are removed or replaced. In certain embodiments, the modified nucleotide comprises a modified sugar, a modified base, a modified phosphate, or any combination thereof. “Modified nucleotides” include the 2′- and 3′-reversible terminator nucleotides described herein.
The following formula is intended to show possible modification sites on an nucleotide. Other modifications are contemplated. In certain embodiments, a modified nucleotide is of the following formula:
or a salt thereof, wherein:
“Base” (also “B” herein) is a natural or non-natural nucleotide base; and
R and R′ are independently hydrogen or a natural or non-natural sugar substituent.
In certain embodiments, a modified nucleotide is of the following formula:
or a salt thereof, wherein:
X is O or S;
Y is O or S;
“Base” (also “B” herein) is a natural or non-natural nucleotide base; and
R and R′ are independently hydrogen or a natural or non-natural sugar substituents.
In certain embodiments, Y is O. In certain embodiments, Y is S. In certain embodiments, X is O. In certain embodiments, X is S.
In certain embodiments, a modified nucleotide is a base-modified nucleotide. “Base-modified” encompasses nucleotides, wherein a G, U, A, or C base is substituted or modified, or wherein a G, U, A, or C base is replaced by a different group (e.g., hypoxanthine).
Non-limiting examples of modified bases include, but are not limited to, 5-methylcytosine, pyridin-4-one, pyridin-2-one, phenyl, pseudouracil, 3-methyl uracil, dihydrouridine, naphthyl, aminophenyl, 5-alkylcytidines, 5-alkyluridines, 5-halouridines, 6-azapyrimidines, 6-alkylpyrimidines, propyne, quesosine, 2-thiouridine, 4-thiouridine, 4-acetyltidine, 5-(carboxyhydroxymethyl)uridine, 5-carboxymethylaminomethyl-2-thiouridine, 5-carboxymethylaminomethyluridine, β-D-galactosylqueosine, 1-methyladenosine, 1-methylinosine, 2,2-dimethylguanosine, 3-methylcytidine, 2-methyladenosine, 2-methylguanosine, N6-methyladenosine, 7-methylguanosine, 5-methoxyaminomethyl-2-thiouridine, 5-methylaminomethyluridine, 5-methylcarbonylmethyluridine, 5-methyloxyuridine, 5-methyl-2-thiouridine, 2-methylthio-N6-isopentenyladenosine, 3-D-mannosylqueosine, uridine-5-oxyacetic acid, 2-thiocytidine, and threonine derivatives.
Other non-limiting examples of bases include, but are not limited to, natural or non-natural pyrimidine or purine; and may include, but are not limited to, N1-methyl-adenine, N6-methyl-adenine, 8′-azido-adenine, N,N-dimethyl-adenosine, aminoallyl-adenosine, 5′-methyl-uridine, pseudouridine, N1-methyl-pseudouridine, 5′-hydroxy-methyl-uridine, 2′-thio-uridine, 4′-thio-uridine, hypoxanthine, xanthine, 5′-methyl-cytidine, 5′-hydroxy-methyl-cytidine, 6′-thio-guanine, and N7-methyl-guanine.
In certain embodiments, the base-modified nucleotide is selected from the group consisting of N1-methyladenosine-5′-triphosphate, N6-methyladenosine-5′-triphosphate, N6-methyl-2-aminoadenosine-5′-triphosphate, 5-methyluridine-5′-triphosphate, N1-methylpseudouridine-5′-triphosphate, pseudouridine-5′-triphosphate, 5-hydroxymethyluridine-5′-triphosphate, 5-methylcytidine-5′-triphosphate, 5-hydroxymethylcytidine-5′-triphosphate, N7-methylguanosine-triphosphate, 8′-adizoadenisone-5′-triphosphate, inosine 5′-triphosphate, 2-thiouridine-5′-triphosphate, 6-thioguanosine-5′-triphosphate, 4-thiouridine-5′-triphosphate, and xanthosine-5′-triphosphate.
In certain embodiments, the modified nucleotide is a sugar-modified nucleotide. “Sugar-modified” nucleotides encompass nucleotides wherein the ribose or deoxyribose moiety is substituted, or wherein the ribose or deoxyribose is replaced by a different sugar moiety. In certain embodiments, the ribose or deoxyribose is modified (e.g., substituted) at the 1′, 2′, 3′, 4′, and/or 5′ position. In some embodiments, a nucleotide may be modified at the 2′ position. In some embodiments, a nucleotide may be modified at the 3′ position.
In certain embodiments, the 2′ and/or 3′ position of a sugar is substituted with a natural or non-natural “sugar substituent” R or R′. In certain embodiments, R and R′ are independently selected from the group consisting of hydrogen, halogen, —CN, —N02, —N3, optionally substituted alkyl, optionally substituted alkenyl, optionally substituted alkynyl, optionally substituted aryl, optionally substituted heteroaryl, optionally substituted carbocyclyl, optionally substituted heterocyclyl, optionally substituted acyl, optionally substituted hydroxyl, optionally substituted amino, or optionally substituted thiol.
In certain embodiments, R and/or R′ are independently —ORP, wherein each instance of RP is independently an oxygen protecting group, optionally substituted acyl, or an amino acid. In certain embodiments, R and/or R′ comprise a reactive moiety for bioconjugation (e.g., click chemistry handle, e.g., azide or alkyne), a fluorophore, catalytic protein, oligonucleotide, or reporting tag.
In some embodiments, the 2′ position of a sugar, e.g., ribose, may be modified with a halogen, e.g., a fluorine group; an alkyl group, e.g., methyl or ethyl group; a methoxy group; an amino group; a thio group; an aminopropyl group; a dimethylaminoethyl; a dimethylaminopropyl group; a dimethylaminoethyloxyethyl group; an azido group; a silyl group; a cyclic alkyl group; or a N-methylacetamido group.
In certain embodiments, the 2′ position of a sugar, e.g., ribose, is modified with a hydroxyl (—OH), hydrogen (—H), fluoro (—F), amine (—NH3), azido (—N3), thiol (—SH), methoxy (—OCH3), or methoxyethanol (—OCH2CH2OCH3).
In certain embodiments, the 2′ position may also be substituted with redox-active, fluorogenic or intrinsically fluorescent moieties, natural and non-natural amino acids, peptides, proteins, mono- or oligosaccharides, functional/ligand binding glycans, and polymers or large molecules such as polyethylene glycol (PEG).
In some embodiments, the 3′ position of a sugar, e.g., ribose, may be modified with a halogen, e.g., a fluorine group; an alkyl group, e.g., methyl or ethyl group; a methoxy group; an amino group; a thio group; an aminopropyl group; a dimethylaminoethyl; a dimethylaminopropyl group; a dimethylaminoethyloxyethyl group; an azido group; a silyl group; a cyclic alkyl group; or a N-methylacetamido group.
In certain embodiments, the 3′ position of a sugar, e.g., ribose, is modified with a hydroxyl (—OH), hydrogen (—H), fluoro (—F), amine (—NH3), azido (—N3), thiol (—SH), methoxy (—OCH3), or methoxyethanol (—OCH2CH2OCH3).
In certain embodiments, the 3′ position may also be substituted with redox-active, fluorogenic or intrinsically fluorescent moieties, natural and non-natural amino acids, peptides, proteins, mono- or oligosaccharides, functional/ligand binding glycans, and polymers or large molecules such as polyethylene glycol (PEG).
In certain embodiments, the sugar-modified nucleotide is modified at the 2′-position. For example, in certain embodiments, the sugar-modified nucleotide is a 2′-F, 2′-O-alkyl, 2′-amino, or 2′-azido modified nucleotide.
In certain embodiments, the sugar-modified nucleotide is a 2′-F modified nucleotide. In certain embodiments, the sugar-modified nucleotide is selected from the group consisting of 2′-fluoro-2′-deoxyadenosine-5′-triphosphate, 2′-fluoro-2′-deoxycytidine-5′-triphosphate, 2′-fluoro-2′-deoxyguanosine-5′-triphosphate, and 2′-fluoro-2′-deoxyuridine-5′-triphosphate.
In certain embodiments, the sugar-modified nucleotide is a 2′-O-alkyl modified nucleotide. In certain embodiments, the sugar-modified nucleotide is selected from the group consisting of 2′-O-methyladenosine-5′-triphosphate, 2′-O-methylcytidine-5′-triphosphate, 2′-O-methylguanosine-5′-triphosphate, 2′-O-methyluridine-5′-triphosphate, and 2′-O-methylinosine-5′-triphosphate.
In certain embodiments, the sugar-modified nucleotide is a 2′-O-amino modified nucleotide. In certain embodiments, the sugar-modified nucleotide is selected from the group consisting of 2′-amino-2′-deoxycytidine-5′-triphosphate, 2′-amino-2′-deoxyuridine-5′-triphosphate, 2′-amino-2′-deoxyadenosine-5′-triphosphate, and 2′-amino-2′-deoxyguanosine-5′-triphosphate.
In certain embodiments, the sugar-modified nucleotide is a 2′-O-azido modified nucleotide. In certain embodiments, the sugar-modified nucleotide is selected from the group consisting of 2′-azido-2′-deoxycytidine-5′-triphosphate, 2′-azido-2′-deoxyuridine-5′-triphosphate, 2′-azido-2′-deoxyadenosine-5′-triphosphate, and 2′-azido-2′-deoxyguanosine-5′-triphosphate.
In certain embodiments the modified nucleoside triphosphate is an irreversible terminator, also known as a capping nucleotide, such as 3′-O-methyl-NTP, 3′-O-methyl-dNTP, 3′-azido-dNTP, 3′-azido-NTP, 3′-amine-dNTP, 3′-amine-NTP, etc.
In certain embodiments, the sugar-modified nucleotide is a 2′-modified reversible terminator RNA nucleotide (e.g., 2′-O-protected reversible terminator nucleotide). 2′-modified reversible terminator nucleotides are described herein. In certain embodiments, the 2′-modified reversible terminator nucleotide also comprises a modified base moiety.
In certain embodiments, the sugar-modified nucleotide is a 3′-modified reversible terminator RNA nucleotide (e.g., 3′-O-protected reversible terminator nucleotide). 3′-modified reversible terminator nucleotides are described herein. In certain embodiments, the 3′-modified reversible terminator nucleotide also comprises a modified base moiety.
Other modifications to the sugar are contemplated. These modifications include, but not limited to, replacing the ring's oxygen with a sulfur. In certain embodiments, a bridge is introduced between the 2′-carbon and the 4′-carbon (e.g., to limit ring conformation). In some embodiments, an modified nucleotide is a bridged nucleotide, e.g., locked nucleic acid (LNA); a constrained ethyl nucleotide (cEt), or an ethylene bridged nucleic acid (ENA) nucleotide.
In some embodiments, a nucleotide, e.g., a nucleotide, may comprise a modified phosphate group, e.g., a phosphorothioate. Non-limiting examples of modified phosphate groups include phosphorothioates, phosphotriesters, methyl phosphonates, alkyl, heterocyclic, amide, morpholino, peptide nucleic acids (PNA), and other known phosphorus-containing groups. In certain embodiments, the modification is to the alpha (α) phosphate of the triphosphate. In certain embodiments, the nucleotide is an (α) thiophosphonate. In certain embodiments, the modifications to the beta (β) and/or gamma (γ) phosphates of the triphosphate.
In certain embodiments, nucleotide modified with a fluorophore can be used to verify the success of each iterative incorporation event, thereby producing in some embodiments virtually error-free RNA oligonucleotides. In certain embodiments, a modified nucleotide comprises a fluorophore.
Modified nucleotides may comprise more than one modification. For example, a modified nucleotide may comprise a base modification and a sugar modification.
RNA Oligonucleotide Synthesis with Reversible Terminators
Also provided herein are methods of synthesizing RNA oligonucleotides using reversible terminator nucleotides. A “reversible terminator nucleotide” is an nucleotide comprising a non-natural chemical moiety at the 2′- and/or 3′-position that is capable of being removed. After addition of the reversible-terminator nucleotide to the initiator oligonucleotide, the non-natural chemical moiety 2′- and/or 3′-position blocks the incorporation of a second nucleotide, e.g., by interfering with the binding of the oligonucleotide to the polymerase. The non-natural chemical moiety 2′- and/or 3′-position can then be removed, leaving the 3′-position open to the addition of an additional nucleotide. In certain embodiments, the method allows for the controlled addition of one nucleotide at a time, also referred to as “(n+1)” addition. In certain embodiments, the reversible terminator nucleotide is protected at the 2′- and/or 3′-hydroxyl groups (i.e., “2′- and/or 3′-O-protected reversible terminator nucleotides”).
Provided herein are methods for template-independent synthesis of RNA oligonucleotides, the method comprises:
(a) providing an initiator oligonucleotide, wherein the initiator oligonucleotide is single-stranded RNA;
(b) providing a poly(N) polymerase;
(c) combining the initiator oligonucleotide, the poly(N) polymerase, and a reversible terminator nucleotide under conditions sufficient for the addition of the reversible terminator nucleotide to the 3′ end of the initiator oligonucleotide;
(d) deprotecting the RNA oligonucleotide formed in step (c) at the protected position (e.g., 2′ and/or 3′ position) of the reversible terminator nucleotide; and
(e) optionally, repeating steps (a)-(c) until a desired RNA sequence is obtained.
In certain embodiments, the poly(N) polymerase is a poly(U) polymerase. Provided herein is are methods for template-independent synthesis of RNA oligonucleotides, the methods comprising:
(a) providing an initiator oligonucleotide, wherein the initiator oligonucleotide is single-stranded RNA;
(b) providing a poly(U) polymerase;
(c) combining the initiator oligonucleotide, the poly(U) polymerase, and a 2′- and/or 3′-O-protected reversible terminator nucleotide under conditions sufficient for the addition of the 2′- and/or 3′-O-protected reversible terminator nucleotide to the 3′ end of the initiator oligonucleotide;
(d) deprotecting the RNA oligonucleotide formed in step (c) at the protected 2′- and/or 3′-O-position of the 2′- and/or 3′-O-protected reversible terminator nucleotide;
(e) optionally, repeating steps (a)-(c) until a desired RNA sequence is obtained.
As described herein, 2′-O-protected reversible terminator nucleotides can also be used. In certain embodiments, the poly(N) polymerase is a poly(U) polymerase. Provided herein is are methods for template-independent synthesis of RNA oligonucleotides, the methods comprising:
(a) providing an initiator oligonucleotide, wherein the initiator oligonucleotide is single-stranded RNA;
(b) providing a poly(U) polymerase;
(c) combining the initiator oligonucleotide, the poly(U) polymerase, and a 2′-O-protected reversible terminator nucleotide under conditions sufficient for the addition of the 2′-O-protected reversible terminator nucleotide to the 3′ end of the initiator oligonucleotide;
(d) deprotecting the RNA oligonucleotide formed in step (c) at the protected 2′-O-position of the 2′-O-protected reversible terminator nucleotide;
(e) optionally, repeating steps (a)-(c) until a desired RNA sequence is obtained.
As described herein, 3′-O-protected reversible terminator nucleotides can also be used. Provided herein is are methods for template-independent synthesis of RNA oligonucleotides, the methods comprising:
(a) providing an initiator oligonucleotide, wherein the initiator oligonucleotide is single-stranded RNA;
(b) providing a poly(U) polymerase;
(c) combining the initiator oligonucleotide, the poly(U) polymerase, and a 3′-O-protected reversible terminator nucleotide under conditions sufficient for the addition of the 3′-O-protected reversible terminator nucleotide to the 3′ end of the initiator oligonucleotide;
(d) deprotecting the RNA oligonucleotide formed in step (c) at the protected 3′-O-position of the 3′-O-protected reversible terminator nucleotide;
(e) optionally, repeating steps (a)-(c) until a desired RNA sequence is obtained.
Any poly(N) polymerase described herein can be used in the reversible terminator methods described above. In certain embodiments, a mutant poly(U) polymerase described herein is used to incorporate the reversible terminator nucleotide. In certain embodiments, a mutant poly(U) polymerase described herein is used to incorporate a 3′-reversible terminator nucleotide described herein.
An RNA oligonucleotide of any particular sequence can be synthesized using the methods described herein.
Some of the methods described herein employ reversible terminator RNA oligonucleotides. A “reversible terminator nucleotide” is a modified nucleotide that comprises a non-natural chemical moiety at the 2′- and/or 3′-position that is capable of being removed. In certain embodiments, the reversible terminator nucleotide is protected at the 2′-O- and/or 3′-O-positions with an oxygen protecting group. Also provided herein are new reversible terminator nucleotides (e.g., 2′-modified reversible terminator nucleotides and 3′-modified reversible terminator nucleotides).
In certain embodiments, the 2′-modified reversible terminator nucleotide is protected at the 2′-O-position with an oxygen protecting group (“2′-O-protected reversible terminator nucleotide”). In certain embodiments, the 3′-modified reversible terminator nucleotide is protected at the 3′-O-position with an oxygen protecting group (“3′-O-protected reversible terminator nucleotide”).
For example, in certain embodiments, the reversible terminator nucleotide (i.e., 2′- and/or 3′-O-protected reversible terminator nucleotide) is of the following formula:
or a salt thereof, wherein:
each instance of RP is hydrogen, an oxygen protecting group, optionally substituted acyl, or an amino acid, or two RP are joined together with the intervening atoms to form optionally substituted heterocyclyl; provided that at least one RP is an oxygen protecting group optionally substituted acyl, or an amino acid; and
“Base” (also “B” herein”) is a natural or non-natural nucleotide (e.g., modified) base. Other portions of the nucleotide can be modified as described above and herein.
For example, in certain embodiments, the reversible terminator nucleotide (i.e., 2′- and/or 3′-O-protected reversible terminator nucleotide) is of the following formula:
or a salt thereof, wherein:
Y is O or S;
X is O or S;
each instance of RP is hydrogen, an oxygen protecting group, optionally substituted acyl, or an amino acid, or two RP are joined together with the intervening atoms to form optionally substituted heterocyclyl; provided that at least one RP is an oxygen protecting group optionally substituted acyl, or an amino acid; and
“Base” (also “B” herein) is a natural or non-natural nucleotide (e.g., modified) base.
In certain embodiments, a 3′-modified reversible terminator nucleotide (i.e., 3′-O-protected reversible terminator nucleotide) is of the following formula:
or a salt thereof, wherein:
Y is O or S;
X is O or S;
RP is an oxygen protecting group;
R is hydrogen or a natural or non-natural sugar substituent described herein; and
“Base” (also “B” herein) is a natural or non-natural nucleotide (e.g., modified) base.
Optionally, in certain embodiments, a linking group connects the 2′ carbon to the 4′ carbon (e.g., through the group R′)
In certain embodiments, a 3′-modified reversible terminator nucleotide is a locked or bridged nucleotide. In certain embodiments, a 3′-modified reversible terminator nucleotide (i.e., 3′-O-protected reversible terminator nucleotide) is of the following formula:
or a salt thereof, wherein:
Y is O or S;
X is O or S;
RP is an oxygen protecting group, optionally substituted acyl, or an amino acid;
R is hydrogen or a natural or non-natural sugar substituent described herein; and
“Base” (also “B” herein) is a natural or non-natural nucleotide (e.g., modified) base.
In certain embodiments, Y is O. In certain embodiments, Y is S. In certain embodiments, X is O. In certain embodiments, X is S.
As described herein, “Base” (also “B” herein) can be any natural or non-naturally occurring nucleobase. Naturally occurring bases include G, U, A, and C. Non-natural (e.g., modified) bases include substituted or modified variants of G, U, A, and C. Non-limiting examples of modified bases include, but are not limited to, 5-methylcytosine, pyridin-4-one, pyridin-2-one, phenyl, pseudouracil, 3-methyl uracil, dihydrouridine, naphthyl, aminophenyl, 5-alkylcytidines, 5-alkyluridines, 5-halouridines, 6-azapyrimidines, 6-alkylpyrimidines, propyne, quesosine, 2-thiouridine, 4-thiouridine, 4-acetyltidine, 5-(carboxyhydroxymethyl)uridine, 5-carboxymethylaminomethyl-2-thiouridine, 5-carboxymethylaminomethyluridine, 3-D-galactosylqueosine, 1-methyladenosine, 1-methylinosine, 2,2-dimethylguanosine, 3-methylcytidine, 2-methyladenosine, 2-methylguanosine, N6-methyladenosine, 7-methylguanosine, 5-methoxyaminomethyl-2-thiouridine, 5-methylaminomethyluridine, 5-methylcarbonylmethyluridine, 5-methyloxyuridine, 5-methyl-2-thiouridine, 2-methylthio-N6-isopentenyladenosine, 3-D-mannosylqueosine, uridine-5-oxyacetic acid, 2-thiocytidine, and threonine derivatives.
Other non-limiting examples of bases include, but are not limited to, natural or non-natural pyrimidine or purine; and may include, but are not limited to, N1-methyl-adenine, N6-methyl-adenine, 8′-azido-adenine, N,N-dimethyl-adenosine, aminoallyl-adenosine, 5′-methyl-uridine, pseudouridine, N1-methyl-pseudouridine, 5′-hydroxy-methyl-uridine, 2′-thio-uridine, 4′-thio-uridine, hypoxanthine, xanthine, 5′-methyl-cytidine, 5′-hydroxy-methyl-cytidine, 6′-thio-guanine, and N7-methyl-guanine.
In certain embodiments, the nucleotide sugar is substituted with a natural or non-natural “sugar substituent” R. In certain embodiments, R is hydrogen, halogen, —CN, —NO2, —N3, optionally substituted alkyl, optionally substituted alkenyl, optionally substituted alkynyl, optionally substituted aryl, optionally substituted heteroaryl, optionally substituted carbocyclyl, optionally substituted heterocyclyl, optionally substituted acyl, optionally substituted hydroxyl, optionally substituted amino, or optionally substituted thiol. In certain embodiments, R is hydrogen. In certain embodiments, R is halogen. In certain embodiments, R is —CN. In certain embodiments, R is —NO2. In certain embodiments, R is —N3. In certain embodiments, R is optionally substituted alkyl. In certain embodiments, R is optionally substituted alkenyl. In certain embodiments, R is optionally substituted alkynyl. In certain embodiments, R is optionally substituted aryl. In certain embodiments, R is optionally substituted heteroaryl. In certain embodiments, R is optionally substituted carbocyclyl. In certain embodiments, R is optionally substituted heterocyclyl. In certain embodiments, R is optionally substituted acyl. In certain embodiments, R is optionally substituted hydroxyl. In certain embodiments, R is optionally substituted amino. In certain embodiments, R is optionally substituted thiol.
In certain embodiments, R is —ORP, wherein RP is an oxygen protecting group, optionally substituted acyl, or an amino acid.
In certain embodiments, R comprises a reactive moiety for bioconjugation (e.g., click chemistry handle, e.g., azide or alkyne), a fluorophore, catalytic protein, oligonucleotide, or reporting tag.
In some embodiments, R is halogen, e.g., a fluorine group; an alkyl group, e.g., methyl or ethyl group; a methoxy group; an amino group; a thio group; an aminopropyl group; a dimethylaminoethyl; a dimethylaminopropyl group; a dimethylaminoethyloxyethyl group; an azido group; a silyl group; a cyclic alkyl group; or a N-methylacetamido group.
In certain embodiments, R is hydroxyl (—OH), hydrogen (—H), fluoro (—F), amine (—NH3), azido (—N3), thiol (—SH), methoxy (—OCH3), or methoxyethanol (—OCH2CH2OCH3).
In certain embodiments, R comprises redox-active, fluorogenic or intrinsically fluorescent moiety, natural and non-natural amino acids, peptides, proteins, mono- or oligosaccharides, functional/ligand binding glycans, or polymers or large molecules such as polyethylene glycol (PEG).
As defined herein, each RP is independently an oxygen protecting group, optionally substituted acyl, or an amino acid. In certain embodiments, RP is an oxygen protecting group. In certain embodiments, RP is optionally substituted acyl. In certain embodiments, RP is an amino acid. In certain embodiments, RP is an oxygen protecting group, optionally substituted acyl, or an amino acid that can be cleaved by an esterase.
In certain embodiments, the reversible terminator nucleotide is capable of being deprotected under photochemical conditions. Therefore, the reversible terminator RNA oligonucleotide, in certain embodiments, is protected at the 2′-O- and/or 3′-O-positions with a photolabile oxygen protecting group. In certain embodiments, a 2′-modified reversible terminator nucleotide is protected at the 2′-O position with a photolabile protecting group. In certain embodiments, a 3′-modified reversible terminator nucleotide is protected at the 3′-O position with a photolabile protecting group.
In certain embodiments a 2′- or 3′-O-protecting group (e.g., RP) is of one of the following formulae:
In certain embodiments a 2′- or 3′-O-protecting group (e.g., RP) is of one of the following formulae:
In certain embodiments a 2′- or 3′-O-protecting group (e.g., RP) is of one of the following formulae:
In certain embodiments a 2′- or 3′-O-protecting group (e.g., RP) is of the following formulae:
In certain embodiments a 2′- or 3′-O-protecting group (e.g., RP) is an amino acid of the following formula:
In certain embodiments, each instance of RP is independently alkyl, silyl, allyl, azidomethyl, benzyl, coumarinyl, or carbonate.
In certain embodiments, a 2′-modified reversible terminator nucleotide is a 2′-O-alkyl, 2′-O-silyl, 2′-O-allyl, 2′-O-azidomethyl, 2′-O-benzyl, 2′-O-coumarinyl, or a 2′-O-carbonate modified nucleotide. In certain embodiments, the 2′-modified reversible terminator nucleotide is a 2′-O-carbonate modified nucleotide selected from 2′-O-allyloxycarbonyl and 2′-O-(2-oxo-2H-chromen-4-yl)methyloxycarbonyl.
In certain embodiments, the 2′-O-protected reversible terminator is a 2′-O-allyl-NTP or 2′-O-azidomethyl-NTP.
In certain embodiments, a 3′-modified reversible terminator nucleotide is a 3′-O-alkyl, 3′-O-silyl, 3′-O-allyl, 3′-O-azidomethyl, 3′-O-benzyl, 3′-O-coumarinyl, or a 3′-O-carbonate modified nucleotide. In certain embodiments, the 3′-modified reversible terminator nucleotide is a 3′-O-carbonate modified nucleotide selected from 3′-O-allyloxycarbonyl and 3′-O-(2-oxo-2H-chromen-4-yl)methyloxycarbonyl.
In certain embodiments, the 3′-O-protected reversible terminator is a 3′-O-allyl-NTP, 3′-O-azidomethyl-NTP, 3′-O-allyl carbonate-NTP, 3′-O-allyl carbonate-dNTP, 3′-O-azidomethyl carbonate-NTP, or 3′-O-azidomethyl carbonate-dNTP.
In certain embodiments, the 3′-O-protected reversible terminator is a 3′-O-allyl-NTP, 3′-(O-allyl-carbonate)-dNTP (e.g., 3′-(O-allyl-carbonate)-dATP, etc.), 3′-(O-azidomethyl carbonate)-dNTP, 3′-(O-acetate)-dNTP, 3′-(O-acyl amino acids)-dNTP, 3′-(O-3-methylcoumarin)-dNTP, 3′-(O-(4-methylcoumarin carbonate)-dNTP, 3′-(O-(2-nitrobenzyl)-dNTP, 3′-(O-(2-nitrobenzyl carbonate)-dNTP, 3′-(O-TMS)-dNTP, or 3′-(O-Teoc)-dNTP.
Other certain embodiments of reversible terminator nucleotides, including certain embodiments of 3′-protected nucleotides, are shown in
As described herein, reversible terminator oligonucleotides may be protected with oxygen protecting groups (e.g, RP groups). Oxygen protecting groups are well known in the art and include those described in detail in Protecting Groups in Organic Synthesis, T. W. Greene and P. G. M. Wuts, 3rd edition, John Wiley & Sons, 1999, incorporated herein by reference. Exemplary oxygen protecting groups include, but are not limited to, methyl, methoxylmethyl (MOM), methylthiomethyl (MTM), t-butylthiomethyl, (phenyldimethylsilyl)methoxymethyl (SMOM), benzyloxymethyl (BOM), p-methoxybenzyloxymethyl (PMBM), (4-methoxyphenoxy)methyl (p-AOM), guaiacolmethyl (GUM), t-butoxymethyl, 4-pentenyloxymethyl (POM), siloxymethyl, 2-methoxyethoxymethyl (MEM), 2,2,2-trichloroethoxymethyl, bis(2-chloroethoxy)methyl, 2-(trimethylsilyl)ethoxymethyl (SEMOR), tetrahydropyranyl (THP), 3-bromotetrahydropyranyl, tetrahydrothiopyranyl, 1-methoxycyclohexyl, 4-methoxytetrahydropyranyl (MTHP), 4-methoxytetrahydrothiopyranyl, 4-methoxytetrahydrothiopyranyl S,S-dioxide, 1-[(2-chloro-4-methyl)phenyl]-4-methoxypiperidin-4-yl (CTMP), 1,4-dioxan-2-yl, tetrahydrofuranyl, tetrahydrothiofuranyl, 2,3,3a,4,5,6,7,7a-octahydro-7,8,8-trimethyl-4,7-methanobenzofuran-2-yl, 1-ethoxyethyl, 1-(2-chloroethoxy)ethyl, 1-methyl-1-methoxyethyl, 1-methyl-1-benzyloxyethyl, 1-methyl-1-benzyloxy-2-fluoroethyl, 2,2,2-trichloroethyl, 2-trimethylsilylethyl, 2-(phenylselenyl)ethyl, t-butyl, allyl, p-chlorophenyl, p-methoxyphenyl, 2,4-dinitrophenyl, benzyl (Bn), p-methoxybenzyl, 3,4-dimethoxybenzyl, o-nitrobenzyl, p-nitrobenzyl, p-halobenzyl, 2,6-dichlorobenzyl, p-cyanobenzyl, p-phenylbenzyl, 2-picolyl, 4-picolyl, 3-methyl-2-picolyl N-oxido, diphenylmethyl, p,p′-dinitrobenzhydryl, 5-dibenzosuberyl, triphenylmethyl, α-naphthyldiphenylmethyl, p-methoxyphenyldiphenylmethyl, di(p-methoxyphenyl)phenylmethyl, tri(p-methoxyphenyl)methyl, 4-(4′-bromophenacyloxyphenyl)diphenylmethyl, 4,4′,4″-tris(4,5-dichlorophthalimidophenyl)methyl, 4,4′,4″-tris(levulinoyloxyphenyl)methyl, 4,4′,4″-tris(benzoyloxyphenyl)methyl, 3-(imidazol-1-yl)bis(4′,4″-dimethoxyphenyl)methyl, 1,1-bis(4-methoxyphenyl)-1′-pyrenylmethyl, 9-anthryl, 9-(9-phenyl)xanthenyl, 9-(9-phenyl-10-oxo)anthryl, 1,3-benzodithiolan-2-yl, benzisothiazolyl S,S-dioxido, trimethylsilyl (TMS), triethylsilyl (TES), triisopropylsilyl (TIPS), dimethylisopropylsilyl (IPDMS), diethylisopropylsilyl (DEIPS), dimethylthexylsilyl, t-butyldimethylsilyl (TBDMS), t-butyldiphenylsilyl (TBDPS), tribenzylsilyl, tri-p-xylylsilyl, triphenylsilyl, diphenylmethylsilyl (DPMS), t-butylmethoxyphenylsilyl (TBMPS), formate, benzoylformate, acetate, chloroacetate, dichloroacetate, trichloroacetate, trifluoroacetate, methoxyacetate, triphenylmethoxyacetate, phenoxyacetate, p-chlorophenoxyacetate, 3-phenylpropionate, 4-oxopentanoate (levulinate), 4,4-(ethylenedithio)pentanoate (levulinoyldithioacetal), pivaloate, adamantoate, crotonate, 4-methoxycrotonate, benzoate, p-phenylbenzoate, 2,4,6-trimethylbenzoate (mesitoate), methyl carbonate, 9-fluorenylmethyl carbonate (Fmoc), ethyl carbonate, 2,2,2-trichloroethyl carbonate (Troc), 2-(trimethylsilyl)ethyl carbonate (TMSEC), 2-(phenylsulfonyl) ethyl carbonate (Psec), 2-(triphenylphosphonio) ethyl carbonate (Peoc), isobutyl carbonate, vinyl carbonate, allyl carbonate, t-butyl carbonate (BOC or Boc), p-nitrophenyl carbonate, benzyl carbonate, p-methoxybenzyl carbonate, 3,4-dimethoxybenzyl carbonate, o-nitrobenzyl carbonate, p-nitrobenzyl carbonate, S-benzyl thiocarbonate, 4-ethoxy-1-napththyl carbonate, methyl dithiocarbonate, 2-iodobenzoate, 4-azidobutyrate, 4-nitro-4-methylpentanoate, o-(dibromomethyl)benzoate, 2-formylbenzenesulfonate, 2-(methylthiomethoxy)ethyl, 4-(methylthiomethoxy)butyrate, 2-(methylthiomethoxymethyl)benzoate, 2,6-dichloro-4-methylphenoxyacetate, 2,6-dichloro-4-(1,1,3,3-tetramethylbutyl)phenoxyacetate, 2,4-bis(1,1-dimethylpropyl)phenoxyacetate, chlorodiphenylacetate, isobutyrate, monosuccinoate, (E)-2-methyl-2-butenoate, o-(methoxyacyl)benzoate, α-naphthoate, nitrate, alkyl N,N,N′,N′-tetramethylphosphorodiamidate, alkyl N-phenylcarbamate, borate, dimethylphosphinothioyl, alkyl 2,4-dinitrophenylsulfenate, sulfate, methanesulfonate (mesylate), benzylsulfonate, and tosylate (Ts). In certain embodiments, an oxygen protecting group is silyl. In certain embodiments, an oxygen protecting group is t-butyldiphenylsilyl (TBDPS), t-butyldimethylsilyl (TBDMS), triisoproylsilyl (TIPS), triphenylsilyl (TPS), triethylsilyl (TES), trimethylsilyl (TMS), triisopropylsiloxymethyl (TOM), acetyl (Ac), benzoyl (Bz), allyl carbonate, 2,2,2-trichloroethyl carbonate (Troc), 2-trimethylsilylethyl carbonate, methoxymethyl (MOM), 1-ethoxyethyl (EE), 2-methyoxy-2-propyl (MOP), 2,2,2-trichloroethoxyethyl, 2-methoxyethoxymethyl (MEM), 2-trimethylsilylethoxymethyl (SEM), methylthiomethyl (MTM), tetrahydropyranyl (THP), tetrahydrofuranyl (THF), p-methoxyphenyl (PMP), triphenylmethyl (Tr), methoxytrityl (MMT), dimethoxytrityl (DMT), allyl, p-methoxybenzyl (PMB), t-butyl, benzyl (Bn), allyl, or pivaloyl (Piv).
In certain embodiments, the 3′-reversible terminator is a 3′-O-amino acid (e.g., comprising any standard or non-standard amino acid). In certain embodiments, the amino acid can be removed using an esterase.
As generally defined herein, R″ is hydrogen, halogen, —CN, —NO2, —N3, optionally substituted alkyl, optionally substituted alkenyl, optionally substituted alkynyl, optionally substituted aryl, optionally substituted heteroaryl, optionally substituted carbocyclyl, optionally substituted heterocyclyl, optionally substituted acyl, optionally substituted hydroxyl, optionally substituted amino, or optionally substituted thiol. In certain embodiments, R″ is hydrogen. In certain embodiments, R″ is halogen. In certain embodiments, R″ is —CN. In certain embodiments, R″ is —NO2. In certain embodiments, R″ is —N3. In certain embodiments, R″ is optionally substituted alkyl. In certain embodiments, R″ is optionally substituted alkenyl. In certain embodiments, R″ is optionally substituted alkynyl. In certain embodiments, R″ is optionally substituted aryl. In certain embodiments, R″ is optionally substituted heteroaryl. In certain embodiments, R″ is optionally substituted carbocyclyl. In certain embodiments, R″ is optionally substituted heterocyclyl. In certain embodiments, R″ is optionally substituted acyl. In certain embodiments, R″ is optionally substituted hydroxyl. In certain embodiments, R″ is optionally substituted amino. In certain embodiments, R″ is optionally substituted thiol.
In certain embodiments, R″ comprises a reactive moiety for bioconjugation (e.g., click chemistry handle, e.g., azide or alkyne), a fluorophore, catalytic protein, oligonucleotide, or reporting tag.
As generally defined herein, R′″ is hydrogen, halogen, —CN, —NO2, —N3, optionally substituted alkyl, optionally substituted alkenyl, optionally substituted alkynyl, optionally substituted aryl, optionally substituted heteroaryl, optionally substituted carbocyclyl, optionally substituted heterocyclyl, optionally substituted acyl, optionally substituted hydroxyl, optionally substituted amino, optionally substituted thiol, or an oxygen protecting group. In certain embodiments, R′″ is hydrogen. In certain embodiments, R′″ is halogen. In certain embodiments, R′″ is —CN. In certain embodiments, R′″ is —NO2. In certain embodiments, R′″ is —N3. In certain embodiments, R′″ is optionally substituted alkyl. In certain embodiments, R′″ is optionally substituted alkenyl. In certain embodiments, R′″ is optionally substituted alkynyl. In certain embodiments, R′″ is optionally substituted aryl. In certain embodiments, R′″ is optionally substituted heteroaryl. In certain embodiments, R′″ is optionally substituted carbocyclyl. In certain embodiments, R′″ is optionally substituted heterocyclyl. In certain embodiments, R′″ is optionally substituted acyl. In certain embodiments, R′″ is optionally substituted hydroxyl. In certain embodiments, R′″ is optionally substituted amino. In certain embodiments, R′″ is optionally substituted thiol.
In certain embodiments, R′″ comprises a reactive moiety for bioconjugation (e.g., click chemistry handle, e.g., azide or alkyne), a fluorophore, catalytic protein, oligonucleotide, or reporting tag.
As generally defined herein, RN is hydrogen, optionally substituted alkyl, optionally substituted acyl, or a nitrogen protecting group. In certain embodiments, RN is hydrogen. In certain embodiments, RN is optionally substituted alkyl. In certain embodiments, RN is optionally substituted acyl. In certain embodiments, RN is a nitrogen protecting group.
In certain embodiments, RN comprises a reactive moiety for bioconjugation (e.g., click chemistry handle, e.g., azide or alkyne), a fluorophore, catalytic protein, oligonucleotide, or reporting tag.
RNA Oligonucleotide Synthesis with Non-Hydrolyzable RNA Nucleotides
Also provided herein are methods for RNA oligonucleotide synthesis employing non-hydrolyzable nucleotides. As described herein, the rate at which a polymerase can incorporate nucleotides (i.e., hydrolyzable nucleotides) at the 3′-terminus of an initiator oligonucleotide can be controlled by introducing a non-hydrolyzable nucleotide that competes for the enzyme's active site. The non-hydrolyzable nucleotide is not incorporated, and the rate of incorporation of the hydrolyzable nucleotide is directly impacted by the ratio of the hydrolyzable nucleotide and the non-hydrolyzable nucleotides through competitive inhibition. Ultimately, the number of incorporations of an nucleotide is determined by the concentration of a non-hydrolyzable nucleotide in the reaction mixture. After a poly(N) polymerase incorporates one or more nucleotides via terminal transferase, the process can be repeated in one or more in iterative steps, optionally with different nucleotides, until a desires RNA oligonucleotide sequence is obtained.
Provided herein are methods for template-independent synthesis of RNA oligonucleotides, the method comprising:
(a) providing an initiator oligonucleotide, wherein the initiator oligonucleotide is single-stranded RNA;
(b) providing a poly(N) polymerase;
(c) combining the initiator oligonucleotide, the poly(N) polymerase, one or more nucleotides, and one or more non-hydrolyzable nucleotides under conditions sufficient for addition of at least one hydrolyzable nucleotide to the 3′ end of the initiator oligonucleotide, wherein the concentration of the non-hydrolyzable nucleotides is sufficient to inhibit the rate of addition of the one or more nucleotides by the poly(N) polymerase.
As described herein, the poly(N) polymerase is, in certain embodiments, a poly(U) polymerase. Provided herein are methods for template-independent synthesis of RNA oligonucleotides, the method comprising:
(a) providing an initiator oligonucleotide, wherein the initiator oligonucleotide is single-stranded RNA;
(b) providing a poly(U) polymerase;
(c) combining the initiator oligonucleotide, the poly(U) polymerase, one or more nucleotides, and one or more non-hydrolyzable nucleotides under conditions sufficient for addition of at least one hydrolyzable nucleotide to the 3′ end of the initiator oligonucleotide, wherein the concentration of the non-hydrolyzable nucleotides is sufficient to inhibit the rate of addition of the one or more nucleotides by the poly(U) polymerase.
In certain embodiments, the concentration of non-hydrolyzable nucleotide is such that 1-100 of the nucleotides are incorporated. In certain embodiments, the concentration of non-hydrolyzable nucleotide is such that 1-50 of the nucleotides are incorporated. In certain embodiments, the concentration of non-hydrolyzable nucleotide is such that 1-20 of the nucleotides are incorporated. In certain embodiments, the concentration of non-hydrolyzable nucleotide is such that less than 100, less than 90, less than 80, less than 70, less than 60, less than 50, less than 40, less than 30, less than 20, less than 10, less than 5, less than 4, less than 3, or less than 2 of the hydrolyzable nucleotides are incorporated.
Once the one or more nucleotides are added to the initiator oligonucleotide, one or more additional nucleotides can be added subsequently in order to synthesize a desired RNA oligonucleotide. Therefore, in certain embodiments, the method further comprises adding one or more natural or modified nucleotides to the 3′ end of the resulting RNA oligonucleotide (i.e., the RNA oligonucleotide formed in step (c)) until a desired RNA sequence is obtained. In certain embodiments, the method further comprises:
(d) repeating steps (a)-(c) until a desired RNA sequence is obtained.
Methods provided herein employ non-hydrolyzable nucleotides. “Non-hydrolyzable” nucleotides are nucleotides capable of binding to an RNA polymerase, but incapable of undergoing the enzyme-catalyzed addition (i.e., terminal transferase reaction) to an initiator oligonucleotide (e.g., to the 3′ end of the initiator oligonucleotide). In certain embodiments, the non-hydrolyzable nucleotide is a phosphate-modified nucleotide (i.e., comprises a modified triphosphate group). Also provided herein are non-hydrolyzable nucleotides useful in the methods described herein.
For example, in certain embodiments, a non-hydrolyzable nucleotide is of the following formula:
or a salt thereof, wherein:
each Y is independently —O—, —NRN—, —C(RC)2—, or —S—; provided that at least one Y is not —O—;
R and R′ are independently hydrogen or sugar substituents as defined herein;
“Base” is a natural or non-natural (e.g., modified) nucleotide base as defined herein;
RN is hydrogen, optionally substituted alkyl, or a nitrogen protecting group;
and each instance of RC is independently hydrogen, halogen, or optionally substituted alkyl. In certain embodiments, —NRN— is —NH—. In certain embodiments, —C(RC)2— is —CH2—.
In certain embodiments, the non-hydrolyzable nucleotide comprises a modified triphosphate group. In certain embodiments, the non-hydrolyzable nucleotide is selected from the group consisting of uridine-5′-[(α,β)-imido]triphosphate, adenosine-5′-[(α,β)-imido]triphosphate, guanosine-5′-[(α,β)-methyleno]triphosphate, cytidine-5′-[(α,β)-methyleno]triphosphate, adenosine-5′-[(β,γ)-imido]triphosphate, guanosine-5′-[(β,γ)-imido]triphosphate, and uridine-5′-[(β,γ)-imido]triphosphate. The triphosphate group may comprise any other modifications.
In certain embodiments, the non-hydrolyzable nucleotide is a 3′-modified nucleotide. In certain embodiments, the non-hydrolyzable nucleotide is selected from the group consisting of 3′-O-methyladenosine-5′-triphosphate and 3′-O-methyluridine-5′-triphosphate.
The non-hydrolyzable nucleotide may further comprise any other nucleotide modifications described above and herein.
The terminal transferase reactions described herein (i.e., step (c) of any of the methods described herein) are carried out in the presence of a polymerase enzyme (e.g., a poly(N) polymerase). In certain embodiments, step (c) is carried out in the presence of one or more additional enzymes. In certain embodiments, step (c) is carried out in the presence of a mixture of two or more different enzymes. The mixture of enzymes may comprise more than one distinct poly(N) polymerases (e.g., 2 or 3 different poly(N) polymerases). The mixture of poly(N) polymerase enzymes may include both wild-type and mutates poly(N) polymerases (e.g., mutated poly(U) polymerases provided herein).
In certain embodiments, step (c) is carried out in the presence of one or more additional phosphatases in addition to the poly(N) polymerase. In certain embodiments, step (c) is carried out in the presence of a yeast inorganic pyrophosphatase (PPI-ase) in addition to the poly(N) polymerase.
In certain embodiments, the terminal transferase reaction in step (c) is carried out in the presence of one or more additional additives. In certain embodiments, step (c) is carried out in the presence of a crowding agent. In certain embodiments, the crowing agent is polyethylene glycol (PEG) or Ficoll. In certain embodiments, the crowding agent is polyethylene glycol (PEG). In certain embodiments, step (c) is carried out in the presence of an RNase inhibitor. In certain embodiments, step (c) is carried out in the presence of a non-hydrolyzable nucleotide.
The methods described herein use initiator oligonucleotides. The initiator oligonucleotides may be of any sequence and can be any number of nucleotides in length. In certain embodiments, the initiator oligonucleotide is 20 nucleotides or less in length. In certain embodiments, the initiator oligonucleotide is 5-20 nucleotides in length. In certain embodiments, the initiator oligonucleotide is more than 20 nucleotides in length.
In certain embodiments, the initiator oligonucleotide is a poly-rN oligonucleotide. In certain embodiments, the initiator oligonucleotide is a poly-rU, poly-rC, poly-rG, or poly-rA.
The initiator oligonucleotide may also be covalently linked to a solid support. In certain embodiments, the oligonucleotide is cleaved from the solid support after a desired RNA oligonucleotide sequence is obtained. Therefore, in certain embodiments, the initiator oligonucleotide is covalently linked to a solid support through a cleavable linker.
The initiator oligonucleotides can comprise other modification such as fluorophores. In certain embodiments, the initiator oligonucleotide comprises a 5′-fluorophore. In certain embodiments, the fluorophore is Cy5 or FAM. The initiator oligonucleotide may also comprise one or more additional functional groups or handles for bioconjugation. In certain embodiments, the initiator oligonucleotide is functionalized with biotin.
In certain embodiments, the initiator oligonucleotide comprises a 5′-phosphate (e.g., 5′-mono-, di-, or triphosphate). In certain embodiments, the initiator oligonucleotide comprises a 5′-monophosphate. In certain embodiments, the initiator oligonucleotide comprises a 5′-diphosphate. In certain embodiments, the initiator oligonucleotide comprises a 5′-triphosphate.
In certain embodiments, the initiator oligonucleotide comprises a 5′-capping group (i.e., 5′ cap).
In certain embodiments, the 5′ cap can be a mono-nucleotide (1-nt), di-nucleotide (2-nt), tri-nucleotide (3-nt), or N-nucleotide (i.e., of any oligonucleotide length that would be useful). The 5′ cap may also comprise a combination of one or more natural and non-natural (e.g., modified) nucleoside bases, including those described herein.
In certain embodiments, the 5′ cap is a guanine cap. In certain embodiments, the 5′ cap is a 7-methylguanylate cap (m7G). A In certain embodiments, the guanine or m7G cap includes a guanine nucleotide connected to the oligonucleotide via a 5′ to 5′ triphosphate linkage. In certain embodiments, the 5′ cap includes methylation of the 2′ hydroxy-groups of the first and/or second 2 ribose sugars of the 5′ end of the oligonucleotide.
In certain embodiments, the 5′-cap is a 5′-trimethylguanosine cap or a 5′-monomethylphosphate cap. In other embodiments, the 5′ cap is a NAD+, NADH, or 3′-dephospho-coenzyme A cap.
In certain embodiments, the initiator oligonucleotide comprises a primer site for reverse transcription of the synthesized RNA oligonucleotide. In certain embodiments, the initiator oligonucleotide comprises a primer site for PCR amplification.
Splicing RNA Fragments Together with 5′-Triphosphorylated Oligonucleotides
In certain embodiments, the methods provided herein can be applied to the splicing of oligonucleotide fragments together (i.e., ligation) using a 5′-triphosphate group by a template-independent polymerase to create a long RNA (e.g., >100-nt in length) molecule.
A 5′-triphosphate oligonucleotide, either synthesized as an initiator oligonucleotide or as the product of controlled, template-independent enzymatic synthesis (e.g., a method described herein), can be a substrate of polymerase such as poly(U) polymerase or mutated variant thereof (e.g., a mutated variant described herein). In certain embodiments, the poly(U) polymerase or mutated variant thereof is accepting of large 3′-modifications. In some instances, the 3′-modification is a series of nucleic acids (i.e., oligonucleotide) instead of a single nucleoside triphosphate or a protecting group.
Thus, provided herein are methods for the synthesis of RNA oligonucleotides, the methods comprising:
(a) providing a first oligonucleotide, wherein the first oligonucleotide comprises a 5′-triphosphate group;
(b) providing a second oligonucleotide;
(c) providing a poly(U) polymerase;
(d) combining the first and second oligonucleotides and the poly(U) polymerase under conditions sufficient for the ligation of the first oligonucleotide to the 3′ end of the second oligonucleotide.
In certain embodiments, the second oligonucleotide is a 3′-OH oligonucleotide.
In certain embodiments, the 5′-triphosphate oligonucleotide is modified to include phosphorothioate at the alpha (α)-phosphate.
In certain embodiments, the first oligonucleotide (e.g., 5′-triphosphorylated oligonucleotide includes one or more modifications to the nucleobases, sugars, or backbone of the oligonucleotide. In certain embodiments, the second oligonucleotide includes one or more modifications to the nucleobases, sugars, or backbone of the oligonucleotide.
In certain embodiments, template-independent ligations occur in reaction conditions that enhance ligation activity, such as the addition of crowding agents, etc. as described herein.
The methods provided herein can be applied to the synthesis of DNA oligonucleotides. After a desired RNA oligonucleotide is obtained via a method described herein, one or more additional steps of reverse transcription and/or amplification can be carried out to yield DNA (e.g., cDNA, ssDNA, double stranded DNA). The end result is a method for controlled, template-independent synthesis of DNA oligonucleotides.
Therefore, in certain embodiments, a method provided herein further comprises a step of:
(f) performing reverse transcription on the resulting RNA oligonucleotide using a reverse transcription priming site, primer, and a reverse transcriptase enzyme to produce a complementary single-stranded DNA oligonucleotide. In certain embodiments, the reverse transcriptase enzyme is a high-fidelity reverse transcriptase enzyme.
In certain embodiments, a method provided herein further comprises a step of:
(g) amplifying the complementary single-stranded DNA oligonucleotide or cDNA produced from synthesized RNA oligonucleotide via reverse transcription in step (f) with a DNA polymerase to be produce double-stranded DNA. In certain embodiments, the DNA polymerase is a high-fidelity DNA polymerase.
Oligonucleotide-based therapeutics are an emerging modality in the rationally-designed and personalized, postgenomic era of medicine (Khvorova et al. 2017). Comprised of short sequences of natural and/or non-natural, modified nucleic acid building blocks, oligonucleotide therapeutics can be specifically tailored to affect a target with maximum efficacy while retaining an optimal pharmacokinetic profile (Deleavey et al. 2012). This is largely defined by the chemical and structural architecture of the oligonucleotide therapeutic, which can include a combination of carefully chosen modifications to the sugar rings, nucleobases, and phosphate backbone as well as the overall three-dimensional structure of the oligonucleotide (Cummins et al. 1995, Eckstein 2014, Watts et al. 2008, Wilson et al. 2006). Both chemical composition and the sequence in which the nucleic acid building blocks are assembled confer the global properties of the oligonucleotide therapeutic; a slight rearrangement or different chemical moiety could potentially improve its therapeutic ability (Khvorova et al. 2017, Koch et al. 2014, Bohr et al. 2017). This is a clear a key advantage over traditional small molecule therapeutics where a major redesign may be necessary for optimization. While there is a diverse array of successful oligonucleotide therapeutics classes, such as short (<50-nt) antisense oligonucleotides (ASOs) (Goyal et al. 2018, Uhlmann et al. 1990), short-interfering RNA (siRNA) (Dana et al. 2017), microRNA (miRNA) (Rupaimoole et al. 2017), etc., as well as longer (>100-nt) messenger RNA (mRNA) (Pardi et al. 2018) and long-noncoding RNA (lncRNA) (Arun et al. 2018), one unifying theme among each is that their production, especially at large scales, is vastly limited by the current state oligonucleotide synthesis technology (Ma et al. 2012).
The synthesis of DNA and RNA oligonucleotides using the phosphoramidite chemistry has been a staple of scientific research since the 1970s (Beaucage et al. 1992). The phosphoramidite chemistry is exceedingly reliable and inexpensive for the synthesis of short, uncomplicated DNA oligonucleotides comprised of the four natural nucleobases. However, aside from the advent of massively parallelized synthesis and automation technology, there has been only minor, incremental improvements to the core methodologies of phosphoramidite-based oligonucleotide synthesis (Beaucage et al. 1992, Kosuri et al. 2014). This is especially true for the chemical synthesis of RNA and heavily modified oligonucleotides, which is still very costly, low-yielding, and often requires multiple purifications post-synthesis that greatly increases the lead-time to isolate appreciable quantities of the desired product (Baronti et al. 2018). Furthermore, the phosphoramidite chemistry is not particularly conducive to a large repertoire of chemical modifications, a necessity for many current oligonucleotide therapeutics (Khvorova et al. 2017), as organic solvents and harsh conditions complicates synthesis schemes by requiring additional protecting groups for labile moieties that may confer unique properties onto the oligonucleotide for delivery or ligand binding purposes, for example (Baronti et al. 2018). In vitro transcription (IVT) strategies have ameliorated some of these limitations; particularly the production of long RNA oligonucleotides (>120-nt), which is currently impossible with the phosphoramidite chemistry (Pardi et al. 2018, Milligan et al. 1987, Sahin et al. 2014). However, IVT does not allow for site-specific labeling of the oligonucleotide, requiring the user to swap out particular bases in addition to using a DNA template for proper enzymatic catalysis. Unfortunately, the combination of high-costs, difficult synthesis, and inaccessibility to diverse nucleic acid building blocks has stifled researchers from developing innovative oligonucleotide therapeutics to combat debilitating diseases. One potential solution to address the aforementioned limitations of oligonucleotide synthesis and oligonucleotide-based therapeutic development is to completely circumvent the use of the phosphoramidite chemistry. Currently, there is great interest in utilizing a class of polymerases known as nucleotidyl transferases, which catalyze the addition of a nucleoside monophosphate to the 3′-end of a short initiator sequence, to synthesize oligonucleotides de novo (Perkel 2019, Pratt et al., 2008). Many nucleotidyl transferases do not require the use of a template sequence and their reactions can be carried out under aqueous conditions, avoiding many of the negative aspects of chemical oligonucleotide synthesis including nucleobase depurination, unwanted insertions or deletions, and the accumulation of irreversibly capped truncation products. Some notable nucleotidyl transferases capable of template-independent de novo oligonucleotide synthesis include, but are not limited to, Terminal deoxynucleotidyl Transferase (TdT) (Motea et al. 2010), Cid1 poly(U) polymerase (PuP) (Munoz-Tello et al. 2012), poly(A) polymerase (PaP) (Balbo et al. 2007), poly(G) polymerase (PgP), poly(C) polymerase (PcP), CCD-adding enzyme (Cho et al. 2007), polymerase Mu (μ) (Dominguez et al. 2000), and polymerase Theta (Θ) (Thomas et al. 2019). Of the aforementioned nucleotidyl transferases, only terminal deoxynucleotidyl transferase has been used in a successful demonstration of enzymatic oligonucleotide synthesis (Palluk et al. 2019). However, only applications in DNA data storage are so far possible as terminal deoxynucleotidyl transferase is difficult to control, has a strong preference for natural deoxynucleoside triphosphates, and is exceedingly biases toward certain nucleobases and initiator combinations over others—attributes that can be computationally correct post-synthesis to retrieve stored data (Ceze et al. 2019, Anavy et al. 2019, Lee et al. 2019). Thus, the development of an enzymatic oligonucleotide synthesis platform that can (1) extend the growing sequence by a single base (n+1) with a reversible blocked modified nucleoside triphosphate, (2) incorporate an array of modified nucleoside triphosphates that confer therapeutic or other value to the oligonucleotide, and (3) be scaled to industrially relevant outputs is of great importance.
Several methods for the controlled de novo synthesis of RNA oligonucleotides via enzymatic catalysis have been developed. Engineered and wild-type polymerases with the ability to efficiently incorporate natural and modified ribonucleotide triphosphates (rNTPs) without a template sequence can be used to iteratively add nucleotides to the 3′-OH of an initiator oligonucleotide sequence. Their addition can be through either single or multiple incorporation events. The biologically compatible reaction conditions needed for enzyme functionality greatly reduces the susceptibility of RNA oligonucleotides to degradation that is normally associated with chemical synthesis. These methods can be integrated into a microfluidic or array-based format to synthesize many RNA oligonucleotides in parallel with high cost efficiency. An RNA oligonucleotide synthesized with this method can be produced with a low error rate and will be biologically compatible for downstream biotechnological applications.
The DNA/RNA-directed polymerase, polymerase μ, and the RNA-directed polymerases, poly(A) polymerase (PAP) and poly(U) polymerases are three examples of polymerases that are compatible for the aforementioned RNA synthesis schemes. However, any other polymerase or enzyme with the capacity to add nucleotides to the 3′-terminus of an initiator oligonucleotide without the need of a template sequence could be used, such as CCA-adding enzymes. This includes possible functional mutants that display a similar or increased capacity for controlled de novo RNA synthesis.
Some possible applications of this invention include the following: (1) the cost-efficient and high-fidelity de novo synthesis of RNA oligonucleotides longer than 100-nt, (2) synthesized RNA oligonucleotides can be used as a cheap and high-quality source of biological material such as: synthetic transfer RNA, ribosomal RNA, self-folding RNA structures, novel ribozymes, protein binding complexes, RNA therapeutics, CRISPR/Cas9 Guide RNA, and RNA sequencing probes (such as padlock probes for in situ sequencing), (3) the production of useful, PCR amplifiable, DNA oligonucleotides or gene sequences via conversion by reverse transcription, and (4) enzymes used for RNA synthesis like Pol(p) (a DNA/RNA-directed polymerase) are additionally candidates for controlled enzymatic synthesis of DNA oligonucleotides or gene sequences under biologically compatible reaction conditions.
Controlling rNTP Incorporation Speed with Impeding Reaction Conditions
RNA oligonucleotide synthesis can be controlled by selecting reaction components that heavily impede natural nucleotide incorporation catalysis rates and maximize desired length products such as the addition of non-hydrolyzable or incompatible nucleotides (
Modified rNTP Incorporation Reversibility Prevents Additional Incorporation Events
RNA oligonucleotide synthesis can be also controlled by incorporating modified nucleotides that temporarily alter the binding affinity of the polymerase to the initiator oligonucleotide in order to limit the extension reaction to just one incorporation event (n+1) (
polymerase Enzymes for Controlled RNA Synthesis
polymerases from the Family X such as Terminal deoxynucleotidyl Transferase (TdT), polymerase Mu (Pol μ), polymerase Beta (Pol β), and polymerase Lambda (Pol λ) are candidates for the controlled, template-independent synthesis of RNA oligonucleotides (Fowler and Suo 2006). These highly specialized polymerases have been shown to be key driving forces in critical DNA repair pathways such as non-homologous end joining (NHEJ) and the generation of generation of antibody and T-cell receptor diversity during V(D)J recombination (Moon et al. 2007, 2014; Nick McElhinny and Ramsden 2004; Bertocci et al. 2006). The involvement of Family X polymerases in such biological processes are attributed to their precision in incorporating natural nucleotides in a template-dependent manner while maintaining the ability to indiscriminately add nucleotides to a primer sequence independently of a template when variability is necessary (J. F. Ruiz et al. 2001; Dominguez et al. 2000; Motea and Berdis 2010). This unique capacity is ideal for the enzymatic synthesis of RNA in that there is a large margin of natural flexibility associated with the Family X polymerases without the need for protein evolution schemes. It has been previously shown that TdT has the ability to incorporate natural nucleotides in addition to DNA nucleotides (Roychoudhury 1972). Family X polymerases could be further engineered to be more compatible with the proposed RNA synthesis schemes.
polymerase Mu (Pol μ)
Pol μ is a Family X polymerase that, under optimal reaction conditions, has been shown to efficiently incorporate both deoxyribonucleotide triphosphates (dNTPS) and rNTPs to DNA, RNA, and DNA-RNA hybrid oligonucleotide substrates (José F. Ruiz et al. 2003; Agrawal et al. 2003). Several studies of the enzyme's primary structure, catalytic pocket, and various catalytic states have determined the amino acid residues associated with rNTP binding and the kinetics of their incorporation (Moon et al. 2014; Jamsen et al. 2017; Moon et al. 2017). Interestingly, wild-type Pol μ has the ability to incorporate a rNTP without distorting the oligonucleotide primer or nucleotide structures as well as keeping the geometry of the active site in a normal configuration; a phenomena that may greatly affect the ability of other Family X polymerases to accommodate rNTPs in any useful capacity or speed (Moon et al. 2017). In addition, it is worth noting that Pol has been cited to be less discriminatory in its preference towards the DNA substrate compared to other Family X polymerases (Moon et al. 2015).
In one study, the expression of the two tumor associated human Pol μ point mutations, (G174S) and (R175H), produced enzymes with decreased efficiency and fidelity in NHEJ. These mutants were cited to randomly incorporate nucleotides despite having a template sequence guiding the DNA repair process resulting in significant alterations to the expected error-rates (Sastre-Moreno et al. 2017). Other groups have demonstrated that removing large proportions of Pol μ such as the N-terminal BRCT domain (which is typically associated with other DNA repair pathway core factors) result in active enzymes (Moon et al. 2014). These truncated variants were shown to retain wild-type activity and ability to bind non-hydrolyzable nucleotides, but potentially have more physical space to incorporate modified or bulky nucleotides. Nevertheless, wild-type Pol μ is characterized as a primarily template-dependent polymerase; however, the point mutation (R387K) to human wild-type Pol μ resulted in an enzyme with significant increased template-independent activity (Andrade et al. 2009). This point mutation (R387K) is exceedingly important to the feasibility using Pol in template-independent de novo RNA oligonucleotide synthesis and, again, reiterates the immense value of its flexibility. Pol μ is currently the only known polymerase out of the Family X and beyond that displays both template-dependent and template-independent activities (Domínguez et al. 2000; Juarez et al. 2006). In addition to a wild-type or mutagenized Pol μ, there are other polymerases that could be potentially used to synthesize long RNA oligonucleotides de novo.
The 3′-tailing of single-stranded RNA with ribonucleotides is important in two different contexts: (1) natural biological or biochemical processes and (2) studying these processes in vivo or in vitro (Proudfoot 2011; Strauss et al. 2012). For the latter, several groups have employed wild-type RNA polymerases such as Saccharomyces cerevisiae and Escherichia coli poly(A) polymerase (PAP), Schizosaccharomyces pombe Cid 1 poly(U) polymerase (PUP) to directly label the 3′-terminus of RNA oligonucleotides in vitro in a template-independent manner (G. Martin and Keller 1998; Munoz-Tello, Gabus, and Thore 2012; Kwak and Wickens 2007; Winz et al. 2012). Under optimal conditions, it was shown that these families of enzymes, in particular, PAP, can accept modified ribonucleotides with modifications at the 2′- and 3′-positions of the sugar as well as the 8′-position of the adenosine base (Winz et al. 2012). While the overall incorporation efficiency varied between the modified position, nucleotide base, and the enzymes tested, 1-3 nucleotide incorporation events took place on average yielding a RNA oligonucleotide with a 3′- or internal azide functional group to be attached with a dye via bioorthogonal click chemistry (Winz et al. 2012).
In addition to studying the mechanism of modified nucleotide incorporation, other groups have examined both the biochemical and structural mechanisms for substrate binding and catalysis of PAP (Georges Martin et al. 2004; Bard et al. 2000). Through these studies, which included site-directed mutagenesis of many residues in the catalytic pocket and exhaustive analysis of steady-state kinetics, it was clear that there is that incorporation was heavily biased towards ATP over the other nucleotide bases (Georges Martin et al. 2004). However, another group determined that a single point mutation to a bacterial PAP (R215A) resulted in a complete reversal of this bias, allowing for random incorporation of all of nucleotide bases (Just et al. 2008). This result was similar for another template-independent RNA polymerase, CCA-adding enzyme (Just et al. 2008; Xiong and Steitz 2004). These studies make RNA polymerases such as PAP to be exceedingly compatible in the execution of scheme II for controlled de novo RNA oligonucleotide. Both the wild-type and mutagenized RNA polymerases appear to be flexible enough to incorporate U, A, G, or C sugar-modified nucleotides bases (2′-,3′-, or both) that can be deprotected or altered under mild reaction conditions to restore enzyme binding affinity.
Human polymerase μ R387K was expressed and purified as described in the materials and methods. Because it was unknown under which reaction conditions polymerase μ R387K functioned optimally, several reaction parameters were initially evaluated. It was found that an incubation temperature 37° C. and the reaction buffer conditions: 10 mM Magnesium Acetate, 50 mM Potassium Acetate, and 20 mM Tris-Acetate yielded sufficient enzymatic activity in terms of dNTP incorporation. In addition, polymerase p activity was evaluated after reactions were supplemented with common divalent metal cofactors (Mn2+, Mg2+, Co2+, etc.) and it was found that a combination of Mn2+ and Mg2+ at a concentration of 0.25 mM yielded the highest rate of ssDNA generation at ˜650 RFU/minute, whereas 0.25 mM Co2+ yielded the worst rate at ˜100 RFU/minute (
2. S. cerevisiae Poly(A) Polymerase Incorporates 2′-Modified ATP Nucleotides and 2′-Blocked Reversible Terminators
The efficiency of S. cerevisiae poly(A) polymerase (Thermo 74225Z25KU) for incorporating of several 2′-modified was evaluated. The modified nucleotides evaluated included the following: 2′-F-rATP (Trilink N-1007), 2′-Azido-rATP (Trilink N-1045), 2′-Amino-rATP (Trilink N-1046), and 2′-O-Methyl rATP (Trilink N-1015). Extension reactions were supplemented with the appropriate buffers, which included 0.5 mM Mn2+, 200 pmol of the initiator RNA oligonucleotide, 2.5 mM modified nucleotide, and 900 units of enzyme. Reactions were incubated at 37° C. for 60 minutes before being analyzed with denaturing gel electrophoresis. According to the gel (
3. S. pombe Cid1 poly(U) polymerase Incorporates Natural Nucleotides Universally
The efficiency of S. pombe Cid1 poly(U) polymerase (NEB M0337) for incorporating natural ribonucleotides evaluated. For kinetic analysis, extension reactions were supplemented with the appropriate buffers, 10 pmol of the labeled initiator RNA oligonucleotide (5′-Cy5-poly rU-15-mer), 1.0 mM natural nucleotide (either ATP, UTP, GTP, or CTP), 1×SYBR Green II for RNA (Thermo), and 2 units of poly(U) polymerase. Reactions were incubated at 37° C. for 30 minutes and monitored in real time. 2 μL of each extension reaction was then analyzed using a 15% TBE-Urea gel and imaged on a Typhoon FLA 9500 system with EX: 649 nm and EM: 666 nm. It is clear from the gel that poly(U) polymerase has the ability to incorporate all-natural ribonucleotides as compared to the control; however, there is some bias in terms of how many extensions will occur (
4. S. pombe Cid1 poly(U) polymerase Incorporates 2′-Modified Nucleotides Universally
Knowing that S. pombe Cid1 poly(U) polymerase has the ability to incorporate all four natural ribonucleotides universally, it was sought to determine if this universality can be extended to 2′-modified nucleotides. Using the same reaction parameters, poly(U) polymerase was incubated with 2.5 mM 2′-O-Methyl-rATP, rUTP, rCTP or rGTP at 37° C. for 60 minutes. 2 μL of each extension reaction was then analyzed using a 15% TBE-Urea gel and imaged on a Typhoon FLA 9500 system with EX: 649 nm and EM: 666 nm. The gel indicates S. pombe Cid1 poly(U) polymerase incorporates the 2′-modified nucleotides universally and only extends the initiator oligonucleotide by +1-2 nucleotides with very high efficiently (
5. S. pombe Poly(U) Polymerase is Minimally Affected by Initiator Oligonucleotide Sequence Composition and Secondary Structure/Hairpins
Many terminal transferases are extremely sensitive to the sequence composition of the initiator oligonucleotide, where a different base at the 3′-OH terminus can greatly affect the rate at which a nucleotide is incorporated. Thus, it was investigated whether S. pombe poly(U)polymerase is affected in this manner by performing extension reactions for all four natural ribonucleotides in the presence of two 5′-labeled initiator oligonucleotides with differing compositions. Reactions were performed with 1 mM ribonucleotide and 10 pmol of the 5′-labeled initiator oligonucleotide. 2 μL of each extension reaction was then analyzed using a 15% TBE-Urea gel and imaged on a Typhoon FLA 9500 system with EX: 649 nm and EM: 666 nm for the 5′-Cy5-poly rA-15-mer and EX: 495 nm and EM: 520 nm. It was found that S. pombe poly(U) polymerase is minimally affected by initiator oligonucleotide sequence composition bias as indicated by denaturing gel electrophoresis (
6. S. pombe Poly(U) Polymerase Activity is Enhanced by the Addition of Inorganic Pyrophosphatase
A consequence of high terminal transferase activity is the fast accumulation of inorganic pyrophosphate, a known inhibitor of DNA- and RNA-directed polymerases. In order to reduce the accumulation of inorganic pyrophosphate, an inorganic pyrophosphatase (PPi-ase) can be employed to cleave the pyrophosphate into two single phosphates while the reaction progresses. Therefore, it was sought to determine if the supplementation of pyrophosphatase can enhance the terminal transferase activity of S. pombe poly(U) polymerase. Reactions were incubated for 30 minutes at 37° C. with 1 mM of each ribonucleotide, 10 pmol of 5′-Cy5-poly-rU-15-mer initiator oligonucleotide, and 0.1 units of Yeast Inorganic Pyrophosphatase (New England Biolabs M2403). 2 μL of each extension reaction was then analyzed using a 15% TBE-Urea gel and imaged on a Typhoon FLA 9500 system with EX: 649 nm and EM: 666 nm. It was found that the supplementation of inorganic pyrophosphatase enhances the rate at which S. pombe poly(U) polymerase synthesizes RNA and increases the maximum length of the synthesized RNA (
7. S. pombe Poly(U) Polymerase Activity can Naturally Incorporate Base-Modified Ribonucleotides
An application of the methods provided herein is the synthesis of biologically active molecules such as synthetic transfer RNA (tRNA) or ribosomal RNA (rRNA). Often the ribonucleotide bases that comprise tRNA and rRNA are naturally base modified to, for example, induce secondary structure in vivo for optimal functionality. Additionally, the incorporation of modified bases into RNA oligonucleotides can greatly enhance their stability and protect against unwanted nuclease digestion. Thus, it was sought to determine if S. pombe poly(U) polymerase has the ability to incorporate Pseudouridine, one of the most commonly found modified ribonucleotide bases in tRNA and rRNA. Reactions were incubated for 30 minutes at 37° C. with 2 mM, 1 mM, or 0.5 mM rUTP or Pseudouridine (Trilink N-1019), 10 pmol of 5′-Cy5-poly-rU-15-mer initiator oligonucleotide, and 0.1 units of Yeast Inorganic Pyrophosphatase. 2 μL of each extension reaction was then analyzed using a 15% TBE-Urea gel and imaged on a Typhoon FLA 9500 system with EX: 649 nm and EM: 666 nm. It was found that S. pombe poly(U) polymerase has the innate ability to incorporate Pseudouridine, producing based-modified RNA oligonucleotides approximately 30-45-nt in length (
8. RNA Synthesis by S. pombe Poly(U) Polymerase can be Controlled with Competitive Inhibitor Nucleotides
As exemplified in
9. RNA Synthesis by S. pombe Poly(U) Polymerase can be Controlled with 2′-O-Blocked Reversible Terminator Nucleoside Triphosphates
As exemplified by
10. S. pombe Poly(U) Polymerase Efficiently Incorporates 2′-Reversible Terminator Nucleoside Trisphosphates with All Four Natural Nucleobases
Previously, it was determined that S. pombe poly(U) polymerase has the ability to incorporate 2′-modified nucleoside triphosphates (2′-methoxy) bearing the four natural RNA nucleobases (A, U, G, C) with relatively equal efficiency (
11. Active S. pombe Poly(U) Polymerase can be Expressed in Bacteria with an N-Terminus His6-Tag for Large-Scale Production and Purification
The primary sequence for S. pombe poly(U) polymerase (UniProtKB—O13833) was modified by adding the amino acids “MGSSHHHHHHSSGLVPRGSH” to the N-terminus of the enzyme. These amino acids encode for an N-terminus His6-tag with the appropriate linkers. Using the protocol outlined in the material and methods section, N-terminus His6-tagged S. pombe poly(U) polymerase was expressed, purified, and concentrated to a small volume. Denaturing gel electrophoresis indicated that N-terminus His6-tagged S. pombe poly(U) polymerase was properly expressed and isolated from bacterial lysates. With the N-terminus His6-tag, the expected molecular weight of S. pombe poly(U) polymerase is approximately 45 kDa—which a strong band corresponded to on the gel (
12. Controlled RNA Oligonucleotide Synthesis Using S. pombe Poly(U) Polymerase can be Carried Out on Solid-Phase Surfaces
Controlled RNA oligonucleotide synthesis can be readily performed using bulk solutions; however, after the extension and deblocking steps in each synthesis cycle the growing oligonucleotide must be purified to remove interfering components. Multiple purifications, while having efficient recovery with modern methods, ultimately lead to major sample loss after several cycles of synthesis. Thus, performing oligonucleotide synthesis on a solid-phase support such as functionalized beads, wells, slides, etc. is significantly more conducive for the synthesis of longer oligonucleotide fragments and large, industrially relevant quantities of material. In order to evaluate the ability for S. pombe poly(U) polymerase to extend oligonucleotides anchored to a surface, an initiator oligonucleotide bearing a 5′-amine group and internal Cy5 dye was first used to attach a 5′-Biotin-PEG-NHS linker (EZ-Link #A35389 Thermo). The efficiency the labeling reaction was determined to be >90% via analysis with a 15% TBE-urea gel (the addition of the bulky PEG group will make the oligonucleotide run differently as compared to the non-labeled oligonucleotide) (
13. A Reusable Solid-Phase Support with Covalent Linker can be Used for S. pombe Poly(U) Polymerase Mediated Controlled Enzymatic RNA Oligonucleotide Synthesis
A major contributing factor to the overall cost of controlled enzymatic RNA oligonucleotide synthesis is the oligonucleotide initiator sequence. In previous examples of bulk synthesis, the oligonucleotide initiator sequence is consumed and typically non-reusable. Additionally, it is difficult to remove the oligonucleotide initiator sequence from the final product if desired. To overcome this problem, the site-specific cleavage of riboinosines (rI) and/or deoxyinosines (dI) in single stranded RNA, DNA, or combination thereof, by Endonuclease V can be used to remove unwanted initiator sequence from the final oligonucleotide product. Endonuclease V is highly specific for riboinosines (rI) and deoxyinosines (dI) and will not destroy other bases in the oligonucleotide initiator sequence. Expanding this concept to solid-phase oligonucleotide synthesis, which more is conducive for long RNA oligonucleotide synthesis and industrial relevant synthesis scales in comparison to bulk solution synthesis, a reusable set of beads, wells, slides, etc. can be produced for repeated, and potentially unlimited, synthesis runs. A brief overview of this process is given in
In order to determine if this scheme works as intended, an initiator oligonucleotide was synthesized with a 5′-amine group and a deoxyinosine (dI) that would yield two equally sized fragments under Endonuclease V digestion. The initiator oligonucleotide was anchored onto the surface of amine functionalized silica beads by introducing a dual-NHS-PEG9 linker that would react with the 5′-amine of the oligonucleotide and the amine on the silica beads. Derivatized beads were allowed to incubate with Endonuclease V (expressed and purified as described in the materials and methods section) for 1 hour at 37° C. Additionally, the same digestion reaction was conducted in bulk phase for comparison. For both solid-phase and bulk digestion reactions, control samples were put into place which were reactions that did not contain Endonuclease V. After the 1-hour incubation, digestion reactions were analyzed using a 15% TBE-urea gel under denaturing conditions. Gels were stained with 1×SYBR GelStar nucleic acid stain for 15 minutes shaking at room temperature. It was observed that Endonuclease V worked as intended producing digestion fragments in both the bulk and solid-phase reactions (
To demonstrate that the described solid-support system is functional for enzymatic synthesis and that initiator oligonucleotide is reusable via the deoxyinosine nucleobase remaining intact on the surface post Endonuclease V digestion, washed silica beads harboring digested initiator oligonucleotide were incubated with S. pombe poly(U) polymerase and natural rNTPs (uncontrolled extension) as well as the 2′-O-allyl-ATP reversible terminator (controlled extension) using optimized reaction conditions. For comparison, beads with newly anchored, undigested initiator oligonucleotide were extended similarly. All beads were then washed with 10 mM Tris-HCl (pH 6.5) and allowed to incubate in the presence of Endonuclease V for 1 hour at 37° C. The efficiency of extension and cleavage from the surface was then analyzed using a 15% TBE-urea gel under denaturing conditions and stained with 1×SYBR GelStar nucleic acid stain for 15 minutes shaking at room temperature. The gel indicated positive extension and cleavage for both the reused and newly derivatized beads under all conditions (
New nucleoside triphosphates for the enzymatic synthesis of RNA oligonucleotides and modified oligonucleotides have been developed. RNA oligonucleotides and modified oligonucleotides can be used in various applications including oligonucleotide therapeutics. Nucleoside triphosphates are reversibly terminated with a blocking group at the 3′-position of the sugar ring, conferring only (n+1) extension of the growing oligonucleotide; extension reactions do not produce a free hydroxyl group (—OH) at the 3′-position where further extension may be possible. The blocking group can be removed with a mild, biocompatible deprotection agent. This strategy compliments virtual blocking at the 2′ position or base where the growing oligonucleotide is sterically blocked rather than chemically blocked (these are also known as “virtual terminators”).
In some instances, the 3′-blocking strategy requires a compatible enzyme (e.g., mutagenesized poly(U) polymerase described herein), that accommodates the blocking chemical domain. There are a number of chemical domains that can be used for the 3′-blocking strategy as listed below. Some examples include, but are not limited to, 3′-O-allyl triphosphates (3′-O-allyl-NTP) and 3′-O-azidomethyl triphosphates (3′-O-azidomethyl-NTP).
The new 3′-reversibly terminated nucleoside triphosphates can have the advantage of including multiple modifications that confer therapeutic or other functional value to the overall oligonucleotide. Nucleoside triphosphates may have single or multiple modifications in addition to the 3′-reversible terminating group. Modifications can be introduced site-specifically into the oligonucleotide, without additional protecting groups. These nucleoside triphosphates require a compatible enzyme for their incorporation, which may hold a unique sequence or set of mutant codons for each modified nucleotide used. Modifications manifest as chemical handles, ligand binding domains, a way to confer oligonucleotide nuclease resistance, stereopure thiophosphonates oligonucleotides, a way to confer a propensity to form desired oligonucleotide secondary structure, or a way to confer resistance to form undesired oligonucleotide secondary structure, etc. This includes:
i. Modifications to the 2′-domain of the furanose ring, which may be, but limited to, a hydroxyl (—OH), hydrogen (—H), fluoro (—F), amine (—NH3), azido (—N3), thiol (—SH), methoxy (—OCH3), methoxyethanol (—OCH2CH2OCH3), redox-active, fluorogenic or intrinsically fluorescent moieties, natural and non-natural amino acids, peptides, proteins, mono- or oligosaccharides, functional/ligand binding glycans, and large/bulky groups such as poly-ethene-glycol (PEG).
ii. Modifications to the alpha (α) phosphate of the triphosphate, where either a thiophosphonate R or S isomer is site-specifically introduced into the oligonucleotide to produce a stereopure oligonucleotide.
iii. Modifications to the beta (β) and gamma (γ) phosphates of the triphosphate: where either and/or both modifications confer value to the enzymatic oligonucleotide synthesis scheme; for example, to prevent or limiting unwanted pyrophosphorolysis as a consequence of pyrophosphate generation,
iv. Modifications to the furanose ring, which may be, but not limited to, replacing the ring's oxygen with a sulfur or introducing a bridge between the 2′-oxygen and the 4′-carbon that limits ring conformation.
v. Modifications to the nucleobase, where the base is natural or non-natural pyrimidine or purine, and may include, but not limited to, N1-methyl-adenine, N6-methyl-adenine, 8′-azido-adenine, N,N-dimethyl-adenosine, aminoallyl-adenosine, 5′-methyl-urdine, pseudouridine, N1-methyl-pseudouridine, 5′-hydroxy-methyl-uridine, 2′-thio-uridine, 4′-thio-uridine, hypoxanthine, xanthine, 5′-methyl-cytidine, 5′-hydroxy-methyl-cytidine, 6′-thio-guanine, and N7-methyl-guanine.
At the completion of synthesis, oligonucleotides may be irreversibly capped with a final 3′-blocked nucleoside triphosphate that may confer further functional or therapeutic value. This also may require the use of a compatible enzyme (e.g., mutated poly(U) polymerase enzyme) and may be a modification group described herein. Additionally, both the 3′- and 2′-domains for the furanose ring may be irreversibly blocked with the same or different groups.
In addition to mono-nucleoside triphosphate, di-nucleoside triphosphates, tri-nucleoside triphosphates, and N-nucleoside triphosphates (where N=a triphosphorylated oligonucleotide of N length) can be used as substrates for incorporation using a compatible enzyme catalyst, effectively making an oligonucleotide ligase.
In some cases, the addition of a new nucleoside triphosphate may introduce a cleavable handle that can be acted upon by chemical or biological means for post-synthesis processing and purification. For example, oligonucleotides bearing a hypoxanthine (inosine) group may be site-specifically cleaved by Endonuclease V. This is particularly useful for solid-phase synthesis of oligonucleotides, where the bound oligonucleotide initiator can be reused indefinitely.
Enzymatic Oligonucleotide Synthesis with Wild-Type and Mutated Poly(N) Polymerases
Gel electrophoresis analysis of H336 mutants' capacity to incorporate the natural nucleotide GTP—“G” and CTP—“C” is shown in
Gel electrophoresis analysis of poly(U) polymerase mutant H336R capacity to incorporate an array of natural and analogue nucleotides in comparison to the wild-type poly(U) polymerase is shown in
Uncontrolled incorporation of 2′-methoxy-adenosine triphosphate (2′-O-Me-ATP) by various S. pombe poly(U) polymerase mutants, specifically at position H336 is shown in
Uncontrolled incorporation of 2′-fluoro-adenosine triphosphate (2′-F-ATP) by various S. pombe poly(U) polymerase mutants, specifically at position N171 is shown in
Controlled incorporation (capping) of 3′-methoxy-adenosine triphosphate (3′-O-Me-ATP) by various S. pombe poly(U) polymerase mutants, specifically at position N171 is shown in
Controlled incorporation of the reversible terminator 3′-O-allyl Adenosine Triphosphate (3′-(O-allyl)-ATP) by various S. pombe Mutants is shown in
Controlled incorporation of the reversible terminator 3′-O-allyl carbonate deoxyadenosine triphosphate (3′-(0-allyl carbonate)-dATP) by the poly(U) polymerase double mutant H336R-N171A is shown in
Reaction calibration assessment of purified poly(U) polymerase stock H336R with reversible terminator 3′-O-allyl Adenosine Triphosphate 3′-(O-allyl)-ATP) is shown in
Demonstration of controlled enzymatic synthesis is shown in
Exemplary structures of 3′-reversible terminator nucleotides for enzymatic incorporation are shown in
Select 3′ protecting groups where the furanyl ring bears oxygen are shown in
Exemplary scheme for the preparation of a 3′ azidomethyl ether for a nucleotide triphosphate is shown in
Exemplary scheme for the preparation of a 3′ azidomethyl ether for a locked nucleotide triphosphate is shown in
Exemplary scheme for the preparation of a 3′ allyl ether for a nucleotide triphosphate is shown in
Exemplary scheme for the preparation of a 3′ azidomethyl ether for a locked nucleotide triphosphate is shown in
Saccharomyces cerevisiae poly(A) polymerase
Schizosaccharomyces pombe poly(U) polymerase
The primary sequences of wild-type or mutant enzymes of interest were codon optimized for E. coli expression using a custom optimization algorithm and ordered as gBlocks® (IDT) with 20-nt overlap sequences for Gibson Assembly into the pET-28-c-(+) His-tag expression vector (EMD Millipore 69866-3). Using forward and reverse primers from IDT, the gBlocks® were PCR amplified with Phusion High Fidelity (HF) polymerase (NEB M05030). The PCR thermocycling was performed as follows: initial denature for 98° C. for 30 seconds, denature at 98° C. for 10 seconds, anneal at 68° C. for 10 seconds, and extend at 72° C. for 60 seconds for 18 cycles before a final extension of 5 minutes at 72° C. PCR reactions were purified and concentrated using a QIAquick PCR Purification Kit (Qiagen 28106).
The pET-28-c-(+) expression vector was prepared for gBlocks® insertion by digesting the circular DNA with 40U of NDeI (NEB R0111) per 500 ng vector at 37° C. for 90 minutes. The linear DNA was separated from undigested material with 2% agarose gel electrophoresis and extracted by incubating agarose containing the bands corresponding to the linear DNA in Buffer QG (Qiagen 19063) at 55° C. rotating at 1000 RPM for 2 hours. The resultant mixture was cleaned and concentrated with the QIAquick PCR Purification Kit. The PCR amplified insert and vector sequences were combined at a ratio of 1:3 with 0.1 pmol of total material and assembled with Gibson Assembly Master Mix (NEB E5510S) at 50° C. for 1 hour. T7 Express chemically competent E. coli (NEB C2566I) were transformed with the fully assembled plasmid as per manufacturer's instructions and positive transformants are selected for on LB-kanamycin plates (50 ug/mL kanamycin).
Bacterial colonies were sequenced (Genewiz, T7—Forward Primer, T7 Term—Reverse Primer) and those with perfect matches were grown in liquid LB-kanamycin media (50 μg/mL kanamycin) overnight, diluted 1:400 in fresh liquid LB-kanamycin, and induced with 1 mM IPTG (Sigma 16758) at approximately OD600=0.8. The induced liquid cultures were incubated overnight at 15° C., shaking at 250 RPM. Cultures were then pelleted at 3500×g for 10 minutes and then His-Tag purified using a HisTalon Resin Kit as per manufacturer's instructions (Clontech 635654). The eluted enzyme samples were then buffer exchanged into an optimal 2× protein storage buffer using 15-mL filter columns (Millipore) at the appropriate MWCO by centrifugation at 5000×G for 15 minutes at 4° C. This process was repeated twice. On the third spin, samples were spun for 30 minutes in order to concentrate the protein into a smaller volume.
Two small aliquots were taken for to determine the overall protein concentration using a Reducing Agent Compatible MicroBCA kit (Thermo 23252) and to determine the size of the His-Tag purified protein using a 16% Tris-Gly denaturing gel (Thermo XP00165) with a 10-250 kDa protein ladder (Thermo 26619). Post gel-electrophoresis, gels were stained with Coomassie Orange Fluor (Thermo C33250) for 20 minutes at room temperature under gentle agitation and visualized using a GelDoc Image Station (Biorad). The remaining concentrated stock of protein was diluted 1:2 with sterile glycerol and stored at −20° C.
Single or multiple amino acids may be mutagenized for improvement in either of the RNA oligonucleotide synthesis schemes by rational design or by high-throughput methods such as error-prone PCR. Plasmids carrying the target protein were harvested and purified from a sequence verified liquid bacterial cultures grown overnight in LB-kanamycin media at 37° C. using a MiniPrep Kit (Qiagen 27104). Oligonucleotide primers were ordered from IDT and were designed to PCR amplify the protein expression plasmid while simultaneously mutagenizing the plasmid at the predetermined location, yielding linearized DNA. Using the reagents from the Q5 Site-Directed Mutagenesis Kit (NEB E0554S), the protein expression plasmid was PCR amplified using the Q5 Hot Start High-Fidelity 2x Master Mix with the following thermocycling conditions: initial denature for 98° C. for 30 seconds, denature at 98° C. for 10 seconds, anneal at 68° C. for 10 seconds, and extend at 72° C. for 120 seconds for 25 cycles before a final extension of 2 minutes at 72° C. 1 μL of the resulting PCR amplification reaction was then treated with the kit's enzyme reaction cocktail to re-circularize the protein expression plasmid while digesting away the unsubstituted plasmid sequences remaining in the reaction mixture. After bacterial transformation and sequence verification, colonies with perfect sequence matches were used to express and analyze the site-directed mutant protein with methods previously outlined. The resultant purified mutagenized protein was concentrated and buffer exchanged into the appropriate 2× storage buffer as previously mentioned and diluted 1:2 with sterile glycerol and stored at −20° C.
Initial Activity Screen with Natural rNTPs
Expressed proteins with terminal transferase activity were screened by determining the rate of RNA generation in terms of total RNA concentration and the length/distribution of RNA produced by the protein after incubation with natural rNTPs. In order to measure the rate of RNA generation, a 10 μL bulk extension reaction consisting of 10 pmol of a short 5′-Cy5 labeled initiator oligonucleotide (15-20-nt), 100 μM of rNTPs, 0.25 mM divalent cation cofactor (such as Co2+, Mg2+, Mn2+, Zn2+, or combinations thereof), 1x Reaction Buffer, 1x SYBR Dye (GelStar (Lonza 50535), Qubit ssDNA Dye (Thermo Q10212), or SYBR Green II RNA gel stain (Thermo S7564), and 1 μL of purified enzyme was monitored on a plate reader (EX:598 nm, EM:522 nm) over 30 minutes at 37° C., taking signal reads every 1 minute in triplicate (N=3). Using a custom R script, the rate of RNA generation and, subsequently enzyme initial activity (Vo), was determined from the slope of the best fit curve of the average RFU plotted as a function of time. The length of the RNA produced in these reactions was determined by comparing products to a 100-nt ssDNA ladder (Simplex Biosciences) using a 15% TBE-Urea denaturing gel (Thermo EC6885) following the manufacturer's protocol. Approximately 8 μL of the initial activity screen reaction volume was loaded onto the gels and run at 185V for 60 minutes unless otherwise specified. Gels were then stained with a solution of 1× GelStar Nucleic Acid stain or SYBR Green II RNA gel stain for 15 minutes with gentle agitation. The resultant gel was then imaged on a Typhoon FLA 9500 system (GE Healthcare Life Sciences) using imaging parameters for SYBR Gold. For extension reactions using initiator oligonucleotides labeled with a 5′-fluorophore such as FAM, Cy5, Cy3, etc., gels were not stained and imaged directly using the appropriate parameters.
Enzyme Activity Assay—Uncontrolled Extension with Natural & Analogue Nucleotides
Uncontrolled extension reactions were comprised of 5 pmol of initiator oligonucleotide, 1 mM natural or analogue nucleotide, 1× poly(U) polymerase reaction buffer (10 mM NaCl, 10 mM Tris-HCl, 10 mM MgCl2, 1 mM DTT, pH 7.9 at 25 C), and 1 μg of purified enzyme. Natural and analogue nucleotides were either purchased from commercial sources or custom synthesized in-house. Reactions were incubated at 37° C. for 30 minutes and immediately analyzed by gel electrophoresis using a 15% TBE-Urea denaturing gel (Thermo EC6885) as per manufacturer's instructions. The length of the oligonucleotide produced in these reactions was determined by comparing products to a 100-nt ssDNA ladder (Simplex Biosciences). Gels were then stained with a solution of 1x GelStar Nucleic Acid stain or SYBR Green II RNA gel stain for 15 minutes with gentle agitation. The resultant gel was then imaged on a Typhoon FLA 9500 system (GE Healthcare Life Sciences) using imaging parameters for SYBR Gold. For extension reactions using initiator oligonucleotides labeled with a 5′-fluorophore such as FAM, Cy5, Cy3, etc, gels were not stained and imaged directly using the appropriate parameters.
Enzyme Activity Assay—Controlled Extension with Natural & Analogue Reversible Terminator Nucleotides
Controlled extension reactions were comprised of 5 pmol of initiator oligonucleotide, 1 mM blocked reversible terminator nucleotides, 1× poly(U) polymerase reaction buffer (10 mM NaCl, 10 mM Tris-HCl, 10 mM MgCl2, 1 mM DTT, pH 7.9 at 25 C), and 1 μg of purified enzyme. Reactions were incubated at 37° C. for 1 minute and immediately analyzed by gel electrophoresis using a 15% TBE-Urea denaturing gel (Thermo EC6885) as per manufacturer's instructions. Success of the (N+1) event was determined by running a blank extension reaction in which no nucleotide or enzyme was supplemented. Gels were then stained with a solution of 1× GelStar Nucleic Acid stain or SYBR Green II RNA gel stain for 15 minutes with gentle agitation. The resultant gel was then imaged on a Typhoon FLA 9500 system (GE Healthcare Life Sciences) using imaging parameters for SYBR Gold. For extension reactions using initiator oligonucleotides labeled with a 5′-fluorophore such as FAM, Cy5, Cy3, etc, gels were not stained and imaged directly using the appropriate parameters.
Both uncontrolled and controlled extension reactions can be performed using surface bound initiator oligonucleotide. The surface bound initiator oligonucleotide was obtained from IDT with a 5′-amine C6 spacer group and an internal Cy5 fluorophore. This oligonucleotide was then biotinylated and PEG-ylated using an EZ Link NHS-PEG12-Biotin kit (Thermo A35389) as per manufacturer's instructions and then clean and concentrated using an Oligonucleotide Clean and Concentrator Spin-column Kit (Zymo D4060). Derivatized initiator oligonucleotide was then bound to the surface of a streptavidin coated PCR plate (BioTez, Germany) by incubating oligonucleotide in 2× Binding and Wash buffer (10 mM Tris-HCl, 2M NaCl, 1 mM EDTA, pH 7.5 at 25° C.) for 1 hour with gentle agitation (300 RPM) in the plate wells. Wells were then aspirated and then washed once with 1× Binding and Wash Buffer. Extension reaction cocktails were made up as previously described and incubated with surface bound oligonucleotide for a predetermined time (30 minutes for uncontrolled & 1 minutes for controlled) shaking at 900 RPM at 37° C. Wells were then washed again using 1× Binding and Wash Buffer. To remove the extended oligonucleotide from the surface, wells were incubated with stripping solution (95% formamide, 10 mM EDTA, pH 6.0 at 25° C.) at 65° C. for 5 minutes. Oligonucleotide suspended in the stripping solution were then cleaned and purified using an oligonucleotide spin-column and eluted into 6 μL diH20. Surface extension reactions were then analyzed using gel electrophoresis as previously described.
In the claims articles such as “a,” “an,” and “the” may mean one or more than one unless indicated to the contrary or otherwise evident from the context. Claims or descriptions that include “or” between one or more members of a group are considered satisfied if one, more than one, or all of the group members are present in, employed in, or otherwise relevant to a given product or process unless indicated to the contrary or otherwise evident from the context. The invention includes embodiments in which exactly one member of the group is present in, employed in, or otherwise relevant to a given product or process. The invention includes embodiments in which more than one, or all of the group members are present in, employed in, or otherwise relevant to a given product or process.
Furthermore, the invention encompasses all variations, combinations, and permutations in which one or more limitations, elements, clauses, and descriptive terms from one or more of the listed claims is introduced into another claim. For example, any claim that is dependent on another claim can be modified to include one or more limitations found in any other claim that is dependent on the same base claim. Where elements are presented as lists, e.g., in Markush group format, each subgroup of the elements is also disclosed, and any element(s) can be removed from the group. It should it be understood that, in general, where the invention, or aspects of the invention, is/are referred to as comprising particular elements and/or features, certain embodiments of the invention or aspects of the invention consist, or consist essentially of, such elements and/or features. For purposes of simplicity, those embodiments have not been specifically set forth in haec verba herein.
It is also noted that the terms “comprising” and “containing” are intended to be open and permits the inclusion of additional elements or steps. Where ranges are given, endpoints are included. Furthermore, unless otherwise indicated or otherwise evident from the context and understanding of one of ordinary skill in the art, values that are expressed as ranges can assume any specific value or sub-range within the stated ranges in different embodiments of the invention, to the tenth of the unit of the lower limit of the range, unless the context clearly dictates otherwise.
This application refers to various issued patents, published patent applications, journal articles, and other publications, all of which are incorporated herein by reference. If there is a conflict between any of the incorporated references and the instant specification, the specification shall control. In addition, any particular embodiment of the present invention that falls within the prior art may be explicitly excluded from any one or more of the claims. Because such embodiments are deemed to be known to one of ordinary skill in the art, they may be excluded even if the exclusion is not set forth explicitly herein. Any particular embodiment of the invention can be excluded from any claim, for any reason, whether or not related to the existence of prior art.
Those skilled in the art will recognize or be able to ascertain using no more than routine experimentation many equivalents to the specific embodiments described herein. The scope of the present embodiments described herein is not intended to be limited to the above Description, but rather is as set forth in the appended claims. Those of ordinary skill in the art will appreciate that various changes and modifications to this description may be made without departing from the spirit or scope of the present invention, as defined in the following claims.
This application claims priority under 35 U.S.C. § 119(e) to U.S. Provisional Application, U.S. Ser. No. 62/745,136, filed Oct. 12, 2018, the entire contents of which is incorporated herein by reference.
This invention was made with government support under grant number DE-FG02-02ER63445, awarded by the U.S. Department of Energy. The government has certain rights in the invention
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2019/055870 | 10/11/2019 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62745136 | Oct 2018 | US |