This disclosure includes a Sequence Listing submitted electronically in ascii format under the file name “NEB-440_ST25”. This Sequence Listing is incorporated herein in its entirety by this reference.
Argonaute proteins have an ability to bind small single-stranded 5′-phosphorylated nucleic acids which provide base-pairing specificity for targeting complementary single-stranded targets. Eukaryotic Argonautes (eAgo) are essential components of certain RNA-induced gene silencing processes, in which eAgo associates with a single-stranded RNA guide to form an RNA-induced silencing complex (RISC). The RISC is then directed to the complementary sequence on mRNA molecules where Argonaute catalyzes the nucleolytic cleavage of single-stranded mRNA in a guide-specific manner, thus, resulting in a reduction of target gene expression. RISC also can interact with a variety of Argonaute-associated proteins to induce cleavage-independent mechanisms of gene regulation.
Despite the fact that prokaryotes lack RNA interference pathways, many bacterial and archaeal organisms also possess Argonaute proteins implying a different biological role and/or mechanism of action for these proteins within a cell. Multiple recent studies suggest that prokaryotic Argonautes in vivo function as defense systems against foreign genetic elements. Prokaryotic Agos (pAgos) represent a very diverse group of proteins and based on the presence or absence of the basic domains can be divided into two major groups—the short pAgos and the long pAgos. All known active pAgos belong to a long Ago group and, similar to eAgos, consist of four essential domains, N-terminal, PAZ, MID and PIWI. In contrast to eAgos, which use RNA guides to exclusively target RNA, different bacterial Agos have been shown to bind either RNA or DNA guides and to cleave either RNA or DNA targets, whereas some archaeal Agos exclusively utilize DNA guides for cleavage of DNA targets.
In addition to a guided cleavage of complementary targets, many pAgos exhibit non-specific nuclease activity when they are not associated with the guides. TtAgo co-purifies with DNA sequences that are preferentially derived from its own expression plasmid, but only if the Argonaute is catalytically active. Based on these and similar studies, the non-specific activity of pAgos was implicated in cellular function required for guide processing. While the physiological mechanism for DNA guide processing in vivo still remains ambiguous, the most recent study of mesophilic bacterial Argonaute CbAgo shows that CbAgo nucleolytic activity cooperates with cellular double-strand break repair machinery in generation of small DNAs that later can be used as guides by this Argonaute.
The present disclosure relates, in some embodiments, a non-naturally occurring composition comprising a helicase, a first Argonaute (e.g., a mesophilic Argonaute) bound to a first guide (e.g., an engineered guide), and optionally, a second Argonaute (e.g., a mesophilic Argonaute) bound to a second guide (e.g., an engineered guide). Each guide may be engineered or programmed to bind (e.g., base pair) with specific complementary sequences in a target polynucleotide. A non-naturally occurring composition may comprise a first Argonaute and the first guide at an Argonaute:guide molar concentration ratio of 2:1 to 1:2 or an Argonaute:guide molar concentration ratio equal to or lower than 1:1.4. A non-naturally occurring composition may comprise a second Argonaute and the second guide at an Argonaute:guide molar concentration ratio of 2:1 to 1:2 or an Argonaute:guide molar concentration ratio equal to or lower than 1:1.4. In some embodiments, a composition may further comprise a double-stranded polynucleotide (e.g., a target DNA). A double-stranded polynucleotide may comprise a nucleotide sequence complementary to a first guide and/or complementary to a second guide.
In some embodiments, a composition may comprise an Argonaute (e.g., a first Argonaute bound to a first guide and/or a second Argonaute bound to a second guide) selected from an Aquifex aeolicus Argonaute, an Aquifex aeolicus Argonaute, a Microsystis aeruginosa Argonaute, a Clostridium bartlettii Argonaute, an Exiguobacterium Argonaute, an Anoxybacillus flavithermus Argonaute, a Halogeometricum borinquense Argonaute, a Halorubrum lacusprofundi Argonaute, an Aromatoleum aromaticum Argonaute, a Synechococcus Argonaute, a Clostridium butyricum Argonaute (CbAgo), a Clostridium disporicum Argonaute (CdAgo), a Clostridium perfringens Argonaute (CpAgo), a Clostridium sartagoforme Argonaute (CsAgo), a Clostridium saudiense Argonaute (CaAgo), an Intestinibacter bartlettii Argonaute (IbAgo) and, in each case, homologues thereof (e.g., Argonautes having at least 90% amino acid sequence identity thereto). For example, an Argonaute (e.g., a first Argonaute bound to a first guide) may be CaAgo, CbAgo, CdAgo, CpAgo, CsAgo or IbAgo and the Argonaute bound to the second guide is independently CaAgo, CbAgo, CdAgo, CpAgo, CsAgo or IbAgo. A helicase may be selected from an EcoRecQ DNA helicase from Escherichia coli, a CpeRecQ from Clostridium perfringens, a Cbu RecQ from Clostridium butyricum, a DNA helicase from T4-like bacteriophage (e.g., T4 gp41, T4 gp41 associated with T4 gp59, T4 UvsW, T4 Dda and Slur07 Dda), a T7 bacteriophage gp4 DNA helicase, RecBCD-family helicases from Escherichia coli, a modified RecBCD helicase (e.g., RecBexo- helicase, RecBexo-C, RecBexo-CD, RecΔB, RecΔBC, RecΔBCD), a UvrD/PcrA family helicase (e.g., E. coli EcoUvrD), an E. coli Rep, an M. tuberculosis PcrA, an M. leprae PcrA, and an Escherichia coli Tra helicase. Each guide (e.g., engineered guide) may be independently (of other guides in a composition) 12-60 nucleotides in length.
The present disclosure relates, in some embodiments, to methods of forming a double strand break in a double-stranded polynucleotide at a target position in the polynucleotide. For example, a method may comprise contacting (a) a double-stranded polynucleotide having a first target sequence on a first strand of the polynucleotide and a second target sequence on the opposite strand, (b) a helicase, (c) an Argonaute with a first bound guide having a sequence complimentary to the first target sequence, and (d) an Argonaute with a second bound guide having a sequence complimentary to the second target sequence under conditions that permit hybridization of complimentary sequences and cleavage of the first strand by the (c) Argonaute and cleavage of the second strand by the (d) Argonaute to produce a double strand break in the polynucleotide. A method of forming a double-strand break may be performed at a wide range of suitable temperatures. For example, a method of forming a double strand break in a double-stranded polynucleotide at a target position in the polynucleotide may include contacting reaction components disclosed above and herein at a temperature of 25° C. to 45° C. A method of forming a double strand break in a double-stranded polynucleotide may comprise forming at least a first fragment of the polynucleotide and a second fragment of the polynucleotide. For example, the number of fragments of the starting polynucleotide formed may be a function of the number and position of the sequence(s) complimentary to the guide(s). A first guide may have a sequence complementary to the first target sequence in the polynucleotide. A method of forming a double strand break in a double-stranded polynucleotide may comprise forming one or more fragments of the polynucleotide with one or more such fragments comprising a blunt end, an overhang from 1 to 50 nucleotides in length (which overhang may be a 5′ or a 3′ overhang), and/or an overhang from 51 to 100 nucleotides in length (which overhang may be a 5′ or a 3′ overhang).
A method may comprise, for example, (a) contacting a helicase, an Argonaute, a guide DNA bound to the Argonaute, and a polynucleotide comprising a target sequence that is complementary to at least part of the guide DNA, to produce a reaction mix; and/or (b) incubating the reaction mix at a temperature of 25° C. to 45° C., wherein the nucleic acid is cleaved. Contacting, in this context, may further comprise contacting the helicase, the Argonaute, the guide DNA bound to the Argonaute, the polynucleotide, a second Argonaute, and a second guide bound to the second Argonaute, wherein the polynucleotide further comprises a second target sequence that is complementary to at least part of the second guide DNA.
The present disclosure relates, in some embodiments, to compositions, methods, systems, and kits that modify argonaute's use of single-stranded DNA guides for sequence-specific cleavage of complementary DNA targets to yield pAgos adapted to serve as programmable DNA endonucleases. In some embodiments, cleavage may occur at a physiological temperature (e.g., at temperatures from 25° C. to 45° C.) optionally, in the presence of a helicase and/or a chemical agent. Examples of chemical agents include alkali, dimethylsulfoxide (DMSO), and formamide. Conditions may otherwise to permit or favor cleavage of and/or to destabilize Watson-Crick bonding in double stranded DNA, for example, by contacting the substrate DNA with media having a low or high ionic strength, with air, and/or with glass.
The CRISPR/Cas9 system is a widely used enzymatic tool for programmable DNA cleavage. Cas9 nuclease is programmed with a single RNA guide and can invade ds DNA structure. It generates double-strand breaks at a guide-specific DNA target. Also, the system functions at physiological temperatures, so it had been successfully adapted for genome editing in vivo (Wang H et al., Annu. Rev. Biochem. 2016, 85:227-264).
In contrast to CRISPR-Cas9, mesophilic pAgos have been observed to act poorly on double-stranded targets due to their poor ability to invade duplex DNA. Characterized pAgos that function at 30-75° C. temperatures show low levels of endonucleolytic activity on ds DNA preferentially targeting negatively supercoiled plasmids and/or DNA sections with low G/C content. These preferences are consistent with DNA duplex destabilization aiding pAgos to access targets on ds DNA. At a temperature that causes thermal DNA denaturation (e.g., >87° C.), hyperthermophilic pAgos (e.g., PfAgo from the archeon Pyrococcusfuriosus) work in vitro as programmable DNA-guided DNA-cleaving endonucleases. After DNA melting takes place, the double-strand cleavage by PfAgo proceeds by way of two independent strand-nicking events catalyzed by two PfAgo monomers, each loaded with a guide complementary to one DNA strand.
Putative pAgo proteins were screened for cleavage activity at 37-65° C. temperature and candidates were identified that are active at 37° C. (
Recently, ds DNA cleavage by mesophilic SeAgo was tested during ongoing transcription of the target region, which was expected to transiently melt ds DNA, but the approach had no observed effect on target cleavage by SeAgo under the conditions tested. Taking a different approach, ds DNA cleavage by argonaute CbAgo was explored during the ongoing unwinding of DNA strands by mesophilic DNA helicases. Applicants disclose here that, surprisingly, mesophilic argonautes are capable of cleaving ds DNA at or near physiological temperatures (e.g., from 25° C. to 45° C.) in the presence of one or more helicases. Initially, CbAgo was combined with either EcoRecQ (Abcam, Inc., Cambridge, Mass., USA) or Slur07 Dda (McLab, South San Francisco, Calif., USA) and both combinations were found to be proficient in cleaving short ds DNA tailed with a single-stranded fork structure (
Highly processive and highly efficient DNA helicases were evaluated for compatibility with CbAgo. For example, E. coli RecBCD is a very fast (e.g., 1,000 to 2,000 bp s−1) and very processive (e.g., ˜30,000 bp) DNA helicase which prefers unwinding blunt-ended DNA. Wild type RecBCD is a heterotrimer consisting of three subunits, RecB, RecC and RecD and has multiple enzymatic activities: ATP-dependent DNA unwinding activity, ATP-dependent dsDNA and ssDNA exonuclease activity, ATP-stimulated ssDNA exonuclease activity and ATPase activity. Wild type RecBCD possess a strong nuclease activity associated with the RecB subunit. RecB is organized into a 100-kDa N-terminal helicase domain and 30-kDa C-terminal exonuclease/endonuclease domain. The helicase and nuclease domains can function independently from each other suggesting that nucleolytic activity of RecB subunit can be eliminated without losing DNA unwinding function. In one embodiment, a truncated RecB helicase variant, lacking C-terminal 930-1180 amino acids, was constructed and referred to as RecΔB. In another embodiment, a full-length nuclease deficient RecB helicase variant, referred to as RecBexo-, was constructed. Both RecB variants were combined with the RecC subunit to form rapid and processive DNA helicases, referred to as RecΔBC and RecBexo-C.
In the presence of RecΔBC DNA helicase CbAgo was capable of cleaving up to 25-30% of targets on blunt-ended DNA (
In contrast to RecΔBC, DNA unwinding by RecBexo-C helicase permits CbAgo to efficiently access single-stranded targets on otherwise inaccessible double-stranded substrates. Coupling CbAgo cleavage with DNA strand unwinding by RecBexo-C DNA helicase for the first time allowed argonaute's properties on double-stranded substrates to be evaluated at 37° C. temperature. A detailed analysis of CbAgo cleavage products was carried out using high-throughput capillary gel electrophoreses to identify the most effectual pre-arrangement of the guides for ds target cleavage. An array of 21-nt long DNA guides was used to evaluate cleavage efficiency of individual strands within a double-stranded DNA. In the presence of RecBexo-C helicase, CbAgo loaded with 13 different guides was shown to efficiently cleave targeted DNA strand up to 80-100% in 16 minutes at 37° C. (
According to some embodiments, individual DNA strands are cleaved independently by two Ago/guide complexes. Cleavage products may be custom-designed (including fragment length, cleavage location, and overhang length, and overhang character (e.g., 5′- or 3′-single-stranded extensions)) by appropriate selection of the guide used to target each DNA strand. In some embodiments, cleavage products may be (further) designed or controlled by adjusting reaction conditions including, for example, molar ratio of Argonaute to guide, molar ratio of Argonaute to substrate, molar ratio of helicase to substrate, reaction temperature, reaction time, salt(s) present (if any) and concentration of such salt(s), pH, buffer(s) present (if any) and concentration of such buffer(s), and combinations thereof. For concurrent cleavage of opposite DNA strands, two guide pairs were initially tested. CbAgo loaded with one guide pair was expected to yield DNA fragments tailed with 5 nt-long 3′-ss extensions (
In some embodiments, guides that target opposite DNA strands may partially complement each other. Complimentary regions may be identified by aligning nucleotide sequences of two guides which are employed concurrently for double-stranded cleavage. For example, the two 21-nt long guides complement each other by 20 nucleotides if they are arranged to yield a blunt-ended ds cut. In case of either 5′- or 3′-staggered cuts which create ss overhangs of 13 nucleotides in length, the 21-nt long guides complement each other by 9 or 11 nucleotides respectively. Sequence alignment also reveals that guides complement each other through 3′-terminal sequences if they are pre-arranged to yield cleavage products tailed with 3′-ss extensions. To generate cleavage products tailed with 5′-ss extensions, the guides must complement each other through their 5′-terminal sequences.
The 5′-end of the guide may play a key role in target recognition and cleavage by Ago nuclease. Nucleotides 2-8 of the guide counted from a 5′-phosphorylated end are termed the “seed” region. In an Ago/guide complex, the bases of the seed region are solvent exposed, therefore they can readily base pair with a matching sequence on the target strand. If targeted sequence matches the guide sequence over more than 15 nucleotides, then Ago readily cleaves targeted sequence at the scissile phosphate which is located between nucleotides 10 and 11 counting from the 5′-phosphate on the guide sequence. When two opposing CbAgo:guide complexes are formed at a 1:2 molar concentration ratio and then combined in the same cleavage reaction, free guides may hybridize to each other through complementary regions and form inactive double-stranded structures. However, in some embodiments, the free guides may also base pair with the seed region of the already formed CbAgo/guide complex, thus, turning into unintentional targets. Without limiting any embodiment to any particular mechanism of action, CbAgo may cleave such unintentional target, for example, where CbAgo/guide complex is capable of base pairing with the free guide by more than 15 nucleotides. Close inspection of selected CbAgo reactions performed with guides sharing lengthy complementary regions revealed slow cleavage of ds targets during early time points followed by a rapid acceleration in cleavage at the later time points, thus, indicating that single-stranded free guides possibly were cleaved prior to double-stranded DNA targets. In some embodiments, CbAgo cleavage of ds DNA may be inhibited, possibly severely inhibited, in the presence of free guides which are highly complementary to each other across the 5′-terminal sequences. Without limiting any particular embodiment to any specific mechanism of action, this inhibition may be due to the formation of uncleavable double-guided complexes.
The present disclosure provides methods for effective pre-arrangement of the guides for ds DNA cleavage by CbAgo in the presence of RecBexo-C DNA helicase. For example, efficient methods for double-strand cleavage by CbAgo may include targeting ds DNA with guide pairs programmed to yield 3′-ss staggered breaks. Efficient methods for double-strand cleavage by CbAgo may include targeting ds DNA with guide pairs programmed to produce 5′-ss staggered breaks, wherein the CbAgo:guide molar concentration ratio is equal to or lower than 1:1.4.
CbAgo loaded with several particular guides (e.g., guides T1-C1 or T2-A1) cleaved single-stranded target imbedded within a double-stranded DNA (
The present disclosure provides mesophilic Ago which can rapidly cleave ds targets during concurrent DNA unwinding by DNA helicase RecBexo-C deficient in nuclease activity. As elaborated in EXAMPLES 17-18, in the presence of RecBexo-C helicase CbAgo efficiently cleaves linear ds DNAs ranging from 300 bp up to 25 kb in length. According to some embodiments, CbAgo/RecBexo-C combination cleaves ds targets located as far as 11-12 kb away from the end of linear DNA. But also, the CbAgo/RecBexo-C can be used to cleave close to the end of linear DNA, thus yielding DNA fragments with sequence-specific ss extensions of any desirable length that are ready for ligation without further enzymatic treatment. EXAMPLE 19 discloses methods for assembly of natural linear DNA molecules by using CbAgo/RecBexo-C programmable DNA endonuclease (
According to some embodiments, an argonaute and a helicase, for example, CbAgo and RecBexo-C, may be combined for efficient mesophilic DNA-guided DNA-cleaving programmable endonuclease activity. These may be used in vitro for development of new synthetic biology tools that require or benefit from sequence-specific nicking/cleavage of double-stranded DNA at otherwise inaccessible locations.
Aspects of the present disclosure can be further understood in light of the embodiments, section headings, figures, descriptions and examples, none of which should be construed as limiting the entire scope of the present disclosure in any way. Accordingly, the claims set forth below should be construed in view of the full breadth and spirit of the disclosure.
Each of the individual embodiments described and illustrated herein has discrete components and features which can be readily separated from or combined with the features of any of the other several embodiments without departing from the scope or spirit of the present teachings. Any recited method can be carried out in the order of events recited or in any other order which is logically possible.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. Still, certain terms are defined herein with respect to embodiments of the disclosure and for the sake of clarity and ease of reference.
Sources of commonly understood terms and symbols may include: standard treatises and texts such as Kornberg and Baker, DNA Replication, Second Edition (W.H. Freeman, New York, 1992); Lehninger, Biochemistry, Second Edition (Worth Publishers, New York, 1975); Strachan and Read, Human Molecular Genetics, Second Edition (Wiley-Liss, New York, 1999); Eckstein, editor, Oligonucleotides and Analogs: A Practical Approach (Oxford University Press, New York, 1991); Gait, editor, Oligonucleotide Synthesis: A Practical Approach (IRL Press, Oxford, 1984); Singleton, et al., Dictionary of Microbiology and Molecular biology, 2d ed., John Wiley and Sons, New York (1994), and Hale & Markham, the Harper Collins Dictionary of Biology, Harper Perennial, N.Y. (1991) and the like.
As used herein and in the appended claims, the singular forms “a” and “an” include plural referents unless the context clearly dictates otherwise. For example, the term “a protein” refers to one or more proteins, i.e., a single protein and multiple proteins. It is further noted that the claims can be drafted to exclude any optional element. As such, this statement is intended to serve as antecedent basis for use of such exclusive terminology as “solely,” “only” and the like in connection with the recitation of claim elements or use of a “negative” limitation. As used herein, “can” and “may” are intended to convey an optional, possible, and/or permissive condition of or for operability.
Numeric ranges are inclusive of the numbers defining the range. All numbers should be understood to encompass the midpoint of the integer above and below the integer i.e., the number 2 encompasses 1.5-2.5. The number 2.5 encompasses 2.45-2.55 etc. When sample numerical values are provided, each alone may represent an intermediate value in a range of values and together may represent the extremes of a range unless specified.
In the context of the present disclosure, “Argonaute” refers to an endonuclease that catalyzes cleavage of a single stranded nucleic acid governed by the sequence of a bound guide and may comprise an N-terminal domain (e.g., facilitating release of a target nucleic acid after cleavage), a PAZ domain (e.g., which may hold the 3′ end of the guide pending hybridization of the guide with a complimentary sequence), a MID domain (e.g., which binds a short, single-stranded oligonucleotide guide), and/or a PIWI domain (e.g., having a metal-dependent, RNase H-like endonuclease with activity conditioned on whether the PAZ domain is bound to the 3′ end of the guide and/or whether the guide is hybridized to a complementary target sequence). In some embodiments, an Argonaute may be a naturally occurring protein. In some embodiments, an Argonaute may be a non-naturally occurring protein. An Argonaute may have an amino acid sequence having at least 80%, at least 90%, at least 95%, or 100% sequence identity to a wild type Argonaute polypeptide (e.g., Argonaute from Thermus thermophilus). Examples of Argonautes include, without limitation, Argonautes from Aquifex aeolicus, Microsystis aeruginosa, Clostridium bartlettii, Exiguobacterium, Anoxybacillus flavithermus, Halogeometricum borinquense, Halorubrum lacusprofundi, Aromatoleum aromaticum, Thermus thermophilus, Synechococcus, (e.g., Synechococcus elongatus), Thermosynechococcus elogatus, C. butyricum (CbAgo), C. disporicum (CdAgo), C. perfringens (CpAgo), C. sartagoforme (CsAgo), C. saudiense (CaAgo), I. bartlettii (IbAgo), and/or an Argonaute listed in Table 2 or Table 3.
Argonautes include eukaryotic (e.g., mouse AGO2) and prokaryotic Argonautes. Argonaute may comprise an amino acid change relative to a reference sequence (e.g., a naturally occurring sequence) such as a deletion, insertion, substitution, variant, mutation, fusion, chimera, or any combination thereof. This term refers to any modified (e.g., shortened, mutated, lengthened) polypeptide sequence or homologue of the Argonaute. An Argonaute can be enzymatically inactive, partially active, constitutively active, fully active, inducibly active and/or more active, (e.g., more than the wild type homologue of the protein or polypeptide). A “thermostable” Argonaute is a protein that remains catalytically active for at least 5 minutes or 10 minutes at elevated temperatures such as above 45° C., 50° C. or 55° C. An Argonaute catalytically active at physiological temperatures (e.g., 25-45° C.) may be referred to as a “mesophilic Argonaute”. With its guide bound to a complementary target sequence, an Argonaute creates a break in the phosphodiester backbone of the complementary target nucleic acid. In the case of double-stranded substrates, a break is only created in the strand which is complementary to the guide nucleic acid. As disclosed herein, a break in the other strand may be introduced using a second Argonaute with a second guide.
In the context of the present disclosure, “buffer” and “buffering agent” refer to a chemical entity or composition that itself resists and, when present in a solution, allows such solution to resist changes in pH when such solution is contacted with a chemical entity or composition having a higher or lower pH (e.g., an acid or alkali). Examples of suitable non-naturally occurring buffering agents that may be used in disclosed compositions, kits, and methods include, for example, Tris, HEPES, TAPS, MOPS, tricine, or MES.
In the context of the present disclosure, “double-strand break” refers to any breakage of the phosphate backbones of each strand in a double stranded polynucleotide (e.g., RNA, DNA, RNA/DNA hybrids). Double stranded polynucleotides may have any helical (e.g., α, β, ζ) or non-helical conformation. Double strand breaks include breaks that leave blunt ends (Class I) or overhangs (Classes II and III). Class II overhangs may be from 1-50 nucleotides in length (e.g., 1-4 nts, 5-12 nts, 13-25 nts, 26-50 nts) with the stringency of conditions needed for separation of such overhangs generally increasing with length of the overhang. Overhangs of 1-4 nts may be generated by restriction endonucleases. Overhangs of 5-12 nts may be dissociated, for example, at 37 C. Overhangs of 13 nts or beyond may require higher temperatures and/or chemical denaturants. Class III overhangs may be over 50 nucleotides in length. A double-strand break in a double-stranded polynucleotide may produce two fragments of the original double-stranded polynucleotide. A double strand break may not result in two separate fragments, for example, if an overhang is long and/or conditions permit the overlapping regions of each strand to remain base paired. In such cases, the overhang may be destabilized (e.g., by heat, salt, and/or pH adjustments) to release the overlapping strands from one another.
In the context of the present disclosure, “guide” refers to a single-stranded oligonucleotide (a) capable of binding (e.g., hybridizing to) a polynucleotide having a complimentary sequence, (b) capable of binding an Argonaute, and (c) comprising (i) at least 12 nucleotides (e.g., 12-60 nucleotides), (ii) at least 50% deoxyribonucleotides (e.g., at least 60%, at least 70%, at least 80%, at least 90%, or 100% deoxyribonucleotides), (iii) up to 50% ribonucleotides (e.g., up to 10%, up to 25%, up to 35%, up to 45%, up to 50% ribonucleotides), (iv) optionally, a phosphorylated 5′ end, (v) optionally, a nucleotide sugar modification, and (vi) optionally, a nucleotide substitution. In some embodiments, a guide may comprise a phosphorylated 5′ end or another chemical modification at its 5′ end. A guide may be engineered or synthetic with a sequence selected to complement a desired target sequence. A guide maybe capable of directing an Argonaute polypeptide:guide DNA complex to a target polynucleotide. A DNA guide may be an oligonucleotide or polynucleotide that is synthetic or from a natural source such as genomic DNA, cDNA, extrachromosomal DNA, microbial DNA or viral DNA (e.g., the natural source differing from the Argonaute such that the guide and Argonaute together form a non-naturally occurring combination). The guide DNA is generally single stranded when used with Argonaute although it may be derived from dsDNA.
In some embodiments, a guide length suitable for Argonaute cleavage of dsDNA (e.g., in the presence of a helicase or single strand binding protein) may comprise at least 12 nucleotides, for example, having a size range of 12-60 nucleotides, 14-50 nucleotides, 15-40 nucleotides, 16-35 nucleotides, 15-24 nucleotides, or 16-21 nucleotides. In some embodiments, a guide DNA may be greater than 21 nucleotides or at least 24 nucleotides in length. In some embodiments, a guide may be 16-21 nucleotides in length (e.g., 16, 17, 18, 19, 20 or 21 nucleotides).
In some embodiments, a guide may comprise a nucleotide sugar modification or a nucleotide substitution. In some embodiments, a nucleotide sugar modification comprises a 2′ sugar modification and maybe selected from the group consisting of a 2′-O—CH3, a 2′-F, and a 2′-MOE modification. In some embodiments, a nucleotide substitution comprises one selected from the group consisting of locked nucleic acid (LNA), an unlocked nucleic acid (UNA), deoxyuridine, pseudouridine, 5-methylcytosine, 2-aminopurine, 2,6-diaminopurine, deoxyinosine, 5-hydroxybutynl-2′-deoxyuridine, 8-aza-7-deazaguanosine, and 5-nitroindole. In some embodiments, a guide molecule comprises a sugar modification and a nucleotide substitution.
The nucleotide sequence of a guide may or may not be degenerate. For example, guides with degenerate sequences may be useful for targeting nucleic acid sequences that are not fully known and/or for targeting more than one variant in a population of polynucleotides.
In the context of the present disclosure, “helicase” refers to a motor protein that moves linearly along double stranded nucleic acids unwinding or otherwise separating the component strands along base paired nucleosides. A helicase may or may not form a ring structure surrounding a nucleic acid substrate. A helicase may or may not unwind molecules that comprise partially single stranded nucleic acids (“a helicases”). Examples of helicases include, without limitation, RecQ-family helicases (e.g., EcoRecQ DNA helicase from Escherichia coli (WP_096324295.1), CpeRecQ from Clostridiumperfringens (WP_011590145.1), Cbu RecQ from Clostridium butyricum (WP_003411240.1)); DNA helicases from T4-like bacteriophages (e.g., T4 gp41, T4 gp41 associated with T4 gp59, T4 UvsW, T4 Dda and Slur07 Dda); T7 bacteriophage gp4 DNA helicase; RecBCD-family helicases (e.g., E. coli RecBCD DNA helicase); modified RecBCD helicases (e.g., RecBexo- helicase, RecBexo- C, RecBexo-CD, RecΔB, RecΔBC, RecΔBCD); UvrD/PcrA family helicases, e.g., E. coli EcoUvrD, E. coli Rep, M. tuberculosis PcrA, M. leprae PcrA; and/or E. coli Tra helicase. A helicase may unwind, for example, linear, nicked circular, and/or supercoiled circular DNA.
In the context of the present disclosure, “non-naturally occurring” refers to a polynucleotide, polypeptide, carbohydrate, lipid, or composition that does not exist in nature. Such a polynucleotide, polypeptide, carbohydrate, lipid, or composition may differ from naturally occurring polynucleotides polypeptides, carbohydrates, lipids, or compositions in one or more respects. For example, a polymer (e.g., a polynucleotide, polypeptide, or carbohydrate) may differ in the kind and arrangement of the component building blocks (e.g., nucleotide sequence, amino acid sequence, or sugar molecules). A polymer may differ from a naturally occurring polymer with respect to the molecule(s) to which it is linked. For example, a “non-naturally occurring” protein may differ from naturally occurring proteins in its secondary, tertiary, or quaternary structure, by having a chemical bond (e.g., a covalent bond including a peptide bond, a phosphate bond, a disulfide bond, an ester bond, and ether bond, and others) to a polypeptide (e.g., a fusion protein), a lipid, a carbohydrate, or any other molecule. Similarly, a “non-naturally occurring” polynucleotide or nucleic acid may contain one or more other modifications (e.g., an added label or other moiety) to the 5′-end, the 3′ end, and/or between the 5′- and 3′-ends (e.g., methylation) of the nucleic acid. A “non-naturally occurring” composition may differ from naturally occurring compositions in one or more of the following respects: (a) having components that are not combined in nature, (b) having components in concentrations not found in nature, (c) omitting one or components otherwise found in naturally occurring compositions, (d) having a form not found in nature, e.g., dried, freeze dried, crystalline, aqueous, and (e) having one or more additional components beyond those found in nature (e.g., ATP, TTP, CTP, GTP, a buffer, a detergent, a dye, a solvent or a preservative).
In the context of the present disclosure, “oligonucleotide” refers to a polymer of nucleotides comprising naturally occurring nucleotides, non-naturally occurring nucleotides, derivatized nucleotides, or a combination thereof. As used herein, the term “complementarity” refers to the ability of nucleotides, or analogues thereof, to form Watson-Crick base pairs. Complementary nucleotide sequences will form Watson-Crick base pairs and non-complementary nucleotide sequences will not.
In the context of the present disclosure, “single-stranded DNA binding protein” refers to a protein that binds to ssDNA. The genomes of most organisms, including bacteria (e.g., E. coli), viruses (e.g., herpes viruses) and mammals, encode at least one SSB. SSBs of interest include, but are not limited to, ET SSB, E. coli recA, T7 gene 2.5 product (gp2.5), T4 gene 32 product (gp32), E. coli SSB, replication protein A (RPA) from archaeal and eukaryotic organisms, Nanoarchaeum equitans SSB-like protein, UvrD, RadA, Rad51, phage lambda RedB or Rac prophage RecT. An SSB may be thermostable or mesolabile. An SSB may have at least 80%, at least 90%, at least 95%, or 100% sequence identity to a wild type SSB.
In the context of the present disclosure, a “substitution” at a position in a comparator amino acid sequence refers to any difference at that position relative to the corresponding position in a reference sequence, including a deletion, an insertion, and a different amino acid, where the comparator and reference sequences are at least 80% identical to each other. A substitution in a comparator sequence, in addition to being different than the reference sequence, may differ from all corresponding positions in naturally occurring sequences that are at least 80% identical to the comparator sequence.
In the context of the present disclosure, “target” refers to a nucleic acid having a nucleic acid sequence, which may be a chromosomal sequence or an extrachromosomal sequence, (e.g., an episomal sequence, a minicircle sequence, a plasmid, a mitochondrial sequence, a chloroplast sequence, etc.). A target nucleic acid can be a dsDNA or ssDNA; a target nucleic acid may also be an RNA.
All publications, patents, and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference. All reagents referenced, if unavailable elsewhere, may be obtained from the indicated source and/or New England Biolabs, Inc. (Ipswich, Mass.).
The present disclosure provides, in some embodiments, compositions for producing a break in double-stranded DNA (e.g., linear DNA, open circular DNA, negatively supercoiled DNA, and/or positively supercoiled DNA). In some embodiments, compositions may produce a break in double-stranded DNA at temperatures from, for example, 25° C. to 45° C. (e.g., 25° C. to 30° C., 30° C. to 35° C., 35° C. to 40° C., or 40° C. to 45° C.). A composition may comprise, for example, a helicase, an Argonaute bound to a first guide, and optionally, an Argonaute bound to a second guide. In some embodiments, a single Argonaute with a single guide may produce a double-stranded break. For example, if a palindromic ds sequence (e.g., 5′-CGTAATTCGTACGAATTACG-3′/3′-GCATTAAGCATGCTTAATGC-5′; SEQ ID NO:94/SEQ ID NO:94) is targeted, an Argonaute having a guide 5′-pCGTAATTCGTACGAATTACG-3′ (SEQ ID NO:94) (a) may bind a target's top strand resulting in cleavage of the top strand and (b) may bind complementary bottom strand resulting in cleavage of the bottom strand, thereby producing a double-stranded break. Double-stranded DNA with longer and/or more complex repeating sequences may also be cleaved with an Argonaute bound to a single guide sequence.
Compositions for cleaving complex and/or non-palindromic/non-repeating target sequences may comprise, a helicase, an Argonaute bound to a first guide, and an Argonaute bound to a second guide. The first and second guides may be selected to target sequences on opposite strands such that the resulting single-stranded breaks are on opposite strands and close enough to one another to constitute a double-strand break. In some embodiments, a composition may include or exclude a single-stranded DNA binding protein.
A composition, according to some embodiments, may include or exclude components beyond a helicase and an Argonaute. For example, a composition may include one or more polynucleotides that are actual or potential substrates for programmed cleavage (e.g., plasmids, phage, vectors, genomic DNA, organellar DNA, library DNA), detection agents, nucleotide triphosphates (e.g., ATP, GTP, CTP, TTP or modified versions thereof), buffers, salts, detergents, and/or crowding agents. A composition may include one or more other proteins, examples of which include polymerases, ligases, nucleases, helicase loading proteins (e.g., T4 gp59 protein for T4 gp41 helicase or MutL protein for EcoUvrD helicase), and/or helicase processivity enhancers (e.g., RepD for PcrA helicase).
In some embodiments, a helicase may be in a molar excess relative to Argonaute. In some embodiments, an Argonaute and a helicase may be present in a reaction at molar a ratio of 1 (Argonaute):1 (helicase) to 1(Argonaute):100 (helicase), e.g., 1 (Argonaute):5 (helicase) to 1 (Argonaute):30 (helicase), e.g., about 1 (Argonaute): about 15 (helicase), although ratios outside of these ranges can be used.
In some embodiments, off-target activity may be reduced or absent from compositions and methods of the disclosure. Selection of the length, complexity, and/or G:C content of the guide sequence(s), for example, may result in compositions and methods for cleaving DNA with little or off-target activity.
The present disclosure provides, in some embodiments, methods of forming a double strand break in a double-stranded polynucleotide at a (pre-selected) target position in the polynucleotide. According to some embodiments, forming a double strand break in a double-stranded polynucleotide my occur at temperatures from, for example, 25° C. to 45° C. A method may comprise, for example, contacting (a) a double-stranded polynucleotide having a first target sequence on a first strand of the polynucleotide and a second target sequence on the opposite strand, (b) a helicase, (c) an Argonaute with a first bound guide (e.g., having a sequence complimentary to the first target sequence), and (d) an Argonaute (which may be the same as or different from the Argonaute in (c)) with a second bound guide (e.g., having a sequence complimentary to the second target sequence) under conditions that permit hybridization of complimentary sequences and cleavage of the first strand by the (c) Argonaute and cleavage of the second strand by the (d) Argonaute to produce a double strand break in the polynucleotide, the double strand break forming at least a first fragment of the polynucleotide and a second fragment of the polynucleotide. Pre-selecting a position for cleavage may include, in some embodiments, obtaining the nucleotide sequence of at least a portion of each strand of the polynucleotide, selecting for each strand a cleavage site in the obtained sequence, identifying a sequence for a first guide complementary to at least a portion of the obtained sequence for one strand and sufficient to contact an Argonaute bound to such first guide and the cleavage site.
In some embodiments, a method may further comprise detecting at least one of the fragments of the polynucleotide, for example, by gel-electrophoresis, capillary electrophoresis, mass spectrometry, fluorescence (e.g., where a fluorescent tag and/or a quencher is/are attached to the polynucleotide or a fragment), and/or sequencing.
Some specific example embodiments may be illustrated by one or more of the examples provided herein.
T7 Express lysY/Iq E. coli (New England Biolabs, Inc., Ipswich, Mass., USA) carrying expression plasmid was grown at 37° C. in 1 L of LB medium containing 0.02 mg/ml kanamycin until OD600 0.6. Protein expression was induced with 0.2 mM IPTG and cell culture continued to grow overnight at 16° C. The cells were harvested by centrifugation, resuspended in 45 ml buffer A (20 mM Tris-HCl, pH 7.5, 300 mM NaCl, 10 mM imidazole) supplemented with protease inhibitor cocktail (Complete mini. EDTA-free; Roche Diagnostics GmbH, Mannheim. Germany) and disrupted by sonication. The cell-free lysate was loaded on a 5 ml HisTrap Nickel HP column (GE-Healthcare, Chicago, Ill., USA) and proteins were eluted with 20-250 mM imidazole gradient. The fractions containing Ago protein were pooled, diluted 2-fold with buffer B (20 mM Tris-HCl, pH 7.5, 0.1 mM EDTA), loaded on 5 ml HiTrap Heparin HP column (GE-Healthcare, Chicago, Ill., USA) and proteins were eluted with 0.15-1 M NaCl gradient. The fractions containing Ago protein were pooled, diluted with buffer B to 150 mM NaCl concentration, loaded on 5 ml HiTrap Capto S Sepharose HP column (GE-Healthcare, Chicago, Ill., USA) and eluted with 0.15-0.5 M NaCl gradient. The fractions containing Ago were pooled and loaded on 5 ml Bioscale CHTP column (Bio-Rad, Hercules, Calif., USA). The proteins were eluted with 0.02-0.3 M KPO4 gradient. The purified protein was concentrated by dialysis against 20 mM Tris-HCl (pH 7.5), 300 mM NaCl, 0.1 mM EDTA, 0.1 mM TCEP and 50% (vol/vol) glycerol and stored at −80° C. The purified Argonaute proteins were >95% homogenous as determined by Coomassie-stained SDS-PAGE.
T7 Express E. coli (New England Biolabs, Inc., Ipswich, Mass., USA) carrying RecBexo- expression plasmid was grown at 37° C. in until OD600=0.6. Protein expression was induced with 0.2 mM IPTG and cell culture continued to grow overnight at 16° C. The cells were harvested by centrifugation, resuspended in 150 ml buffer A (20 mM potassium phosphate, pH 7.0, 300 mM NaCl, 20 mM imidazole. 5% glycerol) and disrupted by sonication. The clarified lysate was loaded on a HisTrap Nickel HP column (5 ml) and proteins were eluted using 20-250 mM imidazole gradient in buffer A. The fractions containing RecBexo- protein were diluted 2-fold with buffer B (20 mM potassium phosphate. pH 7.0, 0.1 mM EDTA, 5% glycerol) to bring NaCl concentration to 100 mM and loaded on HiTrap Heparin HP column (5 ml). The proteins were eluted using 0.1-0.8 M NaCl gradient in buffer B. The fractions containing RecBexo- protein were diluted 2-fold with buffer B and loaded on Bioscale CHTP column (5 ml). The proteins were eluted with 0.02-0.4 M KPO4 gradient in buffer B containing 250 mM NaCl. The purified protein was concentrated by dialysis against 10 mM Tris-HCl, pH 7.4, 300 mM NaCl, 0.1 mM EDTA, 0.1 mMDTT and 50% glycerol. The protein was >95% homogenous as determined by Coomassie-stained SDS-PAGE.
T7 Express E. coli carrying RecC expression plasmid was grown and expressed as described above for RecBexo- protein. The cells were resuspended in 80 ml of 20 mM potassium phosphate buffer, pH 7.0 containing 450 mM NaCl, 20 mM imidazole and disrupted by sonication. The clarified lysate was loaded on a HisTrap Nickel HP column (5 ml) and proteins were eluted with 20-250 mM imidazole gradient in the buffer that contained 300 mM NaCl. The pooled RecC protein was diluted 3-fold with 20 mM potassium phosphate buffer. pH 7.0, 0.1 mM EDTA, 5% glycerol to bring NaCl concentration to 100 mM and was loaded on HiTrap Heparin HP column (5 ml). RecC protein did not bind to the Heparin column at 100 mM NaCl. The flow through fraction was further diluted to 75 mM NaCl concentration and loaded on HiTrap Q HP column (5 ml). RecC protein was eluted with 0.75-1.0 M NaCl gradient and concentrated by dialysis against 10 mM Tris-HCl, pH 7.4, 300 mM NaCl, 0.1 mM EDTA, 0.1 mM DTT and 50% glycerol. The protein was >90% homogenous as determined by Coomassie-stained SDS-PAGE.
The codon optimized genes encoding Argonaute proteins were ordered in pET29a expression vector from GenScript (Piscataway, N.J., USA). Analytical amounts of twenty Argonaute proteins were synthesized from pET29a plasmids using PURExpress In Vitro Protein Synthesis kit (New England Biolabs, Inc., Ipswich, Mass., USA). For large scale expression and purification of CbAgo, CpAgo, CdAgo, IbAgo, CsAgo and CaAgo, the respective genes were subcloned into pET28c expression vector in frame with the N-terminal 6×His tag. Argonaute protein expression and purification procedures are provided in EXAMPLE 1.
Exonuclease V encoding plasmid construct (New England Biolabs, Inc. Ipswich, Mass.) was used as a template to construct nuclease deficient mutant of RecB DNA helicase. A full-length RecBexo- mutant was created by introducing three mutations within the catalytic site of nuclease domain. The recB coding sequence was amplified as three overlapping PCR fragments B1 (3080 bp). B2 (200 bp) and B3 (340 bp). To introduce E1020A mutation, the GAG codon was replaced by GCG codon in the overlapping primers used for amplification of fragments B1 and B2. Similarly, codon GAC was changed to GCC and codon AAA was changed to GCA in the overlapping primers for amplification of fragments B2 and B3 that resulted in D1080A and K1082A mutations (see TABLE 1 for complete primer sequences). The three recB fragments were directly assembled into pET28c vector in frame with the N-terminal 6×His tag employing NEBuilder HiFi DNA Assembly Cloning Kit (New England Biolabs. Inc., Ipswich. Mass.). The fragment of recB gene encoding for the N-terminal 1-929 amino acids was amplified by PCR and assembled into pET28c vector in frame with the N-terminal 6×HIs tag to create a RecΔB deletion mutant. Wild type RecC encoding gene was individually sub-cloned into pET28c vector in frame with the N-terminal 6×His tag. RecΔB, RecBexo- and RecC protein expression and purification is provided in EXAMPLE 2 and EXAMPLE 3.
All substrate oligonucleotides (DNA or RNA) and 5′-phosphorylated guides were purchased from Integrated DNA technologies (Coralville, Iowa, USA). Nucleotide sequences for all substrates and guides can be found in TABLE 1.
To test activity of in vitro expressed pAgo proteins a 1 μl of PURExpress sample was mixed with 250 nM guide G-1 (21 nt) in 10 μl of buffer containing 20 mM Bis-Tris propane, pH 8.0, 50 mM NaCl, 2 mM MgCl2, 0.1% (v/v) Tritox X-100 and incubated for 20 minutes at 37° C. to form a pAgo/guide complex. The pAgo/G-1 complex was then combined with 50 nM 5′-FAM labeled ss DNA target T-1 in 20 μl reaction and incubated for 1 hour either at 37° C. or at 65° C. temperature. The reactions were terminated by adding an equal volume of stop buffer (95% Formamide, 0.025% Bromophenol Blue, 0.025% Xylene Cyanol, 5 mM EDTA) and heating the samples for 5 minutes at 95° C. The cleavage products were separated by gel electrophoresis on 15% denaturing polyacrylamide gel containing 7.5M urea and 24% formamide and visualized using Typhoon 9400 Scanner (GE Healthcare Chicago, Ill., USA).
Activity assays performed with purified Ago proteins were carried out with 17 nt long guides G-2 (DNA guide) and G-3 (RNA guide). Argonaute cleavage reactions were performed at 5:5:1 Ago:guide:target molar concentration ratio. For guide loading, 250 nM Ago was combined with 250 nM guide in 10 μl buffer containing 20 mM Bis-Tris propane, pH 8.0, 50 mM NaCl, 2 mM MgCl2, 0.1% (v/v) Triton X-100 and incubated for 20 minutes at 37° C. The Ago/guide mixture was added to a 20 μl cleavage reaction containing 50 nM 5′-FAM labeled ss substrate (either T-2 or T-3), and the reaction was incubated for 1 hour at 37° C. Reactions were terminated by adding 50 mM EDTA. Cleavage products at a 4 nM final concentration were separated by Capillary Electrophoresis on an Applied Biosystems 3730xl DNA Analyzer (Applied Biosystems, Waltham, Mass., USA). The quantitative analysis of flourescent peaks was performed using PeakScanner Software v1.0 (Thermo Fisher Scientific, Inc., Waltham, Mass., USA) and fragment analysis software for in-house use at New England Biolabs as previously described (Hunt E A et al., PLoS One, 2018 Aug. 29; 13(8):e0203073; Greenough L et al., Nucl. Acids Res., 2016, 44 (2), e15).
To evaluate guide efficiency, CbAgo cleavage was examined over time by incubating a 53-nt long ss-phiX174 DNA substrate with CbAgo loaded with thirteen different 21 nt long guides for 1, 16 and 64 minutes at 37° C.
A 3619-3858 nt segment of phage φX174 DNA was amplified by PCR to generate either 5′-FAM-labeled or 5′-FAM/ROX-labeled 239 bp DNA substrate. A 3393-4012 nt segment of phage φX174 DNA was amplified by PCR to generate either 5′-FAM-labeled or 5′-FAM/ROX-labeled 619 bp DNA substrate. A 3536-3858 nt segment of phage φX174 DNA was amplified by PCR with a 5′FAM- and a 5′ROX-labeled primers to generate a 5′-FAM/ROX-labeled 322 bp DNA substrate. The PCR products were purified using Monarch® PCR and DNA Cleanup kit (New England Biolabs, Inc., Ipswich, Mass.) and DNA concentration was quantified using NanoDrop spectrophotometer (ThermoFisher Scientific, Inc., Waltham, Mass., USA).
For one-strand cleavage experiments the CbAgo/guide complex was formed by incubating 0.5 μM CbAgo and IgM guide for 15 minutes at 37° C. in a 10 μl of 1× CutSmart buffer (50 mM Potassium Acetate, pH 7.9, 20 mM Tris-Acetate, 10 mM Magnesium Acetate, 100 μg/ml BSA). The CbAgo/guide complex at a 0.25/0.5 μM final concentration was combined with the 50 nM ds DNA substrate and indicated elsewhere DNA helicase in a 20 μl of 1× CutSmart buffer (New England Biolabs, Inc., Ipswich, Mass., USA) supplemented with 5 mM ATP and incubated at 37° C.
For two-strand cleavage experiments, two separate 5 μl reactions, each containing CbAgo (0.5 μM) and one guide (1 μM) were carried out to form two individual CbAgo/guide complexes that target opposite strands on ds substrate. Both CbAgo/guide complexes each at a 0.125/0.25 μM final concentration were combined with the 50 nM ds DNA substrate and indicated elsewhere DNA helicase in a 20 μl of 1× CutSmart buffer supplemented with 5 mM ATP and incubated at 37° C. The cleavage reactions were terminated by addition of 50 mM EDTA, DNA samples were diluted to 4 nM final concentration and analyzed by CE as described in EXAMPLE 6. For time course experiments the cleavage reaction volume was increased to 50 μl and 5 μl samples were withdrawn from the reaction at the indicated time points.
Cleavage reactions of φX174, pAd2_BsaI and pAd2-AvrII DNAs were carried out using 0.2 μg of DNA, which were linearized with the indicated restriction enzymes following manufacturer's recommendations (New England Biolabs, Inc., Ipswich, Mass.). The plasmids pAd2_BsaI and pAd2_AvrII were created by Dr. Richard Morgan (New England Biolabs, Inc., Ipswich, Mass.) by cloning either a 22404 bp BsaBI fragment or 19428 bp AvrII fragment of Adenovirus2 genomic DNA into pUC19 vector.
A schematic overview of an assay for monitoring helicase activity is shown in
DNA oligonucleotides carrying m6A residue within a GATC site were synthesized at Organic Synthesis facility of New England Biolabs (Ipswich, Mass., USA). Double-stranded fork substrate for helicase activity assay was created by annealing two complementary ss oligonucleotides (nucleotide sequences are shown in TABLE 1). DNA unwinding reactions were carried out for 15 minutes at 37° C. in 10 μl of 1× CutSmart buffer containing 0.1 μM fork substrate, 5 μM trap oligo, 5 mM ATP, 2 units DpnI and 1 μl of serially diluted helicase. Reactions were terminated by adding 50 mM EDTA. Cleavage products were analyzed by CE as described above.
A 332 bp 5′FAM/ROX-labeled DNA1 was generated by PCR using a3681-4012 nt segment of φX174 DNA as a template. DNA 1 was cleaved with CbAgo loaded with guides T2+B1 to generate a 5′-ROX labeled 15 bp, and a 5′-FAM labeled 309 bp cleavage products. A 300 bp 5′FAM/ROX-labeled DNA2 was generated by PCR using an 806-1105 nt segment of pUC19 DNA as a template. DNA2 was cleaved with CbAgo loaded with guides T5+B5 to generate a 5′-ROX labeled 14 bp, and a 5′-FAM labeled 278 bp cleavage products. All CbAgo cleavage products were flanked with 8 nt long 3′-ss extensions. The 15 bp and 14 bp fragments were discarded by column purification with Monarch® PCR and DNA Cleanup Kit (New England Biolabs, Inc., Ipswich, Mass.).
A 29 bp ds “Bridge” oligonucleotide flanked with an 8-nt long 3′-ss extensions on both ends was created by combining two complementary 5′-phophorylated ss oligonucleotides (1 nmol each) in 100 μl of 10 mM Tris-HCl buffer, pH 7.5 and heating for 5 min at 95° C. followed by slow cooling down to room temperature. DNA fragment ligation was carried out with 400 units of T4 DNA ligase (New England Biolabs, Inc., Ipswich, Mass., USA) in 10 μl of T4 DNA ligase buffer. First, 0.3 pmols of 309 bp DNA1 fragment and 0.3 pmols of 29 bp synthetic Bridge oligonucleotide were ligated for 15 minutes at 25° C. The reaction was then supplemented with 0.3 pmols of 278 bp DNA2 fragment and ligation continued for another 15 minutes at 25° C. Ligation products were analyzed by CE.
Candidate Argonaute proteins were identified in a series of steps. First off, three known Argonaute proteins TtAgo (UniProt ID Q746M7), NgAgo (UniProt ID LOAJX6), and PfAgo (UniProt ID Q8U3D2) were aligned using MUSCLE multiple sequence alignment software (Edgar R C, 2004, Nucleic Acids Res. 32(5):1792-1797). The resulting multiple sequence alignment was used as an input to PSI-BLAST to search against UniProt database. The expected threshold was set at 1×10−4, and PSI-BLAST was running multiple iterations until convergence. All bacterial homologs found were extracted based on taxonomic classification. Only proteins containing PAZ and PIWI domains were considered for further analysis. The presence of catalytic PIWI and PAZ domains in found homologs was checked by running HMM search using domain profiles available in PFAM database (PFAM PF02171 and PF02170 for PIWI and PAZ domain, respectively). PAZ domain profile in PFAM is built mainly on sequences of eukaryotic proteins and resulted in a very few hits when run against bacterial proteins. Therefore, the HMM profile for bacterial PAZ domain was generated from scratch using HMMER software (Eddy S R, 2011, PLOS Comp. Biol., 7:e1002195) based on sequences of known Argonaute proteins. Proteins originating from known thermophilic organisms were discarded. Additionally, only proteins that share less than 90% sequence identity to each other were selected for further analysis. Finally, proteins that do not contain aspartates at conserved PIWI catalytic sites were excluded. The remaining list of forty-five bacterial Argonautes is shown in TABLE 2.
Butyrivibrio sp. INlla16
Clostridium butyricum
Clostridium disporicum
Clostridium perfringens WAL-14572
Clostridium sartagoforme
Dorea longicatena
Eubacteriaceae bacterium
Fusicatenibacter saccharivorans
Kurthia massiliensis
Methylomicrobium buryatense
Prochlorothrix hollandica
Pseudoalteromonas luteoviolacea
Pseudobutyrivibrio xylanivorans
Pseudomonas luteola
Deinococcus sp. RL
Exiguobacterium sp. AB2
Oscillatoria acuminata
Paenibacillus borealis
From the compiled list of 45 bacterial Agos, 20 candidates residing in mesophilic bacterial hosts were selected for screening of catalytically active proteins. The names of selected pAgos, bacterial hosts and NCBI Gene bank Accession numbers are listed in TABLE 3. The quick examination of pAgo activity was performed employing proteins expressed with PURExpress® In Vitro Protein Synthesis kit (New England Biolabs, Inc., Ipswich, Mass.). SDS-PAGE analysis confirmed detectable levels of soluble proteins in case of 19 pAgo candidates, except for CbAgo protein from C. butyricum which was found in an insoluble fraction (data not shown). Ten out of twenty tested pAgo candidates revealed DNA-guided DNA cleavage activity at 37° C. and total 13 candidates were active at 65° C. (
Aromatoleum aromaticum EbN1
Butyrivibrio species INlla16
Clostridium butyricum
Clostridium disporicum
Clostridium perfringens
Clostridium sartagoforme
Clostridium saudiense
Dorea longicatena
Eubacteriaceae bacterium
Fischerella thermalis
Fusicatenibacter saccharivorans
Intestinibacter bartlettii
Kurthia massiliensis
Lyngbya species PCC 8106
Methylomicrobium buryatense
Microcystis aeruginosa
Prochlorothrix hollandica
Pseudomonas luteola
Pseudoalteromonas luteoviolacea
Pseudobutyrivibrio xylanivorans
Six pAgos were purified and characterized with a goal to find an Argonaute exhibiting the highest cleavage activity at 37° C. Five selected pAgos originated from hosts assigned to the Clostridium genus. The host for IbAgo also originally was assigned as Clostridium bartlettii, but later the species was re-assigned as Intestinibacter bartlettii (Song Y. L. et al. 2004, Anaerobe, 10(3), 179-184; Gerritsen, J. et al., 2014, J. Syst. Evol. Microbiol, 64(Pt5), 1600-1616). A previously described high-throughput capillary gel electrophoresis-based activity assay was used (Hunt E A et al., PLoS One, 2018 Aug. 29; 13(8):e0203073; 31, Greenough L et al., Nucl. Acids Res., 2016, 44 (2), e15) to rapidly characterize the purified pAgos for guide preference (DNA vs RNA guide) and target preference (DNA vs RNA target). The obtained results are summarized in TABLE 4. All tested pAgos preferred DNA guides and DNA targets over RNA guides and RNA targets. CpAgo was the only Argonaute capable of DNA-guided cleavage of RNA and RNA-guided cleavage of DNA, albeit at a significantly reduced rates when compared to the DNA-guided DNA-target cleavage. The majority of investigated pAgos were capable of cleaving DNA at temperatures spanning from 30° C. up to 75° C. The highest activity was observed at 55-65° C., except for IbAgo which was the most active at 45° C. (data not shown). This suggests that the investigated pAgos still prefer higher temperatures for the optimal catalysis even if they adapted to function in mesophilic hosts. The efficiency of DNA target cleavage at 37° C. was evaluated at different time points ranging from 5 to 120 minutes. The results presented in
Clostridium perfringens
Clostridium disporicum
Clostridium butyricum
Clostridium sartagoforme
Clostridium saudiense
Clostridium bartlettii
An array of 13 different guides of 21 nt in length were designed to hybridize with a 53-nt long ss DNA substrate at positions that were shifted by one nucleotide to the right with respect to each other (
A variety of proteins which have an ability either to intercalate into ds DNA or to stabilize ss DNA regions during various cell processes, such as DNA replication, DNA recombination and DNA repair, were tested in combination with CbAgo for ds DNA cleavage at 37° C. The tested proteins include E. coli RecA protein (New England Biolabs, Inc., Ipswich, Mass.) which catalyzes the pairing of ssDNA with complementary regions of dsDNA; Spy dCas9 protein (New England Biolabs, Inc., Ipswich, Mass.), an inactive mutant of Cas9 nuclease, which retains ability to intercalate duplex DNA for programmable DNA binding activity; and three single-stranded DNA binding proteins, T4 gp32, extreme thermostable single-stranded DNA binding protein ET SSB (T4 gp32 and ET SSB, New England Biolabs, Inc., Ipswich, Mass., USA), and E. coli SSB (ThermoFisher Scientific, Inc. Waltham, Mass., USA), which can bind to ss DNA regions thus preventing hybridization of complementary DNA strands. However, neither of the above listed accessory proteins were capable of increasing CbAgo cleavage activity on ds DNA targets at 37° C.
The possibility was explored that unwinding of ds DNA substrates by mesophilic helicases might help CbAgo to access single-stranded targets at 37° C. The commercially available mesophilic DNA helicases, EcoRecQ (Abcam, Inc., Cambridge, Mass., USA), EcoUvrD (MyBioSource, Inc. San Diego, Calif., USA), T4 gp41 (GoldBio Inc., St. Louis Mo., USA) and T4 DNA Helicase (McLab, Inc., South San Francisco, Calif., USA) first were screened for their ability to unwind forked ds DNA. A schematic overview of assay which was used to test these helicases for DNA unwinding activity is shown in
A capillary electrophoreses-based assay was employed to test if EcoRecQ and Slur07Dda DNA helicases can help CbAgo in cleaving ds DNA targets at 37° C. A schematic overview of the assay is shown in
In summary the obtained results confirmed that DNA helicases have a potential to act as accessory proteins in helping mesophilic Argonautes to access the single-stranded targets within a duplex DNA at 37° C.
RecQ helicases are highly conserved among bacteria throughout evolution as they are important for the maintenance of genome stability. Using the known protein sequence of EcoRecQ as a reference, we have identified putative RecQ-like DNA helicases in Clostridium butyricum (WP_003411240.1) and Clostridium perfringens (WP_011590145.1) which also are the hosts for CbAgo and CpAgo, respectively. Detailed protein sequence comparison showed that CbuRecQ and CpeRecQ share strong similarity with EcoRecQ at the N-terminal protein region where reside the conserved helicase, RecQ-Ct and HRDC domains (Morozov V. et al., Trends Biochem. Sci., 1997, 22(11), 417-418). However, CbuRecQ and CpeRecQ proteins have an additional C-terminal domain of unknown function that is missing in EcoRecQ. The genes encoding EcoRecQ, CpeRecQ and CbuRecQ helicases were ordered in pET28c vector from GenScript (Piscataway, N.J., USA). The purified RecQ proteins were compared for DNA unwinding activity. The activity assay permitted to confirm that both putative proteins, CpeRecQ and CbuRecQ were functional DNA helicases. The comparison of three RecQ proteins indicated that EcoRecQ and CpeRecQ possess similar specific activity (
As seen from the results presented in
A series of experiments were carried out to test double-stranded cleavage activity of CbAgo in the presence of Slur07 Dda, EcoRecQ and CpeRecQ DNA helicases. In addition, the CbAgo cleavage was tested employing DNA helicase mixtures made up by combining either EcoRecQ or CpeRecQ with Slur07 Dda helicase. Cleavage efficiency was compared using two sets of four double-stranded substrates. In one set, the substrates carried a common 50-60 bp duplex region (
Among the set of substrates with 50-60 bp duplex portion (
CbAgo activity declined when duplex DNA portion was increased from 50-60 bp to 80 bp suggesting that RecQ and Dda helicases were not able to efficiently unwind 80 bp DNA duplex (
Crystal structure of RecBCD enzymes revels four conserved residues (Glu1020, Asp1067, Asp1080 and Lys 1082) in the active site of the RecB nuclease domain (Singleton M R et al., Nature, 2004, 432, 187-193. doi: 10.1038/nature02988). Nuclease deficient variant, referred to as RecBexo-, was constructed by replacing three catalytic residues of the nuclease domain, E1020, D1080 and K1082 with an alanine residue. The purified RecBexo- and RecC subunits were mixed at 1:1 stoichiometry to the final 10 μM concentration to reconstitute RecBexo-C helicase.
DNA unwinding activity of either RecBexo- or RecBexo-C DNA helicases was analyzed using the unwinding assay described in EXAMPLE 8 and shown in
A 322 bp long ds DNA fragment labeled with a 5′-FAM on one strand and a 5′-ROX on the opposite strand was amplified by PCR using a 3536-3858 nt locus of φX174 phage DNA as a template. The generated substrate in the middle encompassed a 53 nt sequence segment which was used to characterize the effect of guide sequence on the CbAgo cleavage of single-stranded DNA (
Additional guides, B1, B2, B3 and B4, were designed to target the opposite strand of 322 bp DNA fragment (
Individual DNA strands are cleaved independently by two CbAgo/guide complexes therefore, the two guides can be prearranged to produce any custom-designed cleavage products which might be tailed with either 5′- or 3′-single-stranded extensions of varying length. The concurrent cleavage of opposite DNA strands was explored using a 322 bp ds DNA fragment labeled with a 5′-FAM on one strand and a 5′-ROX on the opposite strand. Two guide pairs were initially selected to test the efficiency of double-strand cleavage by CbAgo in the presence of RecBexo-C helicase. CbAgo loaded with guide pair T2+B1 was expected to produce cleavage products tailed with 5 nt-long 3′-ss extensions (
To further explore the observed bias, CbAgo cleavage of 322 bp 5′FAM/ROX-labeled DNA was compared in a series of experiments using a set of previously designed thirteen guides which were programmed to cleave the 5′FAM-labeled DNA strand at positions shifted by one nucleotide along substrate sequence. For cleavage of the 5′ROX-labeled strand either guide B1 or guide B4 was used. In one series of experiments, the use of 13 guides in combination with the guide B1 allowed to generate CbAgo cleavage products tailed with 3′-single-stranded extensions varying from 1 nt up to 13 nt in length (
The linearized φX174 phage DNA was used to explore if CbAgo can rapidly locate and cleave targets on substantially longer ds DNA during strand unwinding progression catalyzed by RecBexo-C helicase. The schematic overview of employed guide pairs on the circular φX174 DNA is shown in
Cleavage of longer than 5-6 kb DNA at 37° C. was tested with a CbAgo/RecBexo-C combination. Two plasmids, pAd2_BsaBI (22.114 kB in length) and pAd2_AvrII (25.091 kB in length) were linearized with AscI and SrfI, respectively, and two guide pairs were designed to target each DNA at the midpoint (
A method allowing seamless assembly of CbAgo-cleaved DNA fragments was developed (
Assembly of 332 bp and 300 bp 5′-FAM/ROX labeled PCR fragments (referred to as DNA1 and DNA2, respectively) was evaluated. On DNA1, the CbAgo target site was selected close to the 5′-ROX labeled end, whereas on DNA2 the CbAgo target site was selected close to the 5′-FAM-labeled end. The guide pairs T1+B2 and T5+B5 complemented DNA1 and DNA2, respectively, and both guide pairs were arranged to create 8 nt-long 3′-ss extensions on CbAgo cleavage products. In this arrangement, the CbAgo cleavage of either DNA produced throwaway terminal fragments, a 5′-ROX labeled 15 bp fragment and a 5′-FAM labeled 14 bp fragment, which were eliminated by column purification (
RecB is organized into a 100-kDa N-terminal helicase domain and 30-kDa C-terminal exonuclease/endonuclease domain. The C-terminal domain functions independently as an endo- and exonuclease and is responsible for all nuclease activities associated with the RecBCD. A truncated RecB variant comprising a helicase domain (RecB1-929) has ATP-dependent DNA unwinding activity, but no longer exhibits nuclease activity. Based on the above information, a truncated RecB1-929 C helicase that lacks nuclease activity was tested for ability to help CbAgo in cleaving double-stranded targets at 37° C.
RecB1-929 variant, referred here as RecΔB, was created as described in EXAMPLE 5. The purified RecΔB subunit was mixed with RecC subunit at 1:1 stoichiometry to the final 10 μM concentration to reconstitute RecΔBC helicase. RecΔBC was evaluated as a CbAgo partner in the cleavage of 80 bp blunt-ended DNA substrate at 37° C. In the presence of RecΔBC DNA helicase, only 25% of substrate was cleaved by CbAgo/guide B12 complex in the first 15 minutes (
CbAgo cleavage of 239 bp and 619 bp ds DNA was evaluated either in the presence of RecΔBC alone or in the presence of two helicase mixtures, RecΔBC+CpeRecQ and RecΔBC+CbuRecQ. 239 bp and 619 bp FAM-labeled DNA substrates were generated by PCR amplification using phage φX174 DNA as a template as described in EXAMPLE 7. For single-strand cleavage experiments, CbAgo was loaded with a guide T2 during a 15-minute incubation at 37° C. The CbAgo cleavage reaction was assembled in a 20 μl of 1× CutSmart buffer containing 50 nM DNA substrate, CbAgo/T2 at a 0.25/0.25 μM final concentration, 0.25 μM RecΔBC and 5 mM ATP. When applied, either CpeRecQ or CbuRecQ was added to the reaction at a final 5.5 μM or 6.5 μM concentration, respectively, and the reactions were incubated for 1 hour at 37° C. Single-strand cleavage results presented in
In summary, the results indicated that RecΔBC DNA helicase did not support CbAgo programmable cleavage of ds DNA under conditions tested. The unwinding activity of RecΔBC was increased only in the presence of a high concentration of CpeRecQ DNA helicase. For a rapid and processive DNA unwinding activity, RecB subunit must form a complex with RecC subunit. Most likely, deletion of 30 kD C-terminal nuclease domain destabilized interactions between the RecΔB and RecC subunits, thus making a structurally unstable RecΔBC complex and causing RecΔB helicase to act alone. Reconstitution of RecB1-929 with RecC and RecD subunits leads to processive unwinding of a 4.3 kb linearized ds DNA plasmid. Possibly, the RecD subunit can be used to form a stable RecΔBCD complex and thus, to increase DNA unwinding activity to levels that allow efficient cleavage of long ds DNAs by CbAgo. Also, data presented in EXAMPLE 15 shows that full-length RecBexo-C variant carrying a catalytically inactive C-terminal nuclease domain exhibits vary rapid and processive DNA unwinding activity that is sufficient for CbAgo cleavage of 22-25 kb long double-stranded DNAs. Potentially, a RecB variant with a partially deleted nuclease domain can be created which can form a stable complex with RecC subunit, but no longer exhibits nuclease activity.
CbAgo is capable of cleaving ss DNA at wide range of temperatures spanning from 30° C. up to 75° C. However, E. coli RecBCD enzyme was shown to unwind DNA duplex at 20-37° C. temperature range. To determine the optimal temperature range for CbAgo/RecBexo-C programmable endonuclease, double-strand DNA cleavage reactions were carried out at 25, 30, 37, 42, 45 and 50° C. temperatures. 322 bp FAM/ROX-labeled DNA substrate was cleaved simultaneously on both strands with CbAgo loaded with guides T2 and B1 in the presence of RecBexo-C DNA helicase as described in EXAMPLE 7. The cleavage efficiency of each DNA strand was evaluated after incubation for 4, 8, 16 and 32 minutes at the indicated temperature. The results presented in
Modified 322 bp DNA substrates were generated by PCR as described in EXAMPLE 7. DNA amplification with dNTPs including 5-methyl-dCTP or 5-hydroxymethyl-dCTP allowed 5-methylcytosine-containing substrates (mC) or hydroxymethylcytosine containing substrates (hmC), respectively, to be generated. To generate a beta-glucosyl-5-hydroxymethylcytosine containing substrate (ghmC), the hmC substrate was glucosylated using T4-BGT glucosyltransferase in the presence of UDP-Glucose. Target DNA sequence complementary to B2 guide carried two cytosine residues which were subjected to a modification, however the cytosines were positioned at either 3-nt or 6-nt distance from the CbAgo cleavage site (
CbAgo functions at a range of temperatures (e.g., 30-75° C.) which do not (alone) facilitate separation of ds DNA into single strands. In this example, ds DNA was subjected to physical denaturation (e.g., high temperature) in the presence of chemical denaturant (e.g., formamide) to generate single-stranded targets for CbAgo cleavage. To generate denatured DNA substrate, 2.5 pmol/5 μl of either 322 bp or 619 bp DNA was combined with 2 μl 100% formamide in a 10 μl volume and incubated for 10 minutes at 85° C., then quickly placed in ice/ethanol bath for 1 minute. CbAgo was pre-loaded with guides T2 and B1 at a 1:1 molar concentration ratio in two separate 15 μl reactions each containing 1.5 μl 10× CutSmart buffer, 1.5 μl 5 μM guide, 1.5 μl 5 μM CbAgo. Two CbAgo/guide complexes, 12.5 μl each, were combined with 15 μl 1× CutSmart buffer and a cleavage reaction was initiated at 37° C. by addition of 10 μl denatured DNA substrate (50 nM final concentration). Formamide concentration in CbAgo cleavage reaction was maintained at 4% final concentration to slow down DNA strand reannealing at 37° C. 5 μl samples were removed from the reaction at indicated time points and quenched with 50 mM EDTA. DNA samples were analyzed by capillary electrophoreses as described in EXAMPLE 6. Data presented in
CbAgo cleavage efficiency of formamide-denatured, 616 bp ROX-labeled DNA was compared using increasing CbAgo/guide concentrations. CbAgo/T2 and CbAgo/B1 complexes at either 125:125 nM, 250:250 nM or 500:500 nM Ago:guide concentration, were each combined with 50 nM formamide-denatured, 616 bp ROX-labeled DNA and cleavage was monitored over time. The percentage of cleaved DNA at a 2-minute time point was 1.5-fold more for reactions with 500 nM CbAgo than for reactions with 125 nM CbAgo (