The present disclosure relates in general methods and kits for improved nucleic acid sequencing.
The present invention is in the technical field of genomics. More particularly, the present invention is in the technical field of nucleic acid sequencing. Nucleic acid sequencing can provide information for a wide variety of biomedical applications, including diagnostics, prognostics, pharmacogenomics, and forensic biology. Sequencing may involve basic low throughput methods including Maxam-Gilbert sequencing (chemically modified nucleotide) and Sanger sequencing (chain-termination) methods, or high throughput next-generation methods including massively parallel pyrosequencing, sequencing by synthesis, sequencing by ligation, semiconductor sequencing, and others. For most sequencing methods, a sample, such as a nucleic acid target, needs to be processed prior to introduction into a sequencing instrument. For example, a sample may be fragmented, amplified or attached to an identifier. Unique identifiers are often used to identify the origin of a particular sample. Most sequencing methods generate relatively short sequencing reads, ranging from tens of bases to hundreds of bases in length, and cannot generate complete haplotype phase information due to limited sequencing read length.
The present invention provides methods and kits for tracking nucleic acid target origin by barcode tagging when the targets are broken into smaller pieces. A plurality of nucleic acid sequences which are used as barcodes may be clonally amplified or clonally synthesized on a solid support (e.g., bead, microparticle, slide, plate or flowcell). The design of barcode sequences in this invention allows the creation of billions of different barcodes and each barcode sequence contains features for improving sequencing accuracy. Nucleic acid targets with or without modification are captured in vitro by these clonally localized nucleic acid barcode templates on the solid support. Transposase and transposable DNA are used to facilitate the fragmentation and barcode tagging of the nucleic acid targets. Hundreds, thousands or millions of nucleic acid targets can be processed simultaneously in a massively parallel fashion. Each of the targets can be locally captured by a unique group of barcodes in an open bulk reaction without additional partition, such as, with wells, microwells, holes, tubes, spots, nanochannels, droplets, emulsion droplets, capsules, or any other suitable container for comparting fractions of a sample. These captured targets can be broken into smaller fragments, and a target specific barcode sequence will be tagged onto each fragment as an identification of its original target. These nucleic acid target tracking methods can be used for a variety of applications in both whole genome sequencing and targeted sequencing.
The methods and kits presented herein provide several advantages over existing methods, such as Illumina's synthetic long read and 10× Genomics's linked-read. For example, this invention provides millions to billions or more of barcodes which significantly improve the tagging capacity and specificity. The barcode design provides features that reduce sequencing error from long stretch of the same type of nucleotide, i.e., homopolymer sequences and filter out low quality reads so that it improves sequencing quality. Barcodes can be clonally synthesized directly or amplified clonally or semi-clonally using known chemistries (e.g., emulsion PCR method, bridge PCR) on a solid surface. The transposase based fragmentation method simplifies the sample preparation procedure. Unlike all existing methods, the barcode tagging reaction in this invention can be performed in an open bulk solution without additional partition with wells, microwells, holes, tubes, spots, nanochannels, droplets, emulsion droplets, capsules. The procedure is easy to be automated or scaled up for high throughput sample preparation. This invention provides barcode tagging method for not only long nucleic acid samples for applications, such as, haplotype phasing, structure variation detection and copy number study, but also for short nucleic acid samples to track sample uniqueness.
Transposases in all the figures are illustrated as a tetramer in the transpososome based on the MuA transposition system.
Most commercially available sequencing technologies have limited sequencing read length. Second generation sequencing technologies particularly can sequence only several hundred bases and rarely reach a thousand bases. However, nucleic acid sequences of a gene can span from several kilobases to tens and hundreds of kilobases, which means sequencing read length of tens of kilobases is necessary to successfully determine the haplotypes of all genes. This disclosure provides methods and kit for processing nucleic acid targets into smaller pieces while keeping their origin information with target-specific barcode tags. The processed DNA samples can be used to generate libraries for sequencing applications. The sequencing data can be assembled into full or tandem long reads for haplotype phasing. The methods and kit presented herein provide several advantages over existing methods, such as, Illumina's synthetic long read and 10× Genomics's linked read. For example, this disclosure provides millions and billions or more of barcodes that improves sequencing accuracy by improving the tagging capacity and specificity. Also, unlike existing methods, the barcode tagging reaction in this disclosure may be performed in an open bulk solution without further partitions with wells, microwells, holes, tubes, spots, nanochannels, droplets, emulsion droplets, capsules, etc. The procedure is easy to be automated or scaled up for high throughput sample preparation.
Barcoding methods have been widely used in high throughput sequencing application for sample identification. Barcode designs with completely random or degenerate nucleotide sequences are used for molecular tagging of individual nucleic acid and PCR amplicons. By “barcode” it is meant in general a label that can be associated with (e.g., attached to) a target and convey information (e.g., identity) of that target. By “random” or “degenerate” it is meant a nuclei acid sequence in which one or more positions contain a number of possible bases (e.g., any 2 or 3 or 4 out of A, T, G, C, U). Generally, a barcode can be any nucleic acid sequence of length between 4 to 100 bases, preferably 6 to 25 bases, most common is 6 to 8 bases. The methods and kit disclosed in this disclosure include an improved barcode design to be able to not only offer maximum barcoding capacity, but also improved sequencing accuracy and provide identification for both molecules and samples (e.g., different samples from different patients) at the same time. This disclosure provides a barcode design which contains two or more random nucleotide segments interspersed with predetermined non-homopolymer nucleotide segments (called homopolymer breakers). Each random sequence segment may contain 3 to 9 degenerate bases, preferably 3 to 7 degenerate bases. The length of each random segment may be the same or different. Each homopolymer breaker may have 2 to 9 known bases in length. In one embodiment, the barcode has two random degenerate nucleotide sequencing with one known homopolymer breaker in between. In another embodiment (
For certain sequencing technologies, such as, Illumina's sequencing by synthesis (SBS) technology, if base sequences for all the molecules are the same at one particular sequencing flow step, it will interfere the signal processing pipeline and tends to lead to higher error rate. In some cases, to avoid all barcode sequences with the same breaker segment, more than one barcode sequence design can be used together. They may have the same barcode structure but with different breaker sequence so that at a particular sequencing flow step, there will be at least two different nucleotide bases presented. In one embodiment in
A “barcode template”, which contains a barcode sequence, flanked by at least one handle sequence at one end or two handle sequences at both ends (
The barcode templates (
In some cases, a single stranded barcode template polynucleotide can be directly clonally synthesized on a solid support, such as, with reverse synthesis and split-and-pool method (Macosko et al., 2015) without clonal amplification.
Capture Nucleic Acid-Transpososome Complexes with Clonally Barcoded Solid Support without Additional Partition for Barcode Tagging of the Nucleic Acid
The present disclosure provides methods and kits that capture nucleic acid targets, which are bound by transpososomes, to a clonally barcoded solid support. The captured nucleic acid target may then be fragmented and tagged with barcode sequences on the barcoded solid support.
The term “transposase” as used herein refers to an enzyme that is a component of a functional nucleic acid protein complex capable of transposition and which is mediating transposition. The term “transposase” also refers to integrases from retrotransposons or of retroviral origin. It also refers to both wild type enzymes and mutant enzymes and fusion enzyme with tag, such as, GST tag, 6×His-tag, etc.
The term “transposon”, as used herein, refers to a nucleic acid segment that is recognized by a transposase or an integrase enzyme and is an essential component of a functional nucleic acid-protein complex capable of transposition. It refers to both wild type and mutant transposon.
A “transposon end sequence” as used herein refers to the nucleotide sequences at the distal ends of a transposon. The transposon end sequences are responsible for identifying the transposon for transposition; they are the DNA sequences required to form a transpososome and to perform a transposition reaction.
A “transposable DNA” as used herein refers to a nucleic acid segment that contains at least one transposon unit.
The term “transpososome” as used herein refers to a transposase enzyme non-covalently bound to a double stranded nucleic acid (i.e., transposon).
A “transposition reaction” as used herein refers to a reaction where a transposon inserts into a target nucleic acid. Primary components in a transposition reaction are a transposon, a transposase or an integrase enzyme, and its target nucleic acid.
The term “transpososome-nucleic acid complex” or “nucleic acid-transpososome complex” as used herein refers to strand transfer complex (STC), a stable nucleic acid-protein complex of transpososome and its target nucleic acid into which transposons insert.
A “transposase binding region” as used herein refers to the nucleotide sequences that are always within the transposon end sequence where a transposase specifically binds when mediating transposition. The transposase binding region may comprise more than one site for binding transposase subunits.
A “transposon joining strand” as used herein means the strand of the double stranded transposon DNA that is joined by the transposase to the target nucleic acid at the insertion site.
A “transposon complementary strand” as used herein means the complementary strand of the transposon joining strand in the double stranded transposon DNA.
The method and materials of the disclosure are exemplified by employing in vitro MuA transposition (Haapa et al. 1999 and Savilahti et al. 1995). Other transposition systems can be used, e.g., Ty1 (Devine and Boeke, 1994), Tn7 (Craig, 1996), Tn10 and IS10 (Kleckner et al. 1996), Mariner transposase (Lampe et al., 1996), Tc1 (Vos et al., 1996, 10(6), 755-61), Tn5 (Park et al., 1992), P element (Kaufman and Rio, 1992) and Tn3 (Ichikawa and Ohtsubo, 1990), bacterial insertion sequences (Ohtsubo and Sekine, 1996), retroviruses (Varmus and Brown 1989), and retrotransposon of yeast (Boeke, 1989).
In the present disclosure, a transposable DNA may comprise only one transposon end sequence (
In some cases, the 3′ end of the transposon complementary strand may be shorter than the 5′ end of joining strand of the transposable DNA (
A method for fragmenting and barcoding nucleic acid samples is described as following (
Both Tn5 transpososome and MuA transpososome have been previously described to simultaneously fragment DNA and introduce adaptors at high frequency in vitro, creating sequencing libraries for next-generation DNA sequencing (Adey et al 2010, Caruccio et al 2011, and Kavanagh et al 2013). These specific protocols remove any phasing or contiguity information as a result of the fragmentation of the DNA. However, in these protocols after DNA reaction with transpososomes, a column purification, a heat treatment step, a protease treatment or an incubation with a SDS solution was necessary to release the transposase from the transpososome-DNA complex so that DNA becomes fragments. However, the DNA string bound with transpososomes, known as strand transfer complexes, are very stable under natural condition (Surette et al 1987, Mizuuchi et al 1992, Savilahti et al 1995, Burton and Baker 2003, Au et al 2004, Amini et al 2014), and so is the DNA string with transpososome (603) in
The DNA strings with transpososomes are incubated with barcoded solid support (604) as described in
Many double stranded nucleic acid targets may react with transposable DNA and transposase simultaneously in various concentrations to generate many complexes. When many nucleic acid-transpososome complexes (703,
This disclosure provides a method to encapsulate nucleic acid targets bound with transpososomes and clonally barcoded beads or microparticles in water-in-oil emulsion droplets, and further generate barcode tagged nucleic acid fragments.
The DNA strings with transpososomes, i.e. the contiguous nucleic acid-transpososome complexes, which are generated as described previously in this disclosure (
It should be noted that partitions can be used in connection with these or other embodiments. The term “partition,” as used herein, may be a verb or a noun. When used as a verb (e.g., “to partition,” or “partitioning”), the term generally refers to the fractionation (e.g., subdivision) of a species or sample (e.g., a polynucleotide) between vessels that can be used to sequester one fraction (or subdivision) from another. Such vessels are referred to using the noun “partition.” Partitioning may be performed, for example, using microfluidics, dilution, dispensing, vortexing, filtering and the like. A partition may be, for example, a well, a microwell, a hole, a droplet (e.g., a droplet in an emulsion), a continuous phase of an emulsion, a test tube, a spot, a capsule, a bead, a surface of a bead in dilute solution, or any other suitable container for sequestering one fraction of a sample from another. A partition may also comprise another partition.
Encapsulating Nucleic Acid-Transpososome Complexes with Clonal Barcode Oligonucleotide Pools in Water-In-Oil Emulsion Droplets
This disclosure provides a method to encapsulate nucleic acid targets bound with transpososomes and clonal barcode oligonucleotide pools in water-in-oil emulsion droplets, and further generate barcode tagged nucleic acid fragments.
The DNA strings with transpososomes, i.e. the contiguous nucleic acid-transpososome complexes, which are generated as described previously in this disclosure (
Capture Nucleic Acid with Immobilized Clonally Barcoded Transpososomes for Barcode Tagging of Nucleic Acid without Additional Partition
This disclosure provides methods to capture nucleic acid targets with immobilized clonally barcoded transpososome complexes, fragment the captured nucleic acid and attach the barcode sequence to the fragments without additional partition.
The barcode template used for this application contains both barcode sequence and a transposase binding region. In one embodiment, the barcode template may have the structure as the
A method for clonal barcode tagging and fragmentation of nucleic acid sample is described as following. A solid support (1302) with double stranded barcode template including transposase binding region (1301) may incubate with transposase (1303, 1403) and nucleic acid target (1305, 1405) simultaneously or separately. In one embodiment, transposase (1303) may incubate with a barcode solid support (1302), bind to TBR of the barcode templates and form transpososome on the solid support (1304, 1404). Nucleic acid target may be captured by the immobilized transpososome (1306, 1406). After a heat treatment step, such as at about 65° C. to about 75° C. for approximately 5-10 minutes, a protease treatment or incubation with a protein denaturing agent, e.g. SDS solution, guanidine hydrochloride, urea, etc., transposase will be released from the solid support and fragmented nucleic acid target is exposed (1307, 1407). Additional reaction with a DNA polymerase may perform to fill in the gaps generated during transposition reaction.
In an open bulk reaction when many different nucleic acid targets present, a solid support or solid supports with many different clonal barcode templates prepared according to previously described procedure in this disclosure will be used to clonally capture each nucleic acid target. Limited dilution of the nucleic acid targets may be used. However, no additional partitions with wells, microwells, spots, nanochannels, droplets, emulsion droplets or capsules is necessary. The solid support (1402) may be separated as individual bead or microparticle (
Transposases can be pre-loaded on the barcoded solid support in the method depicted in
Capture Nucleic Acid with Immobilized Clonally Barcodes for Barcode Tagging of Nucleic Acid without Additional Partition
The method in the previous section uses transposition reaction for capturing nucleic acid targets to a clonally barcoded solid support without additional partition. Alternatively, nucleic acid targets can be captured to a clonally barcoded solid support via primer extension reaction with or without strand displacement. The distal end of immobilized barcode template may contain a string of degenerate nucleotides ranging from 6 bases to 20 bases, which can be used as a random primer and annealed to nucleic acid target for target capture. Further primer extension reaction using a DNA polymerase with or without strand displacement function will create a copy or copies of portions of targeted nucleic acid with barcode attached.
The barcode tagged fragments (706, 1407) are immobilized on the solid support. They may be released from the solid support in many ways. In one embodiment, a cleavable link or a rare restriction site may be included in the oligonucleotide sequence which is attached to the solid support. With a cleavage reaction or a restriction enzyme digestion, the barcode tagged fragments can be released from the solid support. In some cases, a primer extension may be performed to make a copy or copies of the barcode tagged fragments (
Assemble Barcode Sequencing Reads into Long Reads
This disclosure provides methods and kit to clonally barcode tag nucleic acid samples in an open bulk reaction without sophisticated compartmentation or partition scheme as other methods. The barcode tagged fragments may be from a whole genome sample. The sequencing reads generated from these barcode tagged fragments may be used to assemble the whole genome as a haploid sequencing method.
The sequencing reads generated from these barcode tagged fragments contain the barcode information which can be used to identify the target origin of these fragments. These short sequencing reads with the same barcode can be grouped together and comprise many short tandem reads spread along the original nucleic acid targets. They provide useful long range linkage information to be used for haplotype phasing. The longer the original nucleic acid targets are, the longer the tandem reads will be, the more useful they are for phasing application. An analysis pipeline which can be developed for full genome assembly or structural variation analysis using these barcode reads for both de novo sequencing and resequencing. In one case, all the sequencing reads may be used for standard shotgun assembly analysis to establish many initial contigs first. The barcode information can then be used to phase the initial contigs into much longer contigs. One of the embodiments in this disclosure is to generate barcode solid support with clonal amplification. Even with limited dilution method, more than one barcode template may be clonally amplified on the same beads or microparticles or at close locations on the slide or flowcell. It is also possible that one barcode templates may be clonally amplified on more than one solid support or solid support surface area to create replicated barcode solid supports. However, the barcode templates designed in this disclosure can generate millions and billions or more of different barcodes, the level of polyclonal barcode solid support and duplicated barcode solid support generated in the process will not significantly interfere with the assembly of the barcode tagged reads overall.
Targeted Sequencing with Barcode Tagged Fragments
This disclosure also provides methods to use these barcode tagged fragments for targeted sequencing application according to the following.
In one case, the region of interest, such as HLA genes or CYP2D6 gene, may be amplified as long range PCR products. These long range PCR products can be used as DNA targets directly with the barcode tagging methods described in this disclosure. The tandem long reads generated from the described method can phase back these long range PCR fragments accordingly.
In some cases, a whole genomic DNA sample may be barcode tagged using the methods described in this disclosure first. In one embodiment, these barcode tagged genomic DNA fragments may be released from the solid support as priming extension products or cleaved from the solid support biochemically (
In another embodiment, these barcode tagged genomics DNA fragments stay on the solid support (
These barcode tagging methods may be used for phasing the targeted gene, genes, or exome. These barcode tagging methods may also be used as a tool for differentiating the duplicated reads in the targeted sequencing application. This method improves sequencing assay detection limit on heterogeneous samples, e.g., somatic mutation detection in a cancer biopsy sample or circulating tumor cell/DNA.
An embodiment of the present disclosure is a barcode template that comprises a barcode sequence and two handle sequences flanking the barcode sequence. The barcode sequence comprises one or more segments of random nucleotide sequence with one or more segments of known nucleotide sequence. In some embodiments, each handle sequence is approximately between about 10 nucleotides and about 100 nucleotides in length. In other embodiments, the handle sequences comprise sequences for priming and/or hybridization. Further, the handle sequences may comprise transposon end sequences. In some instances, the barcode sequence is between about 6 nucleotides and about 100 nucleotides in length. The known sequence in the barcode sequence is between about 2 nucleotides and about 50 nucleotides in length. The known sequence in the barcode sequence may be used as quality filter to remove error prone sequencing reads.
Another embodiment of the present disclosure is a method of clonally barcode tagging nucleic acid targets comprising: providing a solid support having clonal barcode templates immobilized thereon; providing a transposable DNA, wherein said transposable DNA has its 5′ end of transposon joining strand ligatable to the 3′ end of said immobilized barcode template; applying nucleic acid targets to said transposable DNA and transposase to form DNA-transpososome strings in solution; hybridizing the DNA-transpososome strings with said solid support having barcode templates, wherein said 5′ ends of joining strand of transposable DNA ligate to barcode templates, without any additional compartmentalization; and applying a heat treatment, a protease or a protein denaturing agent, e.g. SDS solution, guanidine hydrochloride, urea, etc., to release said transposase from said transpososomes. In some embodiments, the transposable DNA has one transposon end sequence from wildtype or mutant Tn5 or MuA transposon DNA; wherein said transposase is one of wildtype or mutant Tn5 or MuA transposase. The 5′ end of transposon joining strand of said transposable DNA has phosphate suitable for ligation. The 3′ end of transposon complementary strand of said transposable DNA has a protruding end and the protruding end comprises complementary nucleotide sequences of the said barcode template on the solid support; and the said transposable DNA can hybridize to the said barcode template; and the 3′ end of barcode template is ligatable with 5′ end of transposon joining strand directly or after modification with an enzyme. The length of said protruding end is about 1 bases, about 3 bases, about 5 bases, about 10 bases, about 15 bases, about 20 bases, about 25 bases, about 30 bases or as long as the length of the immobilized oligonucleotide on the solid support. The number of said nucleic acid molecules are at least about 102, 103, 104, 105, 106 wherein said DNA-transpososome strings are diluted in the reaction solution before hybridize to the said solid support. The hybridization reaction may be performed with further compartmentalization in plates, microwells or nanochannels.
Another embodiment of the present disclosure is a method of clonally barcode tagging nucleic acid targets comprising providing a solid support having clonal barcode templates immobilized thereon; providing the distal end from the solid support of said barcode templates has a transposon binding region; providing the said barcode templates on the solid support is double stranded for the transposable DNA end; applying transposase and nucleic acid targets to said solid support with immobilized barcode templates to form DNA-transpososome string on the surface of solid support without any additional compartmentalization; and applying a heat treatment, a protease or a protein denaturing agent to release said transposase from the transpososomes. The transposon binding region is from wildtype or mutant Tn5 or MuA transposon DNA; wherein said transposase is one of wildtype or mutant Tn5 or MuA transposase. The number of the nucleic acid targets is in the range of about at least 102, 103, 104, 105, or 106. The nucleic acid targets are diluted in the reaction solution before reaction with said immobilized barcode templates and transposase.
Another embodiment of the present disclosure is a method of generating a library of barcode tagged DNA fragments comprising providing said clonally barcode tagging nucleic acid targets on a solid support; after heating, protease or a protein denaturing agent treatment, the immobilized barcode tagged fragments is treated with a DNA polymerase to fill in the gaps created in the transposition reaction; releasing barcode tagged DNA fragments with a primer extension reaction. In some embodiments, the primer has nucleotide sequence same as a portion of or the whole transposon joining strand sequence in said transpososome. The released barcode tagged fragments are sequencing ready library when sequencing library adapter sequences are included in the said primer sequence and said barcode template sequence. The released barcode tagged fragments are further amplified with primers containing library adapter sequences to generate sequencing ready library. The library contains sample specific index introduced in said primer extension reaction or said amplification reaction; therefore, libraries from different samples can be pooled together for sequencing. The sequencing reads of said barcode tagged nucleic acid fragments are grouped into a string of tandem reads from the same nucleic acid targets; which are capable for haplotype phasing. Cleavage reaction to release the immobilized barcode templates from the solid support is another embodiment.
Another embodiment of the present disclosure is a method of generating a library of targeted gene, genes or exome with barcode tagged nucleic acid fragments comprising providing said released barcode tagged nucleic acid fragments; performing primer extension reaction with first set of primers for targeted gene, genes or exome; and performing amplification reaction with a common primer containing a portion of said barcode template sequence and a second set of primers for target gene, genes, or exome; wherein said second set of primers are nested in the product of said first set of primers. The adapter sequence for sequencing library is added during the amplification step.
Another embodiment of the present disclosure is a method of generating a library of targeted gene, genes or exome with barcode tagged nucleic acid fragments comprising providing said released barcode tagged nucleic acid fragments; performing amplification with a common primer containing a portion of said barcode template sequence and first set of primers for targeted gene, genes or exome; performing amplification with a common primer containing a portion of said barcode template sequence and a second set of primers for target gene, genes, or exome; and the second set of primers are nested in the product of said first set of primers. The adapter sequence for sequencing library is added during the amplification step.
Another embodiment of the present disclosure is a method of generating a library of targeted gene, genes or exome with barcode tagged nucleic acid fragments comprising providing said clonally barcode tagging nuclei acid targets on a solid support; performing primer extension reaction with first set of primers for targeted gene, genes or exome; performing amplification with a common primer containing a portion of said barcode template sequence and a second set of primers for target gene, genes, or exome; and the second set of primers are nested in the product of said first set of primers. The adapter sequence for sequencing library is added during the amplification step.
Another embodiment of the present disclosure is a method of generating a library of targeted gene, genes or exome with barcode tagged nucleic acid fragments comprising providing said clonally barcode tagging nuclei acid targets on a solid support; performing amplification with a common primer containing a portion of said barcode template sequence and first set of primers for targeted gene, genes or exome; performing amplification with a common primer containing a portion of said barcode template sequence and a second set of primers for target gene, genes, or exome; and the second set of primers are nested in the product of the first set of primers. The adapter sequence for sequencing library is added during said amplification step. In some embodiments, the library contains sample specific index introduced in the prime extension reaction or said amplification reaction; therefore, libraries from different samples can be pooled together for sequencing.
Another embodiment of the present disclosure is a method of clonally barcode tagging nucleic acid targets comprising providing beads or microparticles having clonal barcode templates immobilized thereupon; providing a transposable DNA; applying nucleic acid targets to said transposable DNA and transposase to form DNA-transpososome strings in solution; encapsulating said DNA-transpososome strings, the beads or microparticles having barcode templates, and aqueous reaction reagents into water-in-oil emulsion droplets; applying a heat treatment to release said transposase from said transpososomes to break said nucleic acid target into fragments in the emulsion droplets; and driving the nucleic acid fragments onto the said barcode templates on said beads or microparticles.
Another embodiment of the present disclosure is a method of clonally barcode tagging nucleic acid targets comprising providing a transposable DNA; applying nucleic acid targets to said transposable DNA and transposase to form nucleic acid-transpososome complexes in solution; encapsulating said nucleic acid-transpososome complexes and aqueous reaction reagents into water-in-oil emulsion droplets as target droplets; providing clonal barcode templates in water-in-oil droplets as barcode droplets; merging the target droplets with barcode droplets one by one; applying a heat treatment to said merged droplets to release said transposase from said transpososomes to break said DNA target into fragments inside the emulsion droplets; and attaching the said barcode to said DNA fragments in the droplets.
Although the invention has been explained with respect to an embodiment, it is to be understood that many other possible modifications and variations can be made without departing from the spirit and scope of the invention as herein described.
Further, in general with regard to the processes, systems, methods, etc. described herein, it should be understood that, although the steps of such processes, etc. have been described as occurring according to a certain ordered sequence, such processes could be practiced with the described steps performed in an order other than the order described herein. It further should be understood that certain steps could be performed simultaneously, that other steps could be added, or that certain steps described herein could be omitted. In other words, the descriptions of processes herein are provided for the purpose of illustrating certain embodiments, and should in no way be construed so as to limit the claimed invention.
Moreover, it is to be understood that the above description is intended to be illustrative and not restrictive. Many embodiments and applications other than the examples provided would be apparent to those of skill in the art upon reading the above description. The scope of the invention should be determined, not with reference to the above description, but should instead be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled. It is anticipated and intended that future developments will occur in the arts discussed herein, and that the disclosed systems and methods will be incorporated into such future embodiments. In sum, it should be understood that the invention is capable of modification and variation and is limited only by the following claims.
Lastly, all defined terms used in the application are intended to be given their broadest reasonable constructions consistent with the definitions provided herein. All undefined terms used in the claims are intended to be given their broadest reasonable constructions consistent with their ordinary meanings as understood by those skilled in the art unless an explicit indication to the contrary is made herein. In particular, use of the singular articles such as “a,” “the,” “said,” etc. should be read to recite one or more of the indicated elements unless a claim recites an explicit limitation to the contrary.
This example describes a specific barcode design based on the concept described in
P7 oligonucleotide (5′-CAAGCAGAAGACGGCATACGAGAT-3′) was synthesized with an amine group at the 5′ end and a six-carbon linker (C6) between the amine and the other nucleotides (Integrated DNA Technologies, Coralville, Iowa). This oligonucleotide was conjugated to Dynabeads® M-270 Carboxylic Acid beads per manufacturer's protocol. Barcode templates 301, 302 and 303 were synthesized separately and pooled in equal molarity. They were clonally amplified to beads conjugated with P7 oligonucleotides according to the BEAMing protocol (Diehl et al, 2005) using P7 as forward primer and P5 (5′-AATGATACGGCGACCACCGAGATCTACAC-3′) as reverse primer. Clonally amplified beads were collected. Barcode templates on the beads were further amplified off the beads using P5 and P7 primers and sequenced on a MiniSeq instrument to evaluate the system performance.
This example describes specifically designed MuA transposable DNAs and its transposition functionality with a C-terminal His-tagged MuA transposase. One of MuA transposable DNA designs (
In some cases, the linker oligonucleotides (2203) may not be annealed to the transposable DNA during transposition reaction. It can be used in the capture reaction only when transposable DNA ends need to attach to barcode templates.
This example describes a method of barcode tagging of DNA with barcoded beads in an open bulk reaction without additional partition. Modified barcode templates 301, 302 and 303 using a different universal sequence for handle 1 were pooled in equal molarity, and clonally amplified to beads conjugated with P7 oligonucleotides according to the BEAMing protocol (Diehl et al, 2005). Beads with single stranded DNA were collected directly after BEAMing reaction as barcoded beads. 1 ng E. coli genomic DNA was tagmented by incubating with 0.05 uM MuA transposable DNA as design B in
This example demonstrates the contiguity of barcode tagged DNA sequencing reads. A barcode tagged E. coli DNA library described in Example 3 was sequenced on a NextSeq 500 instrument with 73-cycle Read 1 sequencing for genomic DNA insert and 18-cycle Index 1 sequencing for barcode sequences (
This application is a continuation patent application of U.S. application Ser. No. 16/077,295, filed Aug. 10, 2018, which is a 371 U.S. National Stage of PCT International Application No. PCT/US2017/020297, filed Mar. 1, 2017, which claims the benefit of and priority to U.S. Provisional Application No. 62/301,967, filed Mar. 1, 2016, the contents of each of which are incorporated herein by reference in their entireties.
Number | Date | Country | |
---|---|---|---|
62301967 | Mar 2016 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 16077295 | Aug 2018 | US |
Child | 18149397 | US |