This invention relates generally to methods for sequencing DNA, and in particular, to methods for using sequence by ligation to sequence DNA immobilized on miniaturized, high density bead-based arrays.
Profiling mRNA populations is an effective way to investigate cellular or tissue responses and classify into cellular types. Modalities currently available for assessing genome-wide gene expression have proven to be insensitive for the comprehensive assessment of important but rare mRNA molecules such as transcription factors. Serial analysis of gene expression (SAGE) utilizes Sanger dideoxy chemistry to sequence concatenated, PCR amplified cDNA tags. There are several limitations to SAGE, such as the high cost per tag, and skewing of mRNA counts attributable to PCR amplification of tag products prior to cloning and sequencing.
Polymerase colony (polony) bead DNA sequencing is an inexpensive, accurate, rapid approach to sequencing DNA. Polony multiplex analysis of gene expression (PMAGE) using polony bead DNA sequencing permits accurate quantitative assessment of mRNA expression. In general, the method relies on a biochemical procedure, sequencing by degenerate fluorescent ligation, to tag each bead with a fluorophore encoding the identity of a base within the template. As with SAGE tags, PMAGE tags can be quantified, subjected to rigorous statistical analysis, and assigned its cognate gene identity by standard database algorithms. This approach has the further advantages of having a yield of data that is orders of magnitude greater than with SAGE at a fraction of the cost.
However, it may be challenging under certain circumstances to achieve accurate quantitative assessments of mRNA abundance because of difficulties in attaining high yields of data, attaining a high degree of accuracy in the sequencing chemistry, and decreasing systematic bias of tag data.
Polony bead DNA sequencing may be conventionally performed on an array of beads (200-1000 nm in diameter, coated with oligonucleotides) embedded in an acrylamide matrix. While this serves the essential purpose of immobilizing the beads, it may also introduce significant complications. The gel may interfere with access to the bead-bound DNA templates by sequencing reagents due to limited diffusion. Acrylamide gel may be susceptible to attack by alkali, or dehydration may limit the reagents which are used during sequencing cycles (e.g. alcohols, alkaline denaturants, etc. cannot be used). During the course of a sequencing run, fluorescent reagents and contaminants may stick to the gel causing loss of reads. The acrylamide layer is not absolutely flat, and the beads within the gel are not uniformly in the same focal plane, hence focusing on the beads during sequencing may be problematic resulting in a loss of yield and sequencing fidelity. In addition, fluorescence background may accumulate on DNA-bearing beads as the sequencing run progresses. The accumulation is the result of covalent addition of fluorescent species to free 3′ hydroxyl ends of templates and un-extended bead-bound amplification primers. Further, increasing concentrations of beads may lead to clumping, which can confound data acquisition due to clumps of beads landing on different focal planes.
Current protocols in Polony DNA sequencing and other methods of sequencing DNA by ligation include a ‘capping’ step where the array is incubated in the presence of terminal deoxytransferase and dideoxynucleotides. While the expectation is that the enzyme will add a dideoxynucleotide to the 3′ end of each DNA strand, some ends may not be capped and may still be free to participate in polymerization and ligation reactions, which over several sequencing cycles can result in the development of significant background signal.
When performing sequencing by fluorescent nonamer ligation for expression profiling, large deviations from expected tag counts can occur because of sequence-specific systematic biases in the efficiency of the ligation reaction. These biases, while reproducible from run to run, can cause the frequency of a significant number of tags to be under-represented during analysis.
The present invention is based in part on the discovery of novel materials and methods for increasing bead density on the array, improving signal, reducing background, enhancing the efficiency of ligation, and improving the accuracy of reported tag counts when performing DNA sequencing.
Certain aspects of the present invention are directed to bead-based arrays. In accordance with certain exemplary embodiments, an array is provided having a plurality of beads and a solid support. Each bead of the plurality has oligonucleotides immobilized on the surface of the bead. Each bead is connected to the solid support by at least one molecule on the bead which can bind to a corresponding molecule on a surface of the solid support.
Other aspects of the present invention are directed to methods for making bead-based arrays. In accordance with certain exemplary embodiments, a method for producing an array includes providing a plurality of beads and a solid support, and connecting the plurality of beads to the solid support by binding at least one molecule on a bead to a corresponding molecule on a surface of the solid support. Each bead of the plurality has oligonucleotides immobilized on the surface of the bead,
Other aspects of the present invention are directed to methods for using bead-based arrays. In accordance with certain exemplary embodiments, a method is provided for DNA sequencing. A plurality of beads is provided, wherein each bead has immobilized on the surface thereof single-stranded template DNA having a 3′ terminus. A first oligonucleotide having a 5′ overhanging sequence is annealed to the 3′ terminus of the template DNA. Then, a second oligonucleotide having a 3′ blocking moiety is annealed to the 5′ overhanging sequence, and the second oligonucleotide is ligated to the template DNA. An array is formed by connecting the plurality of beads to a solid support. Finally, DNA sequencing is preformed on the array.
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee. The foregoing and other features and advantages of the present invention will be more fully understood from the following detailed description of illustrative embodiments taken in conjunction with the accompanying drawings.
Embodiments of the present invention include aspects of methods and techniques of immobilizing polymerase colonies (polonies) onto miniaturized, high density bead-based arrays for DNA sequencing described in, for example PCT/US05/06425 and U.S. patent application Ser. No. 11/505,073, incorporated herein by reference in their entirety for all purposes.
Embodiments of the present invention also include aspects of methods and techniques for utilizing a sequence by ligation approach on multiplexed polonies to achieve the sequencing of millions of oligonucleotide cDNA tags per experiment, for example Shendure et al. (2005) Science, Vol. 309. No. 5741, pp. 1728-1732, incorporated herein by reference in its entirety for all purposes. This technique is called PMAGE (Polony Multiplex Analysis of Gene Expression).
Embodiments of the present invention are particularly directed to bead-based arrays where beads are directly coupled to a solid support. According to one embodiment, the beads are immobilized to a substrate surface via a tether molecule, including molecules having linker moieties, that are covalently attached to both the substrate surface and the bead. According to the present invention the bead can include a molecule which can bind to a corresponding molecule on the surface of the solid support thereby connecting the bead to the solid support. According to one embodiment, the molecule on the bead binds non-covalently to the corresponding molecule on the surface of the solid support. Examples of non-covalent binding molecules include streptavidin and biotin. The beads are arranged in a uniform layer of preferably one bead thickness. These bead-based arrays can then be used in and/or with the methods described in the above DNA sequencing methods, i.e. PCT/US05/06425, U.S. patent application Ser. No. 11/505,073, Shendure et al. (2005) Science, Vol. 309. No. 5741, pp. 1728-1732, the entire disclosures of which are incorporated herein by reference and for all purposes. The support can be glass, metal, ceramic, or plastic solid phase and the like. The support can be either flat or contoured.
In certain embodiments, polony beads are bound directly to a glass support by linkage of the 3′ ends of the oligonucleotides immobilized on the beads to activated —NH2 groups on the siliconized glass support. Linkages can include amino ester linkages, asymmetric linkages, e.g., N-(κ-maleimidoundecanoyloxy)sulfosuccinimidyl (KMUS), N-(ε-maleimidocaproyloxy)succinimidyl (EMCS), N-hydroxy-succimidyl (for RNH2 functional groups), and iodoacetamidyl (for RSH functional groups), and cleavable or reversible linkages such as dithiol or nitrobenzyl, and the like. In certain embodiments, a glass support is provided, such as, for example, a glass slide or a glass coverslip, which is coated with a hydrophilic polymer. The hydrophilic polymer acts as a tether molecule between the bead and the support. The hydrophilic polymer, or tether molecule, can be polyethylene glycol, poly(N-vinyl lactams), polysaccharides, polyacrylates, polyacrylamides, polyalkylene oxides, and copolymers of any of them. Each polymer is covalently attached at one end to the glass support, and bears a linker functional group at the other end. Beads having oligonucleotides with a 3′-NH2 group as described herein are bound to the glass substrate by covalently attaching the 3′-NH2 group to the linker function group. The linker can include amino esters, bis(sulfosuccinimidyl) suberate (BS3), N-hydroxysuccinimidyl (NHS), N-(κ-maleimidoundecanoyloxy)sulfosuccinimidyl (KMUS), N-(ε-maleimidocaproyloxy)succinimidyl (EMCS), iodoacetamide, dithiol, nitrobenzyl, and mixtures of any of them.
In accordance with another aspect of the invention, methods for producing an array are provided. Polony beads, each having a polony of oligonucleotides immobilized on the surface of the bead are directly coupled or connected to the surface of a solid support by binding at least one molecule on a bead to a corresponding molecule on a surface of the solid support. In certain embodiments, this can be done by reacting a tether molecule as described herein with the surface of the solid support and with an oligonucleotide immobilized on a bead to covalently attach the bead to the glass support. In other embodiments, a glass support is provided having tether molecules already covalently attached to the surface of the glass support. The tether molecules can be polyethylene glycol, poly(N-vinyl lactams), polysaccharides, polyacrylates, polyacrylamides, polyalkylene oxides, and copolymers of any of them. The tether molecules each beard a linker functional group which can be reacted with an oligonucleotide immobilized on a bead to covalently attach the bead to the glass support. The linker function group can include amino esters, bis(sulfosuccinimidyl)suberate (BS3), N-hydroxysuccinimidyl (NHS), N-κ-aleimidoundecanoyloxy)sulfosuccinimidyl (KMUS), N-(ε-maleimidocaproyloxy)succinimidyl (EMCS), iodoacetamide, dithiol, nitrobenzyl, and mixtures of any of them.
In certain embodiments, a method of sequence-specific enrichment for amplified beads from a mixed population of empty and amplified beads is provided by ligation of a capture probe and coupling of the capture probe to the glass support to which the beads are to be attached.
In accordance with another aspect of the invention, methods to improve signal and reduce background in polony ligation-mediated DNA sequencing are provided. Certain embodiments are directed to methods for capping the 3′ terminii of single-stranded template DNA that are bound to beads to block further participation in ligation reactions. According to embodiments of the present invention, a first oligonucleotide having a 5′ overhanging sequence is annealed to the 3′ terminus of the single-stranded template DNA. A second oligonucleotide complementary to the 5′ overhanging sequence and containing a 3′ blocking moiety is annealed to the first oligonucleotide. The second oligonucleotide is ligated to the template DNA. The 3′ blocking moiety can include an amino-modifier, a dideoxycytidine, a non-ribose, a covalent blocking group, a steric blocking group, or a reversible blocking group and the like. The 3′ amino modifier can be used to attach the DNA-coated beads to glass by NHS ester chemistry. The oligonucleotides may be composed of degenerate bases, such as 5′-/5Phos/NNNNNNNNN/3AmM/-3′.
Other embodiments are directed to reducing background by removing free forward primer from the beads after coupling to the array in a sequence-specific manner. According to certain embodiments, one method includes the steps of annealing a protecting oligonucleotide of complementary sequence to the 3′ end of the DNA strands which should not be removed, incubating the DNA on the beads with Exonuclease I under conditions suitable for 3′ to 5′ exonucleolysis, and inactivating and removing the Exonuclease I.
Another aspect of the invention is directed to enhancing the efficiency of the ligation reaction for either sequencing by ligation or for capping the 3′ terminii of single-stranded template DNA by addition of polyethylene glycol to the ligation reaction.
In another aspect of the invention, methods to improve the accuracy of reported tag counts when performing DNA sequencing by fluorescent nonamer ligation. In certain embodiments, the ligation protocol comprises the step of incrementally increasing the ligation reaction temperature from about 20° C. to about 40° C. In another embodiment, macromolecular crowding agents are included in the ligation step. The crowding agents can include polyethylene glycol. In another embodiment, chemical additives which can decrease the Tm (melting temperature) difference between A/T and G/C basepairs are included in the ligation step. The chemical additives can include betaine. In another embodiment, nucleotide analogs which can decrease the Tm (melting temperature) difference between A/T and G/C basepairs are incorporated into either the query nonomers during synthesis or the template DNA during amplification. The nucleotide analogs can include 2-aminopurine, 2,6-diaminopurine, bromo-deoxyuridine, deoxyinosine, 5-nitroindole, and locked nucleic acids.
In particular, certain embodiments provide polony beads bound directly to a glass support by linkage of the 3′ ends of the oligonucleotides immobilized on the beads to activated —NH2 groups on the siliconized glass support. The advantages of creating a bead array immobilized directly on a glass substrate are manifold. For instance, the present invention provides for the ability to make a highly dense array in a monolayer. The maximum number of beads that can be successfully arrayed directly on glass is approximately 60,000,000 (65,000 beads per frame, 930 frames per array), which provides as many as 30× more DNA sequences to be obtained from a single slide. The present invention provides for improved chemistry and increased signal to noise ratio due to direct accessibility of reagents to the beads. The present invention also provides for a wider range and greater volume of potential reagents that can be used, including reagents which are incompatible with an acrylamide matrix. The bead array can be imaged more easily because it is in the form of a monolayer and is relatively flat, and the autofocusing procedure will be more automatable.
Certain embodiments are directed to ligation methods for capping the 3′ terminii of single-stranded template DNA that are bound to beads to block further participation in ligation reactions. Advantages of “Capping-by-ligation” include not only improved signal in polony ligation-mediated DNA sequencing by effectively reducing background, but also the ability to attach 3′ amino modifiers with which to attach the DNA-coated beads to glass by NHS ester chemistry. Preferred embodiments include the addition of polyethylene glycol in the ligation reaction, which significantly increases the efficiency of all ligation reactions where one of the oligonucleotides is bound to a solid support.
This invention is further illustrated by the following examples, which should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application are hereby incorporated by reference in their entirety for all purposes.
Total RNA was prepared using Trizol reagent (GibcoBRL). Each library described here was constructed from a total of 10 micrograms of total RNA. We pooled RNA from the tissues of 45 male mice per library (wildtype and αMHC403/+ in SvEv background) to minimize inter-animal variability in RNA expression.
Prepare Dynabeads (mRNA Direct Kit, Dynal), washing solutions, 1st strand mix (keep on ice), before thawing RNA.
Prepare fresh solutions per library. Glycogen (Roche) or BSA (NEB) is used in the following solutions to reduce clumping of dynabeads. 2×BW buffer (10 mM Tris-HCl pH 7.5, 1 mM EDTA, 2.0 NaCl). The following volumes represent the requirements for 1 library.
Prepare 37° and 160 water bath for subsequent immediate steps.
Wash 2×0.5 ml of washing buffer with Buffer A.
Resuspend beads in 1st strand synthesis mix:
Place tube at 37° C. for 2 min, then add 3 μl of Superscript RT. Incubate at 37° C. for 1 hr, mix beads every 10 min. by gentle flicking or slow vortexing. After incubation, place tube on ice to terminate the reaction.
On ice add the components of the 2nd strand synthesis, in order shown, to the first strand reaction:
E coli DNA ligase
E coli DNA pol I
E coli RNAse H
Incubate at 16° C. for 2 hrs, mix beads every 10 min.
After incubation, terminate reaction by adding 4511 of 0.5 M EDTA.
Draw off the supernatant, then add pre-heated 0.5 ml of Buffer C, mix well, and heat at 75° C. for 12 min. (with intermittent mixing) to inactivate E. coli DNA polymerase. Remove supernatant and quickly wash again with 0.5 ml of Buffer C.
Then wash 4× Buffer D, 2×200 μl of 1× Buffer4+BSA (transfer to new tubes after first wash).
Take 5 μl of the last wash for checking the integrity of cDNA by RT PCR.
Resuspend beads in following mix and incubate at 37° C. for 1 hr:
After incubation, wash beads with 2×500 μl of Buffer C (preheated to 37° C., wash quickly before SDS precipitates), then wash 4×200 μl Buffer D (beads can be stored overnight at this stage).
Ligating Adapter A to cDNA
This ligation of linkers A and B is performed in molar excess of linkers to minimize the formation of library to library dimers (from 2.25 picomoles in conventional SAGE to 100 pmoles in this reaction).
Release of cDNA-Tags Using Tagging Enzyme AcuI
PCR reaction should be stopped at the minimum number of samples to amplify the complex library template before primers are exhausted. In such case, due to the great similarity of the templates, library molecules are more likely to anneal with another single-stranded library molecule rather than to its exact complementary partner.
The library can be confirmed by TA cloning and Sanger sequencing the PCR product. It is imperative that any confirmatory PCR reaction be performed with dedicated reagents and pipettors, and that these PCR products are not to come in contact with library or emulsion PCR preparation areas. A few molecules of amplified and concentrated library PCR product is theoretically sufficient to contaminate an entire library.
Performed as previously described in Shendure et al. (2005) Science, Vol. 309. No. 5741, pp. 1728-1732.
SENSE CONSTRUCT: 91 bases, ten base cDNA sequence tag
EmulSAGEA1 and A2 are annealed to form the Forward adapter, and EmulSAGEB1 and B2 are annealed to form the Reverse Adapter.
Unique library-specific sequences are 5′ proximal to the emulsion “Free Forward” primer, therefore not introducing a PCR based bias between libraries:
Beads loaded with DualBiotForProxl:
Use of an acrylamide matrix for polony array construction interferes with access to bead-bound DNA templates by sequencing reagents due to limiting diffusion. Susceptibility of acrylamide to attack by alkali or dehydration also excludes use of certain reagents (e.g. alcohols, alkaline denaturants, and others) during sequencing cycles. Additionally, beads within the matrix are not uniformly located in a single focal plane, resulting in diminished performance of microscopy based data acquisition with lower yield and signal coherence. Immobilizing polony beads directly to glass resolved these limitations. To increase the number of beads per glass array, to align beads on a monolayer, and to increase the permeability of chemistry reagents, an approach is provided to cross-link amino groups on oligonucleotide-coated beads to aminosilylated glass cover-slips. Glass cover slips are aminosilylated. Then oligonucleotides (both loaded forward primers and amplified templates) on polony beads are capped with primary amines. Reactive amines on oligonucleotide-coated beads and on glass coverslips are bridged with bivalent amino-ester crosslinkers. Capping also serves the dual purpose of decreasing background fluorescence in ligation-based sequencing.
Amplified polony beads
Acetone (water-free)
3AmM capping oligo:
3AmM 9N capping oligo:
Bridging oligo for bead loading oligo:
Bridging oligo for amplified template:
1. Thoroughly wash and dry cover slips.
2. Prepare 2% solution of Aminosilane
1. BS3 is very moisture sensitive. Store at 4 C in dessicator. To avoid condensation on the product, allow to equilibrate to room temperature for several minutes before using.
2. Make fresh solution of 5 mM BS3 in PBS (2.86 mg/mL), using the minimum volume that can be accurately weighed.
3. Resuspend beads in 10 μl of 5 mM BS3 quickly (this is time-sensitive) and drop onto coverslip. Place an inverted Teflon-coated slide on top to facilitate beads settling down to the coverslip surface. The time from suspension of beads in BS3 to laying on the slide should be no more than 12 seconds. Incubate for 45 min. at room temp. If removing from a slide holder, let the coverslip/slide settle for 10 min. before inverting and removing the coverslip.
4. The shearing force from removing the slide from the coverslip can disrupt the beads from the coverslip. Carefully separate the coverslip from the slide. Place into the quench solution (20 mM Tris pH 7.5). Gently rinse away extra beads in the quench solution.
5. Rinse coverslip in water, and now ready to assemble into flow cell.
This protocol decreases the formation of secondary bead to bead interactions that can cause loss of yield. The linker is not free in solution and so cannot link beads to each other before they attach to glass. This makes the protocol more robust to variations in handling and easier to practice.
amplified polony beads
Codelink-treated coverslip. The coverslip is coated with polyethylene glycol. One end of the polyethylene glycol molecule is covalently attached to the coverslip. The other end of the polyethylene glycol molecule bears a linker functional group such as N-hydroxysuccinimide.
Sequencing was performed as previously described (1) with the following modifications:
Tag sequences which form stable hairpin loops with the template sequence can be under-represented in datasets. This phenomenon is likely attributable to inaccessibility of sequencing oligos. To inhibit hairpin formation, blocking primers are added in the annealing step, such that the blocking primer anneals to the opposite end of the library template from the anchoring primer. Anchor primers were hybridized in the flowcell with the addition of the corresponding blocking primers in equimolar quantities.
Query primers were used as previously described, but 6-FAM was used in the place of FRET for its superior signal in the plus positions and minus 5-6 position. For example, to query the plus 5 position, the following degenerate nonamer mix was used:
Ligation with query primers was preceded by priming the flowcell with PEG containing Quick ligase buffer (NEB). Query primers were ligated in the flowcell (8 uM query primer mix (2 Um each subpool), 6000 U T4 DNA ligase (NEB), Ix Quick ligase buffer (NEB). Institution of PEG for macromolecular crowding increases the kinetics of the ligation reaction, thus increasing signal.
We used multiple ligation temperatures, implemented as a stepped gradient, to improve annealing of all degenerate query nonomers, given their broadly disparate melting temperatures. By incrementally increasing the ligation reaction temperature from 18 C to 37 C (close to the maximum permissible temperature for T4 DNA ligase), a greater subset of degenerate sequence will hybridize to the template and thus fluorescently-tag the bead by ligation. Our protocol is for ligation at 18 C for 5′, then at 25 C for 5′, then 30 C for 5′, then 37 C for 5′ to more thoroughly cover the Tm space of all degenerate query nonomers. At the end of the reaction, excess query primer was washed out at room temperature with Wash 1 E for 5′.
After completing all sequencing cycles, the array is hybridized with library-specific fluorescent oligonucleotides (See end of Note S 1 for primer sequences). 4 μl of each 100 mM primer is mixed with 192 μl of 6×SSPE. Primer hybridization is performed as previously described.
Stripping was performed as previously described in Shendure et al. (2005) Science, Vol. 309. No. 5741, pp. 1728-1732.
As an additional measure to reduce background, free forward primer can be removed from beads after they have been coupled to the array in a sequence-specific manner using Exonuclease I to selectively degrade single-stranded DNA strands.
Steps for removing forward primer are as follows:
Begin by hybridizing a ‘protecting’ oligonucleotide complementary in sequence to the 3′ end of the strands which should not be degraded. Then incubate with Exonuclease I under conditions suitable for 3′->5′ exonucleolysis of all strands to which the protecting oligonucleotide has not annealed. Since Exonuclease I can only initiate digestion from a single-stranded 3′ end, it will not degrade those strands annealed to a ‘protecting’ oligonucleotide. Following Exonuclease I treatment, inactivate and remove the enzyme by incubating in 6M guanidinium HCl for 1 minute at RT followed by several rinses in dH2O and Wash 1.
The accuracy of reported tag counts is improved by modifying the ligation protocol to address underlying reaction inefficiencies.
“Stepped temperature” ligation reactions are performed to more thoroughly cover the Tm space of all degenerate query nonamers used, which ranges from 16° C. (AAAAAAAAA) to 50° C. (GGGGGGGGG) according to the equation:
By incrementally increasing the ligation reaction temperature from 20° C. to 40° C. (the maximum permissible temperature for T4 DNA ligase), a greater subset of degenerate sequence will hybridize to the template and thus fluorescently-tag the bead by ligation. In addition, macromolecular crowding agents, including polyethylene glycol, can be used to increase the kinetics of the reaction, resulting in more signal per bead. Chemical additives, including betaine, can be used to decrease the Tm differential between A/T and G/C basepairs. Nucleotide analogs, including 2-aminopurine, 2,6-diaminopurine, bromo-deoxyuridine, deoxyinosine, 5-nitroindole, and locked nucleic acids, can be incorporated into either the query nonamers during synthesis or the template during amplification to modulate the Tm differential.
In light of the foregoing disclosure of the invention and description of various embodiments, those skilled in this area of technology will readily understand that various modifications and adaptations can be made without departing from the scope and spirit of the invention. All such modifications and adaptations are intended to be covered by the following claims.
This application is a continuation of PCT Application No. PCT/U.S.07/79936 designating the United States and filed Sep. 28, 2007; which claims the benefit of the filing date of U.S. Provisional Patent Application No. 60/847,752 filed on Sep. 28, 2006; each of which is hereby incorporated herein by reference in its entirety for all purposes.
This invention was made with U.S. Government support under grant number HG003170 awarded by the National Institutes of Health. The Government has certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
60847752 | Sep 2006 | US |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/US2007/079936 | Sep 2007 | US |
Child | 12412471 | US |