The present invention relates generally to flow cytometry and, more particularly, to a method for generating address/capture tags for use in multiplexed flow-cytometry based assays.
Single nucleotide polymorphisms (SNPs) are the most frequent form of sequence variation among individuals (Cooper et al., 1985; Cooper and Krawczak, 1990). These sites are present at high density in the genome and are highly conserved, making them powerful tools for the mapping and diagnosis of disease-related alleles. As sequencing and mapping of the human genome near completion, the detection and analysis of SNPs for applications ranging from disease gene mapping to diagnostics will be a major objective for genome research (Schaffer and Hawkins, 1998; Brookes, 1999). Such applications could involve the screening of hundreds to hundreds of thousands of SNPs in thousands to tens of thousands of samples. There is at present a pressing need for SNP scoring methods that are robust, high throughput, and cost efficient.
A variety of assay configurations has been developed to score SNPs, including hybridization (Wang et al., 1998), ligation (Landegren et al., 1988), polymerase (Syvanen et al., 1990), and nuclease (Lee et al., 1993; Lyamichev et al., 1999). These assays have been adapted to a number of analysis platforms including electrophoresis (Pastinen et al., 1996), microplates (Tobe et al., 1996), mass spectrometry (Braun et al., 1997), and flat arrays (Wang et al., 1998). The ideal method for large-scale SNP scoring would use a robust assay chemistry combined with a flexible analysis plat-form, enabling the multiplexed analysis of many SNPs per sample in a highly automated manner.
Polymerase-mediated single-base extension of oligonucleotide primers, or minisequencing (Syvanen, 1999), has proven to be a straightforward and robust tool for SNP genotyping. This approach involves the annealing of a primer directly upstream of the site of interest and single-base extension by DNA polymerase using labeled dideoxynucleotide triphosphates (ddNTPs). Minisequencing is attractive because it requires only a single primer per SNP and uses polymerase specificity to interrogate base identity. Minisequencing assays have been adapted to a variety of assay platforms, including electrophoresis (Tully et al., 1996), microplates (Shumaker et al., 1996), oligonucleotide arrays (Pastinen et al., 1997), and homogeneous fluorescence assays (Chen and Kwok, 1999); however, each of these configurations has limitations that preclude high-throughput, multiplexed, and automated analysis.
Flow cytometry is capable of sensitive and quantitative fluorescence measurements of individual particles without the need to separate free from particle-bound label. Analysis rates are very high (hundreds to thousands of particles per second), and multiple fluorescence and light scatter signals can be detected simultaneously. These features make flow cytometry an extremely powerful analytical tool for the analysis of cellular and macromolecular assemblies (Nolan and Sklar, 1998).
Accordingly, it is an object of the present invention to provide a flow cytometric assay that combines minisequencing with genomic analysis using multiplexing microsphere arrays to enable high-throughput SNP scoring.
Another object of the invention is to provide a method for designing address/capture tags that are capable of high specificity in directing a specific assay to a specific microsphere population in a multiplexed assay.
Additional objects, advantages and novel features of the invention will be set forth in part in the description which follows, and in part will become apparent to those skilled in the art upon examination of the following or may be learned by practice of the invention. The objects and advantages of the invention may be realized and attained by means of the instrumentalities and combinations particularly pointed out in the appended claims.
To achieve the foregoing and other objects, and in accordance with the purposes of the present invention, as embodied and broadly described herein, the method for identifying a set of sequences useful as address/capture tags includes: generating a chosen number of random DNA sequences having a chosen length; rejecting all reverse complementary sequences from the chosen number of random DNA sequences, the remaining sequences forming a first group of sequences; rejecting all sequences from the first group of sequences having common subsequences with a subsequence length greater than a chosen number of bases, the remaining sequences forming a second group of sequences; rejecting all sequences in the second group of sequences which can form stable hairpins, the remaining sequences forming a third group of sequences; and rejecting all sequences in the third group of sequences which can form stable dimers, the remaining sequences forming a fourth group of sequences; whereby a set of sequences is identified such that the sequences, if synthesized, would hybridize to their respective complements with a high degree of specificity.
Preferably, the method includes the steps of determining the melting temperature of each of sequence in the fourth group of sequences; rejecting all sequences that melt below a selected temperature, forming thereby a fifth group of sequences; and synthesizing a desired number of the sequences in the fifth group of sequences and complements thereof.
It is preferred that the selected melting temperature is between 50° C. and 70° C. and, more preferably, that the selected melting temperature is about 60° C.
It is also preferred that the method includes the step of rejecting all runs of bases greater than a chosen number of bases.
Benefits and advantages of the invention include a great increase in the number of assays that can be reliably performed simultaneously using flow cytometry.
The accompanying drawings, which are incorporated in and form a part of the specification, illustrate the embodiments of the present invention and, together with the description, serve to explain the principles of the invention. In the drawings:
Briefly, the present invention includes a method for the construction of a collection of double-stranded DNA sequences manifesting specificity of binding. Each double-strand thereof consists of a pair of reverse complementary sequences. Binding specificity means that under reasonable experimental conditions the binding between the single strands arising from the double-strand sequences of the collection will be restricted to the reverse complementary pairs of sequences. The motivation for generating such sequences is that they enables large numbers of experiments to be tagged with one strand from a sequence and localized, on microbeads as an example, using the other complementary strand.
First, many potential tag sequences (oligomers) are generated. These sequences are then investigated for interactions that appear stable enough to create problems in the assay. In practical terms, this is accomplished by calculating the stability of any unfavorable interaction and expressing it in terms of a ΔG value, then omitting those oligomers that are likely to be involved in such interactions. Finally, the abbreviated collection of potential sequences is sorted by predicted melting temperature (Tm) (Kaderali, 2001), and a subset is chosen that has a narrow window of Tm's. This facilitates efficient capture at a temperature that is equally favorable for all tags.
As an example, chosen complementary pairs will melt at 60° C., whereas all other pairs of strands will melt below 30° C. Between these two temperatures, the desired binding specificity is manifest. The selected sequences and their complementary sequences are then synthesized.
The microsphere-based flow cytometric minisequencing assay of the present invention was demonstrated for SNP analysis.
Reference will now be made in detail to the present preferred embodiments of the invention, examples of which are illustrated in the accompanying drawings.
A. Materials and Methods:
1. Oligonucleotides. The DNA oligonucleotides were synthesized on an automated Applied Biosystems Model 394 oligonucleotide synthesizer using biotin-phosphoramidite and biotin- or amino-amino CPG from Glen Research (Sterling, Va.) or ordered from commercial sources. All the synthesized oligonucleotides were desalted, and their concentrations were measured by absorbance at 260 nm.
2. PCR amplification and sequencing of genomic DNA. Genomic DNA was prepared from blood samples of 30 individuals employed by Los Alamos National Laboratory (LANL). All samples were obtained with informed consent as approved by the LANL Institutional Review Board. These samples had been previously sequenced using an automated DNA sequencer (PE Applied Biosystems, Foster City, Calif.) using standard methods. PCR amplification of an HLA-DBP1 exon II 320-bp fragment containing the Glu-69 SNP target was performed using the primers UG19 and UG21 described in Recheldi et al. (1993). Amplification of a 255-bp fragment from exon 11 of the HLA DPA1 gene used the primers described in Wang et al. (1999). Before minisequencing, the PCR-amplified template was treated with shrimp alkaline phosphatase (SAP, 1 unit, USB) and exonuclease I (ExoI, 1 unit, USB) in SAP reaction buffer (USB) in a total volume of 10 ml at 37° C. for 1 h, followed by an inactivation step of 72° C. for 15 min. One microliter of the ExoI/SAP-treated PCR product was used for each minisequencing reaction.
3. Preparation of microspheres. Streptavidin-coated and carboxylated microspheres (3.1 or 6.2 mm in diameter) were purchased from Spherotech, Inc. (Libertyville, Ill.). Avidin-coated or carboxylated multiplexing microspheres were purchased from Luminex Corp. (Austin, Tex.). In some cases, avidin (ExtraAvidin, Sigma, St. Louis, Mo.), or amino-bearing oligonucleotides were covalently attached to carboxylated microspheres using ethylenediaminocarbodiimide (EDAC, Pierce, Rockford, Ill.). Avidin (5 mg/ml) or amino-oligonucleotide (100 nM) and EDAC (10 mg/ml) were added, and the mixture was incubated for 30 min. Biotinylated oligonucleotides (100 nM) were bound to avidin- or streptavidin-coated microspheres (1 3 107/ml) by incubation in TE buffer for at least 1 h at RT. The micropheres were washed by two cycles of centrifugation and resuspension to remove unbound oligonucleotides.
4. Capture/address tags. A random, insertion-deletion code (Varshamov and Tenengol'ts, 1965; Hazelwinkel, 1988, the teachings of which are hereby incorporated by reference herein), consisting of 1024 length-20 DNA sequences was designed. In this code, no subsequence common to any two code words contained more than 14 letters. These subsequences are not necessarily contiguous, and Needleman-Wunsch sequence alignment was used to find the length of the longest common subsequence, with matching letters contributing unity and mismatches and insertions/deletions contributing zero to the alignment score (Needleman and Wunsch, 1970, the teachings of which are hereby incorporated by reference herein). The rationale for implementing this code was that minimal cross-hybridization could occur between the reverse complement of one code word and another code word when the code words have only short subsequences in common. Sixteen of these code words were synthesized, see Table 1. This subset was derived from the code after further vetting with the Oligo program Molecular Biology Insights (Cascade, Colo.). The salient tests included duplex melting temperature, hairpin formation, matching to repetitive sequences in the DNA database, and cross-hybridization of capture tags.
5. Minisequencing assay. Minisequencing reactions were carried out in Thermosequenase buffer (Amersham Life Sciences, Cleveland, Ohio) in the presence of biotinylated or capture-tagged minisequencing primers (25 nM each), one FITC-labeled ddNTP (NEN/DuPont, Herts, UK), three nonfluorescent ddNTPs (5 mM each), Thermosequenase (1 unit, Amersham), and DNA template. The reaction was cycled 99 times at 94° C. for 10 s and at 60° C. for 10 s. After the minisequencing reaction, avidin- or address-tagged microspheres were added to each tube (5×106) and incubated at room temperature for 1 h to capture the minisequencing primers. The hybridized bead mix was then diluted into 500 ml TE/BSA (50 mMTris-HCl, pH, 8.0, 0.5 mM EDTA, 0.5% (w/v) bovine serum albumin) for fluorescence measurement by flow cytometry.
6. Fluorescence detection by flow cytometry. Flow cytometric measurements of microsphere fluorescence were made on a Becton-Dickinson FACSCalibur (San Jose, Calif.) using CellQuest acquisition and analysis software. In some cases, multiplex samples were analyzed using the FlowMetrix O/R acquisition system (Luminex Corp.) interfaced with FACSCalibur. The samples were illuminated at 488 nm (15 mW), and forward-angle light scatter, 90° light scatter, and fluorescence signals were acquired. Linear amplifiers were used for all measurements. Particles were gated on forward angle and 90° light scatter, and the mean fluorescence channel numbers were recorded. The background fluorescence signal from unlabeled micro-spheres was subtracted from all samples. Mean fluorescence values were converted to mean equivalent soluble fluorophore units using Quantum 24 FITC Standard Microspheres from Flow Cytometry Standards Corp. (San Juan, Puerto Rico).
B. Results
A single biotinylated oligonucleotide annealed immediately adjacent to the SNP site is extended one base using DNA polymerase and fluorescent ddNTPs. The present assay configuration involves four parallel reactions, each with a different fluorescent ddNTP and three other nonfluorescent ddNTPs. The use of Thermosequenase, a thermostable DNA polymerase that efficiently incorporates ddNTPs, allows the minisequencing reactions to be cycled, thus amplifying the signal. After extension, the biotinylated primers were captured onto streptavidin- or avidin-coated microspheres, and the number of incorporated fluorescent ddNTPs was measured by flow cytometry.
The polymorphism, amino acid position 69 in exon II of the HLA DPB1 locus was analyzed by the method of the present invention. This site is associated with immune hypersensitivity to the metal beryllium (Recheldi et al., 1993). A 320-bp fragment containing the site of interest was amplified from 30 different human genomic samples that had been sequenced previously, but had been coded to provide a “blind” test. A biotinylated minisequencing primer (18-mer) was designed to anneal immediately adjacent to this site. Four parallel reactions were set up containing the synthetic template, primer, polymerase, one of the four fluorescein-labeled ddNTPs, and the remaining three unlabeled ddNTPs. The reactions were cycled 99 times before the addition of the avidin capture beads. After capture of the primers, the samples were diluted 100-fold and analyzed by flow cytometry.
As shown in
The ability to interrogate an individual template molecule with many primers through thermal cycling is important to the sensitivity of the minisequencing approach. Preliminary experiments indicated that maximal signal was achieved after between 50 and 100 cycles. Using 99 cycles, we determined that using ˜250 pM template (50 pg/ml of a 320-bp PCR product) allowed the genotype to be scored accurately (
A key advantage of the flow cytometric method is the ability to perform multiplexed analyses using soluble arrays of differently stained microspheres (Fulton et al., 1997; Kettman et al., 1998). To adapt minisequencing to multiplexed microspheres, we first designed a set of address and capture oligonucleotide 20-mers. The sequences were designed to hybridize to only their respective complements and not to any other address or capture sequence. As presented in
The multiplexed SNP scoring method of the present invention is demonstrated by genotyping common HLA DPA1 alleles. Variation in this region also appears to contribute to CBD suscep-tibility, especially in conjunction with the Glu69 allele (Wang et al., 1999). HLA alleles can be defined by the nucleotide base identity at several variable sites. For the alleles considered here (Table 2), there are eight SNP sites that can define alleles. Some of these sites are linked, so that a subset of the SNP sites can be used to identify individual alleles (Marsh and Bodmer, 1995). Minisequencing primers were designed to interrogate these eight SNPs, choosing a combination of Tm-matched upper and lower strand primers with the lowest tendency toward intramolecular hairpins and dimerization with themselves or any of the other primers. The close proximity of some SNP targets required a careful choice of primers to avoid competition for primer hybridization sites. For example, sites C37P3 and C38P3 are only three bases apart, necessitating the use of an upper strand primer to interrogate the first site and a lower strand primer for the second (
Note
C: codon number, P: position in a codon.
Presented in
C. Discussion
The primer single-base extension method, also known as minisequencing, has been adapted to flow cytometry to enable multiplexed SNP analysis suitable for high-throughput applications. Using fluorescently stained microspheres bearing unique address tags, we were able to perform multiplexed primer extension with fluorescent ddNTPs on several SNPs simultaneously and subsequently capture primers onto microspheres for analysis by flow cytometry.
Flow-cytometry-based minisequencing has several advantages over other methods used for SNP scoring. First, because flow cytometry provides intrinsic resolution between free and particle-bound fluorophore, samples can be analyzed without any separation or wash steps. Second, flow cytometry is a very sensitive method of fluorescence detection. Most commercial instruments can easily measure a few thousand fluorescent molecules per particle. In the present assay, this sensitivity enables the analysis of DNA template at subnanomolar concentrations. Third, efficiency is improved by performing hybridization and primer extension in solution. Hybridization on a surface is much slower than hybridization in solution (Zammatteo et al., 1997). In preliminary experiments, it was found that minisequencing using an immobilized primer was much less efficient than with a soluble primer (data not presented). By performing hybridization and extension in solution, followed by capture on microspheres for analysis, the assay sensitivity and speed were further improved. Finally, because flow cytometry is a multiparameter detection platform, it is possible to mea-sure several features of a particle simultaneously. For example, it is possible to label each of the four ddNTPs with a different fluorophore, as is the case for dye-terminator sequencing, and detect them simultaneously in a single reaction.
Note.
C: codon; P: position; Cap: capture probe; U or L: upper or lower primer.
The accuracy of the new genotyping method is conferred by the high fidelity of the DNA polymerase that fluorescently labels the capture-tagged primer. Minisequencing has been widely tested using a variety of detection platforms and has been found to be very robust (Syvanen, 1999). The design of multiplexed minisequencing assays requires considerations similar to those required for successful multiplex PCR, namely, avoiding primer heterodimers and false priming. Exon 2 of the DPA1 gene proved particularly challenging, because some of the allele-defining sites were close together (
Perhaps the most important advantage of the present flow-cytometry-based method is the ability to configure multiplexed SNP-scoring assays using soluble arrays of dyed microspheres. In this case, we performed a multiplexed analysis of the eight SNPs that define common alleles of the HLA DPA1 gene, another risk factor in chronic beryllium disease (Wang et al., 1999). The key to our implementation of the multiplexed analysis is the use of address-tagged microspheres and capture-tagged primers to target SNP-specific primers to identifiable microsphere subsets. As presented in
Sets of up to 100 dyed microspheres will soon be available commercially (Luminex Corp.). Each of the 100 microsphere subsets could be addressed to code for a unique primer, allowing the analysis of 100 SNPs in a single reaction. Because they can be readily prepared on the lab bench without any specialized equipment, microsphere arrays are much more flexible than two-dimensional microarrays on chips or slides. These features, combined with recent advances in automated sample handling (Nolan et al., 1995; Edwards et al., 1999), make flow cytometry an extremely attractive platform for high-throughput genotyping. In summary, a rapid and sensitive microsphere-based minisequencing assay has been developed for the multiplexed analysis of single nucleotide polymorphisms using flow cytometry. Incubations can be carried out in very small volumes (˜10 ml), subjected to thermocycling to amplify signal, and analyzed without a wash step at a rate of greater than one sample per minute. The optimal reaction conditions have been determined for the case where template is limited and sensitivity is most important as well as for the case where template is not limiting and speed is most important. Flow cytometers are widely available in core facilities in many universities and medical schools and in industry. The present invention makes it possible to rapidly screen large numbers of samples with a minimum of start-up costs and development time. Moreover, flow cytometry is also compatible with hybridization- and ligation-based assays (Fulton et al., 1997; Cai et al., 1998; Iannone et al., 2000), making it a versatile platform for a variety of genomic analyses.
The foregoing description of the invention has been presented for purposes of illustration and description and is not intended to be exhaustive or to limit the invention to the precise form disclosed, and obviously many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and its practical application to thereby enable others skilled in the art to best utilize the invention in various embodiments and with various modifications as are suited to the particular use contemplated. It is intended that the scope of the invention be defined by the claims appended hereto.
This patent application claims the benefit of provisional application Ser. No. 60/210,759 which was filed on Jun. 8, 2000.
This invention was made with government support under Contract No. W-7405-ENG-36 awarded by the U.S. Department of Energy to The Regents of The University of California. The government has certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
60210759 | Jun 2000 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 09877819 | Jun 2001 | US |
Child | 10992971 | Nov 2004 | US |