Claims
- 1. A method for identifying, classifying, or quantifying one or more nucleic acids in an input population comprising a plurality of nucleic acids having different nucleotide sequences, said method comprising:
(a) probing said input population with one or more recognition means, each recognition means recognizing a different target nucleotide subsequence or a different set of target nucleotide subsequences; (b) generating one or more subpopulations from said input population probed by said recognition means, each subpopulation being produced from a nucleic acid in said input population by recognition of one or more target nucleotide subsequences in said nucleic acid by said recognition means and comprising a representation of (i) the length between occurrences of target nucleotide subsequences in said nucleic acid, and (ii) the identities of said target nucleotide subsequences in said nucleic acid or the identities of said sets of target nucleotide subsequences among which are included the target nucleotide subsequences in said nucleic acid; (c) partitioning said subpopulations to produce one or more partitioned fractions; (d) constructing one or more libraries from said partitioned fractions wherein said libraries comprise specific nucleic acid fragments distributed in an array; (e) pooling said specific fragments from said one or more libraries into one or more pools, wherein the pooling retains a map of a fragment's location in the array, wherein the pooling resulting in all nucleic acid sequences that are present in the subpopulations being present in the pools; (f) partitioning said pooled fragments to provide one or more selected subsets; (g) identifying at least one specific nucleic acid fragment from said subsets; (h) deconvoluting said subsets, thereby mapping the specific fragment to its original location; and (i) sequencing one or more fragments in said subset to provide a nucleic acid sequence; thereby identifying, classifying, or quantifying one or more nucleic acids.
- 2. The method of claim 1, wherein step (c) comprises two or more partitioning means.
- 3. The method of claim 2, wherein said partitioning means is selected from the group consisting of gel electrophoresis; high pressure liquid chromatography; size selection; separation based on physical and/or biochemical properties including molecular weight, molecular size, terminal nucleotide sequences, exact migratory pattern, and the like; elution; gel slicing; nucleotide subsequence probing; restriction digest; ligating an adapter oligonucleotide to one or more ends of the fragment; hybridizing; and amplification.
- 4. The method of claim 1 further comprising dividing said sample of nucleic acids into a plurality of portions and performing the steps of claim 1 individually on a plurality of said portions, wherein a different one or more recognition means are used with each portion.
- 5. The method of claim 1, wherein said cDNA population is derived from the 5′ ends of the RNA molecules.
- 6. The method of claim 1, wherein said cDNA population is derived from the interior regions of the RNA molecules.
- 7. The method of claim 1, wherein said cDNA population is derived from the 3′ ends of the RNA molecules.
- 8. The method of claim 1, wherein said partitioning step optionally comprises hybridizing a probe nucleic acid sequence to the population of nucleic acids.
- 9. The method of claim 1, further comprising ligating adapter oligonucleotides to the termini of the digested cDNA molecules, thereby producing ligation products.
- 10. The method of claim 9, further comprising amplifying the ligation products.
- 11. The method of claim 10, further comprising separating the amplified products.
- 12. The method of claim 11, wherein said separating is by gel electrophoresis.
- 13. The method of claim 11, wherein the first nucleic acid sequence is identified by comparing the size of one or more digestion products produced by a member of the subpopulation of nucleic acid sequences to the sizes of fragments generated by the same restriction enzyme or enzymes in said reference nucleic acid or nucleic acids.
- 14. The method of claim 11, further comprising
recovering one or more size-separated digestion products; reamplifying the recovered products; and separating the reamplified products.
- 15. The method of claim 14, wherein said separating is by gel electrophoresis or liquid chromatography.
- 16. The method of claim 15, wherein the first nucleic acid sequence is identified by comparing the size of one or more digestion products produced by a member of the subpopulation of nucleic acid sequences to the sizes of fragments generated by the same restriction enzyme in said reference nucleic acid sequences.
- 17. The method of claim 9, further comprising:
inserting the ligation product into a cloning vector to form a vector-insert; transforming the vector-insert into a suitable host; culturing the host under conditions allowing for replication of the vector-insert; recovering the vector-insert from said host; and digesting the vector-insert with one or more restriction enzymes, thereby releasing said insert; and comparing the size of the insert to sizes of fragments generated by the same restriction enzyme or enzymes in said reference nucleic acid or nucleic acids.
- 18. The method of claim 1, wherein at least a portion of the first nucleic acid sequence is determined and compared to nucleotide sequence to the nucleotide sequence of one or more reference nucleic acid sequences.
- 19. The method of claim 1, wherein the determining step comprises hybridizing the first nucleic acid sequence to one or more of the reference nucleic acid sequences.
- 20. A method for equalizing the representation of nucleic acid sequence in a population of nucleic acid sequences, the method comprising in order the steps of:
providing a population of cDNA molecules derived from a population of RNA molecules, wherein said cDNA population comprises a first nucleic acid and a second nucleic acid sequence having a nucleic acid sequence distinct from the first nucleic acid sequence, and wherein said first nucleic acid sequence is present at a higher level in said population than said second nucleic acid sequences; partitioning said cDNA population into one or more subpopulations of nucleic acid sequences, wherein said partitioning comprises digesting the cDNA population with one or more restriction enzymes; and lowering the level of said first nucleic acid sequence relative to the level of said second nucleic acid sequence in the subpopulation of nucleic acid sequences, thereby equalizing the representation of nucleic acid sequences in said population of nucleic acid sequences.
- 21. A method for producing a population of nucleic acid molecules enriched for 5′ regions of mRNA molecules for analysis by the method of claim 1, the method comprising:
providing a population of RNA molecules, said population including RNA molecules having a 5′ terminal Gppp cap structure and a 5′ terminal phosphate group; contacting said population of RNA molecules with a phosphatase under conditions that result in removal of the 5′ terminal phosphate group while leaving the 5′ terminal Gppp cap structure intact; inactivating said phosphatase; contacting the population of RNA molecules with a pyrophosphatase under conditions that result in the removal of the 5′ terminal Gppp and the formation of a 5′ phosphate group; annealing an oligonucleotide in the presence of an RNA ligase to form a hybrid molecule; and forming a cDNA from said oligonucleotide.
- 22. A method of identifying an RNA sequence in a sample comprising a plurality of RNA sequences, the method comprising:
synthesizing cDNA copies of a plurality of RNA species to form a cDNA sample; determining the size of one or more of said cDNA molecules in said cDNA sample; comparing the size of said sample with the size of a reference nucleic acid according to the method of claim 1; and thereby identifying the cDNA sequence.
- 23. The method of claim 22, wherein said cDNA molecules are digested with one or more restriction enzymes prior to the determining step.
- 24. The method of claim 23, further comprising ligating adapter oligonucleotides to the termini of the digested cDNA molecules prior to the determining step.
- 25. The method of claim 22, wherein said identifying step comprises comparing the size of one or more digestion products produced by one or more said cDNA molecules to a reference nucleic acid or nucleic acids.
- 26. A method of identifying an RNA sequence in a population of RNA sequences, the method comprising:
(a) removing 5′ terminal pppG from RNAs in said population to form a population of RNAs having terminal 5′ phosphate groups; (b) ligating a linker oligonucleotide to the terminal 5′ phosphate groups of RNA molecules in said population of RNAs; (c) synthesizing complementary cDNA molecules from said population of RNA molecules to form a cDNA sample; (d) digesting said complementary cDNA molecules with at least one restriction enzyme; (e) ligating an adapter molecule to the digested cDNA molecules; (f) amplifying the molecules produced in step (e); (g) identifying the amplified molecules of step (f); and (h) comparing the amplified molecules to one or more reference nucleic acids.
- 27. The method of claim 1, wherein the partitioning step optionally comprises one or more processes selected from:
a) isolating nucleic acid sequences from different cell types; b) separating the nucleic acid sequences in the subpopulation by physical properties; c) amplification of a specific subpopulation of nucleic acid sequences; d) amplifying 5′ terminal sequences of the nucleic acid sequences; e) amplifying interior sequences of the nucleic acid sequences; and f) amplifying 3′ terminal sequences of the nucleic acid sequences; g) partitioned subtraction screening, h) length selection by lariat formation, i) use of identical primers, j) use of shortened primers, k) use of intermediate annealing temperature, and l) use of modified cycle times.
- 28. A method of identifying a novel nucleic acid sequence, the method comprising:
(a) probing said input population with one or more recognition means, each recognition means recognizing a different target nucleotide subsequence or a different set of target nucleotide subsequences; (b) generating one or more subpopulations from said input population probed by said recognition means, each subpopulation being produced from a nucleic acid in said input population by recognition of one or more target nucleotide subsequences in said nucleic acid by said recognition means and comprising a representation of (i) the length between occurrences of target nucleotide subsequences in said nucleic acid, and (ii) the identities of said target nucleotide subsequences in said nucleic acid or the identities of said sets of target nucleotide subsequences among which are included the target nucleotide subsequences in said nucleic acid; (c) partitioning said subpopulations to produce one or more partitioned fractions; (d) constructing one or more libraries from said partitioned fractions wherein said libraries comprise specific nucleic acid fragments distributed in an array; (e) pooling said specific fragments from said one or more libraries into one or more pools, wherein the pooling retains a map of a fragment's location in the array, wherein the pooling resulting in all nucleic acid sequences that are present in the subpopulations being present in the pools; (f) partitioning said pooled fragments to provide one or more selected subsets; (g) normalizing the population to provide one or more subpopulations of nucleic acid sequences; (h) identifying at least one specific nucleic acid fragment from said subsets; (i) deconvoluting said subsets, thereby mapping the specific fragment to its original location; and (j) sequencing one or more fragments in said subset to provide a nucleic acid sequence; thereby identifying a novel nucleic acid sequence.
- 29. The method of claim 28, wherein the normalizing comprises partitioning.
- 30. The method of claim 29, wherein the partitioning comprises one or more processes selected from:
a) isolating nucleic acid sequences from different cell types; b) separating the nucleic acid sequences in the subpopulation by physical properties; c) amplification of a specific subpopulation of nucleic acid sequences; d) amplifying 5′ terminal sequences of the nucleic acid sequences; e) amplifying interior sequences of the nucleic acid sequences; and f) amplifying 3′ terminal sequences of the nucleic acid sequences; g) partitioned subtraction screening, h) length selection by lariat formation, i) use of identical primers, j) use of shortened primers, k) use of intermediate annealing temperature, and l) use of modified cycle times.
- 31. The method of claim 28, further comprising the steps of:
assembling the plurality of nucleic acid sequences to provide an assembled sequence; and determining whether the assembled sequence is absent in a reference set of one or more reference nucleic acid sequences; whereby if the assembled sequence is absent from the reference the set assembled sequence is a novel nucleic acid sequence.
- 32. A method of screening a population of nucleic acid molecules to identify a novel sequence, the method comprising:
providing a population of nucleic acid sequences; normalizing said population into one or more subpopulations of nucleic acid sequences, wherein said normalizing is selected from the group consisting of restriction endonuclease digestion, size-based fragment partitioning; terminal nucleotide sequence, and fragment migratory pattern; identifying a first nucleic acid sequence in the subpopulation of nucleic acid sequences; and comparing the first nucleic acid sequence to a reference nucleic acid sequence or sequences, wherein the absence of the first nucleic acid sequence in the reference nucleic acid sequence or nucleic acid sequences indicates the first nucleic acid sequence is a novel nucleic acid sequence.
- 33. The method of screening as in claim 32, wherein the normalization step comprises processes selected from the group consisting of partitioned subtraction screening, length selection by lariat formation, use of identical primers, use of shortened primers, use of intermediate annealing temperature, use of modified cycle times, and use of a 5′-capped end.
- 34. The method of claim 1, wherein the input population comprises multiple input sources.
- 35. The method of claim 34, wherein nucleic acids from individual input sources are labeled.
- 36. The method of claim 35, wherein different input sources are selected from the group consisting of: tissue type, cell type, treatment condition, disease state, and organism type.
- 37. The method of claim 1, wherein nucleic acids are processed using multiplexing.
- 38. The method of claim 28, wherein the input population comprises multiple input sources.
- 39. The method of claim 28, wherein nucleic acids are processed using multiplexing.
- 40. The method of claim 32, wherein the input population comprises multiple input sources.
- 41. The method of claim 32, wherein nucleic acids are processed using multiplexing.
RELATED APPLICATIONS
[0001] This application claims priority to provisional application U.S. Ser. No. 60/115,109, filed Jan. 8, 1999, and non-provisional application U.S. Ser. No. 09/417,386, filed Oct. 13, 1999, which are incorporated herein by reference in their entirety.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60115109 |
Jan 1999 |
US |
Continuations (1)
|
Number |
Date |
Country |
Parent |
09417386 |
Oct 1999 |
US |
Child |
10407519 |
Apr 2003 |
US |