Claims
- 1. A method of reducing the complexity of a first nucleic acid sample to produce a second nucleic acid sample comprising:
selecting a collection of target sequences by a method comprising:
identifying fragments that are in a selected size range when a genome is digested with a selected enzyme or enzyme combination; identifying sequences of interest present on the fragments in the selected size range; and selecting as target sequences fragments that are in the selected size range and comprise a sequence of interest; fragmenting said first nucleic acid sample to produce sample fragments; ligating at least one adaptor to the sample fragments; and generating said second nucleic acid sample by amplifying a subset of sample fragments wherein said collection of target sequences is enriched in said second sample.
- 2. The method of claim 1 wherein amplification is by PCR.
- 3. The method of claim 1 wherein amplification is by PCR using a primer that is complementary to the adaptor.
- 4. The method of claim 2 wherein 20 to 50 cycles of PCR are used to amplify the fragments.
- 5. The method of claim 1 wherein the subset of sample fragments is targeted for enrichment by selecting the method of fragmentation.
- 6. The method of claim 1 wherein the step of fragmenting said first nucleic acid sample comprises digestion with at least one restriction enzyme.
- 7. The method of claim 6 wherein at least one restriction enzyme has a six base recognition sequence.
- 8. The method of claim 1 wherein the step of fragmenting said first nucleic acid sample comprises digestion with a type IIs endonuclease.
- 9. The method of claim 1 wherein the sequences enriched in said second nucleic acid sample comprise at least 0.01% of said first nucleic acid sample.
- 10. The method of claim 1 wherein the sequences enriched in said second nucleic acid sample comprise at least 0.5% of said first nucleic acid sample.
- 11. The method of claim 1 wherein the sequences enriched in said second nucleic acid sample comprise at least 3% of said first nucleic acid sample.
- 12. The method of claim 1 wherein the sequences enriched in said second nucleic acid sample comprise at least 12% of said first nucleic acid sample.
- 13. The method of claim 1 wherein the sequences enriched in said second nucleic acid sample comprise at least 50% of said first nucleic acid sample.
- 14. The method of claim 1 wherein said first nucleic acid sample is genomic DNA, DNA, cDNA derived from RNA or cDNA derived from mRNA.
- 15. The method of claim 1 wherein the target sequences are 800 base pairs long or less.
- 16. The method of claim 1 wherein the target sequences are 1000 base pairs long or less.
- 17. The method of claim 1 wherein the target sequences are 1200 base pairs long or less.
- 18. The method of claim 1 wherein the target sequences are 1500 base pairs long or less.
- 19. The method of claim 1 wherein the target sequences are 2000 base pairs long or less.
- 20. The method of claim 1 wherein the subset of sample fragments enriched in said second nucleic acid sample is comprised of fragments that are about 2000 base pairs long or less.
- 21. The method of claim 1 wherein the subset of sample fragments enriched in said second nucleic acid sample is comprised of fragments that are about 3000 base pairs long or less.
- 22. The method of claim 1 wherein said fragmenting, ligating and amplifying steps are done in a single tube.
- 23. The method of claim 1 wherein two or more adaptors are ligated to the fragments so that some of the fragments have a first adaptor sequence ligated to one end and a second adaptor sequence ligated to the other end.
- 24. The method of claim 23 wherein said amplification is by PCR with a first primer that recognizes the first adaptor and a second primer that recognizes the second adaptor.
- 25. The method of claim 1 wherein the adaptor comprises a complementary region comprising a first common sequence and the complement of said first common sequence and a region comprising a second common sequence on one strand and a third common sequence on the other strand wherein said second and third common sequences will not basepair under standard hybridization conditions.
- 26. The method of claim 25 wherein amplification is by PCR using a first primer to said second common sequence and a second primer to said third common sequence.
- 27. The method of claim 1 wherein said sequences of interest comprise sequence variations.
- 28. The method of claim 27 wherein said sequences variations are single nucleotide polymorphisms.
- 29. The method of claim 28 wherein at least one of the single nucleotide polymorphisms is associated with a phenotype.
- 30. The method of claim 28 wherein at least one of the single nucleotide polymorphisms is associated with a disease.
- 31. The method of claim 28 wherein at least one of the single nucleotide polymorphisms is associated with the efficacy of a drug.
- 32. The method of claim 28 wherein at least one of the single nucleotide polymorphisms is associated with a haplotype.
- 33. The method of claim 1 wherein a computer is used for one or more steps.
- 34. A method for selecting a collection of target sequences comprising:
identifying fragments that are in a selected size range when a genome is digested with a selected enzyme or enzyme combination; identifying sequences of interest present on the fragments in the selected size range; and selecting as target sequences fragments that are in the selected size range and comprise a sequence of interest.
- 35. A collection of target sequences selected according to the method of claim 34.
- 36. The collection of target sequences of claim 35 wherein more than 80% of the target sequences are 800 base pairs long or less.
- 37. The collection of target sequences of claim 35 wherein more than 80% of the target sequences are 1000 base pairs long or less.
- 38. The collection of target sequences of claim 35 wherein more than 80% of the target sequences are 1200 base pairs long or less.
- 39. The collection of target sequences of claim 35 wherein more than 80% of the target sequences are 1500 base pairs long or less.
- 40. The collection of target sequences of claim 35 wherein more than 80% of the target sequences are 2000 base pairs long or less.
- 41. The method of claim 34 wherein said sequences of interest comprise sequence variations.
- 42. The method of claim 41 wherein said sequences variations are single nucleotide polymorphisms.
- 43. The method of claim 42 wherein at least one of the single nucleotide polymorphisms is associated with a phenotype.
- 44. The method of claim 42 wherein at least one of the single nucleotide polymorphisms is associated with a disease.
- 45. The method of claim 42 wherein at least one of the single nucleotide polymorphisms is associated with the efficacy of a drug.
- 46. The method of claim 42 wherein at least one of the single nucleotide polymorphisms is associated with a haplotype.
- 47. The method of claim 34 wherein a computer is used for one or more steps.
- 48. A method for analyzing a first nucleic acid sample comprising:
fragmenting said first nucleic acid sample to produce fragments; ligating an adaptor to the fragments; generating a second nucleic acid sample by amplifying the fragments wherein a subset of fragments comprising a collection of target sequences is enriched in said second sample; providing a nucleic acid array; hybridizing said second nucleic acid sample to said array; generating a hybridization pattern resulting from said hybridization; and analyzing said hybridization pattern.
- 49. The method of claim 48 wherein said method for analyzing a first nucleic acid sample comprises determining whether the first nucleic acid sample contains sequence variations.
- 50. The method of claim 49 wherein said sequence variations are single nucleotide polymorphisms.
- 51. The method of claim 50 wherein at least one of the single nucleotide polymorphisms is associated with a phenotype.
- 52. The method of claim 50 wherein at least one of the single nucleotide polymorphisms is associated with a disease.
- 53. The method of claim 50 wherein at least one of the single nucleotide polymorphisms is associated with the efficacy of a drug.
- 54. The method of claim 50 wherein at least one of the single nucleotide polymorphisms is associated with a haplotype.
- 55. The method of claim 48 wherein the nucleic acid array is designed to interrogate one or more sequences in said collection of target sequences.
- 56. The method of claim 48 wherein the step of analyzing said hybridization pattern determines the presence or absence of DNA sequence variation in the first nucleic acid sample.
- 57. The method of claim 48 wherein some of the sequences enriched in said second nucleic acid sample are first determined by a computer.
- 58. The method of claim 48 wherein said collection of target sequences is selected by a method comprising:
identifying fragments that are in a selected size range when a genome is digested with a selected enzyme or enzyme combination; identifying sequences of interest present on the fragments in the selected size range; and selecting as target sequences fragments that are in the selected size range and comprise a sequence of interest.
- 59. The method of claim 58 wherein a computer is used for one or more steps.
- 60. The method of claim 48 wherein a second adaptor is ligated to the fragments so that some of the fragments have a first adaptor sequence ligated to one end and a second adaptor sequence ligated to the other end.
- 61. The method of claim 60 wherein said amplification is by PCR with a first primer that recognizes the first adaptor and a second primer that recognizes the second adaptor.
- 62. The method of claim 48 wherein the adaptor comprises a complementary region comprising a first common sequence and the complement of said first common sequence and a region comprising a second common sequence on one strand and a third common sequence on the other strand wherein said second and third common sequences will not base pair under standard hybridization conditions.
- 63. The method of claim 62 wherein amplification is by PCR using a first primer to said second common sequence and a second primer to said third common sequence.
- 64. A method of genotyping an individual comprising:
providing a first nucleic acid sample from said individual; fragmenting said first nucleic acid sample to produce fragments; ligating at least one adaptor sequence to the population of fragments; and generating a second nucleic acid sample from said first nucleic acid sample wherein said second nucleic acid sample is enriched for a subset of fragments and said subset of fragments comprises sequences from a collection of target sequences comprising a collection of single nucleotide polymorphisms; providing an array comprising probes to interrogate for the presence or absence of different alleles in said collection of single nucleotide polymorphisms; hybridizing said second nucleic acid sample to said array; generating a hybridization pattern resulting from said hybridization; and analyzing the hybridization pattern to determine which alleles are present for at least one of the single nucleotide polymorphisms.
- 65. The method of claim 64 wherein the single nucleotide polymorphisms is associated with a phenotype.
- 66. The method of claim 64 wherein the single nucleotide polymorphisms is associated with a disease.
- 67. The method of claim 64 wherein said single nucleotide polymorphisms is associated with the efficacy of a drug.
- 68. The method of claim 64 wherein at least one of the single nucleotide polymorphisms is associated with a haplotype.
- 69. A method for screening for DNA sequence variations in a population of individuals comprising:
providing a first nucleic acid sample from each of said individuals; providing a second nucleic acid sample by a method comprising:
fragmenting said first nucleic acid sample to produce fragments;
ligating at least one adaptor sequence to the population of fragments; and, generating a second nucleic acid sample from said first nucleic acid sample wherein said second nucleic acid sample is enriched for a subset of fragments and said subset of fragments comprises sequences from said collection of target sequences; providing a plurality of nucleic acid arrays wherein said arrays comprise probes designed to interrogate for DNA sequence variations; hybridizing each of said second nucleic acid samples to one of said plurality of arrays; generating a plurality of hybridization patterns resulting from said hybridizations; and analyzing the hybridization patterns to determine the presence or absence of sequence variation in the population of individuals.
- 70. The method of claim 69 wherein said sequence variation is one or more single nucleotide polymorphisms.
- 71. The method of claim 70 wherein at least one of the single nucleotide polymorphisms is associated with a phenotype.
- 72. The method of claim 70 wherein at least one of the single nucleotide polymorphisms is associated with a disease.
- 73. The method of claim 70 wherein at least one of the single nucleotide polymorphisms is associated with the efficacy of a drug.
- 74. The method of claim 70 wherein at least one of the single nucleotide polymorphisms is associated with a haplotype.
- 75. The method of claim 69 wherein the nucleic acid array is designed to interrogate sequence variations in said collection of target sequences.
- 76. A kit for genotyping at least one individual comprising:
buffer, one or more restriction enzymes, at least one adaptor, a ligase, one or more primers and instructions for the use of the kit.
- 77. A kit for genotyping at least one individual comprising an array designed to interrogate sequence variation in a collection of target sequences wherein said target sequences are selected by a method comprising:
identifying fragments that are in a selected size range when a genome is digested with a selected enzyme or enzyme combination; identifying sequences of interest present on the fragments in the selected size range; and selecting as target sequences fragments that are in the selected size range and comprise a sequence of interest.
- 78. A method for analyzing a first nucleic acid sample comprising:
obtaining a second nucleic acid sample that is enriched for a subset of fragments said subset comprising selected target sequences by a method comprising:
selecting one or more target sequences that may be present in the first nucleic acid sample; fragmenting the first nucleic acid sample so that the selected target sequences are present on fragments of a specific size range; ligating at least one adaptor sequence to the fragments; generating said second nucleic acid sample by amplifying the fragments so that the fragments containing the selected target sequences are enriched in the amplified product; providing a nucleic acid array; hybridizing said second nucleic acid sample to said array; and analyzing a hybridization pattern resulting from said hybridization.
- 79. The method of claim 78 wherein said second nucleic acid sample comprises at least 0.01% of said first nucleic acid sample.
- 80. The method of claim 78 wherein said second nucleic acid sample comprises at least 0.5% of said first nucleic acid sample.
- 81. The method of claim 78 wherein said second nucleic acid sample comprises at least 3% of said first nucleic acid sample.
- 82. The method of claim 78 wherein said second nucleic acid sample comprises at least 12% of said first nucleic acid sample.
- 83. The method of claim 78 wherein said second nucleic acid sample comprises at least 50% of said first nucleic acid sample.
- 84. The method of claim 78 wherein said first nucleic acid sample is DNA, genomic DNA. cDNA derived from RNA or cDNA derived from mRNA.
- 85. The method of claim 78 wherein the fragments are amplified by a polymer PCR.
- 86. The method of claim 78 wherein the step of fragmenting the first nucleic acid sample comprises digestion with at least one restriction enzyme.
- 87. The method of claim 78 wherein the step of fragmenting the first nucleic acid sample comprises digestion with at least one type IIs endonuclease.
- 88. The method of claim 78 wherein said adaptor sequence comprises a primer template sequence.
- 89. The method of claim 78 wherein said adaptor sequence comprises a tag sequence.
- 90. The method of claim 78 wherein said method for analyzing a first nucleic acid sample comprises determining whether the first nucleic acid sample contains sequence variations.
- 91. The method of claim 90 wherein said sequence variations are single nucleotide polymorphisms.
- 92. The method of claim 91 wherein at least one of the single nucleotide polymorphisms is associated with a phenotype.
- 93. The method of claim 91 wherein at least one of the single nucleotide polymorphisms is associated with a disease.
- 94. The method of claim 91 wherein at least one of the single nucleotide polymorphisms is associated with the efficacy of a drug.
- 95. The method of claim 91 wherein at least one of the single nucleotide polymorphisms is associated with a haplotype.
- 96. The method of claim 78 wherein said target sequences are selected by a method comprising:
identifying fragments that are in a selected size range when a genome is digested with a selected enzyme or enzyme combination; identifying sequences of interest present on the fragments in the selected size range; and selecting as target sequences fragments that are in the selected size range and comprise a sequence of interest.
- 97. The method of claim 96 wherein a computer is used for one or more steps.
- 98. A solid support comprising a plurality of probes attached to said solid support wherein said probes are designed to interrogate sequence variation in a collection of target sequences and said collection of target sequences is selected by a method comprising:
identifying fragments that are in a selected size range when a genome is digested with a selected enzyme or enzyme combination; identifying sequences of interest present on the fragments in the selected size range; and selecting as target sequences fragments that are in the selected size range and comprise a sequence of interest.
RELATED APPLICATIONS
[0001] The present application is a continuation-in-part of U.S. application Ser. No. 09/916,135 filed Jul. 25, 2001 the disclosure of which is incorporated herein by reference in its entirety.