Claims
- 1. A method for detecting genes which act together in a coordinated manner and are clustered together in a genome, said method comprising the steps of:
a) preparing, from isolated genomic DNA, a large-insert library of DNA fragments; b) determining the DNA sequence of at least part of some of the fragments in the large-insert library to form a plurality of Gene Sequence Tags (GSTs); c) comparing, under computer control, the DNA sequence of the GST with sequences in a database containing genes, gene fragments, DNA sequences or amino acid sequences known to be part of a cluster of genes that act together in a coordinated manner and that are clustered together on a chromosome to identify a GST that has similar structure to a gene, gene fragment, DNA sequence or amino acid sequence known to be part of a cluster of genes that act together in a coordinated manner; and; d) using the GST having similar structure to a gene, gene fragment, DNA sequence or amino acid sequence known to be part of a cluster of genes that act together in a coordinated manner to detect a DNA fragment from the large-insert library, which DNA fragment from the large insert library contains the GST and genes which act together in a coordinated manner and are clustered together on a chromosome.
- 2. A method for detecting genes which act together in a coordinated manner and are clustered together in a genome, said method comprising the steps of:
a) preparing, from isolated genomic DNA, a small insert library of DNA fragments of the genomic DNA and a large insert library of DNA fragments of the genomic DNA; b) determining the DNA sequence of at least part of some of the fragments in the small insert library to form a plurality of Gene Sequence Tags (GSTs); c) comparing, under computer control, the DNA sequence of the GSTs or the amino acid sequence corresponding to the DNA sequence of the GSTs with sequences in a database containing genes, gene fragments, DNA, or amino acid sequences known to be part of a cluster of genes that act together in a coordinated manner and are clustered together on a chromosome to identify a GST that has similar structure to a gene, gene fragment, DNA or amino acid sequence known to be part of a cluster of genes that act together in a coordinated manner; and d) using the GST having similar structure to a gene, gene fragment, DNA or amino acid sequence known to be part of a cluster of genes that act together in a coordinated manner to detect a DNA fragment from the large insert library, which DNA fragment from the large insert library contains the GST and genes which act together in a coordinated manner and are clustered together on a chromosome.
- 3. The method of claim 2 wherein step d) involves identifying, from the small insert library, the DNA fragment containing the GST having similar structure to a gene, gene fragment, DNA or amino acid sequence known to be part of a cluster of genes that act together in a coordinated manner, and using the DNA fragment of the small insert library or a portion thereof as a hybridization probe to screen the large insert library to detect a large insert DNA fragment containing genes that act together in a coordinated manner.
- 4. The method according to claim 2 comprising the further step of:
e) determining the sequence of the large insert DNA fragment from step d).
- 5. The method according to claim 2 wherein step b) further comprises the additional step of translating the DNA sequence of the GSTs to generate corresponding amino acid sequences, and wherein in step c) comparing is done on the basis of the amino acid sequence.
- 6. The method according to claim 2 wherein in step c) the identification of GSTs that have similar structure to genes, gene fragments, DNA or amino acid sequences known to be part of a cluster of genes that act together in a coordinated manner is done by computer assisted homology analysis.
- 7. The method according to claim 2 wherein the genomic DNA is obtained from a microorganism.
- 8. The method according to claim 7, wherein the microorganism is a prokaryotic microorganism.
- 9. The method according to claim 8 wherein the microorganism is of a genus selected from Nocardia, Geodermatophilus, Actinoplanes, Micromonospora, Nocardioides, Saccharothrix, Amycolatopsis, Kutzneria, Saccharomonospora, Saccharopolyspora, Kitasatospora, Streptomyces, Microbispora, Streptosporangium, and Actinomadura.
- 10. The method according to claim 8 wherein the microorganism is of a genus selected from Stigmatella, Myxococcus and Polyangium.
- 11. The method according to claim 2 wherein the genomic DNA is drawn from a population of uncultured microorganisms.
- 12. The method according to claim 2 wherein the genomic DNA is derived from a cultured microorganism.
- 13. The method according to claim 2 wherein the DNA fragments in the small insert library are between about 1.5 kilobase pairs (kbp) and about 10 kbp.
- 14. The method according to claim 13, wherein the DNA fragments in the small insert library are between about 1.5 kbp and about 5 kbp.
- 15. The method according to claim 13, wherein the DNA fragments in the small insert library are between about 1.5 kbp and about 3 kbp.
- 16. The method according to claim 2, wherein the DNA fragments in the large insert library are between about 10 kbp and about 300 kbp.
- 17. The method according to claim 16 wherein the DNA fragments in the large insert library are between about 30 kbp to about 50 kbp.
- 18. The method according to claim 2 wherein the genes which act together in a coordinated manner and are clustered together in a genome are associated with a pathogenicity island.
- 19. The method according to claim 2 wherein the genes which act together in a coordinated manner and are clustered together in a genome are associated with degradation of a compound.
- 20. The method according to claim 2 wherein the genes which act together in a coordinated manner and are clustered together in a genome are associated with conferring resistance to a therapeutic drug.
- 21. A high-throughput method for identifying a gene or gene cluster involved in the biosynthesis of a microbial natural product comprising:
a) preparing, from isolated genomic DNA, a large insert library of DNA fragments of about 30 kbp to about 300 kbp; b) determining the DNA sequence of a least part of some of the fragments in the small insert library to form a plurality of Gene Sequence Tags (GSTs); c) comparing, under computer control, the DNA sequence of the GSTs or the amino acid sequence corresponding to the GSTs with sequences in a database containing genes, gene fragments, DNA sequences or amino acid sequences known to be involved in the biosynthesis of microbial natural products to identify a GST that has a similar structure to a gene, gene fragment, DNA sequence or amino acid sequence known to be involved in the biosynthesis of microbial natural products; and d) using the GST having similar structure to a gene, gene fragment, DNA or amino acid sequence known to be involved in the biosynthesis of microbial natural products, or portions thereof, to identify a DNA fragment from the large insert library, which DNA fragment contains the GST and a gene or gene cluster involved in the biosynthesis of a microbial natural product.
- 22. A high throughput method for identifying a gene or gene cluster involved in the biosynthesis of a microbial natural product comprising:
a) preparing, from isolated genomic DNA, a small insert library of DNA fragments of the genomic DNA and a large insert library of DNA fragments of the genomic DNA; b) determining the DNA sequence of at least part of some of the fragments in the small insert library to form a plurality of Gene Sequence Tags (GSTs); c) comparing, under computer control, the DNA sequence of the GSTs or the amino acid sequence corresponding to the DNA sequence of the GSTs with sequences in a database containing genes, gene fragments, DNA or amino acid sequences known to be involved in the biosynthesis of microbial natural products to identify a GST that has a similar structure to a gene, gene fragment, DNA or amino acid sequence known to be involved in the biosynthesis of microbial natural products; and d) using the GST having similar structure to a gene, gene fragment, DNA or amino acid sequence known to be involved in the biosynthesis of microbial natural products, or portions thereof, to identify a DNA fragment from the large insert library, which DNA fragment contains the GST and a gene or gene cluster involved in the biosynthesis of a microbial natural product.
- 23. The method according to claim 22 wherein the microorganism from which the genomic DNA was obtained was not known to produce the natural product biosynthesis of which involves the gene cluster identified.
- 24. The method according to claim 22 wherein the gene or gene cluster is involved in the biosynthesis of an enediyne.
- 25. The method according to claim 22 wherein the gene or gene cluster is involved in the biosynthesis of a lipopeptide.
- 26. The method according to claim 22 wherein the gene or gene cluster is involved in the biosynthesis of a microbial natural product is a macrolide.
- 27. The method according to claim 22 wherein the gene or gene cluster is a polyketide synthase gene or a cluster of genes including a polyketide synthase gene.
- 28. The method according to claim 27 wherein the polyketide synthase gene is a modular Type 1 polyketide synthase gene.
- 29. The method according to claim 22 wherein the gene or gene cluster is involved in the biosynthesis of an orthosomycin compound.
- 30. The method of claim 29 wherein the othosomycin is an everninomicin compound or an avilamycin compound.
- 31. The method according to claim 22 wherein the gene or gene cluster is involved in the biosynthesis of a glycosylated lipopeptide or an acidic lipopeptide.
- 32. The method according to claim 22 wherein the gene or gene cluster is involved in the biosynthesis of a benzodiazepine compound.
- 33. A method for scanning the genome of a microorganism to identify a gene cluster involved in the biosynthesis of a lipopeptide, said method comprising:
a) providing genomic DNA from a microorganism; b) preparing a randomly generated small insert library of DNA fragments of about 1.5 kbp to about 10 kbp of the genomic DNA, and a randomly generated large insert library of DNA fragments of the genomic DNA of about 10 kbp to about 300 kbp; c) sequencing at least part of some of the fragments in the small insert library to form a plurality of Gene Sequence Tags (GSTs) of about 300 base pairs (bp) to about 700 bp, translating the DNA sequences of the GSTs into the corresponding amino acid sequence and providing the amino acid sequence of the GSTs in computer readable form; d) comparing, under computer control, the amino acid sequences of the GSTs with sequences in a database containing amino acid sequences known to be involved in the biosynthesis of lipopeptides to identify a GST that has a similar structure to an amino acid sequence known to be involved in the biosynthesis of lipopeptides; and e) using the GST of step d) as a hybridization probe to screen the large insert library of genomic DNA to detect a DNA fragment containing a gene cluster involved in the biosynthesis of a lipopeptide.
- 34. A method for scanning the genome of a microorganism to identify a gene cluster involved in the biosynthesis of an enediyne, said method comprising:
a) providing genomic DNA from a microorganism; b) preparing a randomly generated small insert library of DNA fragments of about 1.5 kbp to about 10 kbp of the genomic DNA, and a randomly generated large insert library of DNA fragments of the genomic DNA of about 10 kbp to about 300 kbp; c) sequencing at least part of some of the fragments in the small insert library to form a plurality of gene sequence tags (GSTs) of about 300 bp to about 700 bp, translating the DNA sequence of the GSTs into the corresponding amino acid sequence and providing the amino acid sequences of the GSTs in computer readable form d) comparing, under computer control, the amino acid sequences of the GSTs with sequences in a database containing amino acid sequences known to be involved in the biosynthesis of enediynes to identify a GST that has a similar structure to an amino acid sequence known to be involved in the biosynthesis of enediynes; and e) using the GST of step d) as a hybridization probe to screen the large insert library of genomic DNA to detect a DNA fragment containing a gene cluster involved in the biosynthesis of an enediyne.
- 35. A method for scanning the genome of a microorganism to identify a gene cluster involved in the biosynthesis of an orthosomycin, said method comprising:
a) providing genomic DNA from a microorganism; b) preparing a randomly generated small insert library of DNA fragments of about 1.5 kbp to about 10 kbp of the genomic DNA, and a randomly generated large insert library of DNA fragments of the genomic DNA of about 10 kbp to about 300 kbp; c) sequencing at least part of the fragments in the small insert library to form a plurality of gene sequence tags (GSTs) of about 300 bp to about 700 bp, translating the DNA sequence of the GSTs into the corresponding amino acid sequence and providing the amino acid sequences of the GSTs in computer readable form; d) comparing, under computer control, the amino acid sequences of the GSTs with sequences in a database containing amino acid sequences known to be involved in the biosynthesis of orthosomycins to identify a GST that has a similar structure to an amino acid sequences known to be involved in the biosynthesis of orthosomycins; and e) using the GST of step d) as a hybridization probe to screen the large insert library of genomic DNA to detect a DNA fragment containing a gene cluster involved in the biosynthesis of an orthosomycin.
- 36. A method for scanning the genome of a microorganism to identify a polyketide synthase gene or a gene cluster including a polyketide synthase gene, said method comprising:
a) providing genomic DNA from a microorganism; b) preparing a randomly generated small insert library of DNA fragments of about 1.5 kbp to about 10 kbp of the genomic DNA, and a randomly generated large insert library of DNA fragments of the genomic DNA of about 10 kbp to about 300 kbp; c) sequencing at least part of some of the fragments in the small insert library to form a plurality of Gene Sequence Tags (GSTs) of about 300 base pairs (bp) to about 700 bp, translating the DNA sequences of the GSTs into the corresponding amino acid sequence and providing the amino acid sequence of the GSTs in computer readable form; d) comparing, under computer control, the amino acid sequences of the GSTs with sequences in a database containing amino acid sequences known to be associated with a polyketide synthase to identify a GST that has a similar structure to an amino acid sequence known to be associated with a polyketide synthase; e) using the GST of step d) as a hybridization probe to screen the large insert library of genomic DNA to detect a DNA fragment containing a polyketide synthase gene or a gene cluster including polyketide synthase gene.
Parent Case Info
[0001] This application is a continuation-in-part of U.S. Ser. No. 09/910,813 filed Jul. 24, 2001; and a continuation-in-part of U.S. Ser. No. 10/152,886 filed May 21, 2002 which claims benefit of provisional application No. 60/291,959 filed May 21, 2001 and U.S. S. No. 60/334,604 filed Dec. 3, 2001; and a continuation-in-part of U.S. Ser. No. 09/976,059 filed Oct. 15, 2001 which claims benefit of provisional applications 60/239,924 filed October 13, 2000 and U.S. S. No. 60/233,296 filed Apr. 12, 2001; and a continuation-in-part of U.S. Ser. No. 10/205,032 filed Jul. 26, 2002 which claims the benefit of provisional application U.S. S. No. 60/307,629, filed Jul. 26, 2001; and a continuation-in-part of U.S. Ser. No. 10/132,134 filed Apr. 26, 2002 which claims benefit of provisional application No. 60/286,346 filed Apr. 26, 2001 each of which is hereby incorporated by reference in its entirety including any drawings, and from each of which priority is claimed. This application claims benefit under 35 U.S.C. 119 of provisional application U.S. S. No. 60/372,789 filed on Apr. 17, 2002 which is also incorporated by reference in its entirety.
Provisional Applications (8)
|
Number |
Date |
Country |
|
60291959 |
May 2001 |
US |
|
60334604 |
Dec 2001 |
US |
|
60239924 |
Oct 2000 |
US |
|
60307629 |
Jul 2001 |
US |
|
60286346 |
Apr 2001 |
US |
|
60296744 |
Jun 2001 |
US |
|
60372789 |
Apr 2002 |
US |
|
60342133 |
Dec 2001 |
US |
Continuation in Parts (6)
|
Number |
Date |
Country |
Parent |
09910813 |
Jul 2001 |
US |
Child |
10232370 |
Sep 2002 |
US |
Parent |
10152886 |
May 2002 |
US |
Child |
10232370 |
Sep 2002 |
US |
Parent |
09976059 |
Oct 2001 |
US |
Child |
10232370 |
Sep 2002 |
US |
Parent |
10205032 |
Jul 2002 |
US |
Child |
10232370 |
Sep 2002 |
US |
Parent |
10132134 |
Apr 2002 |
US |
Child |
10232370 |
Sep 2002 |
US |
Parent |
10166087 |
Jun 2002 |
US |
Child |
10232370 |
Sep 2002 |
US |