Claims
- 1.A method for identifying a nucleic acid or a polypeptide sequence that may be a target for a drug comprising the following steps:
- 2.A method for identifying a nucleic acid or a polypeptide sequence that may be essential for the growth or viability of an organism comprising the following steps:
- 3.The method of claim 1 or claim 2, wherein the drug is an anti-microbial drug.
- 4.The method of claim 1 or claim 2, wherein the first nucleic acid or a polypeptide sequence is derived from a pathogen.
- 5.The method of claim 4, wherein the pathogen is a microorganism.
- 6.The method of claim 1 or claim 2, wherein the microorganism is Mycobacterium tuberculosis (MTB).
- 7.The method of claim 1 or claim 2, wherein the plurality of sequences used to identify a second sequence comprises a database of the gene sequences of an entire genome of an organism.
- 8.The method of claim 1 or claim 2, wherein the plurality of sequences used to identify a second sequence comprises a database of the gene sequences derived from a pathogen.
- 9.The method of claim 1 or claim 2, wherein the "phylogenetic profile"method algorithm comprises
- 10.The method of claim 9, wherein the phylogenetic profile is in the form of a vector, matrix or phylogenetic tree.
- 11.The method of claim 9, comprising determining the significance of homology between the proteins by computing a probability (p) value threshold.
- 12.The method of claim 11, wherein the probability is set with respect to the value 1/NM, based on the total number of sequence comparisons that are to be performed, wherein N is the number of proteins in the first organism"s genome and M in all other genomes.
- 13.The method of claim 9, wherein the presence or absence is by calculating an evolutionary distance.
- 14.The method of claim 13, wherein the evolutionary distance is calculated by:
- 15.The method of claim 14, wherein the conditional probability matrix is defined by a Markov process with substitution rates, over a fixed time interval.
- 16.The method of claim 14, where the conversion from an amino acid substitution matrix to a conditional probability matrix is represented by:
- 17.The method of claim 16, where Pj"s are the abundances of amino acid j and are computed by solving a plurality of linear equations given by the normalization condition that:
- 18.The method of claim 1 or claim 2, wherein the "physiologic linkage"method algorithm identifies proteins and nucleic acids that participate in a common functional pathway.
- 19.The method of claim 1 or claim 2, wherein the "physiologic linkage"method algorithm comprises identifies proteins and nucleic acids that participate in the synthesis of a common structural complex.
- 20.The method of claim 1 or claim 2, wherein the "physiologic linkage"method algorithm comprises identifies proteins and nucleic acids that participate in a common metabolic pathway.
- 21.The method of claim 1 or claim 2, wherein the "domain fusion"method algorithm comprises
- 22.The method of claim 21, wherein the aligning is performed by an algorithm selected from the group consisting of a Smith-Waterman algorithm, Needleman-Wunsch algorithm, a BLAST algorithm, a FASTA algorithm, and a PSI-BLAST algorithm.
- 23.The method of claim 21, wherein the multiple distinct non-homologous polypeptides are obtained by translating a nucleic acid sequence from a genome database.
- 24.The method of claim 21, wherein the plurality of proteins have a known function.
- 25.The method of claim 21, wherein at least one of the multiple distinct non-homologous polypeptides has a known function.
- 26.The method of claim 21, wherein at least one of the multiple distinct non-homologous polypeptides has an unknown function.
- 27.The method of claim 21, wherein the alignment is based on the degree of homology of the multiple distinct non-homologous polypeptides to the plurality of proteins.
- 28.The method of claim 21, further comprising determining the significance of the aligned and identified second primary amino acid sequence by computing a probability (p) value threshold.
- 29.The method of claim 28, wherein the probability threshold is set with respect to the value 1/NM, based on the total number of sequence comparisons that are to be performed, wherein N is the number of proteins in a first organism"s genome and M in all other genomes.
- 30.The method of claim 21, further comprising filtering excessive functional links between one first primary amino acid sequence of multiple distinct non-homologous polypeptides and an excessive number of other distinct non-homologous polypeptides for any alignment found between the first primary amino acid sequences of the distinct non-homologous polypeptides and at least one of the second primary amino acid sequences of the plurality of proteins.
- 31.A computer program product, stored on a computer-readable medium, for identifying a nucleic acid or a polypeptide sequence that may be a target for a drug, the computer program product comprising instructions for causing a computer system to be capable of:
- 32.A computer program product, stored on a computer-readable medium, for identifying a nucleic acid or a polypeptide sequence that may be essential for the growth or viability of an organism, the computer program product comprising instructions for causing a computer system to be capable of:
- 33. A computer system, comprising:
Cross Reference to Related Applications
[0001] The present application is a continuationapplication ("CIP") of Patent Convention Treaty (PCT) International Application Serial No: PCT/US00/02246, filed in the U.S. receiving office on January 28, 2000, and this application claims the benefit of priority under 35 U.S.C. § 119(e) of U.S. Provisional Application Nos. 60/165,124, and 60/165,086, both filed November 12, 1999, and U.S. Provisional Application No. 60/179,531, filed February 1, 2000. International Application Serial No: PCT/US00/02246 claims the benefit of priority under 35 U.S.C. §119(e) of U.S. Provisional Application Serial No. 60/117,844, filed January 29, 1999, U.S. Provisional Application Serial No. 60/118,206, filed February 1, 1999, U.S. Provisional Application Serial No. 60/126,593, filed March 26, 1999, U.S. Provisional Applications Serial No. 60/134,093, filed May 14, 1999, and U.S. Provisional Application Serial No. 60/134,092, filed May 14, 1999. Each of the aforementioned applications is explicitly incorporated herein by reference in their entirety and for all purposes.
Provisional Applications (3)
|
Number |
Date |
Country |
|
60/165,124 |
Nov 1999 |
US |
|
60/179,531 |
Feb 2000 |
US |
|
60/165,086 |
Nov 1999 |
US |
Continuation in Parts (1)
|
Number |
Date |
Country |
| Parent |
PCT/US00/02246 |
Jan 2000 |
US |
| Child |
09712363 |
Nov 2000 |
US |