Claims
- 1. A method of identifying at least one protein-protein relationship comprising:
a. compiling a database of sequences; b. comparing a reference sequence to at least one sequence in the database; c. identifying conserved residues between the reference sequence and at least one sequence in the database; d. comparing the conserved residues between the reference sequence and at least one sequence in the database; and e. identifying the protein-protein relationship based on the comparison.
- 2. The method of claim 1, wherein the database contains nucleic acids sequences.
- 3. The method of claim 1, wherein the database contains amino acid sequences.
- 4. The method of claim 1, wherein the database contains open reading frame sequences.
- 5. The method of claim 1, wherein the database contains open reading frame sequences from prokaryotes and eukaryotes.
- 6. The method of claim 1, wherein the database contains open reading frame sequences from bacteria.
- 7. The method of claim 1, wherein the database contains open reading frame sequences from E. coli.
- 8. The method of claim 1, wherein comparing the reference sequence to the database includes the algorithm BLAST, FASTA or its equivalent.
- 9. The method of claim 1, wherein the identifying of conserved residues includes the algorithm ClustalW, PileUp or its equivalent.
- 10. The method of claim 1, wherein the identifying of conserved residues includes a pairwise comparison of the reference sequence and the database sequences.
- 11. The method of claim 10, wherein the identifying of conserved residues further comprises scoring the conserved residues using BLOSUM, PAM, Dayhoff or its equivalent.
- 12. The method of claim 1, wherein the comparing of conserved residues includes measuring Euclidean distances.
- 13. The method of claim 1, wherein the comparing of conserved residues includes measuring absolute correlation of the conserved residues.
- 14. A method of identifying at least one protein-protein relationship comprising:
a. compiling a database of sequences; b. comparing a reference sequence to at least one sequence in the database; c. identifying conserved residues between the reference sequence and at least one sequence in the database; d. comparing the conserved residues between the reference sequence and at least one sequence in the database; e. grouping the conserved residues; and f. identifying the protein-protein relationship based on the grouping.
- 15. The method of claim 14, wherein the database contains nucleic acids sequences.
- 16. The method of claim 14, wherein the database contains amino acid sequences.
- 17. The method of claim 14, wherein the database contains open reading frame sequences.
- 18. The method of claim 14, wherein the database contains open reading frame sequences from prokaryotes and eukaryotes.
- 19. The method of claim 14, wherein the database contains open reading frame sequences from bacteria.
- 20. The method of claim 14, wherein the database contains open reading frame sequences from E. coli.
- 21. The method of claim 14, wherein comparing the reference sequence to the database includes the algorithm BLAST, FASTA or its equivalent.
- 22. The method of claim 14, wherein the identifying of conserved residues includes the algorithm ClustalW, PileUp or its equivalent.
- 23. The method of claim 14, wherein the identifying of conserved residues includes a pairwise comparison of the reference sequence and the database sequences.
- 24. The method of claim 23, wherein the identifying of conserved residues further comprises scoring the residues using BLOSUM, PAM, Dayhoff or its equivalent
- 25. The method of claim 14, wherein the comparing of conserved residues includes measuring Euclidean distances.
- 26. The method of claim 14, wherein the comparing of conserved residues includes measuring absolute correlation of the conserved residues.
- 27. The method of claim 14, wherein the grouping includes combining based on Euclidean distance and absolute correlation measurements of the conserved bases.
- 28. A method of identifying at least one protein-protein relationship comprising:
a. compiling a database of sequences; b. comparing a reference sequence to at least one sequence in the database; c. identifying conserved residues between the reference sequence and at least one sequence in the database; d. forming a positional vector containing the conserved residues; e. grouping the positional vectors into evolutionary clusters; f. compiling an evolutionary profile based on the evolutionary clusters; and g. identifying the protein-protein relationship based on the evolutionary profiles.
- 29. The method of claim 28, wherein the database contains nucleic acids sequences.
- 30. The method of claim 28, wherein the database contains amino acid sequences.
- 31. The method of claim 28, wherein the database contains open reading frame sequences.
- 32. The method of claim 28, wherein the database contains open reading frame sequences from prokaryotes and eukaryotes.
- 33. The method of claim 28, wherein the database contains open reading frame sequences from bacteria.
- 34. The method of claim 28, wherein the database contains open reading frame sequences from E. coli.
- 35. The method of claim 28, wherein comparing the reference sequence to the database includes the algorithm BLAST, FASTA or its equivalent.
- 36. The method of claim 28, wherein the identifying of conserved residues includes the algorithm ClustalW, PileUp or its equivalent.
- 37. The method of claim 28, wherein the identifying of conserved residues includes a pairwise comparison of the reference sequence and the database sequence.
- 38. The method of claim 37, wherein the identifying of conserved residues further comprises a scoring the residues using BLOSUM, PAM, Dayhoff or its equivalent.
- 39. The method of claim 28, wherein the forming of positional vectors includes compiling conserved residues at each position within the reference sequence.
- 40. The method of claim 28, wherein the grouping of positional vectors includes measuring Euclidean distances.
- 41. The method of claim 28, wherein the grouping of positional vectors includes measuring absolute correlation of conserved residues.
- 42. The method of claim 28, wherein the grouping includes combining positional vectors based on Euclidean distances and absolute correlation of conserved residues.
- 43. The method of claim 28, wherein the compiling of evolutionary profiles includes a pairwise comparison of each position of the evolutionary cluster.
- 44. The method of claim 43, further comprising using the algorithm BLOSUM, PAM, Dayhoff or its equivalent
- 45. A method of identifying at least one protein-protein relationship comprising:
a. compiling a database of sequences; b. comparing a reference sequence to at least one sequence in the database; c. identifying conserved residues between the reference sequence and at least one sequence in the database; d. compiling the conserved residues across the reference sequence and the database sequences into a positional vector; e. calculating a score for each positional vector; f. grouping the positional vectors into evolutionary clusters based on the score; g. comparing each conserved residue between the reference sequence and at least one sequence in database of the evolutionary cluster; h. forming an evolutionary profile based on the evolutionary clusters; and i. based on comparing each evolutionary profile, identifying the protein-protein relationship.
- 46. The method of claim 45, wherein the database contains nucleic acids sequences.
- 47. The method of claim 45, wherein the database contains amino acid sequences.
- 48. The method of claim 45, wherein the database contains open reading frame sequences.
- 49. The method of claim 45, wherein the database contains open reading frame sequences from prokaryotes and eukaryotes.
- 50. The method of claim 45, wherein the database contains open reading frame sequences from bacteria.
- 51. The method of claim 45, wherein the database contains open reading frame sequences from E. coli.
- 52. The method of claim 45, wherein comparing the reference sequence to the database includes the algorithm BLAST, FASTA or its equivalent.
- 53. The method of claim 45, wherein the identifying of conserved residues includes the algorithm ClustalW, PileUp or its equivalent.
- 54. The method of claim 45, wherein calculating the score for each positional vector includes a pairwise comparison of the reference sequence and the database sequences.
- 55. The method of claim 54, wherein calculating the score for each positional vector further comprising comparing conserved residues using BLOSUM, PAM, Dayhoff or its equivalent.
- 56. The method of claim 45, wherein the grouping of positional vectors includes measuring Euclidean distances.
- 57. The method of claim 45, wherein the grouping of positional vectors includes measuring absolute correlation of the conserved residues.
- 58. The method of claim 45, wherein the grouping includes combining the positional vectors based on Euclidean distance and absolute correlation of the conserved residues.
- 59. The method of claim 45, wherein the comparing of conserved residues includes a pairwise comparison of each residue at each position of the evolutionary cluster whereby each database sequence and reference sequence is compared to each other.
- 60. The method of claim 59, further comprising using the algorithm BLOSUM, PAM, Dayhoff or its equivalent.
- 61. A method of identifying at least one protein-protein relationship comprising:
a. compiling a database of sequences; b. comparing a reference sequence to at least one sequence in the database; c. identifying conserved residues between the reference sequence and at least one sequence in the database; d. compiling the conserved residues across the reference sequence and at least one sequence in the database into a positional vector; e. calculating a score for each positional vector; f. grouping the positional vectors into evolutionary clusters based on the score; g. comparing each conserved residue between the reference sequence and at least one sequence in database of the evolutionary cluster; h. establishing a score at each conserved residue position across the evolutionary cluster; i. forming an evolutionary profile based on the scores of the evolutionary clusters; and j. based on the evolutionary profile, identifying the protein-protein relationship.
- 62. The method of claim 61, wherein the database contains nucleic acids sequences.
- 63. The method of claim 61, wherein the database contains amino acid sequences.
- 64. The method of claim 61, wherein the database contains open reading frame sequences.
- 65. The method of claim 61, wherein the database contains open reading frame sequences from prokaryotes and eukaryotes.
- 66. The method of claim 61, wherein the database contains open reading frame sequences from bacteria.
- 67. The method of claim 61, wherein the database contains open reading frame sequences from E. coli.
- 68. The method of claim 61, wherein comparing the reference sequence to the database includes the algorithm BLAST, FASTA or its equivalent.
- 69. The method of claim 61, wherein the identifying of conserved residues includes the algorithm ClustalW, PileUp or its equivalent.
- 70. The method of claim 61, wherein calculating the score for each positional vector includes a pairwise comparison of the reference sequence and the database sequences.
- 71. The method of claim 61, wherein calculating the score for each positional vector further comprising comparing conserved residues using BLOSUM, PAM, Dayhoff or its equivalent.
- 72. The method of claim 61, wherein the grouping of positional vectors includes measuring Euclidean distances.
- 73. The method of claim 61, wherein the grouping of positional vectors includes measuring absolute correlation of the conserved residues.
- 74. The method of claim 61, wherein the grouping includes combining the positional vectors based on Euclidean distance and absolute correlation of the conserved residues.
- 75. The method of claim 61, wherein the comparing of conserved residues includes a pairwise comparison of each residue at each position of the evolutionary cluster whereby each database sequence and reference sequence is compared to each other.
- 76. The method of claim 75, further comprising using the algorithm BLOSUM, PAM, Dayhoff or its equivalent.
- 77. A method of identifying at least one protein-protein relationship comprising:
a) compiling a database of sequences; b) comparing a reference sequence to at least one sequence in the database; c) identifying conserved residues between the reference sequence and at least one sequence in the database; d) compiling conserved residues based on location in structure; e) forming an evolutionary cluster based on the compiled residues; f) comparing each conserved residue between the reference sequence and at least one sequence in the database of the evolutionary cluster; g) establishing a score at each conserved residue position across the evolutionary cluster; h) forming an evolutionary profile based on the scores of the evolutionary clusters; and i) based on the evolutionary profile, identifying the protein-protein relationship.
- 78. The method of claim 77, wherein the database contains nucleic acids sequences.
- 79. The method of claim 77, wherein the database contains amino acid sequences.
- 80. The method of claim 77, wherein the database contains open reading frame sequences.
- 81. The method of claim 77, wherein the database contains open reading frame sequences from prokaryotes and eukaryotes.
- 82. The method of claim 77, wherein the database contains open reading frame sequences from bacteria.
- 83. The method of claim 77, wherein the database contains open reading frame sequences from E. coli.
- 84. The method of claim 77, wherein comparing the reference sequence to the database includes the algorithm BLAST, FASTA or its equivalent.
- 85. The method of claim 77, wherein the identifying of conserved residues includes the algorithm ClustalW, PileUp or its equivalent.
- 86. The method of claim 77, wherein the compiling of conserved residues is based on the location between the conserved residues measured in Ångstroms.
- 87. The method in claim 86, wherein the location distance between residues is 3 to 7 Ångstroms.
- 88. The method of claim 77, wherein the comparing of conserved residues includes a pairwise comparison of each residue at each position of the evolutionary cluster whereby each database sequence and reference sequence is compared to each other.
- 89. The method of claim 88, further comprising using the algorithm BLOSUM, PAM, Dayhoff or its equivalent.
- 90. A method of identifying at least one protein-protein relationship comprising:
a. compiling a database of sequences; b. comparing a reference sequence to at least one sequence in the database; c. identifying at least one segment of a sequence within a set of sequences of the database; d. assembling a set of segments to create an assembled-sequence; e. identifying conserved residues between the reference sequence and at least one assembled-sequence in a set of assembled-sequences; f. comparing the conserved residues between the reference sequence and at least one assembled-sequence in the set of assembled-sequences; and g. identifying the protein-protein relationship based on the comparison.
- 91. A system of identifying at least one protein-protein relationship comprising:
a database of a plurality of sequences; a Comparison module used to compare a reference sequence to at least one sequence in the database of sequences; an Identification module to identify conserved residues between the reference sequence and at least one sequence in the database of sequences; a Calculation module used to compare the conserved residues between the reference sequence and at least one sequence in the database of sequences; and an Selector module to identify the protein-protein relationship based on the comparison.
- 92. The system of claim 91, wherein the database contains nucleic acids sequences.
- 93. The system of claim 91, wherein the database contains amino acid sequences.
- 94. The system of claim 91, wherein the database contains open reading frame sequences.
- 95. The system of claim 91, wherein the database contains open reading frame sequences from prokaryotes and eukaryotes.
- 96. The system of claim 91, wherein the database contains open reading frame sequences from bacteria.
- 97. The system of claim 91, wherein the database contains open reading frame sequences from E. coli.
- 98. The system of claim 91, wherein the comparison module further comprises using the algorithm BLAST, FASTA or its equivalent.
- 99. The system of claim 91, wherein the identification module further comprises identifying of conserved residues using the algorithm ClustalW, PileUp or its equivalent.
- 100. The system of claim 91 further comprising a profiler module to calculate an evolutionary profile.
- 101. The system of claim 91 further comprising a storage module to store evolutionary profiles.
- 102. A computer readable medium, which when executed by a microprocessor, performs a method of identifying at least one protein-protein relationship comprising:
a. compiling a database of sequences; b. comparing a reference sequence to at least one sequence in the database; c. identifying conserved residues between the reference sequence and at least one sequence in the database; d. comparing the conserved residues between the reference sequence and at least one sequence in the database; and e. identifying the protein-protein relationship based on the comparison.
CROSS-REFERENCE TO RELATED APPLICATION
[0001] This application claims priority under 35 U.S.C. §119 to U.S. Provisional Application No. 60/300,586 entitled “Characterizing Nucleic Acid and Amino Acid Sequences In Silico” filed Jun. 22, 2001, the entire content of which is hereby incorporated by reference in its entirety for all purposes.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60300586 |
Jun 2001 |
US |