Claims
- 1. A method of integrating a plurality of biological/chemical databases, each of which includes records for a plurality of biological/chemical objects, the method comprising:
identifying a set of records in the plurality of biological/chemical databases that relates to a single biological/chemical object; establishing an entity in a data structure that corresponds to the single biological/chemical object, the entity including a plurality of aliases, a respective one of which refers to a respective record in the set of records in the plurality of biological/chemical databases; and repeatedly performing the identifying and the establishing for a plurality of sets of records in the plurality of biological/chemical databases to establish a plurality of entities in the data structure.
- 2. A method according to claim 1 further comprising:
linking the plurality of entities in the data structure based upon relationships therebetween to provide an entity-relationship model of the plurality of biological/chemical databases.
- 3. A method according to claim 2 further comprising:
traversing the plurality of entities that are linked in the entity-relationship model in response to a query to thereby obtain query results that are based on the entity-relationship model of the plurality of biological/chemical databases.
- 4. A method according to claim 3 wherein the traversing comprises:
traversing the plurality of entities that are linked in the entity-relationship model from a starting entity to an ending entity in response to a query that specifies the starting entity and the ending entity to thereby identify relationships between the starting entity and the ending entity that are based on the entity-relationship model of the plurality of biological/chemical databases.
- 5. A method according to claim 3 wherein the traversing comprises:
traversing the plurality of entities that are linked in the entity-relationship model from a starting entity to a plurality of ending entities in response to a query that specifies the starting entity to thereby identify relationships between the starting entity and the plurality of ending entities that are based on the entity-relationship model of the plurality of biological/chemical databases.
- 6. A method according to claim 3 wherein the traversing comprises:
traversing the plurality of entities that are linked in the entity-relationship model in response to a query and in response to at least one path rule to thereby obtain query results that are based on the entity-relationship model of the plurality of biological/chemical databases.
- 7. A method according to claim 6 wherein the at least one path rule specifies a type of path to use in traversing through the plurality of entities, a type of path not to use in traversing through the plurality of entities, a type of ending entity that can be included in the query results, a type of ending entity that is not to be included in the query results, a type or class of relationship to be used in traversing through the plurality of entities, a type or class of relationship that is not to be used in traversing through the plurality of entities and/or a confidence level to be achieved in traversing through the plurality of entities.
- 8. A method according to claim 6 further comprising storing the query and the path rule for reuse.
- 9. A method according to claim 2 further comprising:
storing the query results that are based on the entity-relationship model of the plurality of biological/chemical databases as at least one new relationship in the entity-relationship model of the plurality of biological/chemical databases to thereby store knowledge that was derived from the query in the entity-relationship model of the plurality of biological/chemical databases.
- 10. A method according to claim 2 further comprising:
assigning a confidence level to at least one of the relationships in the entity-relationship model of the plurality of biological/chemical databases.
- 11. A method according to claim 10 further comprising:
traversing the plurality of entities that are linked in the entity-relationship model in response to a query to thereby obtain query results that are based on the entity-relationship model of the plurality of biological/chemical databases including the at least one confidence level that is assigned.
- 12. A method of integrating a new biological/chemical database with a plurality of biological/chemical databases, each of which includes records for a plurality of biological/chemical objects, the method comprising:
providing a data structure including a plurality of entities, a respective one of which corresponds to a single biological/chemical object, at least some of the entities including a plurality of aliases, a respective one of which refers to at least one record in a respective one of the plurality of biological/chemical databases that relates to the single biological/chemical object; identifying records in the new biological/chemical database that correspond to at least one of the entities in the data structure; and adding aliases to the at least one of the entities of the data structure that refer to the records in the new biological/chemical database to thereby integrate the new biological/chemical database into the plurality of biological/chemical databases.
- 13. A method according to claim 12 wherein the identifying comprises:
identifying a record in the new biological/chemical database that corresponds to two or more entities in the data structure; and merging the two or more entities in the data structure into a new entity that includes aliases that correspond to the records in the two or more entities in the data structure as well as the record in the new biological/chemical database that corresponds to the two or more entities in the data structure.
- 14. A method according to claim 13 wherein the new biological/chemical database is an updated version of one of the plurality of biological/chemical databases, the method further comprising:
identifying at least one record in the one of the plurality of biological/chemical databases that has been deleted from the updated version of the one of the plurality of biological/chemical databases; removing the at least one record in the one of the plurality of biological/chemical databases that has been deleted; and removing aliases that are associated with the at least one record that has been removed.
- 15. A method according to claim 14 further comprising:
splitting at least one entity in the data structure based upon the aliases that were removed.
- 16. A method according to claim 12 further comprising:
identifying records in the new biological/chemical database that do not correspond to at least one of the entities in the data structure; and adding at least one new entity to the data structure that corresponds to the records in the new biological/chemical database that do not correspond to at least one of the entities in the data structure.
- 17. A method according to claim 12 wherein the providing comprises:
providing a data structure including a plurality of entities, a respective one of which corresponds to a single biological/chemical object, at least some of the entities including a plurality of aliases, a respective one of which refers to at least one record in a respective one of the plurality of biological/chemical databases that relates to the single biological/chemical object, and further including a plurality of relationships that link the plurality of entities in the data structure based upon relationships therebetween to provide an entity-relationship model of the plurality of biological/chemical databases.
- 18. A method according to claim 16 further comprising:
linking the at least one new entity to at least one of the entities in the data structure based upon relationships therebetween to provide an entity-relationship model of the plurality of biological/chemical databases and the new biological/chemical database.
- 19. A method according to claim 17 further comprising:
traversing the plurality of entities that are linked in the entity-relationship model in response to a query to thereby obtain query results that are based on the entity-relationship model of the plurality of biological/chemical databases and the new biological/chemical database.
- 20. A method according to claim 18 further comprising:
traversing the plurality of entities that are linked in the entity-relationship model in response to a query to thereby obtain query results that are based on the entity-relationship model of the plurality of biological/chemical databases and the new biological/chemical database.
- 21. A method according to claim 19 further comprising:
storing the query results that are based on the entity-relationship model of the plurality of biological/chemical databases and the new chemical/biological database as at least one a new relationship in the entity-relationship model of the plurality of biological/chemical databases and the new chemical/biological databases to thereby store knowledge that was derived from the query in the entity-relationship model of the plurality of biological/chemical databases and the new chemical/biological database.
- 22. A method according to claim 12 further comprising:
maintaining an image of the data structure prior to the adding.
- 23. A method according to claim 22 further comprising:
comparing the image of the data structure prior to the adding and the data structure including the aliases, to obtain discovery.
- 24. A method according to claim 12 wherein the new biological/chemical database does not include an entity-relationship data structure.
- 25. A method according to claim 24 further comprising:
generating an entity-relationship structure for the new biological/chemical database.
- 26. A method of querying a plurality of biological/chemical databases, each of which includes records for a plurality of biological/chemical objects, the method comprising:
providing a data structure including a plurality of entities that are linked in an entity-relationship model, a respective one of which corresponds to a single biological/chemical object, at least some of the entities including a plurality of aliases, a respective one of which refers to a record in a respective one of the plurality of biological/chemical databases that relates to a single biological/chemical object; and traversing the plurality of entities that are linked in the entity-relationship model in response to a query to thereby obtain query results that are based on the records in the plurality of biological/chemical databases.
- 27. A method according to claim 26 wherein the traversing comprises:
traversing the plurality of entities that are linked in the entity-relationship model from a starting entity to an ending entity in response to a query that specifies the starting entity and the ending entity to thereby identify relationships between the starting entity and the ending entity that are based on the entity-relationship model of the plurality of biological/chemical databases.
- 28. A method according to claim 26 wherein the traversing comprises:
traversing the plurality of entities that are linked in the entity-relationship model from a starting entity to a plurality of ending entities in response to a query that specifies the starting entity to thereby identify relationships between the starting entity and the plurality of ending entities that are based on the entity-relationship model of the plurality of biological/chemical databases.
- 29. A method according to claim 26 wherein the traversing comprises:
traversing the plurality of entities that are linked in the entity-relationship model in response to a query and in response to at least one path rule to thereby obtain query results that are based on the entity-relationship model of the plurality of biological/chemical databases.
- 30. A method according to claim 29 wherein the at least one path rule specifies a type of path to use in traversing through the plurality of entities, a type of path not to use in traversing through the plurality of entities, a type of ending entity that can be included in the query results, a type of ending entity that is not to be included in the query results, a type of relationship that is to be used in traversing through the plurality of entities, a type of relationship not to be used in traversing through the plurality of entities and/or a confidence level to be achieved in traversing through the plurality of entities.
- 31. A method according to claim 29 further comprising storing the query and the path rule for reuse.
- 32. A method according to claim 26 further comprising:
storing the query results that are based on the entity-relationship model of the plurality of biological/chemical databases as at least one new relationship in the entity-relationship model of the plurality of biological/chemical databases to thereby store knowledge that was derived from the query in the entity-relationship model of the plurality of biological/chemical databases.
- 33. A method according to claim 26 further comprising:
assigning a confidence level to at least one of the relationships in the entity-relationship model of the plurality of biological/chemical databases.
- 34. A method according to claim 33 further comprising:
traversing the plurality of entities that are linked in the entity-relationship model in response to a query to thereby obtain query results that are based on the entity-relationship model of the plurality of biological/chemical databases including the at least one confidence level that is assigned.
- 35. A method according to claim 26 wherein the traversing is followed by:
displaying at least some of the entities that are traversed during the traversing.
- 36. A method according to claim 26 wherein the displaying comprises:
displaying at least some of the relationships among the entities that are traversed during the traversing.
- 37. A system for integrating a plurality of biological/chemical databases, each of which includes records for a plurality of biological/chemical objects, the system comprising:
means for identifying a plurality of sets of records in the plurality of biological/chemical databases, wherein a respective set of records relates to a respective single biological/chemical object; and means for establishing a plurality of entities in a data structure, wherein a respective entity corresponds to a respective one of the single biological/chemical objects, the entities including a plurality of aliases, a respective one of which refers to a respective record in the respective set of records in the plurality of biological/chemical databases.
- 38. A system according to claim 37 further comprising:
means for linking the plurality of entities in the data structure based upon relationships therebetween to provide an entity-relationship model of the plurality of biological/chemical databases.
- 39. A system according to claim 38 further comprising:
means for traversing the plurality of entities that are linked in the entity-relationship model in response to a query to thereby obtain query results that are based on the entity-relationship model of the plurality of biological/chemical databases.
- 40. A system according to claim 39 wherein the means for traversing comprises:
means for traversing the plurality of entities that are linked in the entity-relationship model from a starting entity to an ending entity in response to a query that specifies the starting entity and the ending entity to thereby identify relationships between the starting entity and the ending entity that are based on the entity-relationship model of the plurality of biological/chemical databases.
- 41. A system according to claim 39 wherein the means for traversing comprises:
means for traversing the plurality of entities that are linked in the entity-relationship model from a starting entity to a plurality of ending entities in response to a query that specifies the starting entity to thereby identify relationships between the starting entity and the plurality of ending entities that are based on the entity-relationship model of the plurality of biological/chemical databases.
- 42. A system according to claim 39 wherein the means for traversing comprises:
means for traversing the plurality of entities that are linked in the entity-relationship model in response to a query and in response to at least one path rule to thereby obtain query results that are based on the entity-relationship model of the plurality of biological/chemical databases.
- 43. A system according to claim 42 wherein the at least one path rule specifies a type of path to use in traversing through the plurality of entities, a type of path not to use in traversing through the plurality of entities, a type of ending entity that can be included in the query results, a type of ending entity that is not to be included in the query results, a type or class of relationship to be used in traversing through the plurality of entities, a type or class of relationship that is not to be used in traversing through the plurality of entities and/or a confidence level to be achieved in traversing through the plurality of entities.
- 44. A system according to claim 42 further comprising means for storing the query and the path rule for reuse.
- 45. A system according to claim 38 further comprising:
means for storing the query results that are based on the entity-relationship model of the plurality of biological/chemical databases as at least one new relationship in the entity-relationship model of the plurality of biological/chemical databases to thereby store knowledge that was derived from the query in the entity-relationship model of the plurality of biological/chemical databases.
- 46. A system according to claim 38 further comprising:
means for assigning a confidence level to at least one of the relationships in the entity-relationship model of the plurality of biological/chemical databases.
- 47. A system according to claim 46 further comprising:
means for traversing the plurality of entities that are linked in the entity-relationship model in response to a query to thereby obtain query results that are based on the entity-relationship model of the plurality of biological/chemical databases including the at least one confidence level that is assigned.
- 48. A system for integrating a new biological/chemical database with a plurality of biological/chemical databases, each of which includes records for a plurality of biological/chemical objects, the system comprising:
a data structure including a plurality of entities, a respective one of which corresponds to a single biological/chemical object, at least some of the entities including a plurality of aliases, a respective one of which refers to at least one record in a respective one of the plurality of biological/chemical databases that relates to the single biological/chemical object; means for identifying records in the new biological/chemical database that correspond to at least one of the entities in the data structure; and means for adding aliases to the at least one of the entities of the data structure that refer to the records in the new biological/chemical database to thereby integrate the new biological/chemical database into the plurality of biological/chemical databases.
- 49. A system according to claim 48 wherein the means for identifying comprises:
means for identifying a record in the new biological/chemical database that corresponds to two or more entities in the data structure; and means for merging the two or more entities in the data structure into a new entity that includes aliases that correspond to the records in the two or more entities in the data structure as well as the record in the new biological/chemical database that corresponds to the two or more entities in the data structure.
- 50. A system according to claim 49 wherein the new biological/chemical database is an updated version of one of the plurality of biological/chemical databases, the system further comprising:
means for identifying at least one record in the one of the plurality of biological/chemical databases that has been deleted from the updated version of the one of the plurality of biological/chemical databases; means for removing the at least one record in the one of the plurality of biological/chemical databases that has been deleted; and means for removing aliases that are associated with the at least one record that has been removed.
- 51. A system according to claim 50 further comprising:
means for splitting at least one entity in the data structure based upon the aliases that were removed.
- 52. A system according to claim 48 further comprising:
means for identifying records in the new biological/chemical database that do not correspond to at least one of the entities in the data structure; and means for adding at least one new entity to the data structure that corresponds to the records in the new biological/chemical database that do not correspond to at least one of the entities in the data structure.
- 53. A system according to claim 48 wherein the data structure includes a plurality of entities, a respective one of which corresponds to a single biological/chemical object, at least some of the entities including a plurality of aliases, a respective one of which refers to at least one record in a respective one of the plurality of biological/chemical databases that relates to the single biological/chemical object, and further including a plurality of relationships that link the plurality of entities in the data structure based upon relationships therebetween to provide an entity-relationship model of the plurality of biological/chemical databases.
- 54. A system according to claim 52 further comprising:
means for linking the at least one new entity to at least one of the entities in the data structure based upon relationships therebetween to provide an entity-relationship model of the plurality of biological/chemical databases and the new biological/chemical database.
- 55. A system according to claim 53 further comprising:
means for traversing the plurality of entities that are linked in the entity-relationship model in response to a query to thereby obtain query results that are based on the entity-relationship model of the plurality of biological/chemical databases and the new biological/chemical database.
- 56. A system according to claim 54 further comprising:
means for traversing the plurality of entities that are linked in the entity-relationship model in response to a query to thereby obtain query results that are based on the entity-relationship model of the plurality of biological/chemical databases and the new biological/chemical database.
- 57. A system according to claim 55 further comprising:
means for storing the query results that are based on the entity-relationship model of the plurality of biological/chemical databases and the new chemical/biological database as at least one a new relationship in the entity-relationship model of the plurality of biological/chemical databases and the new chemical/biological databases to thereby store knowledge that was derived from the query in the entity-relationship model of the plurality of biological/chemical databases and the new chemical/biological database.
- 58. A system according to claim 48 further comprising:
means for maintaining an image of the data structure before the aliases are added.
- 59. A system according to claim 58 further comprising:
means for comparing the image of the data structure before the aliases are added and the data structure including the aliases, to obtain discovery.
- 60. A system according to claim 48 wherein the new biological/chemical database does not include an entity-relationship data structure.
- 61. A system according to claim 60 further comprising:
means for generating an entity-relationship structure for the new biological/chemical database.
- 62. A system for querying a plurality of biological/chemical databases, each of which includes records for a plurality of biological/chemical objects, the system comprising:
a data structure including a plurality of entities that are linked in an entity-relationship model, a respective one of which corresponds to a single biological/chemical object, at least some of the entities including a plurality of aliases, a respective one of which refers to a record in a respective one of the plurality of biological/chemical databases that relates to a single biological/chemical object; and means for traversing the plurality of entities that are linked in the entity-relationship model in response to a query to thereby obtain query results that are based on the records in the plurality of biological/chemical databases.
- 63. A system according to claim 62 wherein the means for traversing comprises:
means for traversing the plurality of entities that are linked in the entity-relationship model from a starting entity to an ending entity in response to a query that specifies the starting entity and the ending entity to thereby identify relationships between the starting entity and the ending entity that are based on the entity-relationship model of the plurality of biological/chemical databases.
- 64. A system according to claim 63 wherein the means for traversing comprises:
means for traversing the plurality of entities that are linked in the entity-relationship model from a starting entity to a plurality of ending entities in response to a query that specifies the starting entity to thereby identify relationships between the starting entity and the plurality of ending entities that are based on the entity-relationship model of the plurality of biological/chemical databases.
- 65. A system according to claim 63 wherein the means for traversing comprises:
means for traversing the plurality of entities that are linked in the entity-relationship model in response to a query and in response to at least one path rule to thereby obtain query results that are based on the entity-relationship model of the plurality of biological/chemical databases.
- 66. A system according to claim 65 wherein the at least one path rule specifies a type of path to use in traversing through the plurality of entities, a type of path not to use in traversing through the plurality of entities, a type of ending entity that can be included in the query results, a type of ending entity that is not to be included in the query results, a type or class of relationship that is to be used in traversing through the plurality of entities, a type or class of relationship not to be used in traversing through the plurality of entities and/or a confidence level to be achieved in traversing through the plurality of entities.
- 67. A system according to claim 65 further comprising means for storing the query and the path rule for reuse.
- 68. A system according to claim 62 further comprising:
means for storing the query results that are based on the entity-relationship model of the plurality of biological/chemical databases as at least one new relationship in the entity-relationship model of the plurality of biological/chemical databases to thereby store knowledge that was derived from the query in the entity-relationship model of the plurality of biological/chemical databases.
- 69. A system according to claim 62 further comprising:
means for assigning a confidence level to at least one of the relationships in the entity-relationship model of the plurality of biological/chemical databases.
- 70. A system according to claim 69 further comprising:
means for traversing the plurality of entities that are linked in the entity-relationship model in response to a query to thereby obtain query results that are based on the entity-relationship model of the plurality of biological/chemical databases including the at least one confidence level that is assigned.
- 71. A system according to claim 62 further comprising:
means for displaying at least some of the entities that are traversed during the traversing.
- 72. A system according to claim 62 wherein the means for displaying comprises:
means for displaying at least some of the relationships among the entities that are traversed during the traversing.
- 73. A computer program product that is configured to integrate a plurality of biological/chemical databases, each of which includes records for a plurality of biological/chemical objects, the computer program product comprising a computer usable storage medium having computer-readable program code embodied in the medium, the computer-readable program code comprising:
computer-readable program code that is configured to identify a set of records in the plurality of biological/chemical databases that relates to a single biological/chemical object; computer-readable program code that is configured to establish an entity in a data structure that corresponds to the single biological/chemical object, the entity including a plurality of aliases, a respective one of which refers to a respective record in the set of records in the plurality of biological/chemical databases; and computer-readable program code that is configured to repeatedly access the computer-readable program code that is configured to identify and the computer-readable program code that is configured to establish, to process a plurality of sets of records in the plurality of biological/chemical databases and thereby establish a plurality of entities in the data structure.
- 74. A computer program product according to claim 73 further comprising:
computer-readable program code that is configured to link the plurality of entities in the data structure based upon relationships therebetween to provide an entity-relationship model of the plurality of biological/chemical databases.
- 75. A computer program product according to claim 74 further comprising:
computer-readable program code that is configured to traverse the plurality of entities that are linked in the entity-relationship model in response to a query to thereby obtain query results that are based on the entity-relationship model of the plurality of biological/chemical databases.
- 76. A computer program product according to claim 75 wherein the computer-readable program code that is configured to traverse comprises:
computer-readable program code that is configured to traverse the plurality of entities that are linked in the entity-relationship model from a starting entity to an ending entity in response to a query that specifies the starting entity and the ending entity to thereby identify relationships between the starting entity and the ending entity that are based on the entity-relationship model of the plurality of biological/chemical databases.
- 77. A computer program product according to claim 75 wherein the computer-readable program code that is configured to traverse comprises:
computer-readable program code that is configured to traverse the plurality of entities that are linked in the entity-relationship model from a starting entity to a plurality of ending entities in response to a query that specifies the starting entity to thereby identify relationships between the starting entity and the plurality of ending entities that are based on the entity-relationship model of the plurality of biological/chemical databases.
- 78. A computer program product according to claim 75 wherein the computer-readable program code that is configured to traverse comprises:
computer-readable program code that is configured to traverse the plurality of entities that are linked in the entity-relationship model in response to a query and in response to at least one path rule to thereby obtain query results that are based on the entity-relationship model of the plurality of biological/chemical databases.
- 79. A computer program product according to claim 78 wherein the at least one path rule specifies a type of path to use in traversing through the plurality of entities, a type of path not to use in traversing through the plurality of entities, a type of ending entity that can be included in the query results, a type of ending entity that is not to be included in the query results, a type or class of relationship to be used in traversing through the plurality of entities, a type or class of relationship that is not to be used in traversing through the plurality of entities and/or a confidence level to be achieved in traversing through the plurality of entities.
- 80. A computer program product according to claim 78 further comprising computer-readable program code that is configured to store the query and the path rule for reuse.
- 81. A computer program product according to claim 75 further comprising:
computer-readable program code that is configured to store the query results that are based on the entity-relationship model of the plurality of biological/chemical databases as at least one new relationship in the entity-relationship model of the plurality of biological/chemical databases to thereby store knowledge that was derived from the query in the entity-relationship model of the plurality of biological/chemical databases.
- 82. A computer program product according to claim 75 further comprising:
computer-readable program code that is configured to assign a confidence level to at least one of the relationships in the entity-relationship model of the plurality of biological/chemical databases.
- 83. A computer program product according to claim 82 further comprising:
computer-readable program code that is configured to traverse the plurality of entities that are linked in the entity-relationship model in response to a query to thereby obtain query results that are based on the entity-relationship model of the plurality of biological/chemical databases including the at least one confidence level that is assigned.
- 84. A computer program product that is configured to integrate a new biological/chemical database with a plurality of biological/chemical databases, each of which includes records for a plurality of biological/chemical objects, the computer program product comprising a computer usable storage medium having computer-readable program code embodied in the medium, the computer-readable program code comprising:
a data structure including a plurality of entities, a respective one of which corresponds to a single biological/chemical object, at least some of the entities including a plurality of aliases, a respective one of which refers to at least one record in a respective one of the plurality of biological/chemical databases that relates to the single biological/chemical object; computer-readable program code that is configured to identify records in the new biological/chemical database that correspond to at least one of the entities in the data structure; and computer-readable program code that is configured to add aliases to the at least one of the entities of the data structure that refer to the records in the new biological/chemical database to thereby integrate the new biological/chemical database into the plurality of biological/chemical databases.
- 85. A computer program product according to claim 84 wherein the computer-readable program code that is configured to identify comprises:
computer-readable program code that is configured to identify a record in the new biological/chemical database that corresponds to two or more entities in the data structure; and computer-readable program code that is configured to merge the two or more entities in the data structure into a new entity that includes aliases that correspond to the records in the two or more entities in the data structure as well as the record in the new biological/chemical database that corresponds to the two or more entities in the data structure.
- 86. A computer program product according to claim 85 wherein the new biological/chemical database is an updated version of one of the plurality of biological/chemical databases, the computer program product further comprising:
computer-readable program code that is configured to identify at least one record in the one of the plurality of biological/chemical databases that has been deleted from the updated version of the one of the plurality of biological/chemical databases; computer-readable program code that is configured to remove the at least one record in the one of the plurality of biological/chemical databases that has been deleted; and computer-readable program code that is configured to remove aliases that are associated with the at least one record that has been removed.
- 87. A computer program product according to claim 86 further comprising:
computer-readable program code that is configured to split at least one entity in the data structure based upon the aliases that were removed.
- 88. A computer program product according to claim 84 further comprising:
computer-readable program code that is configured to identify records in the new biological/chemical database that do not correspond to at least one of the entities in the data structure; and computer-readable program code that is configured to add at least one new entity to the data structure that corresponds to the records in the new biological/chemical database that do not correspond to at least one of the entities in the data structure.
- 89. A computer program product according to claim 84 wherein the data structure includes a plurality of entities, a respective one of which corresponds to a single biological/chemical object, at least some of the entities including a plurality of aliases, a respective one of which refers to at least one record in a respective one of the plurality of biological/chemical databases that relates to the single biological/chemical object, and further including a plurality of relationships that link the plurality of entities in the data structure based upon relationships therebetween to provide an entity-relationship model of the plurality of biological/chemical databases.
- 90. A computer program product according to claim 88 further comprising:
computer-readable program code that is configured to link the at least one new entity to at least one of the entities in the data structure based upon relationships therebetween to provide an entity-relationship model of the plurality of biological/chemical databases and the new biological/chemical database.
- 91. A computer program product according to claim 89 further comprising:
computer-readable program code that is configured to traverse the plurality of entities that are linked in the entity-relationship model in response to a query to thereby obtain query results that are based on the entity-relationship model of the plurality of biological/chemical databases and the new biological/chemical database.
- 92. A computer program product according to claim 90 further comprising:
computer-readable program code that is configured to traverse the plurality of entities that are linked in the entity-relationship model in response to a query to thereby obtain query results that are based on the entity-relationship model of the plurality of biological/chemical databases and the new biological/chemical database.
- 93. A computer program product according to claim 91 further comprising:
computer-readable program code that is configured to store the query results that are based on the entity-relationship model of the plurality of biological/chemical databases and the new chemical/biological database as at least one a new relationship in the entity-relationship model of the plurality of biological/chemical databases and the new chemical/biological databases to thereby store knowledge that was derived from the query in the entity-relationship model of the plurality of biological/chemical databases and the new chemical/biological database.
- 94. A computer program product according to claim 84 further comprising:
computer-readable program code that is configured to maintain an image of the data structure before the aliases are added.
- 95. A computer program product according to claim 94 further comprising:
computer-readable program code that is configured to compare the image of the data structure before the aliases are added and the data structure including the aliases, to obtain discovery.
- 96. A computer program product according to claim 84 wherein the new biological/chemical database does not include an entity-relationship data structure.
- 97. A computer program product according to claim 96 further comprising:
computer-readable program code that is configured to generate an entity-relationship structure for the new biological/chemical database.
- 98. A computer program product that is configured to query a plurality of biological/chemical databases, each of which includes records for a plurality of biological/chemical objects, the computer program product comprising a computer usable storage medium having computer-readable program code embodied in the medium, the computer-readable program code comprising:
computer-readable program code that is configured to provide a data structure including a plurality of entities that are linked in an entity-relationship model, a respective one of which corresponds to a single biological/chemical object, at least some of the entities including a plurality of aliases, a respective one of which refers to a record in a respective one of the plurality of biological/chemical databases that relates to a single biological/chemical object; and computer-readable program code that is configured to traverse the plurality of entities that are linked in the entity-relationship model in response to a query to thereby obtain query results that are based on the records in the plurality of biological/chemical databases.
- 99. A computer program product according to claim 98 wherein the computer-readable program code that is configured to traverse comprises:
computer-readable program code that is configured to traverse the plurality of entities that are linked in the entity-relationship model from a starting entity to an ending entity in response to a query that specifies the starting entity and the ending entity to thereby identify relationships between the starting entity and the ending entity that are based on the entity-relationship model of the plurality of biological/chemical databases.
- 100. A computer program product according to claim 98 wherein the computer-readable program code that is configured to traverse comprises:
computer-readable program code that is configured to traverse the plurality of entities that are linked in the entity-relationship model from a starting entity to a plurality of ending entities in response to a query that specifies the starting entity to thereby identify relationships between the starting entity and the plurality of ending entities that are based on the entity-relationship model of the plurality of biological/chemical databases.
- 101. A computer program product according to claim 98 wherein the computer-readable program code that is configured to traverse comprises:
computer-readable program code that is configured to traverse the plurality of entities that are linked in the entity-relationship model in response to a query and in response to at least one path rule to thereby obtain query results that are based on the entity-relationship model of the plurality of biological/chemical databases.
- 102. A computer program product according to claim 101 wherein the at least one path rule specifies a type of path to use in traversing through the plurality of entities, a type of path not to use in traversing through the plurality of entities, a type of ending entity that can be included in the query results, a type of ending entity that is not to be included in the query results, a type or class of relationship that is to be used in traversing through the plurality of entities, a type or class of relationship not to be used in traversing through the plurality of entities and/or a confidence level to be achieved in traversing through the plurality of entities.
- 103. A computer program product according to claim 101 further comprising computer-readable program code that is configured to store the query and the path rule for reuse.
- 104. A computer program product according to claim 98 further comprising:
computer-readable program code that is configured to store the query results that are based on the entity-relationship model of the plurality of biological/chemical databases as at least one new relationship in the entity-relationship model of the plurality of biological/chemical databases to thereby store knowledge that was derived from the query in the entity-relationship model of the plurality of biological/chemical databases.
- 105. A computer program product according to claim 98 further comprising:
computer-readable program code that is configured to assign a confidence level to at least one of the relationships in the entity-relationship model of the plurality of biological/chemical databases.
- 106. A computer program product according to claim 105 further comprising:
computer-readable program code that is configured to traverse the plurality of entities that are linked in the entity-relationship model in response to a query to thereby obtain query results that are based on the entity-relationship model of the plurality of biological/chemical databases including the at least one confidence level that is assigned.
- 107. A computer program product according to claim 98 further comprising:
computer-readable program code that is configured to display at least some of the entities that are traversed during the traversing.
- 108. A computer program product according to claim 98 wherein the computer-readable program code that is configured to display comprises:
computer-readable program code that is configured to display at least some of the relationships among the entities that are traversed during the traversing.
- 109. A bioinformatics data processing system comprising:
a data processing engine that is configured to build an entity-relationship model of a plurality of independent biological/chemical databases, each of which includes records for a plurality of biological/chemical objects, the entity-relationship model comprising:
a plurality of entities, a respective entity of which corresponds to a single biological/chemical object, at least some of the entities including a plurality of aliases, a respective one of which directly or indirectly refers to at least one record in a respective one of the plurality of biological/chemical databases that relates to the single biological/chemical object; and a plurality of relationships that link the plurality of entities in the entity-relationship model based upon relationships therebetween.
- 110. A system according to claim 109 further comprising:
a metadata database that is configured to store therein the entity-relationship model of the plurality of independent biological/chemical databases.
- 111. A system according to claim 109 further comprising:
a loader that is configured to load an independent entity-relationship model of each of the independent biological/chemical databases into the data processing engine.
- 112. A system according to claim 111 wherein the loader is configured to load an independent entity-relationship model of each of the independent biological/chemical databases into the data processing engine in a typeless format.
- 113. A system according to claim 111 in combination with the plurality of independent biological/chemical databases.
- 114. A system according to claim 109 further comprising:
a query tool that is configured to traverse the plurality of entities that are linked in the entity-relationship model in response to a query to thereby obtain query results that are based on the entity-relationship model of the plurality of biological/chemical databases.
- 115. A system according to claim 114 wherein the query tool is a Web-based query tool.
- 116. A system according to claim 109 further comprising:
a virtual experiment tool that is configured to conduct virtual experiments on the entity-relationship model of a plurality of independent biological/chemical databases.
- 117. A system according to claim 109 further comprising:
a discovery tool that is configured to discover biological/chemical knowledge from the entity-relationship model of a plurality of independent biological/chemical databases.
- 118. A system according to claim 109 wherein the data processing engine runs on a plurality of data processing systems that are configured in a peer-to-peer configuration.
- 119. A bioinformatics data structure comprising:
an entity-relationship model of a plurality of independent biological/chemical databases, each of which includes records for a plurality of biological/chemical objects, the entity-relationship model comprising:
a plurality of entities, a respective entity of which corresponds to a single biological/chemical object, at least some of the entities including a plurality of aliases, a respective one of which directly or indirectly refers to at least one record in a respective one of the plurality of biological/chemical databases that relates to the single biological/chemical object; and a plurality of relationships that link the plurality of entities in the entity-relationship model based upon relationships therebetween.
- 120. A data structure according to claim 119 further comprising:
an independent entity-relationship model of each of the independent biological/chemical databases.
CROSS REFERENCE TO PROVISIONAL APPLICATIONS
[0001] This application is related to and claims the benefit of Provisional Application Serial No. 60/296,018 to Levy and Segaran, filed Jun. 5, 2001, entitled Cell: A Cross-Referenced Ontological Database for Biological Data; and Provisional Application Serial No. 60/356,616 to Gardner and Wilbanks, filed Feb. 13, 2002, entitled Ontology Networks, a New Foundation for Discovery, both of which are assigned to the assignee of the present application, the disclosures of both of which are hereby incorporated herein by reference in their entirety as if set forth fully herein.
Provisional Applications (2)
|
Number |
Date |
Country |
|
60296018 |
Jun 2001 |
US |
|
60356616 |
Feb 2002 |
US |