Claims
- 1. A computer-readable structure, encoded on a computer-readable medium, for organizing database elements corresponding to proteins in tissues obtained from a selected organism, the structure comprising records for storing different types of data relating to respective said proteins, each of said records having at least an identification field for identifying a corresponding one of said proteins, a parameter field for indicating a selected characteristic of the corresponding protein, a location field for indicating the relative location in the organism from which the corresponding protein was obtained; and an abundance field for indicating the relative amount of the corresponding protein obtained from said location.
- 2. The computer-readable structure of claim 1, wherein said records are configured for automated searching and extraction of selected said data therein in response to queries for proteins having similar said data in at least a selected one of said identification field, said parameter field, said location field and said concentration field.
- 3. The computer-readable structure of claim 1, wherein said parameter field comprises data selected from the group consisting of isoelectric point, molecular weight, mass spectrometry data, molecular signature of fragments of the corresponding protein, at least partial amino acid sequence, and coordinates in a global map.
- 4. The computer-readable structure of claim 3, wherein said parameter field comprises data relating to protein condition response variables selected from the group consisting of sex of said organism, time of day, developmental stage of said tissue from which said protein was obtained, subcellular fraction, selected disease state, chemical exposure and stimulus exposure.
- 5. The computer-readable structure of claim 1, wherein said parameter field comprises data relating to protein condition response variables selected from the group consisting of sex of said organism, time of day, developmental stage of said tissue from which said protein was obtained, a selected disease.
- 6. A computer program product for extracting selected data relating to multiple proteins in multiple tissue samples from a database comprising:
a computer-readable medium; a user interface module for guiding a user to generate at least one query to retrieve selected data from said database, said database comprising database elements corresponding to proteins in tissues obtained from a selected organism, the structure comprising records for storing different types of data relating to respective said proteins, each of said records having at least an identification field for identifying a protein, a parameter field for indicating a selected characteristic of the corresponding protein, a location field for indicating the location in the organism from which the corresponding protein was obtained; and an abundance field for indicating the relative amount of the corresponding protein obtained from said location; and a database search module communicatively coupled to said user interface module and operable to locate and retrieve said database elements that correspond to said query.
- 7. A computer program product as claimed in claim 6, wherein said user interface module is operable to generate a graphical user interface screen having prompts corresponding to the fields in said records, said database search module being operable to apply boolean logic to combine information entered by said user in response to at least one of said prompts to locate and retrieve said database elements in accordance with said information.
- 8. A computer program product as claimed in claim 7, wherein said user interface module is operable to allow said user to request retrieve of database elements that do not correspond to selected information entered by said user, and said database search module is operable to locate said database records that do not correspond to said selected information.
- 9. A computer program product as claimed in claim 6, wherein said database search module is operable to continue to locate and retrieve said database elements satisfying said query and not to time-out until at least one of two conditions occur comprising a complete search of said database and a manual command to terminate searching entered by said user.
- 10. A method for identifying a protein marker that indicates a condition via a change in its abundance comprising;
determining the abundances of said protein marker in a plurality of substantially the same biological samples that have at least one different selected characteristic; accessing a Protein Index database comprising entries for providing data relating to proteins including said protein marker; and comparing said abundances to said entries in said Protein Index database to determine the protein is a marker for the condition.
- 11. The method of claim 10, wherein said comparing step facilitates at least one of a plurality of investigations for diagnosing a disease in the organism from which said samples, monitoring the effectiveness of a therapy applied to said organism, finding a mode of therapeutic action for said organism, screening for toxicity of a compound provided to said organism, screening for biological activity of a candidate pharmaceutical provided to said organism, and determining therapeutic treatment options.
- 12. The method of claim 11, further comprising the step of determining the degree with which said abundances of different proteins are altered..
- 13. The method of claim 10, wherein said Protein Index database comprises a protein index corresponding to at least one of a plurality of subjects selected from the group consisting of a human being, an agricultural animal, a pest, a wild animal, a companion animal, an agricultural plant, an ornamental plant, a weed, a wild plant, and a microorganism.
- 14. The method of claim 13, wherein said Protein Index database can correspond to a location selected from the group consisting of a particular species, a selected population, an individual organism, an ecosystem, a particular organ, a type of tissue, a type of cell type, and a subcellular particle.
- 15. The method of claim 10, wherein said Protein Index database comprises protein indices corresponding to respective subjects selected from the group consisting of a human being, an agricultural animal, a pest, a wild animal, a companion animal, an agricultural plant, an ornamental plant, a weed, a wild plant, and a microorganism.
- 16. The method of claim 15, wherein said protein indices can each correspond to a location selected from the group consisting of a particular species, a selected population, an individual organism, an ecosystem, a particular organ, a type of tissue, a type of cell type, and a subcellular particle.
- 17. The method of claim 10, wherein said Protein Index database comprises protein indices corresponding to respective subjects selected from the group consisting a normal protein index, an abnormal protein index, and a treated protein index.
- 18. The method of claim 10, wherein said selected different characteristic is selected from the group consisting of a disease of said organism from which said samples were taken, a treatment, a toxin, a chemical compound; a biological activity; and a pharmaceutical.
- 19. The method of claim 10, further comprising the step of updating said Protein Index database to generate an improved Protein Index.
- 20. The method of claim 19, wherein said updating step comprises the steps of:
determining the protein abundances of additional biological samples; collecting information regarding the biological and medical properties of the additional samples; and adding said information to the Protein Index database.
- 21. The method of 20, wherein said adding step comprises the step of using said information to determine improved acceptable ranges of abundance for each protein in said improved Protein Index.
- 22. The method of claim 20, wherein said determining step comprises the step of obtaining protein abundance data from multiple biological samples from different individuals, and said adding step comprises the step of compiling said protein abundance data to determine a range of acceptable abundances for the particular protein at a particular location.
- 23. The method of claim 10, wherein said accessing step comprises the step of using said Protein Index database in conjunction with at least one database of proteins describing one of protein Molecular Anatomy, Pathology and a Molecular Effects of Drugs.
- 24. The method of claim 10, wherein said accessing step comprises the step of using said Protein Index database in conjunction with a database of the nucleotide sequences for a location selected from the group consisting of the same individual, population, species and ecosystem.
- 25. A method for obtaining proteomics information comprising:
generating a query to retrieve selected data relating to a protein from a computerreadable protein index database for organizing database elements corresponding to a protein in a biological sample obtained from a selected organism, said protein index database comprising records for storing different types of data relating to respective said proteins; locating respective ones of said records in said protein index database that satisfy protein characteristics requested via said query; and generating an output corresponding to respective ones of said records.
- 26. A computer program product for extracting selected data relating to a protein in a tissue sample from a database comprising:
a computer-readable medium for storing said database; a user interface module for guiding a user to generate at least one query to retrieve selected data from said database, said database comprising database elements corresponding to proteins in tissue obtained from a selected organism, the structure comprising records for storing different types of data relating to respective said proteins; and a database search module communicatively coupled to said user interface module and operable to locate and retrieve said database elements that correspond to said query.
- 27. A method for identifying component-specific proteins from a Protein Index database comprising information relating to a plurality of proteins, the method comprising the steps of;
a) generating a first list of all said proteins indicated in said Protein Index database as being located in a first specimen of a selected component; b) generating a second list of all said proteins indicated in said Protein Index database as being located in a second specimen of said selected component; c) subtracting from said first list all of said proteins common to both said first list and said second list; and d) repeating steps b and c for components 3 to n, where n is the total number of components in said Protein Index database, to obtain said component-specific proteins corresponding to those remaining in said first list after common said proteins are subtracted therefrom.
- 28. The method of claim 27, wherein said component corresponds to one of a tissue, a cell, a subcellular particle and an organ.
- 29. The method of claim 27, wherein said component is a tissue and said Protein Index database comprises greater than 50 tissue-specific proteins.
- 30. A method for identifying selected proteins from a Protein Index database comprising information relating to a plurality of proteins, the method comprising the steps of;
generating a list of said proteins indicated in said Protein Index database as being located in tumor; and comparing said list with said information in said Protein Index database to identify at least one of said proteins that is tissue-specific and located in said tumor.
- 31. A method for determining tissue damage using a body fluid sample and a Protein Index database comprising information relating to a plurality of tissue-specific proteins, the method comprising the steps of:
obtaining a list of proteins in said body fluid sample; comparing said list with said information to determine if said list comprises one of said tissue-specific proteins which would not occur in said body fluid sample under normal conditions, or would occur at a higher and lower amount under normal conditions than indicated in said list.
- 32. The method of 31, wherein said body fluid is selected from the group consisting of blood, urine, serum, plasma, feces, saliva, sputum, tears, sweat, cerebral spinal fluid, and pleural fluid.
- 33. A method for determining the location and abundance combination for a selected protein comprising;
accessing a Protein Index database comprising abundance data for each protein at respective locations included in said Protein Index database; identifying said abundance data in said Protein Index database which relates to said selected protein; and generating combinations of said locations and corresponding said abundance data for said selected protein.
- 34. A method for determining how a particular biological effect affects a selected protein abundance in different locations comprising;
a) accessing a Protein Index database for the abundance of each protein-location combination for the selected protein for at least one biological state; b) determining the abundance of each protein-location combination for the selected protein for the different biological state; and c) comparing the abundances obtained for said at least one biological state and said different biological state to determine which of said protein-location combinations have significantly altered abundances.
- 35. The method of claim 34, wherein said determining step is performed by performing at least one of accessing said Protein Index database for the abundance of each protein-location combination for said selected protein for said different biological state, and experimentally determining the abundance of each protein-location combination for the selected protein for said different biological state.
- 36. The method of claim 34, wherein said biological state is selected from the group consisting of age, gender, disease state, temperature exposure, diet, time since last meal, chemical exposure, hormone exposure, pharmaceutical exposure, poison exposure, starvation, dehydration, state of alertness, different time of day, menstrual cycle, physical injury, recovery from surgery, stress, and electrical shock, and response to selected stimuli.
- 37. A method for determining whether a particular biological effect is limited to a particular location in an organism and which other locations are affected comprising;
a) accessing a Protein Index database to obtain information stored therein relating to the abundance of a protein in a selected location, said protein being known or suspected to be altered in abundance due to the particular biological effect; b) determining from said Protein Index database the abundance of the same said protein in other locations in the organism; and c) comparing the abundance data obtained by steps a and b to determine the extent to which the protein abundance is altered in other locations in the same organism.
- 38. The method of claim 37, wherein one of steps a and b comprises the step of experimentally measuring the abundance of said protein in said selected location and said other locations in the organism, respectively.
- 39. The method of claim 37, wherein the effects of a disease state on distant organs is determined.
- 40. The method of claim 37, further comprising the step of determining whether a protein in one location is in a different variant form
- 41. The method of claim 37, wherein said different variant form is selected from the group consisting of different glycosylation, different phosphorylation, different post translational modification, differemn cleavage, alternatively spliced, and complexed to a different associated protein.
- 42. A method for finding sets of co-regulated proteins within the same tissue, different tissues or between different tissues comprising the steps of:
a) accessing a Protein Index database comprising information relating to a plurality of proteins for a subset of said information relating to a first biological state; b) accessing said Protein Index database for a subset of said information relating to a second biological state; c) accessing said Protein Index database for a subset of said information relating to a a third biological state, d) generating a list of protein-location combinations with altered abundances between the first biological state and the second biological state; e) generating a list of protein-location combinations with altered abundances between a combination selected from the group consisting of the first biological state and the second biological state, the first biological state and the third biological state, and the second biological state and the third biological state; and f) determining which protein-location combinations are consistently altered in the same direction with other protein-location combinations in steps d and e, said sets of protein-location combinations being designated as sets of co-regulated proteins.
- 43. The method of claim 42, further comprising the steps of:
g) accessing the Protein Index for a fourth biological state; and h) generating lists of protein-location combinations with altered abundances in the same direction between the fourth and one the first biological state, the second biological state and the third biological state.
- 44. The method of claim 43, further comprising the steps f and g for biological states 5, . . . , n wherein n is an integer number.
- 45. The method of claim 42, wherein said Protein Index database is used in conjunction with data from at least one of a protein binding study and a yeast 2-hybrid system.
- 46. The method of claim 42, wherein said Protein Index database is used for one of a regulatory homology determination and a structural relationship determination.
- 47. A method for determining the similarity of an in vitro or testing system to an in vivo system comprising:
a) accessing a Protein Index database comprising a subset of information relating to proteins for the in vivo system; b) accessing a subset information relating to a first selected system from a Protein Index database, said selected system being one of an in vitro system and a testing system; and c) comparing the number and similarities of abundances for each protein-location combination generated from each said subset; wherein the greater the number and amount of similarities, the more similar the first selected system is to the in vivo system.
- 48. The method of claim 47, further comprising the step of repeating steps b and c with respect to a second selected system to determine which of the first selected system and the second selected system is better wherein the selected system with more similarities and fewer differences is considered the better of the two selected systems.
- 49. The method of claim 47, wherein the in vitro system is a cell line.
- 50. The method of claim 47, wherein the testing system is from a different species than the in vivo system.
- 51. The method of claim 47, wherein toxicity or biological activity of a composition is determined.
- 52. A method for interpreting genomic nucleic acid sequence information to determine if an open reading frame constitutes an exon comprising;
sequencing a protein; deducing genomic nucleic acid sequences therefrom; and comparing the deduced sequences to a database of genomic sequences; wherein open reading frames corresponding to protein sequences constitute true exons.
- 53. The method of claim 52, wherein n the molecular weigh and pI of the whole protein or plural molecular weights and plural pI from digestion fragments of the whole protein are determined rather than the amino acid sequences.
- 54. A method for determining splicing sites for a nucleic acid comprising;
sequencing a protein, deducing genomic nucleic acid sequences there from, and comparing the deduced sequences to a database of genomic sequences, wherein possible and alternative splicing sites, which are capable of producing a mRNA capable of expressing the protein sequence, represent true or possible splicing sites.
- 55. The method of claim 54, wherein the molecular weigh and pI of the whole protein or plural molecular weights and plural pI from digestion fragments of the whole protein are determined rather than the amino acid sequences.
- 56. A method for determining which genomic sequences determines a phenotype comprising;
determining a single polymorphic nucleotide profile for plural individuals; determining protein abundance differences for the same individuals; determining which single polymorphic nucleotide changes are associated with which protein abundance differences; and determining which phenotypes are associated with which protein abundance differences in the same individuals; wherein certain patterns of single polymorphic nucleotide changes correlate to a phenotype.
- 57. The method of claim 56, wherein plural single polymorphic nucleotide changes correspond to one protein abundance difference.
- 58. The method of claim 56, wherein single polymorphic nucleotide that do not correspond to any change in protein abundance are not used to determine a pattern of single polymorphic nucleotide changes that correlates to a phenotype.
- 59. A Protein Index database having a plurality of records with each record having a plurality of independently searchable features, wherein said features include protein identification information, protein properties, origin and quantity information, wherein said database includes records encompassing at least 100 proteins and at least 10 different locations in a single species.
- 60. The database of claim 59, wherein said origin is selected from the group consisting of organ, tissue, cell and organelle.
- 61. The database of claim 59, wherein said database is generated from biological samples taken from a single individual or genetic identical individuals.
- 62. The database of claim 59, wherein said database is generated from a statistically significant number of genetically different individuals wherein an abundance range for each protein-location combination is determined.
- 63. The database of claim 59, wherein said database is generated from surgically removed tissue.
- 64. A method for determining the proteome of an individual comprising the steps of:
taking a protein containing sample from each of at least five tissues from an individual; and determining the presence and relative abundance of at least ten proteins from each of said tissues.
Parent Case Info
[0001] This application is a continuation-in-part of U.S. Ser. No. ______ filed Jan. 4, 2001 and entitled “Reference Database” (Attorney's Docket 41333), which is a continuation-in-part of U.S. Serial No. 654,133, filed Sep. 1, 2000.
Continuations (1)
|
Number |
Date |
Country |
Parent |
09756285 |
Jan 2001 |
US |
Child |
10235649 |
Sep 2002 |
US |