The present invention contains subject matter related to Japanese Patent Application JP 2005-266728, filed in the Japanese Patent Office on Sep. 14, 2005, the entire contents of which being incorporated herein by reference.
1. Field of the Invention
The present invention relates to an information processing apparatus, method, system, and program, and a recording medium. More particularly, the present invention relates to an information processing apparatus, method, system, and program, and a recording medium, which are intended to digitize the relation between intermolecular interactions and cellular functions.
2. Description of Related Art
There are many diseases involving gene defects. They include genetic metabolic diseases induced by a single gene defect as well as cancerous diseases induced by a plurality of gene defects which have accumulated with time. Analyzing whether a specific gene (and its product) is normal or abnormal is important in understanding the origin of a disease and establishing the plan for medical treatment.
This has been generally achieved by the technique which involves investigating the copy number of a specific gene of interest, confirming the degree of transcription of the gene, performing DNA sequencing on the amplified product of RT-PCR of the gene, detecting mutation by the thus obtained base sequence, and confirming immunohistochemically the localization of the gene at the protein level or the change in expression of the gene. This technique has helped accumulate a large amount of knowledge, some of which is used as an essential method for clinical test.
The copy number of the gene may be investigated by using the Southern blotting technique, which involves treating a sample with a restriction enzyme, transferring the treated sample to a nitrocellulose membrane by electrophoresis, and hybridizing the transferred sample to find a specific base sequence. The degree of transcription of the gene may be investigated by using the Northern blotting technique, which involves separating RNA by gel electrophoresis, transferring the separated RNA to a nylon membrane, hybridizing the transferred RNA with a labeled probe, and detecting the desired molecules.
These classic techniques of the first generation are followed by the new techniques of the second generation, which are designed to examine a very large number of genes and proteins comprehensively at one time. They have been developed for the human genome project, which needs to examine a large number of genes comprehensively at one time. Nowadays, comprehensive analyses are carried out not only for genes (genome) but also for RNAs (transcriptome), proteins (proteome), and metabolites (metabolome). Many methods have been devised to utilize the resulting data for disease diagnosis and medical treatment.
There have been proposed many methods for studying how a change in mRNA affects a disease by computer analysis of comprehensive data originating from transcriptome which is a collection of mRNAs or all of transcription products in a cell. Among them is a method for knowing the property of cancer and devising the medical treatment of cancer by analyzing the expression profile of mRNA and other molecular data. For example, refer to following Patent Documents 1 to 9.
Patent Document 1:
Japanese Patent Laid-open No. 2005-34151,
Patent Document 2:
Japanese Patent Laid-open No. 2004-329211,
Patent Document 3:
JP-A-2005-514051,
Patent Document 4:
JP-A-2005-512557,
Patent Document 5:
JP-A-2005-514359,
Patent Document 6:
JP-A-2005-518522,
Patent Document 7:
JP-A-2005-500832,
Patent Document 8:
JP-A-2005-503779,
Patent Document 9:
JP-A-2005-508199.
There has been disclosed a technique for a drawing a graph that shows nodes representing proteins and edges representing their interactions and then visualizing it three-dimensionally by using a parameter called spring force. (For example, see Patent Document 10: Japanese Patent Laid-open No. 2004-118819.)
There has also been disclosed a technique for visualizing by means of nodes and links a table that shows interactions and their intensity between objects such as proteins. (For example, see Patent Document 11: Japanese Patent Laid-open No. 2004-30034.)
Unfortunately, the techniques disclosed in Patent Documents 1 to 9 above are designed to analyze comprehensive data, abstract their feature, and find their relation with specific diseases. They do not give any information about how a set of data of gene increase or decrease relates with specific diseases. Even though they provide the relation between the presence of gene expression cluster and the clinical data, they do not indicate the significance of the relation. Therefore, they do not permit one to ascertain the difference between meaningful data fluctuation and meaningless data fluctuation due to sample preparation for comprehensive data.
In addition, the techniques disclosed in Patent Documents 10 and 11 make intermolecular interactions visual but have no way of digitizing and predicting intermolecular interactions.
The present invention was completed in view of the foregoing. It is intended to digitize the relation between intermolecular interactions and cellular functions.
The first embodiment of the present invention is directed to an information processing apparatus which includes acquisition means, arithmetic means, and output control means. The acquisition means acquires the amount of the molecules for detection which have been produced by control cells and sample cells. The arithmetic means receives from the acquisition means the information about the amount of the molecules for detection which have been produced by the control cells and the sample cells, thereby calculating the score that indicates whether one of the two molecules for detection promotes or suppresses the other and also indicates whether the cellular function is promoted or suppressed depending on the combination of cellular functions for the mutual promotion or suppression between the two molecules for detection. The output control means controls the output of the score which has been calculated by the arithmetic means for the cellular function.
In the information processing apparatus, the acquisition means acquires the amount of the molecules for detection which have been produced by the control cells and the sample cells, according to the amount of the nucleic acid which has been expressed in response to the molecules for detection which have been collected from the control cells and the sample cells.
In the information processing apparatus, the combination of the two molecules for detection is classified into the following five categories according to the interrelation between the two molecules; the first category applicable to two molecules which suppress each other, the second category applicable to two molecules the first one of which promotes the second one and the second one of which suppresses the first one, the third category applicable to two molecules which promote each other, the fourth category applicable to two molecules only one of which promotes the other, and the fifth category applicable to two molecules only one of which suppresses the other.
In the information processing apparatus, the arithmetic means calculates the score for the cellular functions by accumulating for each cellular function those values which are obtained by giving the score based on the amount of the molecules for detection which have been produced in the control cells and the sample cells to the cellular functions relating to the mutual promotion or suppression between the two molecules for detection which belong to the first to third categories out of the five categories and then multiplying a prescribed factor.
In the information processing apparatus, the prescribed factor is established such that it takes on the largest value for the cellular function relating to the first category of the first to third categories out of the five categories and it also takes on the smallest value for the cellular function relating to the third category of the first to third categories out of the five categories.
In the information processing apparatus, the prescribed factor is larger than 1 when the two molecules for detection have a molecular bond.
The information processing apparatus further includes storage means that stores in a table form the information about the combination of the two molecules for detection which are classified into any of the five categories and the cellular function relating to the mutual promotion or suppression of the two molecules for detection.
The information processing apparatus further includes estimating means that estimates the score for the cellular function when there is any change in the amount of the molecules for detection which have been produced in the control cells and the sample cells after it has been acquired by the acquisition means.
The information processing apparatus further includes network building means that builds a network for the information about the interrelation of the molecules for detection, so that the estimating means calculates the effect of change in the amount of the molecules for detection which have been produced on other molecules based on the network which has been built by the network building means, thereby estimating the score for the cellular function.
The information processing apparatus further includes analyzing means that analyzes the change with time of the cellular function based on the score for the cellular function, with its output being controlled by the output control means.
The second embodiment of the present invention is directed to an information processing method or an information processing program which includes the steps of acquiring the amount of the molecules for detection which have been produced in the control cells and the sample cells, receiving the information about the amount of the molecules for detection which have been produced in the control cells and the sample cells, thereby calculating the score that indicates whether one of the two molecules for detection promotes or suppresses the other and also indicates whether the cellular function is promoted or suppressed depending on the combination of cellular functions for the mutual promotion or suppression between the two molecules for detection.
The third embodiment of the present invention is directed to an information processing system which includes an analyzing unit that analyzes the amount of the molecules for detection which have been produced in the control cells and the sample cells and an information processing apparatus that analyzes the information about the cellular function relating to the mutual promotion or suppression of the two molecules for detection. The information processing apparatus has acquisition means, arithmetic means, and output control means. The acquisition means acquires the amount of the molecules for detection which have been produced by control cells and sample cells. The arithmetic means receives from the acquisition means the information about the amount of the molecules for detection which have been produced by the control cells and the sample cells, thereby calculating the score that indicates whether one of the two molecules for detection promotes or suppresses the other and also indicates whether the cellular function is promoted or suppressed depending on the combination of cellular functions for the mutual promotion or suppression between the two molecules for detection. The output control means controls the output of the score which has been calculated by the arithmetic means for the cellular function.
The information processing system includes the steps of acquiring the amount of the molecules for detection which have been produced in the control cells and the sample cells, receiving the information about the amount of the molecules for detection which have been produced in the control cells and the sample cells, thereby calculating the score that indicates whether one of the two molecules for detection promotes or suppresses the other and also indicates whether the cellular function is promoted or suppressed depending on the combination of cellular functions for the mutual promotion or suppression between the two molecules for detection.
The network denotes any setup which consists of at least two apparatus connected to each other so that information can be transmitted from one apparatus to the other. The apparatus capable of communication through the network may be those which are independent from one another or those which are constituent units of one apparatus.
The term “communication” means wireless and wire communications or a mixture thereof. In the latter case, wireless communication may be carried out in some sections and wire communication may be carried in other sections. Another mode of communication may be such that wire communication is carried out from the first apparatus to the second apparatus and wireless communication is carried out from the second apparatus to the first apparatus.
The above-mentioned information processing apparatus according to the present invention is able to analyze the cellular function relating to the molecules for detection. It is also able to classify the overall relation between molecules, thereby digitizing the relation between the intermolecular interaction and the cellular function.
The following is a detailed description of the embodiments according to the present invention. This description is intended to ensure that the embodiments according to the present invention conform to the specification and drawings therein. The embodiments may include those which have the constituents of the present invention which are not shown in the specification or the drawings therein. This does not necessarily mean that such embodiments do not correspond to the constituents of the present invention. Conversely, even though some embodiments may be written as conforming to the constituents of the present invention, it does not necessarily mean that such embodiments do not conform to other constituents than the constituents.
The information processing apparatus according to the present invention has acquisition means (such as the arithmetic unit 21 shown in
The combination of the two molecules for detection is classified into the following five categories according to the interrelation between the two molecules; the first category (such as NN-type) applicable to two molecules which suppress each other, the second category (such as PN-type) applicable to two molecules the first one of which promotes the second one and the second one of which suppresses the first one, the third category (such as PP-type) applicable to two molecules which promote each other, the fourth category (such as N-type) applicable to two molecules only one of which promotes the other, and the fifth category (such as P-type) applicable to two molecules only one of which suppresses the other.
The information processing apparatus may additionally have storage means (such as the protein information database 3) that stores in a table form (shown in
The information processing apparatus may additionally have inferring means (such as the target molecule inferring unit 27 shown in
The information processing apparatus may additionally have network building means (such as the network building unit 26 shown in
The information processing apparatus may additionally have analyzing means (such as the result analyzing unit 6 shown in
The information processing method or program according to the present invention includes the step of acquiring the amount of the molecules for detection which have been produced in the control cells (such as normal cells) and the sample cells (the step being represented by Step S3 in
The information processing system according to the present invention includes an analyzing unit (such as the mRNA expression analyzing unit 2 shown in
The embodiment of the present invention will be described with reference the accompanying drawings.
The protein information analyzing system includes the chip forming unit 1, the mRNA expression analyzing unit 2, the protein information database 3, the protein information analyzing unit 4, the result display unit 5, and the result analyzing unit 6. It may also have the protein kit 7.
The chip forming unit 1 yields a DNA chip (or DNA microarray) which has as the probe a nucleic acid with the base sequence structure complementary to the molecule (protein) for detection.
The mRNA expression analyzing unit 2 is so designed as to drop the control target and the detection target onto the DNA chip which has been prepared by the chip forming unit 1, thereby determining the amount of the molecule (protein) for detection in each case. The control target is produced by the mRNA collected from the normal cell (control cell), and the detection target is obtained by reverse transcription (for duplication) of the complementary DNA (cDNA) from the mRNA collected from the sample cell. In other words, the mRNA expression analyzing unit 2 performs hybridization, which utilizes the reaction to form the complementary strands (double strands) between nucleic acids each having the complementary base sequence, and then determines, by fluorescence intensity analysis with an intercalator, the amount of the molecule (protein) for detection which has been expressed in the normal cell and the amount of the molecule (protein) for detection which has been expressed in the sample cell, and supplies the thus obtained result to the protein information analyzing unit 4.
The foregoing units may be replaced by the protein kit 7, which is designed to detect comprehensively the molecules (proteins) for detection by using protein chips.
The protein information database 3 stores information about the protein to be used for processing by the protein information analyzing unit 4. The protein information database 3 may be connected, by wire or wireless (e.g., through the Internet or LAN or WAN network), directly to the protein information analyzing unit 4. It may also be installed inside the protein information analyzing unit 4.
The combination of two different protein molecules can be classified into five categories according to their intermolecular interactions. The protein information database 3 stores information about the classification of the combination of two protein molecules belonging to each category. (The classification is referred to as molecule set.)
As shown in
The NN-type denotes a combination in which two molecules suppress each other. The two molecules in the NN-type combination function as the molecular switch, with one representing “ON” if it dominates over the other quantitatively and functionally, and the other representing “OFF”.
The PN-type denotes a combination in which the first molecule promotes the second molecule and the second molecule suppresses the first molecule. In other words, two molecules perform contradictory functions (promotion and suppression) on each other. In this case, the information about the molecule for promotion converges on a certain value with oscillation as the result of negative feedback.
The PP-type denotes a combination in which two molecules promote each other. While two molecules are promoting each other, the information about the two molecules is amplified as the result of positive feedback.
The P-type denotes a combination in which one molecule promotes the other. The N-type denotes a combination in which one molecule suppresses the other.
Tables 1 to 9 show the NN-type combination (molecule set) of molecules.
Tables 10 to 35 show the PN-type combination (molecule set) of molecules.
OX1
P2
RS1
OX1
T1
LK
RS
NS
FNA1
L3
GF1
AK
N
S
GF1
The protein information database 3 also stores information about how the intermolecular interaction between two molecules affects the cellular function.
How the intermolecular interaction between two molecules affects the cellular function is inferred in the following way.
For a molecule set of NN-type, the following steps are taken to infer how the intermolecular interaction between two molecules affects the cellular function. The first step is to select the cellular function to be affected simultaneously by two proteins of the molecule set of interest. It is assumed that when the cellular function X is promoted by protein A and suppressed by protein B, and protein A is on and protein B is off, the molecule set of protein A and protein B promotionally acts on the cellular function X. Conversely, it is assumed that when the cellular function X is suppressed by protein A and promoted by protein B, and protein A is off and protein B is on, the molecule set of protein A and protein B promotionally acts on the cellular function X.
The protein information database 3 stores information about how the intermolecular interaction (between two molecules) affects the cellular function with respect to each protein of the molecule set of NN-type. As its example,
In the case of molecule set of NN-type involving TP53 and ABCC1 shown in
In the case of molecule set of NN-type involving TP53 and telomerase shown in
In the case of molecule set of NN-type involving TP53 and FGF2 shown in
In the case of molecule set of NN-type involving TP53 and TERT shown in
In the case of molecule set of NN-type involving TP53 and HSPA4 shown in
In the case of molecule set of NN-type involving TP53 and TXNRD1 shown in
Thus, the protein information database 3 stores any molecule set of NN-type involving other proteins than PT53 to show what cellular function is promoted when which protein is on.
For the relation between the cellular function and interaction between two molecules in the molecule set of PN-type, the molecule set of PN-type is assumed to positively act (POS) on the cellular function Y which is promoted by protein A and suppressed by protein B when promotive action is brought from protein A to protein B. Likewise, it is assumed to negatively act (NEG) on the cellular function Z which is suppressed by protein A and promoted by protein B.
The protein information database 3 stores information about the relation between the cellular function and the interaction between two molecules for proteins involved in the molecule set of PN-type. This is illustrated in FIGS. 4 to 9, which show the relation between the cellular function and the molecule set of PN-type involving TP53.
In the case of molecule set of PN-type involving ADP and TP53 as shown in
In the case of molecule set of PN-type involving catenin and TP53 as shown in
In the same way as mentioned above, proteins involved in the molecule sets of PN-type shown in
The relation between the cellular function and interaction between two molecules is inferred in the same way as mentioned above for the molecule set of PP-type, that is, two proteins are regarded as “POS” (for promotion) and “NEG” (for suppression), respectively, if they promote and suppress the specifically selected cellular function on which they act simultaneously.
The protein information database 3 stores information about relation between the interaction of two molecules and the cellular function for the molecule set of PP-type involving various proteins. FIGS. 10 to 14 show, as some of their examples, how the cellular function is affected by the molecule set of PN-type involving TP53 molecule.
The molecule set of PN-type involving TP53 and PTEN as shown in
The molecule set of PN-type involving TP53 and GADD45A as shown in
In the same way as mentioned above, proteins involved in the molecule sets of PN-type shown in
Now, the description of
The protein information analyzing unit 4 includes the protein expression ratio arithmetic unit 21, the point accumulating unit 22, the factor setting unit 23, the operating input acquisition unit 24, the database building and processing unit 25, the network building unit 26, the target molecule inferring unit 27, and the result output unit 28.
The protein expression ratio arithmetic unit 21 receives from the mRNA expression analyzing unit 2 (or the protein kit 7) information about the amount of target protein expressed in normal cells and information about the amount of target protein expressed in sample cells. It compares the amount of target protein expressed in normal cells with the amount of target protein expressed in sample cells and calculates the increase or decrease of the amount of target protein expressed. It supplies the thus obtained value as the protein index to the point accumulating unit 22.
The point accumulating unit 22 receives the value of protein index from the protein expression ratio arithmetic unit 21 and calculates the accumulated value of scores for individual cellular functions by using the value of protein index of two proteins constituting the molecule set of NN-type, PN-type, and PP-type stored in the protein information database 3 and the value of the factor set up in the factor setting unit 23. If the point accumulating unit 22 gives a positive value of score for the cellular function, it means that the cell for detection promotes the cellular function; otherwise, it means that the cell for detection suppresses the cellular function.
The point accumulating unit 22 performs arithmetic process in the following manner for the molecule sets of NN-type, PN-type, and PP-type.
Association with cellular function and scoring are carried out as follows for the molecule set of NN-type involving INS and IFNG.
The point accumulating unit 22 receives from the protein expression ratio arithmetic unit 21 the values of protein index for the two proteins constituting the molecule set of NN-type and then calculates the absolute value of the difference between the two values. Subsequently, it assigns the absolute value to be positive or negative according to whether each cellular function is promoted or suppressed, and multiplies it by the factor set up by the factor setting unit 23, thereby giving the score of the cellular function associated with the molecule set.
The cellular function associated with INS and IFNG for the molecule set of NN-type is classified into two categories as shown in
The point accumulating unit 22 receives the protein index of INS and the protein index of IFNG from the protein expression ratio arithmetic unit 21. If the difference between the two indexes for the cellular function promoted by INS is larger than 0, it adds a positive sign to the absolute value of the difference. If the difference between the two indexes is smaller than 0, then it adds a negative sign to the absolute value of the difference. It multiplies the signed value by the factor set up in the factor setting unit 23. The resulting value is the score of the cellular function controlled by the two proteins (shown in
Association with cellular function and scoring are carried out as follows for the molecule set of PN-type involving INS and JUN.
The point accumulating unit 22 receives from the protein expression ratio arithmetic unit 21 the value of protein index for either of the two proteins constituting the molecule set of PN-type which is promoted. It makes the value positive or negative according to whether the cellular function is promoted or suppressed and then multiplies it by the factor set up by the factor setting unit 23, thereby giving the score of the cellular function associated with the molecule set.
The cellular function associated with INS and JUN for the molecule set of PN-type is classified into two categories as shown in
Association with cellular function and scoring are carried out as follows for the molecule set of PP-type involving TNF and TP53.
The point accumulating unit 22 receives from the protein expression ratio arithmetic unit 21 the values of protein index for the two proteins constituting the molecule set of PP-type and then calculates their product. Subsequently, it assigns the product to be positive or negative according to whether each cellular function is promoted or suppressed, and multiplies it by the factor set up by the factor setting unit 23, thereby giving the score of the cellular function associated with the molecule set.
The cellular function associated with INS and TP53 for the molecule set of PP-type is classified into two categories as shown in
The protein index is calculated in the following manner which is explained with reference to
In
The combination of 19 proteins (shown in
The combination of 19 proteins (shown in
The combination of 19 proteins (shown in
As mentioned above, the point accumulating unit 22 calculates the score of the cellular function (as explained with reference to FIGS. 15 to 18) for the molecule sets of NN-type, PN-type, and PP-type, and then accumulates the scores of individual cellular functions and supplies the results to the result output unit 28 and the target molecule inferring unit 27.
The factor setting unit 23 sets up the factor for score accumulation to be executed by the point accumulating unit 22. The factor should preferably be set up such that it takes the largest value for NN-type and the smallest value for PP-type. If there is a molecular bond between two molecules involved in the molecule set, the factor should be multiplied by a prescribed value larger than 1. These factors are previously obtained by experiment and experience; they may be set up in the factor setting unit 23 or may be changed by the user through processing in the operation input acquisition unit 24.
The operation input acquisition unit 24 is an input device such as keyboard, mouse, touch pad, and touch panel, which receives inputs in response to the user's operation. It permits the user to change the setting of the factor in the factor setting unit 23, to change the value of protein index in the simulation by the target molecule inferring unit 27 (mentioned later), and to update the database in the database building and processing unit 25. It supplies the entry to the factor setting unit 23, the target molecule inferring unit 27, and the database building and processing unit 25.
The database building and processing unit 25 updates and supplements various kinds of information stored in the protein information database 3 according to the user's input (which is supplied from the operating input acquisition unit 24) or database externally supplied through the network interface (not shown).
The target molecule inferring unit 27 performs simulation to infer the target molecule on the basis of score for each cellular function obtained from processing by the point accumulating unit 22.
The target molecule inferring unit 27 simulates the change of score for cellular function which occurs when the protein index of specific molecule changes in the expression of mRNA of DAOY (cultured cell of human medulloblastoma), which was explained above with reference to
The value in
As shown in
The foregoing suggests that any treatment (with an anticancer agent, for example) to suppress the function of proteins (AKT1 and IL6) causes at least the cultured cell of human medulloblastoma (DAOY) to dye rather than proliferate. Finding a combination of proteins for the most remarkable effect will help search for the candidate of target molecule as an anticancer agent.
The target molecule inferring unit 27 may also be designed such that it performs simulation to infer the target molecule based on the protein network model built up by the network building unit 26.
The network building unit 26 builds up the molecule network based on the information stored in the protein information database 3.
Any increase or decrease of the index of a certain protein in the protein network model built up by the network building unit 26 affects the index of other proteins connected with the network. The target molecule inferring unit 27 simulates the change of index of individual proteins in the network to infer how an increase or decrease of protein index at one node affects the protein index at other nodes (adjacent to the node in which the protein index has changed), on the assumption that the effect in the first adjacent node is 50%, the effect in the second adjacent node is 30%, the effect in the third adjacent node is 10%, and so on. The protein network model built up by the network building unit 26 consists of more than one molecule network (similar to that shown in
The target molecule inferring unit 27 repeats the process of accumulating the score of cellular function by using the result of the simulation which has been carried out by means of the molecule network, thereby inferring the target molecule.
The result output unit 28 receives an accumulated score of cellular function from the point accumulating unit 22 or receives the result of inference of the target molecule from the target molecule inferring unit 27, and then delivers it to either or both of the result display unit 5 and the result analyzing unit 6.
The result display unit 5 consists of a display device such as CRT and LCD. It displays the result of accumulated score of cellular function or the result of inference of the target molecule which has been received from the result output unit 28. The user will be able to perform input operation to infer the target molecule by reference to the result of accumulated score for cellular function which is displayed on the result display unit 5.
The result analyzing unit 6 accumulates the result of accumulated score for cellular functions or the result of inference of the target molecule (which has been received from the result output unit 28) and then performs analysis according to need.
To be concrete, the result analyzing unit 6 accumulates chronologically the result of accumulated score for cellular functions of the same test subject and analyses the chronological change, so that it permits one to correctly judge whether or not the target protein has decreased as the result of medication to the test subject during the specific period. Moreover, it also permits one to confirm the effect (increase or decrease in expression) on other proteins or the effect on other cellular function by medication in that period.
This description is based on the assumption that the result analyzing unit 6 is independent of the protein information analyzing unit 4. However, the former may be included in the latter.
The protein analyzing system according to the present invention permits one to analyze in a simple manner any system anomaly of disease caused by anomalous molecule network in cells (such as cancer).
In other words, the protein analyzing system according to the present invention classifies interactions between two molecules into five categories which are NN-type for two proteins suppressing each other, PN-type for two proteins, with the first one promoting the second one and the second one suppressing the first one, PP-type for two proteins promoting each other, P-type for two proteins, with the first one only promoting the second one, and N-type for two proteins, with the first one only suppressing the second one, and calculates and accumulates the score for the cellular function associated with the pair of proteins falling under any of these categories, thereby digitizing the cellular function. Combining this result with the variation of cellular function makes it possible to infer the system structure of cells.
The present invention makes it possible to analyze the relation between the cellular function and the intermolecular action of proteins instead of merely paying attention to a single molecule. Therefore, it permits one to investigate the change in cellular function which occurs when the amount of specific proteins expressed fluctuates. This capability may be used to simulate a combination to restore the normal state by changing the anomalous cellular function (resulting from cancer, for example). In this way it is possible to infer the target molecule important for medical treatment.
The target molecule important for medical treatment will be inferred by means of the molecule network consisting of nodes representing proteins and links representing interactions classified into five categories mentioned above. In this way it is possible to infer the target molecule more accurately.
Once a correct target molecule is inferred, it would be very useful to establish an adequate way of medication to restore the anomalous system resulting from diseases.
In what follows, the process for analysis by means of the protein analyzing system will be described with reference to
In Step S1, the chip forming unit 1 prepares a DNA chip to determine the amount of protein expressed (for analysis).
One DNA chip has more than one probe, so that it can determine the amount of more than one protein expressed.
In Step S2, the mRNA expression analyzing unit 2 carries out hybridization for the normal cell and the sample cell. To be concrete, this step is carried out as follows. The DNA chip, which has been prepared in Step S1, is given dropwise a target for control and a target for detection. The target for control is produced by using complementary DNA (cDNA) which has been replicated by reverse transcription from mRNA collected from normal cells. The target for detection is produced by reverse replication of complementary DNA (cDNA) from mRNA collected from sample cells. The probe and target are bound together (hybridized) through the reaction to form the complementary strands (double-strands) between the nucleic acids having the complementary base sequence.
In Step S3, the mRNA expression analyzing unit 2 calculates the amount of target protein expressed in normal cells and the amount of target protein expressed in sample cells, and then it sends the result to the protein expression ratio arithmetic unit 21 of the protein information analyzing unit 4.
The detailed procedure for Step S3 includes cleaning of the DNA chip, on which hybridization has occurred, and addition of an intercalator which emits fluorescence upon irradiation with exciting light, then the intercalator binds with the probe which has been hybridized. The intercalator binds with the probe in such a way that it does not enter between the probe and the target if they are not hybridized and it enters between the probe and the target only if they are hybridized. Upon irradiation with exiting light, the intercalator emits fluorescence, which is subsequently condensed by an object lens or the like and separated from exciting light by a prism. The condensed and separated fluorescence enters a photodiode for image analysis and calculation of the amount of target protein expressed.
In Step S4, the protein expression ratio arithmetic unit 21 of the protein information analyzing unit 4 calculates an increase or decrease in the amount of target protein expressed in the sample cells in comparison with the amount of target protein in the normal cells. Subsequently, it sends the result of calculations to the point accumulating unit 22. In other words, the protein expression ratio arithmetic unit 21 calculates the protein index on the basis of the amount of target protein (control) expressed in the normal cells and the amount of target protein expressed in the sample cells, both of which have been supplied from the mRNA expression analyzing unit 2. Then, it sends the result to the point accumulating unit 22.
In Step S5, the point accumulating unit 22 calculates the point accumulation (mentioned later) according to the flow sheet shown in
In Step S6, the result output unit 28 sends the result obtained in Step S5 to either or both of the result display unit 5 and the result analyzing unit 6.
In Step S7, the operating input acquisition unit 24 decides whether or not an instruction has been given to execute the target molecule inferring process.
In Step S8, the target molecule inferring unit 27 performs the process to infer the target molecule according to the flow sheets shown in
In Step S9, the result analyzing unit 6 decides whether or not an instruction has been given to analyze the result of analysis of proteins which has been supplied from the result output unit 28 of the protein information analyzing unit 4, if it is decided in Step S7 that no instruction has been issued to execute the process of inferring the target molecule or after completion of the processing Step S8.
In Step S10, the result analyzing unit 6 chronologically analyzes the result of protein analysis if it is judged in Step S9 that an instruction has been issued to analyze the result of protein analysis. The procedure for analysis includes accumulating chronologically the result of accumulation of the score for cellular function of the same test subject, analyzing the chronological changes, confirming whether or not the target protein has decreased as the result of medication to the test subject during the prescribed period, and confirming the effect on other proteins due to medication in the prescribed period or the effect on other cellular functions.
Step S11 is to decide whether or not an instruction has been issued to terminate the processing if it is judged in Step S9 that an instruction has been issued to analyze the result of protein analysis or after completion of the processing in Step S10.
The process returns to Step S7 if it is decided in Step S11 that no instruction for processing has been received from the user, and the steps after S7 are repeated. The process ends if it is judged in Step S11 that an instruction for processing has been received from the user.
The foregoing processing gives the score of cellular function in response to the amount of expression for individual molecule sets classified according to interactions between two molecules. The score of cellular function permits one to infer the target molecule and to analyze chronologically the result of protein analysis.
In what follows, the process for point accumulation to be performed in Step S5 shown in
In Step S41, the point accumulating unit 22 extracts one of the molecule sets of NN-type, PN-type, or PP-type, which involves the proteins whose expression has been detected.
In Step S42, the point accumulating unit 22 extracts the factor, which has been set up by the factor setting unit 23, according to the classification of the molecule sets (NN-type, PN-type, or PP-type) and the presence or absence of the molecular bond.
In Step S43, the point accumulating unit 22 detects whether each of cellular functions corresponding to the molecule sets is promoted or suppressed, by referencing the information about the relation between the molecule set and the cellular function shown in FIGS. 3 to 14 which is stored in the protein information database 3.
Each of the cellular functions is associated with the molecule set as explained with reference to FIGS. 15 to 17.
In Step S44, the point accumulating unit 22 multiplies by a factor the value as the base of the score (said value being obtained as explained with reference to
In Step S45, the point accumulating unit 22 decides whether or not the score has been added to all the molecule sets. If it is judged in Step S45 that the score is not yet added to all the molecule sets, the step returns to Step S41 and subsequent steps are repeated.
If it is decided in Step S45 that the score has been added to all the molecule sets, the point accumulating unit 22 performs accumulation for each cellular function in Step S46, and the step returns to Step S5 and proceeds to Step S6 (shown in
The above-mentioned process accumulates the score for each cellular function, thereby allowing one to know which cellular function is promoted or suppressed in the sample cells.
In what follows, the target molecule inferring process 1 to be performed in Step S8 shown in
The target molecule inferring process 1 infers the target molecule based only on the changed value of the protein index, without using the molecule network.
In Step S71, the operating input acquisition unit 24 decides whether or not it has received an input for the changed value of the protein index. If the operating input acquisition unit 24 decides in Step S71 that it has not yet received an input for the changed value of the protein index, it repeats the process in Step S71 until it judges that it has received an input for the changed value of the protein index.
In Step S72, the operating input acquisition unit 24 sends the value of the protein index entered to the target molecule inferring unit 27 if it is judged in Step S71 that it has received an input for the changed value of the protein index. The target molecule inferring unit 27 sends the changed value of the protein index entered to the point accumulating unit 22, thereby causing the point accumulating unit 22 to accumulate the point by using the changed protein index as explained with reference to
In Step S73, the result output unit 28 sends the result of calculation which supplied from the point accumulating unit 22 to the result output unit 28 and the target molecule inferring unit 27.
In Step S74, the operating input acquisition unit 24 decides whether or not it has received an input for the changed value of different protein index. If the operating input acquisition unit 24 decides in Step S74 that it has received an input for the changed value of different protein index, the process returns to Step S72 and the subsequent processes are repeated. If the operating input acquisition unit 24 judges in Step S74 that it has not yet received an input for the changed value of different protein index, the process returns to Step S8 shown in
The foregoing process performs point accumulation by using the changed protein index as explained with reference to
In what follows, the target molecule inferring process 2 to be performed in Step S8 shown in
The target molecule inferring process 2 infers the target molecule by simulating the changed value of protein index for a plurality of molecules by using the molecule network.
In Step S101, the operating input acquisition unit 24 decides whether or not it has received an input for the changed value of the protein index. If the operating input acquisition unit 24 decides in Step S101 that it has not yet received an input for the changed value of the protein index, it repeats the process in Step S101 until it judges that it has received an input for the changed value of the protein index.
In Step S101, the operating input acquisition unit 24 sends the value of the protein index entered to the target molecule inferring unit 27 if it is decided in Step S101 that it has received an input for the changed value of the protein index. The target molecule inferring unit 27 sends the changed value of the protein index entered to the network building unit 26, thereby causing the network building unit 26 to calculate the variation of the protein index at individual nodes that occurs when the value of the prescribed protein index is changed in the molecule network built up by the network building unit 26. The network building unit 26 calculates the variation of the protein index at each node based on the changed value of the protein index supplied, and sends the result to the point accumulating unit 22.
It is desirable to have a means for considering the presence of nodes under influence of more than one route for the increase or decrease of the protein index at a certain node.
In Step S103, the point accumulating unit 22 accumulates the point by using the protein index after simulation in the same way as explained with reference to
In Step S104, the result output unit 28 sends the result of calculation (which has been received from the point accumulating unit 22) to the result output unit 28 and the target molecule inferring unit 27.
In Step S105, the operating input acquisition unit 24 decides whether or not it has accepted the input of the changed value of the different protein index. If it is decided in Step S105 that the input of the changed value of the different protein index has been accepted, the process returns to Step S102 and the subsequent process is repeated. If it is decided in Step S105 that the input of the changed value of the different protein index has not been accepted, the process returns to Step S8 shown in
The above-mentioned process accumulates the score for cellular function once again by using the result of simulation by means of the molecule network, thereby allowing one to infer the target molecule.
A series of processes mentioned above may be implemented by means of hardware or software. At least part of the above-mentioned process may be carried out by means of the personal computer 101 shown in
In
The CPU 111, the ROM 112, and the RAM 113 are connected to one another through the internal bus 114, which is connected to the input/output interface 115.
The input/output interface 115 is connected to the input device 116 such as keyboard and mouse, the output device 117 such as display and speaker, the memory unit 118 such as hard disk, and the communication unit 119 such as modem and terminal adaptor. The communication unit 119 performs communications through various networks including telephone circuit and CATV.
The input/output interface 115 is connected to the drive 120 according to need. The drive 120 may be equipped with the removable medium 121, such as magnetic disc, optical disc, magneto-optical disc, and semiconductor memory. The computer program is read out from the drive 120 and then installed in the memory 118 according to need.
In the case where software is used for processing, the program constituting the software is installed from the network or recording medium.
The recording medium may be the ROM 112 in which the program is recorded or the hard disc included in the memory device 118. In this case the ROM 112 and the hard disc are built into the personal computer delivered to the user. The program may also be recorded in the removal medium 121, which is distributed to the user separately from the computer proper.
In this specification, the steps for the program recorded in the recording medium may be carried out chronologically in the order listed; however, they may also be carried out in parallel or independently.
In this specification, the term “system” denotes an entire apparatus including a plurality of devices.
Incidentally, the embodiments of the present invention are not limited to those mentioned above; they may be modified variously without departing from the scope of the present invention.
Number | Date | Country | Kind |
---|---|---|---|
2005-266728 | Sep 2005 | JP | national |