The technology described herein relates to methods for determining metabolites that can be used as agents and/or targets for the therapeutic treatment of disease. The levels of one or more metabolites identified using these methods can be manipulated to increase or decrease the endogenous and/or intracellular levels of these metabolites by, for example, administration of the metabolites themselves, inhibition/activation of relevant enzymes, and/or inhibitors/activators of specific transporters.
Today the search for disease cures centers on identifying key molecular determinants of the disease. If such molecules—and the roles they play—can be identified, then regulation of their concentration, or inhibition of their function, may be successful routes to a disease therapy. In the complex biochemical interplay that underlies most disease conditions, many molecules play more than one role—sometimes a useful role as well as a detrimental role—and many molecules are created and altered as the biochemical machinery performs its task. Molecules that are created during metabolic processes—metabolites—may prove useful targets in developing many disease therapies.
Elucidating the metabolic changes exhibited by cancer cells is important not only for diagnostic purposes, but also to more deeply understand the molecular basis of carcinogenesis, which could lead to novel therapeutic approaches. Certain metabolic processes may play fundamental roles in cancer progression by regulating the expression of oncogenes or modulating various signal transduction systems. The significance of other metabolic phenotypes observed in cancer is more controversial, such as the shift in energy production from oxidative phosphorylation (respiration) to aerobic glycolysis, which is known as the Warburg effect. The prevailing view recently has been that the Warburg effect is a consequence of the cancer process (secondary events due to hypoxic tumor conditions) rather than a mechanistic determinant, as originally hypothesized. Recently, however, a different picture of the role of metabolic changes in tumorigenesis has emerged. For example, the dichloroacetate-induced reversion from a cytoplasm-based glycolysis to a mitochondria-located glucose oxidation inhibits cancer growth. This suggests that a glycolytic shift is a fundamental requirement for cancer progression.
Changes in intracellular concentrations of certain metabolites can influence the rate of cancer cell growth. A metabolite can exert this effect by acting as a signaling molecule, a role that does not preclude other important cellular functions. For instance, diacylglycerol, a lipid that confers specific structural and dynamic properties to biological membranes and serves as a building block for more complex lipids, is also an essential second messenger in mammalian cells whose dysregulation contributes to cancer progression. Similarly, structural components of cell membranes, such as the sphingolipids ceramide and sphingosine, are also second messengers with antagonizing roles in cell proliferation and apoptosis. Pyridine nucleotides constitute yet another example, having well characterized functions as electron carriers in metabolic redox reactions and roles in signaling pathways. In particular, NAD+ modulates the activity of sirtuins, a recently discovered family of deacetylases that may contribute to breast cancer tumorigenesis. Arginine is yet another metabolite involved in numerous biosynthetic pathways that also has a fundamental role in tumor development, apoptosis, and angiogenesis.
Cellular metabolites can also be involved in the control of cell proliferation by directly regulating gene expression. Signaling pathway-independent modulation of gene expression by metabolites can occur in several ways. For example, metabolites can bind to regulatory regions of certain mRNAs (riboswitches), inducing allosteric changes that regulate the transcription or translation of the RNA transcript, however, this type of direct metabolite-RNA interaction has not yet been detected in humans. In another example, transcription factors can be activated upon metabolite binding (e.g., binding of steroid hormones to the estrogen receptor transcription factor induces gene expression events leading to breast cancer progression). In yet another example, metabolites can be involved in epigenetic processes such as post-translational modification of histones that regulate gene expression by changing chromatin structure. The modulation of the rate of histone acetylation by nuclear levels of acetyl-CoA is an example of metabolic control over chromatin structure that involves epigenetic changes linked to cell proliferation and carcinogenesis.
Manipulation of specific metabolic pathways has been the basis of several anticancer therapies that have been proposed based on experimental evidence, that are subject to validation in clinical trials, and/or that are currently in use. An exemplary anticancer therapy that was proposed based on experimental evidence is the inactivation of the metabolic enzyme KIAA1363 which decreased the rate of tumor growth in vivo. Several anticancer treatments that exploit the antiproliferative action of ceramide are examples of therapies based on the pharmacological manipulation of a metabolic pathway that are currently in clinical trials. A metabolite-based therapy, that has been used since 1970 for acute lymphoblastic leukemia, and has also applied to ovarian cancer and other tumors, consists of depleting circulating asparagine by administration of the bacterial enzyme L-asparaginase.
To date, however, the search for metabolites that have a direct connection to a particular disease state has been haphazard. Rather than making reasonable predictions of the metabolites that are likely to be involved in a particular disease, researchers still rely on fortuitous discoveries.
In general, preventive and therapeutic anticancer approaches based on the pharmacological manipulation of metabolism aim to increase or decrease the intracellular levels of certain metabolites by, for example, administration of either the metabolites themselves, inhibitors/activators of relevant enzymes, and/or inhibitors/activators of specific transporters.
A method for identifying one or more metabolites associated with a disease, the method comprising: obtaining a set of gene-expression data from diseased cells of an individual with the disease; obtaining a reference set of gene-expression data from control cells; assigning an expression status to each gene in the gene expression data that encodes a gene product, wherein the expression status for each gene is one of: up-regulated in the diseased cells relative to the control cells; down-regulated in the diseased cells relative to the control cells; expressed by both the diseased cells and the control cells at statistically indistinguishable levels; and not expressed by both the diseased cells and the control cells; determining the effects of gene products on metabolite levels for each metabolite in a list of human metabolites: identify a set of gene products that have an effect on the metabolite; using the expression status for the gene that encodes each gene product that has an effect on the metabolite, predict whether an intracellular level of the metabolite in the diseased cells is increased or decreased relative to its level in control cells; identifying one or more of: those metabolites whose intracellular level is predicted to be lower in diseased cells than in control cells; and those metabolites whose intracellular level is predicted to be higher in diseased cells than in control cells, as associated with the disease.
A method for identifying one or more metabolites associated with a disease, the method comprising: comparing gene expression data from diseased cells to gene expression data from control cells in order to deduce genes that are differentially-regulated in the diseased cells relative to the control cells; based on enzyme function and pathway data for all human metabolites that utilize the genes that are differentially-regulated in the disease cells, identifying one or more metabolites whose intracellular levels are lower in diseased cells than in control cells, and thereby associating the one or more metabolites with the disease.
A method for identifying one or more metabolites associated with a disease, the method comprising: comparing gene expression data from diseased cells to gene expression data from control cells in order to deduce genes that are differentially-regulated in the diseased cells relative to the control cells; based on enzyme function and pathway data for all human metabolites that utilize the genes that are differentially-regulated in the disease cells, identifying one or more metabolites whose intracellular levels are higher in diseased cells than in control cells, and thereby associating the one or more metabolites with the disease.
A method of determining a metabolite-based disease therapy, the method comprising: identifying one or more metabolites associated with the disease, by the methods described herein, and administering said one or more metabolites to an individual with the disease.
A method of treating an individual with a disease, the method comprising: administering to the individual a metabolite identified as associated with the disease by the methods described herein, in an amount sufficient to produce a therapeutic effect.
A method of determining a metabolite-based disease therapy, the method comprising: identifying one or more metabolites associated with the disease, by the methods described herein; and administering one or more drugs to change the levels of said one or more metabolites to an individual with the disease.
The present technology further comprises computer systems configured to carry out the methods described herein in whole or in part, and to provide results of said methods to a user, as for example on a display or in the form of a printout.
The present technology further comprises computer-readable media, encoded with computer-executable instructions for carrying out the methods described herein in whole or in part, when operated on by a suitably configured computer.
When it is stated that a computer system is configured to carry out a method in whole or in part, or that a computer readable medium is configured with instructions for carrying out a method in whole or in part, it is understood to mean that one or more steps of the method is carried out, other than by the computer or computer system. For example, obtaining gene expression data may be obtained manually and read into the computer, or written on to a computer-readable medium.
The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description herein. Other features, objects, and advantages of the invention will be apparent from the description and drawings, and from the claims.
Like reference symbols in the various drawings indicate like elements.
In some embodiments, a metabolomics-based system, such as a computer-based system, that utilizes various data such as metabolic data, can be used to identify one or more metabolites associated with a disease that may have potential as agents and/or targets for therapeutic treatment. The system described here can use a combination of gene-expression data and the relationships between metabolites and gene products to make predictions on the levels of metabolites in diseased cells compared to control cells.
By ‘gene product’ as used herein, is meant molecules, in particular biochemical molecules such as oligonucleotides (DNA, RNA, etc.) or proteins, resulting from the expression of a gene. A measurement of the amount of gene product can be used to infer how active a gene is. Abnormal amounts of gene product can be correlated with diseases, such as the overactivity of oncogenes which can cause cancer, the overexpression of Interleukin-10 which can induce symptoms in virus-induced asthma, and the underexpression of certain genes in early Parkinson's disease. Exemplary gene products of particular interest herein include small molecule transporters, and enzymes, because of their respective involvement in metabolic pathways.
Computational analysis of gene-expression data acquired from both diseased and control cells can determine gene products that are over or under expressed in diseased cells. Data indicative of the relationships between metabolites and gene products, such as data determined from biochemical pathways, enzyme function prediction, and the like, can be used to relate the effect of differential expression on metabolite levels. Considering the relationships and the gene-expression data, predictions can be made on the effect of a disease state on the endogenous and/or intracellular level of metabolites. As used herein, it is to be understood that “intracellular” includes any material that can penetrate a cell membrane, and therefore includes synthetic (non-naturally occurring) species such as pharmaceuticals. “Endogenous” includes those materials expressed, synthesized, or otherwise made naturally within cells.
The metabolites that are predicted to exist at different levels in diseased cells (relative to control cells, such as from a healthy individual) can be further evaluated as potential agents and/or targets, for therapeutic treatments. For example, metabolites that exist at decreased levels in cancer cells, relative to control cells, can be potential agents for anticancer therapies. In which case, one or more metabolites can be supplemented to raise the cellular levels of each of these metabolites to within normal physiological ranges, for the purpose of restoring normal cell function. Similarly, metabolites that exist at increased levels in cancer cells can be targets for anticancer therapies. In this example, activation or inhibition of key enzymes could be used to lower cellular levels of each of these metabolites to within normal physiological levels. In either case, the systems and methods described herein can be used to identify which metabolites, from the larger group of known physiological metabolites, are likely to be agents and/or targets for therapeutic treatments.
Cellular metabolites can be produced and/or consumed by enzymes, bind to regulatory regions of mRNA, activate transcription factors, and/or regulate gene expression through post-translational modification. In diseased cells, certain genes can be over/under expressed leading to increased/decreased levels of one or more metabolites. In some circumstances, it may be possible to restore normal cell function in a diseased cell by returning one or more metabolite levels back to a normal range. In circumstances where a metabolite exists at a lower level in diseased cells, relative to control cells, raising the level of metabolite may have therapeutic value. Conversely, lowering the metabolite level in diseased cells exhibiting increased metabolite levels may also have therapeutic value. One method for determining possible therapeutic agents and/or targets would be to compare the actual intracellular levels of every human metabolite as they exist in normal and diseased states. Metabolites that exist in differential levels between the diseased and control cells could be candidates for further testing to determine their therapeutic value. Currently, however, there is no feasible way to implement such large-scale biochemical assays. As an alternative, gene expression studies, known to individuals skilled in the art, coupled with information relating to biochemical pathways (e.g., gene product function, enzyme function, and the like), can be utilized to predict metabolites that may exist at increased/decreased levels in diseased cells, relative to control cells. These predicted metabolites can be further evaluated, using methods known to individuals skilled in the art, to determine their value as agents and/or targets of therapeutic treatments.
Referring now to
In operation 120, the metabolomics-based system can obtain gene-expression data from studies performed on control cells. For example, gene-expression data can be obtained from previously performed gene expression studies of non-diseased cells that are similar in type to the cells from which the data in operation 110 was acquired. In other embodiments, studies can be performed on non-diseased cells, of a similar type, to obtain the gene-expression data. In operation 130, a differential analysis of the gene-expression data, obtained during operations 110 and 120, can be performed for the purpose of assigning an expression status to each of the genes. For example, genes can be assigned a status such as up-regulated in the diseased cells, down-regulated in the diseased cells, similarly expressed in both the diseased and control cells, or not expressed in both the diseased and control cells.
In operation 140, the effects of gene products on metabolite levels are determined from, for example, existing databases, computational enzyme-function prediction, or the like. In some embodiments, gene products and associated metabolites can be assigned to steps in metabolic pathways. Information from databases can be retrieved and analyzed to identify metabolite/gene product interactions found in the database. In other techniques, the function of, and metabolites related to, proteins with currently unknown function can be inferred using, for example, similarity to proteins with known functions. These relationships can then be used to determine the effect that a particular gene product has on a metabolite. For example, if the gene product (e.g., an enzyme) is determined to catalyze the production of a certain metabolite, it can be deduced that the gene product causes an increase in the intracellular level of the metabolite. Conversely, if the gene product is determined to transport the metabolite out of the intracellular space (e.g., into storage vesicles), it can be deduced that the gene product causes a decrease in the intracellular level of the metabolite. In some embodiments, this information can be determined during operation 140. In other embodiments, some or all of this information can be determined at a previous time and retrieved during operation 140.
In operation 150, the results of the previously described operations can be used to identify metabolites that are predicted to exist in increased/decreased levels in diseased cells relative to control cells. For example, the metabolomics-based system can create a genetic-metabolic matrix including all metabolites and their known relationships to gene products. An example of such a matrix can be found in
For example, metabolite X may be known to be produced by enzyme A (which is decreased in diseased cells) and consumed by enzyme F (which is increased in diseased cells), where the relationships between metabolite X and enzymes A and F were determined during operation 140 and the differential levels of enzyme A and F in diseased cells, compared to control cells, were determined during an analysis of gene-expression data, such as during operation 130. From the relationships between metabolites and gene products and the expression status of the genes that code for these gene products, the metabolomics based system can predict the levels of metabolites in diseased cells relative to control cells. For example, the metabolite X described previously, because it is produced at lower levels in the diseased cells (due to the decreased expression of the gene that produces enzyme A) and consumed at higher levels in the diseased cells (due to the increased expression of the gene that produces enzyme B), can be predicted to exist at lower levels in the diseased cells. Information indicative of the level of metabolites in diseased cells compared to control cells is stored during operation 160 for display and/or future evaluation as potential agents and/or targets for therapeutic treatments.
In some embodiments, the metabolomics-based system can be used to identify agents and/or targets for anti-cancer therapies. For example, studies of ovarian cancer cells and normal ovarian cells can be used to predict metabolites that exist in different levels in the cancer cells (relative to normal cells). One or more of the metabolites, predicted to exist in differential levels, can then be evaluated as agents and/or targets for potential anti-cancer therapies. Metabolites that exist at decreased levels in cancer cells can be supplemented to raise intracellular levels to a near normal range, while metabolites that exist at increased levels can be targets for therapies that decrease the intracellular levels of the metabolites. Some therapies may involve only a single metabolite, while other therapies may involve multiple metabolites concurrently. In cases where multiple metabolites are involved concurrently, some metabolites may be supplemented, while other metabolites levels may be decreased. In one example, a metabolomics-based system such as described herein was used to predict that Seleno-L-methionine exists at decreased levels in ovarian cancer cells (e.g., Hey-A8 and Hey-A8 MDR cells). Subsequently, supplementation of Seleno-L-methionine was shown in vitro to inhibit the growth of Hey-A8 and Hey-A8 MDR cells.
In some embodiments, the metabolomics-based system can be used to identify metabolites that may have potential as agents and/or targets for therapeutic treatment. In one embodiment described herein, analysis of expression data, acquired through gene expression studies of diseased and control cells, can be used to identify genes that are expressed at different levels in diseased cells and control cells. This information can be combined with, for example, knowledge of biochemical pathways (e.g., the relationships between metabolites and gene products) and/or the predicted function of gene products (whose function is not known) to predict the relative level of metabolites in diseased cells compared to the level found in control cells.
For example, the knowledge that enzyme A (which produces metabolite X) is expressed at a lower level in a diseased cell and that enzyme B (which consumes metabolite X) is expressed at a higher rate in the diseased cell could lead one to predict that the level of metabolite X found in the diseased cell would be lower than the level in a normal, non-diseased cell. This prediction could indicate that metabolite X is a potential agent for therapeutic treatment. In this case, where a metabolite is predicted to exist at lower levels in a diseased cell, the metabolite itself could be supplemented to raise the physiological levels of the metabolite up to a normal range. Conversely, where a metabolite is predicted to exist at higher levels in a diseased cell, the metabolite could be a target for other therapies that lower the levels of the metabolite (e.g., activation or inhibition of key enzymes). In either case, the system described here can be used to identify metabolites, from the larger group of known physiological metabolites, which could be further evaluated, by other techniques, as agents and/or targets for therapeutic treatments.
To determine gene products that are expressed at different levels in diseased and control cells, gene expression studies (using methods known to individuals skilled in the art) can be performed on diseased and control cells. Based on the results of the expression studies, each gene can be classified into one of four possible groups: Gup, indicating that the gene is up-regulated in diseased cells relative to control cells; Gdown, indicating that the gene is down-regulated in diseased cells relative to control cells; Gsimilar, indicating that the levels in both diseased and control cells were statistically indistinguishable; and Gnone, indicating that the gene was not expressed in either of the control or diseased cells. Exemplary information that can be used to classify genes includes data (e.g., signal intensities, presence calls, and the like) obtained through DNA microarray technology, serial analysis of gene expression (SAGE) technology, PCR based technologies, and the like.
Referring now to
In some embodiments, the gene-expression data obtained from studies of the diseased and control cells can be utilized, in operation 220, to assign an “on” or “off” status to each gene's set of expression data. This status can be assigned to every gene in each of the diseased and normal cells. In this way, each gene will have a status for the diseased and the non-diseased states. For example, the mean fraction of presence calls generated by the Affymetrix MICROARRAY SUITE 5.0 software can be used to assign a status of “on” or “off” to each gene in each expression study. In some embodiments, for genes where the mean fraction of presence calls labeled as “marginal” or “absent” in the corresponding probe sets is at least 80%, an “off” status is provisionally assigned to the gene, otherwise, an “on” status is assigned to the gene. This process is repeated until all genes have a provisional assignment, of “on” or “off”, for both of the studied conditions (e.g., control cells and diseased cells).
For example, gene A, whose expression levels were measured in both the study of the control cells and diseased cells, can be assigned a status for each state, where the status of the gene A in the non-diseased state is independent of the status of gene A in the diseased state, and vice versa. In other words, gene A in the diseased state can be assigned a status of “on” based on the results of the expression study of the diseased cells, while gene A in the non-diseased state can be assigned a status of “off” based on the results of the expression study of the control cells.
In operation 230, for all genes that have been assigned either an “on” or “off” status for both the control and the diseased states, each gene can be initially assigned an expression status of Gup, Gdown, Gsimilar, or Gnone, based on the previously assigned statuses of the diseased and non-diseased states. A gene is assigned a Gup expression status, indicating that the gene is up-regulated in diseased cells relative to control cells, if the status of the gene in the control cells is “off” and the status of the gene in the diseased cells is “on”. A gene is assigned a Gdown expression status, indicating that the gene is down-regulated in diseased cells relative to control cells, if the status of the gene in the control cells is “on” and the status of the gene in the diseased cells is “off”. A gene is assigned a expression status, indicating that the levels of the gene in both diseased and control cells were statistically indistinguishable, if the status of the gene in control cells is “on” and the status of the gene in the diseased cells is “on”. A gene is assigned a Gnone expression status, if the status of the gene in the control cells and the diseased cells is “off”.
In operation 240, additional tests can be applied to each of the genes with either a Gsimilar, or Gnone expression status, for the purpose of potentially re-assigning their status. For example, differential expression (e.g., differences between the expression levels of the genes in control cells and the diseased cells, as measured during the expression studies) can be used to re-assign the expression status of genes that were previously assigned Gsimilar or Gnone expression statuses. For genes classified as either Gsimilar or Gnone, if the signal intensities in the diseased and control samples exhibit a statistically significant difference (e.g., in at least 40% of the corresponding probe sets, as evaluated by an ANOVA two-tailed test with P<0.005), the genes can be re-assigned the expression status of Gup or Gdown, depending on whether the gene is up-regulated in the diseased sample or down-regulated in the diseased sample, respectively. The expression statuses of the genes can be used later by the metabolomics-based system to predict the levels of metabolites in diseased cells compared to the levels in control cells. In alternate embodiments, each gene can be initially assigned an expression status (as in operation 230) and further re-assigned a new status (as in operation 240) before assigning a status to additional genes. While some exemplary criteria used to assign an expression status was described here, it remains within the scope of the method to utilize other criteria, in addition or in the alternative to those described here, to assign one or more expression statuses to genes. For example, different statistical tests, at different confidence levels, can be utilized to assign one of more or less than four expression statuses. In another example, genes may be annotated with quantitative information indicative of differential expression. A gene could be annotated with information that includes the percentage change between the non-diseased and diseased states of the cell (e.g., the gene is expressed at a 47% higher rate in the diseased cells than in the control cells, the gene is expressed at a 37% lower rate in the diseased cells than in the control cells, or the like). In yet another example, genes that are assigned an expression status can also be assigned confidence information (e.g., the gene is expressed at a higher rate in the diseased cells than in the control cells at a 58% confidence level, or the like).
In some embodiments, information determined about genes (e.g., which status of Gup, Gdown, Gsimilar, and Gnone the genes are assigned) is used to estimate the potential effects of the differential expression, if any, on the endogenous and/or intracellular levels of metabolites. To do so, connections can be determined between gene products and metabolites. One such source of data connecting gene products and metabolites is information about metabolic pathways. Information regarding human metabolic pathways is available, for example, from existing databases, in the form of pathway maps. The pathway maps can be available as graphical images and also as markup language files that facilitate the parsing of relevant biological data. The biochemical reactions, including for example, information about substrates, products, direction/reversibility, and associated enzyme-coding genes can be extracted from the metabolic pathway maps and organized in such a way as to assist in predicting how the effects of differential gene expression affects endogenous and/or intracellular metabolite levels.
In some embodiments, such as the one described herein, the markup language files can be retrieved from a database, and necessary information extracted from these files when it is needed to estimate the potential effects of the differential expression on the endogenous and/or intracellular levels of metabolites. In other embodiments, this retrieval and extraction of data can be done at an earlier time and the results of this retrieval and extraction can be used for more than one set of predictions. Put another way, the files can be downloaded and the data can be extracted one or more times (e.g., weekly, monthly, on an on-demand basis, or the like), stored, and retrieved for later use by the metabolomics-based system to identify potential therapeutic agents and/or targets. However obtained, this data can be combined with gene-expression data from diseased and control cells to construct a genetic-metabolic matrix (e.g., during operation 140), an example of which is depicted in
In some examples, particular metabolites are excluded from the genetic-metabolic matrix. Reasons to exclude a metabolite from the matrix can include, for example, that the metabolite is non-physiological, that the metabolite is ubiquitous, or that the metabolite participates in reactions that are mainly catalyzed by orphan human enzymes (well defined enzyme activities for which no sequence is known). Exemplary non-physiological metabolites (e.g., ecgonine and parathion) can include metabolites that only participate in reactions pertaining to the biosynthesis of secondary metabolites, the biodegradation and metabolism of xenobiotics, and the like. Ubiquitous metabolites (e.g., H2O, ATP, NAD(+)(P), O2, or the like) often carry out generic roles in many reactions and can be defined as those that are involved as substrate or product in twenty (20) or more reactions. Referring to the third exclusion category previously mentioned (the metabolite participates in reactions that are mainly catalyzed by an orphan human enzyme), the number of reactions where a metabolite m acts as a substrate or product in human metabolic pathways can be defined as Nrm,human and the number of reactions where the metabolite m acts as a substrate or product in reference (e.g., non organism specific) metabolic pathways can be defined as Nrm,ref. If Nrm,human/Nrm,ref<0.5, then the metabolite m can belong to the third exclusion category (e.g., the metabolite participates in reactions that are mainly catalyzed by orphan human enzymes). The metabolites determined to be part of the third exclusionary category may be excluded because the reactions are due to orphan enzymes, the reactions only occur in other organisms, or the reactions occur in humans but have not yet been detected. For example, the metabolite 1-alkyl-sn-glycero-3-phosphate is excluded because out of four enzymes that use it as substrate or product, two, EC 2.3.1.105 and EC 1.1.1.101, are orphans in human, and one, EC 2.7.1.93, has only been found in rabbits. The metabolomics-based system can use the methods described herein (e.g., during operation 150) to generate a matrix such as the one depicted in
In some embodiments, the metabolomics-based system can utilize information indicative of relationships between metabolites and gene products together with gene-expression data to predict the relative levels of metabolites in diseased cells, relative to control cells. For example, based on information contained in a genetic-metabolic matrix annotated with differential gene-expression data, the system can predict which metabolites are expected to exist at higher levels in diseased cells, which metabolites are expected to exist at lower levels in diseased cells, and which metabolites are unknown as to their levels in diseased cells compared to control cells. Based on the rules applied, these predictions can also include a confidence level indicating the degree of confidence associated with the prediction. In this way, metabolites that are predicted to exist at different levels in diseased cells, relative to cells, can be prioritized based on the level of confidence associated with the prediction, such that future testing of the metabolites as therapeutic agents and/or targets can be prioritized based on the confidence level of the predictions.
Referring to
Referring to the embodiment depicted by
In another embodiment, depicted by
Referring now to
In some embodiments, the process 500 can perform operation 530 and combine the information indicative of the effects of gene products on metabolic levels, obtained during operation 510, with the information obtained during operation 520 that is indicative of genes that are expressed differently in diseased cells, relative to control cells. The result of this combining can, for example, be a genetic-metabolic matrix annotated with the differential expression status data, such as the matrix depicted in
Exemplary rules, employed by the metabolomics-based system (e.g., during operation 550), for predicting the cumulative effect of differential gene expression on the metabolite levels in a cell can be based on the supposition that lower levels of enzymes catalyzing the production of a metabolite and/or higher levels of enzymes catalyzing the consumption of a metabolite each have the effect of decreasing the level of metabolite found in the cell. Conversely, higher levels of enzymes catalyzing the production of a metabolite and/or lower levels of enzymes catalyzing the consumption of a metabolite each have the predicted effect of increasing the level of metabolite found in the cell. The same can be true for gene products other than enzymes, such as small molecule transporters. Increased levels of transporters that move metabolites out of the intracellular environment tend to decrease intracellular level of these metabolites, while increased levels of transporters that move metabolites into the intracellular environment tend to increase the intracellular levels. Decreasing the latter transporters would have the opposite effect.
In some embodiments, the greater the number and/or percentage of gene products that have similar effects on the level of the metabolite, the greater the confidence in the prediction. For example, assume that metabolite A is produced by four enzymes, all of which show decreased expression in diseased cells and is consumed by three enzymes, all of which show increased expression in diseased cells. Also assume that metabolite B is produced by four enzymes, three of which show decreased expression and one of which shows normal expression in diseased cells and is consumed by three enzymes, all of which show increased expression in diseased cells. Since all seven enzymes (100%) related to metabolite A have the effect of decreasing the level of metabolite A (e.g., there are less enzymes that produce it and more that consume it), the confidence level can be high that metabolite A is present at lower quantities in the diseased cells. Regarding metabolite B, 86% (6 out of 7) of the considered gene products have the effect of decreasing the level of metabolite B. In this example, it may still be predicted that metabolite B is found at lower levels in the diseased cells, but the confidence in that prediction may be lower.
In some embodiments, the metabolomics-based system can perform an operation, such as the operation 550 described in connection with
Referring again to
Conversely, a metabolite can be included in a group Mdown (e.g., predicted to have decreased levels in diseased cells) when both of the following two tests are true. First, there is at least one gene encoding for a gene product able to decrease the intracellular level of the metabolite whose expression status is Gup or Gsimilar, there is no gene encoding for a gene product able to decrease the intracellular level of the metabolite whose expression status is Gdown (down-regulated in diseased cells), and there is no gene encoding for a gene product able to increase the intracellular level of the metabolite whose expression status is Gup (up-regulated in diseased cells) or Gsimilar (significantly expressed at similar levels in diseased and control cells). Second, either or both of the following apply. There is at least one gene encoding for a gene product able to increase the intracellular level of the metabolite whose expression status is Gdown (down-regulated in diseased cells) and/or there is at least one gene encoding for a gene product able to decrease the intracellular level of the metabolite whose expression status is Gup (up-regulated in diseased cells).
Referring again to
All remaining considered metabolites, which are not assigned a status of Mup or Mdown, can be included in group Munknown, indicating that there is currently no prediction as to whether the level of the metabolite in the cell is increased or decreased in diseased cells, relative to control cells. In this way, the methodology attempts to consider, as much as is practical, the entire proteome complement of enzymes that produce and consume a metabolite.
In some embodiments, the metabolites included in the groups Mup and Mdown can be further screened for use in therapeutic treatments. For example, supplementation of a particular metabolite (e.g., one determined to be included in group Mdown) to raise the intracellular level to a normal physiological level may be of therapeutic value. For certain compounds that are lowered in cancer cells, restoration to levels closer to normal could be achieved by directly administering the deficient metabolite. On the other hand, for metabolites whose levels are increased in cancer cells, reversion to normal levels could involve activation or inhibition of key enzymes. In either case, the approach described herein can identify likely agents and/or targets. In some embodiments, the gene-expression data, the relationships between gene-products and metabolites, the genetic-metabolic matrices, the expression status of one or more genes, and/or metabolites that have potential as agents and/or targets can be stored in electronic form on a computer-readable medium for use with a computer. Additionally, the metabolomics-based methods for identifying potential agents and/or targets for further research can be performed on one or more computers, as depicted in
Referring now to
Working memory 620 can store an operating system 622, one or more genetic-metabolic matrices 624, and/or one or more metabolites 625 that may be potential agents and/or targets for therapeutic treatment. The computer system 600 can also include a graphical user interface 626 and instructions for processing machine readable data including one or more protein function inference tools 628, one or more gene-expression data analysis tools 630, one or more genetic-metabolic matrix tools 632, one or more metabolite prediction tools 634, and one or more file format interconverters 636.
The computer system 600 may be any of the varieties of laptop or desktop personal computer, or workstation, or a networked or mainframe computer or super-computer, which would be available to one of ordinary skill in the art. For example, computer system 600 may be an IBM-compatible personal computer, a Silicon Graphics, Hewlett-Packard, Fujitsu, NEC, Sun or DEC workstation, or may be a supercomputer of the type formerly popular in academic computing environments. Computer system 600 may also support multiple processors as, for example, in a Silicon Graphics “Origin” system, or a cluster of connected processors.
The operating system 622 may be any suitable variety that runs on any of computer systems 600. For example, in one embodiment, operating system 622 is selected from the UNIX family of operating systems, for example, Ultrix from DEC, AIX from IBM, or IRIX from Silicon Graphics. It may also be a LINUX operating system. In other embodiments, operating system 622 may be a VAX VMS system. In still other embodiments, the operating system 622 can be a DOS operating system or a Windows operating system, such as Windows 3.1, Windows NT, Windows 95, Windows 98, Windows 2000, Windows XP, or Windows Vista. In yet other embodiments, operating system 622 is a Macintosh operating system such as MacOS 7.5.x, MacOS 8.0, MacOS 8.1, MacOS 8.5, MacOS 8.6, MacOS 9.x and MacOS X.
The graphical user interface (“GUI”) 626 is preferably used for displaying genetic-metabolic matrices (e.g., the genetic-metabolic matrix 624), and/or listing metabolites that are potential agents and/or targets for therapeutic treatments, on user interface 606. User-interface 606 may comprise input and output devices such as a keyboard, mouse, touch-screen, display screen, trackpad, scanner, printer, or projector.
The network interface 608 may optionally be used to access one or more metabolic databases and/or sets of gene-expression data stored in the memory of one or more other computers. One or more aspects of the metabolomics-based methods described herein may be carried out with commercially available programs which run on, or with computer programs that are developed specially for the purpose and implemented on, computer system 600. Exemplary commercially available programs can include spreadsheet software (e.g., Excel), pathway analysis software (e.g., Ingenuity, Spotfire, or the like), and microarray data processing software (e.g., dChip). Alternatively, the metabolomics-based methods may be performed with one or more stand-alone programs each of which carries out one or more operations of the metabolomics-based system.
In this example, it is shown that the change in concentration of some metabolites that occur in cancer cells could have an active role in the progress of the disease rather than being a side effect of it. The reversion to a metabolic phenotype more similar to the normal state was explored to determine the possible therapeutic value. For certain compounds that are lowered in cancer cells, restoration to levels closer to normal can be achieved by directly administering the deficient metabolite. On the other hand, for metabolites whose levels are increased in cancer cells, reversion could involve, for example, activation or inhibition of key enzymes, an approach that is more difficult to implement. For that reason, it was decided to focus on the former case. It would be ideal to compare the actual intracellular levels of every human metabolite in normal and diseased states to identify those that are lowered in cancer cells. However, direct large-scale biochemical assays are currently unfeasible. Metabolite profiling based on NMR or mass spectrometry techniques, although very powerful, require costly instruments, and are not free of problems and limitations. In silico methods based on linking enzymes to upregulated microarray-detected transcripts and mapping to metabolic pathways have been applied to the qualitative reconstruction of the metabolome of cancer cells and some predictions have been successfully validated by biochemical experiments. Here, the metabolomics-based method was implemented using CoMet, a fully automated and general computational metabolomics approach to predict the human metabolites whose intracellular levels are more likely to be altered in cancer cells, based on methods described herein. CoMet is further described in: A. K. Arakaki, R. Mezencev, N. Bowen, Y. Huang, J. McDonald and J. Skolnick, “Identification of metabolites with anticancer properties by Computational Metabolomics” Molecular Cancer, 2008:7: 57, incorporated herein by reference. The metabolites predicted to be lowered in cancer compared to normal cells were prioritized as potential anticancer agents. The methodology was applied to a leukemia cell line, and several human metabolites were discovered that, either alone or in combination, exhibited various degrees of antiproliferative activity.
Human T-acute lymphoblastic leukemia Jurkat cells procured from ATCC were grown at RPMI-1640 medium (Mediatech) supplemented with 10% FBS (Gibco), 2 mmol/L L-glutamine (Mediatech), 100 IU/mL penicillin, 100 μg/mL streptomycin, and 0.25 μg/mL amphotericin B (all from Mediatech) at 37° C. in the atmosphere of 5% CO2, 95% air, and 80% relative humidity. The Jurkat cells were allowed to reach 600,000 cells per mL of suspension culture and about 106 cells from two biological replicates were used for the isolation of total cellular RNA.
RNA quality was verified on the Bioanalyzer RNA Pico Chip (Agilent Technologies). Total RNA was extracted from cell lines using Trizol (Invitrogen). Total RNA from the above extractions was processed using the RiboAmp OA or HS kit (Arcturus) in conjunction with the IVT Labeling Kit from Affymetrix, to produce an amplified, biotin-labeled mRNA suitable for hybridizing to GeneChip Probe Arrays (Affymetrix). Labeled mRNA was hybridized to GeneChip Human Genome U133 Plus 2.0 Arrays in the GeneChip Hybridization oven 640, further processed with the GeneChip Fluidics Station 450 and scanned with the GeneChip Scanner. Affymetrix .CEL files were processed using the Affymetrix Expression Console (EC) Software Version 1.1. Files were processed using the default MASS 3′ expression workflow which includes scaling all probes to a target intensity (TGT) of 500. Spiked in report controls used were AFFX-BioB, AFFX-BioC, AFFX-BioDn, and AFFIX-CreX. Affymetrix .CEL files for three normal lymphoblast samples used as a normal reference to compare Jurkat cells expression data were directly retrieved from the Gene Expression Omnibus (samples GSM113678, GSM113802, and GSM113803 of untreated GM1585 1 cells from the Series GSE5040).
One source of biological information was the Kyoto Encyclopedia of Genes and Genomes (KEGG) of Jul. 5, 2007. The enzyme function annotation for human genes was obtained from the KEGG GENES database, the chemical information about human metabolites from the KEGG LIGAND database, and the metabolic pathway data from the KEGG PATHWAY database. The enzyme function annotations from KEGG were implemented with high confidence predictions made by EFICAz, further described in: A. K. Arakaki, W. Tian, and J. Skolnick, “High accuracy multi-genome scale reannotation of enzyme function by EFICAz” BMC Genomics 2006:7: 315, an approach for enzyme function inference that significantly increased annotation coverage. For the mapping between microarray probe identifiers and Entrez GeneID identifiers, the Affymetrix HG-U133 Plus 2.0 NetAffx Annotation file of May 31, 2007 was used.
The first step in the methodology for the identification of metabolites with anticancer activity consisted of the classification of each enzyme-coding human gene into four possible groups: Gup: (upregulated in cancer cells), Gdown: (downregulated in cancer cells), Gsimilar: (expressed in both, normal and cancer cells, at levels that are statistically indistinguishable), and Gnone: (not expressed in both, normal and cancer cells). Two types of data were used for the classification: the log base 2 signal intensities and the presence calls of the corresponding probe sets, as reported by the Affymetrix Microarray Suite Software 5.0 (MAS 5.0). First, an “off” status was provisionally assigned to each gene in each of the two studied conditions (normal and cancer) if the mean fraction of presence calls labeled as “marginal” or “absent” in the corresponding probe sets is at least 80%, otherwise an “on” status is assigned. Then, each gene was temporarily classified into the Gup, Gdown, Gsimilar, or Gnone group, according to its on/off status in normal and cancer conditions. Finally, genes in the temporary Gsimilar or Gnone groups were transferred to the Gup or Gdown groups if they fulfilled the following criterion for differential expression: the signal intensities in normal and cancer samples exhibited a statistically significant difference in at least 40% of the corresponding probe sets, as evaluated by an ANOVA two tailed test with P<0.005.
The second step in the methodology was an in silico estimation of the effect that the differentially expressed enzyme-encoding genes could have exerted on the intracellular levels of metabolites. First, all the human metabolic pathways were retrieved from the KEGG PATHWAY database, a compilation of maps representing the molecular interactions and reaction networks for different types of biological processes. For the biological process labeled as Metabolism there were eleven groups of pathways: 1) Carbohydrate Metabolism, 2) Energy Metabolism, 3) Lipid Metabolism, 4) Nucleotide Metabolism, 5) Amino Acid Metabolism, 6) Metabolism of Other Amino Acids, 7) Glycan Biosynthesis and Metabolism, 8) Biosynthesis of Polyketides and Nonribosomal Peptides, 9) Metabolism of Cofactors and Vitamins, 10) Biosynthesis of Secondary Metabolites, and 11) Xenobiotics Biodegradation and Metabolism. The pathway maps were available as graphical images and also as KEGG Markup Language (KGML) files that facilitates the parsing of relevant biological data. Thus, the biochemical reactions were extracted from the KGML human metabolic pathway maps, including information about substrates, products, direction/reversibility, and associated enzyme-coding genes.
This information was combined with gene-expression data from normal and cancer cells to construct a genetic-metabolic matrix that linked each of 1,477 metabolites with the specific human genes encoding for enzymes that consume and/or produce each metabolite. The differential expression status given by the four-group classification described in the previous section was stored for each gene. The following were excluded from the genetic-metabolic matrix: i) 209 non-physiological metabolites, here defined as those that only participate in reactions that belong to the “Biosynthesis of Secondary Metabolites” and the “Xenobiotics Biodegradation and Metabolism” groups of metabolic pathways, e.g., ecgonine or parathion, ii) 197 metabolites that are considered ubiquitous and often carry out generic roles in many reactions, here defined as those that are involved as substrate or product in ten or more reactions, e.g., H2O, ATP, NAD(+)(P) or O2, and iii) 289 metabolites that participate in reactions that are mainly catalyzed by orphan human enzymes. To determine metabolites belonging to the third category, the number of reactions where a metabolite m acts as substrate or product in human metabolic pathways was defined as Nrm,human, and in reference (non organism specific) metabolic pathways was defined as Nrm,ref. If Nrm,human/Nrm,ref<0.5, then the metabolite m was included in the third exclusion category. The absent reactions in human pathways may be due to orphan enzymes, reactions that only occur in other organisms or reactions that may occur in humans but have not yet been detected, for example, the metabolite 1-alkyl-sn-glycero-3-phosphate was excluded because out of four enzymes that use it as substrate or product, two, EC 2.3.1.105, and EC 1.1.1.101, are orphans in human, and one, EC 2.7.1.93, has only been found in rabbit. The total number of metabolites remaining in the genetic-metabolic matrix after the three types of exclusions was 982.
In this example, a set of rules was used to scan the genetic-metabolic matrix for metabolites whose intracellular levels in cancer cells are likely to differ from those in normal cells. The rules were based on the supposition that lower levels of enzymes catalyzing the production of a metabolite and transporters moving the metabolite into the intracellular space (and/or higher levels of enzymes catalyzing the consumption of the metabolite and transporters moving the metabolite out of the intracellular space) imply a decreased level of such metabolite, and vice versa (see
In the methodology, a given metabolite was predicted to have decreased levels in cancer cells when: 1) both of the following applied: 1.1) there was no gene encoding for an enzyme able to catalyze the production of the metabolite whose differential expression status was Gup (upregulated in cancer cells) or Gsimilar (significantly expressed at similar levels in normal and cancer cells) and 1.2) there was no gene encoding for an enzyme able to catalyze the consumption of the metabolite whose differential expression status was Gdown (downregulated in cancer cells), and 2) either or both of the following applied: 2.1) there was at least one gene encoding for an enzyme able to catalyze the production of the metabolite whose differential expression status was Gdown (downregulated in cancer cells) and 2.2) there was at least one gene encoding for an enzyme able to catalyze the consumption of the metabolite whose differential expression status was Gup (upregulated in cancer cells). Similarly, a metabolite was predicted to have increased levels in cancer cells when: 1) both of the following applies: 1.1) there was no gene encoding for an enzyme able to catalyze the consumption of the metabolite whose differential expression status was Gup or Gsimilar and 1.2) there was no gene encoding for an enzyme able to catalyze the production of the metabolite whose differential expression status was Gdown, and 2) either or both of the following applies: 2.1) there was at least one gene encoding for an enzyme able to catalyze the consumption of the metabolite whose differential expression status was Gdown and 2.2) there was at least one gene encoding for an enzyme able to catalyze the production of the metabolite whose differential expression status was Gup.
The in silico metabolomics methods described herein were used to compare two Jurkat cell samples to three normal GM15851 lymphoblast cell samples, which resulted in 104 metabolites predicted to be lowered in the cancer cells (TABLE 1) and 78 metabolites predicted to be increased in the cancer cells (TABLE 2), out of 982 metabolites considered in the analysis (TABLE 4). A search of the literature for experimental evidence identified that 13 of the 982 analyzed metabolites exhibit anticancer activity in Jurkat cells. TABLE 3 shows that 2 of the 13 metabolites were predicted to be lowered in Jurkat cells: thymidine, an antineoplastic agent, and prostaglandin D2, which induces apoptosis without inhibiting the viability of normal T lymphocytes). Only 1 of the 13 proven anticancer agents in Jurkat cells belonged to the group of 78 metabolites predicted to be increased in these cancer cells: the apoptotic agent 2-methoxy-estradiol-17β. The remaining 10 known anticancer molecules active in Jurkat cells: testosterone, melatonin, sphingolipid GD3,2′-deoxyguanosine, 2′-deoxyadenosine, 2′-deoxyinosine, nicotinamide, methylglyoxal, linoleic acid, and cAMP were included in the set of 800 metabolites whose intracellular levels were predicted to be essentially the same in both Jurkat and normal cells. The fraction of metabolites with known anticancer activity among the compounds predicted to be lowered in Jurkat cells (2 of 104 or 0.019) is higher than that corresponding to the rest of the compounds [11 non predicted ones have literature validated anticancer properties; (1+10)/(78+800)=0.013]. However, the significance of this difference cannot be assessed with adequate statistical power due to the small size of the sample. Another complication is the fact that negative results tend to be underreported, thereby making it difficult to obtain unbiased statistics about metabolites that lack anticancer properties.
The ligand descriptors in the third column of Table 2 include generic descriptors that refer to classes of molecules, e.g., a peptide. Many of the most general descriptors are discarded from subsequent analyses.
Based on criteria such as low molecular weight, commercial availability, and affordability, nine metabolites predicted to be lowered in Jurkat cells were selected to test their effect on the proliferation of that cell line (TABLE 3). The effect of a 72 hour treatment on the growth of Jurkat cells was examined using the following metabolites (at a concentration of 100 μM): riboflavin, tryptamine, 3-sulfino-L-alanine, menaquinone, dehydroepiandrosterone (the non-sulfated version of the predicted metabolite dehydroepiandrosterone sulfate), α-hydroxystearic acid (one of the possible compounds compatible with the predicted generic metabolite α-hydroxy fatty acid), hydroxyacetone, seleno-L-methionine, and 5,6-dimethylbenzimidazole (the aglycone of the predicted metabolite a-ribazole).
1KEGG ligand identifier
Growth inhibition of Jurkat cells was evaluated by a resazurin-based in vitro toxicology assay kit (Sigma). Metabolites dehydroepiandrosterone (dehydroisoandrosterone, Acros Organics), 5,6-diqnethylbenzimidazole (Aldrich), hydroxyacetone (Sigma), menaquinone (Supelco), riboflavin (Sigma) and tryptamine (Sigma) were solubilized in DMSO (Sigma); 3-sulfino-L-alanine (L-cysteinesulfinic acid, Aldrich) and seleno-L-methionine (Sigma) were solubilized in sterile deionized water and stock solutions (40 mmol/L) were stored frozen at −80° C. prior to its use. Aliquots of 100 μL of cells in phenol red free RPMI 1640 medium (Sigma) supplemented with 5% FBS, 2 mmol/L L-glutamine, 100 IU/mL penicillin, 100 μL/mL streptomycin, and 0.25 μL/mL amphotericin B were inoculated into 96-well black-walled plates at a density of 250,000 cells/mL (Jurkat) or 200,000 cells/mL (OVCAR-3) and incubated for 24 hours at 37° C. in 5% CO2, 95% air, and 80% relative humidity prior to the addition of the metabolites to be tested. Stock solutions of metabolites were diluted 200 times with complete growth medium and added to the appropriate microliter wells in 4 replicates per metabolite, while 100 μL of complete medium was added to the control and blank cells. Following metabolite addition, the plates were incubated far an additional 72 hours, after which 20 μL of TOX-8 reagent was added to metabolite treatment, control and blank wells and incubation continued for additional 3 hours. The increase in fluorescence was measured in a microplate fluorimeter at 590 nm using an excitation wavelength of 560 nm. The emission of control wells, after the subtraction of a blank, was taken as 100% and the results for metabolite treatments were expressed as percentage of the control. Two biological replicates for each cell line were used for cell proliferation assays. Positive results were additionally verified by counting of viable cells using Vi-CELL XR cell counter (Beckman Coulter) and trypan blue dye exclusion method for Jurkat.
Although the fact that the nine tested metabolites predicted to be lowered in Jurkat cells exhibited antiproliferative activity strongly support our hypothesis, the possibility still exists that most endogenous metabolites inhibit the growth of Jurkat cells, independent of the intracellular level status predicted by the metabolomics-based system described here. Therefore, we tested metabolites whose intracellular levels in Jurkat cells were predicted to be increased (bilirubin, androsterone, homovanillic acid, vanillylmandelic acid, N-acetyl-L-aspartate, and taurocholic acid) or unchanged (pantothenic acid, citric acid, folic acid, P-D-galactose, cholesterol) compared with normal lymphoblasts. We analyzed the effect on the growth of Jurkat cells of a 72 hour treatment with each of the eleven human metabolites at a concentration of 100 μM.
While the inhibitory activity of riboflavin, tryptamine and hydroxyacetone on Jurkat cells was moderate (all above 70% growth compared to control), others like menaquinone and DHEA exhibited an important inhibitory effect (11.3% and 16.7% growth compared to the control, respectively). Only 2/11 tested metabolites predicted not to be lowered in Jurkat cells unexpectedly exhibited antiproliferative activity, while the growth inhibition exerted by each of the remaining tested metabolites was less than 10% and statistically insignificant (
If the nine novel antiproliferative compounds described herein are considered and the two metabolites whose anticancer activity in Jurkat cells was previously known, the fraction of anticancer metabolites among the 104 compounds predicted to be lowered in Jurkat cells is considerably higher [(9+2)/104=0.106] than that corresponding to the rest of the compounds [(2+11)/878=0.015]. The positive association between lowered metabolite levels in Jurkat cells as predicted by CoMet and antiproliferative activity of the metabolite in that cell line is highly significant (Fisher's exact test two-tailed p-value=8.7×10−6). Furthermore, when the effect of these metabolites on growth inhibition was tested in Jurkat and human lymphoblast cells cultured in identical conditions, a pattern of selectivity of the antiproliferative effect towards the cancer cell line became evident. In an extreme case, DHEA at a concentration of 50 μM inhibited the growth of Jurkat cells but stimulated the proliferation of lymphoblasts.
Since the results on Jurkat cells were encouraging, a more demanding test was performed in order to evaluate the range of applicability of the in silico metabolomics methods described herein, and the general validity of the correlation between predicted lowered concentration of a metabolite in cancer cells and its anticancer activity. A comparative analysis of the potency of drugs used in current chemotherapy tested on the National Cancer Institute cell lines revealed that leukemia cell lines are the most sensitive ones, while the most resilient cell lines originate from ovarian tissue. Therefore, the OVCAR-3 cell line was chosen to test.
A methodology similar to that of example 1 was used to identify one or more metabolites associated with the OVCAR-3 cell line that may have potential as agents and/or targets for therapeutic treatment. The OVCAR-3 cell line is derived from malignant ascites of a patient with progressive adenocarcinoma of the ovary after failed cisplatin therapy. Gene expression data from three OVCAR-3 cell samples was obtained and compared to expression data from three human immortalized ovarian surface epithelial (IOSE) cell samples (samples GSM154124 and GSM154125 in GEO). Based on this information, CoMet predicted 132 metabolites to be lowered and 120 metabolites to be increased in OVCAR-3 cancer cells. Two of the 132 metabolites predicted to be lowered in OVCAR-3,2-methoxyestradiol and calcitriol, and two of the 730 predicted to be unchanged, 3′,3,5-triiodo-L-thyronine and all-trans-retinoic acid, had previously been demonstrated to exhibit anticancer activity in OVCAR-3 cells.
Growth inhibition of OVCAR-3 cells was evaluated by a resazurin-based in vitro toxicology assay kit (Sigma). Metabolites dehydroepiandrosterone (dehydroisoandrosterone, Acros Organics), 5,6-dignethylbenzimidazole (Aldrich), hydroxyacetone (Sigma), menaquinone (Supelco), riboflavin (Sigma) and tryptarnine (Sigma) were solubilized in DMSO (Sigma); 3-sulfino-L-alanine (L-cysteinesulfinic acid, Aldrich) and seleno-L-methionine (Sigma) were solubilized in sterile deionized water and stock solutions (40 mmol/L) were stored frozen at −80° C. prior to its use. Aliquots of 100 μL of cells in phenol red free RPMI 1640 medium (Sigma) supplemented with 5% FBS, 2 mmol/L L-glutamine, 100 IU/mL penicillin, 100 μL/mL streptomycin, and 0.25 μL/mL amphotericin B were inoculated into 96-well black-walled plates at a density of 200,000 cells/mL and incubated for 24 hours at 37° C. in 5% CO2, 95% air, and 80% relative humidity prior to the addition of the metabolites to be tested. Stock solutions of metabolites were diluted 200 times with complete growth medium and added to the appropriate microliter wells in 4 replicates per metabolite, while 100 μL of complete medium was added to the control and blank cells. Following metabolite addition, the plates were incubated for an additional 72 hours, after which 20 μL of TOX-8 reagent was added to metabolite treatment, control and blank wells and incubation continued for additional 2 hours. The increase in fluorescence was measured in a microplate fluorimeter at 590 nm using an excitation wavelength of 560 nm. The emission of control wells, after the subtraction of a blank, was taken as 100% and the results for metabolite treatments were expressed as percentage of the control. Two biological replicates for each cell line were used for cell proliferation assays. Positive results were additionally verified by counting of viable cells using Vi-CELL XR cell counter (Beckman Coulter) and SRB-based assay for OVCAR-3 cells.
The growth inhibitory effects of some of the predicted compounds may seem relatively low, and the tested concentration of 100 μmol/L may seem too high, compared with most anticancer drugs of synthetic or natural origin. However, this concentration is not unreasonably high for metabolic compounds, since many metabolites can be found at similar levels in the cytosol and/or extracellular fluids. Also, several of the newly found antiproliferative metabolites exhibited synergistic interactions among them, which is consistent with the systematic approach of the methods in that the prediction was performed on the entire metabolome and not on individual metabolites or pathways. This observation raises the intriguing question of what the result would be if concentrations close to those observed in the normal cells could be achieved in the cancer cell for most of the metabolites, i.e., a reversion to a normal like metabolic profile, at least for those metabolites that exhibit the ability of inhibiting the growth of the cancer cell. In addition, some active metabolites might be considered as completely novel lead compounds for further drug design and development, with the advantage of a reduced initial toxicity.
The mode of action of the newly found antiproliferative metabolites has not been investigated, and it is even possible that some of them may exert their effect based on completely novel mechanisms, however, for most metabolites a possible mode of action based on their effect on other cancer cells or on the known properties of closely related molecules can be suggested. For example, 5,6-dichlorobenzimidazole, a bioisosteric derivative of the active metabolite 5,6-dimethylbenzimidazole, induces differentiation of malignant erythroblasts by inhibiting RNA polymerase II. The tested metabolite tryptamine is an effective inhibitor of HeLa cell growth via the competitive inhibition of tryptophanyl-tRNA synthetase, and consequent inhibition of protein biosynthesis. 9-hydroxystearic acid, an isomer of the active metabolite α-hydroxystearic acid, arrests HT29 colon cancer cells in G0/G1 phase of the cell cycle via overexpression of p21 and induces differentiation of HT29 cells by inhibition of histone deacetylase 1 and interrupts the transduction of the mitogenic signal. Menaquinone (vitamin K2), the most efficient compound among the metabolites tested in Jurkat, has been previously reported to induce G0/G1 arrest, differentiation, and apoptosis in acute myelomonocytic leukemia HL-60 cells. However, considering the great difference between acute lymphoblastic and myelomonocytic leukemias in their etiology, pathogenesis, prognosis, and treatment response, the finding of growth inhibition of Jurkat cells by menaquinone is novel and may even have a different underlying mechanism.
There are several factors not accounted for in the methodology that can influence the actual intracellular levels of a metabolite, and constitute possible sources of error that could affect the predictions. First, the initial input in the methods comes from microarray data, however, the gene expression levels inferred from microarray experiments are subject to several sources of variation due to biological or technical causes.
Second, the analysis depends on the mapping of genes, but this mapping is imperfect because: i) errors have been detected in the gene mappings provided by the microarray manufacturer, ii) not all the genes are represented in a microarray, e.g., only 14,500 human genes are represented in the Affymetrix GeneChip Human Genome U133A 3.0 Array employed herein, although the most conservative estimations indicate that there are at least 18,000 protein-coding genes in the human genome, and iii) alternatively spliced genes can generate catalytically inactive forms of an enzyme and, although tools exist to determine the relation between single probes and the intron/exon structure of a target transcript in its known variants, there is no comprehensive repository providing the catalytic activity/inactivity status of different enzyme forms generated by alternative splicing.
Third, the significant number of functionally uncharacterized gene products in fully sequenced genomes, together with the errors and omissions in current biological databases can bias the results when microarray probes are used to infer affected biological functions. For example, the upper bound estimation of the fraction of enzyme-coding genes in the human genome is approximately 20%; however, the fraction of human genes currently annotated as enzymes is only 16%. Moreover, it is estimated that almost 30% of the enzyme activities that have been assigned an EC number are orphans, i.e., they have been experimentally measured in an organism but are not associated to any gene or protein sequence, either in databases or in the literature.
Fourth, the levels of mRNA estimated by microarray experiments may not closely reflect the actual protein levels. Specifically, large-scale analyses have shown a weak correlation between mRNA and protein abundance, a phenomenon that has been attributed to translational regulation, differences in protein in vivo half lives and experimental error or noise in both protein and RNA determinations.
Fifth, the qualitative treatment of metabolic flux a simplification; however, quantitative approaches such as flux balance analysis require the knowledge of the regulatory effects of covalent modifications and the kinetic constants associated to the enzymes involved in the system under study, a wealth of information that currently is both incomplete and not accurate enough to generate large-scale models.
Sixth, similarly, the very limited information available about both, subcellular location where the metabolic conversions take place and transport of metabolites between different intracellular or extracellular compartments prevents us from considering these factors in our methodology, although their influence on the in vivo levels of metabolites is evident. Information about transporter genes can be incorporated into the in silico metabolomics method, and algorithms to make use of it can be developed for qualitative metabolic flux predictions.
Finally, a factor that could confound the hypothetical correlation between lowered metabolites in cancer and their potential as therapeutic agents is the existence of moonlighting activities related to growth control exhibited by several metabolic enzymes.
By applying a fully automated method for in silico metabolomics to two different cancer cell lines nine metabolites have been discovered that alone or in combination, exhibit significant antiproliferative activity in at least one of the two cell lines. The rationale behind the findings can be described by this premise: some metabolites that have lowered levels in a cancer cell relative to normal cells contribute to the progress of the disease. The results strongly indicate that many other metabolites with important roles in carcinogenesis can be discovered or identified by the methods described herein.
In this example only cell proliferation assays have been performed, but it can be speculated that some metabolites may also exhibit other anticancer properties such as antimetastatic or antiangiogenic properties, that would not be evident as inhibition of cell growth in vitro. If the antiproliferative activities observed in cancer cell lines have a therapeutic value, different combined strategies can be devised where sets of predicted metabolites are concurrently selected according to their association with the same or different metabolic pathways, i.e., a strategy can be employed where multiple drug leads target a single pathway, or on the contrary, where each drug lead acts specifically on a different pathway.
The ligand descriptors in the third column of Table 4 include generic descriptors that refer to classes of molecules, e.g., a peptide. Many of the most general descriptors are discarded from subsequent analyses.
The foregoing description is intended to illustrate various aspects of the instant technology. It is not intended that the examples presented herein limit the scope of the appended claims. The invention now being fully described, it will be apparent to one of ordinary skill in the art that many changes and modifications can be made thereto without departing from the spirit or scope of the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
60979932 | Oct 2007 | US | national |
60980954 | Oct 2007 | US | national |
60989233 | Nov 2007 | US | national |
This application claims the benefit of priority under 35 U.S.C. §119(e) to U.S. provisional application Ser. Nos. 60/979,932, filed Oct. 15, 2007, and 60/980,954, filed Oct. 18, 2007, and 60/989,233, filed Nov. 20, 2007, all of which are incorporated herein by reference in their entirety.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US08/80002 | 10/15/2008 | WO | 00 | 5/13/2011 |