The invention relates to methods for selecting a therapeutic indication for a pharmaceutical as well as methods of treating various disease and disorders with a pharmaceutical.
Repositioning helps fully explore the indications of marketed drugs and clinical candidates (Ashburn, et al. Nat Rev Drug Discov 2004; 3(8):673-83); however, most successful stories of drug repositioning are based on serendipity but not systematic analysis (Sardan, et al. Brief Bioinform 2011). In silico methodologies have helped in mining the drug's off-target effects (Xie, et al. PLoS Comput Biol 2007; 3(11):e2173-7; Campillos, et al. Science 2008; 321(5886):263-6; Keiser, et al. Nature 2009; 462(7270):175-81; Yang, et al. PLoS Comput Biol 2009; 5(7):e1000441; and Luo, et al. Nucleic Acids Res 2011), off-system effects (such as, off-target related gene expression perturbation) (Suthram, et al. PLoS Comput Biol 2010; 6(2):e1000662; Yang, et al. PLoS Comput Biol 2011; 7(3):e1002016; Iorio, et al. Proc Natl Acad Sci USA 2010; 107(33):14621-6; and Hu, et al. PLoS One 2009; 4(8):e6536) and off-phenotypes (i.e. adverse drug reactions (Pouliot, et al. Clin Pharmacol Ther 2011; and Tatonetti, et al. Clin Pharmacol Ther 2011) or new indication) providing new hypotheses for in vitro assay (MacDonald, et al. Nat Chem Biol 2006; 2(6):329-37), animal model testing, or clinical trials to reposition the drug. The above strategies primarily focus on using preclinical information. Clinical therapeutic effects, however, are not always consistent with preclinical outcomes (Buchan, et al. Drug Discov Today 2011). Drug repositioning can help explore new indications of marketed drugs and clinical candidates.
Thus, methods are needed for identifying new or unsuspected indications for existing pharmaceuticals. Additionally, methods are needed for validating or invalidating a first therapeutic indication of a pharmaceutical as well as selecting at least a therapeutic agent for treatment or prevention of a disease and/or disorder.
In one embodiment, methods are provided for selecting a new therapeutic indication for at least one first pharmaceutical comprising the steps of:
As used herein “pharmaceutical” means any active ingredient capable of treating or preventing at least one disease, trait and/or phenotype. The pharmaceutical compositions of the invention are prepared using techniques and methods known to those skilled in the art. Some of the methods commonly used in the art are described in Remington's Pharmaceutical Sciences (Mack Publishing Company).
As used herein “druggable” means a characteristic that allows a compound or composition to be developed into a drug. For example, a druggable compound or composition could have at least one of the following characteristics: capable of being formulated for administration to a mammal, capable of reaching its target once administered to a mammal, and/or capable of effecting at least one target. Similarly, the term “biopharmable” refers to large molecule such as, but not limited to, proteins, antibodies, antibody fragments, domain antibodies, single chain antibodies, bispecific antibodies, and any combination or variations thereof, aptamers, fusion proteins, synthetic polypeptides, recombinant polypeptides, vaccines, DNA therapies, and/or RNAi, that can be administered to a mammal.
By the term “treating” and grammatical variations thereof as used herein, is meant therapeutic therapy. In reference to a particular condition, treating means: (1) to ameliorate or prevent the condition of one or more of the biological manifestations of the condition, (2) to interfere with (a) one or more points in the biological cascade that leads to or is responsible for the condition or (b) one or more of the biological manifestations of the condition, (3) to alleviate one or more of the symptoms, effects or side effects associated with the condition or treatment thereof, or (4) to slow the progression of the condition or one or more of the biological manifestations of the condition. Prophylactic therapy is also contemplated thereby. The skilled artisan will appreciate that “prevention” is not an absolute term. In medicine, “prevention” is understood to refer to the prophylactic administration of a drug to substantially diminish the likelihood or severity of a condition or biological manifestation thereof, or to delay the onset of such condition or biological manifestation thereof. Prophylactic therapy is appropriate, for example, when a subject is considered at high risk for developing cancer, such as when a subject has a strong family history of cancer or when a subject has been exposed to a carcinogen.
As used herein “reposition” and “repositioning” and grammatical variations thereof refers to a disease, trait and/or phenotype for which a pharmaceutical may have a use beyond the first disease, trait and/or phenotype for which the pharmaceutical had identified activity.
As used herein the term “amplification” and grammatical variations thereof refers to the presence of one or more extra gene copies in a chromosome complement. In certain embodiments a gene encoding a Ras protein may be amplified in a cell. Amplification of the HER2 gene has been correlated with certain types of cancer. Amplification of the HER2 gene has been found in human salivary gland and gastric tumor-derived cell lines, gastric and colon adenocarcinomas, and mammary gland adenocarcinomas. Semba et al., Proc. Natl. Acad. Sci. USA, 82:6497-6501 (1985); Yokota et al., Oncogene, 2:283-287 (1988); Zhou et al., Cancer Res., 47:6123-6125 (1987); King et al., Science, 229:974-976 (1985); Kraus et al., EMBO J., 6:605-610 (1987); van de Vijver et al., Mol. Cell. Biol., 7:2019-2023 (1987); Yamamoto et al., Nature, 319:230-234 (1986).
As used herein “overexpressed” and “overexpression” of a protein or polypeptide and grammatical variations thereof means that a given cell produces an increased number of a certain protein relative to a normal cell of the same type. By way of example, a protein may be overexpressed by diseased cell relative to a normal cell. Additionally, a mutant protein may be overexpressed compared to wild type protein in a cell. As is understood in the art, expression levels of a polypeptide in a cell can be normalized to a housekeeping gene such as actin. In some instances, a certain polypeptide may be underexpressed in a cell compared with a normal or standard cell.
As used herein “drug-side effect (SE) association” refers to an association of at least one side effect with at least one pharmaceutical.
As used herein “disease-side effect (SE) association” refers to a association of at least one but maybe more than one side effect that may be induced by at least one pharmaceutical or a class of drugs intended for treatment of a certain disease or disorder.
In one embodiment, methods are provided for selecting a new therapeutic indication for at least one first pharmaceutical comprising the steps of:
In one aspect the invention includes counting the number of drugs inducing or not inducing a SE when treating or not treating a disease, and generating a confusion matrix.
In one aspect the drug-side effect (SE) association is determined by the pharmaceutical label of said first pharmaceutical. In one aspect the drug-side effect (SE) association is determined by the SIDER database.
In one aspect the association strength is determined using Matthew correlation coefficient (MCC). In one aspect the association strength of is determined using sensitivity (sn). In one aspect the association strength of (c) is determined using specificity (sp).
In another embodiment, methods are provided for treating a mammal in need of treatment with at least one of the diseases listed as New Indication in Table 1 with the corresponding pharmaceutical listed as Drug of Table 1. In one aspect the mammal is a human.
The invention is further described by the following non-limiting examples.
Here we show that the clinical side-effects (SEs) produced by a drug provide a human phenotypic profile for the drug, and this profile can suggest additional indications. The rationale behind this methodology is compelling in that both therapeutic effects and side effects are behavioral or physiological changes in response to the drug, and may be associated with each other via known or unknown mechanism-of-action (MOA). We extracted 3,175 SE-disease relationships by combining the SE-drug relationships from drug label and the drug-disease relationships from PharmGKB. Many relationships provide explicit repurposing hypotheses, such as drugs causing SE hypoglycemia are potential candidates for diabetes, and a different dose or formulation may possibly produce a clinically beneficial effect in at least a sub population that shows the side effect. Based on these 3,175 relationships, an indication prediction model was constructed. The model was subsequently tested on 4,200 clinical candidates across 101 major human diseases. 36% of the disease models achieved an AUC higher than 0.7, including depression, anxiety disorders, stomach neoplasms, non-small-cell lung carcinoma, lymphoma, leukemia and type II diabetes. The MOA for each SE-disease association was also investigated to rationally interpret the prediction result. This study suggests that clinical pharmacologists should pay closer attention to the SEs observed in clinical trials not just to evaluate the potentially harmful side effects, but to also rationally explore the repositioning potential based on this “clinical human phenotypic assay”.
Repositioning helps fully explore the indications of marketed drugs and clinical candidates (Ashburn, et al. Nat Rev Drug Discov 2004; 3(8):673-83); however, most successful stories of drug repositioning are based on serendipity but not systematic analysis (Sardan, et al. Brief Bioinform 2011). In silico methodologies have helped in mining the drug's off-target effects (Xie, et al. PLoS Comput Biol 2007; 3(11):e2173-7; Campillos, et al. Science 2008; 321(5886):263-6; Keiser, et al. Nature 2009; 462(7270):175-81; Yang, et al. PLoS Comput Biol 2009; 5(7):e1000441; and Luo, et al. Nucleic Acids Res 2011), off-system effects (such as, off-target related gene expression perturbation) (Suthram, et al. PLoS Comput Biol 2010; 6(2):e1000662; Yang, et al. PLoS Comput Biol 2011; 7(3):e1002016; Iorio, et al. Proc Natl Acad Sci USA 2010; 107(33):14621-6; and Hu, et al. PLoS One 2009; 4(8):e6536) and off-phenotypes (i.e. adverse drug reactions (Pouliot, et al. Clin Pharmacol Ther 2011; and Tatonetti, et al. Clin Pharmacol Ther 2011) or new indication) providing new hypotheses for in vitro assay (MacDonald, et al. Nat Chem Biol 2006; 2(6):329-37), animal model testing, or clinical trials to reposition the drug. The above strategies primarily focus on using preclinical information. Clinical therapeutic effects, however, are not always consistent with preclinical outcomes (Buchan, et al. Drug Discov Today 2011). Drug repositioning can help explore new indications of marketed drugs and clinical candidates.
Recently, a systematic analysis pointed out that phenotypic screening exceeded target-based approaches in discovering first-in-class small-molecule drugs (Swinney, et al. Nat Rev Drug Discov 2011; 10(7):507-19). Clinical phenotypic information comes from actual patient data, which mimics a phenotypic “screen” of drug effects on human, and can directly help rational drug repositioning. For example, one study tried to suggest drug's new indications based on existing therapeutic effect (Chiang, et al. Clin Pharmacol Ther 2009; 86(5):507-10). In our study, however, we utilize the rich information of the clinical side-effects (SEs), which usually regarded as unwanted effects of the drugs, to suggest new indications for a drug. For instance, hypotension is an unfavorable SE of many drugs. However, they may act as candidate anti-hypertension drug if we utilize this SE via controlling the dosing, improve the formulation and choosing the sub-population etc.
The rationale for this strategy is that SEs and indications are both measurable behavioral or physiologic changes in response to the treatment, and if drugs treating the same disease share the same SE, there might be some underline mechanism-of-actions (MOAs) linking this disease and the SE and this SE could serve as the phenotypic “marker” of the therapeutic effect of this disease. Furthermore, both therapeutic and side effects are observations on human subjects, but not animal models, so there is less translational issue. This is to suggest the understanding more about the conditions and the extent to which the SE was produced may warrant additional experiments as to MOA, and perhaps eventual repositioning. In this study, we systematically examined the connections between SEs and indications, and quantitatively measured the power of using these connections to predict new indications.
The methodology of Drug Repositioning based on the Side-Effectome (DRoSEf) is discussed in this study. The basic hypothesis is that if the SEs associated with a drug D are also induced by many of the drugs treating disease X, then drug D should be evaluated as a candidate for treating X. We constructed a database of disease-SE associations. Clinical pharmacologists, who observe a SE in their clinical trial can query the database for diseases for which there are drugs that have the same side effect. This would suggest alternative indications for the drug in the clinical trial. The biologists can also investigate the underlying MOA for the relationship between SEs and disease so as to better understand the pathogenesis and the therapeutic process of the disease. Using this approach, we predicted new indications for marketed drugs and 4,200 candidate drugs that were in the clinical trial, with the prediction performance quantitatively measured.
Both disease-drug associations and drug-SE associations are required to infer disease-SE associations. We extracted the indications of drugs from PharmGKB to provide the disease-drug associations (Altman, et al. Nat Genet 2007; 39(4):426). There are multiple resources for SE information. The SEs printed on drug label, however, provide consistent and reliable data as these are identified from large clinical trials, and the drug label is approved and standardized by regulatory agencies. The SIDER database (Kuhn, et al. Mol Syst Biol 2010; 6:343), which had been used to predict drug off-target (Campillos, et al. Science 2008; 321(5886):263-6), provides drug-SE relationships extracted from drug labels for 888 approved drugs using text-mining (Yang, et al. Bioinformatics 2009; 25(17):2244-50 and Agarwal, et al. Nat Rev Drug Discov 2009; 8(11):865-78). The relationships among drugs and SEs were extracted from this database. We just used the binary fact of the SE's presence on the drug label. We then inferred disease-SE associations by counting the number of the drugs inducing or not inducing a SE when treating or not treating a disease, generating confusion matrix as shown in
We investigated a few of the 3,175 associations to understand what these associations implied and how they can be used to suggest new indications. Some of the associations have the explicit explanation based on the current knowledge of the MOA (Table 2). 1) The SE positive ANA indicates the presence of autoimmune antibodies and appears to be associated with stroke. It is the SE shared by drugs treating stroke, mainly ticlopidine and several angiotensin-converting enzyme (ACE) inhibitors. Stroke itself, is associated with severe immune suppression (Vogelgesang, et al. J Neuroimmunol 2011; 231(1-2):105-10). Thus, conceivably drugs that are associated with increasing immune response in terms of positive ANA may help stroke patients, though of course an autoimmune response is not desirable. Overall, 50% of the drugs treating stroke were listed this SE whereas only 2% drug not listed as treating stroke were listed it. These 2% drugs were called false positive drugs (Table 2). Several statins are associated with positive ANA, but not indicated for stroke. A meta-analysis of 120,000 patients across 42 trials showed that statin therapy provides protection for all-cause mortality and nonhemorrhagic strokes (O'Regan, et al. Am J Med 2008; 121(1):24-33). Ramipril, which associates with positive ANA, also shows a 32% risk reduction for stroke [11909785]. The immune related mechanism of action (MOA) in stroke has only been recognized recently (Vogelgesang, et al.), however, based on identification of the stroke positive ANA association by DRoSEf, it is likely that the immune related SEs of these drugs could help suggest their potential use for stroke regardless this MOA was recognized. 2) Cytomegalovirus infection is a sign of a weakened immune system (Dechanet, et al. J Infect Dis 1999; 179(1):1-8). Drugs that reduce immune response are often used to prevent transplant rejection, thus drugs that list increased cytomegalovirus (CMV) infections as a SE may be good candidates for treating transplant patients. Methotrexate, an antineoplastic drug, lists CMV infections as a SE. It has been indicated for preventing transplant rejection [8956122]. 3) DRoSEf suggests that drugs that list porphyria as SE may act as antidiabetics. There are significant negative association between porphyria and diabetes (Andersson, et al. J Intern Med 1999; 245(2):193-7 and Yalouris, et al. Br Med J (Clin Res Ed) 1987; 295(6608):1237-8), with the MOA unknown. For example, in a study of 328 Swedish patients with porphyria, the 16 patients that developed diabetes all had their porphyria symptoms resolved. Valproic acid is a mood-stabilizing drug that lists porphyria. A recent study found it effective in lowering blood glucose levels in both Wfs1 knockout mice (Terasmaa, et al. J Physiol Biochem 2011). Pyrazinamide is an anti-tuberculosis agent that lists porphyria. Interestingly, tuberculosis was found to be correlated with diabetes (Cantalice, et al. J Bras Pneumol 2007; 33(6):691-8 and Nijland, et al. Clin Infect Dis 2006; 43(7):848-54). In mice, naproxen may be a valuable tool to delay or prevent the development of type II diabetes from a pre-diabetic condition (Kendig, et al. Biochem Pharmacol 2008; 76(2):216-24). Estradiol was also found to have antidiabetic effect (Kumar, et al. Endocrinology 2011). In a double-blinded, randomized placebo controlled clinical trial on women with type II diabetes, oral estradiol significantly decreased fasting glucose [19339356]. 4) Drugs that list delusions as a side effect may help with depression. Cabergoline, an ergot derivative that causes delusions, is a dopamine agonist that has an antidepressant-like property (Chiba, et al. Psychopharmacology (Berl) 2010; 211(3):291-301). The dopamine receptor agonist pergolide has shown antidepressant effects in Parkinson patients (Picillo, et al. Parkinsonism Relat Disord 2009; 15 Suppl 4:S81-S84 and Quan, et al. Neuroscience 2011; 182:88-97). 5) Hyperacusis is a medical condition associated with hypersensitivity to certain frequency ranges of sounds. Phenytoin is a known anticonvulsant with hyperacusis as a listed side effect, and DRoSEf suggests a potential utility for treating depression. In fact, a small clinical trial found equivalent therapeutic effects between phenytoin and fluoxetine in treating depression [15889944], the latter drug being the first line antidepressant agent. Modafinil is a drug for narcolepsy and is also potentially effective in combination with fluoxetine to treat depression (Abolfazli, et al. Depress Anxiety 2011; 28(4):297-302). 6) Constitutional symptoms are a listed SE for many antineoplasm drug. An anti-HIV drug nevirapine also lists constitutional symptoms as a SE. Nevirapine has previously been suggested as a treatment for human hormone-refractory prostate carcinoma (Landriscina, et al. Prostate 2009; 69(7):744-54) and other neoplasms (Landriscina, et al. Int J Cancer 2008; 122(12):2842-50).
aDrugs not listed treating disease (2nd column) but listed the SE (3rd column).
In fact, 27% of the “false positive” drugs-disease association suggested by DRoSEf have at least one clinical trial article listed in PubMed. However, not all 3,175 associations have an obvious MOA explanation based on current knowledge. Based on these 3,175 associations, we further built Naïve Bayes models to predict the 145 indication endpoints using their associated SEs as the features. The average AUCs of 10-fold cross validations for each of the 145 disease were calculated using Weka (Mark Hall EFGHBPPRIHW. The WEKA Data Mining Software: An Update; SIGKDD Explorations. 11[1]. Jan, 1, 2009), 92% of the which were above the 0.8.
Based on these 3,175 associations, a disease-SE network was constructed (
Moreover, among the 7% of the drugs that have not been indicated for OCD in PharmGKB but list priapism as a SE, some of them have been reported to treat OCD in literature. For example, ziprasidone has been used as the coadjuvant treatment in resistant OCD treatment (Iglesias, et al. Actas Esp Psiquiatr 2006; 34(4):277-9); quetiapine was reported to be effective in treating OCD (Alexander, et al. Aust N Z J Psychiatry 2009; 43(12):1185; Vulink, et al. J Clin Psychiatry 2009; 70(7):1001-8; and Savas, et al. Clin Drug Investig 2008; 28(7):439-42); there is a case report of oxcarbazepine's therapeutic effect in OCD (McMeekin H. J S C Med Assoc 2002; 98(8):316-20); the symptom of OCD could be decreased after olanzapine treatment (van, et al. J Clin Psychopharmacol 2008; 28(2):214-8); a trend toward an non-obsessive response was seen under nefazodone treatment (Nelson EC. Ann Clin Psychiatry 1994; 6(4):249-53); strong reduction of the OCD could be observed after using clozapine (Peters B, de H L. Prog Neuropsychopharmacol Biol Psychiatry 2009; 33(8):1576-7). Sildenafil was also among the 7% of the drugs that cause ‘priapism’. Given the association of OCD and priapism, and the central nervous system penetration of sildenafil (Schultheiss, et al. World J Urol 2001; 19(1):46-50), the drug could be considered for OCD. A possible MOA is that nitric oxide modulates the neurotransmitters implicated in OCD (Umathe, et al. Nitric Oxide 2009; 21(2):140-7), and the inhibition of PDE5 protein by sildenafil may lead to a sustained release of nitric oxide (Ghofrani, et al. Nat Rev Drug Discov 2006; 5(8):689-702).
The above analysis requires knowing the SEs from a drug's label before we predict new indications. We also wanted to investigate whether we could predict SEs based on compound structure and then predict new indications based on those SEs. We hypothesized that such a prediction “chain” would provide mechanistic explanations of the compound's new indication based on the disease-SE association and the structural information of the compound. To present a quantitative framework of DRoSEf's performance, we recruited all small molecule candidates or marketed drugs from Genego® MetaBase (Ekins, et al. Expert Opin Drug Metab Toxicol 2005; 1(2):303-24). This provided a data source of 4,200 additional molecules in addition to the 888 SIDER drugs. These 4,200 molecules are indicated for at least one of 101 diseases from the 145 disease set. MetaBase also uses MeSH disease terms, thus making comparisons to the MeSH indications from PharmGKB straightforward.
DRoSEf requires the side-effect profile for each molecule to predict new indications. However, such information is difficult to obtain for most of the 4,200 molecules because they are generally clinical candidates without FDA approved drug labels, and have little or no SE published from their clinical trials in a standardized way. Quantitative structure-activity relationship (QSAR) models have been used to predict target binding of the ligand (Nidhi, et al. J Chem Inf Model 2006; 46(3):1124-33). We hypothesized that QSAR models could also be used to predict SEs, and the use of the predicted SEs as an intermediate towards predicting a disease indication would help understand the underpinnings of the disease indication, but also not lower the quality of the disease indication prediction. We are attempting to go from drug Structure to Side Effect and then to a disease indication (StruSEf). For side effect j (SEj), we recruited the positive set (φjpos, drugs causing SEj) and the negative set (φjneg, drugs not reported to induce SEj) from the 888 SIDER drug set (
The ROC curves of the prediction performance for 101 disease endpoints were generated. Some of the disease endpoints had only a few positive drugs from the MetaBase set, and their AUC value might not accurately reflect the true performance. We, therefore, focus on the diseases that have more than 30 compounds with that specific indication in MetaBase. Table 3 lists the diseases with AUC greater than 0.70. The AUCs for neuropsychiatric diseases are higher than neoplasms and other disease endpoints, which may be due to a higher number of the SEs (and thus a better characterized side effect profile) for these diseases. We then evaluated the extent of the structure similarity information contributed to these performances. In fact, when we do not use SE information at all and rely only on chemical structure, only 18% of the 101 disease endpoints achieve AUCs above 0.7, while 36% of disease endpoints had AUCs higher than 0.7 from StruSEf. Moreover, 74% of endpoints achieved higher AUC in StruSEf than using chemical structural information alone. Only 22% of the variance in the AUCs of StruSEf was explained by “chemical structural only” across the 101 endpoints. This again indicates that the side effect intermediate is adding value to the prediction.
MetaBase includes 203 molecules indicated for hypertension. However, there are molecules other than the 203 above that have not yet been reported to treat hypertension that achieved a relatively high Θ score based on SEs (corresponding to the rightmost part of the blue line in
Among the top investigational molecules in
Blonanserin acts as the antagonist of 5-HT2 receptor. A study demonstrated that the increase in blood pressure is due to a stimulation of postjunctional 5-HT2 receptors [3368008].
Some hypertension specific SEs are associated with MOA of various classes of antihypertensive drugs. Thus, we hypothesize that other drugs that are not known as treatments for hypertension but show those side effects might well be in part acting through that specific MOA. For example, pemphigus is reported to be induced by angiotensin-converting enzyme (ACE) inhibitors (Ong, et al. Australas J Dermatol 2000; 41(4):242-6.), and cold extremities could be induced by antihypertensives especially by 13-adrenergic blockers (Feleke, et al. Acta Med Scand 1983; 213(5):381-5.). In the prediction results from StruSEf, we also found that ACE inhibitors are significant enriched in the drugs predicted as pemphigus positive (Fisher's exact p=1.4E-3); whereas β-adrenergic blockers have significantly higher frequency in drugs predicted as cold extremities positive (p=0.02).
This study proposes the systematic and rational drug repositioning based on SEs (DRoSEf), and demonstrated its applicability via prediction of the repositioning potential for 4,200 (candidate) drugs across 101 diseases. Based on the fact that the methodology could recall the known indications for the drug molecules, we suggested the unknown indications for both marketed drugs (DRoSEf) and the clinical candidates (StruSEf). Afterwards, these suggestions were evaluated via mining the proof-of-concepts from literature. The concept of the DRoSEf suggests that the clinical pharmacologists pay closer attention to the SEs observed in clinical trials, and thus explore additional indications for their drugs based on understanding the connections between SE and the therapeutic effect of the drug. The examples raised in this study are only for demonstrating the principle of this methodology, but may not necessary be effective or practical for real repurposing practice. Furthermore, lots of factors should be considered during the practical aspects of this methodology, such as the unmet medical need for the disease, the fraction of the population showing the side effect, the CNS penetration of the molecule and whether the therapeutic effect is significant enough in comparison to current treatments. Moreover the previous therapeutic effect could now become a potential side effect as well, and will need to be carefully considered in the risk benefit profile. But, hopefully, in a few cases this could all be managed via choosing a suitable formulation, dose, and the sub population.
The SEs could be used to predict the targets [18621671]. However, the basic principle of DRoSEf is to mimic “phenotypic assay” rather than the target based assay to screen compounds for a disease indication, though DRoSEf itself could suggest target, such as ACE and β-adrenergic receptor in the case study for hypertension. It has been reported that phenotypic screening exceeded that of target-based approaches to the discovery of first-in-class drugs [21701501]. DRoSEf leverages “assays” with direct human phenotypes. Our study demonstrated that the phenotypic features from human work well on suggesting new indication, which may even outperform assays running on in vitro models or on animal models that face translational challenges. Its application relies on the MOA of the association of the SEs to the disease, although many of these MOAs are unknown or complicated. In this study, we did not consider the absolute frequency of the SEs or the relative frequency or significance compared to placebo. In SIDER, only 37.9% of the drug-SE pairs have frequency information associated with them, thus to maximize the amount of drugs covered, SEs with higher frequencies like nausea and vomiting are usually described in detail and written in the drug label. However, the frequencies for most of the informative SEs are unknown. Some of the SEs are regarded to be rare, but are still implicated in the pathogenesis of a particular disease. In fact, they might expose an extreme phenotype of a known or unknown MOA. For example, porphyria is a rare inherited disease [11117426]. Patients with this inherited disease show a decrease in the risk of porphyria on becoming diabetic (26;27). This may suggest why antidiabetic drugs are usually reported to worsen porphyria, but this may only affect people with an inherited genetic mutation for porphyria, and this subset of population may act as the “model” for screening the anti-diabetes drugs, with porphyria as the screening endpoint. Thus, a drug that increases porphyria in this sub population with the mutation may well be a good diabetes drug in a different larger population. So the off-phenotype of a drug on a sub population might suggest its use for a broader population. Besides mimic a human phenotypic “screening” to help “fishing” the positive candidates for repositioning, DRoSEf may also suggest the unrecognized disease pathogenesis, such as studying the porphyria may lead to better understanding of the diabetes.
A limitation of StruSEf is the number (888) of drugs that have available side effects. The models and accuracy would improve if we were able to obtain side effects on a larger number of drugs. Moreover, predictions of indications for 4200 MetaBase drugs would also be better if we had some side effect information from their early stage clinical trials rather than relying on just their structures. Even if we had to rely on structures for preclinical molecules, it would help if the structure based side effect models were trained on more than the 888 drugs from SIDER. Constructing a larger database of disease-SE associations via mining the drug labels and additional literature should improve accuracy. On the other hand, the prediction performance could also be an underestimate. Molecules that have not yet been reported to treat a disease may well be capable of treating that disease, and in many cases (the false positive drugs as shown in Table 2) clinical trials have already shown a positive effect. These molecules are regarded as false positives currently. which decreases the computed AUC value. However, even with this imperfect SE information and potentially underestimated prediction performance, 36% of the disease endpoints achieved AUCs higher than 0.7, which is generally higher than the disease prediction performance using the QSAR model alone.
Using multiple SEs features to predict the disease endpoint could also improve sensitivity over individual features. Although there are explicit individual disease-SE associations, not all of them have sufficient prediction power. For instance, not all drugs treating anemia have been annotated with polycythemia in SIDER, thus the sensitivity of this feature is limited. The inclusion of multiple features could enhance sensitivity. If a true positive is not recalled by an individual feature, it still has a chance to be recalled by other features. In the case of hypertension, the drug candidate “pranlukast” cannot be recalled through the feature pemphigus, but can be recalled via “cold extremities” (
Inspecting the numerous examples of SE that certainly correspond to and may have even led to clinical trials for new indications, it is obvious that the clinical side effect observations provide powerful human clinical data that is already widely used implicitly by clinicians to repurpose drugs. DRoSEf systemizes this process, provides numerous predictions based on the underlying MOA of the SE in disease's pathogenesis, and benefits from the fact that side effects are human phenotypic data obviating translation issues.
The disease-SE associations were computed based on the disease-drug association and drug-SE association, which were extracted from PharmGKB (18) and SIDER (19) databases respectively. PharmGKB uses MeSH term to describe diseases (Hansen, et al. Clin Pharmacol Ther 2009; 86(2):183-9). For side effects from SIDER, we only use them as present or absent in association with a drug, and do not consider their frequencies explicitly, as only 37.9% of the drugs had side effect frequencies associated with them. Let true positive (tpij) be the number of drugs listing that are indicated for disease i and list j as a SE; false positives (fpij) be the number of drugs that are not indicated for disease i and list SE j; true negatives (tnij) be the number of drugs that are not indicated for disease i and do not list SE j; false negatives (fnij) be the number of drugs that are indicated for i and do not list SE j. We calculated the sensitivity (snij), specificity (spij) and Matthews correlation coefficient or MCC (mccij) of using SE j to predict disease i using the standard formulas below:
For binary variables, the MCC is the equivalent of a Pearson correlation coefficient. The two-sided Fisher's exact pij value was also calculated. A disease-SE association was considered to be non-informative, if (pij>0.05|mccij<0.15|spij<0.75|tpij<2). This threshold provided 3,175 informative associations including 145 MeSH disease phenotypes and 584 SEs. The associations in Table 2 was selected based on the following criteria: the MCC is among the top 150 of all 3175 associations; the tpij>3; the associations between the disease and the SE have an explanation according to the knowledge of the authors. In
We calculated several structural descriptors (log P, molecular weight, number of hydrogen bond donors and acceptors, number of rotatable bonds and SCFP6 fingerprint) for 888 SIDER drugs. We tried to train 584 SE models with multiple Laplacian-modified Bayesian method (Nidhi, et al. J Chem Inf Model 2006; 46(3):1124-33) using the features above. 566 SE models were successfully trained.
We evaluated 5,534 clinical candidates or marketed drugs from Genego MetaBase (by January 2011). We considered only molecules that included SMILES strings, and further listed a disease indication matching at least one of the 145 from the SIDER set, and we excluded molecules that were duplicates from the SIDER drug set. This left us with 4,200 small molecules for an independent test set. These molecules were assigned at least one of the 101 disease MeSH term that match the 145 MeSH diseases.
The endpoint of our prediction is whether or not the compound should be considered for a clinical trial for treating disease i just based on side effect information. For each disease i, we computed its side-effectome profile vector from the SIDER data,
DS
i
=[ds
i1
,ds
i2
, . . . ,ds
ij
],jε[1,566],iε[1,101],
where dsij quantifies the association of disease i and SE j. The vectors were generated using seven different metrics, i.e.,
ds
ij
ε{b
ij
,mcc
ij
,mcc
ij
4
,se
ij
,se
ij
4
,sp
ij
,sp
ij
4},
where bij=0 if (pij>0.05|mccij<0.15|spij<0.75|aij<2), else, bij=1. We used the exponent four in an effort to enhance the signal of the high mcc, se or sp.
For each molecule k without known SEs, we estimated its side-effectome profile vector SMk by computing it using each of the 566 pre-trained SE models,
SM
k
=[sm
1k
,sm
2k
, . . . ,sm
jk
],jε[1,566],kε[1,4200],
where smjk=1 if the molecule k was predicted as possibly causing SE j, else smjk=0. We calculate the association Θik between disease i and molecule k as the dot product of the two vectors,
We computed Θik using each of the seven metrics, and for each metric we further computed an AUC for each of the 101 endpoints. The metrics seij4 performed best among all metrics in terms of the mean AUC across all 101 disease endpoints. Thus, the AUC value is based on the seij4 metrics.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US12/51257 | 8/17/2012 | WO | 00 | 2/19/2014 |
Number | Date | Country | |
---|---|---|---|
61525467 | Aug 2011 | US |