This application claims the benefits of Korean Patent Applications No. 10-2014-0173233 filed on Dec. 4, 2014 in the Korean Intellectual Property Office, the entire disclosure of which is hereby incorporated by reference.
1. Field
The present disclosure relates to methods of screening reactions or pathways induced by a compound.
2. Description of the Related Art
In the field of pharmaceutics, a methodology for predicting decomposition products of a drug has been developed. However, the methodology is simply for predicting compounds produced by organic chemical reaction mechanisms. In order to study the effects of compounds on cells, prediction of reactions or biological pathways in the cells that may be affected by the compounds is important.
Provided are methods of screening for biochemical reactions induced by a compound in a cell or organism.
Additional aspects will be set forth in part in the description which follows and, in part, will be apparent from the description, or may be learned by practice of the presented exemplary embodiments.
These and/or other aspects will become apparent and more readily appreciated from the following description of the exemplary embodiments, taken in conjunction with the accompanying drawings in which:
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to like elements throughout. In this regard, the present exemplary embodiments may have different forms and should not be construed as being limited to the descriptions set forth herein. Accordingly, the exemplary embodiments are merely described below, by referring to the figures, to explain aspects. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items. Expressions such as “at least one of,” when preceding a list of elements, modify the entire list of elements and do not modify the individual elements of the list.
Provided herein is a method of screening or predicting biochemical reactions induced or affected in a cell by a specific compound when used to treat the cell. The term “specific compound” is used herein to refer to a compound of interest (e.g., a target compound or test compound), such as a potential therapeutic compound. According to an embodiment, the method includes selecting two or more biochemical reactions for a specific compound from biochemical reactions known or determined to be associated with the specific compound by comparing one or more functional regions of the compound with one or more functional regions in a transformation library; identifying a set of genes, proteins, and/or metabolites differentially expressed in a cell, in response to a contact between the specific compound and the cell; selecting one or more biological pathways associated with the set of differentially expressed genes, proteins, and/or metabolites from known biological pathways; and determining the biochemical reaction included in the selected biological pathway, among the selected biochemical reactions, as a biochemical reaction induced in the cell by treatment with the test compound. Each of the steps may be performed by at least one computer processor. The at least one processor may be operably connected to a memory device.
The term “biochemical reaction” used herein refers to conversion of one molecule to another molecule, which is mediated by an enzyme in a cell. In the biochemical reaction, an enzyme binds to a substrate and catalyzes conversion of the substrate to a product (e.g., metabolite). The biochemical reaction may be classified or identified according to KEGG reaction ID, EC number, reaction definition, types of a reactant and a product, or a combination thereof. The term “biological pathway” used herein refers to a series of reactions (e.g., chemical or biochemical reactions) between molecules in a cell that leads to a certain product or a change in a cell. The biological pathway may be a metabolic pathway, a gene regulation pathway, or a signal transduction pathway.
The input receiving module 806 may instruct the processor(s) 804 to receive input of the identity of compound X along with data describing a chemical conversion of compound X. The functional and linker region(s) determination module 808 may instruct the processor(s) 804 to determine one or more functional regions and one or more linker regions in compound X. The transformation library searching module 810 may instruct the processor(s) 804 to search a transformation library to find functional region(s) similar or identical to the one or more functional regions of compound X by scanning the transformation library 818 stored in the memory 802. The group assigning module 812 may instruct the processor(s) 804 to assign compound X to one or more groups of the transformation library showing high similarity with the one or more functional regions. The metabolite similarity score computing module 814 may instruct the processor(s) 804 to compute a metabolite similarity score of compound X for one or more reactions in the one or more assigned groups of the transformation library. The reaction identification module 816 may instruct the processor(s) 804 to identifying reactions having a high metabolite similarity score.
The method provided herein involves selecting two or more biochemical reactions for a specific compound from among known biochemical reactions. The known biochemical reactions may include reactions known or determined to be associated with the specific compound. The selecting two or more biochemical reactions for a specific compound may be achieved via comparing one or more functional regions of the compound with one or more functional regions in a transformation library. The step of the selecting two or more biochemical reactions for a specific compound from among known biochemical reactions may include, for example, receiving input of the identity of the compound (e.g., data describing the molecular structure or molecular formula of the compound) along with data describing a chemical conversion of the compound; determining one or more functional regions and one or more linker regions in the compound; searching (scanning) a transformation library to find functional region(s) similar or identical to the one or more functional regions in the transformation library; assigning the compound to one or more groups of the transformation library showing high similarity with the one or more functional regions; computing a metabolite similarity score of the compound for one or more reactions in the one or more assigned groups of the transformation library; and identifying reactions having a high metabolite similarity score. The similarity between functional regions of the compound and those of the transformation library may be determined by using a similarity measure for comparing chemical structures. The similarity measure may be the Tanimoto (or Jaccard) coefficient.
The data describing a chemical conversion (chemical conversion information) for the compound may include at least one of transformation region information and reaction conversion information. Transformation region information may be information (data) describing the structure of the transformation regions of the compound. The transformation region may be atoms or the chemical group or structure undergoing a change in bond connectivity or bond order. Reaction conversion information may be, for example, attributes governing the transformation. The data describing a chemical conversion for the compound may be represented by using simplified molecular-input line-entry system (SMILES) formulas for the compound and the product to which the compound is converted via the chemical conversion.
Determining one or more functional regions and one or more linker regions in the compound may include identifying one or more transformation regions for the compound based on the data describing a chemical conversion of the compound; extracting information (data) describing the structure of the one or more transformation regions of the compound; and identifying one or more functional regions and one or more linker regions in the compound from the extracted information describing one or more transformation regions. The transformation region for the reactant molecule may be identified by comparing it to corresponding product molecule.
The functional region(s) of the compound can be determined based on the chemical conversion information. A functional region of a compound is the region that participates in or is necessary for a given reaction to take place. A functional region may include at least one transformation region (e.g., a chemical group or structure that changes as a result of the chemical conversion reaction) alone or together with at least one other region of interest (e.g., region that participates in some way, perhaps indirectly, in a chemical reaction). The linker region(s) may include the residual part of the compound outside of the identified functional group(s) in the compound. Thus, the functional regions can be determined, for instance, by comparing the compound of interest before a chemical conversion reaction to the resulting compound after the chemical conversion reaction to determine which regions of the molecule change and by applying other general knowledge in the art to determine which other parts of the compound are involved in the reaction.
The structure or formula of functional region of the test compound is used by the processor to search or scan a transformation library. The transformation library is a library or database of information or data describing reactions categorized into a plurality of groups of one or more reactions undergoing similar chemical conversions represented by one or more functional regions and associated information (e.g., a list of at least one enzyme catalyzing the reaction(s) and the extracted functional region(s) and linker region(s) of the reaction(s)). The transformation library is generated by grouping similar reactions based on the functional region similarity. The transformation library may include a collection of previously reported bio-molecular conversions. The transformation library may include a plurality of groups of chemical reactions. Each of the groups may include chemical reactions undergoing similar chemical conversion, a list of enzyme(s) catalyzing each of the reactions, and the functional and linker region(s) of each of the reactions. Each of the groups has representative functional region(s). In the present specification, the transformation library may also be referred to as “a reaction rule set.”
In the transformation library, the systematic arrangement of groups may make them useful in terms of group assignment and deriving metabolite similarity score. The similar chemical transformation may be identified by matching up the at least one functional region among the reactions.
After identifying reactions in the transformation library involving the same or similar structures as the functional regions of the compound of interest, a metabolite similiarity score can be computed. As mentioned above, the reactions from the transformation library are grouped together with the functional regions involved in the reactions. The metabolite similarity score of the compound of interest with respect to one or more reactions in the assigned one or more groups may be computed by comparing the extracted information describing one or more functional regions of the compound of interest with the information describing one or more functional regions of the assigned one or more groups of the transformation library; comparing the extracted information describing one or more linker regions of the compound of interest with the information describing one more linker regions of the one or more reactions of the assigned one or more groups of the transformation library; and computing a similarity score based on the comparison of the one or more functional regions and the one or more linker regions of the compound of interest with those of the transformation library.
The similarity between the regions of the compound of interest and the regions identified in the transformation library is termed metabolite similarity, and a score representing the similarity obtained by the comparison is called as metabolite similarity score. The reactions of the assigned group(s) of the transformation library which produce a high metabolic similarity score is/are selected as likely candidate biochemical reactions for a test compound in the screening method.
Computing a metabolite similarity score is illustrated in greater detail as follows. The compound may be represented as a function of two components, the functional region(s) and the linker region(s). In Equation 1 as below, A is an input compound, a is a functional region(s) of the compound A, and β is a linker region(s) of the compound A.
A=f(α,β) <Equation 1>
The metabolite similarity (MS) score between an input compound A and a representative reaction within a group in transformation library is defined by Equation 2 as below:
where, Aα is a functional region(s) of compound A, Aβ is a linker region(s) of compound A, α is a representative functional region(s) of a group in the transformation library, β is a linker region(s) of a reaction in the group, T(Aα, α) is chemical similarity (Tanimoto coefficient) of functional region(s) of compound A and representative functional region(s) of the group, T(Aβ, β) is chemical similarity (Tanimoto coefficient) of linker region(s) of compound A and linker region(s) of a reaction in the group, and a1 and a2 are each respectively weighting factors for functional and linker regions. Further, similarity could have been assessed through other equivalent metrics such as, but not limited to, root mean square deviation, equivalence overlap, etc., for structural similarity; dice, cosine, etc., for chemical similarity; feature based; etc.
The method involves identifying a set of genes, proteins, and/or metabolites differentially expressed in a cell in response to treating or contacting the cell with the specific compound of interest. The set of differentially expressed genes, proteins, or metabolites may be obtained from a database of one or more gene expression profiles and/or metabolite profiles; experimental data providing one or more gene expression profiles and/or metabolite profiles; or a combination thereof. The database including at least one gene expression profile and/or metabolite profile may be a public database, for example, Gene Express Omnibus (GEO) by NCBI, ArrayExpress, Stanford microarray database, KEGG, PubChem, MetaCyc, ChEBI, PDB, UniProt, GenBank, or Human Metabolome Database (HMDB). The gene expression profile and metabolite profile may be analyzed in cells in contact with or after contacting with the specific compound. The set of differentially expressed genes, proteins, or metabolites may be those selected using known methods. For example, the gene or metabolite expression profile can be statistically processed to determine which genes or metabolites are differentially expressed to a statistically significant degree, and according to the results of statistical analysis, the set of differentially expressed genes, proteins, or metabolites can be selected.
Biological pathways associated with the set of differentially expressed genes, proteins, and/or metabolites may be selected based on a value representing a relationship between the set and each of the pathways, wherein the value is computed from a database having information about known biological pathways. The database having information about known biological pathways may be a public database, for example, KEGG, BioCarta pathway, MetaCyc, Database for Annotation, Visualization and Integrated Discovery (DAVID), AmiGO, or PANTHER. The value representing a relationship between the set and each of the pathways may be, for example, a p-value. The p-value statistically represents a degree of the association of the set with the corresponding pathway, and the smaller p-value indicates that the genes, proteins, or metabolites to be input are statistically significantly enriched in the corresponding pathways.
According to an aspect of another exemplary embodiment, a method of screening or predicting biological pathways induced or affected by a specific compound when used to treat cells includes identifying a product transformed from a specific compound (e.g., a metabolite of the specific compound) through the biochemical reaction determined by the method of screening biochemical reactions induced in a cell by treatment with a specific compound; screening biochemical reactions for the identified product according to the method of screening biochemical reactions (e.g., using the product as the compound of interest in the previously-described screening method) to determine the biochemical reaction induced in the cell by treatment with the identified product; selecting the biological pathway from among known biological pathways, wherein the selected pathway comprises the biochemical reaction determined by the method of screening biochemical reactions, followed by the biochemical reaction induced in the cell by treatment with the identified product; and determining the selected biological pathway as a biological pathway induced in the cell by treatment with the specific compound.
Selecting biological pathways including the screened reactions may be performed by using information stored in a database having information about biological pathways, where the database may be a public database, for example, KEGG, BioCarta pathway, MetaCyc, Database for Annotation, Visualization and Integrated Discovery (DAVID), AmiGO, or PANTHER.
In the method of screening biological pathways, the identifying and screening steps may be repeated at least twice.
The present invention will be described in further detail with reference to the following examples. These examples are for illustrative purposes only and are not intended to limit the scope of the present invention.
1.1. Prediction of Reactions in Cells Using a Natural Enzyme Substrate
Reactions in a cell induced by ornithine, which is known to be a natural subatrate of ornithine decarboxylase (ODC), were predicted. The structure of ornithine was transformed to a formula of NCCC[C@H](N)C(O)═O via the simplified molecular-input line-entry system (SMILES). By using a method of selecting biochemical reactions for a compound among known biochemical reactions, 32 candidate products that are predicted to be produced from ornithine were obtained as shown in Table 1. Specifically, a reaction rule set was prepared by classifying all enzyme reactions occurring in a cell according to the Enzyme Commission number (EC number) thereof and grouping the classified enzyme reactions based on the functional region similarity. Thus, a certain reaction rule in the reaction rule set includes a plurality of groups of EC numbers each of which has a plurality of enzyme reactions. Meanwhile, transformation regions and regions of interest for ornithine were determined from data describing chemical conversions of ornithine. Based on the transformation regions and regions of interest for ornithine, the applicable reaction rules among reaction rules in the reaction rule set were chosen. By applying the chosen reaction rules to ornithine, candidate products of ornithine were obtained.
1.2. Prediction of Rreactions in Cells for a Synthetic Substrate
Reactions in a cell induced by 1-aminooxy-3-aminopropane, which is a synthetic substrate of ODC, were predicted. A structure of 1-aminooxy-3-aminopropane was transformed to a formula of NCCCON via SMILES. By using a method of selecting biochemical reactions for a compound, as described in Example 1.1, 13 candidate products predicted to be produced from 1-aminooxy-3-aminopropane were obtained as shown in Table 2.
EXAMPLE 2
2.1. Prediction of Reactions Using a Method of Determining Biochemical Reactions for a Compound
Dapsone is a known substrate of N-acetyl-transferase (NAT2) or pyruvate kinase (GELBER, R. et al., The polymorphic acetylation of dapsone in man., Clin. Pharmacol. Ther. 12(2):225-238, 1971; Cho SC et al., DDS, 4,4′-diaminodiphenylsulfone, extends organismic lifespan., PNAS, 2010 Nov. 9; 107(45):19326-31). A structure of dapsone was transformed to a formula of NC1=CC═C(C═C1)S(═O)(═O)C1=CC═C(N)C═C1 via SMILES. By using a method of selecting biochemical reactions for a compound, as described in Example 1.1, candidate products predicted to be produced from dapsone were obtained as shown in Table 3.
2.2. Selection By Using Experimental Data From Among Predicted Reactions
Genes that are differentially expressed (differentially expressed genes, DEGs) by dapsone were experimentally determined. Information about biochemical pathways to which the DEGs are involved was obtained by a functional annotation tool of bioinformatics resources (DAVID Bioinformatics Resources version 6.7 operated by the National Institute of Allergy and Infectious Diseases (NINDS), Bethesda, Md.; see Huang D W, Sherman B T, Lempicki R A. “Systematic and integrative analysis of large gene lists using DAVID Bioinformatics Resources,” Nature Protoc. 2009; 4(1):44-57; Huang D W, Sherman B T, Lempicki R A. “Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists,” Nucleic Acids Res. 2009; 37(1):1-13).
The results implies that among the reactions predicted in Example 2.1., the ATP creatine N-phosphotransferase reaction is more likely induced with dapsone treatment of a cell than the acetyl-CoA:arylamine N-acetyltransferase reaction.
It should be understood that the exemplary embodiments described herein should be considered in a descriptive sense only and not for purposes of limitation. Descriptions of features or aspects within each embodiment should typically be considered as available for other similar features or aspects in other embodiments.
All references, including publications, patent applications, and patents, cited herein are hereby incorporated by reference to the same extent as if each reference were individually and specifically indicated to be incorporated by reference and were set forth in its entirety herein.
The use of the terms “a” and “an” and “the” and “at least one” and similar referents in the context of describing the invention (especially in the context of the following claims) are to be construed to cover both the singular and the plural, unless otherwise indicated herein or clearly contradicted by context. The use of the term “at least one” followed by a list of one or more items (for example, “at least one of A and B”) is to be construed to mean one item selected from the listed items (A or B) or any combination of two or more of the listed items (A and B), unless otherwise indicated herein or clearly contradicted by context. The terms “comprising,” “having,” “including,” and “containing” are to be construed as open-ended terms (i.e., meaning “including, but not limited to,”) unless otherwise noted. Recitation of ranges of values herein are merely intended to serve as a shorthand method of referring individually to each separate value falling within the range, unless otherwise indicated herein, and each separate value is incorporated into the specification as if it were individually recited herein. All methods described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g., “such as”) provided herein, is intended merely to better illuminate the invention and does not pose a limitation on the scope of the invention unless otherwise claimed. No language in the specification should be construed as indicating any non-claimed element as essential to the practice of the invention.
Preferred embodiments of this invention are described herein, including the best mode known to the inventors for carrying out the invention. Variations of those preferred embodiments may become apparent to those of ordinary skill in the art upon reading the foregoing description. The inventors expect skilled artisans to employ such variations as appropriate, and the inventors intend for the invention to be practiced otherwise than as specifically described herein. Accordingly, this invention includes all modifications and equivalents of the subject matter recited in the claims appended hereto as permitted by applicable law. Moreover, any combination of the above-described elements in all possible variations thereof is encompassed by the invention unless otherwise indicated herein or otherwise clearly contradicted by context.
Number | Date | Country | Kind |
---|---|---|---|
10-2014-0173233 | Dec 2014 | KR | national |