This application claims the benefit of Korean Patent Application No. 10-2012-0076803, filed on Jul. 13, 2012, in the Korean Intellectual Property Office, the entire disclosure of which is hereby incorporated by reference.
1. Field
The present disclosure relates to methods and apparatuses for analyzing gene information, such as a genome of an individual, for treatment selection.
2. Description of the Related Art
The genome indicates the entire gene information of an organism. Various techniques of sequencing the genome of a certain individual, such as a DeoxyriboNucleic Acid (DNA) chip and Next Generation Sequencing (NGS) technique, a Next NGS (NNGS) technique, and so forth, have been developed. Analysis of gene information, such as a nucleic acid sequence and protein, is widely used to find a gene indicating a disease, such as diabetes or cancer, or perceive a correlation between a genetic variety and an individual expression characteristic. In particular, gene information collected from individuals is significant to find out a genetic characteristic of an individual associated with the progression of different symptoms or diseases. Thus, gene information, such as a nucleic acid sequence and protein of an individual, is core data for perceiving current and future disease-related information to prevent diseases or select an optimal therapy at an initial stage of a disease. Techniques of correctly analyzing gene information of individuals by using genome detecting devices, such as a DNA chip and a microarray for detecting Single Nucleotide Polymorphism (SNP), Copy Number Variation (CNV), and so forth, have been researched.
Provided is a method and apparatus for analyzing gene information, such as the genome of an individual, for treatment selection, as well as a computer-readable recording medium storing a computer-readable program for executing the method.
According to an aspect of the present invention, a method of analyzing gene information for treatment selection, the method comprising: acquiring information about a gene network in which genes are classified into a plurality of subgroups based on functional correlations between the genes; extracting gene subgroups that include a gene targeted by at least one drug to be used in treatment from among the plurality of subgroups included in the gene network; and generating at least one index based on gene information included in the extracted subgroups to visualize the extracted subgroups, wherein one or more of the steps of the method are performed using a gene analyzing apparatus.
According to another aspect of the present invention, an apparatus for analyzing gene information for treatment selection, the apparatus comprising: a data acquisition unit for acquiring information about a gene network in which genes are classified into a plurality of subgroups based on functional correlations between the genes; a subgroup extracting unit for extracting gene subgroups that include a gene targeted by at least one drug to be used in treatment from among the plurality of subgroups included in the gene network; and an index generating unit for generating at least one index based on gene information included in the extracted subgroups to visualize the extracted subgroups.
These and/or other aspects will become apparent and more readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings in which:
Reference will now be made in detail to the following embodiments, examples of which are illustrated in the accompanying drawings.
In particular, the apparatus 10 may be a processor. This processor may be implemented by an array having a plurality of logic gates or a combination of a microprocessor and a memory storing programs executable by the microprocessor. In addition, it will be understood by those of ordinary skill in the art that the apparatus 10 may also be implemented by another type of hardware.
The apparatus 10 may be used as a device for helping medical practitioners in patient diagnosis and treatment selection by visualizing gene information associated with a gene causing a disease, such as cancer or tumor, from among genome data of an individual in relation to drug use, such as an anticancer drug. In addition, information provided by the apparatus 10 may be used for research, such as the development of new medicines, diagnostic markers, and so forth.
In general, the genome of an individual indicates all gene information that the individual has, and recently, the complete genome of a human being and other organisms have been expressed following the development of sequencing technologies. Gene information included in the genome, such as a nucleic acid sequence, protein revelation, and so forth, is mandatory for finding out biological action mechanisms. Genome analysis is widely used to understand various biological phenomena, such as finding out the cause of a specific disease such as diabetes or cancer, a genetic variety, an individual expression characteristic, and so forth.
Recently, functional correlations between genes included in the genome have been gradually expressed in genome research, thereby making it possible to conduct analysis of a gene network among genes. This is because almost all physiological symptoms occurring in a certain living organism are due to interactions of several genes instead of a single gene.
Referring to
Even though information about a gene network is known, research on a method of analyzing the gene network in association with various medical treatments, such as drug therapy, have rarely been conducted. In particular, only techniques for measuring an alteration in a single gene or a set of genes of an individual cancer patient (an alteration in a cancer patient's cell against a normal cell) have been introduced for the case where a prescription of a certain type of anticancer drug is considered. However, techniques for measuring an alteration in a single gene or a set of genes of an individual cancer patient by taking correlations between these anticancer drugs into account have not been introduced for the case where a prescription of two or more types of anticancer drugs is considered.
When a prescription of two or more types of anticancer drugs is considered, it may be meaningless trying to determine the anticancer drugs by individually measuring an alteration in a gene set for each type of anticancer drug because it may be difficult to anticipate the full efficacy of two types of anticancer drugs when the two types of anticancer drugs have the same or similar mechanisms. Thus, when a customized therapy of two or more types of anticancer drugs is considered, it may be first determined whether a genetic alteration of a patient is related to the efficacy of each anticancer drug, and whether mechanisms of the two or more types of anticancer drugs are similar may be simultaneously measured. In other words, when several anticancer drugs are used, it may be measured whether several kinds of oncogenes are related to pathways of the several anticancer drugs, and if it is measured that several kinds of oncogenes are related to the pathways of the several anticancer drugs, correlations between the several anticancer drugs may be first perceived for the optimal joint use of anticancer drugs.
Unlike the existing apparatuses for analyzing gene information, the apparatus 10 may index correlations between several oncogenes related to several anticancer drugs in a gene network, numerically analyze the indexes, and provide the numerical result. That is, the apparatus 10 may numerically analyze and provide a relationship between several gene sets (subgroups or subnets) instead of numerically analyzing an alteration in a single gene or a single set of genes as in the existing apparatuses.
An operation and function of the apparatus 10 will now be described in more detail. Referring back to
The subgroup extracting unit 120 extracts subgroups having a gene corresponding to an action of at least one drug to be used from among the plurality of subgroups included in the gene network acquired by the data acquisition unit 110.
A user of the apparatus 10, e.g., a medical practitioner, may input a list of anticancer drugs to be prescribed for a certain cancer patient by using the apparatus 10. Alternatively, the user of the apparatus 10 may input a list of drugs to research correlations between subgroups corresponding to certain drugs. Although not shown in
Referring back to
The at least one index generated by the index generating unit 130 includes indexes for evaluating at least one of a genetic alteration level of each of the extracted subgroups, correlations between the extracted subgroups, and the number of genes included in the extracted subgroups.
An index for evaluating a genetic alteration level of each of the extracted subgroups is estimated by the index generating unit 130 based on genetic alteration levels of genes included in the extracted subgroups.
The index for evaluating a genetic alteration level of each of the extracted subgroups may correspond to an index for indicating the extracted subgroups with different colors according to a genetic alteration level of each of the extracted subgroups.
The genetic alteration level of each of the extracted subgroups may be estimated based on a statistical probability of which genes having a genetic alteration from among the genes included in the individual genome are included in each of the extracted subgroups. This may be estimated by using generally known methods such as the Geneset Analysis, Geneset Enrichment Analysis, and Fisher Exact Test.
For example, the index generating unit 130 may generate an index of a genetic alteration level of each of the extracted subgroups by using Equation 1.
In Equation 1, p denotes a probability indicating a genetic alteration level of an extracted subgroup, N denotes the total number of genes in the gene network, k denotes the number of genes having an alteration in a cancer, M denotes the number of genes included in all extracted subgroups, and x denotes the number of genes included in the extracted subgroups from among the genes having an alteration in the cancer.
Equation 1 indicates a value of the probability p of which x or more genes having a genetic alteration are included in the extracted subgroups when k genes having a genetic alteration are selected from among the N genes. Equation 1 is known as the Fisher Exact Test.
However, it will be understood by those of ordinary skill in the art that the index generating unit 130 may estimate the index for evaluating a genetic alteration level of each of the extracted subgroups by using other similar algorithms as described above, such as the Geneset Analysis and Geneset Enrichment Analysis, instead of Equation 1.
Referring back to
A distance may be calculated using the number of genes functionally connected to each other between the extracted subgroups. In more detail, a distance may be calculated based on a result obtained by comparing the number of genes functionally connected to each other between the extracted subgroups with the number of genes functionally connected to each other between subgroups randomly sampled from the gene network.
Referring to
By way of further illustration, the distance between the two subgroups may be estimated using Equation 2.
In Equation 2, x denotes the number of genes connected from a subnet A to a subnet B,
Referring to
In Equation 3, êI denotes a distance, |V′| denotes the total number of genes included in a subnet 1 of
In Equation 3, w0, w1, and w2 denote weights. For example, in a relationship between the genes included in the two subgroups, a weight of two times may be defined for the genes (e0) commonly included in the two subgroups, a weight of one time may be defined for the directly connected genes (e1), and a weight of 0.5 times may be defined for the genes (e2) connected by sharing a single gene. That is, Equation 3 may be used by defining w0=2, w1=1, and w2=0.5. However, it will be understood by those of ordinary skill in the art that the values corresponding to the weights are illustrated for only convenience of description and may be easily modified to meet a using environment.
Referring to
Through the illustrations of
Although estimation of distances is illustrated in the current embodiment as described with reference to
In addition, although only the number of genes connected to each other by sharing a single gene (i.e., genes connected to each other by way of a single intervening gene) existing outside subgroups is used in
Referring back to
The visualization processor 140 of
According to another embodiment, the visualization processor 140 may process the visualization in the context of the entire gene network from which the subgroups have been extracted (e.g.,
A result processed by the visualization processor 140 may be output through a user interface unit (not shown), such as a display screen, and provided to a user, such as a therapist.
In operation 801, the data acquisition unit 110 acquires information about a gene network in which genes included in an individual genome are classified into a plurality of subgroups according to functional correlations between the genes.
In operation 802, the subgroup extracting unit 120 extracts subgroups having a gene corresponding to an action of at least one drug to be used from among the plurality of subgroups included in the gene network acquired by the data acquisition unit 110.
In operation 803, the index generating unit 130 generates at least one index based on gene information included in the subgroups extracted by the subgroup extracting unit 120 to visualize the extracted subgroups.
As described above, according to the one or more of the above embodiments of the present invention, information about a gene group causing a disease (e.g., cancer) from among a gene network of a genome of an individual may be visualized with regard to a drug therapy to help a therapist select an effective treatment. In addition, information about gene groups having a genetic alteration, information about correlations between gene groups, and so forth may be provided for an individual patient to help a therapist write an effective prescription. Furthermore, the information may also be used for genetic alteration research, such as development of new medicines, diagnostic markers, and so forth.
The embodiments of the present invention can be written as computer programs and can be implemented in general-use digital computers that execute the programs using a computer-readable recording medium. In addition, a structure of data used in the embodiments of the present invention may be recorded on the computer-readable recording medium through various means. Examples of the computer-readable recording medium include storage media such as magnetic storage media (e.g., ROM, floppy disks, hard disks, etc.) and optical recording media (e.g., CD-ROMs, or DVDs.
In addition, other embodiments of the present invention can also be implemented through computer readable code/instructions in/on a medium, e.g., a computer readable medium, to control at least one processing element to implement any above described embodiment. The medium can correspond to any medium/media permitting the storage and/or transmission of the computer readable code.
The computer readable code can be recorded/transferred on a medium in a variety of ways, with examples of the medium including recording media, such as magnetic storage media (e.g., ROM, floppy disks, hard disks, etc.) and optical recording media (e.g., CD-ROMs, or DVDs), and transmission media such as Internet transmission media. Thus, the medium may be such a defined and measurable structure including or carrying a signal or information, such as a device carrying a bitstream according to one or more embodiments of the present invention. The media may also be a distributed network, so that the computer readable code is stored/transferred and executed in a distributed fashion. Furthermore, the processing element could include a processor or a computer processor, and processing elements may be distributed and/or included in a single device.
All references, including publications, patent applications, and patents, cited herein are hereby incorporated by reference to the same extent as if each reference were individually and specifically indicated to be incorporated by reference and were set forth in its entirety herein.
The use of the terms “a” and “an” and “the” and “at least one” and similar referents in the context of describing the invention (especially in the context of the following claims) are to be construed to cover both the singular and the plural, unless otherwise indicated herein or clearly contradicted by context. The use of the term “at least one” followed by a list of one or more items (for example, “at least one of A and B”) is to be construed to mean one item selected from the listed items (A or B) or any combination of two or more of the listed items (A and B), unless otherwise indicated herein or clearly contradicted by context. The terms “comprising,” “having,” “including,” and “containing” are to be construed as open-ended terms (i.e., meaning “including, but not limited to,”) unless otherwise noted. Recitation of ranges of values herein are merely intended to serve as a shorthand method of referring individually to each separate value falling within the range, unless otherwise indicated herein, and each separate value is incorporated into the specification as if it were individually recited herein. All methods described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g., “such as”) provided herein, is intended merely to better illuminate the invention and does not pose a limitation on the scope of the invention unless otherwise claimed. No language in the specification should be construed as indicating any non-claimed element as essential to the practice of the invention.
Preferred embodiments of this invention are described herein, including the best mode known to the inventors for carrying out the invention. Variations of those preferred embodiments may become apparent to those of ordinary skill in the art upon reading the foregoing description. The inventors expect skilled artisans to employ such variations as appropriate, and the inventors intend for the invention to be practiced otherwise than as specifically described herein. Accordingly, this invention includes all modifications and equivalents of the subject matter recited in the claims appended hereto as permitted by applicable law. Moreover, any combination of the above-described elements in all possible variations thereof is encompassed by the invention unless otherwise indicated herein or otherwise clearly contradicted by context.
Number | Date | Country | Kind |
---|---|---|---|
10-2012-0076803 | Jul 2012 | KR | national |