The present invention relates to a device and a method for predicting the risk for a disease combined w e genetic risk for an associated phenotype, and more particularly, to a device and a method for predicting the risk for a disease.
Single nucleotide polymorphism (SNP) makers of genes related to various diseases or phenotypes have been found by genome sequence translation, genome-wide association study (GWAS), and the like.
Accordingly, it has been reported various methods for predicting the occurrence possibility of disease and non-disease phenotypes of a subject to be tested by using various genetic markers reported through existing technological studies (Reference: Schrodi S J et al. Front. Genet. 2014), and even in domestic and overseas genetic testing agencies, the genetic risks for a disease and a phenotype have been tested by using similar methods.
However, except for the diseases (genetic diseases) that are 100% caused by genetic effects, the occurrence of most of general diseases (cancer, diabetes, etc.) is influenced even by the lifestyle or the presence or absence of other diseases in addition to genetic factors. Therefore, in order to accurately predict the occurrence possibility of diseases, in addition to the genetic factors, the comprehensive analysis of various lifestyles and environmental factors is also important. Recently, studies on disease prediction/diagnosis considering both genetic factors and non-genetic factors have been reported, and the Goldstein B A et al. Front. Genet. 2014 paper relates to a method for combing a clinical risk score and a genetic risk score for a coronary heart disease using a long-link function method. However, this method was a method which is applicable because a Framingham risk score (considering gender, age, cholesterol level, blood pressure, diabetes, smoking state, etc.) for clinically measuring/judging the risk for the coronary heart disease has been pre-established, and the values required for calculating the Framingham risk score have a disadvantage of requiring information measured by medical equipment such as health medical examination/medical examination.
In order to overcome the disadvantage, in Korean Patent Publication No. 2019-0077997, there is disclosed a method for predicting the occurrence possibility of a disease and a phenotype by collecting lifestyles and environmental factors affecting the disease occurrence through clinical questionnaire and combining the collected lifestyles and environmental factors with genetic factors.
However, in the case of collecting the phenotypes (lifestyle, environmental factors, and presence or absence of other related diseases) associated with the disease through the clinical questionnaire method, there are disadvantages that the subjects may answer the questionnaire items with different criteria depending on the subjective and judgment criteria of the subjects to be tested to cause inaccurate results, and the analysis is impossible when the subjects do not answer the clinical questionnaire. Therefore, it is necessary to develop a method of compensating for existing methods in predicting the disease by combining genetic factors and related phenotype factors for the disease.
An object of the present invention is to provide a device and a method for predicting the risk for a disease combined with the genetic risk for an associated phenotype by obtaining a disease genetic risk for a disease and a phenotypic genetic risk for a phenotype (related diseases, lifestyle, etc.) associated with the disease based on information on a genotype of a user and obtaining a comprehensive genetic risk for the disease based on the disease genetic risk and the phenotypic genetic risk.
One aspect of the present invention provides a device for predicting the risk of a disease combined with the genetic risk for an associated phenotype, the device including: a disease risk obtaining unit configured to obtain a disease genetic risk for the disease based on genetic marker information associated with the disease occurrence and information on a genotype of a user; a phenotypic risk obtaining unit configured to obtain a phenotypic genetic risk for a phenotype based on the genetic marker information on the phenotype associated with the disease occurrence and the information on the genotype of the user; and a comprehensive risk obtaining unit configured to obtain a comprehensive genetic risk for the disease based on the disease genetic risk obtained by the disease risk obtaining unit and the phenotypic genetic risk obtained by the phenotypic risk obtaining unit.
The comprehensive risk obtaining unit may obtain the comprehensive genetic risk based on the disease genetic risk and the phenotypic genetic risk by using a ratio of genetic factors affecting the disease occurrence.
The disease risk obtaining unit may convert the obtained disease genetic risk to a relative value as compared with disease genetic risks of other users in an affiliated group of the user, the phenotypic risk obtaining unit may convert the obtained phenotypic genetic risk to a relative value as compared with phenotypic genetic risks of other users in an affiliated group of the user, and the comprehensive risk obtaining unit may obtain the comprehensive genetic risk based on the disease genetic risk converted to the relative value and the phenotypic genetic risk converted to the relative value and convert the obtained comprehensive genetic risk to a relative value as compared with comprehensive genetic risks of other users in the affiliated group of the user.
The phenotypic risk obtaining unit may obtain the phenotypic genetic risk based on the genetic marker information on the phenotype and the information on the genotype of the user by using a predefined effective size by state of the phenotype.
The phenotypic risk obtaining unit may obtain a genetic risk based on the genetic marker information on the phenotype and the information on the genotype of the user, convert the obtained genetic risk to a relative value as compared with the genetic risks for the phenotypes of other users in the affiliated group of the user, obtain the state of the phenotype based on the genetic risk converted to the relative value, and determine an effective size corresponding to the obtained state as the phenotypic genetic risk.
When there is a plurality of phenotypes associated with the disease occurrence, the phenotypic risk obtaining unit may obtain a genetic risk for each of the plurality of phenotypes and may obtain a phenotypic genetic risk based on the obtained plurality of genetic risks.
Another aspect of the present invention provides a method for predicting the risk of a disease combined with the genetic risk for an associated phenotype, the method including: obtaining a disease genetic risk for the disease based on genetic marker information associated with the disease occurrence and information on a genotype of a user; obtaining a phenotypic genetic risk for a phenotype based on the genetic marker information on the phenotype associated with the disease occurrence and the information on the genotype of the user; and obtaining a comprehensive genetic risk for the disease based on the obtained disease genetic risk and the obtained phenotypic genetic risk.
The obtaining of the comprehensive genetic risk may be performed by obtaining the comprehensive genetic risk based on the disease genetic risk and the phenotypic genetic risk by using a ratio of genetic factors affecting the disease occurrence.
The obtaining of the disease genetic risk may be performed by converting the obtained disease genetic risk to a relative value as compared with disease genetic risks of other users in an affiliated group of the user, the obtaining of the phenotypic genetic risk may be performed by converting the obtained phenotypic genetic risk to a relative value as compared with phenotypic genetic risks of other users in an affiliated group of the user, and the obtaining of the comprehensive genetic risk may be performed by obtaining the comprehensive genetic risk based on the disease genetic risk converted to the relative value and the phenotypic genetic risk converted to the relative value and converting the obtained comprehensive genetic risk to a relative value as compared with comprehensive genetic risks of other users in the affiliated group of the user.
The obtaining of the phenotypic genetic risk may be performed by obtaining the phenotypic genetic risk based on the genetic marker information on the phenotype and the information on the genotype of the user by using a predefined effective size by state of the phenotype.
The obtaining of the phenotypic genetic risk may be performed by obtaining a genetic risk based on the genetic marker information on the phenotype and the information on the genotype of the user, converting the obtained genetic risk to a relative value as compared with the genetic risks for the phenotypes of other users in the affiliated group of the user, obtaining the state of the phenotype based on the genetic risk converted to the relative value, and determining an effective size corresponding to the obtained state as the phenotypic genetic risk.
The obtaining of the phenotypic genetic risk may be performed by obtaining a genetic risk for each of the plurality of phenotypes and obtaining a phenotypic genetic risk based on the obtained plurality of genetic risks, when there is a plurality of phenotypes associated with the disease occurrence.
Yet another aspect of the present invention provides a computer program which is stored in a computer readable recording medium to execute any one method for predicting the risk of the disease combined with the genetic risk for the associated phenotype in a computer.
According to the device and the method for predicting e risk for the disease combined with the genetic risk for the associated phenotype of the present invention, a comprehensive genetic risk for the disease is obtained based on the genetic risk for the disease and the genetic risk for the phenotype associated with the disease occurrence, and thus, more objective and accurate disease prediction is possible.
Hereinafter, preferred exemplary embodiments of a device and a method for predicting the risk for a disease combined with the genetic risk for an associated phenotype according to the present invention will be described in detail with reference to the accompanying drawings.
First, a device for predicting the risk of a disease combined with the genetic risk of an associated phenotype according to a preferred exemplary embodiment of the present invention will be described with reference to
Referring to
To this end, the disease risk prediction device 100 may include a storage unit 110, a disease risk obtaining unit 130, a phenotypic risk obtaining unit 150, and a comprehensive risk obtaining unit 170.
The storage unit 110 serves to store programs and data required for the operation of the disease risk prediction device 100, and may be divided into a program area and a data area.
The program area may store a program for controlling the overall operation of the disease risk prediction device 100, an operating system (OS) for booting the disease risk prediction device 100, an application program required for the operation of the disease risk prediction device 100 such as obtainment of the disease genetic risk, obtainment of the phenotypic genetic risk, and obtainment of the comprehensive genetic risk, and the like.
The data area is an area in which data generated according to the use of the disease risk prediction device 100 is stored, and may store genetic marker information associated with the disease occurrence, information on a phenotype associated with the disease occurrence, genetic marker information on the phenotype, information on an effective size by state of the phenotype, genetic risk information (disease genetic risk, phenotypic genetic risk, comprehensive genetic risk, etc.) of a user for the disease, personal information of the user, and the like.
The disease risk obtaining unit 130 obtains the disease genetic risk for the disease based on the genetic marker information associated with the disease occurrence and the information on the genotype of the user.
Here, the genetic marker information associated with the disease occurrence consists of genetic markers associated with the disease occurrence and is divided for each disease and stored in the storage unit 110. For example, when the disease is “liver cancer”, genetic marker information associated with the occurrence of the liver cancer is shown in Table 1.
For example, the disease risk obtaining unit 130 may obtain the disease genetic risk as shown in Table 2 below based on the genetic marker information associated with the occurrence of the disease (liver cancer) shown in [Table 1] above and the information on the genotype of the user.
Here, the disease genetic risk used a genetic risk score method of arithmetically adding the number of risk factors, but the present invention is not limited thereto, and according to an exemplary embodiment, may use various risk calculation methods, such as a genetic risk score, a weighted genetic risk score, a machine learning method, and a linear regression method. In addition, the disease risk obtaining unit 130 may convert the obtained disease genetic risk to a relative value as compared with disease genetic risks of other users in an affiliated group (e.g., the same country, the same residence area, the same age range, etc.) of the user. At this time, the reason for converting the disease genetic risk to the relative value is to normalize (0 to 1) the disease genetic risk to a relative ranking because an absolute value of the genetic risk score may be greatly changed according to the number of genetic markers to be used, and the calculation methods of the disease genetic risk and the phenotypic genetic risk may be different from each other.
For example, the disease risk obtaining unit 130 may convert the obtained disease genetic risk to a percentage (e.g., the top 33%) representing a relative position as compared with the disease genetic risks of other users.
The phenotypic risk obtaining unit 150 obtains the phenotypic genetic risk for the phenotype based on the genetic marker information on the phenotype associated with the disease occurrence and the information on the genotype of the user.
Here, the information on the phenotype associated with the disease occurrence consists of information on a phenotype associated with the disease occurrence, genetic marker information on the phenotype, information on an effective size by state of the phenotype, and the like and is divided for each disease and stored in the storage unit 110. For example, the information on the phenotype related to the occurrence of the liver cancer is shown in [Table 3] below, and the information on the gene marker for the phenotype shown in [Table 3] is as shown in Table 4 to Table 9 below.
At this time, when there is a plurality of phenotypes associated with the disease occurrence, the phenotypic risk obtaining unit 150 may obtain a genetic risk for each of the plurality of phenotypes and obtain phenotypic genetic risks based on the obtained plurality of genetic risks. For example, when the disease is liver cancer, as shown in [Table 3], since the number of phenotypes associated with the occurrence of the liver cancer is six, the phenotypic risk obtaining unit 150 may obtain a genetic risk for each of the six phenotypes and obtain phenotypic genetic risks based on the obtained six genetic risks. In addition, the phenotypic risk obtaining unit 150 may convert the obtained phenotypic genetic risk to a relative value as compared with phenotypic genetic risks of other users in an affiliated group (e.g., the same country, the same residence area, the same age range, etc.) of the user. At this time, the reason for converting the phenotypic genetic risk to the relative value is to normalize (0 to 1) the phenotypic genetic risk to a relative ranking because an absolute value of the genetic risk score may be greatly changed according to the number of genetic markers to be used, and the calculation methods of the disease genetic risk and the phenotypic genetic risk may be different from each other. Here, when there is a plurality of phenotypes associated with the disease occurrence, the phenotypic risk obtaining unit 150 may convert the genetic risks obtained for the plurality of phenotypes to relative values as compared with the corresponding genetic risks of other users, respectively, and obtain phenotypic genetic risks based on the plurality of genetic risks converted to the relative values.
More specifically, the phenotypic risk obtaining unit 150 may obtain the phenotypic genetic risk based on the genetic marker information on the phenotype and the information on the genotype of the user by using a predefined effective size by state of the phenotype.
That is, the phenotypic risk obtaining unit 150 may obtain a genetic risk based on the genetic marker information on the phenotype and the information on the genotype of the user, convert the obtained genetic risk to the relative value as compared with the genetic risks for the phenotypes of other users in the affiliated group of the user, obtain the state of the phenotype based on the genetic risk converted to the relative value, and determine an effective size corresponding to the obtained state as the phenotypic genetic risk.
For example, when the disease is liver cancer, the phenotypic risk obtaining unit 150 may obtain the phenotypic genetic risk as shown in Table 10 below based on the information on the genotype of the user and Table 3 to Table 9 above.
Here, the phenotypic genetic risk used a genetic risk score method of arithmetically adding the number of risk factors, but the present invention is not limited thereto, and according to an exemplary embodiment, may use various risk calculation methods such as a genetic risk score, a weighted genetic risk score, a machine learning method, a linear regression method, and the like.
The comprehensive risk obtaining unit 170 may obtain a comprehensive genetic risk for the disease based on the disease genetic risk obtained by the disease risk obtaining unit 130 and the phenotypic genetic risk obtained by the phenotypic risk obtaining unit 150.
At this time, the comprehensive risk obtaining unit 170 may obtain the comprehensive genetic risk based on the disease genetic risk converted to the relative value and the phenotypic genetic risk converted to the relative value.
More specifically, the comprehensive risk obtaining unit 170 may obtain the comprehensive genetic risk based on the disease genetic risk and the phenotypic genetic risk by using a ratio of genetic factors affecting the disease occurrence.
Here, the ratio of genetic factors affecting the disease occurrence represents a proportion at which the genetic factors affect the occurrence of a specific disease. For example, when the disease is “liver cancer” and the proportion of the genetic factors affecting the occurrence of the liver cancer is “61%”, the ratio of the genetic factors may be “0.61.”
For example, the comprehensive risk obtaining unit 170 may obtain the comprehensive genetic risk for the disease through the following Equation 1.
Comprehensive genetic risk=a*disease genetic risk+(1-a)*phenotypic genetic risk [Equation 1]
Here, a represents a ratio of genetic factors that affect the disease occurrence. The disease genetic risk and the phenotypic genetic risk may also use values converted to the relative values.
For example, when the disease is “liver cancer”, the disease genetic risk converted to the relative value is “the top 33%.” and the phenotypic genetic risk converted to the relative value is “the top 7%”, the comprehensive genetic risk for the liver cancer is as shown in Equation 2 below.
0.61*0.33+(1-0.61)*0.07=0.23 [Equation 2]
In addition, the comprehensive risk obtaining unit 170 may convert the obtained comprehensive genetic risk to a relative value as compared with comprehensive genetic risks of other users in the affiliated group of the user.
For example, the comprehensive risk obtaining unit 170 may obtain the obtained comprehensive genetic risk as a relative ranking as compared with the comprehensive genetic risks of other users and rank the disease risk as follows based on the obtained relative ranking.
Top 1% to 25%: Disease risk grade “high”
Top 26% to 50%: Disease risk grade “mid-high”
Top 51% to 75%: Disease risk grade “mid-low”
Top 76% to 100%: Disease risk grade “low”
Then, a method for predicting the risk of a disease combined with the genetic risk of an associated phenotype according to a preferred exemplary embodiment of the present invention will be described with reference to
Referring to
Next, the disease risk prediction device 100 obtains a phenotypic genetic risk for a phenotype based on genetic marker information on a phenotype associated with the disease occurrence and the information on the genotype of the user (S130).
At this time, when there is a plurality of phenotypes associated with the disease occurrence, the disease risk prediction device 100 may obtain a genetic risk for each of the plurality of phenotypes and obtain phenotypic genetic risks based on the obtained plurality of genetic risks.
In addition, the disease risk prediction device 100 may convert the obtained phenotypic genetic risk to a relative value as compared with phenotypic genetic risks of other users in an affiliated group (e.g., the same country, the same residence area, the same age range, etc.) of the user. Here, when there is a plurality of phenotypes related to the disease occurrence, the disease risk prediction device 100 may convert the genetic risks obtained for the plurality of phenotypes to relative values as compared with the corresponding genetic risks of other users, respectively, and obtain the phenotypic genetic risks based on the plurality of genetic risks converted to the relative values.
More specifically, the disease risk prediction device 100 may obtain the phenotypic genetic risk based on the genetic marker information on the phenotype and the information on the genotype of the user by using a predefined effective size by state of the phenotype. That is, the disease risk prediction device 100 may obtain a genetic risk based on the genetic marker information on the phenotype and the information on the genotype of the user, convert the obtained genetic risk to the relative value as compared with the genetic risks for the phenotypes of other users in the affiliated group of the user, obtain the state of the phenotype based on the genetic risk converted to the relative value, and determine an effective size corresponding to the obtained state as the phenotypic genetic risk.
Thereafter, the disease risk prediction device 100 may obtain a comprehensive genetic risk for the disease based on the obtained disease genetic risk and the obtained phenotypic genetic risk (S150).
At this time, the disease risk prediction device 100 may obtain the comprehensive genetic risk based on the disease genetic risk converted to the relative value and the phenotypic genetic risk converted to the relative value.
More specifically, the disease risk prediction device 100 may obtain the comprehensive genetic risk based on the disease genetic risk and the phenotypic genetic risk by using a ratio of genetic factors affecting the disease occurrence.
In addition, the disease risk prediction device 100 may convert the obtained comprehensive genetic risk to a relative value as compared with comprehensive genetic risks of other users in the affiliated group of the user.
The present invention can be implemented as computer readable codes in a computer readable recording medium. The computer readable recording medium includes all kinds of recording devices for storing data which may be read by a computer. Examples of the computer readable recording medium include a ROM, a RAM, a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
While the preferred exemplary embodiments of the present invention have been illustrated and described above, the present invention is not limited to the aforementioned specific preferred exemplary embodiments, various modifications may be made by a person with ordinary skill in the technical field to which the present invention pertains without departing from the subject matters of the present invention that are claimed in the claims, and these modifications are included in the scope of the claims.
100: Disease risk prediction device
110: Storage unit
130: Disease risk obtaining unit
150: Phenotypic risk obtaining unit
170: Comprehensive risk obtaining unit
Number | Date | Country | Kind |
---|---|---|---|
10-2019-0096589 | Aug 2019 | KR | national |
The present application is the 371 National Stage Application of International Patent Application Serial No. PCT/KR2019/010258, filed Aug. 13, 2019, which claims the benefit of Korean Patent Application No. 10-2019-0096589 filed on Aug. 8, 2019, the entire disclosures of which are incorporated herein by reference.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/KR2019/010258 | 8/13/2019 | WO |