The present invention relates to means for diagnosing, prognosing, treating and/or preventing obesity and/or type II diabetes in humans.
More precisely, the present invention provides means and methods for risk assessment and/or diagnosis and/or prognosis of obesity and/or type II diabetes in humans, based on the detection of nucleic acid biomarkers belonging to, or associated with, a set of SNPs (for “single nucleotide polymorphisms”) in the fatso (FTO) gene.
The present invention also provides means and methods for identifying a SNP haplotype associated with obesity and/or type II diabetes susceptibility in humans, as well as for selecting pharmaceutical agents useful in prevention and/or treatment of obesity and/or type II diabetes in humans.
Obesity is a condition in which the natural energy reserve, stored in the fatty tissue of humans and other mammals, is increased to a point where it is associated with certain health conditions or increased mortality.
Obesity is both an individual clinical condition and is increasingly viewed as a serious public health problem. Excessive body weight is now commonly known to predispose to various diseases, particularly cardiovascular diseases, sleep apnea, osteoarthritis, and diabetes (mellitus) type II. More precisely, obesity, especially central obesity (male-type or waist-predominant obesity), is an important risk factor for the “metabolic syndrome” (“syndrome X”), the clustering of a number of diseases and risk factors that heavily predispose for cardiovascular diseases. These risk factors are diabetes (mellitus) type II, high blood pressure, high blood cholesterol, and triglyceride levels (combined hyperlipidemia). An inflammatory state is present, which—together with the above—has been implicated in the high prevalence of atherosclerosis, and a prothrombotic state may further worsen cardiovascular risk.
In the clinical setting, obesity is typically evaluated by measuring BMI (for “body mass index”), waist circumference, and evaluating the presence of risk factors and comorbidities. In epidemiological studies, BMI alone is used as an indicator of prevalence and incidence of obesity. BMI is calculated by dividing the subject's weight in kilograms by the square of his/her height in metres:
BMI=(kg/m2) or BMI=[weight(lbs.)×703/height(inches)2]
Generally, it is considered that:
Factors that have been suggested to contribute to the development of obesity include, not only overeating, but also:
Obesity is often given to result from a combination of genetic and non-genetic factors. In this respect, the causative gene(s) is(are) still to be identified.
Today, obesity is seen as the biggest health problem facing developed and emerging countries.
Among all the means that have been made available for combating obesity, bariatric surgery is being increasingly used. This technique consists of placing a silicone ring around the top of the stomach to help restrict the amount of food eaten in a sitting. Other more invasive surgery techniques, that cut into or reroute any of the digestive tract, have been also used. However, all of these surgeries comme with risk to the patient and they do not guarantee either successful weight loss or reduced morbidity and mortality.
As a consequence, there is a need in the art for new drugs that would be really efficient for combating obesity. In this regard, identifying the gene(s) that is(are) involved in obesity onset, and thus that is(are) promising candidate therapeutic target(s), is one of the more crucial concerns of scientists and medical staffs.
This is precisely this need that the present invention aims at satisfying by disclosing the most significant association reported so far between a genetic factor and obesity. Indeed, the present invention is based on the finding that several SNPs (for “single nucleotide polymorphisms”) in fatso (FTO) locus are highly and significantly associated with early onset and severe obesity, as well as with the obesity related type II diabetes, in European population.
SNPs represent one of the most common forms of genetic variation. These polymorphisms appear when a single nucleotide in the genome is altered (such as via substitution, addition or deletion). Each version of the sequence with respect to the polymorphic site is referred to as an “allele” of the polymorphic site. SNPs tend to be evolutionary stable from generation to generation and, as such, can be used to study specific genetic abnormalities throughout a population. If SNPs occur in the protein coding region, it can lead to the expression of a variant, sometimes defective, form of the protein that may lead to the development of a genetic disease. Some SNPs may occur in non-coding regions, but nevertheless, may result in differential or defective splicing, or altered protein expression levels. SNPs can therefore serve as effective indicators of a genetic disease. SNPs can also be used as diagnostic tools for identifying individuals with a predisposition for a disease, genotyping the individual suffering from the disease, and facilitating drug development based on the insight revealed regarding the role of target proteins in the pathogenesis process.
For the avoidance of doubt, the methods of the invention do not involve diagnosis practised on the human body. The methods of the invention are preferably conducted on a sample that has previously been removed from the individual. The kits of the invention, described hereunder, may include means for extracting the sample from the individual.
The methods of the invention allow the accurate evaluation of risk for an individual's health due to obesity and/or type II diabetes at or before disease onset, thus reducing or minimizing the negative effects of obesity and/or type II diabetes. In particular, the present invention allows a better prediction of the risk of obesity and/or type II diabetes and, therefore, of subsequent complications. The methods of the invention can be applied in persons who are free of clinical symptoms and signs of obesity and/or type II diabetes, in those who already have obesity and/or type II diabetes, in those who have family history of obesity and/or type II diabetes, or in those who have elevated level or levels of risk factors of obesity and/or type II diabetes.
In the context of the present invention, a “biomarker” (also herein referred to as a “marker”) is a genetic marker indicative of obesity and/or type II diabetes in humans, that is to say a nucleic acid sequence which is specifically and significantly involved in obesity and/or type II diabetes onset. In the context of the invention, such a marker may also be called an “obesity and/or type II diabetes risk SNP marker” or a “risk SNP marker” or a “risk marker” or a “SNP marker”.
Typically, the genetic markers used in the invention are particular alleles at “polymorphic sites” associated with obesity and/or type II diabetes. A nucleotide position in genome at which more than one sequence is possible in a population is referred to as a “polymorphic site”. Where a polymorphic site is a single nucleotide in length, the site is commonly called an “SNP”. For example, if at a particular chromosomal location, one member of a population has an adenine and another member of the population has a thymine at the same position, then this position is a polymorphic site and, more specifically, the polymorphic site is an SNP. Polymorphic sites may be several nucleotides in length due to, e.g., insertions, deletions, conversions, substitutions, duplications, or translocations. Each version of the sequence with respect to the polymorphic site is referred to as an “allele” of the polymorphic site. Thus, in the previous example, the SNP allows for both an adenine allele and a thymine allele. These alleles are “variant” alleles. Nucleotide sequence variants, either in coding or in non-coding regions, can result in changes in the sequence of the encoded polypeptide, thus affecting the properties thereof (altered activity, altered distribution, altered stability, etc.) Alternatively, nucleotide sequence variants, either in coding or in non-coding regions, can result in changes affecting transcription of a gene or translation of its mRNA. In all cases, the alterations may be qualitative or quantitative or both.
Those skilled in the art will readily recognize that the analysis of the nucleotides present in one or several of the SNP markers disclosed herein in an individual's nucleic acid can be done by any method or technique capable of determining nucleotides present in a polymorphic site. For instance, one may detect biomarkers in the methods of the present invention by performing sequencing, mini-sequencing, hybridisation, restriction fragment analysis, oligonucleotide ligation assay, allele-specific PCR, or a combination thereof. Of course, this list is merely illustrative and in no way limiting. Those skilled in the art may use any appropriate method to achieve such detection.
As it is obvious in the art, the nucleotides present in SNP markers can be determined from either nucleic acid strand or from both strands.
The biomarkers used in the context of the invention are “associated with” the FTO gene, which means that said biomarkers are structurally associated with the FTO gene, e.g., the biomarkers are either in the FTO locus, or in close proximity thereto, and/or that said biomarkers are functionally associated with the FTO gene, e.g., the biomarkers interact with or affect the FTO gene or the expression product thereof.
Preferably, the biomarkers used in the methods and kits of the present invention are selected from the group of single nucleotide polymorphisms (SNPs) listed in anyone of Tables 2, 3, and 6 to 9 below (see part II in the Examples below). Yet preferably, some of the SNPs listed in anyone of Tables 2, 3, and 6 to 9 that are of highly significant predictive value are selected from rs9940128, rs1421085, rs1121980, rs17817449, rs3751812, rs11075990, rs9941349, rs7206790, rs8047395, rs10852521, rs1477196, and rs4783819 . . . . In this group, the SNPs rs9940128, rs1421085, rs1121980, rs3751812, rs7206790, rs8047395, and rs17817449 are of particular interest. Yet more preferably, one will use at least the SNP rs1421085 or rs17817449.
Alternatively, the biomarkers may be polymorphic sites associated with at least one SNP selected from the group listed in anyone of Tables 2, 3, and 6 to 9 below. As defined above, the terms “associated with” mean that said biomarkers are structurally and/or functionally associated with said SNP(s). More specifically, the terms “associated with” mean that said biomarkers are in high linkage disequilibrium with said SNPs, i.e., they present a correlation termed r2 of at least 0.6 and/or a D′ of 0.5 with said SNPs in the HapMap European dataset and/or in the population experimentally analyzed by the Inventors as shown below.
Yet alternatively, the biomarkers may be polymorphic sites being in complete linkage disequilibrium with at least one SNP selected from the group listed in anyone of Tables 2, 3, and 6 to 9 below.
Thus, a first aspect of the present invention concerns an in vitro method for risk assessment and/or diagnosis and/or prognosis of obesity and/or type II diabetes in a human subject, comprising at least:
a) detecting, in a nucleic acid sample from said human subject, at least one biomarker associated with the FTO gene; and
b) comparing the biomarker data obtained in step a) from said human subject to biomarker data from healthy and/or diseased people to make risk assessment and/or diagnosis and/or prognosis of obesity and/or type II diabetes in said human subject.
By “risk assessment”, it is meant herein that the present invention makes it possible to estimate or evaluate the risk of a human subject to develop obesity and/or type II diabetes (one could also say “predisposition or susceptibility assessment”). In this respect, an individual “at risk” of obesity and/or type II diabetes is an individual who has at least one at-risk allele or haplotype with one or more “obesity and/or type II diabetes risk SNP markers”. In addition, an “at-risk” individual may also have at least one risk factor known to contribute to the development of obesity and/or type II diabetes, including for instance:
The prediction or risk generally implies that the risk is either increased or reduced.
There is no limitation on the type of nucleic acid sample that may be used in the context of the present invention. In this respect, one may use, e.g., a DNA sample, a genomic DNA sample, an RNA sample, a cDNA sample, an hnRNA sample, or an mRNA sample.
The “diseased” people referred to in the methods of the invention are people suffering from obesity and/or type II diabetes.
According to various embodiments, the method described above is useful for:
identifying human subjects at risk for developing obesity and/or type II diabetes;
diagnosing obesity and/or type II diabetes in a human subject;
selecting efficient and safe therapy to a human subject having obesity and/or type II diabetes;
monitoring the effect of a therapy administered to a human subject having obesity and/or type II diabetes;
predicting the effectiveness of a therapy to treat obesity and/or type II diabetes in a human subject in need of such treatment;
selecting efficient and safe preventive therapy to a human subject at risk for developing obesity and/or type II diabetes;
monitoring the effect of a preventive therapy administered to a human subject at risk for developing obesity and/or type II diabetes;
predicting the effectiveness of a therapy to prevent obesity and/or type II diabetes in a human subject at risk.
The terms “treatment” and “therapy” refer not only to ameliorating symptoms associated with obesity and/or type II diabetes, but also preventing or delaying the onset of the disease, and/or also lessening the severity or frequency of symptoms of the disease, and/or also preventing or delaying the occurrence of another episode of the disease.
A second aspect of the present invention relates to an in vitro method for identifying a SNP haplotype associated with obesity and/or type II diabetes susceptibility in a human subject, wherein said method comprises at least:
a) detecting, in a nucleic acid sample from said human subject, at least one SNP of the FTO gene, wherein said at least one SNP is indicative of obesity and/or type II diabetes susceptibility; and
b) identifying said SNP haplotype in said human subject, wherein said SNP haplotype comprises said at least one SNP detected in step a).
As it is well known in the art, a “haplotype” refers to any combination of genetic markers. A haplotype can comprise two or more alleles. The haplotypes (or “at-risk haplotypes”) described herein are found more frequently and significantly in individuals at risk of obesity and/or type II diabetes than in individuals without obesity and/or type II diabetes risk. Therefore, these haplotypes have predictive value for detecting obesity and/or type II diabetes risk, or a susceptibility to obesity and/or type II diabetes in an individual. An “at-risk haplotype” is thus intended to embrace one or a combination of haplotypes described herein over the markers that show high and significant correlation to obesity and/or type II diabetes.
Detecting haplotypes can be accomplished by methods well known in the art for detecting sequences at polymorphic sites.
Preferably, the SNP(s) detected in step a) is(are) selected from the group listed in anyone of Tables 2, 3, and 6 to 9 below.
A third aspect of the present invention provides a test kit for using in an in vitro method to make risk assessment and/or diagnosis and/or prognosis of obesity and/or of type II diabetes in a human subject, wherein said test kit comprises appropriate means for:
a) assessing type and/or level of at least one biomarker associated with the FTO gene in a nucleic acid sample from said human subject; and
b) comparing the biomarker data assessed in a) from said human subject to biomarker data from healthy and/or diseased people to make risk assessment and/or diagnosis and/or prognosis of obesity and/or of type II diabetes in said human subject.
A fourth aspect of the present invention is related to a test kit for using in an in vitro method for identifying a SNP haplotype associated with obesity and/or type II diabetes susceptibility in a human subject, comprising appropriate means for:
a) detecting at least one SNP of the FTO gene in a nucleic acid sample from said human subject, wherein said at least one SNP is indicative of obesity and/or type II diabetes susceptibility; and
b) identifying SNP haplotype in said human subject, wherein said SNP haplotype comprises said at least one SNP detected in a).
The terms “test kit” and “kit” are synonymous and may be used interchangeably.
In the context of the present invention when reference is made to test kits, the terms <<appropriate means>> refer to any technical means useful for achieving the indicated purpose. As non-limiting examples of such appropriate means, one can cite reagents and/or materials and/or protocols and/or instructions and/or software, etc. All the kits of the present invention may comprise appropriate packaging and instructions for use in the methods herein disclosed. The kits may further comprise appropriate buffer(s) and polymerase(s) such as thermostable polymerases, for example Taq polymerase. Such kits may also comprise control primers and/or probes.
According to preferred embodiments, the test kits of the invention may comprise at least:
a) one isolated PCR primer pair consisting of a forward primer and a reverse primer, for specifically amplifying nucleic acids of interest; and/or
b) one isolated primer for specifically extending nucleic acids of interest; and/or
c) one isolated nucleic acid probe specifically binding to nucleic acids of interest; and/or
d) one isolated antibody specifically binding protein( ) encoded by nucleic acid(s) of interest; and/or
e) one microarray or multiwell plate comprising at least one of a) to d) above.
By “nucleic acids of interest”, it is meant herein the nucleic acid regions or segments containing the biomarkers that are indicative of obesity and/or type II diabetes. In this respect, the nucleic acids of interest may be larger than the biomarkers or they may be limited to the biomarkers.
“Probes” and “primers” are oligonucleotides that hybridize in a base-specific manner to a complementary strand of nucleic acid molecules. By “base-specific manner”, it is meant that the two sequences must have a degree of nucleotide complementarity sufficient for the primer or the probe to hybridize. Accordingly, the primer or probe sequence is not required to be perfectly complementary to the sequence of the template. Non-complementary bases or modified bases can be interspersed into the primer or probe, provided that base substitutions do not inhibit hybridization.
A probe or primer usually comprises a region of nucleic acid that hybridizes to at least about 8, preferably about 10, 12, 15, more preferably about 20, 25, 30, 35, and in some cases, about 40, 50, 60, 70 consecutive nucleotides of the nucleic acid template.
The primers and probes are typically at least 70% identical to the contiguous or complementary nucleic acid sequence (which is the “template”). Identity is preferably of at least 80%, 90%, 95%, and more preferably, of 98%, 99%, 99.5%, 99.8%.
Advantageously, the primers and probes further comprise a label, e.g., radioisotope, fluorescent compound, enzyme, or enzyme co-factor.
A fifth aspect of the present invention is directed to a method for selecting pharmaceutical agents useful in prevention and/or treatment of obesity and/or type II diabetes in a human subject, comprising at least:
a) administering the candidate agents to a model living system containing the human FTO gene;
b) determining the effect of said candidate agents on biological mechanisms involving said FTO gene and/or the expression product thereof; and
c) selecting the agents having an altering effect on said biological mechanisms, wherein the selected agents are considered useful in prevention and/or treatment of obesity and/or type II diabetes in a human subject.
By “pharmaceutical agent”, it is referred to either biological agents or chemical agents or both, provided they can be considered as useful in prevention and/or treatment of obesity and/or type II diabetes in a human subject. Examples of biological agents are nucleic acids, including siRNAs; polypeptides, including toxins, enzymes, antibodies, either polyclonal antibodies or monoclonal antibodies; combinations of nucleic acids and polypeptides, and the like. Examples of chemical agents are chemical molecules, chemical molecular complexes, chemical moieties, and the like (e.g., radioisotopes, etc.).
In a sixth aspect, the present invention concerns the use of a model living system containing the human FTO gene for studying pathophysiology and/or molecular mechanisms involved in obesity and/or type II diabetes.
Where reference is made herein to a “model living system”, it is preferably referred to a non-human transgenic animal, or a cultured microbial, insect or mammalian cell, or a mammalian tissue or organ. More preferably, said model living system will express or overexpress the human FTO gene.
A seventh aspect of the present invention relates to an in vitro method for haplotyping the FTO gene in a human subject, comprising at least:
a) detecting, in a nucleic acid sample from said human subject, the nucleotides present at each allelic position of an “obesity and/or type II diabetes susceptibility haplotype”, which haplotype includes at least one of the SNPs listed in anyone of Tables 2, 3, and 6 to 9, or a polymorphism in linkage disequilibrium therewith; and
b) assigning said human subject a particular haplotype according to the nucleotides detected in a).
In a preferred embodiment, this method further comprises the step of determining the risk of said human subject for developing obesity and/or type II diabetes according to the particular haplotype assigned in step b).
The nucleotides present at each allelic position may be detected in step a) of the above method using any appropriate techniques. For instance, this detection may be performed using enzymatic amplification, such as polymerase chain reaction or allele-specific amplification, of said nucleic acid sample. Alternatively, said detection may be done using sequencing.
Besides, the SNPs and haplotypes disclosed herein allow patient stratification. The subgroups of individuals identified as having increased or decreased risk of developing obesity and/or type II diabetes can be used, inter alia, for targeted clinical trial programs and pharmacogenetic therapies wherein knowledge of polymorphisms is used to help identify patients most suited to therapy with particular pharmaceutical agents.
The SNPs and haplotypes described herein represent a valuable information source helping to characterise individuals in terms of, for example, their identity and susceptibility to disease onset/development or susceptibility to treatment with particular drugs.
Therefore, an eighth aspect of the present invention is directed to a method for selecting human subjects for participation in a clinical trial to assess the efficacy of a therapy for treating and/or preventing obesity and/or type II diabetes, comprising at least:
a) grouping the human subjects according to the particular FTO gene haplotype that each human subject belongs to; and
b) selecting at least one human subject from at least one haplotype groups obtained in a) for inclusion in said clinical trial.
In this method, the particular FTO gene haplotype is advantageously determined in vitro by detecting, in a nucleic acid sample from each human subject, the nucleotides present at each allelic position of an “obesity and/or type II diabetes susceptibility haplotype”, which haplotype includes at least one of the SNPs listed in anyone of Tables 2, 3, and 6 to 9, or a polymorphism in linkage disequilibrium therewith.
A ninth aspect of the present invention provides a test kit for in vitro haplotyping the FTO gene in a human subject according to the method as described above, wherein said test kit comprises appropriate means for:
a) detecting, in a nucleic acid sample from said human subject, the nucleotides present at each allelic position of an “obesity and/or type II diabetes susceptibility haplotype”, which haplotype includes at least one SNP selected from the group listed in anyone of Tables 2, 3, and 6 to 9, or a polymorphism in linkage disequilibrium therewith; and
b) assigning said human subject a particular haplotype according to the nucleotides detected in a).
In addition, the present invention concerns, in a tenth aspect, the use of a test kit as described above for stratifying human subjects into particular haplotype groups.
Advantageously, this test kit is further used for selecting at least one human subject from at least one haplotype groups for inclusion in a clinical trial to assess the efficacy of a therapy for treating and/or preventing obesity and/or type II diabetes.
In an eleventh aspect, the present invention is related to a test kit for in vitro determining the identity of at least one SNP selected from the group listed in anyone of Tables 2, 3, and 6 to 9 in the human FTO gene, comprising appropriate means for such determination.
The present invention is illustrated by the non-limiting following figures:
A) The linkage disequilibrium is presented as a 2 by 2 matrix where dark grey represents very high linkage disequilibrium (r2) and white absence of correlation between SNPs.
b) For each of the SNPs, the log10 of the p-value for the class III obesity (880 individuals) vs. controls (2700) analysis is shown.
FTO expression in human cDNA from adipose tissue (BioChain Institute, USA), pancreatic islets, FACS-purified beta cells (provided by the Human Pancreatic Cell Core Facility, University Hospital, Lille, France) and multiple tissue cDNA panel (BD Biosciences Clontech) where 1: FTO negative control, 2: GAPDH, 3: GAPDH negative control, 4: GAPDH+FTO, 5: GAPDH+FTO negative control, 6: molecular weight markers 50 bp, 150 bp, 300 bp, 500 bp, 750 bp and 1 kb, 7: adipose tissue, 8: adipose tissue RT minus control, 9: pancreatic islets, 10: pancreatic islets RT minus control, 11: heart, 12: brain, 13: placenta, 14: lung, 15: liver, 16: skeletal muscle, 17: kidney, 18: pancreas, 19: pancreatic beta cells. Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) was used as internal control. Beta cell purity was confirmed by immunochemistry (98% insulin-positive cells) and PCR (absence of amplification with chymotrypsin primers, specific for exocrine cells, and presence of amplification with Pdx1 primers, specific for beta cells). FTO primers used were 5′-TGCCATCCTTGCCTCGCTCA-3′ (SEQ ID No.1) and 5′-TGGGGGCTGAATGGCTCACA-3′ (SEQ ID No.2). These two primers were high-performance liquid chromatography purified. 1 μg of adipose tissue, pancreatic islets and beta cells RNA was randomly reverse transcribed using M-MLV Reverse Transcriptase (Promega, USA) according to instructions. PCR was performed using the FastStart Taq DNA polymerase kit (Roche, Germany) according to instructions with 1.25 mmol/l MgCl2, 0.4 μmol/l of each primer, and 5 μl single strand cDNA, using the hot-start PCR method modified as follows: 95° C. for 4 min, 40 cycles of 95° C. for 30 s, 68° C. for 2 min, and then 68° C. for 3 min. PCR products were separated on 2% (wt/vol) agarose gel and visualized using ethidium bromide and ultraviolet trans-illumination.
Other embodiments and advantages of the present invention will be understood upon reading the following Examples.
I.1: Statistical analyses
a) Association tests. Logistic regression was used to test association in case-controls under a multiplicative model and Pearson chi-square for the general association model.
The p-values for replication are one-sided for testing the specific hypothesis of increased frequency of allele C (resp. G) in SNPs rs1421085 (resp. rs17817449) in obese children and adults.
Association testing of both SNPs in family based cohorts was performed using the TDT test which compares the number of transmissions of the at-risk allele, from heterozyguous parent to affected offspring, to its expectation. A McNemar X2 test assesses the significance.
Fisher's method was used for combining p-values of the different studies, in which the twice the negative sum of the natural log of n p-values follows a X2 distribution with 2n degrees of freedom.
b) Genetic model. The proportion of BMI variance explained in adult founders of our familial study populations (parents of French obese children) and in children from the Leipzig cohort, was estimated. The BMI was normalized and expressed in SDS.
The QTL liability threshold model with a quantitative liability trait L (mean 0 and SD 1 in the whole population) and a threshold T, above which an individual is classified as affected, was used. The trait L follows a mixture of three normal distributions N(μg, σR). μg is the genotype specific L mean (takes values −a, 0 and a) and σ2R is the proportion of residual variance which is not due to the locus. For obesity, the trait L can be identified with BMI, as obesity is defined as having a BMI over a certain threshold. With these parameters, it was possible to express the disease risk in terms of Genotype Risk Ratios, GRRi=P(affected/G=i)/P(affected/G=0). The variance due to the locus under investigation was directly derived from the values a and f, the frequency of the at-risk allele in population: σ2α=2.f.(1−f).a2 (Eq 1). The percentage of variance explained by the variant (σ2α) was derived from the linear regression model, by inverting Eq 1. Then, the GRRs corresponding to a prevalence of 10%, used for common obesity, was iteratively calculated.
Initial case-control genotyping was done by the Applied Biosystems SNPlex™ Technology based on the Oligonucleotide Ligation Assay (OLA) combined with multiplex PCR target amplification (http://vww.appliedbiosystems.com). The chemistry of the assay relies on a set of universal core reagent kits and a set of SNP-specific ligation probes allowing a multiplex genotyping of 48 SNPs simultaneously in a unique sample. A quality control measure was included by using specific internal controls for each step of the assay (according to the manufacturer's instructions). Allelic discrimination was performed through capillary electrophoresis analysis using an Applied Biosystems 3730xl DNA Analyzer and GeneMapper3.7 software. Duplicate samples were assayed with a concordance rate of 100%.
High-throughput genotyping for the variants rs1421085 and rs17817449 in replication samples was performed using the TaqMan® SNP Genotyping Assays (Applied Biosystems, Foster City, Calif. USA). The PCR primers and TaqMan probes were designed by Primer Express and optimized according to the manufacturer's protocol.
All SNPs were in Hardy-Weinberg equilibrium (p>0.05). The call rates were higher than 95% in all and groups of cases and controls from all populations except in Swiss obese individuals.
Call rates and HWE test p-values are displayed in Table 1 below.
indicates data missing or illegible when filed
An unusually high frequency of C (resp. G) allele was observed in controls of the Swiss study which is the control sample with highest missing genotypes rate. This may be due either to presence of undetected obese individuals in this anonymous donors sample or be indicative of a correlation between call rate and allele frequency (differential call rate). However, the samples with the highest call rate and displaying no difference of missing rate between cases and controls (French adult obesity and German children obesity) showed the usual range of allele frequency difference (0.41 to 0.51). Thus, the observed association is unlikely to be due to genotype-dependent calling rate difference in cases and controls.
Besides usual duplicates, 535 obese children and 329 class III obese adults were genotyped both in the case-control and in the familial studies. The concordance rates between these two genotyping techniques were 100% for both SNPs in both studies.
39 SNPs were genotyped in 6833 individuals. They capture 100% of the SNPs with a MAF (Minor Allele Frequency) higher than 1% in a region spanning from position 5234790 kb (rs1861868) to position 52386696 kb (rs13337696).
73% of the individuals (N=5037) were successfully genotyped for the 39 SNPs and 88% (6030 individuals) for at least 38 SNPs. The average call rate was 99%.
BMI was calculated and the z-score of BMI was determined according to the Cole's method (Cole et al., 1990).
Model Selection:
A systematic analysis of all possible combinations of 1 to k polymorphisms to select the most informative and parsimonious haplotype configuration in terms of predicting disease status was performed. Because SNPs are in strong linkage disequilibrium (LD), likelihood was estimated from haplotype analyses for combinations of more than 1 polymorphism. The likelihood generated by the program THESIAS was transformed into a Bayesian Information Criterion (BIC) values for each haplotype model and then subtracted the minimum BIC value obtained for each model over all models explored, giving a rescaled BIC value for each haplotype model. The models with a rescaled BIC-2 are considered equivalent to the most informative model, and among these models, the most parsimonious model with the fewest polymorphisms is considered the best model.
Haplotype Clustering:
HapCluster was used to perform a stochastic search for a case-rich cluster of haplotypes that are similar in the vicinity of a putative risk-enhancing variant. Haplotypes within the cluster are predicted to carry a risk-enhancing allele. The algorithm returns a Bayes factor to summarise the evidence for a causal variant, and a sample from the posterior distribution for its location. The current version, freely available at www.daimi.au.dk/˜mailund/HapCluster/, allows an allelic model, suitable for additive effects, and accepts unphased genotype data. Both these enhancements to the algorithm described in Waldron et al (2006) were employed.
48 SNPs in different intergenic regions were initially selected in order to estimate the distribution of neutral SNPs in French Caucasian case-control obesity data-sets. Surprisingly, the SNP rs1121980, located on chromosome 16q12.2, was found to be strongly associated with severe class III (BMI >40 kg/m2) adult obesity (OR=1.55 [1.39-1.73], p-value=5.3.10−16).
It appeared that this SNP is actually located within the first intron of a newly described gene named fatso or FTO (Peters et al., 1999) that has nine predicted exons in humans and encompasses a large 410,507 bp. genomic region on the NCBI 36.1 human genome assembly. Additional SNPs were tested in a 60-kb region (30 kb on each side of this SNP) which spans the LD block where rs1121980 lies. This region encompasses part of the first intron, second exon and first part of the second intron of the FTO gene. SNPs tagging all the frequent markers (MAF >0.05) with an r2 >0.7 as well as SNPs located in potentially functional elements (transcription factor binding sites or other regulatory elements and conserved region between species) and in r2 >0.8 with the initial SNP rs1121980, were selected. Twenty-five SNPs were eventually selected, and twenty-three were successfully genotyped. The case control sample comprised 896 class III obese adults (BMI >40 kg m2), and 2,700 non obese French Caucasian controls (BMI <27 kg m2). Both obese adult individuals and controls have been previously described (Meyre et al., 2005).
Results are shown in Table 2 below. Strong association of several SNPs with class III obesity (1.9.10−16≦p≦5.10−9) was found. Interestingly, three out of the five most significantly associated SNPs, rs17817449, rs3751812 and rs1421085 were putatively functional, based both on phastCons conservation score calculated on 11 vertebrates species (Siepel et al., 2005) and Regulatory Potential score calculated on 7 species (King et al., 2005). Information for genotyped SNPs is displayed in Tables 2 and 3 below.
52358454
rs1421085
52370867
rs17817449
52375960
rs3751812
For each SNP, it is reported in Table 3 above the physical position in bp using NCBI assembly Build 35, the phastCons conservation score calculated on 11 vertebrates species and Regulatory Potential score calculated on 7 species. A star is added when the SNP inserts or deletes a Transcription Factor Binding Site (using SNP inspector Tool from Genomatix Suite). In bold are indicated the three SNPs having the highest scores and then being most likely functional.
It was also tested whether the association observed in the whole region was reflecting one unique signal or whether any other SNP or haplotype displays association on its own, and concluded that the at-risk alleles were nearly perfect proxies of each other. Thus, at least these three SNPs are likely to mirror one unique association of a haplotype combining derived alleles (from NCBI) with a frequency of 40% in controls.
As recently outlined (Ott, 2004), the replication of association data in additional samples is necessary to exclude spurious conclusions, especially when the pre study odd for the implication of a gene is low, which is the case for fatso. SNPs rs1421085 and rs17817449 were chosen, because they display very high evidence of association and are putatively functional, to carry out these analyses. All the p-values were one-sided in these analyses.
It was first compared allele frequencies of the selected SNPs in 1,010 non obese French individuals (Hercberg et al. 1998) (SUVIMAX cohort, BMI <27 kg m2) with 736 obese children (mean age=11 y, BMI >97th percentile) and found significant association with early onset obesity (OR=1.28 [1.11-1.47] p=2.10−5 and OR=1.25 [1.09-1.44] p=5.10−4 for rs1421085 and rs17817449, respectively). Then, 532 non obese young French adults (Vu-Hong et al., 2006) (Haguenau cohort, median age=21y, BMI <25 kg/m2) and 505 French obese children with a BMI >97th percentile (Le Fur et al., 2002) from Saint Vincent de Paul Hospital, were analyzed. Again, similar trend for association with early onset obesity was found (OR=1.47 [1.23-1.75], p=1.17.10−5 and OR=1.52 [1.28-1.81], p=1.82.10−6 for rs1421085 and rs17817449, respectively). Finally, 700 lean children (mean age=11.7y, BMI between 16th and 85th percentile) and 283 obese children (mean age=11.7y, BMI >90th percentile), both of German Caucasian origin (Korner et al., 2007), were genotyped.
Association was again confirmed for both SNPs (OR=1.69 [1.38-2.06], p=3.46.10−7, and OR=1.65 [1.35-2.01], p=1.23.10−6 for rs1421085 and rs17817449, respectively). Table 4 below shows the effect size estimation.
557 Swiss class III obese adults and 541 anonymous Swiss donors were also genotyped, and it was further replicated the initial association between fatso and obesity (OR=1.26 [1.07-1.49], p=0.0032 and OR=1.21 [1.02-1.43] p=0.01 for rs1421085 and rs17817449, respectively). Of note, although allele frequencies in Swiss obese subjects were consistent with the initial observations in French obese subjects (MAF=0.50), the Swiss blood donor cohort which was not tested for obesity displayed higher allele frequencies (f=0.46 vs. 0.41), which may be explained by the presence of obesity in this anonymous individuals group.
For each status, overall significance was assessed using the Fisher's method which combines p-values of each independent analysis. The number of effective tests (Nyholt, 2004) was used at each step, 16.72 and 1.2 respectively, to correct for multiple testing while accounting for the between SNPs' correlation. The meta-analysis combining evidence of association for obesity gave very significant results: p-value=1.67.10−26 and p=1.07.10−24 for SNP rs1421085 and rs17817449, respectively. In order to exclude a potential undetected stratification effect, these 2 SNPs were genotyped in the parents and sibs of both French obese children and class III obese adults. An over-transmission of the SNP rs1421085 (rs17817449 respectively) obesity “at risk” C (respectively G) allele to both obese children and adults was observed (% transmitted=57%, p-value=1.10−4 and % transmitted=66%, p-value=0.00045 in obese children for rs1421085 and rs17817449, respectively; % transmitted=57%, p-value=2.5.10−4 and % transmitted=62%, p-value=0.005, in obese adults for rs1421085 and rs17817449, respectively). An additional cohort comprising 154 families, discordant for severe obesity, (with at least one class III obese and one lean sib) of Swedish descent was further analyzed, and it was also observed over-transmission of the same allele to obese offspring (% transmitted=61%, p-value=0.05 for both SNPs). The overall significance of these three combined family based studies is 2.8.10−6.
Moreover, in founders of French Childhood Obesity families dataset, it was found a very strong association with BMI corrected for age and sex for both SNPs (β=0.19 [0.09-0.29], p=8.10−5 and β=0.17 [0.07-0.27], p=4.10−4 for rs1421085 and rs17817449, respectively). All replication results are displayed in Table 5 below and genotype counts are shown in Table 1 above.
Using haplotype clustering methods (Molitor et al., 2003), a fine-mapping analysis was performed to restrict the localization of the underlying causal variant. 39 SNPs, spanning 100 kb which include the 47 kb as well adjacent blocks were genotyped in 6933 individuals, including 2446 controls and 1935 obese adults and children (Table 6 below). This design covers, with r2 >0.8, all the HapMapSNPs displaying a MAF higher than 1% in this region.
The distribution of posterior location probability (
Actually, it appears that all the markers in the interval chr16:52344480-5240000 which are in high LD r2 >0.7 with rs1421085 in European populations are of interest in the context of the present invention (Table 7).
Thus, in spite of a very high LD in this region, significant difference in association with obesity status was found along this region. The posterior probability distribution is in agreement with the fine-scale recombination data as retrieved from HapMap (www.hapmap.org) (
96 individuals have been sequenced in the 20 kb region. This permitted to identify 66 new SNPs (not yet reported in dbSNP for “Single Nucleotide Polymorphism database”), which are set forth in Table 8A hereunder according to their position. These SNPs were not identified so far at least because the number of individuals used for the human genome sequence assembly is not large enough to ensure statistical power to detect all frequent genetic variations. Using 96 individuals gave here for the first time sufficient power to discover frequent SNPs (MAF >0.05). 62 dbSNPs validated through the above described re-sequencing procedure are listed in Table 8B below. 26 dbSNPs not found through the above described re-sequencing procedure are listed in Table 8C below.
Because this is the largest sequencing study performed so far in this region (96 individuals), the Inventors were both able to identify new SNPs (i.e., not listed in dbSNP nor in HapMap) and to confirm (or discard) previously identified SNPs (either in dbSNP, HapMap or in any other public database). It is noteworthy that all the SNPs (confirmed and new) are in the scope of the present invention as they are in strong linkage disequilibrium (high r2 and/or D′) with the defined at-risk SNPs (including rs1421085).
Table 9 below shows the results of case control analysis on 2400 controls (part of the controls used in the obesity studies described above) and 2200 type II diabetes patients of French Caucasian origin. Analysis was performed under the additive model.
Fatso (FTO) function is mostly unknown. Mice heterozygous for an FTO syntenic Fused toes (Ft) are characterized by partial syndactyly of forelimbs and massive thymic hyperplasia indicating that programmed cell death is affected. Homozygous Ft/Ft embryos die at mid-gestation and show severe malformations of craniofacial structures. However, this physical inactivation involves several genes in the region and thus these phenotypes are not necessarily related to FTO itself. In humans, a small chromosomal duplication has been identified on large chromosomal 16q12.2 region which includes the fatso (FTO) locus (Stratakis et al., 2000). Besides mental retardation, dysmorphic facies, and digital anomalies, the authors also report obesity as primary symptom. Fatso (FTO) locus variation was also recently reported to be modestly associated with the metabolic syndrome in French Canadian hypertensive families (Seda et al., 2005).
FTO's gene expression was examined in several human tissues, especially those of interest for obesity such as brain, adipose tissue, and it was found that human fatso gene was expressed in all eleven tested tissues as shown in
Here, it is shown that several potentially functional SNPs in fatso (FTO) locus are highly associated with early onset and severe obesity in European population. The calculated Population Attributable Risk of 0.22, which is explained by the high frequency of the at-risk haplotype, argues for a putative important effect on population corpulence. It appears to be the most significant association reported so far for obesity (Lyon et al., 2007). Also, it is shown here that the same SNPs are highly associated with type II diabetes.
It was recently shown that, although most research findings in genetic studies may be accidental, multiple replication of strong associations greatly enhances the positive predictive value of research findings being true, even if the pre study odd is low (Moonesinghe et al., 2007). In this regard, fatso appears to be a gene with a strong contribution to obesity as well as type II diabetes despite its, yet, unknown role in glucose homeostasis.
Number | Date | Country | Kind |
---|---|---|---|
07108984.1 | May 2007 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/EP2008/054031 | 4/3/2008 | WO | 00 | 10/1/2009 |
Number | Date | Country | |
---|---|---|---|
60909826 | Apr 2007 | US |