The present invention relates to genetic markers associated with Paget's disease of bone (PDB) and/or a predisposition/susceptibility thereto. Accordingly, the invention provides nucleotide sequences as well as associated proteins/peptides and/or compositions and methods, for use in preventing and/or treating PDB. The invention also extends to uses and methods related to the detection and/or diagnosis of PDB and/or a susceptibility/predisposition thereto.
Paget's disease of bone (PDB) is a common disorder of the skeleton that affects up to 2% of individuals of European ancestry aged 55 years and above1. The disease is characterized by focal areas of increased and disorganized bone remodeling that can cause bone pain, bone deformity, pathological fracture, deafness and secondary osteoarthritis2. Genetic factors are important contributors to PDB risk, and between 15% and 40% of individuals with PDB have an affected first-degree relative3. Mutations affecting the ubiquitin-associated domain of SQSTM1 have been identified in about 10% of individuals with what is termed ‘sporadic’ PDB and in about 40% of individuals with familial PDB4,5. Despite extensive research efforts and the identification of several susceptibility loci by linkage analysis6-8, the remaining genes that predispose to PDB have yet to be identified.
The present invention is based on the identification of a number of genetic markers that are associated with susceptibility to Paget's disease of bone (PDB). In particular, the inventors have determined that variations in sequence at several chromosomal loci are associated with the development of PDB and therefore provide markers for disease and/or a susceptibility/predisposition thereto.
Accordingly, this invention provides details of markers and nucleotide sequences as well as associated proteins/peptides and/or compositions and methods, for use in treating, preventing and/or detecting/diagnosing PDB and/or a susceptibility/predisposition thereto.
It should be understood that the phrase “associated with PDB disease” encompasses any link or correlation between the presence and/or absence of one or more nucleic acid sequence(s) or gene(s) and the symptoms, progression and/or development of PDB as well as a susceptibility/predisposition thereto.
One of skill will appreciate that the term “variant” as applied to the nucleic acid sequences and/or genes described herein, may encompass any variation in a nucleic acid sequence relative to, for example, a wild-type sequence or reference sequence obtained from an individual not suffering from or predisposed to PDB. Variations in a sequence may manifest as the addition, deletion, substitution and/or inversion of one or more nucleotides within a sequence. Additionally, or alternatively, the sequence variations may take the form of one or more polymorphism(s), where, for example, individual nucleotides are substituted for nucleotides not present in the wild-type or reference sequence. In other embodiments, a “variant” nucleic acid sequence may result from the occurrence of one or more mutations within the sequence. Again, as one of skill will appreciate, a mutated nucleic acid sequence may comprise one or more nucleotide inversions, additions, deletions and/or substitutions.
Each of the genetic loci identified as being associated with PDB are discussed in more detail below.
In one embodiment, the invention concerns the finding that chromosomal locus 1p13.3, is associated with PDB. Furthermore, the inventors have discovered that variations within the sequence of this region of chromosome 1 are associated with PDB. In one embodiment, the variant sequences associated with PDB are located within a 14-Kb linkage disequilibrium (LD) block, 87 Kb upstream of the CSF1 gene.
The CSF1 gene encodes macrophage colony-stimulating factor (M-CSF) which plays a critical role in osteoclast formation and survival. Accordingly, and without wishing to be bound by theory, the inventors suggest that the CSF1 gene and its product (M-CSF) are also associated with PDB. In particular, the inventors hypothesise that variations in sequences adjacent (i.e. upstream or downstream of) the CSF1 gene, for example variations within the 1p13.3 loci, may modulate the function, expression and/or activity of CSF1. In turn, this may result in a modulation of the function, expression and/or activity of the M-CSF protein which may further modulate (for example increase or decrease) osteoclast formation, differentiation and/or survival.
The variant 1p13.3 sequences provided by this invention, may comprise one or more polymorphisms. In one embodiment, the polymorphisms are selected from the group consisting of rs10494112; rs499345; and rs484959 see Table 1/Table 1 (Example 2) for details).
As such, the present invention establishes a correlation/association between variant 1p13.3 sequences (including variant CSF1 sequences) and PDB.
The invention also concerns the finding that chromosomal loci, 10p13, is associated with PDB. Furthermore, the inventors have discovered that variations within the sequence of this region of chromosome 10 are associated with PDB. In one embodiment, the variant sequences associated with PDB are located within a 30-Kb region the OPTN gene.
The OPTN gene encodes optineurin and as a result of the observations presented herein, represents a new candidate gene for PDB. Accordingly, in addition to establishing an association between 10p13 sequence variants and instances of PDB, the inventors have identified an association between modulated (for example increased and/or decreased) OPTN (or Optineurin) function, activity and/or expression and PDB.
Without wishing to be bound by theory, the inventors note that optineurin is a ubiquitously expressed cytoplasmic protein having an ubiquitin binding domain, similar to that present in the protein NEMO. Optineurin negatively regulates TNF-α-induced NF-κB activation by interacting with ubiquitylated RIP proteins. Furthermore, the inventors recognise that studies have shown that Optineurin interacts with myosin VI, suggesting it plays a role in vesicular trafficking between the Golgi apparatus and plasma membrane. Given the fact that mutations affecting the VCP protein (also involved in vesicular trafficking), cause inclusion body myopathy with early onset Paget's disease and frontotemporal dementia (IBMPFD) syndrome, modulated optineurin function, activity and/or expression, may play a role in regulating bone metabolism through effects on NF-κB signalling and vesicular trafficking.
Accordingly, variations in the sequence of the 10p13 locus, including, for example variations in the OPTN sequence and/or sequences adjacent thereto, may modulate the function, expression and/or activity of OPTN. In turn, this may result in a modulation of the function, expression and/or activity of the Optineurin protein which may further modulate (for example increase or decrease) osteoclast formation, differentiation and/or survival.
The variant OPTN sequences provided by this invention may comprise one or more polymorphisms. In one embodiment, the variant OPTN sequences comprise the polymorphism rs1561570 see Table 1/Table 1 (Example 2) for details).
In view of the above, the present invention establishes a correlation/association between sequence variations within the 10p13 locus and PDB.
The invention also concerns the finding that chromosomal locus 18q21, is associated with PDB. In particular, the inventors have discovered that variations in the sequence of this region of chromosome 18 (and in particular 18q21.33) are associated with PDB. In one embodiment, the variant sequences associated with PDB may be located near the TNFRSF11A gene and within a 22-Kb region in proximity to the TNFRSF11A gene.
TNFRSF11A encodes the receptor activator of NF-κB (RANK) which plays an important role in the differentiation and formation of osteoclasts. The inventors note that mice comprising a targeted disruption of this gene exhibit severe osteopetrosis resulting from an almost complete lack of osteoclasts. Furthermore loss-of-function mutations in TNFRSF11A cause osteoclast-poor osteopetrosis in humans. Moreover, mutations affecting the signal-peptide region of RANK cause the PDB-like syndromes of familial expansile osteoplysis, early onset familial PDB and expansile skeletal hyperphosphatasia. In view of the above, the inventors suggest that TNFRSF11A and/or RANK function, activity and/or expression, is important in the genetic regulation of bone metabolism.
Accordingly, variations in the TNFRSF11A sequence and/or sequences adjacent thereto (for example, upstream or downstream sequences) may modulate the function, expression and/or activity of TNFRSF11A. In turn, this may result in a modulation of the function, expression and/or activity of RANK which may further modulate (for example increase or decrease) osteoclast formation, differentiation and/or survival.
As such, in addition to establishing an association between 18p21 sequence variants and instances of PDB, the inventors have identified an association between modulated (for example increased and/or decreased) TNFRSF11A and/or RANK function, activity and/or expression and PDB.
The variant 18p21 sequences provided by this invention, may comprise one or more polymorphisms. In one embodiment, the polymorphisms may be selected from the group consisting of rs2957128; and rs3018362 (see (see Table 1/Table 1 (Example 2) for details).
Accordingly, the present invention establishes a correlation/association between 18p21 sequence variants and PDB.
The invention also concerns the finding that chromosomal locus 8q22.3, is associated with PDB. In particular, the inventors have discovered that variations in the sequence of this region of chromosome 8 (and in particular 8q22.3) are associated with PDB. In one embodiment, the variant sequences associated with PDB may be located in a region spanning approximately 400 kb and containing three known genes (RIMS2, TM7SF4, and DPYS). In particular, sequence variants associated with PDB cluster within an 18-kb linkage disequilibrium (LD) block spanning the entire Transmembrane 7 superfamily member 4 gene (TM7SF4).
TM7SF4 encodes dendritic cell-specific transmembrane protein (DC-STAMP) which, without wishing to be bound by theory, is a strong functional candidate gene for PDB since it is involved in osteoclast differentiation and is required for the fusion of osteoclast precursors to form mature osteoclasts. RANKL induced DC-STAMP expression is essential for osteoclast formation and connective tissue growth factor CCN2, stimulates osteoclast fusion through interaction with DC-STAMP. Since osteoclasts from patients with PDB are larger in size and contain more nuclei than normal osteoclasts, it is possible that the genetic variants that predispose to PDB at this locus (8q22.3) do so by enhancing TM7SF4 expression (causing gain-of-function).
Accordingly, variations in the TM7SF4 sequence and/or sequences adjacent thereto (for example, upstream or downstream sequences) may modulate the function, expression and/or activity of TM7SF4.
As such, in addition to establishing an association between 8q22.3 sequence variants and instances of PDB, the inventors have identified an association between modulated (for example increased and/or decreased) TM7SF4 function, activity and/or expression and PDB.
The variant 8q22.3 sequences provided by this invention, may comprise one or more polymorphisms. In one embodiment, the polymorphisms may comprise rs2458413 (see Table 1/Table 1 (Example 2) for details).
Accordingly, the present invention establishes a correlation/association between 8q22.3 sequence variants and PDB.
In one embodiment, the invention concerns the finding that chromosomal locus 7q33, is associated with PDB. Furthermore, the inventors have discovered that variations within the sequence of this region of chromosome 7 are associated with PDB. In one embodiment, the variant sequences associated with PDB are located within a ˜500 kb region containing three known genes (CNOT4, NUP205, and SLC13A4) and two predicted protein coding transcripts (PL-5283 and FAM180A).
The variant 7q33 sequences provided by this invention, may comprise one or more polymorphisms. In one embodiment, the polymorphisms comprise rs4294134, located within the 22nd intron of the gene NUP205. This gene encodes a protein called nucleoporin 205 kDa which is one of the main components of the nuclear pore complex involved in the regulation of transport between the cytoplasm and nucleus7.
As such, the present invention establishes a correlation/association between variant 7q33 sequences (including variant NUP205 sequences) and PDB.
The invention also concerns the finding that chromosomal loci, 14q32.12, is associated with PDB. Furthermore, the inventors have discovered that variations within the sequence of this region of chromosome 14 are associated with PDB. In particular, the inventors have identified a 62 kb PDB associated region of chromosome 14 bounded by two recombination hotspots and the gene RIN3, that encodes Ras and Rab interactor 3.
Without wishing to be bound by theory, the inventors hypothesise that given the importance of small GTPases in vesicular trafficking and osteoclast function14, RIN3 may be involved in bone resorption. The inventors also note that mutations affecting the VCP, a protein also involved in vesicular trafficking, cause inclusion body myopathy with early-onset Paget's disease and frontotemporal dementia (IBMPFD)15.
Accordingly, in addition to establishing an association between 14q32.12 sequence variants and instances of PDB, the inventors have identified an association between modulated (for example increased and/or decreased) RIN3 function, activity and/or expression and PDB.
Accordingly, variations in the sequence of the 14q32.12 locus, including, for example variations in the RIN3 sequence and/or sequences adjacent thereto, may modulate the function, expression and/or activity of RIN3. In turn, this may result in a modulation of the function, expression and/or activity of the RIN3 protein which may further modulate (for example increase or decrease) osteoclast formation, differentiation and/or survival.
The variant sequences provided by this embodiment of the invention may comprise one or more polymorphisms. In one embodiment, the variant 14q32.12 sequences comprise the polymorphism rs10498635 (see Table 1/Table 1 (Example 2) for details).
In view of the above, the present invention establishes a correlation/association between sequence variations within the 14q32.12 locus and PDB.
The invention also concerns the finding that chromosomal locus 15q24.1, is associated with PDB. In particular, the inventors have discovered that variations in the sequence of this region of chromosome 15 (and in particular 15q24.1) are associated with PDB. In one embodiment, the variant sequences associated with PDB may be located in a ˜200 kb region bounded by two recombination hot spots and containing the promyelocytic leukaemia (PML) gene.
Without wishing to be bound by theory, the inventors hypothesise that the association between PDB and PML could be mediated by an effect on TGF-β signaling.
Accordingly, variations in the PML sequence and/or sequences adjacent thereto (for example, upstream or downstream sequences) may modulate the function, expression and/or activity of PML. In turn, this may result in a modulation of the function, expression and/or activity of PML which may further modulate (for example increase or decrease) osteoclast formation, differentiation and/or survival via effects on TGF-β signalling.
As such, in addition to establishing an association between 15q24.1 sequence variants and instances of PDB, the inventors have identified an association between modulated (for example increased and/or decreased) PML and/or PML function, activity and/or expression and PDB.
The variant 15q24.1 sequences provided by this invention, may comprise one or more polymorphisms. In one embodiment, the polymorphisms comprise rs5742915, which results in a phenylalanine to leucine amino acid change at codon 645 (F645L) of the PML protein (see Table 1/Table 1 (Example 2) for details).
Accordingly, the present invention establishes a correlation/association between 15q24.1 sequence variants and PDB.
The invention also concerns the finding that chromosomal loci, 6p22.3, is associated with PDB. Furthermore, the inventors have discovered that variations within the sequence of this region of chromosome 6, near the Prolactin gene PRL, are associated with PDB.
Without wishing to be bound by theory, it is suggested that Prolactin may affect bone metabolism by reducing sex hormone levels. Further, studies have shown that prolactin decreases the ratio of RANKL/OPG in human fetal osteoblast cells (Seriwatanachai D et al Cell Biol International 2008) and in animal models, prolactin was found to inhibit osteoclastic activity (Takahashi, H. et al Zoological Science, 2008)
Accordingly, in addition to establishing an association between 6p22.3 sequence variants and instances of PDB, the inventors have identified an association between modulated (for example increased and/or decreased) PRL function, activity and/or expression and PDB.
Accordingly, variations in the sequence of the 6p22.3 locus, including, for example variations in the PRL sequence and/or sequences adjacent thereto, may modulate the function, expression and/or activity of PRL. In turn, this may result in a modulation of the function, expression and/or activity of the prolactin protein which may further modulate (for example increase or decrease) bone metabolism and osteoclast activity.
The variant sequences provided by this embodiment of the invention may comprise one or more polymorphisms. In one embodiment, the variant 6p22.3 sequences comprise the polymorphism rs1341239 (see Table 1/Table 1 (Example 2) for details).
In view of the above, the present invention establishes a correlation/association between sequence variations within the 6p22.3 locus and PDB.
The invention also concerns the finding that chromosomal loci, Xq24, is associated with PDB. Furthermore, the inventors have discovered that variations within the sequence of this region of chromosome X are associated with PDB. In one embodiment, the variant sequences associated with PDB may be located within the SLC25A43 gene encoding a member of the mitochondrial carrier family of proteins.
Accordingly, in addition to establishing an association between Xq24 sequence variants and instances of PDB, the inventors have identified a possible association between modulated (for example increased and/or decreased) SLC25A43 function, activity and/or expression and PDB
Accordingly, variations in the sequence of the Xq24 locus, including, for example variations in the SLC25A43 sequence and/or sequences adjacent thereto, may modulate the function, expression and/or activity of SLC25A43. In turn, this may result in a modulation of the function, expression and/or activity of the solute carrier family 25, member 43 protein which may further modulate (for example increase or decrease) osteoclast formation, differentiation and/or survival.
The variant sequences provided by this embodiment of the invention may comprise one or more polymorphisms. In one embodiment, the variant Xq24 sequences comprise the polymorphism rs5910578 (see Table 1/Table 1 (Example 2) for details).
In view of the above, the present invention establishes a correlation/association between sequence variations within the Xq24 locus and PDB.
As stated, the invention relates to variant nucleic acid and/or gene sequences that are associated with PDB. For convenience, it is reiterated that the term “variant” relates to nucleic acid (or gene) sequences which, when compared to corresponding wild-type/reference sequences, or sequences derived from subjects not suffering from or susceptible/predisposed to PDB, comprise one or more nucleotide variations and/or mutations comprising nucleotide additions, deletions, inversions and/or substitutions.
In particular, the invention relates to the CSF1, OPTN, TNFRSF11A, TM7SF4 and RIMS2, DPYS, CNOT4, NUP205 SLC13A4, RIN3, PML, PRL and SLC25A43 genes and their products (including M-CSF, Optineurin, RANK, DC-STAMP, nucleoporin and Ras and Rab interactor 3) which have, for the first time, been associated with PDB. Hereinafter, these genes (i.e. those described in detail above) will be collectively referred to as “PDB associated genes”. Similarly, the products of each of these PDB associated genes will be referred to as “PDB associated proteins”. One of skill in this field will understand that a wild-type or reference sequence may comprise those deposited in the SNP database http://www.ncbi.nlm.nih.gov/snp as summarised in Table 1/Table 1 (Example 2), for the most strongly associated genetic markers. Additionally, or alternatively, wild-type or reference sequences for each of the genetic loci described herein may be obtained from the genome database at: http://www.ncbi.nlm.nih.gov/sites/entrez?db=Genome&itool=toolbar.
In view of the above, it should be understood that subjects in which the expression, function and/or activity of one or more PDB associated gene(s) and/or protein(s) is/are modulated, may be suffering from PDB or may be at altered risk of (i.e. susceptible or predisposed to) developing PDB.
In addition to identifying an association between certain genes, their protein products and PDB, the inventors have also identified and characterised a number of chromosomal loci linked to PDB. In particular, the inventors have determined that the loci 1p13.3, 10p13, 18q21, 8q22.3, 7q33, 14q32.12, 15q24.1, 6p22.3 and Xq24 are associated with PDB. For convenience, these chromosomal loci will be collectively referred to as “PDB associated chromosomal loci”.
In one embodiment, the inventors have established that variations in the nucleic acid sequence of these parts of chromosomes 1, 10, 18, 8, 7, 14, 15, 6 and X are associated with PDB. In some cases, these sequence variations may affect the function, expression and/or activity of the PDB associated genes provided by this invention. For example, the sequence variations may reside in regulatory sequences associated with the expression, function or activity of the PDB associated genes—such variations leading to the modulated expression, function and/or activity of PDB associated genes.
In one embodiment, the sequence variations take the form of one or more polymorphisms (i.e. SNPs) and the details of certain, specific SNPs are detailed herein (see Table 1/Table 1 (Example 2)). However, one of skill will appreciate that the invention relates to other SNPs which are themselves associated with these specific SNPs. For example, the invention also relates to SNPs which are identified as being in linkage disequilibrium (in other words SNPs which are proximal/close to and linked) with any one of the SNPs detailed in Table 1. Accordingly, the data presented in, for example,
In view of the above, a further aspect of this invention provides a method of diagnosing PDB in a subject or detecting or identifying an altered risk of developing PDB in a subject, said method comprising the step of identifying any modulation in the function, expression and/or activity of a (or one or more of) PDB associated gene/protein, in a sample provided by a subject, wherein modulated function, expression and/or activity indicates a positive diagnosis of PDB and/or an altered risk of developing PDB.
Additionally, or alternatively, the method of diagnosing PDB in a subject or detecting or identifying an altered risk of developing PDB in a subject, may comprise the step of, identifying or detecting in a sample, provided by said subject, sequence variations within one or more of the PDB associated chromosomal loci detailed above.
In one embodiment, the sequence variations may take the form of one or more SNPs at a locus selected from the group consisting of:
(i) position 110,154,000 on chromosome 1;
(ii) position 110,163,205 on chromosome 1;
(iii) position 110,167,606 on chromosome 1;
(iv) position 105,428,608 on chromosome 8;
(v) position 13,195,732 on chromosome 10;
(vi) position 58,211,715 on chromosome 18;
(vii) position 58,233,073 on chromosome 18;
(viii) position 134,943,668 on chromosome 7
(ix) position 92,173,062 on chromosome 14
(x) position 72,123,686 on chromosome 15
(xi) position 22,412,183 on chromosome 6; and
(xii) position 118,451,730 on chromosome X.
It should be noted that the nucleotide positions are based on NCBI human genome build 36 and thus, over time, these positions/co-ordinates may alter (although the “rs” notation (see below) will still apply and one of skill would easily be able to determine the new position.
In one embodiment, the variations at each of the above described loci, may comprise the addition, deletion, insertion, inversion, substitution and/or mutation of one or ore nucleotides. In a further embodiment, the variations are single nucleotide polymorphisms (SNPs).
In a further embodiment, the sequence variations within the PDB associated chromosomal loci may comprise one or more SNPs which are in linkage disequilibrium with SNPs occurring at each of the loci listed as (i)-(vii) above—examples of such SNPs being identified in
In a yet further embodiment, the methods described herein may comprise the step of detecting one or more of the specific SNPs selected from the group consisting of:
Further details regarding each of these exemplary SNPs are given in Table 1/Table 1 (Example 2).
In one embodiment, the methods of detecting or diagnosing PDB and/or a predisposition or susceptibility thereto, comprise the step of analysing the nucleotides present at each of the alleles associated with SNPs (i)-(xiii) above and determining a risk allele score. One of skill will appreciate that a risk allele score reflects the number of risk alleles present at any one allele. A risk allele score can be 0 (no risk alleles present), 1 (1 risk allele present) or 2 (2 risk alleles present). A risk allele score can be determined for one or more of the SNP alleles listed above. The total risk allele score (i.e. the sum of the score calculated for one or more of the SNPs listed above) may then be used to predict the disease risk in a patient.
In one embodiment, the method of detecting or diagnosing PDB and/or a predisposition or susceptibility thereto, comprises the step of identifying the presence or absence of risk alleles at the SNP denoted rs10494112, wherein the presence of one or more risk alleles represents a disease risk. It should be understood that the “risk allele” at this SNP is “G”.
In a further embodiment, methods for detecting or diagnosing PDB and/or a predisposition or susceptibility thereto and which comprise the step of identifying the presence or absence of risk alleles at the SNP denoted rs10494112, may further comprise the steps of identifying the presence or absence of risk alleles at one or more other SNPs selected from the group consisting of:
wherein the presence of risk alleles at one or more of SNP locations (a)-(f) above, represents a disease risk.
It should be understood that the risk alleles at each of the above-mentioned SNP locations (i)-(xiii) are A, G, A, T, A, A, G, C, C, T and C respectively.
Based on the methods described above, the inventors have produced the following system, whereby one calculates the number of risk alleles detected in a sample provided by a subject being tested, adds the number of risk alleles for each SNP tested and uses the following table to predict disease risk in that subject, wherein the scores highlighted in grey represent an increase disease risk.
One of skill will appreciate that the results of any of the methods described herein may be compared to the results of control or reference values obtained from healthy subjects or subjects known not to be susceptible or predisposed to PDB. For example, where an assay determines the level of modulation of activity, function and/or expression of a PDB associated gene, the results may be compared to the level of expression, function and/or activity of the same PDB associated gene in a reference or control sample.
One of skill will appreciate that, in the early stages, PDB is an asymptomatic disease and thus diagnosis of early stage disease and/or identifying individuals who would benefit from early treatment (as a prophylactic, delaying or preventative measure) is difficult. Accordingly, the present invention provides methods which may be used to predict the likelihood of an individual developing PDB and recommend that individual for early treatment. Accordingly, the methods described herein may be used as a predictive test.
In addition, the methods described herein may be used in combination with existing PDB diagnostic methods—including those relying on variant SQSTM1 sequences. In this way, while existing tests may be able to predict disease (or at least an increased risk of disease) in about 10% of those who are predisposed/susceptible to PDB or who will go on to develop PDB, by combining these existing tests with the new methods described herein, it may be possible to significantly increase the number of people identified as likely to get PDB by about 80% (to about 90% of the population actually susceptible to PDB).
Detecting the modulated function, expression and/or activity of a PDB associated gene/protein and/or one or more variants the sequence of loci 1p13.3, 10p13, 18q21, 8q22.3, 7q33, 14q32.12, 15q24.1, 6p22.3 and Xq24 in a sample from a subject, indicates that that subject might have PDB or might be predisposed or susceptible to developing PDB. Accordingly, the subject to be tested may be known to be suffering from PDB or may be suspected of having PDB. In other embodiments, the subject may be healthy (i.e. with no symptoms of PDB) or may be suspected of being susceptible or predisposed to developing PDB—perhaps because of familial cases.
The methods provided by this invention may require the use of a sample comprising nucleic acid or from which nucleic acid can be obtained. The sample may be provided by (or obtained from) a subject to be tested and may take the form of a tissue biopsy, scraping or swab. Furthermore, the sample may comprise a fluid, for example a bodily fluid, such as whole blood, serum, sweat, plasma, semen, urine, lymph amniotic fluid, tissue and/or glandular secretions and/or saliva.
The Methods which may be used to detect modulated expression, function and/or activity of PDB associated gene(s), or variant nucleic acid sequences at each of the genetic loci described herein, are well known to one of skill in this field and may include, polymerase chain reaction (PCR) based techniques utilising, for example using genomic DNA as template or reverse transcriptase (RT)-PCR (see below) techniques in combination with real-time PCR (otherwise known as quantitative PCR). Other useful techniques may include restriction fragment length polymorphism analysis (RFLP) and/or hybridisation techniques using probes and/or primers designed to hybridise under conditions of high, medium and/or low stringency, to sequences within these loci. Suitable probes and/or primers (i.e. oligonucleotide sequences) are described herein (see below). Further information on such techniques may be found in Molecular Cloning: A Laboratory Manual (Third Edition) By Sambrook, MacCallum & Russell, Pub. CSHL; ISBN 978-087969577-4—incorporated herein by reference.
For example, PCR techniques useful in the detection of variant nucleic acid sequences may require the use of short oligonucleotide primers designed to hybridise to sequences proximal to (for example 3′ and 5′ (upstream or downstream)) of a nucleic acid sequence of interest—for example a sequence potentially harbouring a variant sequence. Once oligonucleotides have been hybridised to a nucleic acid sequence, the nucleic acid sequence between the primers is enzymatically amplified via the PCR. The amplified nucleic acid may then be sequenced to determine whether or not it comprises a variant sequence. Additionally, or alternatively, the amplified nucleic acid may be contacted with one or more restriction enzymes—this technique is particularly useful if a variant nucleic acid sequence is known either to remove a particular restriction site or create a restriction site. The presence or absence of a variant nucleic acid sequence may be detected via analysis of the resulting restriction fragment length polymorphism (RFLP) profile. When analysing RFLP profiles, the results may be compared to standard or control profiles obtained by contacting nucleic acid sequences obtained from health patients (i.e. patients not suffering from or predisposed to PDB), with the same restriction enzymes.
In addition to the above, altered electophoretic mobility may be used to detect alterations in nucleic acid sequences. For example, small sequence deletions and insertions may be visualised by high resolution gel electrophoresis—nucleic acid sequences with different sequences migrating through agarose gels (denaturing or non-denaturing and/or gradient gels) at different speeds/rates.
One of skill will appreciate that relative levels of mRNA expression may used as a means of determining the level of expression, activity and/or function of a particular gene (such as, for example, the PDB associated genes described herein). By way of example, modulation (i.e. an increase or decrease) in the amount of mRNA encoding the genes now found to be associated with PDB, may indicate modulated gene expression, function and/or activity and may further indicate that a subject is suffering from or predisposed/susceptible to, PDB. More specifically, real time-PCR may used to determine the level of expression of any of the PDB associated genes described herein. Typically, and in order to quantify the level of expression of a particular nucleic acid sequence, RT-PCR may be used to reverse transcribe the relevant mRNA to complementary DNA (cDNA). Preferably, the reverse transcriptase protocol may use primers designed to specifically amplify an mRNA sequence of interest (in this case mRNA encoding all or part of a PDB associated gene). Thereafter, PCR may be used to amplify the cDNA generated by reverse transcription. Typically, the cDNA is amplified using primers designed to specifically hybridise with a certain sequence and the nucleotides used for PCR may be labelled with fluorescent or radiolabelled compounds.
One of skill in the art will be familiar with the technique of using labelled nucleotides to allow quantification of the amount of DNA produced during a PCR. Briefly, and by way of example, the amount of labelled amplified nucleic acid may be determined by monitoring the amount of incorporated labelled nucleotide during the cycling of the PCR.
Further information regarding the PCR based techniques described herein may be found in, for example, PCR Primer: A Laboratory Manual, Second Edition Edited by Carl W. Dieffenbach & Gabriela S. Dveksler: Cold Spring Harbour Laboratory Press and Molecular Cloning: A Laboratory Manual by Joseph Sambrook & David Russell: Cold Spring Harbour Laboratory Press.
Other techniques that may be used to determine the level of PDB associated gene expression in a sample, include, for example, northern and/or Southern blot techniques. A northern blot may be used to determine the amount of a particular mRNA present in a sample and as such, could be used to determine the amount of PDB associated gene expression. Briefly, total or messenger (m)RNA may be extracted from any of the samples described above using techniques known to the skilled artisan. The extracted RNA may then be subjected to electrophoresis. A nucleic acid probe, designed to hybridise (i.e. complementary to) an RNA sequence of interest—in this case the mRNA encoding all or part of a PDB associated gene, may then be used to detect and quantify the amount of a particular mRNA present in a sample.
Additionally, or alternatively, a level of PDB associated gene expression may be identified by way of microarray analysis. Such a method would involve the use of a DNA micro-array which comprises nucleic acid derived from PDB associated genes. To identify a level of PDB associated gene expression, one of skill in the art may extract the nucleic acid, preferably the mRNA, from a sample and subject it to an amplification protocol such as, RT-PCR to generate cDNA. Preferably, primers specific for a certain mRNA sequence—in this case sequences encoding PDB associated genes may be used.
The amplified PDB associated gene cDNA may be subjected to a further amplification step, optionally in the presence of labelled nucleotides (as described above). Thereafter, the optionally labelled amplified cDNA may be contacted with the microarray under conditions which permit binding with the DNA of the microarray. In this way, it may be possible to identify a level of PDB associated gene expression.
In addition, other techniques such as deep sequencing and/or pyrosequencing may be used to detect PDB associated sequences and/or variant 1p13.3, 10p13, 18q21, 8q22.3, 7q33, 14q32.12, 15q24.1, 6p22.3 and Xq24 sequences, in any of the samples described above. Further information on these techniques may be found in “Applications of next-generation sequencing technologies in functional genomics”, Olena Morozovaa and Marco A. Marra, Genomics Volume 92, Issue 5, November 2008, Pages 255-264 and “Pyrosequencing sheds light on DNA sequencing”, Ronaghi, Genome Research, Vol. 11, 2001, pages 3-11.
In other embodiments, samples provided by subjects to be tested may be analysed or probed for the levels of each of the PDB associated proteins described herein. For example, immunological detection techniques such as, for example, enzyme linked immunosorbent assays (ELISAs) or Western blot and/or immunoblot techniques may be used. Such techniques may require the use of binding agents or antibodies specific to, or selective for, the various PDB associated gene products (or fragments thereof) described herein. Further information on such techniques may be found in Using Antibodies: A Laboratory Manual By Harlow & Lane, Pub. CSHL, ISBN 978-087969544-6 and Antibodies: A Laboratory Manual by Harlow & Lane, CSHL, ISBN 978-087969314-5—both of which are incorporated herein by reference.
In one embodiment, binding agents (for example antibodies) having affinity/specificity/selectivity to/for any of the PDB associated proteins (or epitopes thereof), may be coated onto the surface of a suitable substrate (for example a microtitre plate). Thereafter, the coated substrate may be contacted with a sample to be tested for the presence or absence of PDB associated proteins. Binding between any PDB associated proteins present in the sample and the binding agents coated onto the surface of the substrate, may be detected by means of a secondary binding agent having specificity for a PDB associated protein. Secondary antibodies useful in the present invention may optionally be conjugated to moieties which permit them to be detected (referred to hereinafter as “detectable moieties”). For example, the secondary antibodies may be conjugated to an enzyme capable of reporting a level via a colourmetric chemiluminescent reaction. Such conjugated enzymes may include but are not limited to Horse Radish Peroxidase (HRP) and Alkaline Phosphatase (AlkP). Additionally, or alternatively, the secondary antibodies may be conjugated to a fluorescent molecule such as, for example a fluorophore, such as FITC, rhodamine or Texas Red. Other types of molecule which may be conjugated to binding agents include radiolabelled moieties.
Other techniques which exploit the use of agents capable of binding PDB associated proteins, for example antibodies, include, for example, techniques such as western blot or dot blot. A western blot may involve subjecting a sample to electrophoresis so as to separate or resolve the components, for example the proteinaceous components, of the sample. The resolved components/proteins may then be transferred to a substrate, such as nitrocellulose.
In order to identify any PDB associated proteins present in a sample, the substrate (for example nitrocellulose substrate) to which the resolved components and/or proteins have been transferred, may be contacted with a binding agent capable of binding PDB associated proteins under conditions which permit binding between any PDB associated proteins present in the sample (or transferred to the substrate) and the agents capable of binding the PDB associated proteins.
Advantageously, the agents capable of binding the PDB associated proteins may be conjugated to a detectable moiety.
Additionally, the substrate may be contacted with a further binding agent having affinity for the binding agent(s) capable of binding PDB associated proteins. Advantageously, the further binding agent may be conjugated to a detectable moiety.
In certain embodiments any of the samples described above may be used a source of PDB associated protein. Additionally or alternatively, the PDB associated protein may be isolated or purified from the sample, or produced in recombinant form.
Other immunological techniques which may be used to identify a level of PDB associated protein in a sample (particularly tissue or biopsy samples) include, for example, immunohistochemistry wherein PDB associated protein binding agents, are contacted with a sample such as those described above, under conditions which permit binding between any PDB associated protein present in the sample and the binding agent. Typically, prior to contacting the sample with the binding agent, the sample is treated with, for example a detergent such as Triton X100. Such a technique may be referred to as “direct” immunohistochemical staining.
Alternatively, the sample to be tested may be subjected to an indirect immunohistochemical staining protocol wherein, after the sample has been contacted with a PDB associated protein binding agent, a further binding agent (a secondary binding agent) which is specific for, has affinity for, or is capable of binding the PDB associated protein antigen binding agent, is used to detect PDB associated protein/binding agent complexes.
The skilled person will understand that in both direct and indirect immunohistochemical techniques, the binding agent or secondary binding agent may be conjugated to a detectable moiety. Preferably, the binding agent or secondary binding agent is conjugated to a moiety capable of reporting a level of bound binding agent or secondary binding agent, via a colourmetric chemiluminescent reaction.
In order to identify the levels of PDB associated protein present in a sample, one may compare the results of any of the immunological of molecular (i.e. PCR, RT-PCR) techniques described herein, with results obtained from the same procedures using control or reference samples obtained from subjects not suffering from PDB and/or not predisposed or susceptible thereto. By way of example, a sample revealing an increased or decreased level of PDB associated gene activity, expression or function and/or an increased or decreased level of PDB associated protein than detected in a corresponding reference or control sample, may have been provided by a subject with PDB or susceptible or predisposed thereto.
In a further aspect the present invention, provides compounds for example polynucleotides and/or polypeptides (proteins, peptides or amino acids), useful in the treatment or prevention of PDB.
Accordingly, in one embodiment, the present invention relates to polynucleotide and/or polypeptide fragments for use in treating or preventing PDB.
In a further aspect, the invention relates to the use of polynucleotide and/or polypeptide fragments for the manufacture of a medicament for treating or preventing PDB.
In a yet further aspect, the present invention provides a method of treating or preventing PDB, said method comprising the step of administering to a subject in need thereof, a therapeutically effective amount of a polynucleotide and/or polypeptide compound.
One of skill will appreciate that a “polynucleotide” compound comprises a chain of DNA or RNA nucleotides—also known as an oligonucleotide. Where the polynucleotide sequences find application in the treatment or prevention of PDB, they should be designed to restore wild-type gene expression, function and/or activity and/or to correct any aberrant (i.e. increased or decreased) gene expression, function and/or activity.
The methods, uses, medicaments and/or compositions provided by this invention may comprise polynucleotides and/or polypeptides, wherein said polynucleotides and/or polypeptides modulate the expression, activity and/or function of the CSF1, OPTN, TNFRSF11A, TM7SF4, RIMS2, DPYS, CNOT4, NUP205 SLC13A4, RIN3, PML, PRL and/or SLC25A43 genes described herein.
By way of example, the polynucleotide compounds of the present invention may comprise all or part of the sequence of the PDB associated genes described herein. Such sequences may be used in gene therapy techniques to restore wild-type PDB associated gene expression, function and/or activity in subjects suffering from PDB or susceptible/predisposed thereto. Accordingly, the polynucleotide sequences for use in the treatment or prevention of PDB may comprise sequences derived from wild-type, or normally functioning, PDB associated genes.
In one embodiment, the polynucleotide sequences for use in the compositions and medicaments of this invention, comprise the CSF1, OPTN, TNFRSF11A, TM7SF4, RIMS2, DPYS, CNOT4, NUP205 SLC13A4, RIN3, PML, PRL and/or SLC25A43 gene sequences or fragments or portions thereof. Accordingly, the present invention provides polynucleotide sequences derived from the CSF1, OPTN, TNFRSF11A, TM7SF4, RIMS2, DPYS, CNOT4, NUP205 SLC13A4, RIN3, PML, PRL and/or SLC25A43 genes, for use in treating and/or preventing PDB. Compositions comprising polynucleotide sequences provided by this invention may be administered to subjects suffering from PDB or to those who are at risk of developing PDB.
In other embodiments, polynucleotide sequences of this invention may comprise antisense sequences which may be used to, for example, suppress aberrant gene expression. Where, for example, one or more of the genes CSF1, OPTN, TNFRSF11A, TM7SF4, RIMS2, DPYS, CNOT4, NUP205 SLC13A4, RIN3, PML, PRL and/or SLC25A43 is aberrantly expressed due to, for example, a variation in the gene sequence itself or some other sequence within the chromosomal loci described herein, an antisense oligonucleotide may be used to modulate, preferably suppress or ablate, the aberrant gene expression. Antisense oligonucleotides may comprise DNA and/or RNA. In the case of RNA based antisense oligonucleotide sequences, the oligonucleotide may take the form of a small/short interfering and/or silencing RNA—such molecules being referred to hereinafter as siRNA. One of skill will appreciate that RNA molecules of this type may be modified in some way so as to be nuclease resistant.
By analysing the sequence of the various loci of this invention, one of skill may utilise algorithms such as, for example, BIOPREDsi, to determine or computationally predict nucleic acid sequences that have an optimal knockdown effect for a particular gene sequence. Accordingly, the skilled person may easily and without burden, prepare and test a library of different oligonucleotides to determine whether or not any are capable of modulating the expression, function and/or activity of any of the CSF1, OPTN, TNFRSF11A, TM7SF4, RIMS2, DPYS, CNOT4, NUP205 SLC13A4, RIN3, PML, PRL and/or SLC25A43 genes.
Polypeptide sequences provide by this invention may take the form of proteins, polypeptides or amino acids, comprising sequences derived from or comprising the PDB associated genes described herein. For example, the medicaments, uses, methods and/or compositions provided by this invention may comprise polypeptides designed to modulate or mimic the expression, function or activity of a PDB associated protein. More specifically, where the expression, activity or function of a PDB associated protein is aberrant resulting in PDB or a susceptibility/predisposition thereto, polypeptides comprising wild-type or normal PDB associated protein sequences may be used to treat or prevent PDB.
One of skill will readily understand that genes homologous to the human PDB associated genes provided by this invention, may be found in a number of different species, including, for example, other mammalian (particularly rodent) species.
Homologous genes may exhibit as little as approximately 20 or 30% sequence homology or identity however, in other cases, homologous genes may exhibit at least 40, 50, 60, 65 70, 75, 80, 85, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99% homology to the various nucleotide sequences given above. As such, homologous genes from other species are to be included within the scope of this invention.
Furthermore, using the various nucleic acid and amino acid sequences of the genes/proteins described herein, one of skill in the art could readily identify related sequences in other species, such as other mammals etc. For example, nucleic acid obtained from a particular species may be analysed using any of the probes described herein (see below), for homologous, identical or closely related sequences.
One of skill in this field will readily understand that for the various nucleic acid sequences and polypeptides described herein, natural variations due to, for example, polymorphisms, may exist between genes and proteins isolated from any given species. These variants may manifest as proteins and/or genes that exhibit one or more amino acid/nucleic acid substitutions, additions, deletions and/or inversions relative to a reference sequence (for example any of the sequences described above). As such, it is to be understood that all such variants, especially those that are functional or display the desired activity, are to be included within the scope of this invention.
In other embodiments, the invention relates to derivatives of any of the sequences described herein. The term “derivatives” may encompass gene or peptide sequences which, relative to those described herein, comprise one or more amino acid substitutions, deletions, additions and/or inversions.
Additionally, or alternatively, analogues of the various peptides described herein may be produced by introducing one or more conservative amino acid substitutions into the primary sequence. One of skill in this field will understand that the term “conservative substitution” is intended to embrace the act of replacing one or more amino acids of a protein or peptide with an alternate amino acid with similar properties and which does not substantially alter the physcio-chemical properties and/or structure or function of the native (or wild type) protein. Analogues of this type are also encompassed with the scope of this invention.
As is well known in the art, the degeneracy of the genetic code permits substitution of one or more bases in a codon without changing the primary amino acid sequence. Consequently, although the sequences described in this application are known to encode certain proteins (each of which is described herein), the degeneracy of the code may be exploited to yield variant nucleic acid sequences which encode the same primary amino acid sequences.
The present invention also provides polynucleotide sequences comprising nucleotides which are complementary to nucleotide sequences (preferably contiguous sequences) adjacent to and/or comprising, the variant sequences described herein. In one embodiment, the polynucleotide sequences are primer or probe sequences which may otherwise be referred to as oligonucleotides. In one embodiment, the probe or primer oligonucleotides provided by this invention may comprise, for example 5-50, 6-40, 7-30, 8-20 nucleotides. In one embodiment, and particularly where the oligonucleotide has application as a probe, the oligonucleotide may comprise a nucleotide complementary to a SNP (i.e. complementary to the minor allele). One of skill will appreciate that primer sequences comprising nucleotides complementary to sequences upstream and/or downstream of any of the SNPs described herein, may be used in PCR based techniques to amplify sections of nucleic acid comprising one or more SNP or SNP loci. Oligonucleotide sequences useful as probes or primers may be used to detect the presence or absence of certain SNP sequences in nucleic acid samples provided by subjects.
In a further aspect, the present invention provides a method of treating and/or preventing PDB, said method comprising the step of obtaining a sample of nucleic acid (or a sample from which nucleic acid may be extracted or prepared) and subjecting the sample to the methods of detecting the presence or absence of a gene and/or polymorphism associated with PDB described herein, wherein subjects identified as suffering from or predisposed/susceptible to PDB are administered a medicament or composition to treat and/or prevent PDB. The medicament and/or composition may comprise any of the polynucleotides and/or polypeptides provided by this invention.
The polynucleotide and/or polypeptide sequences described herein may be isolated in that they are substantially free of any other biological material. Furthermore, the invention relates to recombinant polypeptide sequences generated using vectors comprising, for example, the CSF1, OPTN, TNFRSF11A, TM7SF4, RIMS2, DPYS, CNOT4, NUP205 SLC13A4, RIN3, PML, PRL and/or SLC25A43 genes (or fragments or portions thereof) and host cells, into which the vectors are introduced.
In one embodiment, the invention provides vectors, for example, natural or synthetic vectors, adapted to receive and, in some cases express, genes or gene fragments. Such vectors may include plasmids or expression cassettes. As such, the vectors encompassed by this invention may include plasmid constructs comprising any of the polynucleotide sequences, including the antisense oligonucleotide sequences, described herein.
Vectors of the type described above may be introduced into a suitable cell for the generation of a recombinant product of any of the PDB associated genes (or fragment thereof) described herein. Accordingly, the present invention also extends to host cells modified to comprise any of the vectors described herein. Again, further information relating to the use of cell transformation protocols (for example heat-shock, electroporation and/or chemical transformation) may be found in Molecular Cloning: A Laboratory Manual (Third Edition) By Sambrook, MacCallum & Russell, Pub. CSHL; ISBN 978-087969577-4—incorporated herein by reference.
A further aspect of this invention relates to methods of identifying compounds which modulate the expression/function and/or activity of PDB associated genes/proteins—such compounds being of use in the treatment and prevention of PDB. In one embodiment, such a method may comprise the steps of:
(a) contacting a PDB associated gene/protein with an agent to be tested; and
(b) identifying or detecting any modulation in the expression, function and/or activity of the PDB associated gene/protein.
The step of contacting a PDB associated gene/protein with a test agent, may comprise the use of a cell, for example a mammalian cell, expressing (either naturally or through recombinant manipulation) a PDB associated gene/protein. The techniques which may be used to detect modulated PDB associated gene/protein expression are discussed in detail above.
The method provided by this aspect of the invention may easily be adapted to provide micro-array or high throughput assays capable of analysing large numbers of agents for modulatory activity on the expression, function or activity of a PDB associated gene/protein.
In a further aspect, the invention provides antibodies having affinity for or selective/specific for a polypeptide (or an epitope thereof) encoded by a variant 1p13.3, 10p13, 18q21, 8q22.3, 7q33, 14q32.12, 15q24.1, 6p22.3 and Xq24 sequence. Polyclonal and/or monoclonal antibodies are easily produced and/or purifies using routine laboratory techniques. It should be understood that the term antibody also includes epitope binding fragments or derivatives such as, for example, single chain antibodies, diabodies, triabodies, minibodies and/or single domain antibodies. The term antibodies also encompasses Fab, (Fab)2 and/or other epitope binding fragments.
Accordingly, any of the polynucleotides or polypeptides, antibodies or test agents subjected found to be potentially useful in the treatment or prevention of PDB, may be formulated as sterile pharmaceutical compositions comprising a pharmaceutically acceptable carrier or excipient. Such carriers or excipients are well known to one of skill in the art and may include, for example, water, saline, phosphate buffered saline, dextrose, glycerol, ethanol, ion exchangers, alumina, aluminium stearate, lecithin, serum proteins, such as serum albumin, buffer substances such as phosphates, glycine, sorbic acid, potassium sorbate, partial glyceride mixtures of saturated vegetable fatty acids, water salts or electrolytes, such as protamine sulphate, disodium hydrogen phosphate, potassium hydrogen phosphate, sodium chloride, zinc salts, colloidal silica, magnesium trisilicate, polyvinyl pyrrolidone, cellulose-based substances, polyethylene glycon, sodium carboxymethylcellulose, polyacrylates, waxes, polyethylene-polypropylene-block polymers, polyethylene glycol and wool fat and the like, or combinations thereof.
A further aspect of the invention relates to kits for identifying and/or determining whether or not a variant PDB associated gene/protein and/or 1p13.3, 10p13, 18q21, 8q22.3, 7q33, 14q32.12, 15q24.1, 6p22.3 and Xq24 sequence is present or absent in, for example, a nucleic acid sample. A kit according to this aspect of the invention may include one or more pairs of oligonucleotide primers useful for amplifying a nucleotide sequence of interest. For example, the nucleotide sequence of interest may comprise one or more nucleic acid sites known to harbour variant 1p13.3, 10p13, 18q21, 8q22.3, 7q33, 14q32.12, 15q24.1, 6p22.3 and Xq24 sequence associated with PDB. The kit may comprises a polymerizing agent, for example, a thermo-stable nucleic acid polymerase such as one disclosed in U.S. Pat. No. 4,889,818 or 6,077,664. Furthermore, the kit may comprises an elongation oligonucleotide that hybridizes to sequence adjacent or proximal to a site potentially harbouring a variant 1p13.3, 10p13, 18q21, 8q22.3, 7q33, 14q32.12, 15q24.1, 6p22.3 and Xq24 sequence. Where the kit includes an elongation oligonucleotide, it may also include chain elongating nucleotides, such as dATP, dTTP, dGTP, dCTP, and dITP and/or analogs thereof. In addition, the kit provided by this aspect of this invention may optionally include terminating nucleotides such as, for example, ddATP, ddTTP, ddGTP, ddCTP. The kit may include one or more oligonucleotide primer pairs, a polymerizing agent, chain elongating nucleotides, at least one elongation oligonucleotide, and one or more chain terminating nucleotides. Kits may also optionally include reaction and/or storage buffers, vials or other storage/reaction vessels, microtiter plates and instructions for use.
The present invention will now be described in detail with reference to the following Figures which show:
Manhattan plot of association test results of GWAS stage data showing chromosomal position of 2,487,078 genotyped or imputed SNPs plotted against genomic-control adjusted −log10 P. The red horizontal line represents the threshold for genome wide significance (P<5×10−8).
The genome-wide association study was conducted in a discovery sample of 750 cases of predominantly British descent with clinical and radiological evidence of PDB in whom mutations of the SQSTM1 gene had been excluded by DNA sequencing. These comprised subjects who had participated in the PRISM study (n=597), a randomized trial of two different treatment strategies for PDB38; clinic-based subjects from the UK with sporadic PDB (n=55); and subjects with a family history of PDB derived from the UK (n=20), Australia (n=66), New Zealand (n=8) and Italy (n=4). Details of the 1,002 control subjects have previously been described9; in brief, they comprised healthy subjects of Scottish descent with no clinical evidence of PDB. For the replication study, we conducted genotyping in an additional 500 PDB cases without SQSTM1 mutations who were diagnosed according to standard techniques. These comprised subjects with sporadic PDB who had been recruited from hospital clinics in the UK (n=226), Italy (n=20) and Spain (n=200); subjects with sporadic PDB who had participated in the PRISM study (n=43); and subjects with a positive family history of PDB who had been recruited from hospital clinics in Australia (n=10) and the UK (n=1). The 535 replication controls comprised subjects from the UK who had been referred for investigation of osteoporosis but who had been found to have normal bone density on examination by dual energy X-ray absorptiometry (n=248), spouses of participants of the PRISM study who were not known to be affected by PDB (n=252) and clinic-based controls from Spain (n=35). All study participants were of European descent. The studies were approved by ethics review committees at the relevant institutions and all participants provided informed consent. The discovery sample had 96% power to detect disease-associated alleles with MAF=0.2 and a genotype relative risk of 1.6, assuming a multiplicative model and a disease with population prevalence of 2%.
Genotyping of PDB cases was performed at the genetics core of the Wellcome Trust Clinical Research Facility using Illumina HumanHap300-Duo BeadChip v2. Genotyping of the controls had been previously performed by Illumina Inc. using HumanHap300 v1 and HumanHap240S arrays9. Genotypes for cases and controls were called using BeadStudio v3.2 (Illumina, Inc.) by following the manufacturer's recommended protocol. Genotype data for control subjects were provided after applying the quality-control measures described previously9. For the cases, we used a no-call threshold of 0.15 in BeadStudio and quality-control metrics such as cluster separation, AB T mean (the mean normalized theta values of the heterozygote cluster) and AB R mean (the mean normalized intensity of the heterozygote cluster) to exclude badly performing SNPs. Samples with a call rate of less than 90% were excluded (n=30). The data were then subjected to further quality-control measures using PLINK39 to exclude SNPs with a call rate of less than 95%, those with Hardy-Weinberg equilibrium P values of less than 1.0×10−4 in controls and those with a minor allele frequency of less than 1%. This left a total of 294,663 SNPs common to cases and controls with at least 95% call rates in each set. Samples with excess heterozygosity (1 case), non-European ancestry (21 cases and 1 control) and related subjects (6 cases) were excluded before analysis, leaving a final total of 692 cases and 1,001 controls with an average (±standard deviation) genotype call rate of 99.63±1.0. The genotype cluster plots for all SNPs showing association with PDB at P<1.0×10−4 were visually inspected in BeadStudio. Population ancestry was determined using multidimensional scaling analysis of the IBS distances matrix of all individuals after combining genotype data from the HapMap project (release 22) samples of European (CEU), Asian (CHB and JPT) and African (YRI) ancestry. For this analysis, we first removed SNPs in areas of extended LD (chr. 2: positions 134.0-138.0; chr. 6: 25.0-34.0; chr. 8: 8.0-12.0; chr. 11: 45.0-57.0)40 and those with r2>0.2 within a 150-SNP window. SNPs with call rate <99%, MAF <5% and Hardy-Weinberg equilibrium P<1.0×10 in cases or controls were also excluded, leaving a total of 63,528 SNPs. The genome-wide average IBS distances matrix for all pairs of individuals was then calculated based on the total 63,528 SNPs using PLINK and was then used for multidimensional scaling analysis.
Genotyping of replication samples was performed by Sequenom using the MassARRAY iPLEX platform. DNA from cases and controls was distributed into 384-well plates so that each plate had the same number of cases and controls to minimize genotyping bias due to variations between runs. We included 100 samples from stage 1 as a quality-control measure. The concordance rate between Illumina and Sequenom platforms was >99.9%. Replication samples with call rate <95% were excluded (19 cases and 15 controls), leaving a total of 481 cases and 520 controls with an average genotype call rate of 99.61%. The call rate of all the genotyped SNPs was >95%.
Genotypes were imputed using MACH41 for untyped variants located within 2.0 Mb of SNPs identified in stage 1 as having genome-wide significant association with PDB. The HapMap CEU genotype data from release 22 were used as a reference. To avoid spurious association caused by inaccurate imputation of SNPs located at both ends of the imputed segments, analysis was restricted to SNPs located in the middle 2 Mb of each 4-Mb imputed segment. We used 200 rounds of Markov chain iterations to estimate allele dosage and the most likely genotypes of individuals in the stage 1 data. Imputation quality was assessed by estimating the correlation (r2) between imputed and true genotypes. SNPs with r<0.3 were excluded before further analysis. Analysis of imputed data was performed using logistic regression implemented in mach2dat42 in which the imputed allelic dosage was used to account for uncertainty in imputed genotypes.
Statistical analyses were performed using PLINK (Version 1.07)39. In stage 1, genotyped SNPs were tested for association with PDB using a stratified CMH test. Samples were stratified based on their genome-wide IBS similarity so that individuals assigned to one cluster were not genetically different (P>0.001, obtained from a pairwise population concordance test). The quantile-quantile plot and genomic control factor (λ) were used to assess overdispersion of the test statistics and were calculated using the statistical package R version 2.7.2 (see URLs) based on the 90% least significant SNPs as described previously10, Stepwise logistic regression was used to test for independent effects of an individual SNP, where the allelic dosage of the conditioning SNP was entered as a covariate in the regression model along with the population clusters identified by IBS-sharing analysis described above in order to adjust for population substructure. Haplotype analysis was performed by logistic regression, which looked at the presence or absence of the test haplotype and included the population clusters as a covariate in the model. Haplotypes were phased using the expectation-maximization algorithm implemented in PLINK, and only haplotypes with a frequency of ≧1% were analyzed. The cutoff point for genome-wide significance was set as P<1.7×10−7 (0.05/294,663 total SNPs) for stage 1, and P<3×10−3 (0.05/16 total SNPs) for the replication stage. For the combined analysis, we set the threshold for significance as P<5×10−8 as recently proposed43. The replication and combined datasets were analyzed as described above except that the replication dataset was considered as a separate cluster when population clusters were used in a stratified CMH test or as a covariate in logistic regression models. The population attributable risk (PAR) for markers showing association with PDB was calculated according to the following formula:
PAR=p(OR−1)/[p(OR−1)+1]
where p is the frequency of the risk allele in controls and OR is the risk allele odds ratio. The cumulative PAR was calculated as follows:
cumulativePAR=1−(Π1→n(1−PARi))
where n is the number of variants and PARi is the individual PAR for the ith SNP. URLs. R, http://www.r-project.org/.
In this study, we sought to identify genetic variants that predispose to PDB in individuals without SQSTM1 mutations by using a genome-wide association approach. In the discovery population (stage 1), we genotyped 750 cases and 1,002 controls9 using Illumina arrays. In the replication population (stage 2), we genotyped the most significant SNPs identified from stage 1 in an independent set of 500 cases and 535 controls using the Sequenom MassARRAY iPLEX platform. Details of the subjects used in the discovery and replication stages of the study are provided in the Online Methods.
We used a multidimensional scaling analysis of an identity-by-state (IBS) sharing matrix of all individuals plus HapMap samples to assess population ancestry (
In total, 76 SNPs with P values of 1×10−4 or less were identified in the discovery dataset (Table 3). From these, we selected those SNPs with P<1.0×10−6 and also those with P<1.0×10−5 in which an additional SNP within 50 kb attained a P value of 1.0×10−3 or less for further analysis in the replication group. Following application of quality-control measures on the replication dataset, genotype data were obtained for 481 cases and 520 controls for the 16 selected SNPs (Table 4). Eight SNPs showed significant association with PDB in the replication stage after correction for multiple testing (P<3×10−3), resulting in the identification of eight SNPs for which the P values attained genome-wide significance in the combined dataset (Table 4). The distribution of minor alleles and the direction of associations were similar in both the discovery and replication datasets (Table 1 and Table 4). Although all samples used in the replication stage were from individuals of European ancestry, confounding due to population substructure is possible given the multiple nationalities of the replication cohorts. To address this issue, we tested for association in replication samples from individuals of British descent only (256 cases and 488 controls) using the CMH test. This yielded results that were qualitatively similar to those obtained from the entire replication cohort (Table 5). Linkage disequilibrium (LD) patterns in the associated regions were also similar across the study samples and were similar to those observed in HapMap CEU samples (
Significant association with PDB was observed on chromosome 1p13.3 for three SNPs (rs10494112, rs499345 and rs484595); the strongest signal was with rs484959 (combined P=5.38×10−24; Table 1 and Table 2). These SNPs were weakly correlated (r2<0.36) with other genotyped SNPs and are located in a 14-kb LD block 87 kb upstream of CSF1 (
A second locus showing significant association with PDB was situated on chromosome 10p13. Three SNPs (rs1561570, rs825411 and rs2095388), all located within a 30-kb region, were analyzed in both stages of the study and the strongest signal was observed for rs1561570 (combined P=6.09×10−13; Table 1 and Table 4). These three SNPs are weakly correlated with other genotyped SNPs (highest r2<0.37). rs825411 is not in LD with rs1561570 (r2=0.04) (
The 10p13 locus is marked by two recombination hot spots and contains only one known gene, OPTN (
The third region showing a significant association with PDB was located on chromosome 18q21.33 near TNFRSF11A, which encodes the receptor activator of NF-κB (RANK). Four SNPs within a 300-kb region reached genome-wide significance in the combined analysis (rs663354, rs2980996, rs2957128 and rs3018362). Regression analysis accounting for the genotypic-additive effect of the four SNPs showed that only rs2957128 (P=0.047) and rs3018362 (P=0.022) had independent effects (Table 7). Analysis of haplotypes formed by alleles of rs2957128 and rs3018362 showed that a risk haplotype ‘AA’ was consistently over-represented in PDB cases compared with controls in the combined sample of cases and controls (P=8.71×10−14, OR=1.55; Table 8). These two SNPs are moderately correlated (r2=0.55) and are located in adjacent LD blocks about 5 kb downstream of TNFRSF11A (
The TNFRSF11A gene product RANK plays a critical role in osteoclast differentiation and function. Mice with targeted disruption of Tnftsf11a exhibit severe osteopetrosis due to complete absence of osteoclasts26, and loss-of-function mutations in TNFRSF11A cause osteoclast-poor osteopetrosis in humans27. Mutations affecting the signal peptide region of RANK cause the PDB-like syndromes of familial expansile osteolysis, early-onset familial PDB and expansile skeletal hyperphosphatasia28-30. Mutations of TNFRSF11A have not so far been identified in individuals with classical PDB28,31, although this region of chromosome 18q22 has been linked to PDB in some families32. It is also interesting to note that rs3018362 and rs884205, located downstream of TNFRSF11A, have recently been associated with bone mineral density and fracture risk33-35. The allele of rs3018362 that was associated with PBD was also associated with reduced bone mineral density, raising the possibility that this allele may be associated with increased bone turnover. rs884205 was not directly genotyped in our study, but it is moderately correlated with both rs3018362 (r2=0.53) and rs2957128 (r2=0.52). Imputation analysis showed evidence for association of rs884205 with PDB (imputed P value=5.93×10−11), confirming the importance of TNFRSF11A in the genetic regulation of bone metabolism.
The three loci on chromosomes 1p13, 10p13 and 18q21 identified in this study appear to have independent roles, as we found no evidence to suggest that the associated SNPs within these loci interacted with each other to influence susceptibility to PDB (P>0.33 for all interlocus pairwise interactions; Table 9). These data are consistent with a multiplicative model for association with PDB. The cumulative population attributable risk for the SNPs showing independent association with PDB was 70%. Additionally, the risk of PDB increased with an increasing number of risk allele scores (ORper-risk allele=1.34, 95% CI 1.29-1.40, P=5.81×10−45), with individuals carrying ten or more risk alleles having a sixfold increase in PDB risk compared to those with the median number of risk alleles (Table 10).
It is likely that other genomic regions also contribute to PDB because the present study was powered only to detect variants with a moderate effect size (risk allele OR>1.6). A quantile-quantile plot showing the distribution of P values after removal of all genome-wide significant SNPs and correlated markers showed an excess in the number of SNPs with low P values compared to what is expected by chance (
In summary, we have demonstrated that common genetic variants at loci close to CSF1, OPTN, TM7SF4 and TNFRSF11A are independently associated with PDB. Further studies are now warranted to explore the mechanisms responsible for these associations.
7.25 × 10−12
9.14 × 10−12
1.13 × 10−10
1.15 × 10−13
4.21 × 10−10
1.35 × 10−10
aGenomic-control adjusted P value from stratified Cochran-Mantel-Haenszel test.
9.14 × 10−12
1.13 × 10−10
1.15 × 10−13
7.25 × 10−12
4.21 × 10−10
1.35 × 10−10
1.67 × 10−14
3.02 × 10−16
5.38 × 10−24
6.09 × 10−13
1.86 × 10−11
5.27 × 10−13
aGenomic-control adjusted P values from association testing using stratified Cochran-Mantel-Haenszel test.
aReplication samples of British descent (256 cases and 488 controls) were tested for association with Paget's disease using Cochran-Mantel-Haenszel test.
aResults are based on the combined dataset for the SNPs showing genome wide significant association with PDB.
4.34 × 10−11
aOdds ratio and P value were calculated using logistic regression adjusting for population clusters.
aPair-wise SNP interaction for SNPs showing independent association from each locus. Analysis was performed using the combined dataset. OR is odds ratio for interaction.
aRisk allele score of the six SNPs showing independent association with PDB in the combined dataset (chr.1: rs10494112, rs484959; chr. 10: rs1561570 and rs825411; chr. 18: rs2957128 and rs3018362). Allele scores were normally distributed in cases and controls. Individuals carrying low-frequency scores (allele scores labeled 0 and 1 and 11 and 12) were combined together.
bORs are relative to the median number of risk alleles in the controls (five risk alleles).
This study describes an extension to our previously reported GWAS of PDB in which we used genotype data from 692 PDB cases from our previously described study1, and extended the case group by genotyping an additional 57 PDB cases. The additional cases were selected from recently recruited subjects in the PRISM study23; a randomised trial of two different treatment strategies for PDB patients from the UK. We also increased the size of the control group by using genotype data from 2,930 subjects from the British 1958 Birth Cohort genotyped by the Wellcome Trust Case-Control Consortium7. This control group represents a better match to our PDB cases than the previous controls which were recruited from Scotland1 since, like the PRISM participants, they were recruited from all over the UK. The extended samples size used in this study provided 90% power to detect disease associated allele with MAF=0.2 and genotype relative risk of 1.4 assuming a multiplicative model and a disease with population prevalence of 2%. This represents a substantial increase in power compared to our previous study1 where we had 20% power to detect alleles with genotyped relative risk of 1.4.
GWAS stage genotyping and quality control. Genotyping and quality control for the 692 PDB cases were performed using Illumina HumanHap300-Duo arrays as described previously1. The additional 57 PDB cases were genotyped using Illumina Human660W Quad version 1 arrays and quality control measures were applied as previously described1. Briefly; SNPs with call rate <95% were excluded and samples with call rate <90% (n=1); excess heterozygosity (n=1); and non-European ancestry (n=6; Supplementary
The replication study groups were derived from clinic-based PDB patients and gender-matched controls selected from the same region. Patients with SQSTM1 mutations were excluded and all study participants provided informed consent. The first replication cohort comprised 175 PDB patients from the UK; 8 PDB cases from Sydney Australia and 215 PDB cases from Western Australia. These patients were of British descent and were matched with 485 unaffected British controls. The second replication cohort (Italian replication cohort 1) comprised 354 PDB cases and 390 unaffected controls enrolled from various referral centres in Italy who took part in the GenPage project24. The third replication cohort (Italian replication cohort 2) comprised 205 Italian PDB cases and 238 unaffected controls enrolled from referral centres in Northern, Central and Southern Italy as previously described25. The fourth replication cohort comprised 246 sporadic PDB patients recruited from various referrals centres in Belgium and these were matched with 263 controls with no clinical evidence of PDB as previously described8. The fifth replication cohort comprised 85 PDB patients and 93 controls recruited from various centres in the Netherlands as described8,26. The sixth replication comprised 186 sporadic PDB cases recruited from the Salamanca region in the Castilla-Leon region of Spain and 202 unaffected controls from the same region.
Genotyping of replication samples was performed by Sequenom (Hamburg, Germany) using the MassARRAY iPLEX platform. To minimize genotyping bias due to variations between runs; DNA from cases and controls from the six different replication cohorts were distributed into 384 well plates so that each plate had the same number of cases and controls. We included 4000 known genotypes as a quality control measure and the concordance rate between the genotype calls was >99.8%. We removed 64 samples due to low call rate (<90%) and the call rate for all genotyped SNPs was >95%.
Genome-wide genotype imputation for autosomal SNPs was performed using MACH27 and the HapMap European (CEU) phased haplotype data from release 22 were used as a reference. We excluded SNPs with poor imputation quality based on the estimated correlation between imputed and true genotypes (r2<0.3). Additionally, a subset (2%) of known genotypes were masked during imputation and then imputed genotypes were compared with true genotypes and the average per allele imputation error rate was 2.9%. Imputed SNPs were tested for association using PorbABEL software28 implementing a logistic regression model in which the allelic dosage of imputed SNP was used to adjust for uncertainty in imputed genotypes.
Statistical analyses were performed using PLINK (Version 1.07)29 and R (v2.11.1). In GWAS stage, genotyped SNPs were tested for association with PDB using standard allelic (1.d.f) χ2 statistic. We also performed association testing using regression models in which we adjusted for gender, population clusters (as determined by multidimensional scaling analysis) but results were essentially identical to those obtained from the standard allelic test reported here (data not shown). The genomic inflation factor λ was calculated based on the 90% least significant SNPs as described previously30. The observed test statistic values were corrected using the genomic control method (λ=1.05; Supplementary
PAR=p(OR−1)/[p(OR−1)+1];
where p is the frequency of the risk allele in controls and OR is the risk allele odds ratio. The cumulative PAR was calculated as follows:
Cumulative PAR=1−(Π1→n(1−PARi));
where n is the number of variants and PARi is the individual PAR for the ith SNP. The proportion of familial risk attributable to the identified loci was calculated as previously described33 assuming a multiplicative model of association and a sibling relative risk λs=7.0 as estimated from previous epidemiological studies20. Regional association plots were generated using the locuszoom tool34.
eQTL Analysis.
SNPs showing genome wide significant association with PDB (or those in strong LD; D′≧0.8) were tested for association with cis-allelic expression of gene transcripts located in the associated regions using publicly available eQTL data22,35-38. Only cis-acting allelic associations located within 250 kb of either 5′ or 3′ end of the associated gene with expression P-value <1×10−5 were considered. To avoid false detection, we excluded expression data if the gene probe contained a polymorphic SNP or was located in a highly repetitive sequence.
In Example 1 we describe the identification of susceptibility alleles for PDB at the CSF1, OPTN, and TNFRSF11A loci by a genome wide association study involving 692 PDB cases and 1,001 controls with replication cohort of 481 cases and 520 controls1. In order to identify additional susceptibility loci for the disease, we performed an extended GWAS involving a total of 749 PDB cases of British descent in whom SQSTM1 mutations had been excluded and 2,930 British controls derived from the 1958 Birth Cohort7 with replication in a further 1,474 cases and 1,671 controls from six independent populations.
After applying quality control measures and excluding samples of non-European ancestry, the extended cohort (henceforth referred to as the GWAS stage) comprised 741 cases and 2,699 controls with genotype information for 290,115 SNPs, providing a 4-fold increase in power to detect loci of moderate effect size (odds ratio≧1.4) compared with our previous study1. We then genotyped the highest ranking SNPs identified from the GWAS stage in six independent replication cohorts of SQSTM1-negative PDB cases and matched controls from the UK, Australia, Italy, Spain, the Netherlands, and Belgium. Details of the study groups are provided in the online methods section and Table 2 (Example 2). To increase SNP coverage, we performed genome wide SNP imputation for the GWAS stage samples using phased haplotype data from the HapMap project as a reference. The results of association testing of genotyped and imputed SNPs (total 2,487,078 SNPs) from the GWAS stage are shown in
In the second stage of this study we analysed the highest ranking SNPs observed in the GWAS stage (P values of 5×10−5 or less) for replication after excluding those in linkage disequilibrium (LD; r2>0.8 or D′>0.95) with the highest ranking SNP from each region. A total of 27 SNPs were genotyped in the replication cohorts which consisted of 1,474 PDB cases from six different geographic regions and 1,671 unaffected controls from the same regions that were matched with the cases by gender as described in the online methods section and Table 2 (Example 2). A meta-analysis of data from the GWAS stage and individual replication cohorts was performed under fixed and random effects models and the results are summarised in Table 3 (Example 2). This strengthened the association with PDB for the CSF1, OPTN, and TNFRSF11A loci which were identified in our previous study1 and confirmed the association with 8q22.3 locus which was suggestively associated with PDB in our previous GWAS and was confirmed to be associated with PDB in a small study of Belgian and Dutch subjects8. Furthermore, three additional genome wide significant loci on 7q33, 14q32.12, and 15q24.1 were identified in the combined data set (P<5×10−8; Table 1 (Example 2)). To assess if the reported associations were confounded by age, age of onset or recruitment centre, we performed a regression analysis using case-only data from the GWAS stage to test if any of these factors were associated with the top hits using linear regression models. The results of this analysis showed no evidence to suggest that the reported association is confounded by age (P>0.11), age of onset (P>0.10) or recruitment centre (P>0.44).
The strongest signal on 8q22.3 was with rs2458413 (combined P-value=7.38×10−17; OR=1.4). There was no significant heterogeneity between the study groups (I2=44.3%; Phet=0.10; Table 1 (Example 2),
The first new locus for PDB susceptibility was on 7q33 tagged by rs4294134 (combined P-value=8.45×10−10; OR=1.45). The direction of association was similar in all study cohorts and analysis of the combined data set showed no evidence for heterogeneity between study groups (I2=0%; Phet=0.83; Table 1 (Example 2),
The second new susceptibility locus was located on 14q32.12 and was tagged by rs10498635. This SNP showed borderline evidence of association with PDB in our previous study (P=9.69×10−8)1 but reached genome-wide significance in the present study (combined P-value=2.55×10−11; OR=1.44). Association testing showed no evidence for heterogeneity between the study groups V′=0.0%; Phet=0.62; Table 1 (Example 2),
The third new susceptibility locus was located on 15q24.1 and the strongest association was with rs5742915 (combined P-value=1.60×10−14; OR=1.34; I2=0.0%; Phet=0.56; Table 1 (Example 2),
We were also able to replicate our previously reported association between variants at the CSF1, OPTN, and TNFRSF11A loci and PDB in the present study. The results of meta-analysis of the combined data set for these loci are shown in Table 1 (Example 2) and Supplementary
We next wanted to determine if the identified loci on 15q24.1, 7q33 and 14q32.12 interacted with each other or with the previously identified loci on 1p13.3, 8q22.3, 10p13 and 18q21.33 to affect the risk of PDB. Pair-wise interaction analysis showed weak evidence for interaction between 7q33 (rs4294134) with 8q22.3 (rs2458413; P=0.03) and 10p13 (rs1561570; P=0.02). However, these interactions were not significant after adjusting for multiple testing and none of the other loci showed evidence for interaction (P>0.05) suggesting a multiplicative model of association with PDB risk. In order to estimate the effect size of the identified loci on the development of PDB, we calculated the proportion of familial risk explained by the genome wide significant loci in the replication sample assuming a sibling relative risk for PDB of 7.020. This showed that the proportion of familial risk explained was ˜13% which is much greater than observed for other bone diseases like osteoporosis21. We also estimated the cumulative population attributable risk of these loci in the replication cohort and found it to be 86% and we found that the risk of PDB increased with increasing number of risk allele scores defined by the seven loci (ORper-riskallele=1.44, 95% CI=1.38-1.51, P=5.4×10−57). When allele scores were weighted according to their estimated effect size we found that subjects in the top 10% of the allele score distribution (D10; n=315) had 10.1 fold (95% CI; 7.0-14.6) increase in risk of developing PDB compared to those in the bottom 10% of the distribution (D1; n=315) from the replication dataset (
In addition to the loci mentioned above, additional variants were identified that showed suggestive evidence for association with PDB. For example a locus on chromosome Xq24 showed borderline evidence for association with PDB (rs5910578 within SLC25A43 gene; combined P=1.26×10−7; OR=1.34; Phet=0.44; I2=0.0%) as did another locus on chromosome 6p22.3 (rs1341239 near PRL gene; combined P=3.83×10−6; OR=1.20; Phet=0.63; I2=0.0%; Table 3 (Example 2)). Given that we observed 6 genotyped variants with P<1×10−5 in the GWAS stage after removal of confirmed SNPs and associated variants when we only expect 3 by chance (Supplementary
This study has been successful in identifying seven loci that contribute substantially to the risk of developing PDB. The identified loci have relatively large effect sizes compared with other common diseases such as osteoporosis and rheumatoid arthritis. This indicates that susceptibility to PDB is most probably mediated by inheritance of a relatively small number of genes with large effect sizes as opposed to a large number of genes with small effect sizes as seen in other complex diseases. Many of the susceptibility variants lie within or close to genes that are known to play important roles in regulating osteoclast differentiation and function whereas other variants lie within genes not previously implicated in the regulation of bone metabolism. Whilst further work will be required to identify the functional variants, the present study has provided new insights into the genetic architecture of PDB and has identified several genes that previously were not suspected to play a role in bone metabolism. Finally, the large effect size of the variants identified means that it may be possible in the future to identify people at risk of developing PDB by genetic profiling.
7
rs4294134
G
1.20 × 10
−05
1.50 (1.25-1.79)
2.29 × 10
−05
1.42 (1.20-1.66)
8.45 × 10
−10
14
rs10498635
C
1.51 × 10
−05
1.45 (1.23-1.71)
5.64 × 10
−07
1.42 (1.29-1.63)
2.55 × 10
−11
15
rs5742915
C
1.40 × 10
−07
1.38 (1.22-1.54)
3.99 × 10
−08
1.32 (1.20-1.46)
1.60 × 10
−14
7
1.45 (1.29-1.63)
8.45 × 10
−10
1.45 (1.29-1.63)
0.83
00.0
NUP205
14
1.44 (1.29-1.60)
2.55 × 10
−11
1.44 (1.29-1.60)
0.62
00.0
RIN3
15
1.34 (1.25-1.45)
1.60 × 10
−14
1.34 (1.25-1.45)
0.56
00.0
PML
1As a quality control measure, a subset of PDB samples used in the GWAS stage were genotyped on two different platforms and genotype concordance rate was calculated. For control samples which were genotyped by WTCCC using illumina Human 1.2M-Duo chip, the same samples were also genotyped using Affymetrix v6.0 and genotype concordance rate was calculated for the 88,146 overlapping SNPs (out of 290,115 SNPs used in GWAS analysis). Only genotypes from illumina Human 1.2M chip were used for GWAS analysis.
Number | Date | Country | Kind |
---|---|---|---|
1006197.6 | Apr 2010 | GB | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/GB2011/000581 | 4/13/2011 | WO | 00 | 12/21/2012 |