Osteosarcoma, a common bone malignancy, is an aggressive cancer characterized by early metastasis and high mortality. In dogs, osteosarcoma typically afflicts middle-age large and giant breeds. Osteosarcoma is common in both humans and dogs resulting in a major impact on human and canine health.
The invention is premised on the identification of germ-line risk markers (e.g., SNPs) that can be used singly or together (e.g., forming a haplotype) to predict elevated risk of osteosarcoma in subjects, e.g., canine subjects. As described herein, a genome-wide association study (GWAS) was performed in Greyhounds, Rottweilers and Irish wolfhounds and germ-line risk markers that correlate with canine osteosarcoma were identified. These germ-line risk markers were confirmed to correlate with canine osteosarcoma in a second, larger sample set. Accordingly, aspects of the invention provide methods for identifying subjects that are at elevated risk of developing osteosarcoma or subjects having otherwise undiagnosed osteosarcoma. Subjects are identified based on the presence of one or more germ-line risk markers shown to be associated with the presence of osteosarcoma, in accordance with the invention. Prognostic and theranostic methods utilizing one or more germ-line risk markers are also described herein.
In some aspects, the disclosure relates to a method, comprising a) analyzing genomic DNA from a canine subject for the presence of a single nucleotide polymorphism (SNP) selected from:
In some embodiments, the SNP is selected from BICF2P133066, BICF2P1421479, BICF2S2308696, BICF2P508906, BICF2P508905, BICF2S23216058, BICF2S23216058, BICF2P266591, BICF2P1332375, BICF2S23231062, BICF2S22945043, BICF2P326880, BICF2P893664, BICF2P1420547, BICF2P698281, BICF2S22919383, BICF2S22947803, BICF2S22947803, BICF2S22959094, BICF2S23228287, BICF2S23036972, BICF2P51623, BICF2P1346510, BICF2P1323908, BICF2P1137984, BICF2P1115364, BICF2P58266, BICF2P627162, BICF2P1422910, BICF2P162782, BICF2P162782, BICF2P1342901, BICF2P868731, BICF2P768889, BICF2P1052528, BICF2P408119, BICF2P1468011, BICF2P219326, BICF2P1462759, BICF2P307386, BICF2P1010170, BICF2S23038485, BICF2G630672865, BICF2G630672813, BICF2P1369145, BICF2G630672770, BICF2P81989, BICF2P916235, BICF2G630672753, BICF2P1177075, BICF2P411325, BICF2P1210630, TIGRP2P407733, BICF2P341331, BICF2P318350, BICF2S2335735, BICF2P1003572, BICF2P1104551, BICF2S23550277, BICF2P870378, BICF2P866460, BICF2P1303772, BICF2S23738710, BICF2P344455, BICF2P825177, BICF2S23324500, BICF2S23544574, BICF2P119783, BICF2S23758510, BICF2S23724888, BICF2P1129874, BICF2S23535303, BICF2S23520119, G326F32S322, BICF2S23238674, BICF2P645758, BICF2P189890, BICF2P819174, BICF2P162666, BICF2P1366853, BICF2P775251, BICF2S23746532, BICF2P1162557, BICF2S23538747, BICF2S23538670, BICF2S23218055, BICF2P680751, BICF2S23510137, BICF2P849639, BICF2S22945333, BICF2S2298851, TIGRP2P238123, TIGRP2P238132, BICF2P1466354, BICF2P440326, BICF2P874005, BICF2P928021, BICF2P1182592, BICF2P1378069, TIGRP2P238162, TIGRP2P253880, BICF2P461252, BICF2P879737, BICF2P163146, BICF2S23259485, TIGRP2P253975, BICF2S23760612, TIGRP2P254013, TIGRP2P254028, BICF2S23750273, BICF2P228579, TIGRP2P254054, BICF2P531896, TIGRP2P254060, BICF2P766570, BICF2P1014267, BICF2P1006929, BICF2P1299781, BICF2P672676, BICF2S23761559, BICF2P15617, BICF2P439160, TIGRP2P254095, TIGRP2P254109, BICF2P477812, BICF2P1238318, BICF2P1354921, BICF2S23741435, BICF2P37118, TIGRP2P254175, BICF2P1123483, TIGRP2P254184, BICF2P825842, BICF2P243632, BICF2P1139856, BICF2P1376844, TIGRP2P254212, TIGRP2P254216, and TIGRP2P254223.
In some embodiments, the SNP is selected from BICF2P133066, BICF2S2308696, BICF2P508906, BICF2P508905, BICF2S23216058, BICF2S23216058, BICF2P266591, BICF2P1332375, BICF2S23231062, BICF2S22945043, BICF2P326880, BICF2P893664, BICF2P1420547, BICF2P698281, BICF2S22919383, BICF2S22947803, BICF2S22947803, BICF2S22959094, BICF2S23228287, BICF2S23036972, BICF2P51623, BICF2P1346510, BICF2P1323908, BICF2P1137984, BICF2P1115364, BICF2P58266, BICF2P627162, BICF2P1422910, BICF2P162782, BICF2P162782, BICF2P1342901, BICF2P868731, BICF2P768889, BICF2P1052528, BICF2P408119, BICF2P1468011, BICF2P219326, BICF2P1462759, BICF2P307386, BICF2P1010170, BICF2P229090, BICF2S23516022, and BICF2S22922837. In some embodiments, the SNP is BICF2P133066.
In some embodiments, the SNP is two or more SNPs. In some embodiments, the SNP is three or more SNPs.
Other aspects of the disclosure relate to a method, comprising (a) analyzing genomic DNA from a canine subject for the presence of a risk haplotype selected from:
a risk haplotype having chromosome coordinates chr11:44392734-44414985,
a risk haplotype having chromosome coordinates chr8:35433142-35454649,
a risk haplotype having chromosome coordinates chr13:14549973-14645634,
a risk haplotype having chromosome coordinates chr25:21831580-21921256,
a risk haplotype having chromosome coordinates chr14:48831824-49203827,
a risk haplotype having chromosome coordinates chr5:16071171-16152955,
a risk haplotype having chromosome coordinates chr19:33963105-34145310,
a risk haplotype having chromosome coordinates chr16:43665149-43737129,
a risk haplotype having chromosome coordinates chr15:63767963-63800415,
a risk haplotype having chromosome coordinates chr16:40883517-41081510,
a risk haplotype having chromosome coordinates chr25:43476429-43528145,
a risk haplotype having chromosome coordinates chr1:112977233-113081800,
a risk haplotype having chromosome coordinates chr3:5162058-6465753,
a risk haplotype having chromosome coordinates chr7:64631053-64703475,
a risk haplotype having chromosome coordinates chr1:115582915-116790630,
a risk haplotype having chromosome coordinates chr2:19212450-19542015,
a risk haplotype having chromosome coordinates chr1:122033806-122051988,
a risk haplotype having chromosome coordinates chr35:18326079-18345318,
a risk haplotype having chromosome coordinates chr9:47647012-47668054,
a risk haplotype having chromosome coordinates chr38:11252518-11739329,
a risk haplotype having chromosome coordinates chr21:46231985-46363479,
a risk haplotype having chromosome coordinates chr17:14465884-14482152,
a risk haplotype having chromosome coordinates chr32:25136302-25156153,
a risk haplotype having chromosome coordinates chr36:29637804-29663408,
a risk haplotype having chromosome coordinates chr15:37986345-39974762,
a risk haplotype having chromosome coordinates chr1:29405587-29914411,
a risk haplotype having chromosome coordinates chr26:32374093-32428448,
a risk haplotype having chromosome coordinates chr25:29658978-29767164,
a risk haplotype having chromosome coordinates chr26:3529343-3550075,
a risk haplotype having chromosome coordinates chr5:14720254-15466603,
a risk haplotype having chromosome coordinates chr18:4266743-5854451,
a risk haplotype having chromosome coordinates chr1:16768869-18150476,
a risk haplotype having chromosome coordinates chr9:18896060-19633155, and
a risk haplotype having chromosome coordinates chr11:44390633-44406002; and
(b) identifying a canine subject having the mutation as a subject at elevated risk of developing osteosarcoma or having an undiagnosed osteosarcoma.
In some embodiments, the risk haplotype is selected from a risk haplotype having chromosome coordinates chr11:44392734-44414985, a risk haplotype having chromosome coordinates chr8:35433142-35454649, a risk haplotype having chromosome coordinates chr1:115582915-116790630, a risk haplotype having chromosome coordinates chr2:19212450-19542015, a risk haplotype having chromosome coordinates chr1:122033806-122051988, a risk haplotype having chromosome coordinates chr35:18326079-18345318, a risk haplotype having chromosome coordinates chr9:47647012-47668054, a risk haplotype having chromosome coordinates chr38:11252518-11739329, a risk haplotype having chromosome coordinates chr5:14720254-15466603, and a risk haplotype having chromosome coordinates chr18:4266743-5854451. In some embodiments, the risk haplotype is selected from a risk haplotype having chromosome coordinates chr11:44392734-44414985, a risk haplotype having chromosome coordinates chr1:115582915-116790630, and a risk haplotype having chromosome coordinates chr5:14720254-15466603. In some embodiments, the risk haplotype is the risk haplotype having chromosome coordinates chr11:44392734-44414985.
In some embodiments, the mutation is two or more mutations. In some embodiments, the mutation is three or more mutations. In some embodiments, the genomic region is two or more genomic regions. In some embodiments, the genomic region is three or more genomic regions.
In yet another aspect, the disclosure relates to a method, comprising (a) analyzing genomic DNA from a canine subject for the presence of a mutation in a gene selected from:
one or more genes located within a risk haplotype having chromosome coordinates chr11:44392734-44414985,
one or more genes located within a risk haplotype having chromosome coordinates chr8:35433142-35454649,
one or more genes located within a risk haplotype having chromosome coordinates chr13: 14549973-14645634,
one or more genes located within a risk haplotype having chromosome coordinates chr25:21831580-21921256,
one or more genes located within a risk haplotype having chromosome coordinates chr14:48831824-49203827,
one or more genes located within a risk haplotype having chromosome coordinates chr5:16071171-16152955,
one or more genes located within a risk haplotype having chromosome coordinates chr19:33963105-34145310,
one or more genes located within a risk haplotype having chromosome coordinates chr16:43665149-43737129,
one or more genes located within a risk haplotype having chromosome coordinates chr15:63767963-63800415,
one or more genes located within a risk haplotype having chromosome coordinates chr16:40883517-41081510,
one or more genes located within a risk haplotype having chromosome coordinates chr25:43476429-43528145,
one or more genes located within a risk haplotype having chromosome coordinates chr1:112977233-113081800,
one or more genes located within a risk haplotype having chromosome coordinates chr3:5162058-6465753,
one or more genes located within a risk haplotype having chromosome coordinates chr7:64631053-64703475,
one or more genes located within a risk haplotype having chromosome coordinates chr1:115582915-116790630,
one or more genes located within a risk haplotype having chromosome coordinates chr2:19212450-19542015,
one or more genes located within a risk haplotype having chromosome coordinates chr1:122033806-122051988,
one or more genes located within a risk haplotype having chromosome coordinates chr35: 18326079-18345318,
one or more genes located within a risk haplotype having chromosome coordinates chr9:47647012-47668054,
one or more genes located within a risk haplotype having chromosome coordinates chr38:11252518-11739329,
one or more genes located within a risk haplotype having chromosome coordinates chr21:46231985-46363479,
one or more genes located within a risk haplotype having chromosome coordinates chr17: 14465884-14482152,
one or more genes located within a risk haplotype having chromosome coordinates chr32:25136302-25156153,
one or more genes located within a risk haplotype having chromosome coordinates chr36:29637804-29663408,
one or more genes located within a risk haplotype having chromosome coordinates chr15:37986345-39974762,
one or more genes located within a risk haplotype having chromosome coordinates chr1:29405587-29914411,
one or more genes located within a risk haplotype having chromosome coordinates chr26: 32374093-32428448,
one or more genes located within a risk haplotype having chromosome coordinates chr25:29658978-29767164,
one or more genes located within a risk haplotype having chromosome coordinates chr26:3529343-3550075,
one or more genes located within a risk haplotype having chromosome coordinates chr5:14720254-15466603,
one or more genes located within a risk haplotype having chromosome coordinates chr18:4266743-5854451,
one or more genes located within a risk haplotype having chromosome coordinates chr1:16768869-18150476,
In some embodiments, the gene is selected from one or more genes located within a risk haplotype having chromosome coordinates chr11:44392734-44414985, one or more genes located within a risk haplotype having chromosome coordinates chr8:35433142-35454649, one or more genes located within a risk haplotype having chromosome coordinates chr1:115582915-116790630, one or more genes located within a risk haplotype having chromosome coordinates chr2:19212450-19542015, one or more genes located within a risk haplotype having chromosome coordinates chr1:122033806-122051988, one or more genes located within a risk haplotype having chromosome coordinates chr35:18326079-18345318, one or more genes located within a risk haplotype having chromosome coordinates chr9:47647012-47668054, one or more genes located within a risk haplotype having chromosome coordinates chr38:11252518-11739329, one or more genes located within a risk haplotype having chromosome coordinates chr5:14720254-15466603, and one or more genes located within a risk haplotype having chromosome coordinates chr18:4266743-5854451. In some embodiments, the gene is selected from one or more genes located within a risk haplotype having chromosome coordinates chr11:44392734-44414985, one or more genes located within a risk haplotype having chromosome coordinates chr1:115582915-116790630, and one or more genes located within a risk haplotype having chromosome coordinates chr5:14720254-15466603. In some embodiments, the gene is one or more genes located within the risk haplotype having chromosome coordinates chr11:44392734-44414985.
In some embodiments, the gene is selected from CDKN2B-AS, OTX2, BMPER, GRIK4, EN1, MARCO, MTMR7, SGCZ, CCL20, CD3EAP, ERCC1, ERCC2, FOSB, PPP1R13L, FER, MAN2A1, PJA2, CHST9, ADCK4, AKT2, AXL, BLVRB, C19orf47, C19orf54, CNTD2, CYP2A7, CYP2B6, CYP2S1, DLL3, EGLN2, FBL, FCGBP, GMFG, HIPK4, HNRNPUL1, ITPKC, LEUTX, LTBP4, MAP3K10, MED29, NUMBL, PLD3, PLEKHG2, PSMC4, RAB4B, SAMD4B, SERTAD1, SERTAD3, SHKBP1, SNRPA, SPTBN4, SUPT5H, TIMM50, KIAA1462, C19orf40, CEP89, RHPN2, BLMH, TMIGD1, FAM5C, NELL1, EMCN, AMDHD1, CCDC38, CDK17, ELKS, FGD6, HAL, LTA4H, METAP2, NDUFA12, NEDD1, NR2C1, NTN4, SNRPF, USP44,VEZT, EYA4, TCF21, ARVCF, C22orf25, COMT, XKR6, FBRSL1, BLID, C7orf72, COBL, DDC, FIGNL1, GRB10, IKZF1, VWC2, ZPBP, BCL2, KIAA1468, PHLPP1, PIGN, RNF152, TNFRSF11A, ZCCHC2, ABCA5, KCNJ16, KCNJ2, MAP2K6, CDKN2A, and CDKN2B. In some embodiments, the gene is selected from CDKN2B-AS, OTX2, BMPER, EN1, DLL3, KIAA1462, FAM5C, NELL1, EMCN, TCF21, BLID, VWC2, BCL2, and TNFRSF11A. In some embodiments, the gene is selected from CDKN2B-AS, OTX2, ADCK4, AKT2, AXL, BLVRB, C19orf47, C19orf54, CNTD2, CYP2A7, CYP2B6, CYP2S1, DLL3, EGLN2, FBL, FCGBP, GMFG, HIPK4, HNRNPUL1, ITPKC, LEUTX, LTBP4, MAP3K10, MED29, NUMBL, PLD3, PLEKHG2, PSMC4, RAB4B, SAMD4B, SERTAD1, SERTAD3, SHKBP1, SNRPA, SPTBN4, SUPT5H, TIMM50, KIAA1462, C19orf40, CEP89, RHPN2, BLMH, TMIGD1, FAM5C, BLID, C7orf72, COBL, DDC, FIGNL1, GRB10, IKZF1, VWC2, and ZPBP. In some embodiments, the gene is selected from CDKN2B-AS, ADCK4, AKT2, AXL, BLVRB, C19orf47, C19orf54, CNTD2, CYP2A7, CYP2B6, CYP2S1, DLL3, EGLN2, FBL, FCGBP, GMFG, HIPK4, HNRNPUL1, ITPKC, LEUTX, LTBP4, MAP3K10, MED29, NUMBL, PLD3, PLEKHG2, PSMC4, RAB4B, SAMD4B, SERTAD1, SERTAD3, SHKBP1, SNRPA, SPTBN4, SUPT5H, TIMM50, and BLID. In some embodiments, the gene is selected from CDKN2B-AS, CDKN2A, and CDKN2B.
In some embodiments, the mutation is two or more mutations. In some embodiments, the mutation is three or more mutations. In some embodiments, the gene is two or more genes. In some embodiments, the gene is three or more genes.
In some embodiments of any method provided herein, the genomic DNA is obtained from a bodily fluid or tissue sample of the subject. In some embodiments of any method provided herein, the genomic DNA is obtained from a blood or saliva sample of the subject. In some embodiments of any method provided herein, the genomic DNA is analyzed using a single nucleotide polymorphism (SNP) array. In some embodiments of any method provided herein, the genomic DNA is analyzed using a bead array. In some embodiments of any method provided herein, the genomic DNA is analyzed using a nucleic acid sequencing assay.
In some embodiments of any method described herein, the canine subject is a descendent of a Greyhound, Rottweiler or Irish Wolfhound. In some embodiments, the canine subject is a Greyhound, Rottweiler or Irish Wolfhound.
Yet another aspect of the disclosure relates to a method, comprising (a) analyzing genomic DNA in a sample from a subject for presence of a mutation in a gene selected from:
one or more genes located within a risk haplotype having chromosome coordinates chr11:44392734-44414985 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr8:35433142-35454649 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr13:14549973-14645634 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr25:21831580-21921256 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr14:48831824-49203827 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr5:16071171-16152955 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr19:33963105-34145310 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr16:43665149-43737129 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr15:63767963-63800415 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr16:40883517-41081510 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr25:43476429-43528145 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr1:112977233-113081800 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr3:5162058-6465753 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr7:64631053-64703475 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr1:115582915-116790630 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr2:19212450-19542015 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr1:122033806-122051988 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr35:18326079-18345318 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr9:47647012-47668054 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr38:11252518-11739329 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr21:46231985-46363479 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr17:14465884-14482152 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr32:25136302-25156153 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr36:29637804-29663408 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr15:37986345-39974762 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr1:29405587-29914411 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr26:32374093-32428448 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr25:29658978-29767164 or an orthologue of such a gene,
on or more genes located within a risk haplotype having chromosome coordinates chr26:3529343-3550075 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr5:14720254-15466603 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr18:4266743-5854451 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr1:16768869-18150476 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr9:18896060-19633155 or an orthologue of such a gene, and
one or more genes located within a risk haplotype having chromosome coordinates chr11:44,390,633-44,406,002 or an orthologue of such a gene; and
(b) identifying a subject having the mutation as a subject at elevated risk of developing osteosarcoma or having an undiagnosed osteosarcoma.
In some embodiments, the subject is a human subject. In some embodiments, the subject is a canine subject.
In some embodiments, the genomic DNA is obtained from a bodily fluid or tissue sample of the subject. In some embodiments, the genomic DNA is obtained from a blood or saliva sample of the subject. In some embodiments, the genomic DNA is analyzed using a single nucleotide polymorphism (SNP) array. In some embodiments, the genomic DNA is analyzed using a bead array. In some embodiments, the genomic DNA is analyzed using a nucleic acid sequencing assay.
In some embodiments, the gene is two or more genes. In some embodiments, the gene is three or more genes. In some embodiments, the mutation is two or more mutations. In some embodiments, the mutation is three or more mutations.
Osteosarcomas arise from mesenchymal stem cells, metastasize readily, and have widespread genetic abnormalities. Osteosarcoma in dogs is a spontaneously occurring disease with a global tumor gene expression signature indistinguishable from tumors from human pediatric patients and, while age of onset is higher in dogs, the clinical progression is remarkably similar. Both human and canine osteosarcomas most commonly arise at the ends of the long bones of the limbs and metastasize readily, usually to the lungs.
Aspects of the invention relate to germ-line risk markers (such as single nucleotide polymorphisms (SNPs), risk haplotypes, and mutations in genes) and various methods of use and/or detection thereof. The invention is premised, in part, on the results of a case-control GWAS of 304 Greyhounds, 155 Irish Wolfhounds, and 145 Rottweilers performed to identify germ-line risk markers associated with osteosarcoma. The study is described herein. Briefly, SNPs were identified that correlate with the presence of osteosarcoma in Greyhounds, Irish Wolfhounds, and/or Rottweilers. Significant SNPs were identified on chromosomes 1, 2, 3, 5, 7, 8, 9, 11, 13, 14, 15, 16, 17, 18, 19, 21, 25, 26, 32, 35, 36, and 38. These SNPs are listed in Table 1. Additionally, risk haplotypes having chromosomal regions on chromosomes 1, 2, 3, 5, 7, 8, 9, 11, 13, 14, 15, 16, 17, 18, 19, 21, 25, 26, 32, 35, 36, and 38 were identified that significantly correlated with osteosarcoma in Greyhounds, Irish Wolfhounds, and/or Rottweilers (chr11:44392734-44414985, chr8:35433142-35454649, chr13:14549973-14645634, chr25:21831580-21921256, chr14:48831824-49203827, chr5:16071171-16152955, chr19:33963105-34145310, chr16:43665149-43737129, chr15:63767963-63800415, chr16:40883517-41081510, chr25:43476429-43528145, chr1:112977233-113081800, chr3:5162058-6465753, chr7:64631053-64703475, chr1:115582915-116790630, chr2:19212450-19542015, chr1:122033806-122051988, chr35:18326079-18345318, chr9:47647012-47668054, chr38:11252518-11739329, chr21:46231985-46363479, chr17:14465884-14482152, chr32:25136302-25156153, chr36:29637804-29663408, chr15:37986345-39974762, chr1:29405587-29914411, chr26:32374093-32428448, chr25:29658978-29767164, chr26:3529343-3550075, chr5:14720254-15466603, chr18:4266743-5854451, chr1:16768869-18150476, chr9:18896060-19633155, and chr11:44390633-44406002). These germ-line risk markers were also found to correlate with canine osteosarcoma in a study involving a second, larger sample set. Additional regions were also identified in a third, follow-on study.
Accordingly, aspects of the invention provide methods that involve detecting one or more of the identified germ-line risk markers in a subject, e.g., a canine subject, in order to (a) identify a subject at elevated risk of developing osteosarcoma, or (b) identify a subject having osteosarcoma that is as yet undiagnosed. The methods can be used for prognostic purposes and for diagnostic purposes. Identifying canine subjects having an elevated risk of developing osteosarcoma is useful in a number of applications. For example, canine subjects identified as at elevated risk may be excluded from a breeding program and/or conversely canine subjects that do not carry the germ-line risk markers may be included in a breeding program. As another example, canine subjects identified as at elevated risk may be monitored, including monitored more regularly, for the appearance of osteosarcoma and/or may be treated prophylactically (e.g., prior to the development of the tumor) or therapeutically. Canine subjects carrying one or more of the germ-line risk markers may also be used to further study the progression of osteosarcoma and optionally to study the efficacy of various treatments.
In addition, in view of the clinical and histological similarity between canine osteosarcoma with human osteosarcoma, the germ-line risk markers identified in accordance with the invention may also be risk markers and/or mediators of cancer occurrence and progression in human osteosarcoma as well. Accordingly, the invention provides diagnostic and prognostic methods for use in canine subjects, animals more generally, and human subjects, as well as animal models of human disease and treatment, as well as others.
The germ-line risk markers of the invention can be used to identify subjects at elevated risk of developing osteosarcoma. An elevated risk means a lifetime risk of developing such a cancer that is higher than the risk of developing the same cancer in (a) a population that is unselected for the presence or absence of the germ-line risk marker (i.e., the general population) or (b) a population that does not carry the germ-line risk marker.
Aspects of the invention include various methods, such as prognostic and diagnostic methods, related to osteosarcoma. Osteosarcoma is an aggressive malignant neoplasm arising from primitive transformed cells of mesenchymal origin. Osteosarcoma is the most common histological form of primary bone cancer in both dogs and humans. Osteosarcoma typically arises from the proximal humerus, the distal radius, the distal femur, and/or the tibia. Other sites include the ribs, the mandible, the spine, and the pelvis. In some instances, osteosarcoma may arise from soft-tissues (extraskeletal osteosarcoma). The tumor causes a great deal of pain, and can even lead to fracture of the affected bone. Metastasis of osteosarcoma tumors is very common and usually occurs in the lungs. It is to be understood that the invention provides methods for detecting germ-line risk markers regardless of the location of the osteosarcoma.
Currently available methods for diagnosis of osteosarcoma include X-ray, CT scan, PET scan, bone scan, MRI and bone biopsy. A bone biopsy may be, e.g., a needle biopsy or an open biopsy. Such methods for diagnosis may be used alone or in combination and may also be used to stage the cancer.
Osteosarcoma can be staged using, for example, the TNM system. This system uses three different codes to describe the size and location of the tumor, whether it has spread to the lymph nodes around the tumor, and whether it can be found in other parts of the body.
In the TNM system, “T” plus a letter or number (0 to 4) is used to describe the size and location of the tumor. The tumor stages for osteosarcoma are in the following table.
The “N” in the TNM system stands for node and is used to describe whether cancer has spread to regional lymph nodes. Lymph node stages are in the following table.
The “M” in the TNM system is used for cancer that has spread, or metastasized, to other parts of the body. The stages for metastatic osteosarcoma are in the following table.
Tumor Grade
The TNM system also incorporates the tumor grade. The grade is generally determined by looking at cancer cells under a microscope. Tumor grades are in the following table.
Stages I to IV
After the T, N, and M categories of the osteosarcoma have been identified, this information can be combined with the tumor grade to assign a stage (I to IV) to the osteosarcoma. Stages are in the following table.
Another staging system used is the Musculoskeletal Tumor Society (MSTS) staging system which was developed by Enneking at the University of Florida. The MSTS staging system characterizes nonmetastatic malignant bone tumors by grade (low-grade [stage I] versus high-grade [stage II]) and further subdivides these stages according to the local anatomic extent (intracompartmental [A] versus extracompartmental [B]). For bone tumors, the compartmental status is determined by whether the tumor extends through the cortex of the involved bone. The majority of high grade osteosarcoma are extracompartmental. Subjects with distant metastases are categorized as stage III.
Thus, in some embodiments, the prognostic or diagnostic methods of the invention may further comprise performing a diagnostic assay known in the art for identification and staging of osteosarcoma (e.g., x-ray, CT scan, PET scan, bone scan, MRI and/or bone biopsy).
Germ-Line Risk Markers
Aspects of the invention relate to germ-line risk markers and use and detection thereof in various methods. In general terms, a germ-line marker is a mutation in the genome of a subject that can be passed on to the offspring of the subject. Germ-line markers may or may not be risk markers. Germ-line markers are generally found in the majority, if not all, of the cells in a subject. Germ-line markers are generally inherited from one or both parents of the subject (i.e., were present in the germ cells of one or both parents). Germ-line markers as used herein also include de novo germ-line mutations, which are spontaneous mutations that occur at single-cell stage level during development. This is distinct from a somatic marker, which is a mutation in the genome of a subject that occurs after the single-cell stage during development. Somatic mutations are considered to be spontaneous mutations. Somatic mutations generally originate in a single cell or subset of cells in the subject.
A germ-line risk marker as described herein includes a SNP, a risk haplotype, or a mutation in a gene. Further discussion of each type of germ-line risk marker is provided herein. It is to be understood that a germ-line risk marker may also indicate or predict the presence of a somatic mutation in a genomic location in close proximity to the germ-line risk marker, as germ-line risk marks may correlate with a higher risk of secondary somatic mutations.
As used herein, a mutation is one or more changes in the nucleotide sequence of the genome of the subject. The terms mutation, alteration, variation, and polymorphism are used interchangeably herein. As used herein, mutations include, but are not limited to, point mutations, insertions, deletions, rearrangements, inversions and duplications. Mutations also include, but are not limited to, silent mutations, missense mutations, and nonsense mutations.
In some embodiments, a germ-line risk marker is a single nucleotide polymorphism (SNP). A SNP is a mutation that occurs at a single nucleotide location on a chromosome. The nucleotide located at that position may differ between individuals in a population and/or paired chromosomes in an individual. In some embodiments, a germ-line risk marker is a SNP selected from Table 1. In some embodiments, a germ-line risk marker is a SNP selected from Table 1 or Table 5. Table 1 provides the risk nucleotide identity for each SNP (see “allele” column). The risk nucleotide is the nucleotide identity that is associated with elevated risk of developing osteosarcoma or having an undiagnosed osteosarcoma. The position (i.e., the chromosome coordinates) and SNP ID for each SNP in Table 1 are based on the CanFam 2.0 genome assembly (see, e.g., Lindblad-Toh K, Wade C M, Mikkelsen T S, Karlsson E K, Jaffe D B, Kamal M, Clamp M, Chang J L, Kulbokas E J 3rd, Zody M C, et al.: Genome sequence, comparative analysis and haplotype structure of the domestic dog. Nature 2005, 438:803-819). The first base pair in each chromosome is labeled 0 and the position of the SNP is then the number of base pairs from the first base pair (for example, the SNP on chromosome 11 at position 44405676 is located 44405676 base pairs from the first base pair of chromosome 11).
In some embodiments, the SNP may be one or more of:
In some embodiments, a SNP may be used in the methods described herein. In some embodiments, the method comprises:
a) analyzing genomic DNA from a canine subject for the presence of a SNP selected from:
b) identifying the canine subject having one or more of the SNPs as a subject (a) at elevated risk of developing osteosarcoma or (b) having an undiagnosed osteosarcoma.
In some embodiments, the SNP is selected from BICF2P133066, BICF2P1421479, BICF2S2308696, BICF2P508906, BICF2P508905, BICF2S23216058, BICF2S23216058, BICF2P266591, BICF2P1332375, BICF2S23231062, BICF2S22945043, BICF2P326880, BICF2P893664, BICF2P1420547, BICF2P698281, BICF2S22919383, BICF2S22947803, BICF2S22947803, BICF2S22959094, BICF2S23228287, BICF2S23036972, BICF2P51623, BICF2P1346510, BICF2P1323908, BICF2P1137984, BICF2P1115364, BICF2P58266, BICF2P627162, BICF2P1422910, BICF2P162782, BICF2P162782, BICF2P1342901, BICF2P868731, BICF2P768889, BICF2P1052528, BICF2P408119, BICF2P1468011, BICF2P219326, BICF2P1462759, BICF2P307386, BICF2P1010170, BICF2S23038485, BICF2G630672865, BICF2G630672813, BICF2P1369145, BICF2G630672770, BICF2P81989, BICF2P916235, BICF2G630672753, BICF2P1177075, BICF2P411325, BICF2P1210630, TIGRP2P407733, BICF2P341331, BICF2P318350, BICF2S2335735, BICF2P1003572, BICF2P1104551, BICF2S23550277, BICF2P870378, BICF2P866460, BICF2P1303772, BICF2S23738710, BICF2P344455, BICF2P825177, BICF2S23324500, BICF2S23544574, BICF2P119783, BICF2S23758510, BICF2S23724888, BICF2P1129874, BICF2S23535303, BICF2S23520119, G326F32S322, BICF2S23238674, BICF2P645758, BICF2P189890, BICF2P819174, BICF2P162666, BICF2P1366853, BICF2P775251, BICF2S23746532, BICF2P1162557, BICF2S23538747, BICF2S23538670, BICF2S23218055, BICF2P680751, BICF2S23510137, BICF2P849639, BICF2S22945333, BICF2S2298851, TIGRP2P238123, TIGRP2P238132, BICF2P1466354, BICF2P440326, BICF2P874005, BICF2P928021, BICF2P1182592, BICF2P1378069, TIGRP2P238162, TIGRP2P253880, BICF2P461252, BICF2P879737, BICF2P163146, BICF2S23259485, TIGRP2P253975, BICF2S23760612, TIGRP2P254013, TIGRP2P254028, BICF2S23750273, BICF2P228579, TIGRP2P254054, BICF2P531896, TIGRP2P254060, BICF2P766570, BICF2P1014267, BICF2P1006929, BICF2P1299781, BICF2P672676, BICF2S23761559, BICF2P15617, BICF2P439160, TIGRP2P254095, TIGRP2P254109, BICF2P477812, BICF2P1238318, BICF2P1354921, BICF2S23741435, BICF2P37118, TIGRP2P254175, BICF2P1123483, TIGRP2P254184, BICF2P825842, BICF2P243632, BICF2P1139856, BICF2P1376844, TIGRP2P254212, TIGRP2P254216, or TIGRP2P254223.
In some embodiments, the SNP is selected from BICF2P133066, BICF2S2308696, BICF2P508906, BICF2P508905, BICF2S23216058, BICF2S23216058, BICF2P266591, BICF2P1332375, BICF2S23231062, BICF2S22945043, BICF2P326880, BICF2P893664, BICF2P1420547, BICF2P698281, BICF2S22919383, BICF2S22947803, BICF2S22947803, BICF2S22959094, BICF2S23228287, BICF2S23036972, BICF2P51623, BICF2P1346510, BICF2P1323908, BICF2P1137984, BICF2P1115364, BICF2P58266, BICF2P627162, BICF2P1422910, BICF2P162782, BICF2P162782, BICF2P1342901, BICF2P868731, BICF2P768889, BICF2P1052528, BICF2P408119, BICF2P1468011, BICF2P219326, BICF2P1462759, BICF2P307386, BICF2P1010170, BICF2P229090, BICF2S23516022, or BICF2S22922837.
In some embodiments, the SNP is BICF2P133066.
It is to be understood that any number of SNPs (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 or more SNPs) may be detected and/or used to identify a subject.
In some embodiments, a germ-line risk marker is a risk haplotype. A risk haplotype, as used herein, is a chromosomal region containing at least one mutation that correlates with the presence of or likelihood of developing osteosarcoma in a subject. A risk haplotype is detected or identified and/or may be defined by one or more mutations. For example, a risk haplotype may be a chromosomal region with boundaries that are defined by two or more SNPs that are in linkage disequilibrium and correlate with the presence of or likelihood of developing osteosarcoma in a subject. Such SNPs may themselves be disease-causative or may, alternatively or additionally, be indicators of other mutations (either germ-line mutations or somatic mutations) present in the chromosomal region of the risk haplotype that correlate with or cause osteosarcoma in a subject. Thus, other mutations within the risk haplotype may correlate with presence of or likelihood of developing osteosarcoma in a subject and are contemplated for use in the methods herein. Accordingly, in some embodiments, methods described herein comprise use and/or detection of a risk haplotype. In some embodiments, the risk haplotype is selected from:
a risk haplotype having chromosome coordinates chr11:44392734-44414985,
a risk haplotype having chromosome coordinates chr8:35433142-35454649,
a risk haplotype having chromosome coordinates chr13:14549973-14645634,
a risk haplotype having chromosome coordinates chr25:21831580-21921256,
a risk haplotype having chromosome coordinates chr14:48831824-49203827,
a risk haplotype having chromosome coordinates chr5:16071171-16152955,
a risk haplotype having chromosome coordinates chr19:33963105-34145310,
a risk haplotype having chromosome coordinates chr16:43665149-43737129,
a risk haplotype having chromosome coordinates chr15:63767963-63800415,
a risk haplotype having chromosome coordinates chr16:40883517-41081510,
a risk haplotype having chromosome coordinates chr25:43476429-43528145,
a risk haplotype having chromosome coordinates chr1:112977233-113081800,
a risk haplotype having chromosome coordinates chr3:5162058-6465753,
a risk haplotype having chromosome coordinates chr7:64631053-64703475,
a risk haplotype having chromosome coordinates chr1:115582915-116790630,
a risk haplotype having chromosome coordinates chr2:19212450-19542015,
a risk haplotype having chromosome coordinates chr1:122033806-122051988,
a risk haplotype having chromosome coordinates chr35:18326079-18345318,
a risk haplotype having chromosome coordinates chr9:47647012-47668054,
a risk haplotype having chromosome coordinates chr38:11252518-11739329,
a risk haplotype having chromosome coordinates chr21:46231985-46363479,
a risk haplotype having chromosome coordinates chr17:14465884-14482152,
a risk haplotype having chromosome coordinates chr32:25136302-25156153,
a risk haplotype having chromosome coordinates chr36:29637804-29663408,
a risk haplotype having chromosome coordinates chr15:37986345-39974762,
a risk haplotype having chromosome coordinates chr1:29405587-29914411,
a risk haplotype having chromosome coordinates chr26:32374093-32428448,
a risk haplotype having chromosome coordinates chr25:29658978-29767164,
a risk haplotype having chromosome coordinates chr26:3529343-3550075,
a risk haplotype having chromosome coordinates chr5:14720254-15466603,
a risk haplotype having chromosome coordinates chr18:4266743-5854451,
a risk haplotype having chromosome coordinates chr1:16768869-18150476,
a risk haplotype having chromosome coordinates chr9:18896060-19633155, or
a risk haplotype having chromosome coordinates chr11:44390633-44406002.
In some embodiments, the risk haplotype is selected from:
a risk haplotype having chromosome coordinates chr11:39643190-45990018,
a risk haplotype having chromosome coordinates chr24:27409719-29194396, and
a risk haplotype having chromosome coordinates chr35:11233053-12732906. The chromosome coordinates is the previous sentence are from the CanFam3 genome assembly (see, e.g., UCSC Genome Browser).
In some embodiments, the risk haplotype is selected from:
a risk haplotype having chromosome coordinates chr11:37000000-44000000,
a risk haplotype having chromosome coordinates chr24:27000000-33000000, and
a risk haplotype having chromosome coordinates chr35:10000000-14000000. The chromosome coordinates is the previous sentence are from the CanFam3 genome assembly (see, e.g., UCSC Genome Browser).
Any chromosomal coordinates described herein are meant to be inclusive (i.e., include the boundaries of the chromosomal coordinates). In some embodiments, the risk haplotype may include additional chromosomal regions flanking those chromosomal regions described above, e.g., an additional 0.1, 0.5, 1, 2, 3, 4 or 5 Mb. In some embodiments, the risk haplotype may be a shortened chromosomal region than those chromosomal regions described above, e.g., 0.1, 0.5, or 1 Mb fewer than the chromosomal regions described above.
Any mutation of any size located within or spanning the chromosomal boundaries of a risk haplotype is contemplated herein for detection of a risk haplotype, e.g., a SNP, a deletion, an inversion, a translocation, or a duplication. In some embodiments, the risk haplotype is detected by analyzing the chromosomal region of the risk haplotype for the presence of a SNP. In some embodiments, a SNP in a risk haplotype is a SNP described in Table 1 having chromosome coordinates within the risk haplotype. It is to be understood that other SNPs not listed in Table 1 but located within the risk haplotype coordinates on chromosome 1, 2, 3, 5, 7, 8, 9, 11, 13, 14, 15, 16, 17, 18, 19, 21, 25, 26, 32, 35, 36, or 38 above are also contemplated herein. In some embodiments, if the subject is a human subject, then human chromosome coordinates that correspond to canine chromosome coordinates provided herein are contemplated for use in a method described herein.
In some embodiments, a risk haplotype can be used in the methods described herein. In some embodiments, the method comprises:
a) analyzing genomic DNA from a canine subject for the presence of a risk haplotype selected from:
a risk haplotype having chromosome coordinates chr11:44392734-44414985,
a risk haplotype having chromosome coordinates chr8:35433142-35454649,
a risk haplotype having chromosome coordinates chr13:14549973-14645634,
a risk haplotype having chromosome coordinates chr25:21831580-21921256,
a risk haplotype having chromosome coordinates chr14:48831824-49203827,
a risk haplotype having chromosome coordinates chr5:16071171-16152955,
a risk haplotype having chromosome coordinates chr19:33963105-34145310,
a risk haplotype having chromosome coordinates chr16:43665149-43737129,
a risk haplotype having chromosome coordinates chr15:63767963-63800415,
a risk haplotype having chromosome coordinates chr16:40883517-41081510,
a risk haplotype having chromosome coordinates chr25:43476429-43528145,
a risk haplotype having chromosome coordinates chr1:112977233-113081800,
a risk haplotype having chromosome coordinates chr3:5162058-6465753,
a risk haplotype having chromosome coordinates chr7:64631053-64703475,
a risk haplotype having chromosome coordinates chr1:115582915-116790630,
a risk haplotype having chromosome coordinates chr2:19212450-19542015,
a risk haplotype having chromosome coordinates chr1:122033806-122051988,
a risk haplotype having chromosome coordinates chr35:18326079-18345318,
a risk haplotype having chromosome coordinates chr9:47647012-47668054,
a risk haplotype having chromosome coordinates chr38:11252518-11739329,
a risk haplotype having chromosome coordinates chr21:46231985-46363479,
a risk haplotype having chromosome coordinates chr17:14465884-14482152,
a risk haplotype having chromosome coordinates chr32:25136302-25156153,
a risk haplotype having chromosome coordinates chr36:29637804-29663408,
a risk haplotype having chromosome coordinates chr15:37986345-39974762,
a risk haplotype having chromosome coordinates chr1:29405587-29914411,
a risk haplotype having chromosome coordinates chr26:32374093-32428448,
a risk haplotype having chromosome coordinates chr25:29658978-29767164,
a risk haplotype having chromosome coordinates chr26:3529343-3550075,
a risk haplotype having chromosome coordinates chr5:14720254-15466603,
a risk haplotype having chromosome coordinates chr18:4266743-5854451,
a risk haplotype having chromosome coordinates chr1:16768869-18150476,
a risk haplotype having chromosome coordinates chr9:18896060-19633155, and
a risk haplotype having chromosome coordinates chr11:44390633-44406002; and
b) identifying a canine subject having the risk haplotype as a subject (a) at elevated risk of developing osteosarcoma or (b) having an undiagnosed osteosarcoma.
In some embodiments, the risk haplotype is selected from a risk haplotype having chromosome coordinates chr11:44392734-44414985, chr8:35433142-35454649, chr1:115582915-116790630, chr2:19212450-19542015, chr1:122033806-122051988, chr35:18326079-18345318, chr9:47647012-47668054, chr38:11252518-11739329, chr5:14720254-15466603, or chr18:4266743-5854451.
In some embodiments, the risk haplotype is selected from a risk haplotype having chromosome coordinates chr11:44392734-44414985, chr1:115582915-116790630, or chr5:14720254-15466603.
In some embodiments, the risk haplotype is the risk haplotype having chromosome coordinates chr11:44392734-44414985.
In some embodiments, the risk haplotype is the risk haplotype having chromosome coordinates chr11:44390633-44406002.
In some embodiments, the risk haplotype is a risk haplotype having chromosome coordinates chr11:44390000-44410000.
In some embodiments, the method comprises:
a risk haplotype having chromosome coordinates chr11:39643190-45990018,
a risk haplotype having chromosome coordinates chr24:27409719-29194396, and
a risk haplotype having chromosome coordinates chr35:11233053-12732906; and
b) identifying a canine subject having the risk haplotype as a subject (a) at elevated risk of developing osteosarcoma or (b) having an undiagnosed osteosarcoma. The chromosome coordinates is the previous sentence are from the CanFam3 genome assembly (see, e.g., UCSC Genome Browser).
In some embodiments, the method comprises:
a) analyzing genomic DNA from a canine subject for the presence of a risk haplotype selected from:
a risk haplotype having chromosome coordinates chr11:37000000-44000000,
a risk haplotype having chromosome coordinates chr24:27000000-33000000, and
a risk haplotype having chromosome coordinates chr35:10000000-14000000; and
b) identifying a canine subject having the risk haplotype as a subject (a) at elevated risk of developing osteosarcoma or (b) having an undiagnosed osteosarcoma. The chromosome coordinates is the previous sentence are from the CanFam3 genome assembly (see, e.g., UCSC Genome Browser).
It is to be understood that any number of mutations (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 or more mutations) can exist within each risk haplotype. It is also to be understood that not all mutations within the risk haplotype must be detected in order to determine that the risk haplotype is present. For example, one mutation may be used to detect the presence of a risk haplotype. In another example, two or more mutations may be used to detect and/or confirm the presence of a risk haplotype. It is also to be understood that subject identification may involve any number of risk haplotypes (e.g., 1, 2, 3, 4, or 5 risk haplotypes).
In some embodiments, the presence of a risk haplotype is determined by detecting one or more SNPs within the chromosomal coordinates of the risk haplotype. In some embodiments, the presence of the risk haplotype is detected by analyzing the genomic DNA for the presence of one or more SNPs in Table 1 within the chromosomal coordinates of the risk haplotype.
It is to be understood that any number of SNPs (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 or more SNPs) in any number of risk haplotypes (e.g., 1, 2, 3, 4, or 5 risk haplotypes) may be used. In some embodiments, a subset or all SNPs in Table 1 located within a risk haplotype are used to detect the presence of the risk haplotype.
In some embodiments, a germ-line risk marker is a mutation in a gene. As used herein, a gene includes both coding and non-coding nucleotide sequences. As such, a gene includes any regulatory sequences (e.g., any promoters, enhancers, or suppressors, either adjacent to or far from the coding sequence) and any coding sequences. In some embodiments, a gene includes a nucleotide sequence that encodes a microRNA. In some embodiments, the gene is contained within, near, or spanning the boundaries of a risk haplotype as described herein. In some embodiments, a mutation, such as a SNP, is contained within or near the gene. In some embodiments, the gene is within 1000 Kb, 900 Kb, 800 Kb, 700 Kb, 600 Kb, 500 Kb, 400 Kb, 300 Kb, 200 Kb, or 100 Kb of a SNP as described herein. In some embodiments, the mutation is present in a gene selected from:
one or more genes located within a risk haplotype having chromosome coordinates chr11:44392734-44414985,
one or more genes located within a risk haplotype having chromosome coordinates chr8:35433142-35454649,
one or more genes located within a risk haplotype having chromosome coordinates chr13: 14549973-14645634,
one or more genes located within a risk haplotype having chromosome coordinates chr25:21831580-21921256,
one or more genes located within a risk haplotype having chromosome coordinates chr14:48831824-49203827,
one or more genes located within a risk haplotype having chromosome coordinates chr5:16071171-16152955,
one or more genes located within a risk haplotype having chromosome coordinates chr19:33963105-34145310,
one or more genes located within a risk haplotype having chromosome coordinates chr16:43665149-43737129,
one or more genes located within a risk haplotype having chromosome coordinates chr15:63767963-63800415,
one or more genes located within a risk haplotype having chromosome coordinates chr16:40883517-41081510,
one or more genes located within a risk haplotype having chromosome coordinates chr25:43476429-43528145,
one or more genes located within a risk haplotype having chromosome coordinates chr1:112977233-113081800,
one or more genes located within a risk haploypte having chromosome coordinates chr3:5162058-6465753,
one or more genes located within a risk haplotype having chromosome coordinates chr7:64631053-64703475,
one or more genes located within a risk haplotype having chromosome coordinates chr1:115582915-116790630,
one or more genes located within a risk haplotype having chromosome coordinates chr2:19212450-19542015,
one or more genes located within a risk haplotype having chromosome coordinates chr1:122033806-122051988,
one or more genes located within a risk haplotype having chromosome coordinates chr35: 18326079-18345318,
one or more genes located within a risk haplotype having chromosome coordinates chr9:47647012-47668054,
one or more genes located within a risk haplotype having chromosome coordinates chr38: 11252518-11739329,
one or more genes located within a risk haplotype having chromosome coordinates chr21:46231985-46363479,
one or more genes located within a risk haplotype having chromosome coordinates chr17:14465884-14482152,
one or more genes located within a risk haplotype having chromosome coordinates chr32:25136302-25156153,
one or more genes located within a risk haplotype having chromosome coordinates chr36:29637804-29663408,
one or more genes located within a risk haplotype having chromosome coordinates chr15:37986345-39974762,
one or more genes located within a risk haplotype having chromosome coordinates chr1:29405587-29914411,
one or more genes located within a risk haplotype having chromosome coordinates chr26: 32374093-32428448,
one or more genes located within a risk haplotype having chromosome coordinates chr25:29658978-29767164,
one or more genes located within a risk haplotype having chromosome coordinates chr26:3529343-3550075,
one or more genes located within a risk haplotype having chromosome coordinates chr5:14720254-15466603,
one or more genes located within a risk haplotype having chromosome coordinates chr18:4266743-5854451,
one or more genes located within a risk haplotype having chromosome coordinates chr1:16768869-18150476,
one or more genes located within a risk haplotype having chromosome coordinates chr9:18896060-19633155, or
one or more genes located within a risk haplotype having chromosome coordinates chr11:44390633-44406002.
The mapped genes located within or near the risk haplotypes on chromosome 1, 2, 3, 5, 7, 8, 9, 11, 13, 14, 15, 16, 17, 18, 19, 21, 25, 26, 32, 35, 36, and 38 are described in Table 2 and 3. The Ensembl gene identifiers are based on the CanFam 2.0 genome assembly (see, e.g., Lindblad-Toh K, Wade C M, Mikkelsen T S, Karlsson E K, Jaffe D B, Kamal M, Clamp M, Chang J L, Kulbokas E J 3rd, Zody M C, et al.: Genome sequence, comparative analysis and haplotype structure of the domestic dog. Nature 2005, 438:803-819). The Ensembl gene ID provided for each gene can be used to determine the nucleotide sequence of the gene, as well as associated transcript and protein sequences, by inputting the Ensemble ID into the Ensemble database (Ensembl release 70).
In some embodiments, a mutation in a gene is used in the methods described herein. In some embodiments, the method comprises:
(a) analyzing genomic DNA from a canine subject for the presence of a mutation in a gene selected from
one or more genes located within a risk haplotype having chromosome coordinates chr11:44392734-44414985,
one or more genes located within a risk haplotype having chromosome coordinates chr8:35433142-35454649,
one or more genes located within a risk haplotype having chromosome coordinates chr13:14549973-14645634,
one or more genes located within a risk haplotype having chromosome coordinates chr25:21831580-21921256,
one or more genes located within a risk haplotype having chromosome coordinates chr14:48831824-49203827,
one or more genes located within a risk haplotype having chromosome coordinates chr5:16071171-16152955,
one or more genes located within a risk haplotype having chromosome coordinates chr19:33963105-34145310,
one or more genes located within a risk haplotype having chromosome coordinates chr16:43665149-43737129,
one or more genes located within a risk haplotype having chromosome coordinates chr15:63767963-63800415,
one or more genes located within a risk haplotype having chromosome coordinates chr16:40883517-41081510,
one or more genes located within a risk haplotype having chromosome coordinates chr25:43476429-43528145,
one or more genes located within a risk haplotype having chromosome coordinates chr1:112977233-113081800,
one or more genes located within a risk haplotype having chromosome coordinates chr3:5162058-6465753,
one or more genes located within a risk haplotype having chromosome coordinates chr7:64631053-64703475,
one or more genes located within a risk haplotype having chromosome coordinates chr1:115582915-116790630,
one or more genes located within a risk haplotype having chromosome coordinates chr2:19212450-19542015,
one or more genes located within a risk haplotype having chromosome coordinates chr1:122033806-122051988,
one or more genes located within a risk haplotype having chromosome coordinates chr35:18326079-18345318,
one or more genes located within a risk haplotype having chromosome coordinates chr9:47647012-47668054,
one or more genes located within a risk haplotype having chromosome coordinates chr38: 11252518-11739329,
one or more genes located within a risk haplotype having chromosome coordinates chr21:46231985-46363479,
one or more genes located within a risk haplotype having chromosome coordinates chr17: 14465884-14482152,
one or more genes located within a risk haplotype having chromosome coordinates chr32:25136302-25156153,
one or more genes located within a risk haplotype having chromosome coordinates chr36: 29637804-29663408,
one or more genes located within a risk haplotype having chromosome coordinates chr15:37986345-39974762,
one or more genes located within a risk haplotype having chromosome coordinates chr1:29405587-29914411,
one or more genes located within a risk haplotype having chromosome coordinates chr26: 32374093-32428448,
one or more genes located within a risk haplotype having chromosome coordinates chr25:29658978-29767164,
one or more genes located within a risk haplotype having chromosome coordinates chr26:3529343-3550075,
one or more genes located within a risk haplotype having chromosome coordinates chr5:14720254-15466603,
one or more genes located within a risk haplotype having chromosome coordinates chr18:4266743-5854451,
one or more genes located within a risk haplotype having chromosome coordinates chr1:16768869-18150476,
one or more genes located within a risk haplotype having chromosome coordinates chr9:18896060-19633155, and
one or more genes located within a risk haplotype having chromosome coordinates chr11:44390633-44406002; and
(b) identifying a canine subject having the mutation as a subject (a) at elevated risk of developing osteosarcoma or (b) having an undiagnosed osteosarcoma.
In some embodiments, the gene is selected from:
one or more genes located within a risk haplotype having chromosome coordinates chr11:44392734-44414985,
one or more genes located within a risk haplotype having chromosome coordinates chr8:35433142-35454649,
one or more genes located within a risk haplotype having chromosome coordinates chr1:115582915-116790630,
one or more genes located within a risk haplotype having chromosome coordinates chr2:19212450-19542015,
one or more genes located within a risk haplotype having chromosome coordinates chr1:122033806-122051988,
one or more genes located within a risk haplotype having chromosome coordinates chr35: 18326079-18345318,
one or more genes located within a risk haplotype having chromosome coordinates chr9:47647012-47668054,
one or more genes located within a risk haplotype having chromosome coordinates chr38:11252518-11739329,
one or more genes located within a risk haplotype having chromosome coordinates chr5: 14720254-15466603, or
one or more genes located within a risk haplotype having chromosome coordinates chr18:4266743-5854451.
In some embodiments, the gene is selected from:
one or more genes located within a risk haplotype having chromosome coordinates chr11:44392734-44414985,
one or more genes located within a risk haplotype having chromosome coordinates chr1:115582915-116790630, and
one or more genes located within a risk haplotype having chromosome coordinates chr5:14720254-15466603.
In some embodiments, the gene is one or more genes located within the risk haplotype having chromosome coordinates chr11:44392734-44414985.
In some embodiments, the gene is selected from CDKN2B-AS, OTX2, BMPER, GRIK4, EN1, MARCO, MTMR7, SGCZ, CCL20, CD3EAP, ERCC1, ERCC2, FOSB, PPP1R13L, FER, MAN2A1, PJA2, CHST9, ADCK4, AKT2, AXL, BLVRB, C19orf47, C19orf54, CNTD2, CYP2A7, CYP2B6, CYP2S1, DLL3, EGLN2, FBL, FCGBP, GMFG, HIPK4, HNRNPUL1, ITPKC, LEUTX, LTBP4, MAP3K10, MED29, NUMBL, PLD3, PLEKHG2, PSMC4, RAB4B, SAMD4B, SERTAD1, SERTAD3, SHKBP1, SNRPA, SPTBN4, SUPT5H, TIMM50, KIAA1462, Cl9orf40, CEP89, RHPN2, BLMH, TMIGD1, FAM5C, NELL1, EMCN, AMDHD1, CCDC38, CDK17, ELK3, FGD6, HAL, LTA4H, METAP2, NDUFA12, NEDD1, NR2C1, NTN4, SNRPF, USP44,VEZT, EYA4, TCF21, ARVCF, C22orf25, COMT, XKR6, FBRSL1, BLID, C7orf72, COBL, DDC, FIGNL1, GRB10, IKZF1, VWC2, ZPBP, BCL2, KIAA1468, PHLPP1, PIGN, RNF152, TNFRSF11A, ZCCHC2, ABCA5, KCNJ16, KCNJ2, MAP2K6, CDKN2A, and CDKN2B.
In some embodiments, the gene is selected from CDKN2B-AS, OTX2, BMPER, EN1, DLL3, KIAA1462, FAM5C, NELL1, EMCN, TCF21, BLID, VWC2, BCL2, and TNFRSF11A.
In some embodiments, the gene is selected from CDKN2B-AS, OTX2, ADCK4, AKT2, AXL, BLVRB, C19orf47, C19orf54, CNTD2, CYP2A7, CYP2B6, CYP2S1, DLL3, EGLN2, FBL, FCGBP, GMFG, HIPK4, HNRNPUL1, ITPKC, LEUTX, LTBP4, MAP3K10, MED29, NUMBL, PLD3, PLEKHG2, PSMC4, RAB4B, SAMD4B, SERTAD1, SERTAD3, SHKBP1, SNRPA, SPTBN4, SUPT5H, TIMM50, KIAA1462, Cl9orf40, CEP89, RHPN2, BLMH, TMIGD1, FAM5C, BLID, C7orf72, COBL, DDC, FIGNL1, GRB10, IKZF1, VWC2, and ZPBP.
In some embodiments, the gene is selected from CDKN2B-AS, ADCK4, AKT2, AXL, BLVRB, C19orf47, C19orf54, CNTD2, CYP2A7, CYP2B6, CYP2S1, DLL3, EGLN2, FBL, FCGBP, GMFG, HIPK4, HNRNPUL1, ITPKC, LEUTX, LTBP4, MAP3K10, MED29, NUMBL, PLD3, PLEKHG2, PSMC4, RAB4B, SAMD4B, SERTAD1, SERTAD3, SHKBP1, SNRPA, SPTBN4, SUPT5H, TIMM50, and BLID.
In some embodiments, the gene is selected from CDKN2B-AS, CDKN2A, and CDKN2B. In some embodiments, the gene is selected from CDKN2B-AS, CDKN2A, CDKN2B, and MTAP.
Any number of mutations (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 or more mutations) in any number of genes (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 or more genes) are contemplated.
The genes described herein can also be used to identify a subject at elevated risk of or having undiagnosed osteosarcoma, where the subject is any of a variety of animal subjects including but not limited to human subjects. In some embodiments, the method, comprises
(a) analyzing genomic DNA in a sample from a subject for presence of a mutation in a gene selected from:
one or more genes located within a risk haplotype having chromosome coordinates chr11:44392734-44414985 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr8:35433142-35454649 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr13:14549973-14645634 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr25:21831580-21921256 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr14:48831824-49203827 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr5:16071171-16152955 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr19:33963105-34145310 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr16:43665149-43737129 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr15:63767963-63800415 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr16:40883517-41081510 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr25:43476429-43528145 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr1:112977233-113081800 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr3:5162058-6465753 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr7:64631053-64703475 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr1:115582915-116790630 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr2:19212450-19542015 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr1:122033806-122051988 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr35:18326079-18345318 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr9:47647012-47668054 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr38:11252518-11739329 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr21:46231985-46363479 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr17:14465884-14482152 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr32:25136302-25156153 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr36:29637804-29663408 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr15:37986345-39974762 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr1:29405587-29914411 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr26:32374093-32428448 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr25:29658978-29767164 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr26:3529343-3550075 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr5:14720254-15466603 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr18:4266743-5854451 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr1:16768869-18150476 or an orthologue of such a gene,
one or more genes located within a risk haplotype having chromosome coordinates chr9:18896060-19633155 or an orthologue of such a gene, and
one or more genes located within a risk haplotype having chromosome coordinates chr11:44,390,633-44,406,002 or an orthologue of such a gene; and
(b) identifying a subject having the mutation as a subject (a) at elevated risk of developing osteosarcoma or (b) having an undiagnosed osteosarcoma. In some embodiments, the subject is a human subject. In some embodiments, the subject is a canine subject. An orthologue of a gene may be, e.g., a human gene as identified in Table 2 or 3. In some embodiments, an orthologue of a gene has a sequence that is 70%, 75%, 80%, 85%, 90%, 95%, or 99% or more homologous to a sequence of the gene.
Some methods provided herein comprise analyzing genomic DNA. In some embodiments, analyzing genomic DNA comprises carrying out a nucleic acid-based assay, such as a sequencing-based assay or a hybridization based assay. In some embodiments, the genomic DNA is analyzed using a single nucleotide polymorphism (SNP) array. In some embodiments, the genomic DNA is analyzed using a bead assay. Methods of genetic analysis are known in the art. Examples of genetic analysis methods and commercially available tools are described below.
Affymetrix: The Affymetrix SNP 6.0 array contains over 1.8 million SNP and copy number probes on a single array. The method utilizes at a simple restriction enzyme digestion of 250 ng of genomic DNA, followed by linker-ligation of a common adaptor sequence to every fragment, a tactic that allows multiple loci to be amplified using a single primer complementary to this adaptor. Standard PCR then amplifies a predictable size range of fragments, which converts the genomic DNA into a sample of reduced complexity as well as increases the concentration of the fragments that reside within this predicted size range. The target is fragmented, labeled with biotin, hybridized to microarrays, stained with streptavidin-phycoerythrin and scanned. To support this method, Affymetrix Fluidics Stations and integrated GS-3000 Scanners can be used.
Illumina Infinium: Examples of commercially available Infinium array options include the 660W-Quad (>660,000 probes), the 1MDuo (over 1 million probes), and the custom iSelect (up to 200,000 SNPs selected by user). Samples begin the process with a whole genome amplification step, then 200 ng is transferred to a plate to be denatured and neutralized, and finally plates are incubated overnight to amplify. After amplification the samples are enzymatically fragmented using end-point fragmentation. Precipitation and resuspension clean up the DNA before hybridization onto the chips. The fragmented, resuspended DNA samples are then dispensed onto the appropriate BeadChips and placed in the hybridization oven to incubate overnight. After hybridization the chips are washed and labeled nucleotides are added to extend the primers by one base. The chips are immediately stained and coated for protection before scanning. Scanning is done with one of the two Illumina iScan™ Readers, which use a laser to excite the fluorophore of the single-base extension product on the beads. The scanner records high-resolution images of the light emitted from the fluorophores. All plates and chips are barcoded and tracked with an internally derived laboratory information management system. The data from these images are analyzed to determine SNP genotypes using Illumina's BeadStudio. To support this process, Biomek F/X, three Tecan Freedom Evos, and two Tecan Genesis Workstation 150s can be used to automate all liquid handling steps throughout the sample and chip prep process.
Illumina BeadArray: The Illumina Bead Lab system is a multiplexed array-based format. Illumina's BeadArray Technology is based on 3-micron silica beads that self-assemble in microwells on either of two substrates: fiber optic bundles or planar silica slides. When randomly assembled on one of these two substrates, the beads have a uniform spacing of ˜5.7 microns. Each bead is covered with hundreds of thousands of copies of a specific oligonucleotide that act as the capture sequences in one of Illumina's assays. BeadArray technology is utilized in Illumina's iScan System.
Sequenom: During pre-PCR, either of two Packard Multiprobes is used to pool oligonucleotides, and a Tomtec Quadra 384 is used to transfer DNA. A Cartesian nanodispenser is used for small-volume transfer in pre-PCR, and another in post-PCR. Beckman Multimeks, equipped with either a 96-tip head or a 384-tip head, are used for more substantial liquid handling of mixes. Two Sequenom pin-tool are used to dispense nanoliter volumes of analytes onto target chips for detection by mass spectrometry. Sequenom Compact mass spectrometers can be used for genotype detection.
In some embodiments, methods provided herein comprise analyzing genomic DNA using a nucleic acid sequencing assay. Methods of genome sequencing are known in the art. Examples of genome sequencing methods and commercially available tools are described below.
Illumina Sequencing: 89 GAIIx Sequencers are used for sequencing of samples. Library construction is supported with 6 Agilent Bravo plate-based automation, Stratagene MX3005p qPCR machines, Matrix 2-D barcode scanners on all automation decks and 2 Multimek Automated Pipettors for library normalization.
454 Sequencing: Roche® 454 FLX-Titanium instruments are used for sequencing of samples. Library construction capacity is supported by Agilent Bravo automation deck, Biomek FX and Janus PCR normalization.
SOLiD Sequencing: SOLiD v3.0 instruments are used for sequencing of samples. Sequencing set-up is supported by a Stratagene MX3005p qPCR machine and a Beckman SC Quanter for bead counting.
ABI Prism® 3730 XL Sequencing: ABI Prism® 3730 XL machines are used for sequencing samples. Automated Sequencing reaction set-up is supported by 2 Multimek
Automated Pipettors and 2 Deerac Fluidics—Equator systems. PCR is performed on 60 Thermo-Hybaid 384-well systems.
Ion Torrent: Ion PGM™ or Ion Proton™ machines are used for sequencing samples. Ion library kits (Invitrogen) can be used to prepare samples for sequencing.
Other Technologies: Examples of other commercially available platforms include Helicos Heliscope Single-Molecule Sequencer, Polonator G.007, and Raindance RDT 1000 Rainstorm.
The invention contemplates that elevated risk of developing osteosarcoma is associated with an altered expression pattern of a gene located at, within, or near a risk haplotype, such as a gene located in Table 2 or 3. The invention therefore contemplates methods that involve measuring the mRNA or protein levels for these genes and comparing such levels to control levels, including for example predetermined thresholds.
mRNA Assays
The art is familiar with various methods for analyzing mRNA levels. Examples of mRNA-based assays include but are not limited to oligonucleotide microarray assays, quantitative RT-PCR, Northern analysis, and multiplex bead-based assays.
Expression profiles of cells in a biological sample (e.g., blood or a tumor) can be carried out using an oligonucleotide microarray analysis. As an example, this analysis may be carried out using a commercially available oligonucleotide microarray or a custom designed oligonucleotide microarray comprising oligonucleotides for all or a subset of the transcripts described herein. The microarray may comprise any number of the transcripts, as the invention contemplates that elevated risk may be determined based on the analysis of single differentially expressed transcripts or a combination of differentially expressed transcripts. The transcripts may be those that are up-regulated in tumors carrying a germ-line risk marker (compared to a tumor that does not carry the germ-line risk marker), or those that are down-regulated in tumors carrying a germ-line risk marker (compared to a tumor that does not carry the germ-line risk marker), or a combination of these. The number of transcripts measured using the microarray therefore may be 1, 2, 3, 4, 5, 6, 7, 8, 9, or more transcripts encoded by a gene in Table 2 or 3. It is to be understood that such arrays may however also comprise positive and/or negative control transcripts such as housekeeping genes that can be used to determine if the array has been degraded and/or if the sample has been degraded or contaminated. The art is familiar with the construction of oligonucleotide arrays.
Commercially available gene expression systems include Affymetrix GeneChip microarrays as well as all of IIlumina standard expression arrays, including two GeneChip 450 Fluidics Stations and a GeneChip 3000 Scanner, Affymetrix High-Throughput Array (HTA) System composed of a GeneStation liquid handling robot and a GeneChip HT Scanner providing automated sample preparation, hybridization, and scanning for 96-well Affymetrix PEGarrays. These systems can be used in the cases of small or potentially degraded RNA samples. The invention also contemplates analyzing expression levels from fixed samples (as compared to freshly isolated samples). The fixed samples include formalin-fixed and/or paraffin-embedded samples. Such samples may be analyzed using the whole genome Illumina DASL assay. High-throughput gene expression profile analysis can also be achieved using bead-based solutions, such as Luminex systems.
Other mRNA detection and quantitation methods include multiplex detection assays known in the art, e.g., xMAP® bead capture and detection (Luminex Corp., Austin, Tex.).
Another exemplary method is a quantitative RT-PCR assay which may be carried out as follows: mRNA is extracted from cells in a biological sample (e.g., blood or a tumor) using the RNeasy kit (Qiagen). Total mRNA is used for subsequent reverse transcription using the SuperScript III First-Strand Synthesis SuperMix (Invitrogen) or the SuperScript VILO cDNA synthesis kit (Invitrogen). 5 μl of the RT reaction is used for quantitative PCR using SYBR Green PCR Master Mix and gene-specific primers, in triplicate, using an ABI 7300 Real Time PCR System.
mRNA detection binding partners include oligonucleotide or modified oligonucleotide (e.g. locked nucleic acid) probes that hybridize to a target mRNA. Probes may be designed using the sequences or sequence identifiers listed in Table 2 or 3. Methods for designing and producing oligonucleotide probes are well known in the art (see, e.g., U.S. Pat. No. 8,036,835; Rimour et al. GoArrays: highly dynamic and efficient microarray probe design. Bioinformatics (2005) 21 (7): 1094-1103; and Wernersson et al. Probe selection for DNA microarrays using OligoWiz. Nat Protoc. 2007; 2(11):2677-91).
Protein Assays
The art is familiar with various methods for measuring protein levels. Protein levels may be measured using protein-based assays such as but not limited to immunoassays, Western blots, Western immunoblotting, multiplex bead-based assays, and assays involving aptamers (such as SOMAmer™ technology) and related affinity agents.
A brief description of an exemplary immunoassay is provided here. A biological sample is applied to a substrate having bound to its surface protein-specific binding partners (i.e., immobilized protein-specific binding partners). The protein-specific binding partner (which may be referred to as a “capture ligand” because it functions to capture and immobilize the protein on the substrate) may be an antibody or an antigen-binding antibody fragment such as Fab, F(ab)2, Fv, single chain antibody, Fab and sFab fragment, F(ab′)2, Fd fragments, scFv, and dAb fragments, although it is not so limited. Other binding partners are described herein. Protein present in the biological sample bind to the capture ligands, and the substrate is washed to remove unbound material. The substrate is then exposed to soluble protein-specific binding partners (which may be identical to the binding partners used to immobilize the protein). The soluble protein-specific binding partners are allowed to bind to their respective proteins immobilized on the substrate, and then unbound material is washed away. The substrate is then exposed to a detectable binding partner of the soluble protein-specific binding partner. In one embodiment, the soluble protein-specific binding partner is an antibody having some or all of its Fc domain. Its detectable binding partner may be an anti-Fc domain antibody. As will be appreciated by those in the art, if more than one protein is being detected, the assay may be configured so that the soluble protein-specific binding partners are all antibodies of the same isotype. In this way, a single detectable binding partner, such as an antibody specific for the common isotype, may be used to bind to all of the soluble protein-specific binding partners bound to the substrate.
It is to be understood that the substrate may comprise capture ligands for one or more proteins, including two or more, three or more, four or more, five or more, etc. up to and including all of the proteins encoded by the genes in Table 2 provided by the invention.
Other examples of protein detection and quantitation methods include multiplexed immunoassays as described for example in U.S. Pat. Nos. 6,939,720 and 8,148,171, and published US Patent Application No. 2008/0255766, and protein microarrays as described for example in published US Patent Application No. 2009/0088329.
Protein detection binding partners include protein-specific binding partners. Protein-specific binding partners can be generated using the sequences or sequence identifiers listed in Table 2. In some embodiments, binding partners may be antibodies. As used herein, the term “antibody” refers to a protein that includes at least one immunoglobulin variable domain or immunoglobulin variable domain sequence. For example, an antibody can include a heavy (H) chain variable region (abbreviated herein as VH), and a light (L) chain variable region (abbreviated herein as VL). In another example, an antibody includes two heavy (H) chain variable regions and two light (L) chain variable regions. The term “antibody” encompasses antigen-binding fragments of antibodies (e.g., single chain antibodies, Fab and sFab fragments, F(ab′)2, Fd fragments, Fv fragments, scFv, and dAb fragments) as well as complete antibodies. Methods for making antibodies and antigen-binding fragments are well known in the art (see, e.g. Sambrook et al, “Molecular Cloning: A Laboratory Manual” (2nd Ed.), Cold Spring Harbor Laboratory Press (1989); Lewin, “Genes IV”, Oxford University Press, New York, (1990), and Roitt et al., “Immunology” (2nd Ed.), Gower Medical Publishing, London, New York (1989), WO2006/040153, WO2006/122786, and WO2003/002609).
Binding partners also include non-antibody proteins or peptides that bind to or interact with a target protein, e.g., through non-covalent bonding. For example, if the protein is a ligand, a binding partner may be a receptor for that ligand. In another example, if the protein is a receptor, a binding partner may be a ligand for that receptor. In yet another example, a binding partner may be a protein or peptide known to interact with a protein. Methods for producing proteins are well known in the art (see, e.g. Sambrook et al, “Molecular Cloning: A Laboratory Manual” (2nd Ed.), Cold Spring Harbor Laboratory Press (1989) and Lewin, “Genes IV”, Oxford University Press, New York, (1990)) and can be used to produce binding partners such as ligands or receptors.
Binding partners also include aptamers and other related affinity agents. Aptamers include oligonucleic acid or peptide molecules that bind to a specific target. Methods for producing aptamers to a target are known in the art (see, e.g., published US Patent Application No. 2009/0075834, U.S. Pat. Nos. 7,435,542, 7,807,351, and 7,239,742). Other examples of affinity agents include SOMAmer™ (Slow Off-rate Modified Aptamer, SomaLogic, Boulder, Colo.) modified nucleic acid-based protein binding reagents.
Binding partners also include any molecule capable of demonstrating selective binding to any one of the target proteins disclosed herein, e.g., peptoids (see, e.g., Reyna J Simon et al., “Peptoids: a modular approach to drug discovery” Proceedings of the National Academy of Sciences USA, (1992), 89(20), 9367-9371; U.S. Pat. No. 5,811,387; and M. Muralidhar Reddy et al., Identification of candidate IgG biomarkers for Alzheimer's disease via combinatorial library screening. Cell 144, 132-142, Jan. 7, 2011).
Detectable binding partners may be directly or indirectly detectable. A directly detectable binding partner may be labeled with a detectable label such as a fluorophore. An indirectly detectable binding partner may be labeled with a moiety that acts upon (e.g., an enzyme or a catalytic domain) or a moiety that is acted upon (e.g., a substrate) by another moiety in order to generate a detectable signal. Exemplary detectable labels include, e.g., enzymes, radioisotopes, haptens, biotin, and fluorescent, luminescent and chromogenic substances. These various methods and moieties for detectable labeling are known in the art.
Any of the methods provided herein can be performed on a device, e.g., an array. Suitable arrays are described herein and known in the art. Accordingly, a device, e.g., an array, for detecting any of the germ-line risk markers (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more germ-line risk markers, or at least 10, at least 20, at least 30, at least 40, at least 50, or more germ-line risk markers, or up to 5, up to 10, up to 15, up to 20, up to 25, up to 30, up to 35, up to 40, up to 45, up to 50, up to 75 or up to 100 germ-line risk markers) described herein is also contemplated.
Reagents for use in any of the methods provided herein can be in the form of a kit. Accordingly, a kit for detecting any of the germ-line risk markers (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more germ-line risk markers, or at least 10, at least 20, at least 30, at least 40, at least 50, or more germ-line risk markers, or up to 5, up to 10, up to 15, up to 20, up to 25, up to 30, up to 35, up to 40, up to 45, up to 50, up to 75 or up to 100 germ-line risk markers) described herein is also contemplated. In some embodiments, the kit comprises reagents for detecting any of the germ-line risk markers described herein, e.g., reagents for use in a method described herein. Suitable reagents are described herein and art known in the art.
Some of the methods provided herein involve measuring a level or determining the identity of a germ-line risk marker in a biological sample and then comparing that level or identity to a control in order to identify a subject having an elevated risk of developing osteosarcoma or having as yet undiagnosed osteosarcoma. The control may be a control level or identity that is a level or identity of the same germ-line marker in a control tissue, control subject, or a population of control subjects.
The control may be (or may be derived from) a normal subject (or normal subjects). A normal subject, as used herein, refers to a subject that is healthy, such a subject experiencing none of the symptoms associate with osteosarcoma. The control population may be a population of normal subjects.
In other instances, the control may be (or may be derived from) a subject (a) having a similar cancer to that of the subject being tested and (b) who is negative for the germ-line risk marker.
It is to be understood that the methods provided herein do not require that a control level or identity be measured every time a subject is tested. Rather, it is contemplated that control levels or identities of germ-line risk markers are obtained and recorded and that any test level is compared to such a pre-determined level or identity (or threshold).
In some embodiments, a control is a nucleotide other than the risk nucleotide as described in Table 1.
The methods provided herein detect and optionally measure (and thus analyze) levels or particular germ-line risk markers in biological samples. Biological samples, as used herein, refer to samples taken or obtained from a subject. These biological samples may be tissue samples or they may be fluid samples (e.g., bodily fluid). Examples of biological fluid samples are whole blood, plasma, serum, urine, sputum, phlegm, saliva, tears, and other bodily fluids. In some embodiments, the biological sample is a whole blood or saliva sample. In some embodiments, the biological sample is a tumor, a fragment of a tumor, or a tumor cell(s). In some embodiments, the biological sample is a bone sample or bone biopsy.
In some embodiments, the biological sample may comprise a polynucleotide (e.g., genomic DNA or mRNA) derived from a tissue sample or fluid sample of the subject. In some embodiments, the biological sample may comprise a polypeptide (e.g., a protein) derived from a tissue sample or fluid sample of the subject. In some embodiments, the biological sample may be manipulated to extract a polynucleotide or polypeptide. In some embodiments, the biological sample may be manipulated to amplify a polynucleotide sample. Methods for extraction and amplification are well known in the art.
Methods of the invention are intended for canine subjects. In some embodiments, canine subjects include, for example, those with a higher incidence of osteosarcoma as determined by breed. For example, the canine subject may be a Irish Wolfhound, Greyhound, German Shepherd, Rottweiler, Great Pyrenees, St. Bernard, Leonberger, Newfoundland, Doberman Pinscher or Great Dane, or a descendant of a Irish Wolfhound, Greyhound, German Shepherd, Rottweiler, Great Pyrenees, St. Bernard, Leonberger, Newfoundland, Doberman Pinscher or Great Dane. In some embodiments, the canine subject may be a Greyhound, an Irish Wolfhound, or a Rottweiler, or a descendant of a Greyhound, an Irish Wolfhound, or a Rottweiler. As used herein, a “descendant” includes any blood relative in the line of descent, e.g., first generation, second generation, third generation, fourth generation, etc., of a canine subject. Such a descendant may be a pure-bred canine subject, e.g., a descendant of two Greyhound or a mixed-breed canine subject, e.g., a descendant of both a Greyhound and a non-Greyhound. Breed can be determined, e.g., using commercially available genetic tests (see, e.g., Wisdom Panel).
Methods of the invention may be used in a variety of other subjects including but not limited to human subjects.
Methods of computation analysis of genomic and expression data are known in the art. Examples of available computational programs are: Genome Analysis Toolkit (GATK, Broad Institute, Cambridge, Mass.), Expressionist Refiner module (Genedata AG, Basel, Switzerland), GeneChip—Robust Multichip Averaging (CG-RMA) algorithm, PLINK (Purcell et al, 2007), GCTA (Yang et al, 2011), the EIGENSTRAT method (Price et al 2006), EMMAX (Kang et al, 2010). In some embodiments, methods described herein include a step comprising computational analysis.
Other aspects of the invention relate to use of the diagnostic methods in connection with a breeding program. A breeding program is a planned, intentional breeding of a group of animals to reduce detrimental or undesirable traits and/or increase beneficial or desirable traits in offspring of the animals. Thus, a subject identified using the methods described herein as not having a germ-line risk marker of the invention may be included in a breeding program to reduce the risk of developing osteosarcoma in the offspring of said subject. Alternatively, a subject identified using the methods described herein as having a germ-line risk marker of the invention may be excluded from a breeding program. In some embodiments, methods of the invention comprise exclusion of a subject identified as being at elevated risk of developing osteosarcoma or having undiagnosed osteosarcoma in a breeding program or inclusion of a subject identified as not being at elevated risk of developing osteosarcoma or having undiagnosed osteosarcoma in a breeding program.
Other aspects of the invention relate to diagnostic or prognostic methods that comprise a treatment step (also referred to as “theranostic” methods due to the inclusion of the treatment step). Any treatment for osteosarcoma is contemplated. In some embodiments, treatment comprises one or more of surgery, chemotherapy, and radiation.
In some embodiments, treatment comprises amputation or limb-salvage surgery. Amputation includes removal of a region of or the entirety of a limb containing the osteosarcoma Limb-salvage surgery includes removal of the bone containing the osteosarcoma and a region of healthy bone and/or tissue surrounding the osteosarcoma (e.g., about an inch around the osteosarcoma). The removed bone is then replaced. The replacement can be, for example, a synthetic rod or plate (prostheses), a piece of bone (graft) taken from the subject's own body (autologous transplant), or a piece of bone removed from a donor body (such as a cadaver) and frozen until needed for transplant (allogeneic transplant).
In some embodiments, treatment comprises administration of an effective amount of mifamurtide, methotrexate, cisplatin, carboplatin, doxyrubicin, adriamycin, ifosfamide, mesna, BCD (bleomycin, cyclophosphamide, dactinomycin), etoposide, muramyl tri-peptite (MTP), alendronate and/or pamidronate. In some embodiments, treatment comprises administration of an effective amount of a chemosensitizer such as suramin.
In some embodiments, treatment comprises administration of an effective amount of ADXS-HER2 (Advaxis). ADXS-HER2 comprises a live, attenuated strain of Listeria containing multiple copies of a plasmid that encodes a fusion protein sequence including a fragment of the LLO (listeriolysin O) molecule joined to HER2.
In some embodiments, treatment comprises apSTAR (autologous patient specific tumor antigen response) Veterinary Cancer Laser System (IMULAN BioTherapeutics, LLC and Veterinary Cancer Therapeutics, LLC). Also known as laser-assisted immunotherapy, apSTAR is a cancer treatment for solid tumors that utilizes an autologous vaccine-like approach to stimulate immune responses. apSTAR combines laser-induced in situ tumor devitalization with an immunoadjuvant for local immunostimulation.
In some embodiments, treatment comprises surgery to remove the primary tumor(s) followed administration of an effective amount of an adjuvant chemotherapy to remove metastatic cells. In some embodiments, treatment further comprises additional adjuvant therapy, such as administration of suramin.
In some embodiment, treatment is palliative treatment. In some embodiments, palliative treatment comprises radiation and/or administration of an effective amount of an analgesic (e.g., an non-steroidal anti-inflammatory drug, NSAID).
It is to be understood that any treatment described herein may be used alone or may be used in combination with any other treatment described herein. In some embodiments, treatment comprises surgery and at least one other therapy, such as chemotherapy or radiation.
In some embodiments, a subject identified as being at elevated risk of developing osteosarcoma or having undiagnosed osteosarcoma is treated. In some embodiments, the method comprises selecting a subject for treatment on the basis of the presence of one or more germ-line risk markers as described herein. In some embodiments, the method comprises treating a subject with osteosarcoma characterized by the presence of one or more germ-line risk markers as defined herein.
As used herein, “treat” or “treatment” includes, but is not limited to, preventing or reducing the development of a cancer, reducing the symptoms of cancer, suppressing or inhibiting the growth of a cancer, preventing metastasis and/or invasion of an existing cancer, promoting or inducing regression of the cancer, inhibiting or suppressing the proliferation of cancerous cells, reducing angiogenesis and/or increasing the amount of apoptotic cancer cells.
An effective amount is a dosage of a therapy sufficient to provide a medically desirable result, such as treatment of cancer. The effective amount will vary with the location of the cancer being treated, the age and physical condition of the subject being treated, the severity of the condition, the duration of the treatment, the nature of any concurrent therapy, the specific route of administration and the like factors within the knowledge and expertise of the health practitioner.
Administration of a treatment may be accomplished by any method known in the art (see, e.g., Harrison's Principle of Internal Medicine, McGraw Hill Inc.). Administration may be local or systemic. Administration may be parenteral (e.g., intravenous, subcutaneous, or intradermal) or oral. Compositions for different routes of administration are well known in the art (see, e.g., Remington's Pharmaceutical Sciences by E. W. Martin). Dosage will depend on the subject and the route of administration. Dosage can be determined by the skilled artisan.
Osteosarcoma in dogs is a spontaneously occurring disease with a global tumor gene expression signature indistinguishable from tumors from human pediatric patients and, while age of onset is higher in dogs, the clinical progression is remarkably similar. Both human and canine osteosarcomas most commonly arise at the ends of the long bones of the limbs and metastasize readily, usually to the lungs. Unlike human osteosarcoma, canine osteosarcoma is primarily a heritable disease affecting primarily large dogs. Particular dog breeds show more than 10-fold increased risk, including the Greyhound (mortality from osteosarcoma=26%), Rottweiler (mortality from osteosarcoma=17%) and Irish Wolfhound (mortality from osteosarcoma=21% [ref. 6-8].
Mapping disease genes using genome wide association study (GWAS) in dog breeds, each effectively a genetic isolate only a few hundred years old, requires approximately 10x fewer markers and samples that in human populations. However, population structure, cryptic relatedness and extensive regions of near fixation in breeds complicate GWAS analysis, and to date just a handful of studies have successfully mapped risk factors for complex, multigenic canine disorders. As described herein, novel methods for analyzing breed populations were used to identify genomic loci explaining the majority of the osteosarcoma phenotype variance in three breed populations, and to uncover novel genes and pathways potentially underlying this poorly understood disease.
304 Greyhounds (Grey; 118 Unaffected (U)+186 Affected (A)), 155 Irish wolfhounds (IWH; 68 U+87 A), 145 Rottweilers (Rott; 59 U+86 A) and 14 non-racing AKC registered greyhounds (AKC Grey) were genotyped on the Illumina canineHD SNP arrays (169,011 SNPs with call rate>90%, mean call rate=99.87%). Unaffected canines were those with no detectable osteosarcoma while affected canines were those with osteosarcoma diagnosis confirmed by a licensed veterinarian.
Each of the three breeds comprises a distinct population, with the AKC Grey clustering near their racing brethren (
In each breed a substantial portion of the genome was fixed (with minor allele frequency (MAF)<0.05): 5.6% in the Grey, 5.8% in the Rott and 12.1% in IWH (
Association between germ-line variants with MAF>0.05 and osteosarcoma in each of the three breeds independently were tested, rigorously controlling for the complex population structure in breeds by: (1) excluding one dog from each matched phenotype pair with GR>0.25, preferentially retaining younger cases and older controls; and (2) controlling for cryptic relatedness using a mixed model approach with the top principle component as a covariate [ref. 11 and 12]. The final dataset included 267 Greys (153 A+114 U; 105,934 SNPs with MAF>0.05), 135 Rotts (80 A+55 U; 99,144 SNPs) and 141 IWH (76 A+65 U). After finding no significant associations in the full set of IWH, an age-stratified dataset was next focused on (28 A<6 years old and 62 U>6 years old, 84,385 SNPs). All identified SNPs either had a significant association (exceeding 95% confidence intervals defined empirically using 1000 random permutations;
In each of the breeds, 20-40% of the phenotype variance was explained by the handful of loci with genome-wide significant associations (1 locus in Greys, 2 in IWHs, 6 loci in Rotts) [ref. 10]. Including all regions with p<0.0005 increased the phenotype variance explained to 57% in the Grey (14 loci), 53% in the IWH (4 loci) and 85% in the Rotts (15 loci). Surprisingly, none of the regions of association overlaps between the breeds, in contrast to the pattern observed for Mendelian canine traits [ref. 14], and meta-analysis of the three breeds also yielded no significant associations.
By examining fixed genomic regions one potential shared risk locus was identified: the risk allele tagging the top associated Grey locus is found at exceptionally high frequency in both the Rotts (97%) and IWH (95%), as compared to 51%+1-24% for 28 other dog breeds and 61% for the unaffected AKC Greys. This locus contains two well characterized tumor suppressors, CDKN2A (encodes p16IINTK4a and p19ARF) and CDKN2B (p15INK4b), and the antisense non-coding gene CDKN2B-AS/ANRIL (
GRAIL (Gene Relationships Across Implicated Loci) was used to identify non-random connectivity between genes in associated loci described herein [ref. 18], finding enrichment for relevant descriptors including “bone” (13 loci), “differentiation” (13 loci), “development” (9 loci) and “notch” (7 loci). Notch signaling is critical to osteosarcoma invasion and metastasis [ref. 19]. In 12 of 26 genic loci, GRAIL identified highly connected candidate genes (p<0.05) with intriguing relevance to osteosarcoma (Table 4,
Osteoblast differentiation enhancer FAM5C (Rott) [ref. 25] is connected by GRAIL to NELL1 (Rott), a regulator of osteoblast differentiation and ossification; TNFRSF11A (IWH), an essential mediator of osteoclast development; and the pro-apoptotic gene BLID (IWH).
GRAIL was also used to analyze regions in which the racing and osteosarcoma unaffected AKC Greys differed, defining the most differentiated SNPs using emmax (p<1×10−9) and then clumping them into 68 LD defined regions in PLINK (median size 387 kb, 5.1% of genome). GRAIL analysis of the results detected strong interconnectivity between a number of genes involved in “RNA” related cellular mechanisms, including small nucleolar RNAs in 6 distinct genomic regions (SNORA79, SNORA39, SNORA59A, SNORA6, SNORD87, SNORA62 and SNORD17, SNHG6) and genes related to hormones, catenin complexes and telomerase. Pathway analysis using INRICH (Lee et al. INRICH: Interval-based Enrichment Analysis for Genome Wide Association Studies. Bioinformatics. 2012 Jul. 1; 28(13):1797-9.) on the same set of regions yielded a single significant gene set enrichment after permutation: genes with the MIR-512-5P binding cis regulatory motif GCTGAGT (p=7e-05, pcorr=0.03, regulating genes DDX6, CTNNB1, CHD9, XKR6, STC1, NUDT18, ERP29, GNAZ, GRK6).
Fixed regions longer than 250 kb comprised a large proportion of the genome in each breed (Grey: 2.8%; Rott: 12.9%; IWH: 7.6%) encompassing genes linked to bone development and osteosarcoma, including RB1 (IWH), FOS (Rott), RUNX2 (Rott), CCNB1 (IWH), COL11A2 (Grey) and POSTN (IWH and Grey) [ref. 27]. In total 72.2 Mb (3.3%) of the genome were fixed in all three breeds (N=492, mean size=147 kb, 72.2 Mb total). These shared regions were enriched for microRNAs associated with pathogenesis and progression of osteosarcoma (p=0.017, pcorr=0.042, MIR150, MIR335, MIR340, MIR663, MIR650) [ref. 28]. When examined, the potentially selected RRVs INRICH enrichment (pcorr=0.035) was detected for putative “driver” genes of human osteosarcoma (WASF3; KIAA1279; AIFM2; CLCC1) [ref. 29].
To formally test whether the GWAS loci and RRVs are enriched for the same pathways, the INRICH results from the GWAS were combined with INRICH results for the RRs results from each breed using the Fisher method. The same analysis was performed with RRVs from 28 other breeds as a control (
Somatic tumor DNA was compared to blood-derived germ-line DNA in a subset of 7 affected Greys and 7 affected Rotts using array-based comparative genomic hybridization (aCGH) with a new, dense 180,000-feature Agilent canine CGH microarray (˜13 kb resolution). It was found that 99.7% of autosomal loci (162,858/162,337) had either a gain or loss in at least one dog (log2 tumor:reference signal intensity ratio>+/−0.2). On average, 49.6%+/−11.0% of the loci were altered in each Grey tumor and 56.1%+/−10.8% in each Rott tumor. Particular probes were enriched for changes; the fraction of probes altered in all 7 Rotts (N=8087, 4.95%), all 7 Greys (N=8781, 5.35%) or all 14 dogs (N=1603, 0.98%) was much higher than expected by random chance (pbinomial=2.71%, 1.3% and 0.04% respectively). Putative human osteosarcoma driver genes were among those with universal CGH loss in Greys (ARHGAP22, ARID5B, RCBTB1), Rotts (LHFP), and both breeds (AIFM2, TSC22D1) [ref. 29]. Comparing the genes affected by these high frequency alterations to genes altered in human osteosarcoma cell lines highlight the similarities between dog and human osteosarcoma.
It was then tested whether the 7 gene sets identified by combining GWAS and RRV pathways (
Additionally, an allele frequency comparison between the osteosarcoma-prone racing greyhounds and AKC greyhounds, which rarely get osteosarcoma identified candidate germline osteosarcoma risk variants (
It was also found that there was highly significant overlap in the set of genes altered in canine osteosarcoma tumors and two human osteosarcoma cell lines (
The correlations described in this example were confirmed in a second study involving a larger sample set.
Osteosarcoma is an aggressive tumor of the bone that often metastasizes to the lung. Advances in chemotherapy have increased survival to about 60-70% but patients who present with pulmonary metastases, relapse or don't respond to chemotherapy continue to have a very poor prognosis. Increased understanding of disease etiology could improve therapy by subgrouping patients for treatment based on the underlying biology and also by suggesting mechanisms of tumor development that could be targeted. This is the first GWAS of osteosarcoma reported for any species.
Osteosarcoma in dogs is, both clinically and molecularly, remarkably similar to its human counterpart, but particularly high rates of osteosarcoma occur in some breeds. Here, just a few hundred dogs and ˜100,000 markers were used to explain the majority of phenotype variance within each breed. It was discovered that canine osteosarcoma has a complex genetic architecture; with up to 15 loci associated within a breed, far more than observed in other GWAS mapped canine diseases published thus far. Through comprehensive analysis of inherited genetic variation in these breeds combined with somatic alterations in osteosarcoma tumors a number of genes were identified that affect bone growth and differentiation as well as pathways for transformation and metastasis. The study herein confirms that osteosarcoma is heterogeneous in dogs, but highlights that among all risk factors identified some, e.g., CFA 11 (chr11:44392734-44414985), may be important in most of the affected individuals.
No apparent sharing of GWAS loci was identified between breeds, despite relatively recent shared genetic ancestry. Part of the explanation for this might be that while a large number of genes for osteosarcoma are present in the dog population as a whole, only a few make it into each breed. Through random chance each breed may inherit a different set of genetic risk factors resulting in mostly breed specific risk factors. As a few key risk factors become common in each breed they may then be sufficient to drive the disease development, suggesting that key pathways receive a substantial number of hits within a breed. This could allow dissection of functional pathways by examining different breeds.
Selection may further contribute to the enrichment of disease risk factors within breeds as osteosarcoma tends to affects large dogs. In humans, the tumor most commonly arises in conjunction with the adolescence growth spurt. This suggests that pathways for tissue growth and in particular osteogenesis may be involved in tumor development and this was also supported by the study herein. In general, dog breeds have been generated by breeding towards desirable traits and away from undesirable characteristics within a more or less closed gene pool. This artificial selection has resulted in fixed regions within a breed where all individuals carry the same haplotype. It is possible that selection for size and rapid growth in some breeds have resulted in the fixation of alleles that increase not only bone growth but also the risk of osteosarcoma development. This is evidenced by the top locus described herein, the candidate region on CFA11 identified by association in greyhounds. On closer examination, it was noted that the greyhound risk haplotype occurred in almost all Rottweilers and Irish Wolfhounds in the study, regardless of whether they were affected or free of disease, but not in AKC Greyhounds—a breed not predisposed to osteosarcoma.
The shared risk haplotype on CFA11 (chr11:44392734-44414985) encompasses sequence downstream of ANRIL, a long non-coding RNA regulating the expression of the CDKN2A/B locus which encodes tumor suppressors p16INK4a, p19ARF and p15INK4b H3K27Ac histone marks in an osteosarcoma cell line indicate the presence of an active enhancer element in the haplotype sequence suggesting that SNPs in this region may influence expression of ANRIL in blood (Cunnington et al 2010). Human osteosarcomas display deletion of the orthologous 9p21 locus in 5-21% of cases (reviewed in Martin et al 2012). Correspondingly, mice where the CDKN2/A region has been deleted are known to be tumor-prone (Serrano et al 1996), and more recently it was shown that mice that have the CDKN2A/B locus intact but where 70 kb encompassing part of ANRIL has been deleted show increased risk of developing sarcomas (Visel et al). Furthermore, absence of p16INK4a expression has been correlated with decreased survival in pediatric osteosarcoma patients (Maitra et al 2001). Taking these observations together, we hypothesize that the risk haplotype carries enhancer elements in the ANRIL region, which result in increased expression of ANRIL and thereby cause the down regulation of the CDKN2A/B genes resulting in susceptibility to the initial steps of tumor development. Interestingly, another cancer GWAS in dogs also indicates association with this CFA11 region. Shearin et al report association of a haplotype spanning the MTAP gene and part of CDKN2A with risk of histiocytic sarcoma in Bernese Mountain Dogs (Shearin et al 2012).
2.5 Mb around the greyhound GWAS peak on chromosome 11 (chr11:44392734-44414985) was targeted for dense sequencing (15 dogs) and finemapping (180 cases and 115 controls). Imputation and association testing of sequenced variants narrowed the peak of association in greyhounds dramatically to a 20 kb risk haplotype (chr11:44390000-44410000), telomeric of the genes CDKN2A and CDKN2B, that is nearly fixed in both the rottweilers (98% in cases and 96% in controls) and Irish wolfhounds (95% in cases and 92% in controls). The top haplotype (vertical solid lines) mapped to a locus downstream of the non-coding gene ANRIL on human chromosome 9 (hg19). Potential markers of function in the region included H3K27 acetylation in osteoblasts and DNAase hypersensitivity clusters (assayed from 125 cell types), most notably in regions that align between the dog and human genomes in a Multiz alignment of 46 species and are constrained across mammals as measured by Genomic Evolutionary Rate Profiling (GERP) [refs. ENCODE Nature 2012, Davydov PLoS Comput Biol 2010, Meyer Nucleic Acids Res. 2012 Nov. 15, Rosenbloom Nucleic Acids Res. 2012].
The top haplotype genomic region was tiled with luciferase probes to assay function of seven sections (A-G,
Human chromosome 9 genomic region fragments A to G (
Other genomic variants, such as SNPs and chromosomal regions, within or near CFA11(chr11:44392734-44414985) were found to be associated with osteosarcoma. These variants are listed in Table 5. The chr11:44405676 variant was identified as the top variant based on functional data. The correlations described in this example were confirmed in a second study involving a larger sample set.
280 US leonberger dogs and 71 European (EU) leonberger dogs were included in this study. There were 138 cases and 213 controls total (182 US cases, 98 US controls, 40 EU cases, and 31 EU controls). Outliers, duplicates and uncertain phenotypes were removed. The call rate for SNPs and inds was >95%. The MAF>5%. The Hardy-Weinberg p>1E-6 in controls (
Regions on chromosomes 11, 24, and 35 had a large number of significant SNPs (
Larger regions were determined based on sweeps of the chromosomal regions. These larger regions are shown in Table 7.
Without further elaboration, it is believed that one skilled in the art can, based on the above description, utilize the present invention to its fullest extent. The specific embodiments are, therefore, to be construed as merely illustrative, and not limitative of the remainder of the disclosure in any way whatsoever. All publications cited herein are incorporated by reference for the purposes or subject matter referenced herein.
The indefinite articles “a” and “an,” as used herein in the specification and in the claims, unless clearly indicated to the contrary, should be understood to mean “at least one.”
From the above description, one skilled in the art can easily ascertain the essential characteristics of the present invention, and without departing from the spirit and scope thereof, can make various changes and modifications of the invention to adapt it to various usages and conditions. Thus, other embodiments are also within the claims.
This application claims the benefit of the filing date of U.S. Provisional Application No. 61/785,051, filed Mar. 14, 2013, the entire contents of which are incorporated by reference herein.
This invention was made with U.S. Government support under U54 HG003067 awarded by the National Institutes of Health. The U.S. Government has certain rights in the invention. The research was also generously supported and funded by the Swedish government and Uppsala University.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US14/27247 | 3/14/2014 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
61785051 | Mar 2013 | US |