The present disclosure relates to a method of assaying the risk or susceptibility of fibroadenoma development in a human subject. The disclosed method can be or at least form part of the diagnosis to confirm the occurrence of fibroadenoma in the subject. More particularly, the disclosed method sets out to detect one or more mutations residing within the MED12 gene of the subject that presence of these mutations has shown significant association with the occurrence of fibroadenomas in a human subject.
Fibroadenomas (FAs) are benign breast tumors that represent the most frequently occurring breast tumors in women under the age of 30 years1,2. Often observed in adolescent girls and young adult women, fibroadenomas are clinically known to be hormone-dependent and to fluctuate in size according to periods of pregnancy and menopause3. A study of 265,402 women in China reported a fibroadenoma incidence of 241 per 100,000 among women under 35 years and 165 per 100,000 among women of age 35-39 years4. Histologically, fibroadenomas comprise an admixture of stromal and epithelial cells5. Although benign, fibroadenomas are reported to be associated with an approximately two-fold increase in risk of developing invasive breast carcinoma in 20 years6. The diagnosis of fibroadenoma is typically achieved by biopsy, and patients with larger lesions are often subjected to surgery which can incur cost, anxiety and in rare cases, procedure-related complications. At present, little is known about the genetic abnormalities that underlie fibroadenoma particularly when compared to breast carcinoma, where much recent progress has been made in the characterization of its mutational landscape8,9. For example, previous targeted mutational screens of TP53 in fibroadenomas have been equivocal. One study reported that one out of eight (12.5%) fibroadenomas exhibited a non-silent TP53 mutation10, whereas another study reported no somatic TP53 mutations in fibroadenomas from women who remained unaffected by breast cancer after an average follow-up of ten years11. A single PIK3CA mutation has also been reported from a screen of ten fibroadenoma tumors12. Based upon the findings of these studies, it appears that certain genetic makeups may inevitably predispose an individual to greater risk of FAs development. Therefore, early identification of such genetic attributes or methods allowing one to discover the likelihood of genetic deficiencies is greatly desired.
The present disclosure aims to provide a method of assaying the risk of breast fibroadenomas occurrence in a human subject, preferably a female subject, through genotyping a specific allele or gene of the subject.
Still, an object of the present disclosure is to bring forth a method capable of serving as confirming diagnosis or forming at least part of the confirming diagnosis towards the occurrence of fibroadenoma in a human subject.
A further object of the present disclosure is to offer a method of assaying susceptibility and/or confirming diagnosis of fibroadenoma development in a human subject by detecting one or more mutations located in the MED12 gene using any genotyping approaches known in the art.
Another object of the present disclosure is to provide a method of detecting mutations resulting particularly in non-synonymous substitution in the encoded mediator complex subunit 12 (MED12). The mutations to be detected are associated with higher risk of fibroadenomas occurrence in a female subject.
At least one of the preceding objects is met, in whole or in part, by the present invention, in which one of the embodiments of the present invention involves a method of assaying susceptibility and/or confirming diagnosis of breast fibroadenomas development in a human subject. The method essentially comprises the steps of performing a nucleic acid-based assay to analyze an isolated polynucleotide encoding at least exon 2 of MED12 gene from a sample acquired from the human subject; and regarding the human subject with greater susceptibility and/or confirming diagnosis of breast fibroadenomas development by detecting a mutation in the isolated polynucleotide. Preferably, the mutation is a splice site mutation located at position −8 of exon 2 of the MED12 gene, a missense mutation located at codon 44 of cDNA of the MED12 gene or a missense mutation located at codon 36 of cDNA of the MED12 gene.
In several preferred embodiments, the sample comprises stromal tissues which may acquire the mutation through one or more somatic events progressively acquired in the subject.
Some embodiments of the disclosed method preferably detect the missense mutation, which is located at nucleotide position 107 of codon 36 cDNA of the MED12 gene.
For a number of embodiments, the missense mutation is located at position 130 and/or 131 of codon 44 cDNA of the MED12 gene. More preferably, the missense mutation results in p.G44A, p.G44C, p.G44D, p.G44R, p.G44S, or p.G44V in a polypeptide translated from the MED12 gene.
In several preferred embodiments, the disclosed method may include additional steps of detecting at least one mutation located at PIK3CA and/or TP53 gene of the subject upon detecting a mutation in the isolated polynucleotide encoding at least exon 2 of MED12 gene; and regarding developed fibroadenoma in the subject as benign state in the absence of detectable mutation located at PIK3CA and/or TP53 gene.
The present invention may be embodied in other specific forms without departing from its structures, methods, or other essential characteristics as broadly described herein and claimed hereinafter. The described embodiments are to be considered in all respects only as illustrative, and not restrictive. The scope of the invention is, therefore, indicated by the appended claims, rather than by the description provided hereinafter. All changes that come within the meaning and range of equivalency of the claims are to be embraced within their scope.
Unless specified otherwise, the terms “comprising” and “comprise” as used herein, and grammatical variants thereof, are intended to represent “open” or “inclusive” language such that they include recited elements but also permit inclusion of additional, un-recited elements.
As used herein, the phrase “in embodiments” means in some embodiments but not necessarily in all embodiments.
As used herein, the terms “approximately” or “about”, in the context of concentrations of components, conditions, other measurement values, etc., means+/−5% of the stated value, or +/−4% of the stated value, or +/−3% of the stated value, or +/−2% of the stated value, or +/−1% of the stated value, or +/−0.5% of the stated value, or +/−0% of the stated value.
The term “polynucleotide” or “nucleic acid” as used herein designates mRNA, RNA, cRNA, cDNA or DNA. The term typically refers to oligonucleotides greater than 30 nucleotide residues in length.
The term “primer” used herein throughout the specification refers to an oligonucleotide which, when paired with a strand of DNA, is capable of initiating the synthesis of a primer extension product in the presence of a suitable polymerizing agent. The primer is preferably single-stranded for maximum efficiency in amplification but can alternatively be double-stranded. A primer must be sufficiently long to prime the synthesis of extension products in the presence of the polymerization agent. Primers can be “substantially complementary” to the sequence on the template to which it is designed to hybridize and serve as a site for the initiation of synthesis. By “substantially complementary”, it is meant that the primer is sufficiently complementary to hybridize with a target polynucleotide. Preferably, the primer contains no mismatches with the template to which it is designed to hybridize but this is not essential. For example, non-complementary nucleotide residues can be attached to the 5′ end of the primer, with the remainder of the primer sequence being complementary to the template. Alternatively, non-complementary nucleotide residues or a stretch of non-complementary nucleotide residues can be interspersed into a primer, provided that the primer sequence has sufficient complementarity with the sequence of the template to hybridize therewith and thereby form a template for synthesis of the extension product of the primer.
The term “gene” as used herein may refer to a DNA sequence with functional significance. It can be a native nucleic acid sequence, or a recombinant nucleic acid sequences derived from natural source or synthetic construct. The term “gene” may also be used to refer to, for example and without limitation, a cDNA and/or an mRNA encoded by or derived from, directly or indirectly, genomic DNA sequence.
One aspect of the present disclosure refers to a method of assaying susceptibility and/or confirming diagnosis of breast fibroadenomas development in a human subject, preferably a female human subject. Essentially, the method of assaying comprises the steps of performing a nucleic acid-based assay to analyze an isolated polynucleotide encoding at least exon 2 of MED12 gene from a sample acquired from the human subject; and regarding the human subject with greater susceptibility and/or confirming diagnosis of breast fibroadenomas development by detecting a mutation in the isolated polynucleotide. The sample applicable for the disclosed method can be any biological sample having the extractable or accessible genetic materials suspected to have carrying the mutations of interested, either acquired or constitutional, and detectable in the nucleic acid-based assay. The sample can be, but not limited to, biopsy tissue or blood sample of the subject. More preferably, the sample or the biopsy tissue comprises stromal tissues, which is found by the inventors of the present disclosure being prone to adversely affected, likely owing to dysregulated extracellular matrix organization, in a significant extent by the mutations of interested located at exon 2 of the MED12 gene. The sample may be subjected to pre-treatment to isolate the preferred tissue type prior to extracting the genetic material for analysis.
Further, the polynucleotide to be reacted or analyzed in the nucleic-acid based assay can be directly or indirectly derived from the genetic materials extracted or obtained from the sample of the subject. Polynucleotides can be acquired directly from the sample source, but not limited to, by digesting or cutting the targeted gene segment utilizing restriction enzymes recognizing the specific restriction site located adjacent to the interested portion. On the other hand, the polynucleotides can be amplicons generated and duplicated from the extracted genetic materials through any known PCR or the like approaches. These amplicons are further subjected to the analysis of the nucleic acid-based assay to identify the possible mutations resulting in the occurrence of fibroadenomas.
According to several preferred embodiments, the nucleic acid-based assay can be performed to identify and/or detect the mutations comprises sequencing the polynucleotide. More specifically, the sequencing approach implementable in the present disclosure to effect the detection can be Sanger sequencing and/or ultra-deep targeted amplicon sequencing which is effective and capable of catering highly precise and reliable result in identifying the interested mutations in exon 2 of the MED12 gene. Primer pairs Seq ID No. 1 and Seq ID No. 2 listed in Table 1 below are one embodiment of the primers usable in the present disclosure to realize the sequencing process on exon 2 of the MED12 gene. The sequencing process shall provide reliable reading about the sequence of the polynucleotide that substantial outcome can be inferred thereby regarding the tested subject, at least the in the sample, whether the FA-associated mutation are carried. Furthermore, Seq ID No.3-14 are sequences of the primers pairs can be used to perform the ultra-deep targeted amplicon sequencing. The details of the Sanger sequencing and/or ultra-deep targeted amplicon sequencing utilizing the listed primer pairs are further elaborated in the examples incorporated hereafter. It is important for other skilled artisans to appreciate the fact that the disclosed method can be conducted utilizing other known sequencing equivalent or non-equivalent procedures or approaches to detect presence of the interested mutation in the analyzed polynucleotides and such modification shall not depart from the scope of the present disclosure. Other known processes implementable to identify or assist in identifying these mutations can be any one of, but not limited to, temperature gradient gel electrophoresis, capillary electrophoresis, amplification-refractory mutation system-polymerase chain reaction (ARMS-PCR), dynamic allele-specific hybridization (DASH), target capture for next generation sequencing (NGS), high-density oligonucleotide SNP arrays or Restriction fragment length polymorphism (RFLP).
Pursuant to the preferred embodiments, the disclosed method targets to analyze and/or identify multiple potential mutations residing in exon 2 the MED12 gene concurrently. One of the interesting mutations is a splice site mutation located at position −8 of exon 2 of the MED12 gene. More specifically, the splice site mutation is an intronic T>A substitution located 8 bp upstream of exon 2 of the MED12 gene in the genomic DNA. This splice site mutation results in an aberrant splice acceptor site further leading to retiontion of the last six bases of the MED12 gene intron 1 in the mRNA transcribed thereof.
Another mutation to be identified by the disclosed method, in a number of embodiments, is a missense mutation located at codon 36 of cDNA of the MED12 gene. Preferably, the missense mutation is located at position 107 of codon 36 cDNA of the MED12 gene causing a non-synonymous substitution of the encoded amino acid thereof. More preferably, the disclosed method seeks to detect presence of any mutation resulting in any one of p.L36R or p.L36P. Correspondingly, the equivalent mutations positioned on the cDNA result in the nonsynonymous substitution of p.L36R and p.L36P are respectively c.107T>G and c.107T>A. Considering degeneracy of the codon involved, other mutations may result in similar synonymous substitution of the involved amino acids, p.L36R and p.L36P, besides c.107T>G and c.107T>A.
According to other preferred embodiments, the disclosed method also aims to identify a missense mutation located at codon 44 of cDNA of the MED12 gene. Specifically, the missense mutation is located at position 130 and/or 131 of codon 44 cDNA of the MED12 gene giving rise to non-synonymous substitution of encoded amino acids such as p.G44A, p.G44C, p.G44D, p.G44R, p.G44S, or p.G44V in a polypeptide translated from the MED12 gene.
It is important to note that inventors of the present disclosure found that the aforesaid mutations may subsequently upregulate or activate other genes associated with extracellular matrix organization, estrogen signaling, and TGFβ and Wnt signaling. Up-regulation or uncontrolled activation of these genes or gene products shall hence promote development of FA in the subject.
In several preferred embodiments, the disclosed method may include additional steps of detecting at least one mutation located at PIK3CA and/or TP53 gene of the subject upon detecting a mutation in the isolated polynucleotide encoding at least exon 2 of MED12 gene; and regarding developed fibroadenoma in the subject as benign state in the absence of detectable mutation located at PIK3CA and/or TP53 gene.
Another aspect of the present disclosure may include use of Seq. ID No. 1 and 2 in the preparation of a platform for nucleic-acid based assay for assaying susceptibility and/or confirming diagnosis of breast fibroadenomas development in a female human subject.
Likewise, in further aspect, the present disclosure may include use of Seq. ID No. 3 and 4, Seq. ID No. 5 and 6, Seq. ID No. 7 and 8, Seq. ID No. 9 and 10, Seq. ID No. 11 and 12, and/or Seq. ID No. 13 and 14 in preparation of a platform for nucleic-acid based assay for assaying susceptibility and/or confirming diagnosis of breast fibroadenomas development in a female human subject. Preferably, the use of Seq. ID No. 3 and 4, Seq. ID No. 5 and 6, Seq. ID No. 7 and 8, Seq. ID No. 9 and 10, Seq. ID No. 11 and 12, and/or Seq. ID No. 13 and 14 in the mentioned platform facilitates or materializes identification of a missense mutation located at codon 44 of cDNA of MED12 gene or a missense mutation located at codon 36 of cDNA of MED12 gene. Presence of at least one of these mutations in the breast tissue, more preferably stromal cells of the breast tissue, has been shown to associate with greater risk in developing FA in relation to those with the wild type allele by the present disclosure.
The following example is intended to further illustrate the invention, without any intent for the invention to be limited to the specific embodiments described therein.
A total of 98 fibroadenoma tumors were included in this study, of which 12 were from fresh frozen tumors and a further 86 from archival FFPE (formalin-fixed paraffin-embedded) samples. Tumors and whole-blood were obtained from patients undergoing surgical excision of fibroadenoma or from the SingHealth Tissue Repository, with signed informed consent. Archival samples were obtained from the Department of Pathology of Singapore General Hospital. Clinicopathological information for subjects (age and tumor size) was reviewed retrospectively.
Genomic DNA (gDNA) from fresh frozen tissue was extracted and purified using the Qiagen Blood and Cell Culture DNA kit. In the case of FFPE samples, the Qiagen FFPE DNA kit was used on freshly sectioned FFPE tissue. Genomic DNA yield and quality were determined using Picogreen™ fluorometric analysis as well as visual inspection of agarose gel electrophoresis images.
Native genomic DNA was fragmented with the Covaris™ S2 (Covaris) system using recommended settings. Sequencing adaptor ligation was performed using the Truseq Paired-End Genomic DNA kit (Illumina). For enrichment of coding sequences, the present disclosure used the SureSelectXT™ Human All Exon v3 (50 Mb) kit (Agilent Technologies) according to manufacturer's recommended protocol. Exome-enriched libraries were then sequenced on Illumina's HiSeq 2000 sequencing platform to generate 76 bp paired-end reads. Bioinformatics analysis, comprising of sequence alignment, variant calling and identification of candidate somatic variants was performed as described in previous work28. For point mutations, at least 10 variant reads in the tumor and a total read depth of 10 in the normal sample was required. In the case of Indels, a support of at least 20 variant reads amounting to at least 10% of total reads was required. Indels overlapping simple repeat regions were also discarded. All remaining candidate variants were visually inspected in a genome browser to identify missed probable germline variants or those in regions of anomalous alignment. The variant calling pipeline missed the p.Glu33_Asp34insProGln aberrant splice site variant in Sample004 as it was not in the exome capture kit manufacturer's target region file. It was later identified from a systematic visual inspection of MED12 in a genome browser as it was the only gene recurrently mutated in multiple samples. All candidate somatic variants were confirmed by Sanger sequencing.
The present disclosure used the following PCR primer pair to identify mutations in MED12 exon 2; forward primer: TGTTCTACACGGAACCCTCCTC, reverse primer: CTGGGCAAATGCCAATGAGAT, Tm: 54.6 C, 56.3 C, product length: 373 bp. PCR amplification was conducted using neat DNA and Platinum™ Taq Polymerase (Life Technologies). PCR cycling regime included one cycle at 95° C. for 10 min, 35 cycles at 95° C. for 30 s, 58° C. for 30 s and 72° C. for 1 min, and one cycle at 72° C. for 10 min. BigDye Terminator v.3.1 kit (Applied Biosystems) was used for bi-directional sequencing on generated PCR amplicons and products were fractionated employing ABI PRISM 3730 Genetic Analyzer (Applied Biosystems). Sequencing traces were aligned to reference sequences using Lasergene 10.1 (DNASTAR) and were visually analyzed.
For sensitive detection of low-frequency variants in MED12 exon 2, the present disclosure further used ultra-deep targeted amplicon sequencing. Six PCR amplicons were designed and tiled across exon 2 of MED12 using primers pairs listed in Table 2 below.
The present disclosure then used Fluidigm's Access Array System to generate and pool the amplicons according to manufacturer's instructions. For each sample, 50 ng of genomic DNA was used as template. Sequencing library preparation of the pooled amplicons was performed using the TruSeq HT DNA Sample Preparation Kit (Illumina) according to manufacturer's instructions. Sequencing was performed on the Illumina MiSeq next-generation sequencing platform for 150 cycles using the MiSeq Reagent kit v3.
Bioinformatics analysis of sequencing reads was performed as follows. Briefly, undetermined (‘N’) base calls at the ends of reads were trimmed. Following this, the 5′ end of each read was trimmed by 25 bases to eliminate the possibility of primer inclusion. The Burrows-Wheeler Alignment26 (BWA) tool (0.6.2) was used to align the resulting reads to the reference human genome (hg19). For more sensitive detection of insertions and deletions (indels), the present disclosure also ran a separate alignment process using modified settings (o=2, e=30, d=30, O=0, E=0, L=0). Indels were identified through manual inspection, whereas automated detection of point mutations was performed using the samtools27 (0.1.18) mpileup tool. Variant calls were restricted to regions covered by amplicons generated from primers pairs provided in the Table 1. Variant allele frequencies were calculated for each position in the targeted region, and those that exceeded a threshold of 5% were considered candidate variants. In order to minimize the possibility of PCR-induced artifacts, variants were only considered valid if present in at least two amplicons. Candidate variants had at least 21,620 sequencing reads overlapping them, with an average coverage of 184,526 reads.
To ascertain the sensitivity of our assay, positive control samples containing spiked-in validated mutant MED12 at allele frequencies (15%, 10%, 5%, 3%) were generated via serial dilution. The present disclosure accurately detected variants in positive control samples at allele frequencies down to 3%. The present disclosure also calculated alternate (nonreference) allele frequencies across all positions in our target region in order to estimate the likelihood of error from sequencing and alignment artifacts. The mean alternate allele frequency was 0.281% with a standard deviation of 1.09%. Thus, our detection threshold of 5% exceeds four standard deviations from the estimated background error rate.
To identify genes with recurrent somatic mutations across multiple samples, the present disclosure first sequenced the exomes of eight fresh frozen FA tumors together with matched whole-blood to a mean coverage of 124×, with an average of 87% of bases covered by at least 20 reads in each sample.
Consistent FA being a benign tumor, samples had an average of only seven somatic mutations. Almost all genes were found to be mutated only once. These included tumor suppressors such as NF1 and RB1. The only gene that was recurrently mutated was MED12 (mediator complex subunit 12), which is a member of the Mediator Complex, a multiprotein complex that is widely involved in transcriptional regulation of gene expression. Four out of the eight FA samples sequenced (50%) contained somatic mutations in exon 2 of MED12 as presented in Table 4.
In order to further ascertain the prevalence of MED12 exon 2 mutations in FA, the present disclosure performed ultra-deep targeted amplicon sequencing of MED12 exon 2 in 90 additional FA samples (4 fresh frozen tissue samples and 86 archival samples). This confirmed a strikingly high MED12 exon 2 mutation frequency in FA of 59%. Frequency of the various detected mutation is summarized in Table 5 and
Out of the 98 FA samples sequenced, 41 (42%) had point mutations in codon 44 (20 p.G44D, 12 p.G44S, 3 p.G44R, 3 p.G44V, 2 p.G44C, 1 p.G44A). A single point mutation (1.1%) was also found in codon 43 (p.Q43P) and four (4.1%) in codon 36 (3 p.L36R, 1 p.L36P). Additionally, seven (7.8%) samples were found to have insertions or deletions that were expected to preserve the reading frame, and one (1.1%) further sample harbored a frameshift deletion. The present disclosure also identified four samples with an intronic T>A substitution 8 bp upstream of exon 2 that resulted in an aberrant splice acceptor site, causing the last six bases of intron 1 to be retained13. Several lines of evidence indicate the MED12 exon 2 mutations are somatic. The present disclosure performed Sanger sequencing on eight MED12 mutant fresh-frozen samples with available whole-blood and confirmed that all eight mutations were somatic as indicated in
The MED12 gene lies on chromosome X, and in females, one copy is normally silenced by epigenetic inactivation17. To confirm that mutant MED12 transcripts are expressed, and are not suppressed by X-inactivation, the present disclosure performed Sanger sequencing on complementary DNA (cDNA) generated by reverse-transcribing messenger RNA (mRNA) from eight fresh frozen samples that were determined to harbor MED12 exon 2 mutations by targeted amplicon sequencing. Particularly, the present disclosure sequenced the cDNA of seven MED12-mutant samples with available fresh frozen tissue. The present disclosure converted 100 ng of RNA to cDNA with SuperScript III First-Strand Synthesis SuperMix from Invitrogen according to manufacturer's recommended protocol. The present disclosure performed PCR and sequenced the MED12 region between exon 1 and 3 with primers from Mäkinen et al13; forward primer: CTTCGGGATCTTGAGCTACG, reverse primer: GATCTTGGCAGGATTGAAGC, product length: 199 bp. PCR amplification, sequencing and fractionation was performed as described above for Sanger sequencing of genomic DNA. The present disclosure were able to unambiguously identify the correct MED12 mutations in the cDNA of all but one sample, as illustrated in
Due to the biphasic nature of FA and relatively low variant allele frequencies observed in MED12 mutations (14.1%), it was suspected that MED12 mutations may be present in either the epithelial or stromal compartments. To confirm this, the present disclosure performed LCM (laser capture microdissection) on one sample (Sample006) and Sanger sequenced the individual compartments.
Briefly, fresh frozen tissue from Sample006 was embedded in Optimal Cutting Temperature (OCT) compound (Tissue-Tek, Sakura Finetek), and sections (8 μm thick) were cut in a Microtome-cryostat (Leica), mounted onto Arcturus® PEN membrane glass slides (Life Technologies), and then stored at −80° C. till required. Slides were dehydrated & stained with Arcturus® Histogene® following manufacturer's recommendations. The stained slide was loaded onto the laser capture microscope stage (ArcturusXT™ Laser Capture Microdissection (LCM) System). A Capsure™ Macro LCM cap (Life Technologies) was then placed automatically over the chosen area of the tissue. Once the cells of interest that were highlighted by the software were verified by the user, the machine automatically dissected out the highlighted cells of interest using a near infrared laser or UV pulse that transferred them onto the Capsure™ Macro LCM Cap.
The DNA was extracted directly from LCM caps using Qiagen FFPE DNA Tissue kit following manufacturer's protocol with the following modifications. Each sample cap was incubated with the lysis buffer (ATL & Proteinase K) in a 500 μl microcentrifuge at 60° C. for 5 hrs & enzyme deactivation at 90° C. for 10 minutes. The eluted DNA was used directly for PCR & BigDye® sequencing.
Results show that MED12 mutations are only found in the stromal compartment, and that epithelial portions of the FA tumor contained only wild-type as in
Total RNA was extracted from 10 fresh frozen fibroadenoma tumors using Trizol (Invitrogen) and purified using the RNeasy mini kit (Qiagen). 10 μg of purified total RNA was then labelled according to standard Affymetrix protocol and then hybridized to Affymetrix GeneChip Human Genome U133 Plus 2.0 microarrays. Scanning of the microarrays was performed using the Affymetrix GeneChip Scanner 7G. CEL files were loaded into the R statistical environment (version 2.15.2) using the simpleaffy package31 and preprocessed using the robust multi-array average (RMA) algorithm32 with quantile normalization. Mapping of Affymetrix probe sets to genes was performed using the BrainArray custom CDF33 (chip definition file) version 17. Differentially expressed genes between mutant MED12 and wild-type MED12 samples were identified based on empirical Bayes moderated t-statistics calculated using the limina package34. A list of genes differentially expressed over 1.5 fold in either direction and with a p-value less than 0.05 is presented in Table 7. P-values were not significant after adjusting for multiple hypotheses due to the limited sample size. The microarray data has been deposited in the Gene Expression Omnibus35 (GEO accession ID: GSE55594).
To characterize transcriptional changes associated with aberrant MED12, the present disclosure generated and compared the gene expression profiles of six MED12 mutated fibroadenoma samples against four MED12 wild-type fibroadenomas. Due to the limited sample size and fibroepithelial nature of fibroadenomas, the present disclosure used GSEA19 (Gene Set Enrichment Analysis) in order to identify potentially dysregulated pathways. Genes were rank-ordered by fold-change between MED12-mutant and wild-type fibroadenomas and subjected to GSEA against MSigDB19 (Molecular Signatures Database) curated (c2) gene sets. Particularly, the present disclosure integrated our gene expression data with publicly-available gene expression data of UL tumors (GEO accession ID: GSE30673). Lists of genes upregulated two-fold and four-fold were obtained by calculating fold-change of averages between mutant MED12 (n=8) and wild-type UL samples (n=2). To calculate if the overlap between genes upregulated in MED12-mutant fibroadenoma and MED12-mutant UL is significant, the present disclosure used the Gene Set Enrichment Analysis (GSEA) tool19. Briefly, genes in the fibroadenoma dataset were ranked-ordered according to log fold-change. The GSEA algorithm then examines where genes upregulated in UL fall in the rank-ordered list, and generates an enrichment score corresponding to how enriched a gene set is in either extreme end of the rank-ordered list as can be seen in
A list of candidate mutant MED12 target genes was obtained from the core-enriched genes (
In order to study relative pathway activity on the level of individual samples, the present disclosure used the Gene Set Variation Analysis (GSVA) method36. Using a non-parametric approach, GSVA transforms a gene by sample matrix into a gene set by sample matrix, facilitating the identification of differential activation of functionally related genes. Empirical Bayes moderated t-statistics34 were then calculated and gene sets with p-values<0.05 were considered to have significantly differential activity between mutant MED12 and wild-type MED12 samples. GSVA was performed on two groups of gene sets. MSigDB c2 gene sets associated with breast cancer and estrogen signalling were considered, as shown in
Among others, genes upregulated in MED12-mutant fibroadenomas are associated with ER+ breast cancers, estrogen stimulus in ER+ breast cancer cells, extracellular matrix (ECM) regulation and TGFβ signalling as revealed in Table 8 and 9. As the top GSEA results suggested an association between MED12 mutations and activated estrogen signalling, the present disclosure performed GSVA (Gene Set Variation Analysis) on our microarrays to detect differential pathway activity between samples.
Given the similarity of the MED12 mutation spectrum in FAs and ULs, the present disclosure hypothesized the integration of FA and UL molecular data might allow further pinpointing of genes and pathways. Indeed, GSEA on our FA dataset against a previously-published set of genes upregulated in MED12-mutated ULs revealed a strong similarity of upregulated genes in MED12-mutated FAs and ULs. Specifically, genes upregulated in MED12-mutant FA samples were significantly enriched for genes upregulated over two-fold in MED12-mutant UL (enrichment score=0.61, p=0) as shown in
Moreover, previous study reported that one out of eight (12.5%) fibroadenomas exhibited a non-silent TP53 mutation10, whereas another study reported no somatic TP53 mutations in fibroadenomas from women who remained unaffected by breast cancer after an average follow-up of ten years11. A single PIK3CA mutation has also been reported from a screen of ten fibroadenoma tumors12. It is likely that those cases that harbour PIK3CA and TP53 mutations may actually indicate more aggressive phylloides tumors37 (subtype of fibroepithelial tumors) rather than the true benign fibroadenomas. Therefore the presence of a single genetic alteration, in the absence of others such as P53 or PIK3CA, may be a more accurate biomarker for benign fibroadenoma. Therefore, early identification of such genetic attributes or methods allowing one to discover the likelihood of genetic deficiencies is greatly desired.
Candidate aberrant MED12 target genes were also enriched for genes downregulated in liver cancer with activated beta-catenin (CTNNB1). As MED12 plays a vital role in transducing Wnt/beta-catenin signaling20, this observation is consistent with MED12 mutations resulting in aberrant beta-catenin signalling, which is involved in regulating focal adhesion. Accordingly, GO (gene ontology) analysis showed that genes upregulated in mutant MED12 samples were over-represented with those expressed in the extracellular region as shown in Table 11.
It is crucial to note that the present disclosure is implementable to detect constitutional mutations regarding MED12 gene mutation, more particularly mutations located in exon 2 of MED12 gene, despite the subjects experimented in the foregoing examples may have obtained these mutations as somatic mutations. It is to be understood also that the present invention may be embodied in other specific forms and is not limited to the sole embodiment described above. However modification and equivalents of the disclosed concepts such as those which readily occur to one skilled in the art are intended to be included within the scope of the claims which are appended thereto
Number | Date | Country | Kind |
---|---|---|---|
10201402277X | May 2014 | SG | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/SG2015/050107 | 5/12/2015 | WO | 00 |