1. Field of the Invention
The invention relates to a method for detecting genomic rearrangements in BRCA1 and BRCA2 genes and loci at high resolution using Molecular Combing and relates to a method of determining a predisposition to diseases or disorders associated with these rearrangements including predisposition to ovarian cancer or breast cancer.
2. Description of the Related Art
Breast cancer is the most common malignancy in women, affecting approximately 10% of the female population. Incidence rates are increasing annually and it is estimated that about 1.4 million women will be diagnosed with breast cancer annually worldwide and about 460,000 will die from the disease. Germline mutations in the hereditary breast and ovarian cancer susceptibility genes BRCA1 (MIM#113705) and BRCA2 (MIM#600185) are highly penetrant (King et al., 2003), (Nathanson et al., 2001). Screening is important for genetic counseling of individuals with a positive family history and for early diagnosis or prevention in mutation carriers. When a BRCA1 or BRCA2 mutation is identified, predictive testing is offered to all family members older than 18 years. If a woman tests negative, her risk becomes again the risk of the general population. If she tests positive, a personalized surveillance protocol is proposed:
it includes mammographic screening from an early age, and possibly prophylactic surgery. Chemoprevention of breast cancer with anti-estrogens is also currently tested in clinical trial and may be prescribed in the future.
Most deleterious mutations consist of either small frameshifts (insertions or deletions) or point mutations that give rise to premature stop codons, missense mutations in conserved domains, or splice-site mutations resulting in aberrant transcript processing (Szabo et al., 2000). However, mutations also include more complex rearrangements, including deletions and duplications of large genomic regions that escape detection by traditional PCR-based mutation screening combined with DNA sequencing (Mazoyer, 2005).
Techniques capable of detecting these complex rearrangements include Southern blot analysis combined with long-range PCR or the protein truncation test (PTT), quantitative multiplex PCR of short fluorescent fragments (QMPSF) (Hofmann et al., 2002), real-time PCR, fluorescent DNA microarray assays, multiplex ligation-dependent probe amplification (MLPA)(Casilli et al., 2002), (Hofmann et al., 2002) and high-resolution oligonucleotide array comparative genomic hybridization (aCGH) (Rouleau et al., 2007), (Staaf et al., 2008). New approaches that provide both prescreening and quantitative information, such as qPCR-HRM and EMMA, have recently been developed and genomic capture combined with massively parallel sequencing has been proposed for simultaneous detection of small mutations and large rearrangements affecting 21 genes involved in breast and ovarian cancer (Walsh et al., 2010).
Molecular Combing is a powerful FISH-based technique for direct visualization of single DNA molecules that are attached, uniformly and irreversibly, to specially treated glass surfaces (Herrick and Bensimon, 2009); (Schurra and Bensimon, 2009). This technology considerably improves the structural and functional analysis of DNA across the genome and is capable of visualizing the entire genome at high resolution (in the kb range) in a single analysis. Molecular Combing is particularly suited to the detection of genomic imbalances such as mosaicism, loss of heterozygosity (LOH), copy number variations (CNV), and complex rearrangements such as translocations and inversions (Caburet et al., 2005), thus extending the spectrum of mutations potentially detectable in breast cancer genes. Molecular Combing has been successfully employed for the detection of large rearrangements in BRCA1 ((Gad et al., 2001), (Gad et al., 2002a), (Gad et al., 2003) and BRCA2 (Gad et al., 2002b), using a first-generation “color bar coding” screening approach. However, these techniques lack resolution and cannot precisely detect large rearrangements in and around BRCA1 and BRCA2.
In distinction to the prior art techniques, as disclosed herein, the inventors provide a novel Genetic Morse Code Molecular Combing procedure that provides for high resolution visual inspection of genomic DNA samples, precise mapping of mutated exons, precise measurement of mutation size with robust statistics, simultaneous detection of BRCA1 and BRCA2 genetic structures or rearrangements, detection of genetic inversions or translocations, and substantial elimination of problems associated with repetitive DNA sequences such as Alu sequences in BRCA1 and BRCA2 loci.
The BRCA1 and BRCA2 genes are involved, with high penetrance, in breast and ovarian cancer susceptibility. About 2% to 4% of breast cancer patients with a positive family history who are negative for BRCA1 and BRCA2 point mutations can be expected to carry large genomic alterations (deletion or duplication) in one of the two genes, and especially BRCA1. However, large rearrangements are missed by direct sequencing. Molecular Combing is a powerful FISH-based technique for direct visualization of single DNA molecules, allowing the entire genome to be examined at high resolution in a single analysis. A novel predictive genetic test based on Molecular Combing is disclosed herein. For that purpose, specific BRCA1 and BRCA2 “Genomic Morse Codes” (GMC) were designed, covering coding and non-coding regions and including large genomic portions flanking both genes. The GMC is a series of colored signals distributed along a specific portion of the genomic DNA which signals arise from probe hybridization with the probes of the invention. The concept behind the GMC has been previously defined in WIPO patent application WO/2008/028931 (which is incorporated by reference), and relates to the method of detection of the presence of at least one domain of interest on a macromolecule to test.
A measurement strategy is disclosed for the GMC signals, and has been validated by testing 6 breast cancer patients with a positive family history and 10 control patients. Large rearrangements, corresponding to deletions and duplications of one or several exons and with sizes ranging from 3 kb to 40 kb, were detected on both genes (BRCA1 and BRCA2). Importantly, the developed GMC allowed to unambiguously localize several tandem repeat duplications on both genes, and to precisely map large rearrangements in the problematic Alu-rich 5′-region of BRCA1. This new developed Molecular Combing genetic test is a valuable tool for the screening of large rearrangements in BRCA1 and BRCA2 and can optionally be combined in clinical settings with an assay that allows the detection of point mutations.
A substantial technical improvement compared to the prior color bar coding approach is disclosed here that is based on the design of second-generation high-resolution BRCA1 and BRCA2 Genomic Morse Codes (GMC). Importantly, repetitive sequences were eliminated from the DNA probes, thus reducing background noise and permitting robust measurement of the color signal lengths within the GMC. Both GMC were statistically validated on samples from 10 healthy controls and then tested on six breast cancer patients with a positive family history of breast cancer. Large rearrangements were detected, with a resolution similar to the one obtained with a CGH (1-3 kb). The detected mutation demonstrates the robustness of this technology, even for the detection of problematic mutations, such as tandem repeat duplications or mutations located in genomic regions rich of repetitive elements. The developed Molecular Combing platform permits simultaneous detection of large rearrangements in BRCA1 and BRCA2, and provides novel genetic tests and test kits for breast and ovarian cancer.
The patent or application file contains at least one drawing executed in color.
As in
The GMC described in
9: examples of Alu sequences excluded from the BRCA1 (A) and BRCA2 (B) GMCs.
Physical mapping: is the creation of a genetic map defining the position of particular elements, mutations or markers on genomic DNA, employing molecular biology techniques. Physical mapping does not require previous sequencing of the analyzed genomic DNA.
FISH: Fluorescent in situ hybridization.
Molecular Combing: a FISH-based technique for direct visualization of single DNA molecules that are attached, uniformly and irreversibly, to specially treated glass surfaces.
Predictive genetic testing: screening procedure involving direct analysis of DNA molecules isolated from human biological samples (e.g.: blood), used to detect gene mutations associated with disorders that appear after birth, often later in life. These tests can be helpful to people who have a family member with a genetic disorder, but who have no features of the disorder themselves at the time of testing. Predictive testing can identify mutations that increase a person's chances of developing disorders with a genetic basis, such as certain types of cancer.
Polynucleotides: This term encompasses naturally occurring DNA and RNA polynucleotide molecules (also designated as sequences) as well as DNA or RNA analogs with modified structure, for example, that increases their stability. Genomic DNA used for Molecular Combing will generally be in an unmodified form as isolated from a biological sample. Polynucleotides, generally DNA, used as primers may be unmodified or modified, but will be in a form suitable for use in amplifying DNA. Similarly, polynucleotides used as probes may be unmodified or modified polynucleotides capable of binding to a complementary target sequence. This term encompasses polynucleotides that are fragments of other polynucleotides such as fragments having 5, 10, 15, 20, 30, 40, 50, 75, 100, 200 or more contiguous nucleotides.
BRCA1 locus: This locus encompasses the coding portion of the human BRCA1 gene (gene ID: 672, Reference Sequence NM—007294) located on the long (q) arm of chromosome 17 at band 21, from base pair 41,196,311 to base pair 41,277,499, with a size of 81 kb (reference genome Build GRCh37/hg19), as well as its introns and flanking sequences. Following flanking sequences have been included in the BRCA1 GMC: the 102 kb upstream of the BRCA1 gene (from 41,277,500 to 41,379,500) and the 24 kb downstream of the BRCA1 gene (from 41,196,310 to 41,172,310). Thus the BRCA1 GMC covers a genomic region of 207 kb.
BRCA2 locus: This locus encompasses the coding portion of the human BRCA2 gene (gene ID: 675, Reference Sequence NM—000059.3) located on the long (q) arm of chromosome 13 at position 12.3 (13q12.3), from base pair 32,889,617 to base pair 32,973,809, with a size of 84 kb (reference genome Build GRCh37/hg19), as well as its introns and flanking sequences. Following flanking sequences have been included in the BRCA2 GMC: the 32 kb upstream of the BRCA2 gene (from 32,857,616 to 32,889,616) and the 56 kb downstream of the BRCA2 gene (from 32,973,810 to 33,029,810). Thus the BRCA2 GMC covers a genomic region of 172 kb.
Germline rearrangements: genetic mutations involving gene rearrangements occurring in any biological cells that give rise to the gametes of an organism that reproduces sexually, to be distinguished from somatic rearrangements occurring in somatic cells.
Point mutations: genetic mutations that cause the replacement of a single base nucleotide with another nucleotide of the genetic material, DNA or RNA. Often the term point mutation also includes insertions or deletions of a single base pair.
Frameshift mutations: genetic mutations caused by indels (insertions or deletions) of a number of nucleotides that is not evenly divisible by three from a DNA sequence. Due to the triplet nature of gene expression by codons, the insertion or deletion can change the reading frame (the grouping of the codons), resulting in a completely different translation from the original.
Tandem repeats duplications: mutations characterized by a stretch of DNA that is duplicated to produce two or more adjacent copies, resulting in tandem repeats.
Tandem repeat array: a stretch of DNA consisting of two or more adjacent copies of a sequence resulting in gene amplification. A single copy of this sequence in the repeat array is called a repeat unit. Gene amplifications occurring naturally are usually not completely conservative, i.e. in particular the extremities of the repeated units may be rearranged, mutated and/or truncated. In the present invention, two or more adjacent sequences with more than 90% homology are considered a repeat array consisting of equivalent repeat units. Unless otherwise specified, no assumptions are made on the orientation of the repeat units within a tandem repeat array.
Complex Rearrangements: any gene rearrangement that can be distinguished from simple deletions or duplications. Examples are translocations or inversions.
Probe: This term is used in its usual sense for a polynucleotide of the invention that hybridizes to a complementary polynucleotide sequences (target) and thus serves to identify the complementary sequence. Generally, a probe will be tagged with a marker, such as a chemical or radioactive market that permits it to be detected once bound to its complement. The probes described herein are generally tagged with a visual marker, such as a fluorescent dye having a particular color such as blue, green or red dyes. Probes according to the invention are selected to recognize particular portions or segments of BRCA1 or BRCA2, their exons or flanking sequences. For BRCA1, probes generally range in length between 200 bp and 5,000 bp. For BRCA2, probes generally range in length between 200 bp and 6,000 bp. The name and the size of probes of the invention are described in
Detectable label or marker: any molecule that can be attached to a polynucleotide and which position can be determined by means such as fluorescent microscopy, enzyme detection, radioactivity, etc, or described in the US application nr. US2010/0041036A1 published on 18 Feb. 2010.
Primer: This term has its conventional meaning as a nucleic acid molecule (also designated sequence) that serves as a starting point for polynucleotide synthesis. In particular, Primers may have 20 to 40 nucleotides in length and may comprise nucleotides which do not base pair with the target, providing sufficient nucleotides in their 3′-end, especially at least 20, hybridize with said target. The primers of the invention which are described herein are used to produce probes for BRCA1 or BRCA2, for example, a pair of primers is used to produce a PCR amplicon from a bacterial artificial chromosome as template DNA. The sequences of the primers used herein are referenced as SEQ ID 1 to SEQ ID 130 in Table 8. In some cases (details in table 1), the primers contained additional sequences to these at their 5′ end for ease of cloning. These additional sequences are SEQ ID 134 (containing a poly-A and a restriction site for AscI) for forward primers and SEQ ID 135 (containing a poly-A and a restriction site for PacI) for reverse primers.
Tables 1 and 2 and 8 describe representative primer sequences and the corresponding probe coordinates.
Genomic Morse Code(s): A GMC is a series of “dots” (DNA probes with specific sizes and colors) and “dashes” (uncolored spaces with specific sizes located between the DNA probes), designed to physically map a particular genomic region. The GMC of a specific gene or locus is characterized by a unique colored “signature” that can be distinguished from the signals derived by the GMCs of other genes or loci. The design of DNA probes for high resolution GMC requires specific bioinformatics analysis and the physical cloning of the genomic regions of interest in plasmid vectors. Low resolution CBC has been established without any bioinformatics analysis or cloning procedure.
Repetitive nucleotidic sequences: the BRCA1 and BRCA2 gene loci contain repetitive sequences of different types: SINE, LINE, LTR and Alu. The repetitive sequences which are present in high quantity in the genome sequence but are absent from the probes, i.e. were removed from the BRCA1 and BRCA2 GMCs of the invention, are mainly Alu sequences, having lengths of about 300 bp (see Figure S1, S1, S2 and S3 for more details). This mainly means that the percentage of the remaining Alu-sequences within the DNA probes compared to percentage present in the reference genome is less than 10% and preferably less than 2%. Accordingly, a polynucleotide is said to be “free of repetitive nucleotidic sequences” when at least one type of repetitive sequences (e.g., Alu, SINE, LINE or LTR) selected from the types of repetitive sequences cited above is not contained in the considered probe, meaning that said probes contains less than 10%, preferably less than 2% compared to percentage present in the reference genome. Examples of Alu repeats found in the BRCA1 and 2 genes are given in
The term “intragenic large rearrangement” as used herein refers to deletion and duplication events that can be observed in a gene sequence, said sequence comprising in a restricted view introns and exons; and in an extended view introns, exons, the 5′ region of said gene and the 3′ region of said gene. The intragenic large rearrangement can also cover any gain or loss of genomic material with a consequence in the expression of the gene of interest.
The term “locus” as used herein refers to a specific position of a gene or other sequence of interest on a chromosome. For BRCA1 and BRCA2, this term refer to the BRCA1 and BRCA2 genes, the introns and the flanking sequences refer to BRCA1/BRCA2+introns and flanking sequences.
The term “nucleic acid” as used herein means a polymer or molecule composed of nucleotides, e.g., deoxyribonucleotides or ribonucleotides, or compounds produced synthetically such as PNA which can hybridize with naturally occurring nucleic acids in a sequence specific manner analogous to that of two naturally occurring nucleic acids, e.g., can participate in Watson-Crick base pairing interactions. Nucleic acids may be single- or double-stranded or partially duplex.
The terms “ribonucleic acid” and “RNA” as used herein mean a polymer or molecule composed of ribonucleotides.
The terms “deoxyribonucleic acid” and “DNA” as used herein mean a polymer or molecule composed of deoxyribonucleotides.
The term “sample” as used herein relates to a material or mixture of materials, typically, although not necessarily, in fluid form, containing one or more components of interest. For Molecular Combing, the sample will contain genomic DNA from a biological source, for diagnostic applications usually from a patient. The invention concerns means, especially polynucleotides, and methods suitable for in vitro implementation on samples.
The terms “nucleoside” and “nucleotide” are intended to include those moieties that contain not only the known purine and pyrimidine bases, but also other heterocyclic bases that have been modified. Such modifications include methylated purines or pyrimidines, acylated purines or pyrimidines, alkylated riboses or other heterocycles. In addition, the terms “nucleoside” and “nucleotide” include those moieties that contain not only conventional ribose and deoxyribose sugars, but other sugars as well. Modified nucleosides or nucleotides also include modifications on the sugar moiety, e.g., wherein one or more of the hydroxyl groups are replaced with halogen atoms or aliphatic groups, or are functionalized as ethers, amines, or the like.
The term “stringent conditions” as used herein refers to conditions that are compatible to produce binding pairs of nucleic acids, e.g., surface bound and solution phase nucleic acids, of sufficient complementarity to provide for the desired level of specificity in the assay while being less compatible to the formation of binding pairs between binding members of insufficient complementarity to provide for the desired specificity. Stringent assay conditions are the summation or combination (totality) of both hybridization and wash conditions.
A “stringent hybridization” and “stringent hybridization wash conditions” in the context of nucleic acid hybridization (e.g., as required for Molecular Combing or for identifying probes useful for GMC) are sequence dependent, and are different under different experimental parameters. Stringent hybridization conditions that can be used to identify nucleic acids within the scope of the invention can include for example hybridization in a buffer comprising 50% formamide, 5×SSC, and 1% SDS at 42° C., or hybridization in a buffer comprising 5.times.SSC and 1% SDS at 65° C., both with a wash of 0.2×SSC and 0.1% SDS at 65° C. Exemplary stringent hybridization conditions can also include a hybridization in a buffer of 40% formamide, 1M NaCl, and 1% SDS at 37° C., and a wash in 1×SSC at 45° C. Alternatively, hybridization to filter-bound DNA in 0.5 M NaHPO4, 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 65° C., and washing in 0.1×SSC/0.1% SDS at 68° C. can be employed. Yet additional stringent hybridization conditions include hybridization at 60° C. or higher and 3×SSC (450 mM sodium chloride/45 mM sodium citrate) or incubation at 42° C. in a solution containing 30% formamide, 1 M NaCl, 0.5% sodium sarcosine, 50 mM MES, pH 6.5. Those of ordinary skill will readily recognize that alternative but comparable hybridization and wash conditions can be utilized to provide conditions of similar stringency.
A probe or primer located in a given genomic locus means a probe or a primer which hybridizes to the sequence in this locus of the human genome. Generally, probes are double stranded and thus contain a strand that is identical to and another that is reverse complementary to the sequence of the given locus. A primer is single stranded and unless otherwise specified or indicated by the context, its sequence is identical to that of the given locus. When specified, the sequence may be reverse complementary to that of the given locus. In certain embodiments, the stringency of the wash conditions that set forth the conditions that determine whether a nucleic acid is specifically hybridized to a surface bound nucleic acid. Wash conditions used to identify nucleic acids may include for example a salt concentration of about 0.02 molar at pH 7 and a temperature of at least about 50° C. or about 55° C. to about 60° C.; or a salt concentration of about 0.15 M NaCl at 72° C. for about 15 minutes; or a salt concentration of about 0.2×SSC at a temperature of at least about 50° C. or about 55° C. to about 60° C. for about 15 to about 20 minutes; or, the hybridization complex is washed twice with a solution with a salt concentration of about 2×SSC containing 0.1% SDS at room temperature for 15 minutes and then washed twice by 0.1×SSC containing 0.1% SDS at 68° C. for 15 minutes; or, equivalent conditions. Stringent conditions for washing can also be for example 0.2×SSC/0.1% SDS at 42° C.
A specific example of stringent assay conditions is rotating hybridization at 65° C. in a salt based hybridization buffer with a total monovalent cation concentration of 1.5 M followed by washes of 0.5×SSC and 0.1×SSC at room temperature.
Stringent assay conditions are hybridization conditions that are at least as stringent as the above representative conditions, where a given set of conditions are considered to be at least as stringent if substantially no additional binding complexes that lack sufficient complementarity to provide for the desired specificity are produced in the given set of conditions as compared to the above specific conditions, where by “substantially no more” is meant less than about 5-fold more, typically less than about 3-fold more. Other stringent hybridization conditions are known in the art and may be employed, as appropriate.
“Sensitivity” describes the ability of an assay to detect the nucleic acid of interest in a sample. For example, an assay has high sensitivity if it can detect a small concentration of the nucleic acid of interest in sample. Conversely, a given assay has low sensitivity if it only detects a large concentration of the nucleic acid of interest in sample. A given assay's sensitivity is dependent on a number of parameters, including specificity of the reagents employed (such as types of labels, types of binding molecules, etc.), assay conditions employed, detection protocols employed, and the like. In the context of Molecular Combing and GMC hybridization, sensitivity of a given assay may be dependent upon one or more of: the nature of the surface immobilized nucleic acids, the nature of the hybridization and wash conditions, the nature of the labeling system, the nature of the detection system, etc.
Design of High-Resolution BRCA1 and BRCA2 Genomic Morse Codes
Molecular Combing has already been used to detect large rearrangements in the BRCA1 and BRCA2 genes, but the hybridization DNA probes originally used were part of a low resolution “color bar coding” screening approach and were composed of cosmids, PACs and long-range PCR products only partially covering the BRCA1 and BRCA2 loci. Of importance, the DNA probes also encoded repetitive sequences particularly abundant at the two loci (Gad et al., 2001), (Gad et al., 2002b). As a consequence, detection of the probes often resulted in the superposition of individual colored signals (e.g., yellow spots resulting from superposition of green and red signals) and in strong background noise, undermining the quality of the images and preventing the development of a robust strategy to measure the signals length. Such a low resolution screening approach did not allow the unambiguous visualization of complex mutations, such as tandem repeat duplications (Schurra and Bensimon, 2009), (Herrick and Bensimon, 2009).
The inventors found that high-resolution Genomic Morse Codes (GMC) that were designed by covering more of the BRCA1 and BRCA2 genomic regions and by removing the disturbing repetitive sequences from the DNA probes resolved the problems associated with the prior color bar coding approach.
To visualize the repetitive sequences, dot-plot alignments of the BAC clones used for DNA probe cloning were first performed, based on the Genome Reference Consortium GRCh37 genome assembly (also called hg19, April 2009 release). Based on Repeat Masker analysis (www._repeatmasker.org), the percentages of Alu repetitive DNA in the BRCA1- and BRCA2-encoding BACs were 35% and 17%, respectively (data not shown). This resulted in a dark dot-plot matrix dense in repetitive sequences for BRCA1 (1.6 Alu sequences per 1 kb of DNA, compared to an average in the human genome of only 0.25 Alu/kb), and a brighter dot-plot matrix for BRCA2 (0.64 Alu/kb of DNA) (
35 genomic regions in the BRCA1 locus and 27 regions in the BRCA2 locus that had significantly less repetitive sequences were identified and were used to design and clone DNA hybridization probes compatible with the visualization process associated with Molecular Combing. The name, size and color of the DNA hybridization probes, and the exons covered by the probes, are shown in
To facilitate Genomic Morse Code recognition and measurement, signals located on the genes were grouped together in specific patterns called “motifs”. An electronic reconstruction of the designed BRCA1 and BRCA2 Genomic Morse Codes is shown in
Validation of BRCA1 and BRCA2 Genomic Morse Code Signals in Control Patients
The newly designed Genomic Morse Codes were first validated on genomic DNA isolated from 10 randomly chosen control patients. Typical visualized signals and measured motif lengths for one control donor are reported in
Detection of Known BRCA1 Large Rearrangements in Breast Cancer Patients
Molecular Combing was then applied to 6 samples from patients with a severe family history of breast cancer and known to bear large rearrangements either on BRCA1 or BRCA2 (preliminary screening performed by MLPA or QMPSF). Importantly, the Molecular Combing analysis was a blind test, meaning that for each of the patient the identity of the mutation was unknown before the test, since it was revealed to the operator only after having completed the test on all the samples. 6 different large rearrangements were identified (see Table 5). Importantly, all 6 known mutations have been recently characterized by aCGH and break-point sequencing (Rouleau 2007) and were correctly identified and characterized by Molecular Combing. Complete characterization of the 3 most significant known BRCA1 large rearrangements is reported in
Duplication of Exon 13 (BRCA1)
By visual inspection via Molecular Combing, this mutation appears as a partial tandem duplication of the blue signal S7B1 (
Deletion from Exon 8 to Exon 13 (BRCA1)
By visual inspection, the mutation appeared as a visible as a deletion of the blue signal S7B1, including a large genomic portion between signals S7B1 and S8B1 (
Deletion of the 5′ Region to Exon 2 (BRCA1)
By visual inspection, the mutation appeared as a deletion of the green signal S10B1, as well as a large genomic portion of the 5′ region upstream of BRCA1, including S11S1 and S12B1 (
The results reported herein disclose and exemplify the development of a novel genetic test based on Molecular Combing for the detection of large rearrangements in the BRCA1 and BRCA2 genes. Large rearrangements represent 10-15% of deleterious germline mutations in the BRCA1 gene and 1-7% in the BRCA2 gene (Mazoyer, 2005). Specific high-resolution GMC were designed and were tested on a series of 16 biological samples; the robustness of the associated measurement strategy was statistically validated on 10 control samples, and 6 different large rearrangements were detected and characterized in samples from patients with a severe family history of breast cancer. The robustness of the newly designed GMC, devoid of repetitive sequences, is endorsed by the fact that our Molecular Combing method confirmed the results obtained with high-resolution zoom-in aCGH (11 k) on the same samples (Rouleau et al., 2007), with a resolution in the 1-2 kb range.
Tandem repeat duplications are the most difficult large rearrangements to detect. Contrary to other techniques, such as aCGH and MLPA, the capacity of Molecular Combing to visualize hybridized DNA probes at high resolution permits precise mapping and characterization of tandem repeat duplications, as shown here in case 1 (BRCA1 Dup Ex 13). aCGH can be used to determine the presence and size of duplications, but not the exact location and orientation of tandem repeat duplications. In PCR-based techniques such as MLPA, duplications are considered to be present when the ratio between the number of duplicated exons in the sample carrying a mutation and the number of exons in the control sample is at least 1.5, reflecting the presence of 3 copies of a specific exon in the mutated sample and 2 copies in the wild-type sample. The ratio of 1.5 is difficult to demonstrate unambiguously by MLPA, which often gives false-positive signals, as observed in case 1 (BRCA1 Dup Ex 13). The limits of MLPA have been underlined in several recent studies (Cavalieri et al., 2008), (Staaf et al., 2008). MLPA is limited to coding sequences and can also give false-negative scores, due to the restricted coverage of the 21 probes (Cavalieri et al., 2008). In addition, MLPA provides only limited information on the location of deletion or duplication breakpoints in the usually very large intronic or affected flanking regions, thus necessitating laborious mapping for sequence characterization of the rearrangements. Staaf et al recently suggested that MLPA should be regarded as a screening tool that needs to be complemented by other means of mutation characterization, such as a CGH (Staaf et al., 2008). We propose Molecular Combing as such a replacement technology for MLPA or aCGH, as it unambiguously identifies and visualizes duplications.
Another advantage of Molecular Combing as disclosed herein was its capacity to cover non-coding regions, including the 5′ region of the BRCA1 gene and the genomic region upstream of BRCA1 that comprises the NBR2 gene, the ψBRCA1 pseudogene and the NBR1 gene. Recent studies show that it is very difficult to design exploitable PCR or aCGH probes in this rearrangement-prone genomic region (Rouleau et al., 2007), (Staaf et al., 2008), because of the presence of duplicated regions and the high density of Alu repeats. Genomic rearrangements typically arise from unequal homologous recombination between short interspersed nuclear elements (SINEs), including Alu repeats, long interspersed nuclear elements (LINEs), or simple repeat sequences.
Molecular Combing permits precise physical mapping within this difficult regions, as shown here in cases three and two (BRCA1 Del Ex 2), where we measured mutation sizes of 38.5 kb and 37.1 kb, respectively. As cases 3 and 2 belong to the same family, the detected mutation was the same in both cases, as confirmed by aCGH (Rouleau et al., 2007). The measurement difference of 1.4 kb between these two cases is acceptable, being within the 1-2 kb definition range of the molecular combing assay. The mutation was originally described by Puget et al, who determined the mutation size (37 kb) with a first-generation molecular combing “color bar coding” screening method (Puget et al., 2002). Size estimated with aCGH was in the 40.4-58.1 kb range, because of the low density of exploitable oligonucleotide sequences in this genomic region and the reduced sensitivity of 22 some oligonucleotides due to sequence homology (Rouleau et al., 2007). Molecular combing can therefore be used for the analysis of hard-to-sequence genomic regions that contain large numbers of repetitive elements. Here we demonstrate that the high concentration of Alu sequences in BRCA1 does not represent an obstacle for molecular combing.
Detection of Previously Uncharacterized BRCA1 Large Rearrangements in Breast Cancer Patients
Further samples were tested, and we characterized by Molecular Combing rearrangements which other techniques had failed to accurately describe. One such example is detailed below.
Triplication of Exons 1a, 1b and 2 of BRCA1 and a Portion of NBR2.
We analyzed sample #7 (provided by the Institut Claudius Régaud, Toulouse, France) by Molecular Combing, using the set of probes described in
Such a triplication has not been reported in this genomic region yet. This may be due to the previous lack of relevant technologies to detect the mutation. Therefore, we designed tests specific to this mutation. These tests may be used to screen for this triplication or to confirm this triplication in samples where a rearrangement is suspected in this region. There are several types of possible tests, such as PCR, quantitative PCR (qPCR), MLPA, aCGH, sequencing . . . .
Results of quantification techniques, which provide a number of copies of a given sequence (qPCR, MLPA, aCGH, . . . ) will not provide direct assessment of the tandem nature of the additional copies of the sequence. The triplication reported here may be suspected when sequences within exons 1a, 1b and/or 2 of BRCA1 and/or the sequences between these exons are present in multiple (more than two per diploid genome) copies. Generally speaking, when these results are above the threshold determined for duplicated sequence (which have three copies in total of the duplicated sequence), the sample should be suspected to bear a triplication on a single allele (rather than duplications of the sequence in two separate alleles. Confirmation of the triplication and its tandem nature may be obtained either through a PCR test or through a Molecular Combing test as described in this and the examples section.
As this is a more direct method, we detail some PCR designs here, in the example sections. The man skilled in the art may adapt these tests through common, generally known, molecular biology methods, e.g. by modifying primer locations within the sequence ranges mentioned, and/or modifying experimental conditions (annealing temperature, elongation time, . . . for PCR). Also, these tests may be included in “multiplex” tests where other mutations are also sought. For example, one or several pair(s) of primers designed to detect the triplication and described below may be used simultaneously with one or several other pair(s) of primers targeting distinct amplicons. In addition to these adaptations, several common variants exist for the molecular tests described. Nevertheless, these variants remain functionally identical to the described tests and the adaptation of our designs to these variants is easily achievable by the man skilled in the art. For example, sequencing may be replaced by targeted resequencing, where the region of interest is isolated for other genomic regions before the sequencing step, so as to increase coverage in the region of interest. As another example, semi-quantitative PCR, where DNA is quantity after amplification is assessed by common agarose electrophoresis, may replace QMPSF.
These results demonstrate that the developed Molecular Combing platform is a valuable tool for genetic screening of tandem repeat duplications, CNVs, and other complex rearrangements in BRCA1 and BRCA2, such as translocations and inversions, particularly in high-risk breast cancer families.
A prominent application of the developed molecular diagnostic tool is as a predictive genetic test. However, the methods and tools disclosed herein may be applied as or in a companion diagnostic test, for instance, for the screening of BRCA-mutated cells in the context of the development of PARP inhibitors. Such a genetic test can be applied not only to clinical blood samples, but also to circulating cells and heterogeneous cell populations, such as tumor tissues.
Preliminary Patient Screening
The Genomic Morse Code was validated on 10 samples from patients with no deleterious mutations detected in BRCA1 or BRCA2 (control patients). The genetic test was validated on 6 samples from patients with positive family history of breast cancer and known to bear large rearrangements affecting either BRCA1 or BRCA2. Total human genomic DNA was obtained from EBV-immortalized lymphoblastoid cell lines. Preliminary screening for large rearrangements was performed with the QMPSF assay (Quantitative Multiplex PCR of Short Fluorescent Fragments) in the conditions described by Casilli et al and Tournier et al (Casilli et al., 2002) or by means of MLPA (Multiplex Ligation-Dependent Probe Amplification) using the SALSA MLPA kits P002 (MRC Holland, Amsterdam, The Netherlands) for BRCA1 and P045 (MRC-Holland) for BRCA2. All 16 patients gave their written consent for BRCA1 and BRCA2 analysis.
Sample Preparation
Total human genomic DNA was obtained from EBV-immortalized lymphoblastoid cell lines. A 45-μL suspension of 106 cells in PBS was mixed with an equal volume of 1.2% Nusieve GTG agarose (Lonza, Basel, Switzerland) prepared in 1×PBS, previously equilibrated at 50° C. The plugs were left to solidify for 30 min at 4° C., then cell membranes are solubilised and proteins digested by an overnight incubation at 50° C. in 250 μL of 0.5 M EDTA pH 8.0, 1% Sarkosyl (Sigma-Aldrich, Saint Louis, Mo., USA) and 2 mg/mL proteinase K (Eurobio, Les Ulis, France), and the plugs were washed three times at room temperature in 10 mM Tris, 1 mM EDTA pH 8.0. The plugs were then either stored at 4° C. in 0.5 M7EDTA pH 8.0 or used immediately. Stored plugs were washed three times for 30 minutes in 10 mM Tris, 1 mM EDTA pH 8.0 prior to use.
Probe Preparation
All BRCA1 and BRCA2 probes were cloned into pCR2.1-Topo or pCR-XL-Topo (Invitrogen) plasmids by TOPO cloning, using PCR amplicons as inserts. Amplicons were obtained using bacterial artificial chromosomes (BACs) as template DNA. The following BACs were used: for BRCA1, the 207-kb BACRP11-831F13 (ch17: 41172482-41379594, InVitrogen, USA); and for BRCA2, the 172-kb BAC RP11-486017 (ch13: 32858070-33030569, InVitrogen, USA). See Tables 1 and 2 for primer sequences and probe coordinates. Primer sequences are referenced as SEQ ID 1 to SEQ ID 130. In some cases (as detailed in table 1), additional artificial sequences were added to the 5′ end of the primer for ease of cloning. These artificial sequences are SEQ ID 134 (ForwardPrimerPrefix) for forward primers and SEQ ID 135 (ReversePrimerPrefix) for forward primers, both containing a poly-A and a restriction site for, respectively, AscI and PacI.
SEQ ID 131 (BRCA1-1A), SEQ ID 132 (BRCA1-1B) and SEQ ID 133 (BRCA1-SYNT1) are examples of probe sequences.
Whole plasmids were used as templates for probe labeling by random priming. Briefly, for biotin (Biota) labeling, 200 ng of template was labeled with the DNA Bioprime kit (Invitrogen) following the manufacturers instructions, in an overnight labeling reaction. For Alexa-488 (A488) or digoxigenin (Dig) labeling, the same kit and protocol were used, but the dNTP mixture was modified to include the relevant labeled dNTP, namely Dig-11-dUTP (Roche Diagnostics, Meylan, France) or A488-7-OBEA dCTP (Invitrogen) and its unlabelled equivalent, both at 100 μM, and all other dNTPs at 200 μM. Labeled probes were stored at −20° C. For each coverslip, 5 ut of each labeled probe ( 1/10th of a labeling reaction product) was mixed with 10 μg of human Cot-1 and 10 μg of herring sperm DNA (both from Invitrogen) and precipitated in ethanol. The pellet was then resuspended in 22 μL of 50% formamide, 30% Blocking Aid (Invitrogen), 1×SSC, 2.5% Sarkosyl, 0.25% SDS, and 5 mM NaCl.
Genomic DNA Combing and Probe Hybridization
Genomic DNA was stained by 1 h incubation in 40 mM Tris, 2 mM EDTA containing 3 μM Yoyo-1 (Invitrogen, Carlsbad, Calif., USA) in the dark at room temperature. The plug was then transferred to 1 mL of 0.5 M MES pH 5.5, incubated at 68° C. for 20 min to melt the agarose, and then incubated at 42° C. overnight with 1.5 U beta agarase I (New England Biolabs, Ipswich, Mass., USA). The solution was transferred to a combing vessel already containing 1 ml of 0.5 M MES pH 5.5, and DNA combing was performed with the Molecular Combing System on dedicated coverslips (Combicoverslips) (both from Genomic Vision, Paris, France).
Combicoverslips with combed DNA are then baked for 4 h at 60° C. The coverslips were either stored at −20° C. or used immediately for hybridisation. The quality of combing (linearity and density of DNA molecules) was estimated under an epi-fluorescence microscope equipped with an FITC filter set and a 40× air objective. A freshly combed coverslip is mounted in 20 μL of a 1 ml ProLong-gold solution containing 1 μL of Yoyo-1 solution (both from Invitrogen). Prior to hybridisation, the coverslips were dehydrated by successive 3 minutes incubations in 70%, 90% and 100% ethanol baths and then air-dried for 10 min at room temperature. The probe mix (20 μL; see Probe Preparation) was spread on the coverslip, and then left to denature for 5 min at 90° C. and to hybridise overnight at 37° C. in a hybridizer (Dako). The coverslip was washed three times for 5 min in 50% formamide, 1×SSC, then 3×3 min in 2×SSC.
Detection was performed with two or three successive layers of flurophore or streptavidin-conjugated antibodies, depending on the modified nucleotide employed in the random priming reaction (see above). For the detection of biotin labeled probes the antibodies used were Streptavidin-A594 (InVitrogen, Molecular Probes) for the 1st and 3rd layer, biotinylated goat anti-Streptavidin (Vector Laboratories) for the 2nd layer; For the detection of A488-labelled probes the antibodies used were rabbit anti-A488 (InVitrogen, Molecular Probes) for the 1st and goat anti-rabbit A488 (InVitrogen, Molecular Probes) for the 2nd layer; For the detection of digoxygenin labeled probes the antibodies used were mouse anti-Dig (Jackson Immunoresearch) for the 1st layer, ratanti-mouse AMCA (Jackson Immunoresearch) for the 2nd layer and goat anti-mouse A350 (InVitrogen, Molecular Probes) for the 3rd Layer.
A 20 minute incubation step was performed at 37° C. in a humid chamber for each layer, and three successive 3 minutes washes in 2×SSC, 0.1% Tween at room temperature between layers. Three additional 3 minutes washes in PBS and dehydration by successive 3 minutes washes in 70%, 90% and 100% ethanol were performed before mounting the coverslip.
Image Acquisition
Image acquisition was performed with a customized automated fluorescence microscope (Image Xpress Micro, Molecular Devices, Sunnyvale, Calif., USA) at 40× magnification, and image analysis and signal measurement were performed with the software ImageJ (http://_rsbweb.nih.gov/ij) and JMeasure (Genomic Vision, Paris, France). Hybridisation signals corresponding to the BRCA1 and BRCA2 probes were selected by an operator on the basis of specific patterns made by the succession of probes. For all motifs signals belonging to the same DNA fibre, the operator set the ends of the segment and determined its identity and length (kb), on a 1:1 scale image. The data were then output as a spreadsheet. In the final analysis, only intact motif signals were considered, confirming that no fibre breakage had occurred within the BRCA1 or BRCA2 motifs.
Statistical Analysis
Molecular Combing allows DNA molecules to be stretched uniformly with a physical distance to contour length correlation of 1 μm, equivalent to 2 kb (Michalet et al., 1997). As a consequence, in the absence of large rearrangements, the derived stretching factor (SF) has a value close to 2 kb/μm (±0.2).
All 7 BRCA1 motifs (g1b1-g7b1) and all 5 BRCA2 motifs (g1b2-g5b2) were measured in all 20 biological samples. The mean value size of all motifs measured in the 10 healthy controls, including the associated statistical analysis, is reported in Table S1. The size of all motifs measured in the 6 breast cancer patients, including the associated statistical analysis, is reported in Table S2. For each motif, the following values were determined: the number of measured images (n), the theoretical calculated length (calculated (kb)), the mean measured length (p (kb)), the standard deviation (SD (kb)), the coefficient of variation (CV (%)), the difference between μ and calculated (delta), and the stretching factor (SF=(calculated/μ)×2) (Michalet et al., 1997). In the absence of mutations, delta values are comprised between −1.9 kb and 1.9 kb, and SF values are comprised between 1.8 and 2.2. The presence of a large rearrangement on BRCA1 or BRCA2 was first identified by visual inspection of the corresponding GMC. From numerous datasets, we established that in the presence of large rearrangements in both BRCA1 and BRCA2, delta≧2 kb (for duplications) or delta≦−2 kb (for deletions), and the corresponding SF≧2.3 kb/μm (for deletions) or SF≦1.7 kb/μm (for duplications). To confirm the presence of a large rearrangement, the motif (−s) of interest was (were) first measured on a total population of images (typically between 20 and 40), comprising wild-type (wt) and mutated (mt) alleles. In presence of large rearrangements, and aiming to measure the mutation size, the images were then divided in two groups, corresponding to the wt and the mt alleles. Within each of the two groups of n images, following values were calculated: μ (kb), SD (kb), CV (%). The μ value of the wild-type allele was then compared with the μ value of the mutated allele. To this aim, we calculated the standard error of the mean (SEM=SD/√n) and the 95% confidence interval (95% CI=μ+2×SEM). The mutation size was then calculated as a difference between the mean size of the two alleles: mutation size=μ(BRCA1mt)−μ(BRCA1wt). The related error was calculated according to following formula:
error=(((μmt+2×SEMmt)−(μwt−2×SEMwt))−((μmt−2×SEMmt)−(μwt+2×SEMwt)))/2.
Part 1. Previous Application of Molecular Combing on Characterization of BRCA1 and BRCA2 Large Rearrangements: Design of Low Resolution Color Bar Codes (CBCs)
Molecular Combing has already been used by Gad et al. (Gad GenChrCan 2001, Gad JMG 2002) to detect large rearrangements in the BRCA1 and BRCA2 genes. The hybridization DNA probes originally used were part of a low resolution “color bar coding” screening approach composed of cosmids, PACs and long-range PCR products. Some probes were small and ranged from 6 to 10 kb, covering a small fraction the BRCA1 and BRCA2 loci. Other probes were very big (PAC 103014 measuring 120 kb for BRCA1 and BAC 486017 measuring 180 kb for BRCA2) and were covering the whole loci, including all the repetitive sequences. Thus, no bioinformatic analysis to identify potentially disturbing repetitive sequences has been even performed. More importantly, no repetitive sequence has been ever excluded from the design of the CBCs. This often resulted in incomplete characterizations of the screened mutations (see Part 3). As a consequence, detection of the probes often resulted in the superposition of individual colored signals (e.g., yellow/white spots resulting from superposition of different colored signals) and in strong background noise, undermining the quality of the images and preventing the development of a robust strategy to measure the signals length. In addition, no DNA probe was r isolated and cloned in an insert vector. The BRCA1 Color Bare Code (CBC) was composed of only 7 DNA probes ((Gad, et al, Genes Chromosomes and cancer 31:75-84 (2001))), whereas the BRCA2 CBC was composed of only 8 DNA probes (Gad, et al, J Med Genet (2002)). This low number of DNA probes did not allow high resolution physical mapping.
Importantly, such a low resolution screening approach did not allow the unambiguous visualization of complex mutations, such as tandem repeat duplications or triplications. In contrast, full characterization of tandem repeat duplications and triplications is possible with the high-resolution GMC (see Example 1). Moreover, the accurate physical mapping of all the mutated exons was often problematic, requiring additional laborious sequencing experiments. This often resulted in incomplete characterizations of the screened mutations (see Chapter 3).
Part 2. New Application of Molecular Combing on Characterization of BRCA1 and BRCA2 Large Rearrangements: Design of High Resolution Genomic Morse Codes (GMCs) and Development of a Genetic Test.
An important point of novelty for the present invention is the design and cloning of high-resolution Genomic Morse Codes (GMC) for both BRCA1 and BRCA2 genomic regions. The BRCA1 GMC is composed of 35 DNA probes (
Comparative
Comparative
35 genomic regions in BRCA1 and 27 regions in BRCA2 devoid of repetitive sequences were identified, and were used to design and clone the corresponding DNA hybridization probes. All the details of the employed DNA hybridization probes (name, size, coordinates, color and the nature of the covered exons) are listed above. The cloned DNA probes allow the accurate physical mapping of deleted exons and permit the simultaneous detection of large rearrangements in BRCA1 and BRCA2. The above described improvement in resolution, permitted the inventors to translate their observations into the development of a robust predictive genetic test for breast and ovarian cancer (see example 1).
Part 3: High Resolution GMCs Allow the Unambiguous Detection and Visualization of Complex Mutation (e.g.: Tandem Repeat Duplications and Triplications) that can't be Characterized by Low Resolution CBCs
The following are selected examples of complex mutations that could not be characterized (or only partially) by low resolution CBC, but could be precisely and unambiguously characterized by high resolution GMC:
3.1 BRCA1 Dup Ex 18-20
CBC:
The image generated by Gad et al (case IC171712 in FIG. 1 of Gad et al, Oncogene 2001) has a low resolution and the nature and particularly the identity of the deleted exons cannot be defined by visual inspection. As a consequence, the size of the mutation has not been determined, confirming that the generated images were problematic for measurements.
By visual inspection, this mutation appears as a tandem duplication of the red signal S5B1. After measurement, the mutation was estimated to have a size of 6.7±1.2 kb, restricted to a portion of the genome that encodes for exons 18 to 20. The estimated mutation size is fully in line with the 8.7 kb reported in the literature (Staaf, 2008). Details on the measurement and statistical analysis can be found in Example 1.
Comparative
CBC:
The image generated by Gad et al (case IC657 in FIG. 1 of Gad et al, Oncogene 2001) has a low resolution and the nature of the deleted exons cannot be unambiguously defined by visual inspection. The size of the mutation after measurement was 20.0±9.6 kb, having an important standard deviation.
GMC: (See
By visual inspection, the mutation clearly appeared as a deletion of the blue signal S7B1, including a large genomic portion between signals S7B1 and S8B1. After measurement, the mutation was estimated to have a size of 20±2.8 kb, having a smaller error.
3.3 BRCA1 Dup Ex 13 (6.1 kb)
CBC:
No microscopy image related to mutation has been ever provided. The estimated mutation size was 5.8±1.8 kb (case IARC3653 in FIG. 3 of Gad et al, Oncogene 2001), but is not supported by visual inspection.
GMC: (see
By visual inspection via Molecular Combing, this mutation appears as a partial tandem duplication of the blue signal S7B1. After measurement, the mutation was estimated to have a size of 6.1±1.6 kb, restricted to a portion of the DNA probe BRCA1-8 that encodes exon 13. The estimated mutation size is fully in line with the 6.1 kb reported in the literature (Puget, 1999), and according to the Breast Cancer Information Core database, this mutation belongs to the 10 most frequent mutations in BRCA1 (Szabo, 2000). Therefore, there is perfect correlation between the images and the measurements, and correlation with values present in literature. 3.4 Tandem repeat triplication of exons 1a, 1b and 2 of BRCA1 and a portion of NBR2.
No tandem triplication has been ever reported using the CBC.
By visual inspection via Molecular Combing, two alleles of the BRCA1 gene were identified in a sample provided by the Institut Claudius Regaud, Toulouse, France, differing in the length of the motif g7b1 which extends from the end of the S9B1 probe to the opposite end of the S11B1 probe. The mutation appeared to be a triplication involving portions of the SYNT1 and the S10B1 probe, as confirmed in probe color swapping experiments. This triplication of a DNA segment with a size comprised between 5 and 10 kb, and probably between 6 and 8 kb, involves exons 1a, 1b and 2 of the BRCA1 gene and possibly part of the 5′ extremity of the NBR2 gene.
The CBC would have at best detected this mutation as an increase of the length of a single probe, and thus would not have been able to characterize the mutation as a tandem triplication. Contrarily to Molecular Combing, none of the current molecular diagnostics technology, such as MLPA or aCGH, could assess whether the duplication or triplication is in tandem (within BRCA1) or dispersed (out of BRCA1). This observation makes a clear difference in terms of risk evaluation, since there is no evidence that repeated genomic portions out of the BRCA1 locus are clinically significant. Molecular Combing highlights that the mutation occurs within the BRCA1 gene, thus being of clinical significance.
The following important advantages of GMC compared to CBC are evident from the examples above:
Tests Specific to Detect a Triplication in the 5′ Region of BRCA1
PCR tests to detect unambiguously the triplication described above or a close triplication may distinguish non triplicated from triplicated alleles through either one of two ways:
The organization of the sequences in a triplication may be used to design primer pairs such that the PCR amplification is only possible in a tandem repeat. If one of the primers is located in the amplified sequence and is in the same orientation as the BRCA gene (5′ to 3′) and the other is the reverse complementary of a sequence within the amplified sequence located upstream of the first primer (i.e. the direction from the location of the first to the second primer is the same as the direction from the 3′ to the 5′ end of the BRCA gene), the PCR in a non-mutated sample will not be possible as the orientation of the primers do not allow it. Conversely, in a triplicated sample, the first primer hybridizing on a repeat unit is oriented correctly relative to the second primer hybridizing in the repeat unit immediately downstream of the first primer's repeat unit. Thus, the PCR is possible. In a triplicated sample, two PCR fragments should be obtained using a pair of primers designed this way. In a sample with a duplication, only one fragment would appear. The size of the smaller PCR fragment (or the only fragment in the case of a duplication), s, is the sum of the following distances:
This measurement thus provides a location range for both breakpoints, the downstream breakpoint being at a distance smaller than or equal to s from the location of the downstream primer (in the downstream direction) and the upstream breakpoint at a distance smaller than or equal to s from the location of the upstream primer (in the upstream direction). Besides, since the size of the triplicated sequence (L) is the sum of U+D and the distance between the two primers, L may be readily deduced from the size of the PCR fragment.
The size of the larger fragment is the sum of L and the size of the smaller fragment. Thus, by substracting the size of the smaller fragment from the size of the larger one, the size of the triplicated sequence is readily assessable in a second, independent assessment. This reduces the uncertainty on the location of the breakpoints. Thus, a test designed this way will allow a precise characterization of the triplication. Given the location of the triplication identified here, primer pairs used to detect the triplication could include combinations of one or several of the following downstream and upstream primers (the primer designed as the downstream primer is in the direct orientation relative to the BRCA1 gene and while the upstream primer is reverse complementary to the first strand of the BRCA1 gene). In choosing a combination of primers, in addition to the prescriptions below, one must choose the primer locations so the downstream primer is located downstream of the upstream primer:
A downstream primer may be located:
An upstream primer may be located:
An example of such a combination is the primer pair consisting of primers BRCA1-Synt1-R (SEQ ID 126) and BRCA1-A3A-F (SEQ ID 25);
The combinations above are not meant to be exhaustive and the man skilled in the art may well choose other location for the upstream and downstream primers, provided the orientation and relative location of the primers is chosen as described. Several combinations of primers may be used in separate experiments or in a single experiment (in which case all of the “upstream” primers must be located upstream of all of the “downstream” primers. If more than three primers are used simultaneously (multiplex PCR°, the number of PCR fragments obtained will vary depending on the exact location of the breakpoint (no PCR fragment at all will appear in non mutated samples) and the characterization of the mutation will be difficult. Therefore, it is advisable to perform additional experiments with separate primer pairs if at least one fragment is observed in the multiplex PCR.
Importantly, with the design described in the preceeding paragraphs, the orientation of the triplicated sequence is of minor importance: indeed, in a triplication, at least two of the repeat units will share the same orientation and at least one PCR fragments should be amplified. This holds true for a duplication, as in the case of an inverted repeat, a PCR fragment would be obtained from a one of the primers hybridizing in two separate locations with reverse (facing) orientations, while a direct tandem repeat would generate a PCR fragment from the two primers as described above.
Another type of PCR test to reveal the triplication and its tandem nature requires the amplification of a fraction of or of the entire repeat array, using primer pairs spanning the repeated sequence (both primers remaining outside the amplified sequence), or spanning a breakpoint (one primer is within and the other outside the amplified sequence) or entirely included in the amplified sequence. These tests will generate a PCR fragment of given size in a normal sample, while in a sample with a triplication on one allele, one or more additional PCR fragment will appear, including one the size of the “normal” fragment plus twice the size of the repeat sequence. If a mutation is present, these tests will often lead to results than can have several interpretations. If a single experiment is performed and reveals a mutation, a (series of) complementary test(s) may be performed following the designs presented herein to confirm the correct interpretation. Given the location of the triplication identified here, primer pairs used to detect the triplication could include a combination of one or several of the following primers, with at least one down stream and one upstream primer. The primer designed as the downstream primer is reverse complementary relative to the BRCA1 gene sequence and while the upstream primer is in direct orientation relative to the BRCA1 gene. In choosing a combination of primers, in addition to the prescriptions below, one must choose the primer locations so the downstream primer is located downstream of the upstream primer:
A downstream primer may be located:
An upstream primer may be located:
Specific Embodiments of the Invention Include the Following:
1. A nucleic acid composition for detecting simultaneously one or more large or complex mutations or genetic rearrangements in the locus BRCA1 or BRCA2 comprising at least two colored-labeled probes containing more than 200 nucleotides and specific of each said gene, said probes being visually detectable at high resolution and free of repetitive nucleotidic sequences.
2. A nucleic acid composition according to embodiment 1 for detecting simultaneously one or more large or complex mutations or genetic rearrangements in the locus BRCA1 or BRCA2 comprising at least three colored-labeled probes containing more than 200 nucleotides and specific of each said gene, said probes being visually detectable at high resolution and free of repetitive nucleotidic sequences.
3. A nucleic acid composition according to embodiments 1 or 2 for detecting simultaneously one or more large or complex mutations or genetic rearrangements in BRCA1 or BRCA2 gene comprising at least three color-labeled probes containing more than 600 nucleotides and specific of each said gene, said probes being visually detectable at high resolution and free of repetitive nucleotidic sequences.
4. A composition according embodiments 1, 2 or 3, wherein the probes are all together visualized on a monostranded-DNA fiber or on a polynucleotidic sequence of interest or on a genome to be tested.
5. A composition according embodiments 1, 2, 3 or 4 comprising at least five color-labeled signal probes specific of BRCA1 or BRCA2 locus allowing detection of the following mutations: duplication, deletion, inversion, insertion, translocation or large rearrangement.
6. A composition according embodiments 1 to 4 comprising at least seven color-labeled signal probes specific of BRCA1 or BRCA2 locus allowing to detect following mutations: duplication, deletion, inversion, insertion, translocation or large rearrangement.
7. A composition according embodiments 1 to 4 comprising at least nine color-labeled signal probes specific of BRCA1 or BRCA2 locus allowing to detect following mutations: duplication, triplication, deletion, inversion, insertion, translocation or large rearrangement.
8. A composition according embodiments 1 to 7 comprising at least fourteen color-labeled signal probes specific of BRCA1 or BRCA2 locus allowing to detect following mutations: duplication, triplication, deletion, inversion, insertion, translocation or large rearrangement.
9. A composition according embodiments 1 to 8 comprising at least eighteen color-labeled signal probes specific of BRCA1 or BRCA2 locus allowing to detect following mutations: duplication, triplication, deletion, inversion, insertion, translocation or large rearrangement.
10. A composition according to embodiments 1 to 9 wherein the genetic rearrangement or mutation detected is more than 1.5 kilobase (kb).
11. A predictive genetic test of susceptibility of breast or ovarian cancer in a subject involving the detection (presence or absence) and optionally the characterization of one or more specific large genetic rearrangement or mutation in the coding or non coding sequences of the BRCA1 or BRCA2 locus, the rearrangement being visualized by any of the composition according to embodiments 1 to 10.
12. A method of detection for the sensitivity of a subject to a therapeutic procedure comprising the identification of one or more genetic rearrangements or mutations in the coding or non-coding sequences of BRCA1 or BRCA2 gene or locus by visualizing by molecular combing said genetic rearrangement by using any of the composition according to embodiments 1 to 10.
13. A method of detection of at least one large genetic rearrangement or mutation by molecular combing technique in a fluid or circulating cells or a tissue of a biological sample comprising the steps of
a) contacting the genetic material to be tested with at least two colored labeled probes according to embodiments 1 to 10 visualizing with high resolution the hybridization of step a) and optionally
b) comparing the result of step b) to the result obtained with a standardized genetic material carrying no rearrangement or mutation in BRCA1 or BRCA2 gene or locus.
14. A composition comprising:
two or more oligonucleotide probes according to embodiments 1 to 10;
probes complementary to said oligonucleotide probes;
probes that hybridize to said probes of embodiments 1 to 10 under stringent conditions;
probes amplified by PCR using pairs of primers described in Tables 1 or 2 (SEQ ID 1 to SEQ ID 130); or
probes comprising BRCA1-1A (SEQ ID NO: 131), BRCA1-1B (SEQ ID NO: 132), or BRCA1-SYNT1 (SEQ ID NO:133)
15. A set of primers selected from the group of primers consisting of SEQ ID 1 to SEQ ID 70 and SEQ ID 125 to SEQ ID 130 for BRCA1
16. A set of primers selected from the group of primers consisting of SEQ ID 71 to SEQ ID 124 for BRCA2.
17. An isolated or purified probe produced by amplifying BRCA1 or BRCA2 coding, intron or flanking sequences using a primer pair of embodiment 15 or 16.
18. An isolated or purified probe comprising a polynucleotide sequence of SEQ ID NO: 131 (BRCA1-1A), SEQ ID NO: 132 (BRCA1-1B) or SEQ ID NO: 133 (SYNT1), or that hybridizes to SEQ ID NO: 131 or to SEQ ID NO: 132 or to SEQ ID NO: 133 under stringent conditions.
19. A composition comprising at least two polynucleotides each of which binds to a portion of the genome containing a BRCA1 and/or BRCA2 gene, wherein each of said at least two polynucleotides contains at least 200 contiguous nucleotides and contains less than 10% of Alu repetitive nucleotidic sequences.
20. The composition of embodiment 19, wherein said at least two polynucleotides bind to a portion of the genome containing BRCA1.
21. The composition of embodiment 19, wherein said at least two polynucleotides bind to a portion of the genome containing BRCA2.
22. The composition of embodiment 19, wherein each of said at least two polynucleotides contains at least 500 up to 6,000 contiguous nucleotides and contains less than 10% of Alu repetitive nucleotidic sequences.
23. The composition of embodiment 19, wherein the at least two polynucleotides are each tagged with a detectable label or marker.
24. The composition of embodiment 19, comprising at least two polynucleotides that are each tagged with a different detectable label or marker.
25. The composition of embodiment 19, comprising at least three polynucleotides that are each tagged with a different detectable label or marker.
26. The composition of embodiment 19, comprising at least four polynucleotides that are each tagged with a different detectable label or marker.
27. The composition of embodiment 19, comprising three to ten polynucleotides that are each independently tagged with the same or different visually detectable markers.
28. The composition of embodiment 19, comprising eleven to twenty polynucleotides that are each independently tagged with the same or different visually detectable markers.
29. The composition of embodiment 19, comprising at least two polynucleotides each tagged with one of at least two different detectable labels or markers.
30. A method for detecting a duplication, triplication, deletion, inversion, insertion, translocation or large rearrangement in a BRCA1 or BRCA2 locus, BRCA1 or BRCA gene, BRCA1 or BRCA flanking sequence or intron, comprising: isolating a DNA sample, molecularly combing said sample, contacting the molecularly combed DNA with the composition of embodiment 5 as a probe for a time and under conditions sufficient for hybridization to occur, visualizing the hybridization of the composition of embodiment 5 to the DNA sample, and comparing said visualization with that obtain from a control sample of a normal or standard BRCA1 or BRCA2 locus, BRCA1 or BRCA gene, BRCA1 or BRCA flanking sequence or intron that does not contain a rearrangement or mutation.
31. The method of embodiment 30, wherein said probe is selected to detect a rearrangement or mutation of more than 1.5 kb.
32. The method of embodiment 30, further comprising predicting or assessing a predisposition to ovarian or breast cancer based on the kind of genetic rearrangement or mutation detected in a coding or noncoding BRCA1 or BRCA 2 locus sequence.
33. The method of embodiment 30, further comprising determining the sensitivity of a subject to a therapeutic treatment based on the kind of genetic rearrangement or mutation detected in a coding or noncoding BRCA1 or BRCA 2 locus sequence.
34. A kit for detecting a duplication, deletion, triplication, inversion, insertion, translocation or large rearrangement in a BRCA1 or BRCA2 locus, BRCA1 or BRCA2 gene, BRCA1 or BRCA2 flanking sequence or intron comprising at least two polynucleotides each of which binds to a portion of the genome containing a BRCA1 or BRCA2 gene, wherein each of said at least two polynucleotides contains at least 200 contiguous nucleotides and is free of repetitive nucleotidic sequences, wherein said at least two or polynucleotides are tagged with visually detectable markers and are selected to identify a duplication, deletion, inversion, insertion, translocation or large rearrangement in a particular segment of a BRCA1 or BRCA2 locus, BRCA1 or BRCA2 gene, BRCA1 or BRCA2 flanking sequence or intron; and optionally a standard describing a hybridization profile for a subject not having a duplication, deletion, inversion, insertion, translocation or large rearrangement in a BRCA1 or BRCA2 locus, BRCA1 or BRCA gene, BRCA1 or BRCA flanking sequence or intron; one or more elements necessary to perform Molecular Combing, instructions for use, and/or one or more packaging materials.
35. The kit of embodiment 34, wherein said at least two or polynucleotides are selected to identify a duplication, deletion, inversion, insertion, translocation or large rearrangement in a particular segment of a BRCA1 or BRCA2 locus, BRCA1 or BRCA2 gene, BRCA1 or BRCA2 flanking sequence or intron associated with ovarian cancer or breast cancer.
36. The kit of embodiment 34, wherein said at least two or polynucleotides are selected to identify a duplication, deletion, inversion, insertion, translocation or large rearrangement in a particular segment of a BRCA1 or BRCA2 locus, BRCA1 or BRCA2 gene, BRCA1 or BRCA2 flanking sequence or intron associated with a kind of ovarian cancer or breast cancer sensitive to a particular therapeutic agent, drug or procedure.
37. A method for detecting an amplification of a genomic sequence spanning the 5′ end of the BRCA1 gene and consisting of at least three copies of the sequence in a sample containing genomic DNA. Accordingly, the invention relates in particular to a method for in vitro detecting in a sample containing genomic DNA, a repeat array of multiple tandem copies of a repeat unit consisting of genomic sequence spanning the 5′ end of the BRCA1 gene wherein said repeat array consists of at least three copies of the repeat unit and said method comprises:
38. A method of embodiment 37, where the amplified sequence is at least 2 kb long.
39. A method of embodiment 37, where the amplified sequence is at least 5 kb long.
40. A method of embodiment 37, where the amplified sequence is at most 20 kb long.
41. A method of embodiment 37, where the amplified sequence is at most 10 kb long.
42. A method of embodiment 37, where the amplified sequence is at least 2 kb and at most 20 kb long.
43. A method of embodiment 37, where the amplified sequence is at least 5 kb and at most 10 kb long.
44. A method of any one of embodiments 37 to 43 where the amplified sequence comprises at least one of exons 1a, 1b and 2 of the BRCA1 gene.
45. A method of any one of embodiments 37 to 43 where the amplified sequence comprises exons 1a, 1 b and 2 of the BRCA1 gene.
46. A method of any one of embodiments 37-45 where the detection of the gene amplification is achieved by quantifying copies of a sequence included in the amplified region.
47. A method of any one of embodiments 37-46 where the detection of the gene amplification is achieved by measuring the size of a genomic sequence encompassing the amplified sequence.
48. A method of any one of embodiments 37-47 where the detection of the gene amplification is achieved by making use of polymerase chain reaction or other DNA amplification techniques.
49. A method of any one of embodiments 37 to 48 where the detection of the gene amplification is achieved by quantitative polymerase chain reaction
50. A method of any one of embodiments 37-48 where the detection of the gene amplification is achieved by multiplex, ligation-dependent probe amplification (MLPA).
51. A method of any one of embodiments 37-48 where the detection of the gene amplification is achieved by array-based comparative genomic hybridization (aCGH).
52. A method of any one of embodiments 37-48 where the detection of the gene amplification is achieved by quick multiplex PCR of short fragments (QMPSF)
53. A method of any one of embodiments 37-48 wherein the downstream and upstream primers are respectively selected from the group of:
for a downstream primer:
54. A method of any one of embodiments 37-48 using two or more primers chosen from BRCA1-A3A-F (SEQ ID 25), BRCA1-A3A-R (SEQ ID 26), BRCA1-Synt1-F (SEQ ID 125) and BRCA1-Synt1-R (SEQ ID 126) or their reverse complementary sequences. 55. A method of any one of embodiments 37-48 using the Synt 1 probe (SEQ ID NO: 133).
112 bases (aaaaggcgcgcc) containing the restriction site sequence for AscI (GGCGCGCC) have been added for cloning purposes
212 bases (aaaattaattaa) containing the restriction site sequence for PacI (TTAATTAA) have been added for cloning purposes
3cordinates relative to BAC RP11-831F13, according to NCBI Build 36.1 (hg18);
4B = blue, G = green, R = red
3cordinates relative to BAC RP11-486017, according to NCBI Build 36.1 (hg18)
4B = blue, G = green, R = red
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
Homo sapiens
The present application is a continuation of U.S. Ser. No. 13/665,404, filed Oct. 31, 2012, which claims priority to U.S. Provisional Application No. 61/553,906, filed Oct. 31, 2011, the entire contents of which are incorporated herein by reference. On Oct. 30, 2012, International Application PCT/IB/2012/002422 was also filed with the same title, the entire contents of which are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
61553906 | Oct 2011 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 13665404 | Oct 2012 | US |
Child | 14528616 | US |