The present invention concerns the fields of medicin and tumour molecular biology. In particular the invention relates to means and methods for diagnosis and prognosis of BRCA1-like tumours using tumour classification based on specific genomic copy number alterations (CNA) by techniques such as array CGH.
Breast cancer is the most common cancer in the developed countries and one of the leading causes of death in women, one out of every nine women will be affected by breast cancer (Weir et al., 2003, J Natl Cancer Inst. 95(17):1276-99; ACS 2003-2004). Approximately 10-15% of the breast cancer cases show a positive family history for breast cancer (Anton-Culver et al., 1996, Genet Epidemiol. 13(2):193-205), and of those approximately 25-50% is due to a mutation in the breast cancer predisposition genes BRCA1 and BRCA2 (Narod and Foulkes, 2004, Nat Rev Cancer. 4(9):665-76). Women carrying a mutation in BRCA1/2 have a lifetime risk of breast cancer up to 80% (Ford 1998, Easton 1995, Antoniou 2003, King 2003). Identification of a BRCA1/2 mutation in a patient may not only influence the treatment of a patient (e.g. radiation, bilateral prophylactic mastectomy (Tercyak 2006), or oophorectomy) and surveillance (tumour prevention), but also allows pre-symptomatic mutation screening of family members.
Based on family history and age of onset, breast cancer patients are eligible for DNA screening for pathogenic mutations in BRCA1/2. Diagnostics currently includes mutation scanning and sequencing of gene fragments in germ line DNA. All these techniques do have their disadvantages and a part of the mutations remain undetected (Van der Hout 2006). It is estimated that in 20-30% of the BRCA1-linked families no mutation is found (Narod 1995, Ford 1998). Additionally, the detection of variants of unknown clinical significance complicates counselling and clinical management. Therefore, an additional tool that would indicate BRCA1 and BRCA2 involvement in breast cancer would be an asset to the current clinical diagnostics.
Numerous studies show specific genetic characteristics on which tumours can be categorised in subclasses. Due to the diversity of tumours, in many cases multiple features, i.e. characteristics, are needed to be able to distinguish between these subclasses. An objective method that would be able to discriminate between tumour types could help counselees and clinicians in their decision of treatment (van 't Veer 2002 Nature. 415(6871):530-6; Hannemann 2006, Breast Cancer Res. 8(5):R61). For hereditary BRCA1 cancer, previous publications from us and others show that these tumours develop distinct genetic alterations on which they can be recognised and be distinguished from non-hereditary, i.e. sporadic, tumours. Various methods using expression profiling (Hedenfalk 2001) or comparative genomic hybridisation (CGH) (Wessels 2002, Van Beers 2005, Jonsson 2005) show specific genetic alterations for these tumour groups. Although tumour mRNA has led to many molecular portraits, fresh frozen tissue is not often available especially when family screening includes diseased relatives. Formalin fixation and embedding in paraffin (FFPE) on the other hand is the common procedure for all hospitals to store tumour tissue. To perform CGH studies, we have previously shown that paraffin embedded tumours are of adequate quality (Van Beers 2006). The enhanced resolution of a micro array, compared with metaphase CGH (Wessels 2002), can improve the sensitivity and specificity of the detection of BRCA1 or BRCA2 tumours using CGH technology. Additionally, it will also provide a better estimate of the location of the chromosomal breakpoints of the genetic aberrations.
Due to the large numbers of families and individuals that are eligible for DNA-screening, it is not feasible to extend the current diagnostic assays beyond the current offered tests and screen all family members. Also, the risk for carrying a mutation calculated using prediction models that are based on family history are often worse predictors, especially when it comes to small families. An objective pre-screening test, based on the tumour only, that would indicate the involvement of BRCA1 in a family could help to select only those patients who need more extensive analysis. Genomic profiling of tumours, using comparative genomic hybridisation, could be such a strategy, however, this approach has not been validated earlier in a diagnostic setting.
It is an object of the present invention to provide for a method and means for prognostic and/or diagnostic genomic profiling of tumours for BRCA1 involvement.
The term “hybridisation” refers to the binding of two single stranded nucleic acids via complementary base pairing. The terms “hybridizing specifically to”, “specific hybridization”, and “selectively hybridize to,” as used herein refer to the binding, duplexing, or hybridizing of a nucleic acid molecule preferentially to a particular nucleotide sequence under stringent conditions. The term “stringent conditions” refers to conditions under which a probe will hybridize preferentially to its target subsequence, and to a lesser extent to, or not at all to, other sequences in a mixed population (e.g., a cell lysate or DNA preparation from a tissue biopy) A “stringent hybridization” and “stringent hybridization wash conditions” in the context of nucleic acid hybridization (e.g., as in array, Southern or northern hybridizations) are sequence dependent, and are different under different environmental parameters. An extensive guide to the hybridization of nucleic acids is found in, e.g., Tijssen (1993) Laboratory Techniques in Biochemistry and Molecular Biology—Hybridization with Nucleic Acid Probes part I, Ch. 2, “Overview of principles of hybridization and the strategy of nucleic acid probe assays,” Elsevier, N.Y. (“Tijssen”). Generally, highly stringent hybridization and wash conditions are selected to be about 5° C. lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. The Tm is the temperature (under defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. Very stringent conditions are selected to be equal to the Tm for a particular probe. An example of stringent hybridization conditions for hybridization of complementary nucleic acids which have more than 100 complementary residues on an array or on a filter in a Southern or northern blot is 42° C. using standard hybridization solutions (see, e.g., Sambrook and Russell (2001) Molecular Cloning: A Laboratory Manual (3rd ed.) Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor Press, NY, and detailed discussion, below), with the hybridization being carried out overnight. An example of highly stringent wash conditions is 0.15 M NaCl at 72° C. for about 15 minutes. An example of stringent wash conditions is a 0.2×SSC wash at 65° C. for 15 minutes (see, e.g., Sambrook supra. for a description of SSC buffer). Often, a high stringency wash is preceded by a low stringency wash to remove background probe signal. An example of a medium stringency wash for a duplex of, e.g., more than 100 nucleotides, is 1×SSC at 45° C. for 15 minutes. An example of a low stringency wash for a duplex of, e.g., more than 100 nucleotides, is 4× to 6×SSC at 40° C. for 15 minutes.
The term “nucleic acid” or “polynucleotide” as used herein refers to a deoxyribonucleotide or ribonucleotide in either single- or double-stranded form. The term encompasses nucleic acids containing known analogues of natural nucleotides which have similar or improved binding properties, for the purposes desired, as the reference nucleic acid. The term also includes nucleic acids which are metabolized in a manner similar to naturally occurring nucleotides or at rates that are improved for the purposes desired. The term also encompasses nucleic-acid-like structures with synthetic backbones. DNA backbone analogues provided by the invention include phosphodiester, phosphorothioate, phosphorodithioate, methylphosphonate, phosphoramidate, alkyl phosphotriester, sulfamate, 3′-thioacetal, methylene(methylimino), 3′-N-carbamate, morpholino carbamate, and peptide nucleic acids (PNAs); see Oligonucleotides and Analogues, a Practical Approach, edited by F. Eckstein, IRL Press at Oxford University Press (1991); Antisense Strategies, Annals of the New York Academy of Sciences, Volume 600, Eds. Baserga and Denhardt (NYAS 1992); Milligan (1993) J. Med. Chem. 36:1923-1937; Antisense Research and Applications (1993, CRC Press). PNAs contain non-ionic backbones, such as N-(2-aminoethyl)glycine units. Phosphorothioate linkages are described in WO 97/03211; WO 96/39154; Mata (1997) Toxicol. Appl. Pharmacol. 144:189-197. Other synthetic backbones encompassed by the term include methyl-phosphonate linkages or alternating methylphosphonate and phosphodiester linkages (Strauss-Soukup (1997) Biochemistry 36: 8692-8698), and benzylphosphonate linkages (Samstag (1996) Antisense Nucleic Acid Drug Dev 6: 153-156).
The term “array”, “micro-array”, “nucleic acid array” and “biochip” are used herein interchangeably. They refer to an arrangement, on a substrate surface, of multiple nucleic acid molecules of predetermined identity, of which preferably the sequences are known. Each nucleic acid molecule is immobilized to a “discrete spot” (i.e., a defined location or assigned position) on the substrate surface. The term “micro-array” more specifically refers to an array that is miniaturized so as to require microscopic examination for visual evaluation. The arrays used in the methods of the invention are preferably microarrays. The nucleic acid array as used herein is a plurality of target elements, each target element comprising one or more nucleic acid molecules (probes) immobilized on one or more solid surfaces to which sample nucleic acids can be hybridized. The nucleic acids of a probe can contain sequence(s) from specific genes or clones, e.g. from specific genomic regions described in Table 1. Other probes may contain, for instance, reference sequences. The probes of the arrays may be arranged on the solid surface at different densities. The probe densities will depend upon a number of factors, such as the nature of the label, the solid support, and the like. One of skill will recognize that each probe may comprise a mixture of nucleic acids of different lengths and sequences. Thus, for example, a probe may contain more than one copy of a cloned piece of DNA or RNA, and each copy may be broken into fragments of different lengths. The length and complexity of the nucleic acid fixed onto the target element is not critical to the invention. One of skill can adjust these factors to provide optimum hybridization and signal production for a given hybridization procedure, and to provide the required resolution among different genes or genomic locations.
The term “probe” or “nucleic acid probe”, as used herein, is defined to be one or more nucleic acid fragments whose specific hybridization to a sample can be detected. The probe may be unlabelled or labelled as described below so that its binding to the target or sample can be detected. The probe is produced from a source of nucleic acids from one or more particular (preselected) portions of a chromosome, e.g., one or more clones, an isolated whole chromosome or chromosome fragment, or a collection of polymerase chain reaction (PCR) amplification products. The probes of the present invention are produced from nucleic acids found in the regions described herein.
The probe may also be isolated nucleic acids immobilized on a solid surface (e.g., nitrocellulose, glass, quartz, fused silica slides), as in an array. In some embodiments, the probe may be a member of an array of nucleic acids as described, for instance, in WO 96/17958. Techniques capable of producing high density arrays can also be used for this purpose (see, e.g., Fodor (1991) Science 767-773; Johnston (1998) Curr. Biol. 8: R171-R174; Schummer (1997) Biotechniques 23: 1087-1092; Kern (1997) Biotechniques 23: 120-124; U.S. Pat. No. 5,143,854). One of skill will recognize that the precise sequence of the particular probes described herein can be modified to a certain degree to produce probes that are “substantially identical” to the disclosed probes, but retain the ability to specifically bind to (i.e., hybridize specifically to) the same targets or samples as the probe from which they were derived (see discussion above). Such modifications are specifically covered by reference to the individual probes described herein.
As used herein, a “test nucleic acid sample” or “test nucleic acids” refer to nucleic acids comprising sequences whose quantity or degree of representation (e.g., copy number) or sequence identity is being assayed. Similarly, “test genomic acids” or a “test genomic sample” refers to genomic nucleic acids comprising sequences whose quantity or degree of representation (e.g., copy number) or sequence identity is being assayed.
As used herein, a “reference nucleic acid sample” or “reference nucleic acids” refers to nucleic acids comprising sequences whose quantity or degree of representation (e.g., copy number) or sequence identity serves as a reference to which one or more test samples are compared and more preferably the quantity or degree of representation (e.g., copy number) or sequence identity of the reference sample is known.
The term “sample” as used herein relates to a material or mixture of materials, containing one or more components of interest. Samples include, but are not limited to, samples obtained from an organism and may be directly obtained from a source (e.g., such as a biopsy or from a tumor) or indirectly obtained e.g., after culturing and/or one or more processing steps.
The term “genome” refers to all nucleic acid sequences (coding and non-coding) and elements present in each cell type, preferably each somatic cell type, of a subject. The term genome also applies to any naturally occurring or induced variation of these sequences that may be present in a mutant or disease variant of any cell type, including tumour cells. The terms “genomic DNA” and “genomic nucleic acid” are used herein interchangeably. They refer to nucleic acid isolated from a nucleus of one or more cells, and include nucleic acid derived from (i.e., isolated from, amplified from, cloned from as well as synthetic versions of) genomic DNA. For example, the human genome consists of approximately 3.0×109 base pairs of DNA organised into distinct chromosomes. The genome of a normal diploid somatic human cell consists of 22 pairs of autosomes (chromosomes 1 to 22) and either chromosomes X and Y (males) or a pair of chromosome Xs (female) for a total of 46 chromosomes. A genome of a cancer cell may contain variable numbers of each chromosome in addition to deletions, rearrangements and amplification of any subchromosomal region or DNA sequence.
As used herein, the term “genomic locus” or “genomic region” refer to a defined portion of a genome. Likewise the terms “chromosomal locus” and “chromosomal region” refer to a defined portion of a chromosome. For practical purposes the terms “genomic locus”, “genomic region”, “chromosomal region” and “chromosomal locus” are used interchangeably herein. In the methods of the invention, each nucleic acid probe immobilised to a discrete spot on an array has a sequence that is specific to (or characteristic of) a particular genomic region. In an array-based comparative genomic hybridisation experiment, the ratio of intensity of two differentially labelled test and reference samples at a given spot on the array reflects the genome copy number ratio of the two samples at a particular genomic region.
If a surface-bound polynucleotide or probe “corresponds to” a genomic region, the polynucleotide usually contains a sequence of nucleic acids that is unique to that genomic region. Accordingly, a surface-bound polynucleotide that corresponds to a particular genomic region usually specifically hybridizes to a labelled nucleic acid made from that genomic region, relative to labelled nucleic acids made from other genomic regions.
“CGH” or “Comparative Genomic Hybridisation” refers generally to techniques for identification of chromosomal alterations (such as in cancer cells, for example). Using CGH, ratios between tumour or test sample and normal or reference sample enable the detection of chromosomal amplifications and deletions of regions.
The terms “tumour” or “cancer” in an animal (e.g., a human) refers to the presence of cells possessing characteristics such as atypical growth or morphology, including uncontrolled proliferation, immortality, metastatic potential, rapid growth and proliferation rate, and certain characteristic morphological features. Often, cancer cells will be in the form of a tumour, but such cells may also exist in isolation from one another within an animal. “Tumour” includes both benign and malignant neoplasms.
As used herein, “BRCA1-associated tumour” means a tumour having cells containing a mutation of the BRCA1 locus.
As used herein, “non BRCA1/2 HBOC tumours” refer to tumours in a group of patients with a high risk for BRCA1-associated breast cancer (patients from Hereditary Breast and Ovarian Cancer families) but with a negative screen result for BRCA1 and BRCA2 mutation. Such patients are from a family, which include at least two breast cancer cases and one ovarian cancer; these families are referred to as HBOC families (Hereditary Breast and Ovarian Cancer).
The present invention is based in part on the discovery that certain chromosomal copy number aberrations (CNA) in tumour cells allow to distinguish between BRCA1-associated tumours and sporadic tumours. These aberrations in chromosomal copy number comprise a set of at least 10 chromosomal regions 1p21-34, 3p21-31 (which is herein understood to mean 3p21, more preferably 3p21.1-21.31), 3q22-27, 5q13-15, 5q21-23, 6p22-23, 10p14, 12q21-23, 13q32-33 (which is herein understood to mean 13q31-33), 14q22-24 (Table 1) and some smaller regions as indicated by a list of BAC clones (Table 2). Methods wherein the copy number of at least a subset of these genomic regions is determined are useful for diagnosis and/or prognosis of breast cancer, as well as ovarium cancer and other types of tumours.
Genomic instability is a hallmark of solid tumours, and virtually no solid tumour exists that does not show some alterations of the genome. In some cases these chromosomal abnormalities are characteristic for the specific type of tumour and may thus serve as a marker for differentiation between tumour-types.
In a first aspect therefore, the present invention relates to a method for classifying a sample of cell as comprising cells from a BRCA1-associated tumour or from a sporadic tumour. The method comprises detecting the number of copies per cell in genomic DNA in the sample at least three genomic locations selected from the group consisting of 1p21-34, 3p21, 3q22-27, 5q13-15, 5q21-23, 6p22-23, 10p14, 12q21-23, 13q31-33, and 14q22-24. Preferably in the method, an increase in the number of copies per cell of DNA in genomic locations selected from the group consisting of 1p21-34, 3q22-27, 6p22-23, 10p14, and 13q31-33, and/or a decrease in the number of copies per cell of DNA in genomic locations selected from the group consisting of 3p21, 5q13-15, 5q21-23, 12q21-23, and 14q22-24, compared to the number of copies per cell in non-cancer cells, classifies the cell sample as from a BRCA1-associated tumour.
These locations may be detected individually, or in combination. Thus, for example, in some embodiments, 3, 4, 5, 6, 7, 8, 9, or 10 of the above-listed chromosomal locations may be detected. Most preferably all 10 of the above-listed chromosomal locations forementioned are detected (as are also listed in Table 1). In a preferred method of the invention the detected genomic locations are selected from the group consisting of at least 5q13-15, 3q22-27, 13q31-33, 12q21-23, 10p14, 3p21, 14q22-24, 6p22-23, and 5q21-23. In a more preferred method of the invention the detected genomic locations are selected from the group consisting of at least 5q13-15, 3q22-27, 13q31-33, 12q21-23, 10p14, 3p21, 14q22-24 and 6p22-23. In a further more preferred method of the invention the detected genomic locations are selected from the group consisting of at least 5q13-15, 3q22-27, 13q31-33, 12q21-23, 10p14, 3p21 and 14q22-24. In yet a further preferred method of the invention the detected genomic locations are selected from the group consisting of at least 5q13-15, 3q22-27, 13q31-33, 12q21-23, 10p14 and 3p21. In again a further preferred method of the invention the detected genomic locations are selected from the group consisting of at least 5q13-15, 3q22-27, 13q31-33, 12q21-23 and 10p14. In still a further preferred method of the invention the detected genomic locations are selected from the group consisting of at least 5q13-15, 3q22-27, 13q31-33 and 12q21-23. In the most preferred method of the invention the detected genomic locations are selected from the group consisting of at least 5q13-15, 3q22-27 and 13q31-33.
The methods of the invention may further comprise detecting the number of copies per cell of genomic DNA in the cell sample at least one or two genomic locations selected from 5q13-15, 3q22-27 and 13q31-33, wherein a decrease in the number of copies per cell of DNA in these genomic compared to the number of copies per cell in non-cancer cells, classifies the cell sample as from a sporadic tumour.
The above-listed genomic locations of interest in the present invention are bounded by BAC probes as listed in Table 1.
Single or low-copy number probes that detect DNA within the above genomic locations are particularly useful for use in the invention. A list of exemplary BAC clones that may be used to detect or generate probes to detect the various genomic locations is provided in Table 2. However, it should be understood that this list is not intended to limit the invention and other probes within the genomic locations can also be used. Also, the term “probe” should be understood in its broadest sense to include any nucleic acid molecule that by hybridisation with a complementary sequence in the given genomic location is capable of detecting this location. Cytogenetic banding or chromosome banding is a well-known technique to the skilled person, and e.g. Cheung et al. (2001) Nature 409:953-958; Furey and Haussler (2003) Human Molecular Genetics 12:1037-1044; and Speicher and Carter (2005) Nature Genetics 6:782-792 describe how the chromosome banding is mapped on the genome. Probe molecules for use in the methods of the invention thus range from synthetic oligonucleotide probes and/or (amplification) primers to artcificial chromosomes of more than 1 Mb, depending of the particular technique that is used for determination of copy number as described below. Probes useful in the methods described here are available from a number of sources. For instance, P1 clones are available from the DuPont P1 library (Shepard, et al., Proc. Natl. Acad. Sci. USA, 92: 2629 (1994), and available commercially from Genome Systems. Various libraries spanning entire chromosomes are also available commercially (Clonetech, South San Francisco, Calif.), or from the Los Alamos National Laboratory. The present inventors used the human 3600 BAC/PAC genomic clone set, covering the full human genome at 1 Mb spacing as may be obtained from the Welcome Trust Sanger Institute (http://www.sanger.ac.uk/). Information on this clone set can be obtained at the BAC/PAC Resources Center Web Site (http://bacpac.chori.org). Preferred probes for use in the methods of the invention comprise at least 10, 12, 15, 18, 20, 22, 30, 50 or 100 contiguous nucleotides of a (human genomic) sequence that is present in a BAC clone listed in Tables 1 and 2. More preferred nucleic acid probes comprise a sequence that is unique in the genome, preferably the human genome.
Techniques for the preparation and manipulation of nucleic acid probes are well-known in the art (see, for example, Sambrook and Russell (2001) “Molecular Cloning: A Laboratory Manual (3rd edition), Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, New York; P. Tijssen “Hybridisation with Nucleic Acid Probes-Laboratory Techniques in Biochemistry and Molecular Biology (Parts I and II)”, 1993, Elsevier Science; “PCR Strategies”, 1995, M. A. Innis (Ed.), Academic Press: New York, N.Y.; and “Short Protocols in Molecular Biology”, 2002, F. M. Ausubel (Ed.), 5th Ed., John Wiley & Sons). Nucleic acid probes may be obtained and manipulated by cloning into various vehicles. They may be screened and re-cloned or amplified from any source of genomic DNA. Nucleic acid probes may be derived from genomic clones including mammalian and human artificial chromosomes (MACs and HACs, respectively, which can contain inserts from about 5 to 400 kilobases (kb)), satellite artificial chromosomes or satellite DNA-based artificial chromosomes (SATACs), yeast artificial chromosomes (YACs; 0.2-1 Mb in size), bacterial artificial chromosomes (BACs; up to 300 kb); P1 artificial chromosomes (PACs; about 70-100 kb) and the like. MACs and HACs have been described (see e.g. W. Roush, Science, 1997, 276: 38-39; M. A. Rosenfeld, Nat. Genet. 1997, 15: 333-335; F. Ascenzioni et al., Cancer Lett. 1997, 118: 135-142; Y Kuroiwa et al., Nat. Biotechnol. 2000, 18: 1086-1090; J. E. Meija et al., Am. J. Hum. Genet. 2001, 69: 315-326; and C. Auriche et al., EMBO Rep. 2001, 2: 102-107). SATACs can be produced by induced de novo chromosome formation in cells of different mammalian species (see e.g. P. E. Warburton and D. Kiplin, Nature, 1997, 386: 553-555; E. Csonka et al., J. Cell. Sci. 2000, 113: 3207-3216; and G. Hadlaczky, Curr. Opin. Mol. Ther. 2001, 3: 125-132). Nucleic acid probes may alternatively be derived from YACs, which have been used for many years for the stable propagation of genomic fragments of up to one million base pairs in size (see e.g J. M. Feingold et al., Proc. Natl. Acad. Sci. USA, 1990, 87:8637-8641; G. Adam et al., Plant J., 1997, 11: 1349-1358; R. M. Tucker and D. T. Burke, Gene, 1997, 199: 25-30; and M. Zeschnigk et al., Nucleic Acids Res., 1999, 27: E30). BACs may also be used to produce nucleic acid probes for use in the practice of the present invention. BACs, which are based on the E. coli F factor plasmid system, offer the advantage of being easy to manipulate and purify in microgram quantities (see e.g. S. Asakawa et al., Gene, 1997, 191: 69-79; and Y. Cao et al., Genome Res. 1999, 9: 763-774). PACs are bacteriophage P1-derived vectors (see, for example, P. A. Ioannou et al., Nature Genet., 1994, 6: 84-89; J. Boren et al., Genome Res. 1996, 6: 1123-1130; H. G. Nothwang et al., Genomics, 1997, 41: 370-378; L. H. Reid et al., Genomics, 1997, 43: 366-375; and P. Y. Woon et al., Genomics, 1998, 50: 306-316). Nucleic acid probes may also be obtained and manipulated by cloning into other cloning vehicles such as, for example, recombinant viruses, cosmids, or plasmids. Alternatively, nucleic acid sequences used as array-immobilised nucleic acid probes may be synthesised in vitro by chemical techniques well-known in the art. These methods have been described (see e.g. Nucleic Acids Res. 1997, 25: 3440-3444; M. J. Blommers et al., Biochemistry, 1994, 33: 7886-7896; and K. Frenkel et al., Free Radic. Biol. Med. 1995, 19: 373-380). An alternative to custom arraying of nucleic acid probes is to rely on commercially available arrays and micro-arrays. Such arrays have been developed, for example, by Vysis Corporation (Downers Grove, Ill.), Spectral Genomics Inc. (Houston, Tex.), and Affymetrix Inc. (Santa Clara, Calif.).
In a preferred embodiment of the method of the invention, detection of numbers of copies per cell in genomic DNA is carried out quantitatively or semi-quantitatively. It is not necessary to determine the exact copy number of the genomic regions, as detection of an aberration from the copy number in non-cancer cells, i.e. gain or loss of nucleic acid material, is sufficient. Thus, it is understood that detection of copy number includes estimation of copy numbers. Therefore, a semi-quantitative or a relative measure usually suffices. In addition, quantitative techniques may be used to determine the copy number per cell. The skilled person knows both quantitative and semi-quantitative techniques to determine copy number, e.g. semi-quantitative PCR analysis or quantitative real-time PCR.
Polymerase Chain Reaction (PCR) per se is not a quantitative technique however PCR-based methods have been developed that are quantitative or semi-quantitative in that they give a reasonable estimate of origical copy numbers within certain limits. Examples are quantitative PCR, preferably quantitative real-time PCR (known as RT-PCR, RQ-PCR, QRT-PCR or RTQ-PCR). In addition, many techniques give estimates of relative copy numbers as calculated relative to a reference, e.g. many array techniques. Absolute copy number estimates may be obtained by in situ hybridization techniques (ISH), e.g. fluorescence in situ hybridization (FISH) or chromogenic in situ hybridization (CISH) techniques. Hereafter, non-limiting examples are given of techniques that may be used for the analysis of copy numbers.
Techniques that permit the analysis of copy numbers of individual genomic locations are well known in the art. For example, fluorescence in-situ hybridization (FISH) can be used to study copy numbers of individual genetic loci or particular regions on a chromosome (Pinkel et al., Proc. Natl. Acad. Sci. U.S.A. 85, 9138-42 (1988)). Comparative genomic hybridization (CGH) (Kallioniemi et al. Science 258, 818-21 (1992)) may also be used (Houldsworth et al. Am J Pathol 145, 1253-60 (1994)) to probe for copy number changes of chromosomal regions.
Copy number of genomic locations may also be determined using quantitative PCR such as real-time PCR (see, e.g., Suzuki et al., Cancer Res. 60:5405-9 (2000)). For example, quantitative microsatellite analysis (QUMA) can be performed for rapid measurement of relative DNA sequence copy number. In QUMA, the copy number of a test locus relative to a pooled reference is assessed using quantitative, real-time PCR amplification of loci carrying simple sequence repeats. Use of simple sequence repeats is advantageous because of the large numbers that are mapped precisely. Additional protocols for quantitative PCR are provided in Innis et al. (1990) PCR Protocols, A Guide to Methods and Applications, Academic Press, Inc. N.Y.). Other semi-quantitative methods to determine specific DNA copy numbers are Multiplex Ligation-dependent Probe Amplification (MLPA) (Schouten et al. (2002) Nucleic Acids Res 30(12):e57; Sellner and Taylor (2004) Human Mutation 23(5):413-419) and Multiplex Amplification and Probe Hybridization (MAPH) (Sellner and Taylor (2004) supra).
However, preferably in the methods of the invention copy numbers of genomic locations are determined by hybridizations that are performed on a solid support. For example, probes that selectively hybridize to specific chromosomal regions can be spotted onto a surface. Conveniently, the spots are placed in an ordered pattern, or array, and the placement of the probes on the array is recorded to facilitate later correlation of results. The nucleic acid samples are then hybridized to the array. Thus, in the methods of the invention, copy numbers of genomic locations are preferably analysed in an array-based approach, e.g. using comparative genomic hybridisation. Any of a variety of arrays may be used in the practice of the present invention. Investigators can either rely on commercially available arrays or generate their own. Methods of making and using arrays are well known in the art (see, for example, S. Kern and G. M., Hampton, Biotechniques, 1997, 23:120-124; M. Schummer et al., Biotechniques, 1997, 23:1087-1092; S. Solinas-Toldo et al., Genes, Chromosomes & Cancer, 1997, 20: 399-407; M. Johnston, Curr. Biol. 1998, 8: R171-R174; D. D. Bowtell, Nature Gen. 1999, Supp. 21:25-32; S. J. Watson and H. Akil, Biol Psychiatry. 1999, 45: 533-543; W. M. Freeman et al., Biotechniques. 2000, 29: 1042-1046 and 1048-1055; D. J. Lockhart and E. A. Winzeler, Nature, 2000, 405: 827-836; M. Cuzin, Transfus. Clin. Biol. 2001, 8:291-296; P. P. Zarrinkar et al., Genome Res. 2001, 11: 1256-1261; M. Gabig and G. Wegrzyn, Acta Biochim. Pol. 2001, 48: 615-622; and V. G. Cheung et al., Nature, 2001, 40: 953-958; see also, for example, U.S. Pat. Nos. 5,143,854; 5,434,049; 5,556,752; 5,632,957; 5,700,637; 5,744,305; 5,770,456; 5,800,992; 5,807,522; 5,830,645; 5,856,174; 5,959,098; 5,965,452; 6,013,440; 6,022,963; 6,045,996; 6,048,695; 6,054,270; 6,258,606; 6,261,776; 6,277,489; 6,277,628; 6,365,349; 6,387,626; 6,458,584; 6,503,711; 6,516,276; 6,521,465; 6,558,907; 6,562,565; 6,576,424; 6,587,579; 6,589,726; 6,594,432; 6,599,693; 6,600,031; and 6,613,893). Arrays comprise a plurality of nucleic acid probes immobilised to discrete spots (i.e., defined locations or assigned positions) on a substrate surface. Substrate surfaces for use in the present invention can be made of any of a variety of rigid, semi-rigid or flexible materials that allow direct or indirect attachment (i.e., immobilisation) of nucleic acid probes to the substrate surface. Suitable materials include, but are not limited to: cellulose (see, for example, U.S. Pat. No. 5,068,269), cellulose acetate (see, for example, U.S. Pat. No. 6,048,457), nitrocellulose, glass (see, for example, U.S. Pat. No. 5,843,767), quartz or other crystalline substrates such as gallium arsenide, silicones (see, for example, U.S. Pat. No. 6,096,817), various plastics and plastic copolymers (see, for example, U.S. Pat. Nos. 4,355,153; 4,652,613; and 6,024,872), various membranes and gels (see, for example, U.S. Pat. No. 5,795,557), and paramagnetic or supramagnetic microparticles (see, for example, U.S. Pat. No. 5,939,261). When fluorescence is to be detected, arrays comprising cyclo-olefin polymers may preferably be used (see, for example, U.S. Pat. No. 6,063,338). The presence of reactive functional chemical groups (such as, for example, hydroxyl, carboxyl, amino groups and the like) on the material can be exploited to directly or indirectly attach nucleic acid probes to the substrate surface. Methods for immobilizing nucleic acid probes to substrate surfaces to form an array are well-known in the art.
More than one copy of each nucleic acid probe may be spotted on the array (for example, in duplicate or in triplicate). This arrangement may, for example, allow assessment of the reproducibility of the results obtained (see below). Related nucleic acid probes may also be grouped in probe elements on an array. For example, a probe element may include a plurality of related nucleic acid probes of different lengths but comprising substantially the same sequence. Alternatively, a probe element may include a plurality of related nucleic acid probes that are fragments of different lengths resulting from digestion of more than one copy of a cloned piece of DNA. An array may contain a plurality of probe elements. Probe elements on an array may be arranged on the substrate surface at different densities. Array-immobilised nucleic acid probes may be nucleic acids that contain sequences from genes (e.g., from a genomic library), including, for example, sequences that collectively cover a substantially complete genome or a subset of a genome. The sequences of the nucleic acid probes are those for which comparative copy number information is desired. For example, to obtain DNA sequence copy number information across an entire genome, an array comprising nucleic acid probes covering a whole genome or a substantially complete genome is used. However, in preferred embodiments of the method of the present invention the relevant genomic locations have already been established and there is no need for genome-wide experiments. In such instances the array may contain specific nucleic acid sequences that originate from a discrete set of genes or genomic locations as indicated above and whose copy number in association with the type of tumour is to be tested. Additionally, the array may comprise nucleic acid sequences as positive or negative controls (i.e., the nucleic acid sequences may be derived from karyotypically normal genomes).
Alternatively, the samples can be placed in separate wells or chambers and hybridized in their respective well or chambers. It is understood in the context of the invention that an array of separate wells or chambers is also comprsied within the general term “array” herein. The art has developed robotic equipment permitting the automated delivery of reagents to separate reaction chambers, including “chip” and microfluidic techniques, which allow the amount of the reagents used per reaction to be sharply reduced. Chip and microfluidic techniques are taught in, for example, U.S. Pat. No. 5,800,690, Orchid, “Running on Parallel Lines” New Scientist, Oct. 25, 1997, McCormick, et al., Anal. Chem. 69:2626-30 (1997), and Turgeon, “The Lab of the Future on CD-ROM?” Medical Laboratory Management Report. December 1997, p. 1. Automated hybridizations on chips or in a microfluidic environment are contemplated methods of practicing the invention. Although microfluidic environments are one embodiment of the invention, they are not the only defined spaces suitable for performing hybridizations in a fluid environment. Other such spaces include standard laboratory equipment, such as the wells of microtiter plates, Petri dishes, centrifuge tubes, or the like can be used.
In a preferred embodiment of the invention therefore includes analysing tumour cell samples by array-based comparative genomic hybridisation (aCGH). More specifically, certain methods of the invention comprise steps of: providing a sample of tumour DNA; analysing the tumour DNA by array-based comparative genomic hybridisation to obtain tumour genomic information; and, based on the tumour genomic information obtained, classifying the tumour as a BRCA1-related tumour or a sporadic tumour. The analysis step in the methods of the invention can be performed using any of a variety of methods, means and variations thereof for carrying out array-based comparative genomic hybridisation. Array-based CGH methods are known in the art and have been described in numerous scientific publications as well as in patents (see, for example, U.S. Pat. Nos. 5,635,351; 5,665,549; 5,721,098; 5,830,645; 5,856,097; 5,965,362; 5,976,790; 6,159,685; 6,197,501; 6,335,167; and EP 1 134 293 and EP 1 026 260; van Beers et al., Brit. J. Cancer, 2006; 20. Joosse et al., BMC Cancer. 2007, 7:43; D. Pinkel et al., Nat. Genet. 1998, 20: 207-211; J. R. Pollack et al., Nat. Genet. 1999, 23: 41-46; C. S. Cooper, Breast Cancer Res. 2001, 3: 158-175). In the practice of the present invention, these methods as well as other methods known in the art for carrying out array-based comparative genomic hybridisation may be used as described or modified such that they allow for tumour genomic information to be obtained. Tumour genomic information includes e.g. gain and loss of genetic material, chromosomal abnormalities and genome copy number changes at multiple genomic loci.
The method of the invention encompasses all kinds of tumours, however in a preferred embodiment of the method of the invention, a BRCA1-related tumour or a sporadic tumour is a breast tumour or an ovarian tumour. Most preferably a BRCA1-related tumour or a sporadic tumour is a breast tumour.
Test and reference nucleic acid samples for use in the methods of the present invention may be isolated from a biological sample comprising tumour or reference cells by any suitable method of DNA isolation or extraction. Methods of DNA extraction are well known in the art. A classical DNA isolation protocol is based on extraction using organic solvents such as a mixture of phenol and chloroform, followed by precipitation with ethanol (see e.g. Sambrook and Russell, 2001, supra). Other methods include: salting out DNA extraction, the trimethylammonium bromide salts DNA extraction method and the guanidinium thiocyanate DNA extraction method. There are also numerous different and versatile kits that can be used to extract DNA from bodily fluids and that are commercially available from, for example, BD Biosciences Clontech (Palo Alto, Calif.), Epicentre Technologies (Madison, Wis.), Gentra Systems, Inc. (Minneapolis, Minn.), MicroProbe Corp. (Bothell, Wash.), Organon Teknika (Durham, N.C.), and Qiagen Inc. (Valencia, Calif.). User Guides that describe in great detail the protocol to be followed are usually included in all these kits. Sensitivity, processing time and cost may be different from one kit to another. One of ordinary skill in the art can easily select the kit(s) most appropriate for a particular situation.
In the methods of the invention, the reference sample preferably is a nucleic acid sample that is representative for the normal (i.e. non-breast tumour/non-cancer cell) copy numbers of the complement of the genomic regions that are tested pool in the method in question. The reference may e.g. be derived from a genomic sample from a normal and/or healthy individual or from a pool of such individuals. Preferably the reference nucleic acid sample is from female individuals. It is also preferred that the reference nucleic acid sample does not comprise tumour DNA. A preferred reference nucleic acid sample consists of pooled genomic DNAs isolated from a tissue sample (e.g. lymphocytes) from a number (e.g. at least 4-10) of apparently healthy women. In another preferred embodiment, the reference nucleic acid sample may comprise an artificially-generated population of nucleic acids designed to approximate the level of nucleic acid sequences derived from each genomic region, or fragments thereof, of which the copy number is determined in the tumour samples. In yet another embodiment, the reference nucleic acid sample may be derived from normal cell lines or cell line samples.
In the methods of the invention the extracted test and/or reference nucleic acids may be labelled with a detectable agent or moiety before being analysed by hybridisation. Preferably, the detectable agent is selected such that it generates a signal which can be measured and whose intensity is related (e.g., proportional) to the amount of labelled nucleic acids present in the sample being analysed. In array-based hybridisation methods of the invention, the detectable agent is also preferably selected such that is generates a localised signal, thereby allowing resolution of the signal from each spot on the array.
Methods for labelling nucleic acid fragments are well-known in the art. For a review of labelling protocols, label detection techniques and recent developments in the field, see, for example, L. J. Kricka, Ann Clin. Biochem. 2002, 39: 114-129; R. P. van Gijlswijk et al., Expert Rev. Mol. Diagn. 2001, 1: 81-91; and S. Joos et al., J. Biotechnol. 1994, 35: 135-153. Standard nucleic acid labelling methods include: incorporation of radioactive agents, direct attachment of fluorescent dyes or of enzymes, chemical modifications of nucleic acid fragments making them detectable immunochemically or by other affinity reactions, and enzyme-mediated labelling methods, such as random priming, nick translation, PCR and tailing with terminal transferase. A preferred more recently developed nucleic acid labelling systems includes ULS (Universal Linkage System), which is based on the reaction of monoreactive cisplatin derivatives with the N7 position of guanine moieties in DNA (see, for example, R. J. Heetebrij et al., Cytogenet. Cell. Genet. 1999, 87: 47-52). Other suitable labelling systems include e.g. psoralen-biotin, photoreactive azido derivatives, and DNA alkylating agents.
Any of a wide variety of detectable agents can be used in the practice of the present invention. Suitable detectable agents include, but are not limited to: various ligands, radionuclides (such as for example, 32P, 35S, 3H, 14C, 125I, 131I, and the like); fluorescent dyes (for specific exemplary fluorescent dyes, see below); chemiluminescent agents (such as, for example, acridinium esters, stabilised dioxetanes and the like); microparticles (such as, for example, quantum dots, nanocrystals, phosphors and the like); enzymes (such as, for example, those used in an ELISA, i.e., horseradish peroxidase, beta-galactosidase, luciferase, alkaline phosphatase); colorimetric labels (such as, for example, dyes, colloidal gold and the like); magnetic labels (such as, for example, Dynabeads™); and biotin, dioxigenin or other haptens and proteins for which antisera or monoclonal antibodies are available.
In particularly preferred embodiments, the test and/or reference nucleic acids to be analysed by hybridisation is fluorescently labelled. Suitable fluorescent dyes for use in the present invention include e.g. Cy-3, Cy-5, Texas red, FITC, Spectrum Red, Spectrum Green, phycoerythrin, rhodamine, fluorescein, and equivalents, analogues or derivatives thereof. Favorable properties of fluorescent labelling agents to be used in the practice of the invention include high molar absorption coefficient, high fluorescence quantum yield, and photostability. Preferred labelling fluorophores exhibit absorption and emission wavelengths in the visible (i.e., between 400 and 750 nm) rather than in the ultraviolet range of the spectrum (i.e., lower than 400 nm). Preferred fluorescent dyes include Cy-3 and Cy-5 (i.e., 3- and 5-N,N′-diethyltetramethylindo-dicarbocyanine, respectively). Cy-3 and Cy-5 also present the advantage of forming a matched pair of fluorescent labels that are compatible with most fluorescence detection systems for array-based instruments (see below). Another preferred matched pair of fluorescent dyes comprises Spectrum Red and Spectrum Green. The term “differentially labelled” is used to specify that two samples of nucleic acid segments are labelled with a first detectable agent and a second detectable agent that produce distinguishable signals, whereby e.g. the first sample is the test sample and the second sample is the reference sample. Detectable agents that produce distinguishable signals include matched pairs of fluorescent dyes. Matched pairs of fluorescent dyes are known in the art and include, for example, rhodamine and fluorescein, Cy-3™ and Cy-5™, and Spectrum Red™ and Spectrum Green™.
Hybridization and wash protocols suitable for use with the methods of the invention are described, e.g., in Sambrook and Russell, 2001, supra, P. Tijssen “Hybridisation with Nucleic Acid Probes-Laboratory Techniques in Biochemistry and Molecular Biology (Part II)”, Elsevier Science, 1993; and “Nucleic Acid Hybridisation”, M. L. M. Anderson (Ed.), 1999, Springer Verlag: New York, N.Y. Preferred the hybridization protocols for CGH are those of Pinkel et al. (1998) Nature Genetics 20:207-211 or of Kallioniemi (1992) Proc. Natl. Acad Sci USA 89:5321-5325 (1992). Methods of optimizing hybridization conditions are well known to those of skill in the art (see, e.g., Tijssen, 1993, supra). In order to create competitive hybridisation conditions, the array may be contacted simultaneously with the (differentially) labelled nucleic acid fragments of the test and reference samples. This may be done by, for example, mixing the test and reference samples to form a hybridisation mixture and contacting the array with the mixture.
The specificity of hybridisation may further be enhanced by inhibiting repetitive sequences. In certain preferred embodiments, repetitive sequences sequences (e.g., Alu, L1 and satellite sequences, MRE sequences and simple homo- or oligo-nucleotide tracts) present in the nucleic acid fragments are removed or their hybridisation capacity is disabled. Removing repetitive sequences from a mixture or disabling their hybridisation capacity can be accomplished using any of a variety of methods well-known to those skilled in the art. These methods include, but are not limited to, removing repetitive sequences by hybridisation to specific nucleic acid sequences immobilised to a solid support (see e.g. O. Brison et al., Mol. Cell. Biol. 1982, 2: 578-587); suppressing the production of repetitive sequences by PCR amplification using adequate PCR primers; inhibiting the hybridisation capacity of highly repeated sequences by self-reassociation (see e.g R. J. Britten et al., Methods of Enzymology, 1974, 29: 363-418); or removing repetitive sequences using hydroxyapatite (which is commercially available, for example, from Bio-Rad Laboratories, Richmond, Va.). Preferably, the hybridisation capacity of highly repeated sequences is competitively inhibited by including, in the hybridisation mixture, unlabelled blocking nucleic acids. The unlabelled blocking nucleic acids, which are mixed to the test and reference samples before the contacting step, act as a competitor and prevent the labelled repetitive sequences from binding to the highly repetitive sequences of the nucleic acid probes, thus decreasing hybridisation background. In certain preferred embodiments, the unlabelled blocking nucleic acids are Human Cot-1 DNA. Human Cot-1 DNA is commercially available, for example, from Gibco/BRL Life Technologies (Gaithersburg, Md.).
In another aspect the invention therefore relates to a set of at least three nucleic acid probes for use in the above described methods of the invention. In the set preferably each probe specifically hybridises to a different genomic location selected from the group consisting of 5q13-15, 3q22-27, 13q31-33, 12q21-23, 10p14, 3p21, 14q22-24, 6p22-23, and 5q21-23. More preferably in the set each probe specifically hybridises to a different genomic location selected from the group consisting of 5q13-15, 3q22-27, 13q31-33, 12q21-23, 10p14, 3p21, 14q22-24 and 6p22-23. Further preferred in the set each probe specifically hybridises to a different genomic location selected from the group consisting of 5q13-15, 3q22-27, 13q31-33, 12q21-23, 10p14, 3p21 and 14q22-24. Yet further preferred in the set each probe specifically hybridises to a different genomic location selected from the group consisting of 5q13-15, 3q22-27, 13q31-33, 12q21-23, 10p14 and 3p21. Again further preferred in the set each probe specifically hybridises to a different genomic location selected from the group consisting of 5q13-15, 3q22-27, 13q31-33, 12q21-23 and 10p14. Still further preferred in the set each probe specifically hybridises to a different genomic location selected from the group consisting of 5q13-15, 3q22-27, 13q31-33 and 12q21-23. Most preferably, in the set each probe specifically hybridises to a different genomic location selected from the group consisting of 5q13-15, 3q22-27 and 13q31-33. In these sets, the nucleic acid probes for detection of genomic locations are as defined above and/or may be obtained in methods as described above.
In another aspect the invention relates to a BAC clone, which BAC clone is selected from the group of BAC clones as listed in Table 1 or Table 2. Preferably the invention relates to a set of at least three BAC clones selected from the group of BAC clones as listed in Table 1 or Table 2. More preferably, the set comprises at least 4, 5, 6, 7, 8, 9, 10, 15, 20, 30, 50, 60, 80 or 100 BAC clones selected from the group of BAC clones as listed in Table 1 or Table 2. Even more preferably, the set of BAC clones represent at least three, more preferably at least 4, 5, 6, 7, 8, 9 or 10 of the above-listed genomic locations. Most preferably, the set of BAC clones represent all 10 of the above-listed chromosomal locations (as are also listed in Table 1).
In yet another aspect the invention pertains to an array comprising a set of at least three nucleic acid probes for use in the above described methods of the invention as defined above. More preferably the array comprises distinct nucleic acid probes and/or distinct BAC clones that specifically hybridise to at least 4, 5, 6, 7, 8, 9, 10 of the above-listed genomic locations. More preferably the array comprises nucleic acid probes that comprises a sequence that is unique in the above-listed genomic locations. Most preferably the array comprises distinct probes and/or distinct BAC clones for all 10 of the above-listed chromosomal locations (as are are also listed in Table 1). It is understood herein that an array that comprises distinct nucleic acid probes and/or distinct BAC clones that specifically hybridise to at least three genomic locations is an array that allows to individually analyse at least three (different) genomic locations. Thus, preferably a set of nucleic acid probes and/or BAC clonese is arranged on the array in a positionally-addressable manner. An array is herein understood as any solid support onto which the probes are immobilised, whereby preferably the probes and or BAC clones are immobilised onto the solid support in a positionally-addressable manner. Preferably, the distinct BAC clones that are comprised on the array are selected from the group of BAC clones as listed in Table 1 or Table 2.
In a further aspect the invention relates to kits for use in the diagnostic applications described above. The kits of the invention may comprise any or all of the reagents to perform the methods described herein. In the diagnostic applications such kits may include any or all of the following: assay reagents, buffers, nucleic acids such hybridization probes and/or primers that specifically bind to at least one of the genomic locations described herein, as well as arrays comprising such nucleic acids. In addition, the kits may include instructional materials containing directions (i.e., protocols) for the practice of the methods of this invention. While the instructional materials typically comprise written or printed materials they are not limited to such. Any medium capable of storing such instructions and communicating them to an end user is contemplated by this invention. Such media include, but are not limited to electronic storage media (e.g., magnetic discs, tapes, cartridges, chips), optical media (e.g., CD ROM), and the like. Such media may include addresses to internet sites that provide such instructional materials.
In yet a further aspect the invention relates to a diagnostic and/or prognostic method for indicating the involvement of a BRCA1 deficiency in the development of a tumour in a subject. The method preferably comprises the use of a method as defined hereinabove, a set of nucleic acid probes as defined hereinabove, an array as defined hereinabove, or a kit as defined hereinabove. The diagnostic and/or prognostic method may further be used (on its own or as an additional tool) to identify BRCA1-mutation carrying families where the BRCA1-relation is still unclear. The diagnostic and/or prognostic method may also be used to select for the individual within a high risk family for intensive DNA screening most likely carrying a mutation and/or the diagnostic and/or prognostic method may be used to guide DNA-diagnostics. Additionally, the diagnostic and/or prognostic method may be used to give indications for the significance of unclassified variants. The methods of the invention provide a reliable test for indicating the involvement of a BRCA1 deficiency in the development of individual tumours and as such support decision making in genetic counselling and clinical management, e.g. of the treatment of breast tumours.
In this document and in its claims, the verb “to comprise” and its conjugations is used in its non-limiting sense to mean that items following the word are included, but items not specifically mentioned are not excluded. In addition, reference to an element by the indefinite article “a” or “an” does not exclude the possibility that more than one of the element is present, unless the context clearly requires that there be one and only one of the elements. The indefinite article “a” or “an” thus usually means “at least one”.
All patent and literature references cited in the present specification are hereby incorporated by reference in their entirety.
The following examples are offered for illustrative purposes only, and are not intended to limit the scope of the present invention in any way.
1.1.1 Patients and Sample selection
This study was performed on three breast cancer groups: 1) 28 breast tumours from patients with a verified pathogenic BRCA1 germline mutation, mean age at diagnosis of 39 years (range: 27-61); 2) 48 sporadic breast tumours with mean age at diagnosis of 45 years (range: 32-60), without family history for breast cancer, and were randomly selected from the institutes archive, however, with the same percentage of P53-negative and positive samples as the BRCA1-associated tumour group (Table 3). Both BRCA1-associated and sporadic tumour groups consisted of invasive grade II-III ductal carcinomas. 3) 48 tumours from HBOC families (at least two breast and one primary ovarian cancer), that were subjected to routine diagnostic testing (described by Van der Hout 2006) and had a negative test result for mutations in both BRCA1 and BRCA2. The mean age at diagnosis was 48 years (range: 20-61). Patient's characteristics for all three groups are described in the supplementary table patient information. All sample material was formalin-fixed, paraffin-embedded (FFPE) tissue and extracted DNA had to be of sufficient quality as described before (Van Beers 2006). All experiments involving human tissues were conducted with permission of the institutes' medical ethical advisory board.
Sample DNA was isolated from FFPE tumour tissues as follows. Ten 10 μm slices containing at least 70% tumour cells were cleared of paraffin (2×5 min xylene, 2×30s 100% ethanol, 30s 90% ethanol, 30s 70% ethanol, and rinsed with H2O), treated with 1M sodium acetate at 37° C. over night, and sections of interest (>70% tumour cells) were scraped in 200 μl buffer ATL (Qiagen, cat.no. 51304). 27 μl proteinase K (15 μg/μl, Roche, cat no 3115879001) was immediately added, the same amount at the end of the day and the beginning and the end of the next day; samples were kept shaking at 37° C. all time of digestion. The following day, 40 μl RNase A (20 μg/μl, Sigma, cat.no.RS500) was added to the sample, vortexed, and incubated for 2 minutes at room temperature. 400 μl buffer AL (Qiagen, cat.no. 51304) was added and incubated for 10 minutes at 70° C. 420 μl 100% ethanol was added and vortexed. Sample mixture was spun on a spincolumn (Qiagen, cat.no. 51304) for 1 minute at 8000 rpm. Column was washed with the following reagents sequentially and spun for 1 minute at 8000 rpm: 500 μl AW1, 500 μl AW2, and twice with 80% ethanol. The column was spun dry for 3 minutes at 14,000 rpm. Sample was eluted with 50 μl AE buffer by spinning for 1 minute at 8000 rpm.
Reference DNA was isolated from lymphocytes from six apparently healthy women and pooled. Lymphocytes were purified by adding lysis buffer (155 mM NH4Cl, 10 mM KHCO3, 1 mM EDTA) four times the blood volume, followed by centrifugation at 3000 rpm for 10 minutes at 4° C. Supernatant was removed and cell pellet re-suspended in lysis buffer five times the original blood volume. These steps were repeated until all erythrocytes were removed and the supernatant was a clear solution. 1/10 of the initial blood volume DNAzo1 (Invitrogen, cat.no. 10503-027) was added to the cell pellet and mixed by pipeting until a clear solution was left. ½ of the DNAzo1 volume 100% ethanol was added, DNA was removed from the solution, washed in 70% ethanol and dissolved in Tris-EDTA buffer. DNA was sonicated until the average length was 300-800 bp.
To test the suitability of sample DNA for array-CGH, a quality PCR was performed as described before (Van Beers 2006). This multiplex PCR contains primes to produce band of 100, 200, 300, and 400 bp. Depending on the quality of the DNA, the PCR will produce the different bands. DNA from which at least the 200 by fragment can be amplified is of sufficient quality for array-CGH.
As described before (Joosse 2007), hybridisations were done on micro arrays containing 3.5 k BAC/PAC derived DNA segments covering the whole genome with an average spacing of 1 Mb. The whole library was in triplicate spotted on every slide (Code Link Activated Slides, Amersham Biosciences, Prod. No. 300011 00).
Data processing of the scanned microarray slide included signal intensity measurement in ImaGene Software followed by median pin tip (c.q. subarray) normalisation. Intensity ratios (Cy5/Cy3) were log 2 transformed and triplicate spot measurements were averaged. Chromosomal breakpoints and aberrations were calculated using CGH-segmentation (Picard 2005).
To build a class predictor based on log2 (ratios) of our CGH experiments, the shrunken centroids algorithm (Tibshirani, 2002) was used. For calculating the squared distances δk for each class K to the sample x*, we have applied equal priors (πk=1/K). As the shrinkage increases, the number of BAC clones dividing the tumour groups d′ik decreases, thereby, also the squared distances become relatively small.
Analogy to Gaussian linear discriminant analysis was therefore not applied. The different dynamic ranges and the difference in CGH profiles between samples give a wide variability in discriminant scores. This variability affects both classes' scores and can therefore be scaled towards zero by subtracting both scores with the smallest discriminant score:
δk′(x*)=δk(x*)−arg minkδ(x*)
The classification rule for sample x* is where δk′ (x*)=0. The arg max δ′k(x*) is the distance to the improbable class for sample x*. We use arg max δ′k (x*) here as a ‘likelihood score’ for the probable class.
The class predictor was built on 18 random selected BRCA1-mutated and 32 sporadic breast tumours, and validated on an independent set of 10 BRCA1-mutated and 16 sporadic tumours. To be able to test the HBOC tumour group, reference intervals were based on 95% of the training and validation scores. For legibility, scores for the sporadic tumour group are shown negative, BRCA1 scores positive.
Hypermethylation of BRCA1 promoter was determined by using methylation MLPA according to manufactures' protocol (MRC-Holland, ME001), with a PCR of 30 cycles. Two micro litre MLPA-PCR product was added to 9.8 μl Hi-Di formamide (AB, 4311320) and 0.2 μl ROX-500 (AB, 401734), and analysed on a 3730 DNA sequencer (AB).
LOH at the BRCA1 locus was determined using 5 markers: D17S579, D17S588, D17S1322, D17S1323, and THRAl. Primers and PCR program are described in Tables 4 and 5. One micro litre PCR product was added to 14.9 μl Hi-Di formamide and 0.1 ROX-350 (AB, 401735), and analysed using the 3730 DNA Analyzer.
Microarray data have been deposited in NCBIs Gene Expression Omnibus and are accessible through GEO Series accession number GSE9021 (BRCA1-associated tumours) and GSE9114 (sporadic tumours).
In total, we have obtained array-CGH profiles of 28 BRCA1-related, 48 sporadic and 48 HBOC breast tumours. We here report the chromosomal aberrations and their locations, the differences between the tumour groups, and the discriminating power of a class predictor based on our CGH results.
To analyse chromosomal aberrations, we determined breakpoint locations and estimated copy number levels using the CGH-segmentation algorithm (Picard 2005). Based on the estimated copy number levels, the frequency for gain and loss for all BAC clones was calculated using fixed log 2 ratio thresholds of 0.2 and −0.2 respectively.
We have used nearest Shrunken Centroids (SC) as classification method to discriminate between two breast cancer types, germ line mutated BRCA1 (class 1) and sporadic tumours (class 2). Two third of the samples of each set were randomly selected, i.e. 18 BRCA1 and 32 sporadic tumours, for the SC analysis. Based on leave-one-out cross-validation (LOOCV) (supplementary data LOOCV), the analysis was performed using Δ=1.3 (Van Beers 2006, formula 5) and 191 features were selected to be discriminatory. The remaining one third of the samples were used as external validation for the class predictor. All samples of the BRCA1 group (n=10) were predicted as BRCA1-like and all sporadic samples (n=16) were classified correctly as sporadic tumours. Features that were selected as most characteristic for BRCA1 breast tumours were abundant in regions of chromosome 3q22-27 (gain), 5q12-14 (loss), 6p23-22 (gain), 10p15-14 (gain), 12p13 (gain), 12q21-23 (loss), and 13q31-34 (gain). Features that were specific for the sporadic tumour set were abundant in regions of chromosome 3q22-26 (loss) and 13q31-33 (loss). BAC clones that were selected using the SC method are listed in Tables 1 and 2.
Robustness of our classification predictor was further tested by 15-fold random selections of two-third of the data set for training a class predictor and validating on the remaining one-third of the samples. All fifteen different permutations resulted in a performance of 100% as can be seen in
In our tumour groups, BRCA1-mutated tumours are generally ER, PR and HER2/Neu negative (also known as triple negative), and only 19% of the sporadic cases are triple-negative (supplementary table sample information). To investigate the relation of ER, PR, HER2/neu, or P53-status with chromosomal aberrations and thus the influence on our class predictor, hierarchical cluster analysis (Eisen 1998) was performed on the array CGH results of the 28 BRCA1-associated and the 48 sporadic breast tumours. Tumours sharing the same receptor or P53-status did not reside in clusters as can be seen in
Fourty-eight patients from HBOC families (at least two patients with breast carcinoma and at least one case of primary ovarian carcinoma), were selected and analysed using aCGH. Applying the class predictor, we found 2 samples to be BRCA1-like, 42 samples were predicted as sporadic cancer, and 4 samples could not be assigned with certainty to a class as they fell outside the 95% reference intervals.
To find evidence for BRCA1 involvement in the breast cancer cases that we have classified as BRCA1-like, we performed additional tests that were not included in the original diagnostic setting (Van der Hout 2006). Hypermethylation of the promoter of BRCA1 was determined for all BRCA1-associated, sporadic, and HBOC samples using MLPA-methylation (MRC-Holland, ME001). Only case HR015, which was classified as BRCA1-like, was hypermethylated at the BRCA1 promoter. Additional analyses show also hypermethylation at the BRCA1 promoter site within the ovarian tumour of the same patient, but interestingly not in her lymphocyts. Loss of Heterozygosity (LOH) of BRCA1 was observed in both the samples HR015 and HR019. As BRCA1 exon 11 is the gene's largest exon (it codes for 61% of the protein) and is approximately 3.4 kB long, sequencing is not a standard diagnostic procedure, but is screened for truncating mutations by PTT (Hogervorst 1995). We sequenced exon 11 without finding any mutations. For case HR019 no methylation of BRCA1, or unclassified variants were identified, however, loss of one BRCA1 allele was observed in the tumour.
We show that BRCA1-associated breast tumours develop rearranged genomes with specific genomic aberrations that differ significantly from sporadic breast tumours. Based on our array-CGH data, we were able to identify the most significant differences between these two tumour groups and have built a class predictor with a sensitivity and specificity of 100% using the Shrunken Centroids method (Tibshirani 2002). Compared with the BRCA1-associated tumours, aberrations are seen less frequently in sporadic breast tumours. Many of the identified regions specific for the BRCA1-related tumours have been published (Tirkkonen 1997, Wessels 2002, Van Beers 2005, Jonsson 2005, Johannsdottir 2006) but have been poorly correlated to combined receptor and P53-status. It has been reported that BRCA1 tumours are in general ER, PR, and HER2/neu-negative (Lakhani 2002). Furthermore, it has been shown that specific genetic alterations are associated with receptor and P53-status (Loo 2004, Fridlyand 2006). The differences in chromosomal aberrations between ER-negative and positive breast carcinomas are located at 4p16, 5q23-35, 8p23-21, 10p12, 10q25, 17q11, 19q13, and 21q22 (Loo 2004). The differences in chromosomal aberrations between P53-positive and negative breast tumours are 3p, 4q, 5q, 8q, 15q, and 17q (Fridlyand 2006). To prevent possible influence of P53-status on the separation of our tumour groups, equal distributions of P53-negative and positive tumours were used to build our class predictor. Since we did not have equal numbers of ER-negative tumours in both tumour classes (since this is dominant in BRCA1-associated tumours), we will discuss here whether receptor status may have influenced our class predictor. Although the ER and P53-specific chromosomal regions could be confirmed in our CGH data (data not shown), only a small region of 5q was present in our class predictor, indicating that ER and P53-status do not strongly influence the classifier. Wessels et al. already classified BRCA1-associated and sporadic tumours using classical CGH with an accuracy of 84%, here loss in 3p and 5q and gain in 3q were identified as discriminatory aberrations. As described by Fridlyand et al., TP53-mutatand tumours show loss in chromosome 3p. This chromosomal region is part of the classifier of Wessels et al. which could have contributed to false positives. Chromosomal regions 3q and 5q are present in our current class predictor, however not 3p. This suggests that an equal distribution of P53-tumours in both tumour groups could have helped achieving a better specificity.
Performing unsupervised cluster analyses on the array-CGH results, tumours from the BRCA1-related and sporadic tumour groups sharing the same receptor or P53-status do not reside in clusters (
There are other studies reporting BRCA1-status prediction based on clinico- and pathological reviewing (Lakhani 2005, Van der Groep 2006). These studies show that the investigated protein expressions could not all be clearly related to dysfunctional BRCA1, suggesting the difficulty of pathological reviewing with the currently available markers.
Applying our classification technique to breast tumours from non-BRCA1/2 families, we identified 2 out of 48 tumours to be BRCA1-like. Because all tumours were formalin-fixed and paraffin-embedded, we could not investigate for BRCA1 RNA expression. However, further analyses on genomic DNA showed hypermethylation and LOH of the BRCA1 gene in one of these cases, strongly indicating BRCA1 dysfunction. Cancer formation due to BRCA1 mutation is generally accompanied by the loss of the wild type allele, i.e. LOH, which was also found in the second BRCA1-like HBOC tumour. However, no novel or described mutations in the BRCA1 gene could be identified in this tumour after sequencing exon 11. One explanation for finding no evidence yet for BRCA1-involvement in tumour formation could be that this tumour has sporadically arisen but does suffer of BRCA1-dysfunction (Turner 2007). This particular patient's family history was different compared with an average BRCA1-involved family (breast and ovarian cancer) and included also brain, colon cancer, and leukaemia, additionally, the tumour was ER and PR-positive which is uncommon for BRCA1-related tumours (Lakhani 2002). This unresolved BRCA1-like case has to be analysed more intensively when new techniques and knowledge are available.
These two BRCA1-like tumours were calculated to have a chance for having a BRCA1-mutation according to the Evans scoring (Evans 2004) of 20% and 11.8%, respectively, which is surprisingly low compared with the tumours with an Evans score>50%, that were not classified to be BRCA1-like. This suggests that risk prediction based on family history is not perfect; also, sporadic tumours that have dysfunctional BRCA1 (Turner 2007) can obviously not be predicted using family based models.
Some of the reasons for the variety of discriminant scores within the BRCA1-associated and the sporadic tumour groups are the technical variances between log 2 ratios; also an over estimation of tumour percentage, which can cause a suppressed tumour profile, can lead to a false classification. Therefore, we applied reverence intervals which are based on 95% of our data. Four of our tested HBOC breast carcinomas ended up outside the 95% reference intervals of our classifier. Since discriminant scores outside the 95% reference intervals become too small to be reliable, we withhold to classify these cases.
Although further validation in a large series is required, we conclude that current diagnostics does find most hereditary BRCA1-associated breast tumours. However, while we could still find BRCA1-related breast tumours, our approach may also be used as an additional tool to identify BRCA1-mutation carrying families where BRCA1-relation is still unclear. In the future, it may be possible to include this test into diagnostic routine to select for the individual within a high risk family for intensive DNA screening most likely carrying a mutation, and may be used to guide DNA-diagnostics. Additionally, it may give indications for the significance of unclassified variants (Tischkowitz submitted). Our method outperforms pathological reviewing and all other available methods on tumour material in predicting clinical samples for BRCA1-association.
Number | Date | Country | Kind |
---|---|---|---|
07118284.4 | Oct 2007 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/NL08/50641 | 10/9/2008 | WO | 00 | 7/19/2010 |