Methods for diagnosing cancer and monitoring treatment efficacy based on detecting the presence of increased levels of expression of satellite correlated genes.
Genome-wide sequencing approaches have revealed an increasing set of transcribed non-coding sequences, including “pervasive transcription” by heterochromatic regions of the genome linked to transcriptional silencing and chromosomal integrity (J. Berretta, A. Morillon, EMBO Rep 10, 973 (September, 2009); A. Jacquier, Nat Rev Genet 10, 833 (December, 2009)). In the mouse, heterochromatin is comprised of centric (minor) and pericentric (major) satellite repeats that are required for formation of the mitotic spindle complex and faithful chromosome segregation (M. Guenatri, D. Bailly, C. Maison, G. Almouzni, J Cell Biol 166, 493 (Aug. 16, 2004)), whereas human satellite repeats have been divided into multiple classes with similar functions (J. Jurka et al., Cytogenet Genome Res 110, 462 (2005)). Bidirectional transcription of satellites in yeast maintains silencing of centromeric DNA through the Dicer mediated RNA-induced transcriptional silencing (RITS) and through a recently identified Dicer-independent pathway (M. Halic, D. Moazed, Cell 140, 504 (February 19)), although centromeric satellite silencing mechanisms in mammals are less well defined (A. A. Aravin, G J. Hannon, J. Brennecke, Science 318, 761 (Nov. 2, 2007)). Accumulation of satellite transcripts in mouse and human cell lines results from defects in DICER1 (C. Kanellopoulou et al., Genes Dev 19, 489 (Feb. 15, 2005); T. Fukagawa et al., Nat Cell Biol 6, 784 (August, 2004)) and from DNA demethylation, heat shock, or the induction of apoptosis (H. Bouzinba-Segard, A. Guais, C. Francastel, Proc Natl Acad Sci USA 103, 8709 (Jun. 6, 2006); R. Valgardsdottir et al., Nucleic Acids Res 36, 423 (February, 2008)). Stress-induced transcription of satellites in cultured cells has also been linked to the activation of retroelements encoding RNA polymerase activity such as LINE-1 (D. Ugarkovic, EMBO Rep 6, 1035 (November, 2005); D. M. Carone et al., Chromosoma 118, 113 (February, 2009)). Despite these in vitro models, the global expression of repetitive ncRNAs in primary tumors has not been analyzed, due to the bias of microarray platforms toward annotated coding sequences and the specific exclusion of repeat sequences from standard analytic programs.
The present invention is based, at least in part, on the identification of massive expression of satellite repeats in tumor cells, and of increased levels of satellite correlated genes, e.g., in tumor cells including circulating tumor cells (CTCs). Described herein are methods for diagnosing cancer, e.g., solid malignancies of epithelial origin such as pancreatic, lung, breast, prostate, renal, ovarian or colon cancer, based on the presence of increased levels of those satellite correlated genes.
Thus, in a first aspect, the present invention provides in vitro methods of detecting the presence of cancer in a subject. The methods include determining an expression level of one or more Satellite Correlated Genes selected from the group consisting of HSP90BB (heat shock protein 90 kDa alpha (cytosolic), class B member 2, pseudogene (HSP90AB2P)); NR_003133 (Homo sapiens guanylate binding protein 1, interferon-inducible pseudogene 1 (GBP1P1), non-coding RNA); BX649144 (Tubulin tyrosine ligase (TTL)); DERP7 (transmembrane protein 45A (TMEM45A)); MGC4836 (Homo sapiens similar to hypothetical protein (L1H 3 region)); BC037952 (cDNA clone); AK056558 (cDNA clone); NM_001001704 (FLJ44796 hypothetical); ODF2L (outer dense fiber of sperm tails 2-like (ODF2L)); BC041426 (C12orf55 chromosome 12 open reading frame 55 (C12orf55));) REXO1L1 (RNA exonuclease 1 homolog (S. cerevisiae)-like 1 (REXO1L1)); AK026100 (FLJ22447 hypothetical LOC400221(FLJ22447)); AK026825 (transmembrane protein 212 (TMEM212)); KENAE1 (Homo sapiens mRNA for Kenae1 (AB024691)); HESRG (ESRG hypothetical LOC790952 (ESRG)); AK095450 (LOC285540 hypothetical LOC285540); FLJ36492 (CCR4-NOT transcription complex, subunit 1 (CNOT1)); AK124194 (FLJ42200 protein); AK096196 (hypothetical LOC100129434); AK131313 (Zinc finger protein 91 pseudogene (LOC441666)); FLJ11292 (hypothetical protein FLJ11292); CCDC122 (coiled-coil domain containing 122 (CCDC122)); and BC070093 (cDNA clone) in a sample comprising a test cell from the subject to obtain a test value; and comparing the test value to a reference value. A test value that is significantly above the reference value indicates that the subject has cancer.
In some embodiments, the reference level is a level of the Satellite Correlated Gene in a normal cell. In some embodiments, the normal cell is a cell of the same type as the test cell in the same subject. In some embodiments, the normal cell is a cell of the same type as the test cell in a subject who does not have cancer. In some embodiments, the cell is in a tissue sample.
In some embodiments, the sample is known or suspected to comprise tumor cells, e.g., a blood sample known or suspected of comprising circulating tumor cells (CTCs), or a biopsy sample known or suspected of comprising tumor cells.
In some embodiments, the methods further include diagnosing a subject with cancer based on the presence of a test value that is significantly above the reference value; identifying the subject as having cancer based on the presence of a test value that is significantly above the reference value; selecting a subject for treatment based on the presence of a test value that is significantly above the reference value; treating a subject for cancer (e.g., administering a treatment for cancer to the subject) based on the presence of a test value that is significantly above the reference value; or selecting a subject for further diagnostic testing (e.g., imaging, biopsy, etc) based on the presence of a test value that is significantly above the reference value.
In a further aspect, the invention provides in vitro methods for evaluating the efficacy of a treatment for cancer in a subject. The methods include determining a level of one or more Satellite Correlated Genes selected from the group consisting of HSP90BB (heat shock protein 90 kDa alpha (cytosolic), class B member 2, pseudogene (HSP90AB2P)); NR_003133 (Homo sapiens guanylate binding protein 1, interferon-inducible pseudogene 1 (GBP1P1), non-coding RNA); BX649144 (Tubulin tyrosine ligase (TTL)); DERP7 (transmembrane protein 45A (TMEM45A)); MGC4836 (Homo sapiens similar to hypothetical protein (L1H 3 region)); BC037952 (cDNA clone); AK056558 (cDNA clone); NM_001001704 (FLJ44796 hypothetical); ODF2L (outer dense fiber of sperm tails 2-like (ODF2L)); BC041426 (C12orf55 chromosome 12 open reading frame 55 (C12orf55));) REXO1L1 (RNA exonuclease 1 homolog (S. cerevisiae)-like 1(REXO1L1)); AK026100 (FLJ22447 hypothetical LOC400221(FLJ22447)); AK026825 (transmembrane protein 212 (TMEM212)); KENAE1 (Homo sapiens mRNA for Kenae1 (AB024691)); HESRG (ESRG hypothetical LOC790952 (ESRG)); AK095450 (LOC285540 hypothetical LOC285540); FLJ36492 (CCR4-NOT transcription complex, subunit 1 (CNOT1)); AK124194 (FLJ42200 protein); AK096196 (hypothetical LOC100129434); AK131313 (Zinc finger protein 91 pseudogene (LOC441666)); FLJ11292 (hypothetical protein FLJ11292); CCDC122 (coiled-coil domain containing 122 (CCDC122)); and BC070093 (cDNA clone) in a first sample from the subject to obtain a first value; administering a treatment for cancer to the subject; determining a level of the one or more (i.e., the same) Satellite Correlated Genes in a subsequent sample obtained from the subject at a later time, to obtain a treatment value; and comparing the first value to the treatment value. A treatment value that is below the first value indicates that the treatment was effective (no change, or a decrease, means the treatment was ineffective).
In some embodiments, the first and second samples are known or suspected to comprise tumor cells, e.g., blood samples known or suspected of comprising circulating tumor cells (CTCs), or biopsy samples known or suspected of comprising tumor cells.
In some embodiments, the treatment includes administration of a surgical intervention, chemotherapy, radiation therapy, or a combination thereof.
In some embodiments of the methods described herein, the subject is a human.
In some embodiments of the methods described herein the cancer is a solid tumor of epithelial origin, e.g., pancreatic, lung, breast, prostate, renal, ovarian or colon cancer.
In some embodiments of the methods described herein, the methods include determining a level of one or more, e.g., two, three, or four, of AK056558; BC037952; HSP90BB; and/or AK096196. In some embodiments of the methods described herein, the methods include determining a level of one or both of HSP90BB and/or AK056558. In some embodiments, the methods include determining a level of HSP90BB. In some embodiments, the methods include determining a level of AK096196. In some embodiments, the methods include determining a level of AK056558. In some embodiments, the methods include determining a level of BC037952.
In some embodiments, determining a level of one or more Satellite Correlated Genes comprises determining a level of a transcript. In some embodiments, determining a level of a transcript comprises contacting the sample with an oligonucleotide probe that binds specifically to the transcript. In some embodiments, the probe is labeled.
In some embodiments, “determining a level” comprises detecting the presence or absence, e.g., the presence of a level above the limit of detection of the assay being used.
In some embodiments, the present methods can be used for determining the likelihood that a subject has cancer.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Methods and materials are described herein for use in the present invention; other, suitable methods and materials known in the art can also be used. The materials, methods, and examples are illustrative only and not intended to be limiting. All publications, patent applications, patents, sequences, database entries, and other references mentioned herein are incorporated by reference in their entirety. In addition, PCT/US2011/055108 is specifically incorporated herein by reference in its entirety, and in some embodiments methods described herein can be used in conjunction with methods described in that application. In case of conflict, the present specification, including definitions, will control.
Other features and advantages of the invention will be apparent from the following detailed description and figures, and from the claims.
The present invention is based, at least in part, on the identification of a massive generation of satellite RNAs in human and mouse cancers, and a number of satellite correlated genes. Thus the present methods are useful in the early detection of cancer, and can be used to predict clinical outcomes.
Diagnosing Cancer Using Satellite Correlated Genes as Biomarkers
The methods described herein can be used to diagnose the presence of, and monitor the efficacy of a treatment for, cancer, e.g., solid tumors of epithelial origin, e.g., pancreatic, lung, breast, prostate, renal, ovarian or colon cancer, in a subject.
As used herein, the term “hyperproliferative” refer to cells having the capacity for autonomous growth, i.e., an abnormal state or condition characterized by rapidly proliferating cell growth. Hyperproliferative disease states may be categorized as pathologic, i.e., characterizing or constituting a disease state, or may be categorized as non-pathologic, i.e., a deviation from normal but not associated with a disease state. The term is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness. A “tumor” is an abnormal growth of hyperproliferative cells. “Cancer” refers to pathologic disease states, e.g., characterized by malignant tumor growth.
As demonstrated herein, the presence of cancer, e.g., solid tumors of epithelial origin, e.g., as defined by the ICD-O (International Classification of Diseases—Oncology) code (revision 3), section (8010-8790), e.g., early stage cancer, is associated with the presence of a massive levels of satellite due to increase in transcription and processing of satellite repeats in pancreatic cancer cells, and of increased levels of SCG expression in circulating tumor cells. Thus the methods can include the detection of expression levels of satellite repeats in a sample comprising cells known or suspected of being tumor cells, e.g., cells from solid tumors of epithelial origin, e.g., pancreatic, lung, breast, prostate, renal, ovarian or colon cancer cells. Alternatively or in addition, the methods can include the detection of increased levels of SCG in a sample, e.g., a sample known or suspected of including tumor cells, e.g., circulating tumor cells (CTCs), e.g., using a microfluidic device as described herein.
Cancers of epithelial origin can include pancreatic cancer (e.g., pancreatic adenocarcinoma or intraductal papillary mucinous carcinoma (IPMN, pancreatic mass)), lung cancer (e.g., non-small cell lung cancer), prostate cancer, breast cancer, renal cancer, ovarian cancer, or colon cancer. For example, the present methods can be used to distinguish between benign IPMN, for which surveillance is the standard treatment, and malignant IPMN, which require resection, a procedure associated with significant morbidity and a small but significant possibility of death. In some embodiments, in a subject diagnosed with IPMN, the methods described herein can be used for surveillance/monitoring of the subject, e.g., the methods can be repeated at selected intervals (e.g., every 3, 6, 12, or 24 months) to determine whether a benign IPMN has become a malignant IPMN warranting surgical intervention. In addition, in some embodiments the methods can be used to distinguish bronchioloalveolar carcinomas from reactive processes (e.g., postpneumonic reactive processes) in samples from subjects suspected of having non-small cell lung cancer. In some embodiments, in a sample from a subject who is suspected of having breast cancer, the methods can be used to distinguish ductal hyperplasia from atypical ductal hyperplasia and ductal carcinoma in situ (DCIS). The two latter categories receive resection/radiation; the former does not require intervention. In some embodiments, in subjects suspected of having prostate cancer, the methods can be used to distinguish between atypical small acinar proliferation and malignant cancer. In some embodiments, in subjects suspected of having bladder cancer, the methods can be used to detect, e.g., transitional cell carcinoma (TCC), e.g., in urine specimens. In some embodiments, in subjects diagnosed with Barrett's Esophagus (Sharma, N Engl J Med. 2009, 24; 361(26):2548-56. Erratum in: N Engl J Med. 2010 Apr. 15; 362(15):1450), the methods can be used for distinguishing dysplasia in Barrett's esophagus from a reactive process. The clinical implications are significant, as a diagnosis of dysplasia demands a therapeutic intervention. Other embodiments include, but are not limited to, diagnosis of well differentiated hepatocellular carcinoma, ampullary and bile duct carcinoma, glioma vs. reactive gliosis, melanoma vs. dermal nevus, low grade sarcoma, and pancreatic endocrine tumors, inter alia.
Therefore, included herein are methods for diagnosing cancer, e.g., tumors of epithelial origin, e.g., pancreatic, lung, breast, prostate, renal, ovarian or colon cancer, in a subject. In some embodiments, the methods include obtaining a sample from a subject, and evaluating the presence and/or level of SCG and/or satellites in the sample, and comparing the presence and/or level with one or more references, e.g., a control reference that represents a normal level of SCG or satellites, e.g., a level in an unaffected subject or a normal cell from the same subject, and/or a disease reference that represents a level of SCG or satellites associated with cancer, e.g., a level in a subject having pancreatic, lung, breast, prostate, renal, ovarian or colon cancer.
The present methods can also be used to determine the stage of a cancer, e.g., whether a sample includes cells that are from a precancerous lesion, an early stage tumor, or an advanced tumor. For example, the present methods can be used to determine whether a subject has a precancerous pancreatic, breast, or prostate lesion. Where the markers used are SCG transcript or encoded proteins, increasing levels are correlated with advancing stage.
Samples
In some embodiments of the present methods, the sample is or includes blood, serum, and/or plasma, or a portion or subfraction thereof, e.g., free RNA in serum or RNA within exosomes in blood. In some embodiments, the sample comprises (or is suspected of comprising) CTCs. In some embodiments, the sample is or includes urine or a portion or subfraction thereof. In some embodiments, the sample includes known or suspected tumor cells, e.g., is a biopsy sample, e.g., a fine needle aspirate (FNA), endoscopic biopsy, or core needle biopsy; in some embodiments the sample comprises cells from the pancreatic, lung, breast, prostate, renal, ovarian or colon of the subject. In some embodiments, the sample comprises lung cells obtained from a sputum sample or from the lung of the subject by brushing, washing, bronchoscopic biopsy, transbronchial biopsy, or FNA, e.g., bronchoscopic, fluoroscopic, or CT-guided FNA (such methods can also be used to obtain samples from other tissues as well). In some embodiments, the sample is frozen, fixed and/or permeabilized, e.g., is an formalin-fixed paraffin-embedded (FFPE) sample.
Methods of Detection
Any methods known in the art can be used to detect and/or quantify levels of a biomarker as described herein. For example, the level of an SCG mRNA (transcript) can be evaluated using methods known in the art, e.g., Northern blot, RNA in situ hybridization (RNA-ISH), RNA expression assays, e.g., microarray analysis, RT-PCR, RNA sequencing (e.g., using random primers or oligoT primers), deep sequencing, cloning, Northern blot, and amplifying the transcript, e.g., using quantitative real time polymerase chain reaction (qRT-PCR). Analytical techniques to determine RNA expression are known. See, e.g., Sambrook et al., Molecular Cloning: A Laboratory Manual, 3rd Ed., Cold Spring Harbor Press, Cold Spring Harbor, N.Y. (2001).
In some embodiments, where the SCG is a coding transcript (see Table 6), the level of an SCG-encoded protein is detected. The presence and/or level of a protein can be evaluated using methods known in the art, e.g., using quantitative immunoassay methods such as enzyme linked immunosorbent assays (ELISAs), immunoprecipitations, immunofluorescence, immunohistochemistry, enzyme immunoassay (EIA), radioimmunoassay (RIA), and Western blot analysis.
In some embodiments, the methods include contacting an agent that selectively binds to a biomarker, e.g., to an SCG transcript/mRNA or protein (such as an oligonucleotide probe, an antibody or antigen-binding portion thereof) with a sample, to evaluate the level of the biomarker in the sample. In some embodiments, the agent bears a detectable label. The term “labeled,” with regard to an agent encompasses direct labeling of the agent by coupling (i.e., physically linking) a detectable substance to the agent, as well as indirect labeling of the agent by reactivity with a detectable substance. Examples of detectable substances are known in the art and include chemiluminescent, fluorescent, radioactive, or colorimetric labels. For example, detectable substances can include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, beta-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride, quantum dots, or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include 125I, 131I, 35S or 3H. In general, where a protein is to be detected, antibodies can be used. Antibodies can be polyclonal, or more preferably, monoclonal. An intact antibody, or an antigen-binding fragment thereof (e.g., Fab or F(ab′)2) can be used.
In some embodiments, high throughput methods, e.g., protein or gene chips as are known in the art (see, e.g., Ch. 12, “Genomics,” in Griffiths et al., Eds. Modern genetic Analysis, 1999, W. H. Freeman and Company; Ekins and Chu, Trends in Biotechnology, 1999; 17:217-218; MacBeath and Schreiber, Science 2000, 289(5485):1760-1763; Simpson, Proteins and Proteomics: A Laboratory Manual, Cold Spring Harbor Laboratory Press; 2002; Hardiman, Microarrays Methods and Applications: Nuts & Bolts, DNA Press, 2003), can be used to detect the presence and/or level of satellites or SCG
In some embodiments, the methods include using a modified RNA in situ hybridization technique using a branched-chain DNA assay to directly detect and evaluate the level of biomarker mRNA in the sample (see, e.g., Luo et al., U.S. Pat. No. 7,803,541B2, 2010; Canales et al., Nature Biotechnology 24(9):1115-1122 (2006); Nguyen et al., Single Molecule in situ Detection and Direct Quantiication of miRNA in Cells and FFPE Tissues, poster available at panomics.com/index.php?id=product_87). A kit for performing this assay is commercially-available from Affymetrix (ViewRNA).
Detection of SCG Transcripts in CTCs
In some embodiments, microfluidic (e.g., “lab-on-a-chip”) devices can be used in the present methods. Such devices have been successfully used for microfluidic flow cytometry, continuous size-based separation, and chromatographic separation. In general, methods in which expression of SCG transcripts is detected in circulating tumor cells (CTCs) can be used for the early detection of cancer, e.g., early detection of tumors of epithelial origin, e.g., pancreatic, lung, breast, prostate, renal, ovarian or colon cancer.
The devices can be used for separating CTCs from a mixture of cells, or preparing an enriched population of CTCs. In particular, such devices can be used for the isolation of CTCs from complex mixtures such as whole blood.
A variety of approaches can be used to separate CTCs from a heterogeneous sample. For example, a device can include an array of multiple posts arranged in a hexagonal packing pattern in a microfluidic channel upstream of a block barrier. The posts and the block barrier can be functionalized with different binding moieties. For example, the posts can be functionalized with anti-EPCAM antibody to capture circulating tumor cells (CTCs); see, e.g., Nagrath et al., Nature 450:1235-1239 (2007), optionally with downstream block barriers functionalized with to capture SCG nucleic acids or proteins, or satellites. See, e.g., (13-15) and the applications and references listed herein.
Processes for enriching specific particles from a sample are generally based on sequential processing steps, each of which reduces the number of undesired cells/particles in the mixture, but one processing step may suffice in some embodiments. Devices for carrying out various processing steps can be separate or integrated into one microfluidic system. The devices include devices for cell/particle binding, devices for cell lysis, devices for arraying cells, and devices for particle separation, e.g., based on size, shape, and/or deformability or other criteria. In certain embodiments, processing steps are used to reduce the number of cells prior to introducing them into the device or system. In some embodiments, the devices retain at least 75%, e.g., 80%, 90%, 95%, 98%, or 99% of the desired cells compared to the initial sample mixture, while enriching the population of desired cells by a factor of at least 100, e.g., by 1000, 10,000, 100,000, or even 1,000,000 relative to one or more non-desired cell types.
Some devices for the separation of particles rely on size-based separation with or without simultaneous cell binding. Some size-based separation devices include one or more arrays of obstacles that cause lateral displacement of CTCs and other components of fluids, thereby offering mechanisms of enriching or otherwise processing such components. The array(s) of obstacles for separating particles according to size typically define a network of gaps, wherein a fluid passing through a gap is divided unequally into subsequent gaps. Both sieve and array sized-based separation devices can incorporate selectively permeable obstacles as described above with respect to cell-binding devices.
Devices including an array of obstacles that form a network of gaps can include, for example, a staggered two-dimensional array of obstacles, e.g., such that each successive row is offset by less than half of the period of the previous row. The obstacles can also be arranged in different patterns. Examples of possible obstacle shapes and patterns are discussed in more detail in WO 2004/029221.
In some embodiments, the device can provide separation and/or enrichment of CTCs using array-based size separation methods, e.g., as described in U.S. Pat. Pub. No. 2007/0026413. In general, the devices include one or more arrays of selectively permeable obstacles that cause lateral displacement of large particles such as CTCs and other components suspended in fluid samples, thereby offering mechanisms of enriching or otherwise processing such components, while also offering the possibility of selectively binding other, smaller particles that can penetrate into the voids in the dense matrices of nanotubes that make up the obstacles. Devices that employ such selectively permeable obstacles for size, shape, or deformability based enrichment of particles, including filters, sieves, and enrichment or separation devices, are described in International Publication Nos. 2004/029221 and 2004/113877, Huang et al. Science 304:987-990 (2004), U.S. Publication No. 2004/0144651, U.S. Pat. Nos. 5,837,115 and 6,692,952, and U.S. Application Nos. 60/703,833, 60/704,067, and Ser. No. 11/227,904; devices useful for affinity capture, e.g., those described in International Publication No. 2004/029221 and U.S. application Ser. No. 11/071,679; devices useful for preferential lysis of cells in a sample, e.g., those described in International Publication No. 2004/029221, U.S. Pat. No. 5,641,628, and U.S. Application No. 60/668,415; devices useful for arraying cells, e.g., those described in International Publication No. 2004/029221, U.S. Pat. No. 6,692,952, and U.S. application Ser. Nos. 10/778,831 and 11/146,581; and devices useful for fluid delivery, e.g., those described in U.S. application Ser. Nos. 11/071,270 and 11/227,469. Two or more devices can be combined in series, e.g., as described in International Publication No. WO 2004/029221. All of the foregoing are incorporated by reference herein.
In some embodiments, a device can contain obstacles that include binding moieties, e.g., monoclonal anti-EpCAM antibodies or fragments thereof, that selectively bind to particular cell types, e.g., cells of epithelial origin, e.g., tumor cells. All of the obstacles of the device can include these binding moieties; alternatively, only a subset of the obstacles include them. Devices can also include additional modules, e.g., a cell counting module or a detection module, which are in fluid communication with the microfluidic channel device. For example, the detection module can be configured to visualize an output sample of the device.
In one example, a detection module can be in fluid communication with a separation or enrichment device. The detection module can operate using any method of detection disclosed herein, or other methods known in the art. For example, the detection module includes a microscope, a cell counter, a magnet, a biocavity laser (see, e.g., Gourley et al., J. Phys. D: Appl. Phys., 36: R228-R239 (2003)), a mass spectrometer, a PCR device, an RT-PCR device, a microarray, a device for performing RNA in situ hybridization, or a hyperspectral imaging system (see, e.g., Vo-Dinh et al., IEEE Eng. Med. Biol. Mag., 23:40-49 (2004)). In some embodiments, a computer terminal can be connected to the detection module. For instance, the detection module can detect a label that selectively binds to cells, proteins, or nucleic acids of interest, e.g., SCG transcripts or encoded proteins.
In some embodiments, the microfluidic system includes (i) a device for separation or enrichment of CTCs; (ii) a device for lysis of the enriched CTCs; and (iii) a device for detection of SCG transcripts or encoded proteins.
In some embodiments, a population of CTCs prepared using a microfluidic device as described herein is used for analysis of expression of SCG transcripts or proteins using known molecular biological techniques, e.g., as described above and in Sambrook, Molecular Cloning: A Laboratory Manual, Third Edition (Cold Spring Harbor Laboratory Press; 3rd edition (Jan. 15, 2001)); and Short Protocols in Molecular Biology, Ausubel et al., eds. (Current Protocols; 52 edition (Nov. 5, 2002)).
In general, devices for detection and/or quantification of expression of SCG transcripts or encoded proteins in an enriched population of CTCs are described herein and can be used for the early detection of cancer, e.g., tumors of epithelial origin, e.g., early detection of pancreatic, lung, breast, prostate, renal, ovarian or colon cancer.
Methods of Monitoring Disease Progress or Treatment Efficacy
In some embodiments, once it has been determined that a person has cancer, or has an increased risk of developing cancer, then a treatment, e.g., as known in the art, can be administered. The efficacy of the treatment can be monitored using the methods described herein; an additional sample can be evaluated after (or during) treatment, e.g., after one or more doses of the treatment are administered, and a decrease in the level of expression of SCG transcripts or encoded protein, or in the number of SCG transcript or protein-expressing cells in a sample, would indicate that the treatment was effective, while no change or an increase in the level of SCG transcript or protein-expressing cells would indicate that the treatment was not effective. The methods can be repeated multiple times during the course of treatment, and/or after the treatment has been concluded, e.g., to monitor potential recurrence of disease.
In some embodiments, e.g., for subjects who have been diagnosed with a benign condition that could lead to cancer, subjects who have been successfully treated for a cancer, or subjects who have an increased risk of cancer, e.g., due to a genetic predisposition or environmental exposure to cancer-causing agents, the methods can be repeated at selected intervals, e.g., at 3, 6, 12, or 24 month intervals, to monitor the disease in the subject for early detection of progression to malignancy or development of cancer in the subject.
The invention is further described in the following examples, which do not limit the scope of the invention described in the claims.
The next generation digital gene expression (DGE) application from Helicos BioSciences (D. Lipson et al., Nat Biotechnol 27, 652 (July, 2009)) was utilized to compare expression of tumor markers in primary cancers and their derived metastatic precursors. We first determined DGE profiles of primary mouse pancreatic ductal adenocarcinoma (PDAC) generated through tissue-targeted expression of activated Kras and loss of Tp53 (KrasG12D, Tp53lox/+) (N. Bardeesy et al., Proc Natl Acad Sci USA 103, 5947 (Apr. 11, 2006)). These tumors are histopathological and genetic mimics of human PDAC, which exhibits virtually universal mutant KRAS (>90% of cases) and loss of TP53 (50-60%).
Mice with pancreatic cancer of different genotypes were bred as previously described in the Bardeesy laboratory (Bardeesy et al., Proc Natl Acad Sci USA 103, 5947 (2006)). Normal wild type mice were purchased from Jackson laboratories. Animals were euthanized as per animal protocol guidelines. Pancreatic tumors and normal tissue were extracted sterilely and then flash frozen with liquid nitrogen. Tissues were stored at −80° C. Cell lines were generated fresh for animals AH367 and AH368 as previously described (Aguirre et al., Genes Dev 17, 3112 (2003)) and established cell lines were cultured in RPMI-1640+10% FBS+1% Pen/Strep (Gibco/Invitrogen). Additional mouse tumors from colon and lung were generously provided by Kevin Haigis (Massachusetts General Hospital) and Kwok-Kin Wong (Dana Farber Cancer Institute).
Fresh frozen tissue was pulverized with a sterile pestle in a microfuge tube on dry ice. Cell lines were cultured and fresh frozen in liquid nitrogen prior to nucleic acid extraction. RNA and DNA from cell lines and fresh frozen tumor and normal tissues were all processed in the same manner. RNA was extracted using the TRIzol® Reagent (Invitrogen) per manufacturer's specifications. DNA from tissue and cell lines was extracted using the QIAamp Mini Kit (QIAGEN) per manufacturer's protocol.
Purified RNA was subjected to Digital Gene Expression (DGE) sample prepping and analysis on the HeliScope™ Single Molecule Sequencer from Helicos BioSciences. This method has been previously described (Lipson et al., Nat Biotechnol 27, 652 (2009)). Briefly, Single stranded cDNA was reverse transcribed from RNA with a dTU25V primer and the Superscript III cDNA synthesis kit (Invitrogen). RNA was digested and single stranded cDNA was purified using a solid phase reversible immobilization (SPRI) technique with Agencourt® AMPure® magnetic beads. Single stranded cDNA was denatured and then a poly-A tail was added to the 3′ end using terminal transferase (New England Biolabs).
Purified DNA was subjected to the DNA sequencing sample prepping protocol from Helicos that has been previously described (Pushkarev, N. F. Neff, S. R. Quake, Nat Biotech 27, 847 (2009)). Briefly, genomic DNA was sheared with a Covaris S2 acoustic sonicator producing fragments averaging 200 bps and ranging from 100-500 bps. Sheared DNA was then cleaned with SPRI. DNA was then denatured and a poly-A tail was added to the 3′ end using terminal transferase.
Tailed cDNA or DNA were then hybridized to the sequencing flow cell followed by “Fill and Lock” and single molecule sequencing. Gene expression sequence reads were then aligned to the known human or mouse transcriptome libraries using the DGE program. Genomic DNA sequence reads were aligned to the mouse genome and counted to determine copy number of the major mouse satellite (CNV).
The first mouse pancreatic tumor analyzed, AH284, was remarkable in that DGE sequences displayed a 48-52% discrepancy with the annotated mouse transcriptome, compared with a 3-4% difference for normal liver transcripts from the same mouse. Nearly all the discrepant sequences mapped to the pericentric (major) mouse satellite repeat. The satellite transcript accounts for ˜49% (495,421 tpm) of all cellular transcripts in the tumor, compared with 0.02-0.4% (196-4,115 tpm) in normal pancreas or liver (Table 1).
Satellite sequence reads were found in both sense and anti-sense directions and are absent from poly-A purified RNA. Tumor AH284 therefore contained massive amounts of a non-polyadenylated dsRNA element, quantitatively determined as >100-fold increased over that present in normal tissue from the same animal. By way of comparison, the levels of satellite transcripts in tumor tissues were about 8,000-fold higher than the abundant mRNA Gapdh. A second independent pancreatic tumor nodule from the same mouse showed a lower, albeit still greatly elevated, level of satellite transcript (4.5% of total cellular transcripts).
Analysis of 4 additional pancreatic tumors from (KrasG12D, Tp53lox/+) mice and 4 mice with an alternative pancreatic tumorigenic genotype (KrasG12D, SMAD4lox/lox) revealed increased satellite expression in 6/8 additional tumors (range 1-15% of all cellular transcripts). In 2/3 mouse colon cancer tumors (KrasG12D, APClox/+) and 2/2 lung cancers (KrasG12D, Tp53lox/lox), satellite expression level ranged from 2-16% of all cellular transcripts. In total, 12/15 (80%) independent mouse tumors had greatly increased levels of satellite expression, compared to normal mouse tissues (
Of note, the composite distribution of all RNA reads among coding, ribosomal and other non-coding transcripts showed significant variation between primary tumors and normal tissues (
Northern blot analysis of mouse primary pancreatic tumors was carried out as follows. Northern Blot was performed using the NorthernMax-Gly Kit (Ambion). Total RNA (10 ug) was mixed with equal volume of Glyoxal Load Dye (Ambion) and incubated at 50° C. for 30 min. After electrophoresis in a 1% agarose gel, RNA was transferred onto BrightStar-Plus membranes (Ambion) and crosslinked with ultraviolet light. The membrane was prehybridized in ULTRAhyb buffer (Ambion) at 68° C. for 30 min. The mouse RNA probe (1100 bp) was prepared using the MAXIscript Kit (Ambion) and was nonisotopically labeled using the BrightStar Psoralen-Biotin Kit (Ambion) according to the manufacturer's instructions. Using 0.1 nM probe, the membrane was hybridized in ULTRAhyb buffer (Ambion) at 68° C. for 2 hours. The membrane was washed with a Low Stringency wash at room temperature for 10 min, followed by two High Stringency washes at 68° C. for 15 min. For nonisotopic chemiluminescent detection, the BrightStar BioDetect Kit was used according to the manufacturer's instructions.
The results demonstrated that the major satellite-derived transcripts range from 100 bp to 2.5 kb (
To determine whether genomic amplification of satellite repeats also contributes toward the exceptional abundance of these transcripts in mouse pancreatic tumors, the index AH284 tumor was analyzed using next generation DNA digital copy number variation (CNV) analysis as described above for genomic DNA sequencing.
The results, shown in Table 3, indicated that satellite DNA comprised 18.8% of all genome-aligned reads in this tumor, compared with 2.3% of genomic sequences in matched normal liver. The major satellite repeat has previously been estimated at approximately 3% of the normal mouse genome (J. H. Martens et al., EMBO J 24, 800 (Feb. 23, 2005)). Thus, in this tumor with >100-fold increased expression of satellite repeats, approximately 8-fold gene amplification of the repeats may contribute to their abnormal expression.
To test whether human tumors also overexpress satellite ncRNAs, we extended the DGE analysis to specimens of human pancreatic cancer. Human pancreatic tumor tissues were obtained as excess discarded human material per IRB protocol from the Massachusetts General Hospital. Gross tumor was excised and fresh frozen in liquid nitrogen prior to nucleic acid extraction. Normal pancreas RNA was obtained from two commercial vendors, Clontech and Ambion. The samples were prepared and analyzed as described above in Example 1.
Analysis of 15 PDACs showed a median 21-fold increased expression of total satellite transcripts compared with normal pancreas. A cohort of non-small cell lung cancer, renal cell carcinoma, ovarian cancer, and prostate cancer also had significant levels of satellites and the HSATII satellite. Other normal human tissues, including fetal brain, brain, colon, fetal liver, liver, lung, kidney, placenta, prostate, and uterus have somewhat higher levels of total satellite expression (Table 4,
Subdivision of human satellite among the multiple classes revealed major differences between tumors and all normal tissues. While mouse satellite repeats are broadly subdivided into major and minor satellites, human satellites have been classified more extensively. Of all human satellites, the greatest expression fold differential is evident for the pericentromeric satellite HSATII (mean 2,416 tpm; 10.3% of satellite reads), which is undetectable in normal human pancreas (
The most abundant class of normally expressed human satellites, alpha (ALR) (Okada et al., Cell 131, 1287 (Dec. 28, 2007)) is expressed at 294 tpm in normal human pancreas, but comprises on average 12,535 tpm in human pancreatic adenocarcinomas (43-fold differential expression; 60.3% of satellite reads). Thus, while the overexpression of human ALR repeats is comparable to that of mouse major satellite repeats, it is the less abundant HSATII (49-fold above GAPDH), which shows exceptional specificity for human PDAC. The co-expression of LINE-1 with satellite transcripts in human pancreatic tumors is also striking, with a mean 16,089 tpm (range 358-38,419).
Beyond ALR repeats, the satellite expression profile of normal pancreas and PDAC are strikingly different; for instance normal pancreatic tissue has a much higher representation of GSATII, TAR1 and SST1 classes (26.4%, 10.6%, and 8.6% of all satellite reads), while these were a small minority of satellite reads in pancreatic cancers. In contrast, cancers express high levels of HSATII satellites (4,000 per 106 transcripts; 15% of satellite reads), a subtype whose expression is undetectable in normal pancreas (
The generation of comprehensive DGE profiles for 25 different mouse tissues of different histologies and genetic backgrounds made it possible to correlate the expression of cellular transcripts with that of satellites across a broad quantitative range. To identify such co-regulated genes, all annotated transcripts quantified by DGE were subjected to linear regression analysis, and transcripts with the highest correlation coefficients to satellite expression were rank ordered.
All mouse sample reads were aligned to a custom made library for the mouse major satellite (sequence from UCSC genome browser). Human samples were aligned to a custom made reference library for all satellite repeats and LINE-1 variants generated from the Repbase library (Pushkarev et al., Nat Biotech 27, 847 (2009)). In addition, all samples were subjected to the DGE program for transcriptome analysis. Reads were normalized per 106 genomic aligned reads for all samples.
For linear correlation of mouse major satellite to transcriptome, all tissues and cell lines were rank ordered according to level of major satellite. All annotated genes were then subjected to linear regression analysis across all tissues. Genes were then ordered according to the Pearson coefficient for linear regression and plotted by Matlab.
Analysis of a set of 297 genes with highest linear correlation (R>0.85) revealed 190 annotated cellular mRNAs and a subset of transposable elements (
A subset of cellular mRNAs showed a very high degree of correlation with the levels of satellite repeat expression across diverse mouse tumors (referred to herein as “Satellite Correlated Genes (SCGs)”). Linearly correlated genes with R>0.85 were mapped using the DAVID program (Dennis, Jr. et al., Genome Biol 4, P3 (2003); Huang et al., Nat Protoc 4, 44 (2009)). These genes were then analyzed with the Functional Annotation clustering program and the UP TISSUE database to classify each of these mapped genes. Germ/Stem cell genes included genes expressed highly in testis, egg, trophoblast, and neural stem cells. Neural genes included genes expressed highly in brain, spinal cord, and specialized sensory neurons including olfactory, auditory, and visual perception. HOX and Zinc Finger proteins were classified using the INTERPRO database.
Analysis of 190 annotated transcripts using the DAVID gene ontology program identified 120 (63%) of these transcripts as being associated with neural cell fates and 50 (26%) linked with germ/stem cells pathways (Table 5).
In addition, significant enrichment was evident for transcriptional regulators, including HOX related (9, 5%) and zinc finger proteins (16, 8%). This gene set could not be matched to any known gene signature in the GSEA database (Subramanian et al., Proc Natl Acad Sci USA 102, 15545 (Oct. 25, 2005)), but the ontology analysis points towards a neuroendocrine phenotype. Neuroendocrine differentiation has been described in a variety of epithelial malignancies, including pancreatic cancer (Tezel et al., Cancer 89, 2230 (Dec. 1, 2000)), and is best characterized in prostate cancer where it is correlated with more aggressive disease (Cindolo et al., Urol Int 79, 287 (2007)). A striking increase in the number of carcinoma cells staining for the characteristic neuroendocrine marker chromogranin A, as a function of higher satellite expression in mouse PDACs (
A parallel analysis in human pancreatic cancers and normal tissues using the ALR, the most abundant human satellite, yielded a total of 539 SCGs, Of these 206 could be mapped by the DAVID gene ontology program with a similar enrichment of germ/stem and neural cell fates (Table 6). Together, these observations suggest that, as in the mouse genetic model, tumor-associated derepression of satellite-derived repeats is highly correlated with increased expression of a subset of cellular mRNAs.
The list of SCGs with utility as biomarkers was further refined by taking human SCGs with a minimum 20 fold differential between cancer and normal tissue and with a minimum expression of 500 reads per million. The results are shown in Table 7.
Homo sapiens guanylate
Homo sapiens similar
Homo sapiens mRNA
Candidate SCGs identified from Helicos RNA sequencing criteria as described above (and listed in Table 7) were further evaluated using Affymetrix QUANTIGENE probes. Total RNA from 4 primary pancreatic ductal adenocarcinomas (PDAC) were analysed using the QUANTIGENE Plex RNA assay. The results are shown below (Table 8).
Based on this data, Affymetrix VIEWRNA probes were developed for testing in formalin fixed paraffin embedded (FFPE) primary tumor specimens for HSP90BB and AK096196. These probes were tested using the RNA in situ hybridization (RNA-ISH) assay at MGH on FFPE material. Positive staining was seen on human cancer subcutaneous xenografts made in Nu/Nu mice using colon cancer cell line HCT-116. HSP90BB was further tested in primary human PDAC specimens from the MGH; the results, shown in
It is to be understood that while the invention has been described in conjunction with the detailed description thereof, the foregoing description is intended to illustrate and not limit the scope of the invention, which is defined by the scope of the appended claims. Other aspects, advantages, and modifications are within the scope of the following claims.
This application is a Continuation application of U.S. patent application Ser. No. 14/353,708, filed on Apr. 23, 2014, which is a National Stage application under 35 U.S.C. §371 of international application number PCT/US2012/061576, filed on Oct. 24, 2012, which claims priority under 35 U.S.C. §119 to U.S. Provisional Patent Application No. 61/550,617, filed on Oct. 24, 2011. The entire contents of the foregoing are incorporated herein by reference.
This invention was made with Government support under Grant No. CA129933 awarded by the National Institutes of Health. The Government has certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
61550617 | Oct 2011 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 14353708 | Apr 2014 | US |
Child | 15496234 | US |