Osteosarcoma (OS) is the most common malignant bone tumor of children, adolescents,
and young adults, representing approximately 1% of newly diagnosed cancers in adults, and 3-5% in children (1,2). With current treatment regimens, patients with non-metastatic OS have five-year survival rates above 65% whereas the ˜25% of patients presenting with metastases have a five-year survival of less than 20% (3,4). As such, early detection of OS prior to metastasis could significantly improve outcomes.
Early detection is especially needed in individuals who are predisposed to OS either genetically or through iatrogenic exposures. OS occurs at increased rates in several monogenic hereditary cancer syndromes such as retinoblastoma (RBI) (5), Li-Fraumeni syndrome (TP53), Bloom syndrome (RECQL2), Werner syndrome (RECQL3), and Rothmund-Thomson syndrome (RECQL4). OS also occurs with increased frequency in children exposed to radiation or alkylating agents, in Diamond-Blackfan anaemia patients and in adults with bone disorders such as Paget's Disease. Combined genetic predisposition and exposure to DNA damaging agents confers particularly high risk; for example, relative risk for children with hereditary retinoblastoma increased from ˜69 without such treatments to ˜302 for radiotherapy and ˜539 for radiotherapy plus chemotherapy in the largest treatment-stratified analysis (5).
The need for OS biomarkers is reflected in a large yet inconclusive literature. Early
studies focusing on bone markers such as alkaline phosphatase showed highly variable increases in OS patients (6). Later proteomic studies revealed two as-yet uncharacterized OS-associated proteins (7) whereas studies of miRNAs showed variable results (8-10). A recent study identified 56 miRs that were upregulated in pre-treatment OS patient plasma (11); however, among the top candidates (miR-21, miR-221, and miR-106a), levels increased by only ˜2.4-8-fold and sensitivity was at best ˜85%. An alternative approach is to detect aneuploidy via cell-free DNA (cfDNA) whole genome sequencing, yet at present this has limited sensitivity (12) due to the dilution of tumor with non-tumor cfDNA. Currently no biomarkers have been shown to reliably detect naïve pre-symptomatic OS in predisposed individuals (13).
Beyond OS, many other cancers lack biomarkers with sufficient sensitivity, specificity, and low cost to enable medically beneficial and feasible cancer screening in cancer-predisposed patients and the general public. For example, a recently developed cancer screening approach (CancerSEEK) involves targeted sequencing and detection of single nucleotide variants (SNVs) and small insertions or deletions (INDELS) in ˜500 cancer-related genes (e.g., oncogenes and tumor suppressor genes) in circulating tumor DNA (ctDNA). However, this approach cannot detect cancers that lack SNVs or INDELS in the pre-selected gene panel, such as the ˜50% of OS with causative chromosome structure changes and may be prohibitively expensive to deploy on a population-wide basis. Similarly, circulating tumor cells (CTCs) can be a good indicator of cancer, but CTC screens may fail to detect small incipient tumors and can require complex and expensive technology. miRNA biomarkers have also been explored yet have low or inconsistent sensitivity in different studies.
Beyond early cancer detection, there is a need for more sensitive monitoring of cancer treatment responses and relapse. In many cancers, response and relapse are monitored by imaging, which might not detect the smallest, most treatable lesions, or by reappearance of symptoms, which might occur only after a tumor has advanced. As for cancer screening, cancer treatment response and relapse may be monitored by ctDNA sequencing, CTC detection, and circulating miRNAs, yet the same sensitivity and cost drawbacks as noted for screening may apply. Thus, there is a need to develop a simpler, faster, less expensive, and more sensitive approaches for early cancer detection and therapy response and relapse monitoring. Optimally, a new approach will detect a range of cancers, so it may be widely deployed, and positive results may be followed by appropriate targeted diagnostics and therapeutic interventions.
Provided herein is a novel type of blood test for the detection of a novel marker of osteosarcoma and other cancers. Specifically, the test enables the detection of an increased level of a specific category of human genomic DNA, the repetitive element (RE) DNAs, in serum or plasma of people with cancers (including serum extracellular vesicle-associated repetitive element DNAs as candidate osteosarcoma biomarkers). The increase may be assessed either in terms of the total RE DNA levels or in terms of the RE DNA relative to other sequences in the same nucleic acid preparations.
The detection of the increased RE DNAs in serum or plasma requires that the RE DNAs are isolated by specific methods that have not previously been used for this or related purposes, and which exploit the novel finding that RE DNAs co-purify with extracellular vesicles (EVs) using specific biochemical methods. The novel method involves a) specific methods to isolate/enrich EVs and associated material and/or a method to isolate RE DNA; b) a specific method to isolate ‘small nucleic acids’ from the EV preparations; and c) quantitation of i) RE DNA sequences or ii) the ratio of RE DNA to other nucleic acid sequences. When quantitating the ratio of RE DNA to other nucleic acid sequences, the method may involve concurrent analysis of EV-associated RE DNA and non-RE RNA sequences.
One aspect is the combined use of the EV isolation step and the small nucleic acid isolation step. Omitting either the EV isolation step or using a standard DNA isolation method after EV isolation would not selectively enrich for tumor associated RE DNAs (or enrich to a lesser extent) or enable detection of higher RE DNA levels (total or relative to non-RE sequences) in individuals with cancer. The combined use of an EV isolation step plus a small nucleic acid isolation step has not previously been used to detect differentially represented DNA sequences in fluids from cancer-bearing versus normal individuals. The utility of the test employing this aspect was demonstrated by the finding that the test can discriminate serum samples from individuals with vs without osteosarcoma (
Another aspect is the quantitation of the proportion of RE DNA relative to the total sequences in the EV-associated nucleic acid preparation, which can improve cancer detection sensitivity. The total sequences in the nucleic acid preparation include RE and non-RE genomic DNA (gDNA) sequences as well as RNA sequences. The “RE DNA proportion test” examines the proportion of RE DNA sequences among total sequences, which may be determined by a) reverse transcription of RNA into cDNA, b) co-amplification of EV-associated gDNA and the reverse-transcribed cDNA, and c) detection of RE DNA and non-RE DNA. As examples, the proportion of RE DNA a) may be defined by massively parallel sequencing and expressed as read counts-per-million (CPM), or b) may be defined by capture of the amplified total sequences by a limiting quantity of immobilized capture probes complementary to the DNA amplification primers, followed by probing the captured sequences with fluorescent oligodeoxynucleotides complementary to RE or non-RE sequences of interest. The latter approach measures the proportion of the total sequences comprised of each RE or non-RE sequence, rather than absolute levels, because the capture probes are present in limiting quantity relative to the amplification products and thus capture representative proportions of sequences of interest (as illustrated in
An additional aspect is the quantitation of the ratio of one or more RE DNA sequences to certain down-regulated (or under-represented) non-RE sequence, which can further improve cancer detection. The non-RE sequences may consist of RNAs that co-purify with EVs and whose proportion of total reads declines as RE DNAs increase in cancer patient serum or plasma (
Thus, the test may be formatted in several ways as described above (PCR to directly quantitate RE DNA levels; sequencing or hybridization to define the RE DNA proportion; or comparing the RE DNA and non-RE sequence ratio) to improve utility as may be appropriate to different applications. The test may also be used in several ways, including a) to detect a new osteosarcoma (or other cancers) before the cancer would be detected through the usual clinical presentation. Thus, the method could be used such as in a cancer-screening regimen in individuals who are predisposed to osteosarcoma (or other cancers); b) to monitor therapy response, which is reflected in altered levels of the EV-associated RE-DNAs and can be a prognostic indicator; and/or c) to monitor tumor recurrence prior to its clinical appearance, in patients who are effectively treated but at risk for relapse. Of note, the test can be used in combination with other markers of OS or other cancers. As an example of such a marker, named herein as ‘p90,’ which was elevated in serum density gradient EV fractions in each of 7 OS patients versus 7 controls (
A variation of the method in which the tumor RE DNA is enriched based on tumor specific epigenetic features can also be carried out to provide further enrichment and/or sensitivity.
Provided herein are circulating biomarkers that distinguish OS patients from healthy controls, and a liquid biopsy for early OS detection and other cancers. Liquid biopsies may detect circulating tumor components including cfDNA, tumor cells, and extracellular vesicles (EVs) (14-1), a category that includes exosomes, shedding vesicles, microparticles, retroviral-like particles, ectosomes, microvesicles, oncosomes, and apoptotic bodies (20,21). EVs are released by most if not all cells (22) and carry components of their cell of origin such as proteins, lipids, metabolites, and various types of RNA (23,24). Among the different types of EVs, exosomes and oncosomes are more highly produced by cancer cells than by normal cells, are often present at increased levels at cancer diagnosis, may further increase during tumor progression (15), and carry cargo that reflects metastatic progression and treatment response (25,26). Moreover, EV preparations may contain exosomal as well as non-exosomal tumor components. Therefore, cancer biomarkers, such as OS biomarkers, in serum derived EV preparations were isolated.
To identify EV-associated OS biomarkers, the abundance of nucleic acid sequences in OS patient versus control serum EV preparations were compared. Specifically, small nucleic acids extracted from EV preparations were sequenced and examined for differential representation of unique as well as repetitive element sequences which are often produced and may be released by cancer cells (27), including by OS cells (28). Next it was evaluated whether the same sequences were differentially represented in different patient cohorts and by different EV isolation and analytic methods. Through these approaches circulating EV-associated repetitive element DNA sequences were identified that were more abundant in OS sera compared to healthy sera in two patient cohorts. Moreover, EV-associated repetitive element DNA sequences comprised an increased proportion of total sequences in the small nucleic acid preparations, and the ratio of certain repetitive element DNA sequences versus certain non-repetitive element sequences was increased.
In describing and claiming the invention, the following terminology will be used in accordance with the definitions set forth below. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention. Specific and preferred values listed below for radicals, substituents, and ranges are for illustration only; they do not exclude other defined values or other values within defined ranges for the radicals and substituents.
As used herein, the articles “a” and “an” refer to one or to more than one, i.e., to at least one, of the grammatical object of the article. By way of example, “an element” means one element or more than one element.
The term “about,” as used herein, means approximately, in the region of, roughly, or around. When the term “about” is used in conjunction with a numerical range, it modifies that range by extending the boundaries above and below the numerical values set forth. In general, the term “about” is used herein to modify a numerical value above and below the stated value by a variance of 20%.
As used herein, the terms “determining”, “assessing”, “assaying”, “measuring” and “detecting” refer to both quantitative and qualitative determinations, and as such, the term “determining” is used interchangeably herein with “assaying,” “measuring,” and the like. Where a quantitative determination is intended, the phrase “determining an amount” of an analyte and the like is used. Where a qualitative and/or quantitative determination is intended, the phrase “determining a level” of an analyte or “detecting” an analyte is used.
By “reference” or “control” is meant a standard of comparison. For example, the marker level(s) present in a patient sample may be compared to the level of the marker in a corresponding healthy cell or tissue or in a diseased cell or tissue. As used herein, the term “sample” includes a biologic sample such as any tissue, cell, fluid, or other material derived from an organism.
As used herein a subject is any mammal, including humans, companion animals including cats and dogs, and livestock, including horses, pigs and cows.
The terms “treat,” “treating,” and “treatment,” as used herein, refer to therapeutic or preventative measures such as those described herein. The methods of “treatment” employ administration to a patient of a treatment regimen in order to prevent, cure, delay, reduce the severity of, or ameliorate one or more symptoms of the disease or disorder or recurring disease or disorder, or in order to prolong the survival of a patient beyond that expected in the absence of such treatment. Treatment for cancer includes active surveillance (during active surveillance, the tumor is monitored, and treatment would begin if it started causing any symptoms or problems or showed an alteration in the level of serum markers as described herein), surgery, radiation (such as external-beam radiation, including conventional radiation therapy, intensity modulated radiation therapy (IMRT)), 3-dimensional conformal radiation therapy; stereotactic radiosurgery, fractionated stereotactic radiation therapy or proton radiation therapy), immunotherapy and chemotherapy.
The term “effective amount,” as used herein, refers to that amount of an agent, which is sufficient to effect treatment, prognosis or diagnosis of cancer, when administered to a patient. A therapeutically effective amount will vary depending upon the patient and disease condition being treated, the weight and age of the patient, the severity of the disease condition, the manner of administration and the like, which can readily be determined by one of ordinary skill in the art.
Other terms used in the fields of recombinant nucleic acid technology, microbiology, immunology, antibody engineering and molecular and cell biology as used herein will be generally understood by one of ordinary skill in the applicable arts. Techniques and procedures may be generally performed according to conventional methods well known in the art and as described in various general and more specific references that are cited and discussed throughout the present specification. See, e.g., Sambrook et al., 2001, Molecular Cloning: A Laboratory Manual, 3rd ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., which is incorporated herein by reference for any purpose. Unless specific definitions are provided, the nomenclature utilized in connection with, and the laboratory procedures and techniques of, analytical chemistry, synthetic organic chemistry, and medicinal and pharmaceutical chemistry described herein are those well-known and commonly used in the art. Standard techniques may be used for chemical syntheses, chemical analyses, pharmaceutical preparation, formulation, and delivery, and treatment of patients.
By “biologic sample” is meant any tissue, cell, fluid (such as blood or serum), or other material derived from an organism.
As used herein, the terms “determining”, “assessing”, “assaying”, “measuring” and “detecting” refer to both quantitative and qualitative determinations, and as such, the term “determining” is used interchangeably herein with “assaying,” “measuring,” and the like. Where a quantitative determination is intended, the phrase “determining an amount” of an analyte and the like is used. Where a qualitative and/or quantitative determination is intended, the phrase “determining a level” of an analyte or “detecting” an analyte is used.
As used herein the term “comprising,” “having” and “including” and the like are used in reference to compositions, methods, and respective component(s) thereof, that are present in a given embodiment, yet open to the inclusion of one more or more unspecified elements. The term “including” is used herein to mean, and is used interchangeably with, the phrase “including but not limited to.”
As used herein the term “consisting essentially of” refers to those elements required for a given embodiment. The term permits the presence of additional elements that do not materially affect the basic and novel or functional characteristic(s) of that embodiment of the invention. The term “consisting of” refers to compositions, methods, and respective components thereof as described herein, which are exclusive of any element not recited in that description of the embodiment.
The assays described herein provide numerous advantages. The RE DNA Quantitation assay requires a minimal volume of serum as low as 50 ul (or lower); it can conveniently be added to any blood test where serum or plasma is obtained at a screening exam. The entire procedure is as rapid as one day and cost-effective. Two variations of the assay that measure the proportion of repetitive element DNA sequences relative to total sequences involves additional steps yet has greater specificity.
The RE DNA Quantitation assay provides high sensitivity and specificity as demonstrated by ROC curves (Cambier et al,
A higher sensitivity version of the RE DNA assay measures the proportion of RE DNA relative to total sequences in the same nucleic acid preparation. The high sensitivity of this RE DNA Proportion Test is shown in (
In another iteration, the ratio of proportions of certain RE DNA and certain non-RE sequences is evaluated by a) defining the proportions of such sequences as described for the RE DNA Proportion Test (above) and then b) defining the ratio of these proportions. The higher sensitivity of this RE DNA Ratio Test results from the more consistently increased ratio of RE DNA and decreased levels of certain non-RE RNAs (
Each of the above RE DNA assays may be used in combination with assays of other OS serum markers to strengthen the diagnostic sensitivity, such as the recently discovered and reproducibly increased p90 protein in density gradient EV fractions (
The tests provided herein quantitate the abundance or proportional representation of specific repetitive elements DNAs that are co-purified with EV preparations from a small volume of serum (e.g., 50 ul). The steps include:
1. Serum or plasma isolation. Serum is obtained in serum separator tubes and separated from cells or clot by standard methods. Plasma (blood drawn in EDTA tubes, then separated from cells) might also be used if plasma is converted back to a serum-like state.
2. EV preparation. A standardized amount of serum (e.g., 50 ul) is subjected to either a) polyethylene glycol (PEG) precipitation (Rider et al. Sci Rep2016 Apr. 12; 6:23978. doi: 10.1038/srep23978.), b) size exclusion chromatography (SEC) (Nordin et al. Nanomedicine. 2015 May; 11(4):879-83. doi: 10.1016/j.nano.2015.01.003. Epub 2015 Feb. 4.), or c) density gradient centrifugation (
3. Small nucleic acid isolation. Nucleic acids are extracted from PEG precipitations, from SEC void volumes, or from density gradient EV fractions using commercial small RNA enrichment kits (sera-MiR (SBI) or miRNeasy micro (Qiagen)). This is a novel step for assay of a DNA species and preferentially isolates osteosarcoma and other cancer related RE DNAs.
4. The RE DNA Quantitation Test. A proportion of nucleic acids isolated from osteosarcoma (or other cancer) and control PEG, SEC, or density gradient preparations is subjected to qPCR (using primers determined by PCR primer-design programs, reported in the literature, or otherwise designed by the inventors) to quantitate the abundance of specific RE DNA sequences, and comparison is made between osteosarcoma (or other cancer) and control reference samples.
5. The RE DNA Proportion Test. A proportion of nucleic acids isolated from osteosarcoma (or other cancer) and control PEG, SEC, or density gradient preparations is subjected to adaptor ligation (using a T4 RNA ligase-like enzyme that ligates adaptors to the 5′ and 3′ ends of DNA as well as RNA), reverse transcription (using primers that are complementary to the adaptors), and PCR-based amplification (using PCR primers that are complementary to the ligated adaptors), and determination of the proportional representation of RE sequences of interest either a) by massively parallel next generation sequencing or b) by hybridization-based capture of PCR product to immobilized probes followed by hybridization of fluorescent probes to the captured RE sequences of interest. Other available approaches to measure the proportions may also be used.
6. The RE DNA Ratio Test. A proportion of nucleic acids isolated from osteosarcoma (or other cancer) and amplified as in (5), followed by determination of the proportional representation of both RE sequences of interest and non-RE sequences of interest (by massively parallel next generation sequencing or hybridization-based capture of PCR product and secondary hybridization as in (5)), and calculation of the ratio of the RE and non-RE sequences.
Also provided herein are kits for EV-associated DNA and non-RE nucleic acid enrichment, wherein said kit comprising components including sample gathering, EV isolation/preparation components and/or components to quantitate RE DNA and non-RE sequences, including chips that may be used to capture PCR products and devices to read the fluorescent signals based on secondary hybridization, along with instructions for use and optionally a control sample.
As EV-associated RE DNAs were shown to be elevated in two very different tumors affecting very different patient populations (osteosarcoma and breast cancer (see, for example,
The assay described herein, validated for osteosarcoma, would serve for cancer screening in genetically predisposed children, like retinoblastoma, Li-Fraumeni, Bloom, Werner and Rothmund-Thomson syndromes, children exposed to radiation or alkylating agents, Diamond-Blackfan anaemia syndrome and in adults with bone disorders such as Paget's Disease patients (use in osteosarcoma predisposition syndromes) and at risk for breast or ovarian cancer due to inherited mutations (such as BRCA1 or BRCA2).
As osteosarcoma is characterized by high chromosomal instability, there is no predominant tumor suppressors or oncogenes to focus on as biomarkers. The discovery of specific repetitive element DNAs bypasses the need for gene-specific biomarkers. Additionally, the assay/diagnostic test is a unique and inexpensive assay for relapse of osteosarcoma and other cancers.
The terms “treat,” “treating,” and “treatment,” as used herein, refer to therapeutic or preventative measures such as those described herein. The methods of “treatment” employ administration to a patient of a treatment regimen in order to prevent, cure, delay, reduce the severity of, or ameliorate one or more symptoms of the disease or disorder or recurring disease or disorder, or in order to prolong the survival of a patient beyond that expected in the absence of such treatment. Treatment for a cancer includes active surveillance (during active surveillance, the tumor is monitored, and treatment would begin if it started causing any symptoms or problems or showed an alteration in the level of markers as described herein), surgery, radiation (such as external-beam radiation, including conventional radiation therapy, intensity modulated radiation therapy (IMRT), 3-dimensional conformal radiation therapy; stereotactic radiosurgery, fractionated stereotactic radiation therapy or proton radiation therapy), immunotherapy and chemotherapy.
The term “effective amount,” as used herein, refers to that amount of an agent, which is sufficient to effect treatment, prognosis or diagnosis of cancer, when administered to a patient. A therapeutically effective amount will vary depending upon the patient and disease condition being treated, the weight and age of the patient, the severity of the disease condition, the manner of administration and the like, which can readily be determined by one of ordinary skill in the art.
The early detection of malignancy based on the RE DNA test is specifically herein linked to treatment with appropriate modalities, which may include Surgery, Chemotherapy (e.g., with Methotrexate, Doxorubicin, Cisplatin or carboplatin, Ifosfamide, Cyclophosphamide, Etoposide, Gemcitabine), Radiation therapy, Bone marrow transplant, Immunotherapy, Hormone therapy, Targeted drug therapy, Cryoablation, or Radiofrequency ablation. An example is the treatment of triple negative breast cancer as well as other cancers (Byrum, A. K., Vindigni, A. & Mosammaparast, N. Defining and Modulating ‘BRCAness’. Trends Cell Biol 29, 740-751, doi:10.1016/j.tcb.2019.06.005 (2019), incorporated herein by reference) (e.g., prostate (Wedge, D. C. et al. Sequencing of prostate cancers identifies new cancer genes, routes of progression and drug targets. Nat Genet 50, 682-692, doi:10.1038/s41588-018-0086-z (2018), incorporated by reference), colon (Yaeger, R. et al. Clinical Sequencing Defines the Genomic Landscape of Metastatic Colorectal Cancer. Cancer Cell 33, 125-136 e123, doi:10.1016/j.ccell.2017.12.004 (2018); incorporated by reference), and pancreatic adenocarcinoma (Cancer Genome Atlas Research Network. Electronic address, a. a. d. h. e. & Cancer Genome Atlas Research, N. Integrated Genomic Characterization of Pancreatic Ductal Adenocarcinoma. Cancer Cell 32, 185-203.e113, doi:10.1016/j.ccell.2017.07.007 (2017), incorporated by reference), Ewing's sarcoma (Brenner, J. C. et al. PARP-1 inhibition as a targeted strategy to treat Ewing's sarcoma. Cancer Res 72, 1608-1613, doi:10.1158/0008-5472.CAN-11-3648 (2012); incorporated by reference) that often have a high degree of BRCAness (representing a defect in homologous recombination repair, due to or mimicking BRCA1 or BRCA2 loss, or in replication fork protection (RFP), with increased genomic instability) with a combination of DNA damaging chemotherapy (such as with platinum based compounds) or radiotherapy combined with PARP inhibitors (Byrum, A. K., Vindigni, A. & Mosammaparast, N. Defining and Modulating ‘BRCAness’. Trends Cell Biol 29, 740-751, doi:10.1016/j.tcb.2019.06.005 (2019)). As the two cancers already identified as having increased serum EV-associated RE DNA (osteosarcoma and triple negative breast cancer) both have high levels of BRCAness, the unique combination of RE DNA screening and BRCAness-directed therapies is an embodiment of the invention.
The following example is intended to further illustrate certain particularly preferred embodiments of the invention and are not intended to limit the scope of the invention in any way.
Patients and Samples
This study was reviewed and approved by the institutional review board at Children's
Hospital Los Angeles (approval no. CCI-13-00223) and at Henan Luoyang Orthopedic Hospital (approval no. 2015-01). All participants gave a written informed consent. Parents/Legally authorized persons gave informed consent on behalf of all minors and subjects above 14 years old gave assent. All analyses were conducted in accordance with relevant guidelines and regulations. Blood samples were collected during a clinically indicated venipuncture from previously untreated patients with primary diagnosis of OS and from volunteer subjects with no known medical conditions, i.e. healthy controls. Control sera for the validation cohort were obtained from local volunteer subjects and from Innovative Research Inc. (Novi, MI, USA). Blood was drawn in serum separator collection tubes (SST), clotting was allowed for 30 min at room temperature in vertical position and then tubes were centrifuged at 1,000 g for 10 min at 4° C. Serum was collected, immediately aliquoted, and stored at −80° C.
Discovery Cohort
EV isolation, nucleic acid extraction, and sequencing. Serum EVs were isolated using ExoQuick (System Biosciences Inc. (SBI), Mountain View California, USA) and aliquots frozen. One aliquot was used for NTA analyses and on confirmation of high EV purity aliquots were thawed and nucleic acid extracted using SeraMir (SBI) without DNase treatment, according to manufacturer instructions. The sequencing library was constructed using TailorMix miRNA Sample Preparation (SeqMatic) with a selection of small nucleic acids from 140 to 300 bases. 5′-RNA adapters and 3′-DNA adapters (SeqMatic, personal communication) were directly ligated to nucleic acid substrates, followed by PCR amplification. Libraries were sequenced to generate single-end 50 bp reads on MiSeq 500 platform (Illumina).
Validation Cohort
EV isolation and nucleic acid extraction. Serum was cleared by centrifugation at 3,000×g for 15 min at 4° C. For polyethylene glycol (PEG) precipitation, 50-200 ul of cleared serum was combined with an equal volume of freshly prepared 16% PEG 6000 (Sigma-Aldrich) in 1M NaCl, to give a final concentration of 8%, incubated for 30 min on ice, centrifuged in a tabletop microfuge at 16,000×g for 2 min at room temperature (Eppendorf, model 5424 R using an FA-45-24-11 fixed angle rotor) and the pellet resuspended in a volume of PBS equal to that of the starting serum volume. For size exclusion chromatography (SEC), ˜300 ul of cleared supernatant was centrifuged at for 30 min at 4° C. in a fixed angle rotor and loaded onto a glass Econo-column (BioRad, 10 cm height, 1.5 cm diameter) packed with Sephacryl S-300 High Resolution (GE Healthcare) and pre-washed with 0.32% Sodium Citrate in PBS. The cleared serum was allowed to enter the resin by gravity flow and eluate collected in 20 fractions of 15 drops (˜500 ul) on a Model 2110 Fraction Collector (BioRad). For each fraction, the protein concentration and the presence of EVs was characterized by Bradford method (BioRad) and nanoparticle tracking analysis (see below), respectively. EV fractions were concentrated on a 100 kDa Amicon ultra centrifugal filter (Millipore) from 2×500 μl to a final volume of ˜100 ul. Immunoaffinity capture of CD81+ or CD9+ EVs was carried out using the Exo-FLOW™ Exosomes Purification Kit (SBI, Mountain View California). Briefly, 200 ul of cleared serum was precipitated with 200 ul of 16% PEG 6000 as above, and the pellet re-suspended in 200 ul of PBS. 50 ul of this EV preparation were incubated in 20 μl of anti-CD81 or anti-CD9 pre-coated magnetic beads (9.1 μm) on a rotating rack at 4° C. overnight. CD81+ or CD9+ EVs were eluted from the beads in the Exosome Elution Buffer at 25° C. for 30 min.
Nucleic acids were extracted from 20 to 200 ul of EV preparations (or from 50 ul of serum) using miRNeasy Micro kit (Qiagen) and suspended in 14 ul of RNase/DNasefree H2O (depending on the initial volumes of serum) according to the manufacturer's instruction. For samples intended for reverse-transcription, a spike-in control (C. elegans miR-39-3p) miRNA mimic (Qiagen) was added (1.6×109 copies) after the lysis step. Nucleic acid size and concentration were analyzed on an RNA Pico 6000 chip using an Agilent Bioanalyzer (Agilent, Palo Alto, CA, USA), equipped with Expert 2100 software, which generated an electrophoretic profile and the corresponding ‘pseudo’ gel of the sample. After separation, nucleic acid sizes were normalized to a 25 bp RNA marker. Samples showing nucleic acids of >200 bp were eliminated from the study.
Particle Size and Concentration Measurement by Nanoparticle Tracking Analysis
EV preparations were analyzed by nanoparticle tracking using a NanoSight NS300 (Malvern, Worcestershire, U.K.) configured with a high sensitivity sCMOS camera (OrcaFlash2.8, Hamamatsu C11440, NanoSight Ltd). In brief, each sample was mixed by vortexing, and subsequently diluted in particle-free PBS to obtain a concentration within the recommended measurement range (108-109 particles/mL), corresponding to dilutions from 1:100 to 1:500. After optimization, settings were kept constant between measurements. Ambient temperature was recorded manually and did not exceed 25° C. Approximately 20-40 particles were in the field of view for each measurement. Three videos of 30 s duration were recorded for each sample. Experiment videos were analyzed using NTA 3.2 Dev Build 3.2.16 software (Malvern).
Single Particle Interferometric Reflectance Imaging Sensing (SP-IRIS)
EVs from PEG and immunocapture preparations were analyzed on ExoView R100 platform (Nanoview Biosciences, MA). Briefly, EVs within these preparations were immunocaptured on a multiplexed microarray chip with CD9, CD81 CD63, and CD41a antibody spots, as well as negative control IgG antibody spots to determine the level of non-specific binding, and then probed for CD9, CD81, CD63, and CD41a surface markers with respective additional fluorescent antibodies. EVs from PEG preparation and eluted EVs from immunoaffinity were diluted in solution A (Nanoview Biosciences, MA). The samples were incubated on the ExoView Tetraspanin Chip (EV-TC-TTS-01) placed in a sealed 24-well plate for 16 h at room temperature. The chips were then washed three times in 1 ml PBST for 3 min each on an orbital shaker. Then, chips were incubated with ExoView Tetraspanin Labeling ABs (EV-TC-AB-01) that consist of anti-CD81 Alexa-555, anti-CD63 Alexa-488, and anti-CD9 Alexa-647. The antibodies were diluted 1:5000 in PBST with 2% BSA. The chips were incubated with 250 μL of the labeling solution for 2 h. The chips were then washed once in PBST, three times in PBS followed by a rinse in filtered deionized water and dried. Immunocaptured EVs on the microarray chip were imaged on a single EV-basis with the ExoView R100 reader using the nScan2 2.9 acquisition software. The data were then analyzed using the NanoViewer 2.9 software (Nanoview Biosciences, MA) that counts and sizes fluorescent nanoparticles immunocaptured on the antibody spots. For exosome analysis the size window was selected to include particle sizes from 50-200 nm.
Reverse Transcription (RT) and qPCR
Equal volumes of nucleic acid prepared as above were reverse transcribed using iScript™ cDNA Synthesis Kit (Bio-Rad) in 20 ul volume according to the manufacturer's protocol. 0.5 ul of the samples produced with or without the RT step were analyzed in 10 ul qPCR reactions with iQ™ Green Supermix (Bio-Rad) on an ABI 7900 Fast Real-Time PCR System (Applied Biosystems) with the following cycling parameters: 94° C., 30 sec; 59° C., 15 sec; 68° C., 25 sec for 35 cycles. Relative sequence abundance was determined by the ΔΔCt method. In most PCR runs, a negative control with no nucleic acid template was added and never generated PCR product. PCR primers were designed manually to have a melting temperature of 58° C. and to generate amplicons of ˜100 bp or as previously described for HSATI (62) and HSATII (37) (
DNase I and RNase A Treatment of EV Preparations
Intact EV preparations were treated with DNase I (Qiagen) in RDD buffer for 15 min at room temperature and then inactivated for 10 min at 70° C. Intact EV preparations were treated with RNase A (Thermo Scientific) at final concentration 0.4 ug/ul with or without NaCl at final concentration 1 M for 10 min at 37° C. 39 and inactivated by RNase inhibitor (Takara) at final concentration 2 u/ul.
RNA-Seq Data Processing, Alignment and Analysis
Fastq files were aligned to GRch38 (for analysis 1) or hg19 (for analysis 2) using STAR using the parameters recommended for TEtranscripts (i.e., allowing for up to 100 alignments per read) (34), and the resulting BAM files were processed using TEtranscripts to quantify both non-repetitive element and repetitive element abundance.
Statistical Analysis
Groups were compared using two-tailed, unpaired, Mann Whitney U test (*P<0.05, **P<0.01, ***P<0.001, ****P<0.0001). All analyses were performed using Prism 8 software (GraphPad).
To identify OS biomarkers, nucleic acid sequences associated with EV preparations from sera of OS patients and healthy controls were compared. Initial analyses were performed on a discovery cohort of treatment-naïve OS patients from Children's Hospital Los Angeles (CHLA) and Henan Luoyang Orthopedic Hospital (HLOH), comprised of males and females between 5 and 29 years old and presenting with different OS types. Control cohorts were comprised of healthy siblings of hereditary retinoblastoma patients who had not developed retinoblastoma (hereditary retinoblastoma controls; HRCs) and unrelated approximately age-matched healthy individuals (healthy controls; HC) (Table 1). EV preparations were made with the commercial ExoQuick kit based on polyethylene glycol (PEG) precipitation, with recognition that EV as well as non-EV components are isolated (29,30). Nanoparticle tracking analysis of each sample revealed similar size distributions of control and OS EVs between 50 and 150 nm, which is characteristic of exosomes (
To detect differentially represented EV-associated RNA and DNA sequences, nucleic acids were extracted from OS and control EV preparations using SeraMir small RNA enrichment kit (SBI) without DNase treatment and a sequencing library was built by addition of a 5′-RNA adapter and a 3′-DNA adapter followed by PCR amplification and sequencing. Comparison of uniquely mapped sequences using DESeq2 (31) identified 107 significantly over-represented genes and 587 significantly under-represented genes (>2-fold change, p.adj<0.05) in OS samples (
While these analyses revealed consistent over-representation of repetitive element sequences in OS serum EV preparations, the identities of the most over-represented elements were uncertain since programs that are not specifically designed for repetitive element detection may erroneously map repetitive element reads (33). To more accurately define the differential repetitive element representation, sequences were evaluated with TEtranscripts, which maps repetitive element sequences more accurately and quantitatively than non-dedicated programs (34). Using default settings with reads mapped to the GRCh38 genome, TEtranscripts confirmed that OS samples had far more significantly under-represented than over-represented sequences in comparison to control samples (
HSATI:Satellite:Satellite
HSATII:Satellite:Satellite
Charlie3:hAT-Charlie:DNA
Because GRCh38 contains numerous alternative assemblies that are enriched for repetitive elements that might siphon repetitive element reads, adds synthetic centromeric repeat sequences, and hard-masks certain centromeric and genomic repeat arrays (35), it was considered whether these features might affect the ability to detect differential representation of unique or repetitive element sequences. To address this possibility, TEtranscripts analysis was re-performed with reads aligned to hg19, which lacks the GRCh38 alternative assemblies. This identified 15 significantly over-represented repeat elements, of which Human Satellite II (HSATII) had highest fold change (log2(2.73), p.adj=0.002), and one significantly under-represented element in OS versus control sequences (Table 2, Analysis 2 and
To illustrate the significant over-representation of RE DNA sequences as a proportion of all sequences in the above analysis we plotted the proportional expression of each sample expressed as read counts per million (CPM) of HSATII and LINE1 family sequences (
In addition to identifying 15 over-represented RE DNAs the TEtranscript analysis identified 138 under-represented DNAs (Table 2, Analysis 2 and
It was next examined whether the increased representation of repetitive elements was evident in a validation set of mostly distinct samples. The validation cohort consisted of treatment-naïve OS patients from CHLA and HLOH including males and females between 7 and 46 years old and presenting with various OS types as well as approximately age-matched healthy individuals (Table 1). The validation cohort was independent of the discovery cohort except for re-analysis of OS1 and OS3, which were the only samples with a sufficient quantity to re-test. To assess the repetitive element over-representation, EV-associated nucleic acids were isolated and evaluated using methods that differed from the discovery cohort analyses: EVs were isolated by PEG6000 precipitation (36) instead of ExoQuick, nucleic acids were extracted using the miRNeasy micro-RNA extraction kit (Qiagen) instead of SeraMir, and repetitive elements were examined by reverse transcription and quantitative PCR (RT-qPCR) instead of sequencing. Similar to the discovery cohort, EV concentrations were not significantly higher in sera from OS patients compared to controls (data not shown).
RT-qPCR was used to analyze four representative repetitive element categories including the HSATI and HSATII satellite sequences that were most differentially overrepresented in TEtranscripts Analyses 1 and 2 (Table 2), the LINE1 P1 family member (L1P1) that showed the highest fold change in the RepeatMasker analysis (
For each sample, RT-qPCR was performed on the same proportion of total EV nucleic acid extracted from the same serum volume and was normalized against a spike in RNA. The analyses confirmed the over-representation of HSATI, HSATII, L1P1, and Charlie 3 sequences in OS EV preparations (12.42-fold, p=0.0040; 3.33-fold, p=0.062; 3.56-fold, p=0.016; 12.6-fold, p=0.0007; respectively) (
As the instant nucleic acid isolation and analysis methods could detect RNA as well as DNA sequences, the nucleic acid origin of the over-represented repetitive element sequences was examined by performing qPCR without reverse transcription. With this approach, the HSATI, HSATII, L1P1 and Charlie 3 amplification signals were significantly overrepresented in OS versus control EVs (22.18-fold, p=0.0015; 3.7-fold, p<0.0001; 2.86-fold, p=0.0015; 10.29-fold, p=0.011; respectively) (
To further evaluate the abundance of repetitive element DNAs and control for possible artefactual generation of RT-independent products, PEG-precipitated EV preparations from four OS and four control sera were treated with DNase I or RNase A prior to nucleic acid extraction. RNase A treatments were performed in 1 M NaCl in order to cleave single-stranded RNA as well as in the absence of NaCl in order to cleave single-stranded and double-stranded RNA and RNA strands in RNA-DNA hybrids (39). After these treatments, nucleic acids were extracted with the miRNeasy Micro kit and HSATI and HSATII abundance were assessed by qPCR. In these analyses, DNase I treatment eliminated 97-99% of HSATI and 80-99% of HSATII signals in both OS and control samples, whereas RNase A treatments had no significant effect (
To assess whether OS serum EV preparations might also have an increased abundance of repetitive element RNAs, PEG-precipitated EVs were prepared, treated with DNase I, and the remaining nucleic acids extracted and examined by RT-qPCR. In these samples, no amplification signal was detected for HSATI or Charlie 3, while HSATII and L1P1 products were reduced ˜32-64-fold compared to non-DNase I treated samples and showed no significant difference in control and OS samples (
Co-Purification of OS-Associated Repetitive Element DNAs with EVs in Size Exclusion Chromatography but not Exosome Immunoaffinity Capture
To further evaluate if repetitive element DNAs that were more abundant in OS patient PEG-precipitations (here termed ‘OS-associated’ repetitive element DNAs) are associated with EVs, it was examined whether they co-purified with EVs prepared by size exclusion chromatography (SEC) and exosome immunoaffinity capture. SEC yields purer EV populations (40,41) with lower protein contamination compared to PEG precipitation (42,43), whereas exosome immunoaffinity capture uses well-characterized surface markers CD9 or CD81 to highly purify intact exosomes (41,44).
It was first examined if repetitive element DNAs co-purify with SEC-isolated EVs from two control and two OS samples. Nanoparticle tracking analyses revealed that the control and OS EVs both eluted from size exclusion columns solely in fractions 6 and 7 (
It was next assessed whether OS-associated repetitive element DNAs co-purified with EVs in exosome CD9 or CD81 immunoaffinity capture. In pilot studies, it was confirmed that the immunocapture approach enriched for exosomes by single particle interferometric reflectance imaging sensing (SP-IRIS) using an ExoView instrument (45). SP-IRIS analyses showed that a similar number of EV particles eluted from CD9 immunoaffinity capture from control (21,006 particles, n=1) and OS sera (21,264±374 (SEM) particles, n=2), that the eluted particles could be re-immunocaptured on the microarray-based solid phase chip coated with antibodies to exosomal surface markers (
The finding that repetitive element DNAs were increased in OS patient PEG and SEC EV preparations yet not tightly bound to CD9+ or CD81+ exosomes raised the possibility that repetitive element DNAs might be more abundant in total cfDNA of OS patients and were proportionately present as contaminants in OS and control EV nucleic acid preparations. To examine this possibility, nucleic acids were extracted directly from equal volumes of OS and control sera using the same miRNeasy micro-RNA extraction kit as used for EV preparations and the repetitive element abundance was examined by qPCR. This revealed that L1P1 and Charlie 3 were significantly more abundant whereas HSATI and HSATII were present at similar levels in OS and control cfDNA samples (
The methods were furthered verified in breast cancer (
Sensitive biomarkers are needed to detect incipient OS tumors and enable lifesaving interventions in predisposed individuals. Prior studies identified a variety of potential OS biomarkers yet none had sufficient sensitivity to enable reliable OS detection (12,13). To identify new OS biomarkers, nucleic acid sequences associated with circulating EVs in OS patients was investigated. This revealed an over-representation of diverse repetitive element sequences, among which human satellites HSATI and HSATII were the most significantly increased upon mapping to the GRCh38 and hg19 genome builds. The over-represented repetitive element sequences were confirmed in a validation cohort and found to reflect repetitive element DNAs that co-purified with circulating EVs but were not tightly bound to CD9+ or CD81+ exosomes. HSATI and HSATII were distinguished from other repetitive elements in that they were enriched in serum EV preparations but not in total cfDNA, implying that they segregated into distinct complexes in the circulation of OS patients.
The detection of increased repetitive element DNAs in OS patient sera was enabled by a novel screening approach. First, in the discovery cohort, serum EVs were prepared using a precipitation method that concentrates exosomes as well as other EVs and non-vesicular constituents (29,30), which enlarged the population of biomarker candidates. Second, EVs or nucleic acid preparations were not treated with DNase, which enabled isolation of DNAs as well as RNAs and further diversified the potential biomarker pool. Third, nucleic acids were isolated using small RNA preparation kits that also captured repetitive element DNAs, and then a sequencing library was built by direct ligation of adapters to extracted DNAs as well as RNAs, using an activity with properties similar to T4 RNA ligase (47), which allowed the discovery of differentially represented DNA as well as RNA species. Finally, repetitive element sequences were evaluated that include diverse satellite and non-satellite categories that may comprise more than two-thirds of the human genome (48). Repetitive elements are often ignored in human sequencing studies because of the complexity involved in properly aligning short sequencing reads to highly repetitive regions as well as poor understanding of their functional relevance (33). However, the paucity of over-represented single copy genes in OS patient EV preparations prompted consideration of whether repetitive element sequences might be over-represented.
To examine differential repetitive element sequence representation, RepeatMasker was initially used to align sequence reads against the Repbase library of known repeats (32). This revealed an over-representation of all repetitive element categories in OS serum EV preparations, with the LINE1 family member LIP1 as the most significantly overrepresented species (
The increased abundance of selected repetitive element sequences was validated in a second patient cohort. In the validation set, their increased abundance was observed via qPCR, without reverse-transcription and in a DNase-sensitive manner, implying differential representation of repetitive element DNAs. Repetitive element DNA sequences were similarly increased in OS samples from USA (CHLA) and China (HLOH) and did not correlate with the OS grade, suggesting that these elements are produced independently of OS type. EV-associated repetitive element DNAs showed a high sensitivity and specificity for sera of patients with an OS diagnosis, with a significant AUC >0.9 for HSATI, HSATII and L1P1. However, the sensitivity was diminished by omitting the EV preparation step, particularly for HSATI and HSATII (
The results show that EV-associated repetitive element DNAs are among the most sensitive markers of OS identified to date, with ROC curve AUCs of 0.90 for HSATI and 0.97 for HSATII (
A final question raised by our findings is whether similar EV-associated repetitive element DNAs are increased in the circulation of patients with other cancers. Notably, centromeric and pericentric repetitive element RNA sequences, particularly alpha satellites and satellite II and III sequences, were reported to be overexpressed in testicular, liver, ovarian, and lung cancers compared to corresponding normal tissues (57). The pericentric human satellite II (HSATII) RNA was reported to be the most differentially expressed satellite subfamily in pancreatic cancer tissue and was also overexpressed in lung, kidney, ovarian, colon and prostate cancers (58, 59). Moreover, HSATII was one of the six most up-regulated satellite sequences in a study comparing fresh bone and OS samples by RNA-seq (28). LINE-1 was also overexpressed in pancreatic and prostate tumor samples (60, 61). However, although LINE1 and other repetitive element RNAs were detected in cancer cell-derived EVs in culture (27), their up-regulation has not been reported for circulating cell-free RNA in cancer patients. Thus, circulating repetitive element DNAs are enriched in additional cancer types and can be quantified/detected by the assays/methods described herein.
The invention is described with reference to various specific and preferred embodiments and techniques. However, it should be understood that many variations and modifications may be made while remaining within its scope. All referenced publications, patents and patent documents are intended to be incorporated by reference, as though individually incorporated by reference.
This application claims the benefit of priority of U.S. Provisional Patent Application No. 63/093,004, filed on Oct. 16, 2020, the benefit of priority of which is claimed hereby, and which is incorporated by reference herein in its entirety.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2021/071910 | 10/16/2021 | WO |
Number | Date | Country | |
---|---|---|---|
63093004 | Oct 2020 | US |