This disclosure relates generally to cancer detection and prognosis, and more specifically to apparatus and methods for early cancer detection and prognosis using a nanosensor with Raman spectroscopy.
The cancer burden has continued to grow globally with there being an estimated 18 million cancer cases and 9.6 million deaths caused by cancer in 2018. The majority of cancers can be cured when the disease gets diagnosed while it is still confined to the organ of origin. Therefore, rapid screening of onset of cancer is a key to cure cancer.
Unfortunately, many types of cancers only get diagnosed at advanced stages because the diagnostic intervention is often related to symptoms. Therefore, cancer screening, a strategy for early cancer intervention, needs to be employed to detect disease markers before the presentation of noticeable symptoms.
Many cancers have a better prognosis if they are diagnosed early on, which has led to development of many cancer screening protocols involving efforts to detect cancer in symptomatic as well as occult cancers. The existing successful screening technologies involve identifying precursor lesions, endoscopic biopsies, mammography, colonoscopy and/or cervical cytology. But imaging-based screening suffers from sensitivity and the biopsy-based tests have limitations in assessment of cancer development, prognosis as well as genotyping. This happens because of tumor heterogeneity and cancer evolution. Also, accessing tumors in very early stage is clinically challenging, morbid and expensive. The trauma, challenges and invasiveness of accessing the tumor in the initial stages can be eliminated by alternative blood-based screening technology. The idea of a simple blood test to identify malignant changes, have fueled interest in the constant search of blood-based rapid cancer screening markers and methodologies.
Liquid biopsy, such as blood, urine, stool, mucus or other body fluids, have been used widely in medical diagnosis and treatment response assessments. For instance, all types of cancers, Alzheimer's and various infectious diseases and treatment can be analysed with liquid biopsy. For some medical conditions, liquid biopsy contains far less biomarkers than tissue biopsies. This is particularly true for various disease originated from deep body organs or brain, for instance, ovarian cancer, brain cancer and Alzheimer's disease. Only very small amounts of biomarker molecules will be released to the systems of a human body from these hard-to-reach organs. Therefore, the applicability of liquid biopsy to disease diagnosis and treatment assessment is dependent on the sensitivity of detection device. The detection device should have a sensitive enough Limit of Detection (LoD) to report the presence of biomarkers, even at very low concentration. Trace level detection is necessary for any sensor to be effective for rapid screening at the onset of cancer because of the low availability of the biomarkers in very early stages of tumorigenesis.
The advance of using liquid biopsy is its accessibility, low risk, non-invasiveness, which allows for repeated sample collection. Liquid biopsy is often considered as the key to early diagnosis and screening of disease. Unfortunately, limited by LoD, many liquid-biopsy-based tests often times show low sensitivity or specificity and thus have to be used as a secondary and supplementary tool for diagnosis. For instance, recent research has shown that ctDNA and CTC from blood can identify early stage and asymptotic cancer. However, due to the unsatisfactory accuracy, it has to be used together with image detection for a confirmed diagnosis. Since the imaging techniques cannot detect tumours that are less than about 7-millimetre size, the purpose of using a blood test for the early diagnosis of onset tumour is completely lost using conventional techniques. In clinical settings, ctDNA and CTC assays are only used for therapy monitoring.
Accordingly, there is a need for new liquid-based tests for the diagnosis of cancer.
In accordance with one broad aspect, at least one example embodiment described in accordance with the teachings herein provides a method of providing a cancer assessment for a patient, the method comprising: isolating a volume of a fluid from a fluid sample of the patient, the volume of fluid including at least one biomarker; adding at least a portion of the volume of fluid to a nanosensor, the nanosensor comprising nanoparticles configured to capture the at least one biomarker and amplify signals emitted by the at least one biomarker during Raman spectroscopy; performing Raman spectroscopy on the volume of fluid on the nanosensor to produce a sample Raman spectrum, the sample Raman spectrum having amplified signals indicating the presence of the at least one biomarker on the nanosensor; processing the sample Raman spectrum using data from template Raman spectra from known cancer samples having cancer characteristics to detect whether the sample comprises one or more of the cancer characteristics; and based on the detected one or more cancer characteristics, providing the cancer assessment of the patient.
In at least one embodiment, the one or more cancer characteristics of the sample are detected based on determining which correlation values obtained by correlating the amplified signals of the sample Raman spectrum to template Raman spectra from the known cancer samples having the cancer characteristics are larger than a correlation threshold.
In at least one embodiment, the one or more cancer characteristics of the sample are detected by: performing feature extraction on the Raman sample spectral data to extract feature values; performing classification by applying the feature values to at least one set of classification models determined for the at least one biomarker to detect the one or more cancer characteristics; and providing the cancer assessment by incorporating each of the detected cancer characteristics, wherein the classification models are determined using the template Raman spectra from the known cancer samples.
In at least one embodiment, the feature extraction is performed using Principal Component Analysis, Multivariate Curve Resolution Analysis or a combination thereof.
In at least one embodiment, the feature extraction is performed using Principal Component Analysis, Multivariate Curve Resolution Analysis or a combination thereof.
In at least one embodiment, the classification model is one of Partial Least Squares Discriminant Analysis (PLSDA), Support Vector Machine Discriminant Analysis (SVMDA) and Artificial Neural Network analysis (ANN), TSNE, Random Forest classification.
In at least one embodiment, the cancer characteristic is a cancer type, a cancer stage, cancer progression, cancer metastasis, cancer potential for metastasis, prediction of a benign or malignant tumor, prediction of tumor location, or a combination thereof.
In at least one embodiment, the biomarker is extracellular vesicles.
In at least one embodiment, the biomarker is extracellular vesicles associated with circulating cancer initiating cells (CICs) or cancer stem cells.
In at least one embodiment, the biomarker is cell-free nucleic acid associated with cancer initiating cells (CICs) or cancer stem cells.
In at least one embodiment, the cell-free nucleic acid is as cell free DNA.
In at least one embodiment, the cell free DNA is molecularly modified by one of methylation, oxidation and acetylation.
In at least one embodiment, the biomarker is immune cells.
In at least one embodiment, the biomarker is one or more of T− cells, NK cells and myeloid derived suppressor cells.
In at least one embodiment, the biomarker is one or more of CD 4+ T cells, NK cells and β cells.
In at least one embodiment, the biomarker is serum
In at least one embodiment, the fluid is obtained by density gradient centrifugation.
In at least one embodiment, the volume of fluid is about 10 μL or more.
In at least one embodiment, the fluid is blood plasma and the volume of the blood plasma is about 10 μL or more.
In at least one embodiment, the fluid is buffy coat and the volume of the buffy coat is about 10 μL or more.
In at least one embodiment, after adding the fluid to the nanosensor, the fluid remains on the nanosensor for an incubation period.
In at least one embodiment, the incubation period is in a range of about 1 minute to about 2 minutes.
In at least one embodiment, the method includes providing the cancer assessment includes providing a type of the cancer, a location of the cancer, a stage of the cancer, a metastatic potential of the cancer, a therapy efficacy of the cancer or a monitoring of minimal residual disease.
In at least one embodiment, providing the cancer assessment includes early cancer diagnosis.
In at least one embodiment, providing the cancer assessment includes determining whether a tumor is benign or malignant.
In at least one embodiment, providing the cancer assessment includes, when the tumor is benign, determining weather the tumor has potential for malignancy.
In at least one embodiment, providing the cancer assessment includes determining whether a tumor is primary or metastatic.
In at least one embodiment, providing the cancer assessment includes determining whether a primary tumor has potential for metastasis.
In at least one embodiment, providing the cancer assessment includes determining a progression of the cancer.
In at least one embodiment, providing the cancer assessment includes determining a nodal metastasis of the cancer.
In at least one embodiment, providing the cancer assessment includes determining a clinical metastasis of the cancer.
In at least one embodiment, providing the cancer assessment includes determining a stage and/or grade of the cancer.
In at least one embodiment, providing the cancer assessment includes a prediction of patient survival.
In at least one embodiment, providing the cancer assessment includes providing a prognosis for the patient.
In at least one embodiment, providing the cancer assessment includes providing an early diagnosis of cancer.
In at least one embodiment, providing the cancer assessment includes providing the early diagnosis of hard to detect cancers including brain cancer, ovarian cancer, kidney cancer, pancreatic cancer, liver cancer, lung cancer, or glastrointerstinal cancer.
In at least one embodiment, providing the cancer assessment includes determining a presence of an aggressive brain cancer including glioblastoma.
In at least one embodiment, providing the cancer assessment includes providing a location of the tumor.
In at least one embodiment, providing the cancer assessment includes determining a metastatic state of cancer to brain from a cancer site, the cancer site including lung tissue, breast tissue, colon tissue, kidney tissue, thyroid tissue and skin tissue.
In at least one embodiment, determining the metastatic state is by risk assessment based on a molecular phenotype of the tumor, the molecular phenotype including human epidermal growth factor receptor 2 (HER 2), epidermal growth factor receptor (EGFR) and/or isocitrate dehydrogenase (IDH).
In at least one embodiment, determining the metastatic state of cancer to brain from a cancer site includes determining the metastatic state of breast cancer based on a molecular phenotype of the tumor.
In at least one embodiment, the molecular phenotype of the tumor is HER2 positive or HER 2 negative.
In at least one embodiment, determining the metastatic state of cancer to brain from a cancer site includes determining the metastatic state of lung cancer based on a molecular phenotype of the tumor.
In at least one embodiment, the molecular phenotype of the tumor is EGFR.
In at least one embodiment, providing the cancer assessment includes determining a metastatic state of cancer to localised metastasis or widespread from primary cancer sites.
In at least one embodiment, providing the cancer assessment includes determining presence a gynaecological cancer, the gynaecological cancer being one of ovarian cancer, cervical cancer, or uterine cancer.
In at least one embodiment, providing the cancer assessment includes monitoring cancer recurrence during or after therapy, the therapy including radiation therapy, immunotherapy and/or chemotherapy.
In at least one embodiment, providing the cancer assessment includes monitoring cancer recurrence after surgery.
In at least one embodiment, providing the cancer assessment includes determining a presence of minimal residual disease.
In at least one embodiment, the sample Raman spectrum includes amplified signals indicating the presence of a second biomarker on the nanosensor and the method further comprises: performing further data processing on the sample Raman spectrum to compare the second amplified signals to a second template Raman spectrum to determine a correlation between the sample Raman spectrum and the second template Raman spectrum, the template Raman spectrum being of a known cancer characteristic, and based on both the correlation between the sample Raman spectrum and the template Raman spectrum and the second correlation between the sample Raman spectrum and the second template Raman spectrum, providing the diagnosis of the cancer in the patient.
In accordance with another broad aspect, at least one example embodiment described in accordance with the teachings herein provides a computing device for providing a cancer assessment for a patient for a sample that is a volume of a fluid sample from the patient that includes at least one biomarker; wherein the computing device comprises: a data store comprising program instructions for obtaining Raman spectral data of the sample and performing cancer assessment of the sample using the sample Raman spectral data; and a processing unit that is operatively coupled to the data store and when executing the program instructions is configured to: acquire sample Raman spectral data from the fluid sample where the Raman spectral is obtained after adding at least a portion of the volume of the fluid sample to a nanosensor that comprises nanoparticles configured to capture the at least one biomarker, the sample Raman spectral data having amplified signals indicating the presence of the at least one biomarker on the nanosensor; process the sample Raman spectral data using data from template Raman spectra from known cancer samples having cancer characteristics to detect whether the sample comprises one or more of the cancer characteristics; and provide the cancer assessment of the patient based on the detected one or more cancer characteristics.
In at least one embodiment, the processing unit is further configured to perform any of the steps of the methods described in accordance with the teachings herein.
These and other features and advantages of the present application will become apparent from the following detailed description taken together with the accompanying drawings. It should be understood, however, that the detailed description and the specific examples, while indicating preferred embodiments of the application, are given by way of illustration only, since various changes and modifications within the spirit and scope of the application will become apparent to those skilled in the art from this detailed description.
For a better understanding of the various embodiments described herein, and to show more clearly how these various embodiments may be carried into effect, reference will be made, by way of example, to the accompanying drawings which show at least one example embodiment, and which are now described. The drawings are not intended to limit the scope of the teachings described herein.
Further aspects and features of the example embodiments described herein will appear from the following description taken together with the accompanying drawings.
Various apparatuses, methods and compositions are described below to provide an example of at least one embodiment of the claimed subject matter. No embodiment described below limits any claimed subject matter and any claimed subject matter may cover apparatuses and methods that differ from those described below. The claimed subject matter are not limited to apparatuses, methods and compositions having all of the features of any one apparatus, method or composition described below or to features common to multiple or all of the apparatuses, methods or compositions described below. It is possible that an apparatus, method or composition described below is not an embodiment of any claimed subject matter. Any subject matter that is disclosed in an apparatus, method or composition described herein that is not claimed in this document may be the subject matter of another protective instrument, for example, a continuing patent application, and the applicant(s), inventor(s) and/or owner(s) do not intend to abandon, disclaim, or dedicate to the public any such invention by its disclosure in this document.
Furthermore, it will be appreciated that for simplicity and clarity of illustration, where considered appropriate, reference numerals may be repeated among the figures to indicate corresponding or analogous elements. In addition, numerous specific details are set forth in order to provide a thorough understanding of the example embodiments described herein. However, it will be understood by those of ordinary skill in the art that the example embodiments described herein may be practiced without these specific details. In other instances, well-known methods, procedures, and components have not been described in detail so as not to obscure the example embodiments described herein. Also, the description is not to be considered as limiting the scope of the example embodiments described herein.
It should be noted that terms of degree such as “substantially”, “about” and “approximately” as used herein mean a reasonable amount of deviation of the modified term such that the end result is not significantly changed. These terms of degree should be construed as including a deviation of the modified term, such as 1%, 2%, 5%, or 10%, for example, if this deviation does not negate the meaning of the term it modifies.
Furthermore, the recitation of any numerical ranges by endpoints herein includes all numbers and fractions subsumed within that range (e.g. 1 to 5 includes 1, 1.5, 2, 2.75, 3, 3.90, 4, and 5). It is also to be understood that all numbers and fractions thereof are presumed to be modified by the term “about” which means a variation up to a certain amount of the number to which reference is being made, such as 1%, 2%, 5%, or 10%, for example, if the end result is not significantly changed.
It should also be noted that, as used herein, the wording “and/or” is intended to represent an inclusive-or. That is, “X and/or Y” is intended to mean X, Y or X and Y, for example. As a further example, “X, Y, and/or Z” is intended to mean X or Y or Z or any combination thereof. Also, the expression of A, B and C means various combinations including A; B; C; A and B; A and C; B and C; or A, B and C.
The example embodiments of the devices, systems or methods described in accordance with the teachings herein may be implemented as a combination of hardware and software. For example, the embodiments described herein may be implemented, at least in part, by using one or more computer programs, executing on one or more programmable devices comprising at least one processing element and at least one storage element (i.e. at least one volatile memory element and at least one non-volatile memory element). The hardware may comprise input devices including at least one of a touch screen, a keyboard, a mouse, buttons, keys, sliders and the like, as well as one or more of a display, a speaker, a printer, and the like depending on the implementation of the hardware.
It should also be noted that there may be some elements that are used to implement at least part of the embodiments described herein that may be implemented via software that is written in a high-level procedural language such as object oriented programming. The program code may be written in MATLAB, Julia, Python, C, C++ or any other suitable programming language and may comprise modules or classes, as is known to those skilled in object oriented programming. Alternatively, or in addition thereto, some of these elements implemented via software may be written in assembly language, machine language or firmware as needed. In either case, the language may be a compiled or interpreted language.
At least some of these software programs may be stored on a computer readable medium such as, but not limited to, a ROM, a magnetic disk, an optical disc, a USB key and the like that is readable by a device having a processor, an operating system and the associated hardware and software that is necessary to implement the functionality of at least one of the embodiments described herein. The software program code, when read by the device, configures the device to operate in a new, specific and predefined manner in order to perform at least one of the methods described herein.
Furthermore, at least some of the programs associated with the devices, systems and methods of the embodiments described herein may be capable of being distributed in a computer program product comprising a computer readable medium that bears computer usable instructions, such as program code, for one or more processing units. The medium may be provided in various forms, including non-transitory forms such as, but not limited to, one or more diskettes, compact disks, tapes, chips, and magnetic and electronic storage. In alternative embodiments, the medium may be transitory in nature such as, but not limited to, wire-line transmissions, satellite transmissions, internet transmissions (e.g. downloads), media, digital and analog signals, and the like. The computer useable instructions may also be in various formats, including compiled and non-compiled code.
The following description is not intended to limit or define any claimed or as yet unclaimed subject matter. Subject matter that may be claimed may reside in any combination or sub-combination of the elements or process steps disclosed in any part of this document including its claims and figures. Accordingly, it will be appreciated by a person skilled in the art that an apparatus, system or method disclosed in accordance with the teachings herein may embody any one or more of the features contained herein and that the features may be used in any particular combination or sub-combination that is physically feasible and realizable for its intended purpose.
Recently, there has been a growing interest in developing new technologies of early cancer detection and prognosis. Specifically, there has been a growing interest in identifying new biomarkers in physiological (e.g. blood) samples from humans for early cancer detection and prognosis.
Currently, there are at least two types of biomarkers that can be detected from blood samples. These biomarkers are generally referred to as cell-component biomarkers and immune-cell biomarkers.
Cell-component biomarkers currently include, but are not limited to, extracellular vesicles of circulating cancer initiating cells. Currently, extracellular vesicles may be used to diagnose a type of a cancer, its primary location, its stage, and/or whether the cancer is metastatic. Additionally, extracellular vesicles may be used to provide real-time (e.g., same day test results) monitoring of treatment effect and prognosis for asymptotic individuals.
Cell-component biomarkers also currently include, but are not limited to, cell-free DNA. Cell-free DNA may also be used to diagnose a type of a cancer, its primary location, its stage, and/or whether the cancer is metastatic. Additionally, cell-component biomarkers may be used to diagnose tumor burden, perform therapy monitoring, determine prognosis after therapy and to provide a therapy efficacy assessment.
Cell-component biomarkers also currently include, but are not limited to, methylated cell-free DNA. Methylated cell-free DNA may be used to diagnose the origin of a metastatic tumour, the metastatic potential of the primary tumour and the possible type of metastatic site.
Cell-component biomarkers also currently include, but are not limited to, methylated tumour DNA, which may be used to diagnose the aggressiveness of the tumour, the type of tumour, the metastatic potential and the composition of the tumour such as, for example, where there is a presence of cancer stem cells (CSCs) and metastatic CSCs in the tumour microenvironment.
Immune-cells biomarkers include but are not limited to t-cells and other types of cells from the immune systems, such as but not limited to Natural Killer (NK) cells, myeloid derived suppressor cells (MDSCs) and tumor associated macrophages.
Generally, there are three major types of T cells: CD 3+ (total T cells), CD 4+ (i.e. naive T cells/unactivated) and CD 8+ (i.e. cytotoxic T cells). The presence of high-density CD 8+ T cells are correlated with cancer stage and the ability of the patient to respond to immunotherapy. Increases in the number of CD 3+ and CD 4+ T cells generally results in a reduction of monocytes, which thereby promotes tumor growth, migration and invasion.
Currently, in the prior art, no definitive biomarker exists to predict the outcome of immune system activity/immune infiltration potential of cancer during clinical decision making.
However, in accordance with the teachings herein, methods of establishing cell-component biomarkers using standard templates are described. To establish the cell-component biomarkers, the cell-component biomarkers are first isolated (e.g. from patient tissue biopsy or a cancer cell line). Following isolation, Raman spectral profiles are collected to build a standard template.
Establishing immune-cell standard templates (e.g. templates for immunoprofiling) are described herein as follows: T cells isolated from a buffy coat from normal blood are used to establish the surface-enhanced Raman scattering (SERS) characteristics of the T cells. For example, CD 4+ T cells and CD8+ T cells are distinctly different in size, have different intracellular calcium dynamics and different cytokine secretion, thereby providing the potential to differentiate using their SERS signatures.
Templates of cells from the immune system such as, but not limited to, NK cells, myeloid derived suppressor cells (MDSCs) and tumor associated macrophages can be establish in the same manner. Here, the presence and concentration of each of these immune cell types in the blood can be correlated with the cancer stage, metastatic potential, therapy efficacy and the development of adoptive immunity to cancer.
Referring now to the figures,
In
In at least one embodiment, at step 103, a volume of the blood plasma (e.g. cell-free blood plasma) produced from step 102 is dropped onto a first nanosensor for detection of cell-component biomarkers. In at least one embodiment, the volume of the blood plasma comprises cell component biomarkers such as extracellular vesicles of circulating cancer stem cells, cell-free DNA, methylated cell-free DNA, methylated tumor DNA, exosomes, DNA of cancer stem cells or combinations thereof.
In at least one embodiment, the volume of blood plasma added to the nanosensor is in a range of about 10 microliters (μL).
The nanosensors used herein for surface enhanced Raman scattering for biomolecule detection include one or more nanoparticles (e.g. nanoprobes). The nanoparticles are made of materials such as but not limited to gold, silver, platinum, titanium, silicon, aluminum, nickel, and/or graphite. The nanoparticles described herein may also be referred to as quantum dots (i.e. nanoparticles with particle size less than about 5 nm).
Unlike all other types quantum dots, the quantum dots of nanosensors described herein are generally non-toxic, making them particularly suitable for biomedical applications. In addition, the dots are free of contaminations and generally do not react/interfere with target molecules.
The quantum dots of the nanosensors described herein are smaller and have a unique structure (e.g. have a high vacancy density of crystalline nanoparticles) compared to conventional quantum dots, which translates to higher detection sensitivity and pushes the limit of detection to a lower concentration. Therefore, previously undetectable traces (undetectable because of their low concentration) may be detected using quantum dots of the nanosensors described herein.
In at least one embodiment, the nanosensor may amplify extremely weak signals of biomarkers such as but not limited to extracellular vesicles of circulating stem cells (CSCs) and/or the other biomarkers noted herein. The nanosensor includes one or more self-assembled, three dimensional nanoprobes that can detect a biomolecular configuration of extremely low levels of extracellular vesicles associated with CSCs (e.g. 10 extracellular vesicles in 10 μl of solution), for example. In at least one embodiment, the nanosensors (i.e. nanoprobes) promote trapping of extracellular vesicles, which results in an overall increase in surface area of the extracellular vesicles, and permit drainage of fluids improving extracellular vesicles-surface interaction. The nanosensor is a three-dimensional and porous network of nanoprobes. The interconnected crisscross of nanosensors can act as a trapping and screening device. In addition to biocomponents getting trapped by the nanosensor, fluids are easily drained from the nanosensor, thereby improving the interaction of the biocomponents with the nanosensors.
Generally, the nanosensors used in the methods and systems described herein have a three-dimensional (3D) structure comprising self-assembled closed rings and bridges which causes nanoparticles to aggregate together. Between the self-assembled closed rings and bridges are pores that are interconnected. The ring size is roughly the wavelength of a laser beam used to create the sensor.
In at least one embodiment, the nanoparticles contain rich crystalline defects. In at least one embodiment, the nanoparticle size is tuneable from about 100 nm to about 1 nm. For example, in at least one embodiment, the nanoparticle size may be less than about 1 nm.
In at least one embodiment, the nanosensors used in the methods and systems described herein are fabricated by the methods described in U.S. Provisional Patent Application No. 63/059,079 filed 30 Jul. 2020 entitled “ULTRASHORT LASER SYNTHESIS OF NANOPARTICLES OF ISOTOPES”, the contents of which are incorporated herein by reference.
In one embodiment, at step 104, optionally, a volume of the buffy coat produced from step 102 may be dropped onto a second nanosensor to detect immune-cell biomarkers. In at least one embodiment, the volume of buffy coat comprises immune cell biomarkers such as circulating tumor cells, circulating stem cells, and/or immune cells such as but not limited to T cells (e.g. CD 3+, CD 4+ and/or CD 8+), natural killer (NK) cells, β cells, myeloid derived suppressor cells or combinations thereof.
In at least one embodiment, the volume of the buffy coat added to the nanosensor is in a range of about 10 uL.
In accordance with the teachings herein, the nanosensors that are used allow for the detection of immune cell biomarkers and/or cell component biomarkers in low concentrations that were previously not detectable. This is advantageous as the detection of one or more cell-component biomarkers can be used to diagnose one or more cancer characteristics such as the type of a cancer, the cancer primary location, the cancer stage, whether the cancer is metastatic, or a combination of these features. On the other hand, the detection one or more of immune cell biomarkers can be used to determine the cancer stage and/or the ability of a patient to respond to immunotherapy.
Optionally, at a step 105, the remaining products produced during step 102 may be discarded.
At step 106, in at least one embodiment, after a first incubation time, the first nanosensor including the volume of blood plasma is scanned under a Raman microscope to obtain a Raman spectrum.
In at least one embodiment, the first incubation time is in a range of about 1 to about 2 minutes. The length of the first incubation time is based on the time needed for the biomarkers to adsorb onto the surface of the nanosensors. The first incubation time may be determined empirically.
Also at step 106, in at least one embodiment, after a second incubation time, the second nanosensor including the volume of buffy coat is scanned under a Raman microscope to obtain a Raman spectrum.
In at least one embodiment, the second incubation time is in a range of about 1 to about 2 minutes. Again, the length of the second incubation time is based on the time needed for the biomarkers to adsorb onto the surface of the nanosensors. The second incubation time may be determined empirically.
At step 107 one or more Raman spectra of the volume of cell-free plasma on the nanosensor is generated by a Raman spectroscopy system.
At step 108 one or more Raman spectrum of the volume of buffy coat on the nanosensor is generated by a Raman spectroscopy system.
One example of a Raman spectroscopy system for obtaining Raman spectra of the samples on a nanosensor of steps 107 and/or 108 of method 100 is provided in
The system 200 includes a computing device 202, an excitation pathway including a laser 204, a waveplate 206, a beam steerer 208, a beam expander 210, a Rayleigh filter 212, filter and waveplates 214, and a moveable stage 216. The nanosensor is on the moveable stage 216. The system 200 further includes a return pathway including a microscope 218, the filter and waveplates 214, a focus lens 220, an adjustable slit 222, a collimator 224, a diffraction grating 226, a focus lens 228 and a CCD camera 230. In at least one embodiment, the system 200 is a confocal Raman microscope system with, for example, an excitation wavelength of 785 nm. This microscope system provides all of the components shown in
During use, the computing device 202 includes one or more software programs (see e.g.
The excitation light pulses excite molecules in the sample 217 and the excited molecules which emit scattered photons that are at different energies and different frequencies. There is also a change in the electric dipole-electric dipole polarization and a resulting Raman scattering which is proportional to the polarization change. The scattered photons, referred to as a Raman signal, are then focused and purified by the optical elements in the return pathway prior before reaching the CCD camera 230. The CCD camera 206 then records the images and transmits the images to the computing device 202 so that it can perform processing, as described below.
For example, referring now to
The computing device 202 generally comprises a processing unit 250, an Analog to Digital Converter (ADC) 256, a data store 252, a display 254 and an input/output interface 258 which may be coupled to various peripheral components such as the laser 204 and the CCD camera 230, or a prepackaged Raman microscope which includes these hardware components. The computing device 202 may also include a power unit (not shown) or be connected to a power source to receive power needed to operate its components.
The processing unit 250 is operatively coupled to the other components of the computing device 202 for controlling various operations and performing certain functions, such as setting or modifying stimulus parameters for the laser 204 (i.e. wavelength and intensity), the data acquisition process (i.e. controlling image acquisition, etc.) and assessing patient samples for various aspects of cancer as described herein.
The processing unit 250 can be any suitable processor, controller or digital signal processor that can provide sufficient processing power depending on the operational requirements of the computing device 202 as is known by those skilled in the art. For example, the processing unit 250 may be a high performance processor. In alternative embodiments, the processing unit 250 may include more than one processor with each processor being configured to perform different dedicated tasks.
The data store 252 includes volatile and non-volatile memory elements such as, but not limited to, one or more of RAM, ROM, one or more hard drives, one or more flash drives or some other suitable data storage elements. The data store 252 may be used to store an operating system and programs as is commonly known by those skilled in the art. For instance, the operating system provides various basic operational processes for the processing unit 252 and the programs include various operational and user programs so that a user can interact with the computing device 202 to perform Raman imaging of a sample and subsequent cancer assessment of the sample, to determine and/or update models (e.g. through training) used in the cancer assessment or a combination of both of these operations.
The data store 252 may also include software code for implementing various components for Raman imaging, model training and various aspects of cancer assessment in accordance with the teachings herein as well as storing values for various operational parameters that are used for Raman imaging. For example, the data store 252 can include programs for implementing an input/output module 252a, a cancer assessment module 252b, a Raman imaging module 252c, and a training module 252d. It should be noted that there may be other embodiments in which the software modules may be organized differently; however, the same functions as described herein are performed.
The input/output module 252a can include program instructions for receiving acquired Raman image data and user control data. The input/output module 252a can also include program instructions for outputting and/or storing raw Raman image data, preprocessed Raman image data, and cancer assessment data.
The cancer assessment module 252b includes program instructions for obtaining and preprocessing Raman spectral data for a patient sample, performing feature extraction on the preprocessed Raman spectral data and providing feature extraction values to various models including classifiers for detecting various cancer characteristics that can be used for providing a cancer assessment for the patient sample based on one of more of the biomarkers described herein. The cancer characteristics may include cancer type, cancer stage, cancer metastasis, potential cancer metastasis or any combination thereof. If the analysis is done based on two or more biomarkers then the results from each biomarker can be combined to provide a cancer assessment with improved accuracy compared to basing the analysis on only one biomarker. The operation of the cancer assessment module 252b is further described with reference to
The Raman imaging module 252b is used to control the generation of excitation laser pulses by the laser 204 and recordal of resulting images when the patient sample 217 is on the moveable stage 216. The Raman images may then be directly processed by the cancer assessment module 252b and optionally stored on the data store 252.
The training module 252d includes program instructions using training data to obtain determine the models (e.g. classifiers) used by the cancer assessment module 252b as well as update these models over time as more training data is acquired. The training data may be obtained using the hardware setup of
The display 254 can be any suitable device for displaying images and various types of information such as an LCD monitor or touchscreen display. The input/output interface 258 can be various ports such as one of more of USB, Firewire, serial and parallel ports, for example, that may be coupled with various peripheral devices used for the input or output of data. Examples of these devices include the laser 204 and CCD camera 230 and may also include, but are not limited to, a keyboard, a mouse, a trackpad, a touch interface, and a printer or any combination thereof. The Analog to Digital Converter 256 may be needed to convert any analog data that is received by the input/output interface 258 into digital data.
Returning back to
In at least one embodiment, the correlation analysis of the one or more obtained Raman spectra with the template Raman spectrum includes a comparison of signature peaks of Raman bands of the obtained Raman spectrum with signature peaks of Raman bands of the template Raman spectrum. The template Raman spectrum includes an average spectrum of a plurality of individual Raman spectra from samples of known cancer characteristics.
Raman spectra of plasma/cfDNA/exosomes, for example, contains rich information. This information is typically in the form of signature peaks. For example, the peaks at a particular wavenumber typically represent a type of biomolecule. Some representative Raman assignments are provided here as an example. For instance, a peak at 782 cm−1 is assigned to the ring breathing mode of DNA/RNA bases, a peak at 1445 cm−1 CH2 is assigned to bending modes of proteins, a peak at 1670 cm−1 is assigned to stretching modes of C═C in lipids, and a peak at 2850 cm-1 is assigned to CH2 symmetric stretch in lipids. The Raman assignments are well established by known literature. It should be understood that specific signature peaks of the spectra can be selected depending on the application of the data.
In at least one embodiment, the correlation analysis may include use of one or more correlation analysis tools. Some of the tools for performing correlation analysis may include: Pearson's correlation analysis, Spearman correlation test, Heat maps and/or plotting correlation matrix of eigenvalues using artificial neural network. In at least one embodiment, during the correlation analysis, ten measurements from one sample are obtained.
In at least one embodiment, a large number of peaks in any given spectrum may be selected. The number of peaks may be determined by the molecular vibrations in each biomarker. Generally, a minimum of three peaks are considered for every biomarker used. In at least one embodiment, the number of peaks used may be in a range of 3 to 10. In at least one embodiment, there may be overlap of many peaks in the Raman spectra of relatively large macromolecules like proteins, lipids and/or nucleic acids. Thus, many inaccuracies may be introduced into the analysis when the spectra are analyzed qualitatively and certain peaks are assigned to specific biochemical components. These errors are introduced from visual inspection and guessing to determine biochemical components from changes in intensity as there always is a possibility of combinational contribution from several components contributing to one peak. Also, there is a possibility of loss of important information from omitted regions of spectra. To overcome these errors, chemometric methods of multivariate data analysis can be employed, as is described with reference to
In at least one embodiment, a diagnosis may be made if there is high similarity between the sample Raman spectrum and the template Raman spectrum. For instance, in at least one embodiment, if the correlation analysis of the obtained Raman spectrum (from the patient sample) with the template Raman spectrum demonstrates correlation of 65% or above, a diagnosis may be made based on the underlying cancer characteristics of the template Raman spectrum.
Establishing immune-cell standard templates (e.g. templates for immunoprofiling) are described herein as follows: T cells isolated from a buffy coat from normal blood are used to establish the surface-enhanced Raman scattering (SERS) characteristics of the T cells. For example, CD 4+ T cells and CD8+ T cells are distinctly different in size, have different intracellular calcium dynamics and different cytokine secretion, thereby providing the potential to differentiate using their SERS signatures.
Templates of cells from the immune system such as, but not limited to, NK cells, myeloid derived suppressor cells (MDSCs) and tumor associated macrophages can be established in the same manner. Here, the presence and concentration of each of these immune cell types in the blood can be correlated with the cancer stage, metastatic potential, therapy efficacy and the development of adoptive immunity to cancer.
In at least one embodiment, the obtained Raman spectrum/spectra may be compared to more than one template simultaneously to make a multiplex diagnosis. In at least one embodiment, the comparison of the obtained Raman spectrum/spectra to the template(s) may be completed by a computer program. For instance, in at least one embodiment, Python™ and/or Matlab® modules may be used for data comparison and data interpretation.
Referring now to
At acts 302 and 304, the Raman spectrum of the patient sample are obtained and preprocessed. This may be done using the hardware setup of
At act 306, feature extraction is performed on the preprocessed Raman spectrum to obtain values for features that are used by the models in later steps of method 300. The feature extraction results in a reduction in the dimensionality of the data. The feature extraction may be performed using several different techniques. For instance, at step 306, Principal Component Analysis (PCA) may be used as a non-parametric approach that does not need any explicit background model. In principal component analysis, the input Raman spectral data is decomposed and the output that is generated is the principal components. The first principal component can be defined as a direction maximizing the variance. The ith principal component is orthogonal to the first principal component maximizing the variance. In short, principal components are the eigenvectors of the covariance matrix of the input Raman spectral data. One can determine how many principal components are to be used for analysis ignoring the rest of the components. Known techniques may be used to perform the PCA to produce a certain number of principal components such as 10 principal components, for example. In at least one embodiment, described herein, the components that show maximum amount of data are used (e.g. usually three to five components are selected, which represent at least 70% of spectral data). The principal components are the features that are used as inputs to the classifiers. The principal components that are calculated can be further used for solving the problem of classification of one or more components (e.g. extracellular vesicles) that are detected after the sample is on the nanosensor and the Raman image is captured.
Alternatively, at act 306, Multivariate Curve Resolution analysis (MCR) may be used to provide an estimation of contributions of pure ingredients with mixed measurements (for example, a mixture of various types of cfDNA in a plasma sample) providing information on component profiles to generate scientifically meaningful information, which in this case are the features that are then provided to the classifiers. In this way, interpretation of results from complex and large data gathered from Raman spectra may be easily understood.
MCR analysis generally provides more accurate information on the pure components in a mixture compared to the PC analysis. Therefore, by applying it to the extracellular vesicles, one can derive the variation in the pure components of extracellular vesicles. Alternatively, PC analysis is faster than MCR analysis. However, it should be understood that the choice of whether to perform PC analysis or MCR analysis depends on which type of analysis was used in generating/training the classifiers (e.g. classification models). For example, if PC analysis was used to generate/train the cancer type classifier used in act 308 then PC analysis is performed at act 306 and the extracted feature values are input to act 308 meanwhile if MCR analysis was used to generate/train the cancer stage classifier used in act 310 then MCR analysis is performed at act 306 and the extracted feature values are input to act 310. Accordingly, it is possible that both PC and MCR analysis may be performed in act 306 if PC and MCR analysis were each used in determining/training at least one of the classifiers used in acts 308 to 314.
At act 308, the extracted feature values from act 306 are provided as input to a cancer type classifier which provides probabilities for different cancer type classes that are used to detect whether at least one type of cancer is present in the patient sample. This detection result is then provided to act 316 where it is incorporated into the cancer assessment that is performed. It should be noted that if no cancer type is detected at act 310, then there is no need to perform acts 310 to 314.
At act 310, the extracted feature values from act 306 are provided as input to a cancer stage classifier which provides probabilities for different cancer stage classes that are used to detect what the cancer stage is if there is cancer present in the patient sample. This detection result is then provided to act 316 where it is incorporated into the cancer assessment that is performed.
At act 312, the extracted feature values from act 306 are provided as input to a metastasis classifier which provides probabilities for whether the cancer is metastasized or not when there is cancer present in the patient sample. This detection result is then provided to act 316 where it is incorporated into the cancer assessment that is performed.
At act 314, the extracted feature values from act 306 are provided as input to a potential metastasis classifier which provides probabilities for potential cancer metastasis when there is cancer present in the patient sample. This detection result is then provided to act 316 where it is incorporated into the cancer assessment that is performed.
It should be noted that one or more of acts 308 to 314 may be optional in some embodiments.
At act 316, the detection results from acts 308 to 314 which were performed (as some might be optional) are used to provide a cancer assessment, which might be considered to be a cancer diagnosis for the patient based on the analysis of the patient sample. For instance, the cancer assessment, might be that the patient has a certain cancer type, a certain cancer stage, whether the cancer has metastasized and the potential for cancer metastasis to occur. For example, the patient may have breast cancer that is at stage 4 which has metastasized and based on further potential metastasis, the patient may be given a probably of survival of 5 years.
The classification models that are used in acts 308 to 314 can be based on a machine learning model. For example, the machine learning model may be based on Partial Least Square Discriminant Analysis (PLSDA), Support Vector Machine Discriminant Analysis (SVMDA), or an PLSDA or an Artificial Neural Network topology. Alternatively, other types of machine learning techniques might be used such as, but not limited to, Convolutional Neural Networks, and the Random Forrest method, for example.
In an alternative embodiment, the classification models may be based on a deep learning model in which case the feature extraction step of act 306 does not need to be performed since the deep learning model can perform feature extraction while it determines a classification result.
However, it is possible that each of the classifiers used in acts 308 to 314 are based on different types of models (some examples of which were listed previously) that may be obtained using the training method 400 shown in
Classifiers that are based on PLSDA, determine a probability of a sample belonging to each class along with a classification threshold. This is done by fitting a Gaussian distribution to all calculated class probabilities.
Classifiers that are based on SVMDA, use a supervised method for classification, regression and outlier detection. The SVMDA is an effective tool in high dimensional spaces. Features are separated in different domains and a probability of each sample belonging to a certain class is calculated. These probabilities are used to estimate the most likely class for each sample (e.g. prediction probability).
Classifiers that use an Artificial Neural Network (ANN) employ a sequential model to build a solution to a problem layer by layer. A “hard sigmoid” activation function may be used for each computation node with binary cross entropy to predict the probability of origin of cancer with, an Adam optimizer algorithm, for example. The cancer origin may be used to determine the cancer type, e.g. lung cancer versus breast cancer. The Adam algorithm utilizes a method of adaptive learning rates and obtains learning rates for individual parameters to optimize the ANN. Alternatively, the ANN may be also be useful, for example, in classifiers that are used to obtain information on the trajectory of cancer prognosis. The ANN may be implemented using python's keras, numpy or sklearn functions, for example.
It should be noted that the method 300 can be performed for each biomarker that is used in performing the cancer assessment which may increase the accuracy of the cancer assessment. Each biomarker will have its own set of classification models. For example, if two biomarkers are used, e.g. a first biomarker and a second biomarker, then there will be a first set of classifiers including a first set having a first cancer type classification model, a first cancer stage classification model, a first metastasis classification model and a first metastasis potential classification model that all correspond to the first biomarker that are used to provide a first set of detection results and then a second set having a second cancer type classification model, a second cancer stage classification model, a second metastasis classification model and a second metastasis potential classification model that all correspond to the second biomarker that are used to provide a second set of detection results. The first and second detection results are then combined at act 316 to provide a more accurate cancer assessment. In at least one embodiment, any combination of the features for cell free DNA, immune biomarkers, exosomes and methylation markers may be used as the training dataset. The combination of biomarkers improves the accuracy of the detection process when compared to using single biomarker. The first and second biomarkers for the classification model can be cell free DNA, immune biomarkers, exosomes or methylation markers.
Accordingly, in at least one embodiment, the cancer assessment at act 316 can be comprehensive depending on the types of biomarkers that are used for performing the method 300. For example, in at least one embodiment, the cancer assessment may include: cancer location, cancer stage, cancer characteristics such as but not limited to metastatic potential, therapy efficacy and the development of adoptive immunity to cancer. Any of the biomarkers described herein can be used to detect these cancer characteristics although the specificity and accuracy can change depending on the particular biomarker used to detect a given cancer characteristic. In at least one embodiment, the comprehensive cancer analysis may include a prognosis for a cancer patient.
Referring now to
Acts 402, and 404 are performed in a similar fashion as acts 302 and 304 of method 300. Act 406 involves using PC analysis or MCR analysis, which can be performed as descried previously. However, only one of PC or MCR analysis is performed based on the type of model that is being generated/trained. This may be determined based on the nature of the underlying biomarker for which the model is used.
It should be noted that acts 402 to 406 are performed on several samples to obtain training data which is then used at act 408 for generating/training the classification model. For example, ten or more known cancer samples may be used for training.
Act 408 then involves using the training data from acts 402 to 406. For each sample, there are several features that are used for the training depending on the model structure. For example, when the model is based on PLSDA, the features can be provided by the MCR analysis in which case the first five MCR components can be used.
At act 410, an accuracy assessment of the generated/trained model may be performed, which might be done by running several classifiers in series by applying preprocessing and feature extraction to Raman spectrum of known cancer samples. The classifier giving the highest classification probability is used to determine the type of cancer of the Raman spectrum. This Raman spectra associated with this type of cancer can be removed and the analysis can be repeated for classifying the remaining cancer types.
At act 412, if a number of classification models have been generated/trained then the most accurate one is selected as the optimal model. For example, the model may be further optimized by performing 2,000 iterations for feature extraction and using the features values for the training dataset. Alternatively, a minimum of 10 spectra may be used in the training data for statistical purposes to ensure reproducibility. Each of these spectra were acquired three times and averaged.
As shown in
Raman spectroscopic analysis was undertaken to investigate the phonon modes of these complex nanosensors. As shown in
The nanosensors were classified on the basis of small probes (mean size 7.69 nm), medium probes (mean size 10.7 nm) and large probes (mean size 16.85 nm).
As noted above, in at least one embodiment, cell-component biomarkers present in blood plasma include, but are not limited to, extracellular vesicles of circulating cancer initiating cells (CICs). Experiment data for extracellular vesicles of CICs is given below.
Specifically, SERS profiling of extracellular vesicles of CICs revealed unique features for different types of cancers which was attributed to variation in the contents of the extracellular vesicles of CICs. Unique features derived from SERS profiles enabled prediction of localization of cancer. This approach demonstrated ultra-high sensitivity by trace level detection of up to 10 extracellular vesicles in 10 μl solution. The sensor described herein was also tested with samples derived from clinical specimens of breast cancer cells, lung cancer cells and colorectal cancer cells extracellular vesicles up to 10 extracellular vesicles per 10 microliters was achieved. Specifically, template matching was performed for detection. The signature Raman peaks of the template were visible until 10 extracellular vesicles dilution in 10 microliters. On further dilution, the peaks did not appear.
Raman signature peaks of the Raman spectra were used to identify breast cancer (this is used as standard template for breast cancer diagnosis). This is a typical description of a biomolecule (for example, Bovine Serum Albumin (BSA). The Raman assignments were obtained from the literature.
Early cancer diagnosis was shown due to the nanosensor amplifying extremely weak signals of extracellular vesicles of CSCs in circulation. The nanosensor was synthesized with self-assembled, three-dimensional nano-probes that detect biomolecular configuration of extremely low levels of extracellular vesicles associated with CSCs (e.g. 10 extracellular vesicles in 10 μl of solution of spiked samples). The nanoprobes of the nanosensor promote trapping of extracellular vesicles, have an increased surface area, and permit drainage of fluids improving extracellular vesicles-surface interaction. Increased surface area enables improvement in adsorption of the analytes under investigation. By synthesis of the probes at nanoscale, effective surface area increases, which in turn enables more molecules to adsorb resulting in significant amplification of Raman signals. Signature profiles of the extracellular vesicles derived from lung, breast and colorectal CSCs were obtained by a surface-enhanced Raman scattering (SERS) technique. The profile templates were compared with 2 μL plasma samples obtained from lung cancer, breast cancer and colorectal cancer patients (12 each). The extracellular vesicles of CSCs in circulation were shown to act as an independent marker for accurate and early diagnosis. Predictions of the localization of cancer with very high sensitivity (92% to 100%) and specificity (94% to 100%) were made directly from the plasma samples without the need for isolation of extracellular vesicles. The extracellular vesicles of CSCs in circulation demonstrated the potential for early prognosis of cancer in real time. For instance, the method relies on direct detection of extracellular vesicles from plasma (e.g. performing measurements to obtain Raman spectra and then applying template matching to perform the detection), without the need for preprocessing/isolation. In at least one embodiment, each test process took less than about 10 minutes after receiving the plasma samples. Predictions of the location of cancer simultaneously for multiple cancer types were made with plasma samples derived from cancer patients without any need for extracellular vesicles isolation.
Amplification of the Extracellular Vesicle Signatures with Nanosensor
Raman signals of the circulating extracellular vesicles derived from cancer cells and its CSC counterparts demonstrated substantially enhanced signal response. As per
As per
Limit of detection of the circulating extracellular vesicles was evaluated to demonstrate whether the technique provided ultra-sensitive detection. The experiment was started with 50 circulating extracellular vesicles in 10 μl of solution and the concentration was reduced to 5 circulating extracellular vesicles in 10 μl of solution. As per
Identification of Circulating Extracellular Vesicles from CSCs
SERS profiling of circulating extracellular vesicles of breast cancer cells as well as circulating extracellular vesicles derived from breast CSC was undertaken.
To visualize the variation in the contents of circulating extracellular vesicles, multivariate analyses (principal component analysis PCA and multivariate curve resolution (MCR) analysis were undertaken. PCA is a technique that can be used to analyse data from multiple inter-correlated variables. As shown in
A similar trend was also observed with the circulating extracellular vesicles of lung cancer cells and circulating extracellular vesicles of lung CSCs.
The experimental results from circulating extracellular vesicles derived from cancer cells versus cancer CSC clearly demonstrated that the nanosomal cargo (i.e. extracellular vesicles) were significantly different. Extracellular vesicles (e.g. exosomes) contain many biomolecules such as proteins, lipids, nucleic acids and other cellular components that may contain important information about parent cells. These biomolecules are carried by exosomes are of importance in intracellular communication. These contents are called exosomal cargo. The cargo may vary based on physiological and pathological; conditions, which is confirmed by this study. As our experiments clearly indicate that Raman profile of extracellular vesicles of cancer and CSC show variation in the Raman bands representing nanosomal cargo.
Nanosensor Assisted Cancer Prognosis of the Localization of Cancer from Circulating Extracellular Vesicles of CSC
The performance of nanosensor assisted SERS profiling of the circulating extracellular vesicles derived from three types of CSCs (breast, lung and colorectal) was evaluated. As per
When the scatter plot of PC1 vs PC2 is plotted, and if it shows two distinct clusters then it means that two different types of extracellular vesicles were detected. Sometimes a three-dimensional scatter plot of PC1 vs PC2 vs PC3 is also plotted for better visualization.
The three-dimensional plot of PC1 vs PC2 vs PC3 demonstrated three separate clusters. Signature peak intensities were plotted in a heat map to visualize variation in the nanosomal contents. Circulating extracellular vesicles of breast CSC demonstrated much higher expression of almost all biomolecules and low expression of a couple of proteins and lipids as compared to the Circulating extracellular vesicles of the other two types of CSCs. Circulating extracellular vesicles of lung CSC demonstrated very high expression of proteins and lipids and very low expression of proline, hydroxyproline and proteins. Circulating extracellular vesicles of colorectal CSC demonstrated high expression of proline, hydroxyproline and proteins and moderate expression of other biomolecules.
Prediction of the location of the cancer was attempted with nano sensor based analysis of circulating extracellular vesicles of CSCs. For this purpose, a hierarchical model was developed. First, MCR analysis was undertaken.
As shown in
Specifically, the model demonstrated 100% sensitivity and 98.46% specificity for lung cancer cross validation results and 100% sensitivity and 100% specificity for breast and colorectal cancer cross validation. For prediction with patient plasma, the model was applied in two stages, as follows.
First, lung CSC extracellular vesicles were treated as one class and remaining breast and colorectal cancer extracellular vesicles were treated as one class. Using PLSDA, the classification showed 83.33% sensitivity and 91.66% specificity of classification to predict lung cancer tissue of origin.
In the second stage, breast cancer and colorectal cancer patient plasma were classified. Prediction of tissue of origin as breast cancer classification showed 90.90% sensitivity and 100% specificity. Prediction of tissue of origin as colorectal cancer showed 100% sensitivity and 90.91% specificity.
Applicability of Plasma without Enrichment or Extracellular Vesicles Isolation for Rapid Diagnosis with the Nanosensor Assisted Prognosis
As shown in
For visualization of signature Raman assignments, a heat map of signature bands was plotted. The similarity of the Raman profiles was evident from the spectra as well as the heat maps. MCR was undertaken to solve the problem of mixtures. MCR analysis results in pure components of the mixture. First, five MCR components were compared for similarities and variations by applying an artificial neural network-based model. ANN is one of a variety of different ways of using machine learning. ANN is a brain inspired system intended to learn patterns that are too complex (e.g. spectral data patterns) to extract and teach the machine to recognize. Layer by layer, ANN extracts features of spectra to identify classifying parameters. It should be understood that the feature extraction described herein is not limited to ANN. Rather, any standard deep learning/machine learning methodology/algorithm can be alternatively applied for feature extraction.
Plasma samples from breast, lung and colorectal cancer patients were then tested with the cancer diagnosis-chip for localization of cancer. This was achieved by building a hierarchical support vector machine discriminant analysis (SVMDA) and partial least square discriminant analysis (PLSDA) model with training algorithm designed based on Raman spectral features of the extracellular vesicles of CSC. However, again, it should be understood that the feature extraction is not limited to SVMDA, since any standard deep learning/machine learning methodology/algorithm can be alternatively applied for feature extraction.
Specifically, the SVMDA was applied by calling an SVMDA module (e.g. function) in a Matlab-based software. The model details are as follows: SVM type: C-SVC; and SVM kernel type: radial basis function.
SVM demonstrated specificity of 95.8% for detection of lung cancer and 83.3% sensitivity.
In a second stage, breast cancer and colorectal cancer patient plasma was classified. Prediction of tissue of origin as breast cancer classification showed 90.90% sensitivity and 100% specificity. Prediction of tissue of origin as colorectal cancer showed 100% sensitivity and 90.91% specificity.
Prediction of Localization of Tumor with Clinical Plasma Samples
As shown in
The following biomarkers may be able to be used to screen cancer onset before any symptoms occur:
By combining the epigenetic profile of cfDNA and the epigenetic and proteomic profiling of exosomes, it is possible to accurately screen the onset of cancer in a healthy population.
The methodology for determining cancer onset, in accordance with the teachings herein, involves establishing a spectroscopic template for differential methylation percentage and validation using the cell culture derived DNA. The template may be further applied to DNA isolated from the solid tumor and the corresponding ctDNA, which will confirm that ctDNA methylation is correlated. The healthy plasma is then spiked with cell culture derived DNA to analyse the limit of detection in a complex sample. The spectroscopic data may then be subjected to multivariate analysis to investigate the diagnostic sensitivity, specificity, and analytical limit of detection of the proposed method. Finally, the plasma samples may be analysed using the multivariate analysis components for the diagnosis, prognosis and prediction of cancer metastasis using ctDNA methylation patterns, cfDNA concentration and the exosome concentration.
The application of three-dimensional nanosensors amplify the signals associated with circulating tumor DNA (ctDNA) methylation, thereby enabling direct detection from plasma. The main problem associated with utilizing ctDNA in a clinical setting, is their presence in low concentration. The three-dimensional nanosensors possess a single molecular sensitivity which is ideal for the amplification of differentially methylated sites in ctDNA (see
The following biomarkers can identify the probability of a primary tumor to metastasize to distant organs and predict the organ of metastasis. Additionally, these markers can be utilized to detect the presence of metastasis as low as 1,000 cells and pinpoint the origin of metastasis with a high specificity.
In accordance with the teachings herein, a method is provided that involves establishing a spectroscopic template for differential methylation percentage and validation using cell culture derived DNA. The template may further be applied to DNA isolated from solid tumor and the corresponding ctDNA, which may confirm that ctDNA methylation is correlated with tumor stage, tumor size and metastasis. The spectroscopic data may then be subjected to multivariate analysis to investigate the diagnostic sensitivity, specificity, analytical limit of detection of the proposed method. Finally, the plasma samples may be analyzed using the multivariate analysis components for the diagnosis, prognosis and prediction of cancer metastasis using ctDNA methylation patterns.
One hallmark of cancer is the aberrant changes in the DNA structure. DNA damage is prominently co-related to carcinogenesis. The damage to DNA is associated with reactive Oxygen species (ROS) following which DNA conformational changes occur. Such changes at nanometer scale are extremely difficult to diagnose without any artificial amplification of DNA. However, artificial amplification can lead to introduction of spectral artefacts. Therefore, quantum superstructure assisted profiling of the DNA structural changes in cancer was undertaken without need for DNA amplification. Raman profiling of these changes will lead to timely identification of the disease. 10 μL of DNA solution was added directly to the quantum probes and Raman spectra were obtained from Breast, Lung and Colorectal cancer cells as well as from DNA of its CSCs.
Raman bands in the range of 600 cm-1 to 700 cm-1 are assigned to the sugar-base conformation dependent ribose vibration of Guanine. The orientation sensitive band at 685 is sensitive to was shifted to the lower wavenumber for DNA of cancer cells as well as CSCs. The peak shift was attributed to the disruption in base-stacking resulting in the deformation around lesion sites in cancer. The bands in the region of 800 cm-1 to 1100 cm-1 demonstrate sensitivity to the secondary DNA structure as well as geometry of backbone. The band assigned to phosphodiester backbone and deoxyribose, which is located at 890 cm-1 was shifted to 875 cm-1 for breast CSC, 870 cm-1 for lung CSC. For cancer cells, it showed a slight shift to 875 cm-1 for colorectal cancer, 884 cm-1 for lung cancer and did not show any shift for breast cancer. The shift was attributed to the scissions of DNA, more scission for CSCs as compared to its cancer counterpart was observed. Intensity of this peak was much higher for CSC DNA as compared to cancer cell DNA. This was attributed to the unstacking of DNA bases in CSCs. This was also supported by the substantially increased intensity for the band at 789 cm-1 (C<T and backbone). This could be because of the un-pairing of the paired C, T in CSCs. Opposite trend was observed for the band at 1333 cm-1 (assigned to A,G). The intensity of this band was increased substantially for cancer cell DNA as compared to CSC DNA demonstrating structural damage in cancer cells. The band at 1085 cm-1 was observed to be shifted towards a lower wavenumber for all types of DNAs, indicating ROS induced backbone damage. This damage was much higher in cancer cells as compared to CSCs. The ratio of the peak intensity at 1387 cm-1 and 1334 cm-1 can provide information about DNA aggregation. It was observed that DNA from CSCs showed much higher value of this ratio as compared to cancer cells. This is an indication of the perturbation of the local environment around purine bases in CSCs.
Comprehensive MCSC Analysis of CSC DNA and Cancer Cell DNA Assisted with Quantum Superstructures
For identification of the type of cancer, Raman profiles of cancer stem cell DNA were analyzed. Existing technologies focused on analysis of genomic mutations alone have failed to identify the tissue of origin. Genomic mutation analysis alone cannot identify the underlying tissue of origin because many common gene mutations are shared amongst many cancer types. Few studies have proposed to use assays combining genetic alteration analysis with protein analysis. But such studies demonstrated very low specificity in identification of cancer location for hard to detect cancers like breast, colorectal and lung cancer. In this study, a methodology for localization of cancer has been identified by combining MCSC from cancer cell DNA along-with CSC cell DNA. As shown
Breast Cancer—As per
Colorectal Cancer—Colorectal CSC demonstrated higher intensity of the peaks at 600 cm-1 assigned to nucleotide Confirmation, 658 cm-1 assigned to G+T ring breathing and G Backbone in RNA, 670 cm-1 assigned to Ring breathing mode G, 723 cm-1 assigned to DNA, 792 cm-1 assigned to Ring Breathing mode Cytosine, Uracil, 820 cm-1 assigned to O—P—O stretching of DNA, 914 cm-1 assigned to RNA, 974 cm-1 assigned to RNA, cm-1 assigned to 1095 cm-1 assigned to Nucleic Acid, 1150 cm-1 assigned to Cytosine, Guanine and 1342 cm-1 assigned to Polynucleotide chain (DNA purine bases) whereas colorectal cancer DNA showed intense peak at 875 cm-1 assigned to Phosphodiester, Deoxyribose. PCA demonstrated distinct clustering of cancer DNA and CSC DNA.
Lung Cancer—The analysis of lung cancer cell DNA and lung CSC DNA demonstrated variation in the signature Raman bands in CSC DNA. Lung cancer DNA demonstrated higher intensity at 600 cm-1 assigned to nucleotide Confirmation, 1060 cm-1 assigned to PO2− stretch, 1252 cm-1 assigned to Guanine, Cytosine, 1282 cm-1 assigned to T,A, 1320 cm-1 assigned to Ch3CH2 wagging modes present in DNA purine bases, 1451 cm-1 assigned to Deoxyribose, 1490 cm-1 assigned to DNA whereas lung CSC showed higher intensity at 687 cm-1 assigned to ring breathing mode G, 733 cm-1 assigned to DNA, 787 cm-1 assigned to Ring Breathing mode Cytosine, Uracil, 818 cm-1 assigned to O—P—O stretching of DNA, and 1180 cm-1 assigned to Cytosine, Guanine. Higher peak intensity was attributed to more amounts of the molecules The principal component analysis (PCA) showed two clusters of cancer cell DNA and CSC DNA on opposite sides of the axis. This was attributed to the negative correlation between cancer cell DNA and CSC DNA.
Quantum superstructure assisted MCSC was instrumental in Raman profiling of DNA derived from cancer cells and CSCs demonstrated the ability of the superstructures for ultrasensitive detection. Quantum superstructures were able to obtain minute spectral differences between the DNA derived from different cells types of same cancer. This ultra-sensitive ability of the quantum superstructures was attributed to the molecular level detection capacity. CSCs are an aggressive rare subset of bulk tumor with tumor initiating capabilities. Thus, MCSC of CSC DNA to patient plasma has potential for prognosis of distant tissue invasion, tumor relapse as well as treatment monitoring.
Similarity Analysis Between DNA Derived from Cancer Cells, CSC and Tumor DNA
Cancer stem cells (CSC) is a functional state of a tumor responsible for self-renewal, proliferation and apoptosis resistance. As CSCs are at the apex of tumor hierarchy, information about similarity between the DNA of CSCs and tumor DNA will provide information on many tumor properties for prediction of therapeutic response. Due to the genetic heterogeneity in cancer, different cancer types will demonstrate variation in CSC profiles. For this purpose, MCSC of DNA derived from CSCs versus DNA derived from bulk tumor cells was undertaken. Similarly MCSC of DNA derived from non-CSC DNA versus tumor DNA was also undertaken. Analysis of the similarities associated between the tumor DNA and the DNA derived from CSC should be taken into account in development of new therapeutic approaches.
As shown in
MCR components were used for generation of heat maps and bi-plots. The heat maps provided information on the variances associated with each MCR component. From the heat maps, it is evident that the CSC DNA demonstrated less variance as compared to cancer cell DNA versus tumor DNA. Similarly for colorectal cancer, CSC DNA showed significant similarity to tumor DNA. For lung cancer, cancer cell DNA and CSC DNA did not demonstrate any significant variation when compared to tumor DNA.
Similarity Analysis Between DNA Derived from CSC and Tumor DNA for Early Stage, Intermediate Stage and Advanced Stage Cancers
Liquid profiling by MCSC of cancer cell DNA, CSC DNA and tumor DNA demonstrated that CSC DNA showed significantly low variance as compared to cancer cell DNA. This led us to undertake similarity analysis between CSC DNA and tumor DNA to investigate the variance from cancer development point of view. Tumor grade wise comparison was then undertaken. 10 μL of DNA derived from patient blood plasma and from patient tumor was directly dropped on the quantum superstructures and Raman spectra were captured. Pair-plot of early stage tumor DNA (grade 1), Intermediate stage tumor DNA (grade II&III) and advanced stage tumor DNA (grade IV) was undertaken using multivariate curve resolution analysis. As per
The correlation between tumor DNA and CSC DNA has potential to predict the cancer stages and therefore survival analysis for patients. As tumor stage can provide information on disease prognosis, analysis with CSC DNA is important. Presence of more CSC DNA will lead to poor prognosis, hence by detection of higher similarity of CSC DNA to tumor DNA can potentially provide accurate prognosis. Here, the potential of quantum superstructure-based MCSC for detection of biomolecular similarity analysis of CSC DNA and Tumor DNA for cancer prognosis was demonstrated.
Applicability of Patient Blood Plasma without DNA Isolation for Cancer Diagnosis
Evaluation of the similarities between the cell free DNA (cfDNA) and tumor DNA was undertaken. This analysis was undertaken to gain insights on the concordance in the liquid profiling by MCSC of two types of bioactive substances. Quantum superstructure-based Raman profiling demonstrated substantial similarities between cfDNA and tumor DNA, enabling the use cfDNA as a detection marker. It was hypothesized that the cfDNA from plasma has the ability to demonstrate features of tumor DNA. To test the hypothesis, 10 μL of DNA solution derived from patient blood plasma and patient tumor was directly dropped on the quantum superstructures and Raman spectra was obtained. Multivariate curve resolution alternative least square (MCR-ALS) analysis was undertaken. As per
Breast cancer MCR-AIs analysis demonstrated similar composition of MCR component scores for cfDNA and tumor DNA. Similar scores demonstrated similarity between the components. On analysis of the loadings, it was evident that comp 1 showed dominance of A+G+T whereas comp 2 had majority of the peaks assigned to T+G+C. Comp 8 mostly showed the peaks assigned to A and comp 10 had dominance of the peaks assigned to G. For colorectal cfDNA and tumor DNA, evidence of the single clustering of all cfDNA as well as tumor DNA samples confirmed similar features. The MCR loadings for comp 1 showed dominance of A+T, comp 5 had more peaks assigned to A+G, comp 8 showed mostly peaks for A whereas comp 10 had more peaks assigned to T. Lung cancer MCR components were divided into comp 1 (Adenine-A), Comp 3 (cytosine+thymine—C+T), Comp 9 (cytosine+Guanine—C+G) and comp 10 (adenine+thymine—A+T).
Covariance matrix was plotted. As shown in
Diagnosis of Cancer and Non-Cancer Directly from Plasma
Quantum superstructure-based MCSC of DNA extracted from cancer cells (breast, lung and colorectal cancer) and its respective CSCs along-with DNA from non-cancer fibroblast (NIH3T3) cell line was used as training data. Support vector machine discriminant analysis (SVMDA) was undertaken for this purpose. SVMDA is a non-linear method which undertakes calibration and application of support vector machine classification model. Patient plasma was used for validation of the model. Classification of raw plasma samples was achieved without the need for isolation of cell free DNA with 97% sensitivity and 83% specificity. The misclassification error-proportion of cancer cases incorrectly classified as non-cancer was only 5%. F1 score is the measurement of test accuracy. Fi score provided us the percentage of true positive and true negative results. An F1 score of 0.96 was achieved for this analysis showing very high accuracy of diagnosis. The precision of the model was 0.97, which is the true positivity ratio. Mathew's correlation coefficient for the cancer and non-cancer classification with patient blood plasma was 0.732.
It was observed that the sensitivity of diagnosis improved substantially on including MCSC of both CSC DNA as well as cancer cell DNA. This combination of both DNAs enabled us to tackle the heterogeneity of cfDNA. Patient's plasma data was interpreted based on cell culture DNA. With this approach, insufficient patient data for training the SVMDA can be overcome by easily available cell culture DNA eliminating the limitation of accuracy and reliability.
Identification of Tissue of Origin Directly from Blood Plasma
Identification of tissue of origin is of crucial importance in decision making on site-specific therapy. Here, the ability of the quantum superstructure-assisted liquid profiling was tested by MCSC for identification of the tissue of origin. For undertaking tissue specific molecular Raman profiling, CSC DNA was employed in addition to tumor DNA to locate the original location of cancer. By generating a data bank of Raman profiles of various types of tumor DNA and CSC DNA, there is a potential to design a rapid testing platform which can locate the cancer with very high sensitivity and specificity.
As shown in
Quantum superstructure assisted liquid profiling by MCSC of raw patient blood plasma, achieved identification of tissue of origin of hard to detect cancers. Existing limitation of current cfDNA-based technologies to identify the tissue of origin was eliminated in this study by undertaking ultra-sensitive comprehensive profiling of the DNA structure. This was possible due to the ability of quantum superstructures to extract information on cancer-specific cfDNA directly from patient blood plasma which preserved the integrity of DNA structure resulting in very high sensitivity and specificity.
Early Detection of Brain Cancer with Exosomes in Patient Serum
Ability of the Nanosensor to Predict the Malignancy of Brain Tumor with a Small Amount of Patient Serum
Preliminary Results for Methylation of ctDNA
Turning now to
DNA methylation is a crucial diagnostic marker for cancer metastasis. Studies have shown that methylation levels in DNA are positively correlated with cancer metastatic potential and are highly tissue specific. Here, the quantum hyperstructures for the detection of methylation levels in DNA have been applied.
Further, the presence of the PO2 peak also provides information on any structural damage to DNA. Based on the SERS spectral data, the spectral features which vary the most with varying methylation percentage were monitored. It was concluded from
Next, the quantum hyperstructures were applied to quantify DNA global methylation. Here, a pre-calibrated DNA was used with differential methylation percentage to establish a calibration curve between SERS intensity and methylation percentage (
Principal component analysis was performed to differentiate between the different methylated samples. From the established PCA model, the seven different methylated groups are distinguishable from each other with high specificity and sensitivity. Quantitative analysis of DNA methylation is essential for the diagnosis of cancer metastasis. There exists a linear correlation between the normalized SERS intensity of the DNA methylation markers and methylation %. The correlation equation was calculated as Y=−0.002050*X+0.7661 with an R2 value of 0.7025. The correlation equation is further used to quantify DNA methylation in genomic DNA isolated from preclinical cancer models. These results showcase a quantitative correlation between global DNA methylation levels and the spectral features identified as methylation markers in this study.
Studies have shown that DNA methylation patterns in the primary tumor contains critical information on the metastatic potential, hence can be utilized to determine cancer progression. It can be observed from
It should be noted that colorectal and breast cancer cells are metastatic whereas lung cancer cells are non-metastatic. The content of metastatic and non-metastatic proteins shows variation in the content of the extracellular vesicles of each type of cancer and its CSC counterpart. This variation is cancer type specific. By in-depth analysis of these variations, detection of metastasis can be undertaken.
The 3D architecture arrangement of quantum hyperstructures comprises metallic Ni (cube) decorated with semiconductor NiO (spherical), which contributes to multiple SERS enhancement mechanisms. The primary enhancement arises from the electromagnetic enhancement due to the cube-shaped metallic Nickel's sharp corners.
In addition to engineering hotspots through morphological modifications, there is also a need for a secondary boosting mechanism, which is uniform, reproducible, which is provided by the NiO on the surface of the metallic Nickel. Here, the presence of semiconductor NiO enables effective charge transfer by facilitating adsorption between the methylated DNA and NiO surface. The XPS analysis revealed the presence of —OH functional groups on the surface of the sensor, which interacts with the methylated DNA, thereby enabling methylation specific adsorption. Further, the quantum size of the probe provides a high surface area for adsorption.
Modulating the relationship between SERS and photoluminescence is critically important for enhancing SERS signals. The quantum hyperstructures specifically trap the methylated DNA bases through molecular adsorption. The molecular adsorption enables charge transfer between DNA molecule and the quantum hyperstructures system. It can be observed from
Collectively, the quantum hyperstructures amplifies the methylation-specific signatures through multiple mechanisms: i) plasmonic enhancement due to the morphological modifications of Ni ii) Molecular adsorption because of the surface functional groups on the NiO in combination with the effective quenching of excitons enables a 1000-fold amplification of methylation related signals.
A tumor comprises a highly heterogeneous population of cells. The intratumoral heterogeneity is contributed by the small subpopulation of cells known as cancer stem cells (CSCs). CSCs are shown to undergo asymmetric cell division, further increasing the heterogeneity of a tumor and leading to tumor metastasis. Further, CSCs have also been shown to possess high levels of global hypermethylation compared to non-CSCs. Recent studies have shown that epigenetic changes are critical events in cancer development and epigenetic changes in cancer cells increase the predisposition to metastatic transformation and formation of CSCs. Additionally, DNA methylation patterns in CSCs revealed the dynamics of CSC expansion dynamics and the mechanism of tumor evolution in colorectal cancer. Hence, to design a methylation-based diagnostic test for cancer metastasis with high accuracy and specificity, it is essential to consider methylation patterns of both CSCs and cancer cells since it presents a holistic representation of a tumor.
Aberrant DNA methylation in CSCs plays a primary role in cancer progression, cancer growth and shapes the intratumor heterogeneity. Studies have proven that DNA methylation in CSCs is significantly hypermethylated when compared to bulk tumor cells. The DNA of cancer cells and CSCs derived from preclinical models of breast cancer, lung cancer, and colorectal cancer were compared. Correlation analysis shows the clustering of CSC and cancer DNA in distinctly different clusters without overlap. The correlation analysis between the global methylation levels of CSCs and cancer shows a similar negative correlation across different cancers.
Further, the correlation heatmap (
Next the DNA methylation markers of cancer and CSC DNA with DNA isolated from tumor biopsy was compared to establish the similarities between DNA methylation patterns in CSC and cancer cells in preclinical models and DNA from tissue biopsy. The DNA from tumor biopsy (Tumor DNA) includes the DNA features representing intratumor heterogeneity. The DNA methylation markers in preclinical models can be extended for clinical application if there is a highly significant similarity between DNA from preclinical models and tumor biopsy DNA. Here, a Euclidean distance analysis was applied to showcase the similarity between in vitro DNA and tumor biopsy DNA (
Further, the distribution of tumor DNA in both classes is correlated to the presence of CSCs. A similar trend is observed in colorectal cancer, where in-class similarity was observed as high as 92.7%, whereas between class similarity of 7.28%. However, in lung cancer, the CSCs form a separate class, and the tumor DNA and the cancer cell DNA fall under the same class with a 93.73% similarity. Further, the lung CSC show only 6.27% similarity to tumor DNA. This low similarity could be attributed to the predominance of primary non-metastatic lung cancer.
Additionally, PCA analysis showcased in
Next, the possibility of profiling DNA methylation was investigated directly from plasma. Correlation analysis was performed on the methylation markers in plasma. It can be observed in
A t-SNE (t-distributed stochastic neighborhood embedding) analysis was performed to investigate the similarity between the methylation markers from the subsets of samples. Like PCA analysis, the tSNE algorithm performs a dimensionality reduction to visualize the data in a 3-dimensional space. It can be observed from
DNA methylation is a critical epigenetic mechanism involved in regulating tumor-specific gene regulation such as gene silencing, transcriptional activation. Additionally, when compared to ctDNA mutations, rearrangements, the methylation status of ctDNA has multiple advantages to be applied for cancer diagnosis. First, ctDNA methylation is detectable at early stages of carcinogenesis, second the methylation pattern is highly conserved and helps in determining the tissue of origin of the malignancy.
Here, a global DNA methylation of circulating tumor DNA was applied for cancer diagnosis. It can be observed from PCA analysis in
Next, a random forest classifier was applied for cancer diagnosis using both in vitro cancer DNA and CSC DNA for training. The trained classifier was then applied to an plasma samples from an independent cohort of cancer patients for validation. First, only in vitro cancer DNA and DNA from normal epithelial cells were employed for training the classifier. The validation with plasma samples using the cancer DNA trained classifier yielded an accuracy of 85.2% (
Studies have shown that global DNA methylation patterns are highly tissue-specific, hence can serve as a marker for diagnosis of cancer metastasis. Based on the quantitative correlation between SERS methylation markers and methylation percentage of DNA and the high sensitivity of the sensor, the sensor was applied to distinguish cancer tissue of origin in preclinical models of metastatic and primary cancer. The fundamental diagnostic component of cancer metastasis is to determine the tissue of origin. Additionally, determining the cancer origin combined with site-specific targeting of cancer metastasis is essential to improve treatment outcomes. Studies have shown that DNA methylation is highly tissue-specific, hence can be applied to determine cancer tissue of origin and cancer localization. Further, the tissue-specific methylation patterns are conserved, even though there is cancer-related epigenetic alterations. Here, the tissue-specific methylation and cancer-related epigenetic alterations were combined to classify tumor tissue of origin.
The genomic DNA isolated from preclinical cancer models for metastatic colorectal cancer (COLO-205), Metastatic breast cancer (MDA-MB-231), Non-metastatic lung cancer (H69-AR) were used to determine if the DNA methylation markers can be applied to determine the metastatic status of cancer. The PCA analysis clearly distinguishes cancerous DNA from different tissue of origin (
A random forest algorithm was applied to resolve the tissue of origin of the cancer samples—the DNA methylation markers from in vitro cancer DNA, CSC DNA, tumor DNA. The trained model was then validated by applying it to plasma. It can be observed that the classification accuracy was 100% and the ROC curve generated determined the specificity and sensitivity to be 100% for preclinical models of cancer. However, when the CSC methylation markers were added, the classification accuracy was reduced to 83.9%. However, the specificity of classification was maintained at over 90%. The specificity for Lung cancer, Colorectal cancer, Breast cancer was 93%, 90%, 89%, respectively. The patient cohort's reduction in accuracy is predominant due to the cancer progression, and the cfDNA contains information on cancer progression tissue. The direct detection of tissue of origin from plasma yielded a classification accuracy of 84.4%. The specificity for Lung cancer, Colorectal cancer, Breast cancer was 95.5%, 90%, 95%, respectively.
Determining the Cancer tissue of origin is critical to decide the clinical course of action. Although studies on early detection of cancer's tissue of origin using a liquid biopsy-based approach has shown a high sensitivity, there exists a fundamental variation between the methylation changes between cancer tissue DNA and DNA in circulation. This variation could be attributed to the intrinsic differences between the tissue and plasma methylation. Hence, approaching cancer tissue of origin using a panel of methylated gene loci requires a large-scale patient cohort for validation of the diagnosis method. This study utilizes the global methylation patterns which represent the fundamental epigenetic modifications that is specific to the tissue of origin and tissue of progression.
DNA methylation plays a predominant role in determining the progression status of cancer. The first step to cancer metastasis is progression of tumor cells to multiple sites such as nodes, soft tissue, bones (
Diagnosis of Nodal Progression and Clinical Metastasis with Methylation in the Cell Free DNA of Cancer Stem Cell
The methylation markers were applied to determine the stage of nodal metastasis. K-means clustering analysis was performed to determine the ability of different nodal grades to be classified using methylation markers. The silhouette plot in
Hypermethylation is one of the main epigenetic modifications that drive cancer metastasis. Several studies have shown the correlation between metastasis and hypermethylation across different cancer types. Since the methylation levels in a tumor are positively correlated with the methylation levels in cfDNA, the methylation markers in plasma were utilized to classify a tumor's potential to metastasize directly. PCA analysis shown in
Molecular Level Detection with Superlattice Sensor
The SERS spectra of Crystal Violet (CV) molecules at different concentrations were analyzed to find out the limit of detection and thus the sensitivity of the superlattice sensor. Spectra was acquired by sequentially lowering the concentrations of CV from milliMolar to attoMolar. For this, 10 μL of CV solution was dropped on the superlattice surface and spectra was acquired using a 785 nm Raman laser. At millimolar concentration all the Raman peaks characteristic of CV molecules of Crystal Violet—Peak Assignment were observed. At lower concentrations, it can be observed from
At attoMolar concentration, a small shift was observed for the major Raman peaks which indicates that the spectrum was possibly from a single CV molecule. Since not more than a single CV molecule will be present at attoMolar concentration, different orientations of the molecule with respect to the hotspots give rise to peak shifts and selective enhancement in the SERS spectra. This also proves the ultrasensitive detection capabilities of Ni—Au/Pd superlattice sensor making it superior for detection of trace-level biomarkers.
The optimized sensor with maximum SERS efficiency was employed to study the ultrasensitive detection capabilities of superlattice in discovering EV signatures specific to glioblastoma. The limit of detection experiments was carried out using GBM EVs isolated from A172 cell line. The total EV concentration observed in the serum of glioblastoma patients is of the order of 106 particles/ml. However, the cancer-specific EVs and cancer stem cell-specific EVs will be present at very low concentrations in peripheral circulation. Besides, the concentration of tumor-derived EVs will be shallow at the early stages of glioblastoma, making them undetectable and necessitates ultrasensitive detection for a better prognosis.
The sensor was tested for its ultrasensitive detection capabilities by acquiring the SERS spectra at different EV concentrations. For this, the concentration of EVs isolated from A172 cell culture media was obtained from NTA analysis and was serially diluted in Milli-Q water to prepare a set of known concentrations ranging from 104 EVs/5 μL to 1 EV/5 μL. For analyzing the detection sensitivity, EVs were dropped on the sensor surface, and Raman spectral acquisition was carried out using a laser of 785 nm wavelength with 10 s exposure time.
Interestingly, by virtue of 3D layered structure and its porous nature, the superlattice sensor can efficiently capture EVs on its surface. In addition, the quantum size of Ni cubes provides a high surface area to volume ratio for the interaction of EVs with superlattice surface, facilitating better adsorption and binding of EVs. Moreover, the nano ornamentation of 3D structure with Au/Pd boosts the enhancement efficiency by creating hotspots between the sharp edges of Ni quantum cubes and spherical Au/Pd nanoparticles. The enhanced electric fields generated due to localized surface plasmon resonance boost the EV signals, resulting in high detection sensitivity when the EVs are captured in these hotspots. This permits the quantification of EVs by the sensor at extremely low concentrations. The capturing of EVs by 3D layered Ni superlattice is also evidenced from HRSEM images (
The SERS-based label-free detection of EVs is an efficient approach since the technique allows single-molecule sensitivity with enhanced signals and a good signal-to-noise ratio without using specific antibodies. This proficient Ni-based superlattice sensor with high detection sensitivity for EVs, provides better prospects for clinical applications.
The SERS spectra of GBM EVs were also acquired from 10 randomly selected positions on the Ni—Au/Pd superlattice sensor to ensure the consistency and reproducibility of the EV biomarker spectral signatures
Significant Variation in SERS Profiles of EVs Derived from GBM and GBM CSC Compared to EVs Derived from Non-Cancer Cells
The SERS spectra acquired on the Ni—Au/Pd superlattice sensor for all biological samples were baseline corrected, smoothened, and normalized using Spectragryph software. The as-obtained data was subjected to PCA analysis in Eigenvector software. In PCA, a total of 2367 (100 cm-1 to 1800 cm-1 and 2600 cm-1 to 3200 cm-1) variables were reduced to 10 principal components.
A total of 10 spectra each from normal and GBM CSC EVs acquired from randomly selected points on the sensor was used for analysis. The graph plotted between the first three principal components is shown in
The PCA analysis between GBM cancer EVs and normal EVs also leads to a similar inference that spectral differences exist for these populations as well (
The EVs carry contents that are unique to their parent cells.
The SERS spectra of EVs derived from GBM cancer cells and GBM cancer stem-cell enriched population were characterized in detail by multivariate analysis to figure out peaks of maximum variance. Principal component analysis can find highly correlated variables and reduce the data into a set of eigenvectors/principal components that are linearly uncorrelated. These variables account for the highest variance in the SERS data of GBM cancer and CSC EVs. The PC plot for the first three principal components is shown in
Biochemical Similarity of EVs Derived from GBM CSC with Parent GBM Tumor by SERS Profiling
The differences in SERS spectra observed for normal and cancer specific EVs proved that GBM cancer/CSC EVs can be employed as a biomarker to diagnose Glioblastoma Multiforme. However, our next challenge was to transform this potential biomarker to develop a rapid blood test. It is known that in case of Glioblastoma, tumor-derived EVs escape the blood-brain barrier and enter the peripheral circulation. The phospholipid bilayer of EVs protects the biomolecules it carries without degradation. The correlation of SERS spectra of GBM tumor tissue and EVs isolated from the serum sample of the same patient were analyzed to study the tumor specific characteristics expressed in EVs present in serum.
To analyze the major peaks that contribute to the similarity between GBM serum EVs and GBM tumor tissue, heatmap was drawn for the SERS spectral data. The highest similarity observed were for the peaks at 1073 cm-1, 1255 cm-1, 1632 cm-1, 1513 cm-1, and 1376 cm-1. These peaks correspond to fatty acids, lipids, Amide I protein components, cytosine of nucleic acids, ring breathing mode of nucleic acids respectively. This proved that EVs carried characteristics of GBM tumor cells in the form of different biomolecules. In addition, results imply the necessity of considering the whole spectrum of biomolecules constituting EVs for a holistic analysis that can substantially improve diagnostic efficacy.
GBM Diagnosis with GBM CSC-EVs
The findings show that CSC derived EVs can be considered as a unique biomarker, and when combined with machine learning techniques can serve as a potential diagnostic platform for GBM diagnosis non-invasively. Only an efficient machine learning model can bring out the subtle differences in the SERS spectra facilitating prediction of GBM. To validate the usefulness of EVs as a biomarker reliable for clinical applications, the serum samples of 20 patients affected with Glioblastoma Multiforme for were subjected to PLSDA analysis. Unlike PCA which is unsupervised, PLSDA is an effective supervised machine learning model which can predict and classify the data into distinct groups based on the similarities and covariances in the data.
The SERS spectra of serum samples were obtained and were preprocessed before loading to the PLSDA model. The PLSDA model was executed in three steps; i) training using normal cell-derived EVs, GBM cancer cell derived and GBM CSC EVs, ii) validation of model using 20% of training data, and iii) testing using the patient serum samples. In the training phase, the machine learning model was first trained using normal EVs and GBM cancer derived EVs. The cross-validation of the model confirmed an accuracy of 100%.
However, to further improve the prediction accuracy, GBM CSC derived EVs were employed in the training phase to incorporate the intratumoral heterogeneity of GBM. The hierarchical clustering of the training data is shown as a dendrogram in
Interestingly, the model trained with GBM CSC derived EVs resulted in 100% sensitivity and 100% specificity for Glioblastoma Multiforme prediction (
This study introduces an ultra-sensitive and non-invasive methodology of GBM detection directly from patient blood samples using GSC-DNA as a reliable biomarker and 3D plasmonic meta sensors with a sub-single molecule sensitivity. The GSCs account for the heterogeneity, tumour progression and the aggressiveness of the cancer and hence, plays a vital role in the detection mechanism. The trace levels of the GSC associated ctDNA were detected by SERS of the samples on the 3D plasmonic sensors. The high level of sensitivity of the sensors were achieved by adapting the following: Size reduction of the carbon nanoparticles that activates the organic plasmons, arrangement of metasensor in a 3D cluster that entraps the analyte and increases the surface area of adsorption and incorporation of functional groups on the meta sensors that enhances the binding of the analytes to the sensors. The detection was achieved by using machine learning algorithms for data analysis. SERS profile of the in-vitro GBM DNA and GSC DNA are the data collected and used for training the machine learning algorithm to detect the differences between the GBM and GSC DNA. These differences were utilized to detect the cancer. The tumor derived DNA from GBM patient samples were used as the testing data to accurately pinpoint the differences and group them as healthy and cancerous samples. SERS data obtained form 5 μL of GBM patient's peripheral blood sample on the meta sensor platform was used as the validation data for the test. The classification accuracy and specificity were increased up to 96.9% and 94.4% respectively by incorporating the GSC-DNA along with the GBM DNA.
The SERS spectra of the DNA isolated from in-vitro glioblastoma cancer cells (A-172) and healthy fibroblast cells (WI-38) were recorded to study the differences exhibited by them. The PCA of the SERS spectra of the cell derived DNA showed a clear distinction between the glioblastoma and healthy cells. The predominant peak assignments that are responsible for the variance causing the distinct classification between the healthy and cancer cells were obtained from the PC loading curves of all the principal components considered for the analysis. Here in this case, four principal components have been identified and used for the PCA.
Dysregulation in the expression of proteins or the expression of defective proteins is a significant feature of cancer cells. The variance at the peak positions of the PC loading plots at the amino acid peaks shows that the SERS spectra of the DNA compared, shows significant differences in their gene expression patterns29,30. The variance observed at the 756 cm-1 in the PC4 and the 1075 cm-1 peak in the PC2 and PC3 corresponds to the out of plane ring mode and the in-plane ring breathing of the suggestive oncoprotein c-MYC29,30 which again is a contributing factor for classifying the SERS spectra of the DNA isolated from healthy and cancer cells.
Glioblastoma is an aggressive form of cancer, and at higher stages, it relapses and metastasizes to other locations. The presence of the rare population of cancer stem cells (CSC) is the reason for metastasis and the aggressiveness of the cancer. So, it is highly imperative to take into account the scarce CSC population for the diagnosis of glioblastoma. But the problem associated with using CSC is that it is significantly less in number and the CSC-associated cell-free DNA in the patient blood/serum is at a deficient concentration making it hard to detect. To account for tumour heterogeneity, the SERS spectra of the DNA isolated from in-vitro glioblastoma cells (A-172) and the cancer stem cells derived from A-172 were used to unique spectral signatures that differentiate them into distinct populations based on the differences in their DNA and proteins.
The DNA of the glioblastoma cancer cells and CSCs were dropped on the nanodiamond sensors and their SERS spectra were recorded and analyzed to understand the differences exhibited by the DNA signatures.
The concentration of CSC DNA in blood is directly related to the size and or stage of the tumor. The GSC DNA was serially diluted into many concentrations along with an optimal concentration of healthy fibroblast cell free DNA. This is to mimic the plasma DNA levels which contains both healthy and cancer DNA.
They all follow a linear regression pattern suggesting that they are detectable up to the lowest concentrations.
Using the line of regression equation, the actual concentration of the CSC DNA present in the given samples were calculated and is shown in the scatter plot in
Correlation Between GBM DNA, GSC DNA, Patient Tumor Derived DNA and the Serum Samples Enabling the Direct Detection of GBM from Patient Serum Samples
SERS spectral data of the DNA isolated from the tumour biopsy of the glioblastoma patients was used to establish its resemblance to the cell-free DNA present in the patient serum samples. The similarity between the glioblastoma cell-derived DNA and patient tumour-derived DNA was also demonstrated by comparing their SERS spectra. The SERS spectral data of the DNA isolated from glioblastoma cell-derived and tumour-derived DNA are compared to study the similarity patterns. But the glioblastoma cell DNA showed similarity only to a few data points. Glioblastoma is an aggressive type of cancer, and the tumors consist of a heterogeneous population of cancerous cells, healthy cells and a rare population of cancer stem cells (CSC). The aggressiveness of the cancer is attributed to the presence of CSCs in the tumour. Therefore, the presence of CSC DNA must be accounted for in the DNA isolated from the tumour37,38. The lower similarity index of the cancer cell DNA and tumour DNA could be because of missing out on the other populations of the cells. Therefore, DNA was isolated from the in-vitro glioblastoma CSCs, combined it with the cancer cell DNA, and compared it with the tumour DNA.
The established similarity of tumour-derived DNA to the CSC and cancer cell-derived DNA is extended to show their holistic similarity to the cell-free DNA in patient sera.
In order to establish the distinction between the serum of a healthy individual and a GBM patient, the spectra of serum samples was obtained from 10 healthy individuals and respective cfDNA were isolated and SERS spectra was obtained. The expression of serum properties in cfDNA was stated by the similarity analysis.
Glioblastoma is hard to detect in its early stages because the characteristic biomarkers of this aggressive form of brain cancer cannot extravasate the blood-brain-barrier39. At the later or complex stages, there develops a leaky BBB that allows the biomarkers like the cell free DNA and apoptotic cells to cross the BBB and enter the blood stream. The concentration of this biomarkers in the blood is insufficient for the direct detection of disease by any methods available. The plasmonic metasensors used in this study shows enhanced Raman efficiency that makes it a suitable candidate to detect the glioblastoma from patient blood serum40,41.
The DNA derived form the GBM and the GSC are used as the training data for the machine learning to train the algorithm for GBM diagnosis. The Raman spectra of patient serum is the validation/test data which is diagnosed.
Glioblastoma tumour is a highly heterogenetic population of cells and therefore to account for the heterogeneity, the GSC DNA data was also combined with the GBM data as a training dataset to train the algorithm.
This study designed a fully connected Artificial neural network that can act as a location identifier for brain tumor. The Multilayer perceptron used here is a fully connected artificial neural network that works on feed-forward mechanism. The model is composed of a three layer dense network with nodes or neurons that act as processing elements that use linear activation and backpropagation for the training of the model. The input data was the whole SERS spectrum of serum samples that had 2573 variables or Raman values along with their brain tumor locations. The output layer generates the classification based on the input data and the computations from the hidden layer. At each layer, the predicted output is compared with the actual output. When there is a high error value, the weight corresponding to the variable is recalculated to improve network performance. The training process continues until a good validation accuracy is obtained. Testing employs a new dataset for which the network is unaware of actual values. Here, in this study the output was brain tumor locations that are classified to nine namely, cerebellar, left frontal, left temporal, left occipital, left parietal, right frontal, right temporal, right occipital, and right parietal.
The pre-processed data after normalization was augmented to generate a dataset of 3000 SERS spectra. The augmentation of data helps in eliminating the probabilities of overfitting. When there is a small amount of data the pattern recognized by the model is often not capable of recognizing or classifying a new data. This makes the model use irrelevant data for classification. This is eliminated by constructing an augmented database by a number of random transformations. This data was given to the input layer as input and used to train the MLP ANN model. The learning rate was set to 0.01 to ensure the model captures details on every single peak of Raman spectra. In the MLP algorithm, the output of the first layer acts as the input of the next layer and all layers in the network were fully connected. However the functions or parameters such as weights are independent of the layers. Essentially the weights assigned to variables in each layers are unique. These weights are assigned randomly and then adjusted by the backward propagation by estimating the error. The summation of weighted outputs in one layer is fed to the next layer as input through an activation function. The activation function employed here is Rectified Linear Activation (ReLu) which maps the input to the output by removing the negative values at each step.
After training, the validation was done using 10% of the data and loss and accuracy curves were monitored for increasing epochs. The total number of epochs was set to 100 were a maximum validation accuracy of 94% was obtained. The number of epochs were capped to ensure there is no overfitting of the model. The prediction results are given by the output layer. To observe the loss Mean Squared Error (MSE) function was used. The MSE takes the difference between the actual and predicted value, square it and takes average for the whole dataset. The loss function continuously monitor the performance of the algorithm over increasing epochs and optimize the fitting process. A low loss value obtained as shown in
100%
100%
100%
The SERS spectra of serum samples of brain tumor patients were acquired using the nanosensor and the data was augmented to overcome the limitations posed by a small dataset. The augmented data having whole SERS spectrum accounting for 2573 variables/Raman peaks were given as input for the Weighted co-expression network analysis. A sample dendrogram was constructed to detect the outliers and find similarity in dataset. The heatmap in the
Identifying Significant Raman Signatures Associated with Brain Tumor Location
The modules were then correlated with the tumor location trait. The correlation cut-off value was set to p=0.05 and major three modules were considered. Among them, the blue module was identified to be most positively correlated with the location trait (
The Weighted Raman co-expression analysis revealed that proteins (glycogen, tryptophan), lipids (cholesterol ester) and nucleic acid peaks significantly correlate with the brain tumor location trait. Among these the major biomolecules identified were glycogen, cortisone, phosphatidylinositol, cholesterol ester, DNA and several RNA modes. A summary for each biomarker is given below and would contribute to further research and experiments that can validate our studies and lead to novel ways of diagnosis, prognosis and treatment of brain tumors.
While the applicant's teachings described herein are in conjunction with various embodiments for illustrative purposes, it is not intended that the applicant's teachings be limited to such embodiments as the embodiments described herein are intended to be examples. On the contrary, the applicant's teachings described and illustrated herein encompass various alternatives, modifications, and equivalents, without departing from the embodiments described herein, the general scope of which is defined in the appended claims.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CA2022/050232 | 2/17/2022 | WO |
Number | Date | Country | |
---|---|---|---|
20240132967 A1 | Apr 2024 | US |
Number | Date | Country | |
---|---|---|---|
63150566 | Feb 2021 | US |