The present disclosure generally relates to small molecule biomarkers comprising a panel of metabolite species that is effective for the early detection of breast cancer recurrence, including methods for identifying such panels of biomarkers within biological samples by using a process that combines gas chromatography-mass spectrometry and nuclear magnetic resonance spectrometry.
Breast cancer remains the leading cause of death among women worldwide. It is the second leading cause of death among women in the United States, with nearly 190,000 new cases and 40,000 deaths expected in the year 2010. Although breast cancer survival has improved over the past few decades owing to improved diagnostic screening methods breast cancer often recurs anywhere from 2 to 15 years following initial treatment, and can occur either locally in the same or contralateral breast or as a distant recurrence (metastasis). Recent studies of nearly 3,000 breast cancer patients showed that the recurrence rate 5 and 10 years after completion of adjuvant treatment were 11 percent (“%”) and 20%, respectively. Numerous factors such as stage, grade and hormone receptor status are shown to have association with recurrence. Higher stage tumors often have higher propensity to recur. For example, a recent study reports that 7%, 11% and 13% of recurrence after 5 years for stage I, II and III tumor cases, respectively. In addition, conditions such as lymph node invasion and absence of estrogen receptors are factors in a higher relapse rate and a shorter disease free survival. Studies have shown that early detection of locally recurrent breast cancers can improve survival rate significantly.
Common methods for routine surveillance of recurrent breast cancer include periodic mammographic examinations, self-examination or physician-performed physical examination and blood tests. The performances of such tests are poor, and extensive investigations for surveillance have not proven effective. Often, mammography misses small local recurrences or leads to false positives, resulting in low sensitivity and specificity, and unnecessary biopsies. In view of the unmet need for more sensitive and earlier detection methods, the last decade or so has witnessed the development of a number of new approaches for detecting recurrent breast cancer and monitoring disease progression using blood based tumor markers or genetic profiles. The in vitro diagnostic (“IVD”) markers include carcinoembryonic antigen (“CEA”), cancer antigen (“CA”) 15-3, CA 27.29, tissue polypeptide antigen (“TPA”), and tissue polypeptide specific antigen (“TPS”). Such molecular markers are thought to be promising since the outcome of the diagnosis based on these markers is independent of the expertise and experience of the clinicians and it potentially avoids sampling errors commonly associated with conventional pathological tests, such as histopathology. However, currently these markers tack the desired sensitivity and specificity, and often respond late to recurrence, underscoring the need for alternative approaches.
Up to nearly 50% improvement in the relative survival of patients can be achieved by detecting the recurrence at a clinically asymptomatic phase, showing the need for a reliable test that is based on biomarkers that are indicative of secondary tumor cell proliferation. However, the performance of the commercially available non-invasive tests based on circulating tumor markers such as carcinoembryonic antigen and cancer antigens is too poor to be of significant value for improving early detection. This is because the levels of these markers are also elevated in numerous other malignant and non-malignant conditions unconnected with breast cancer. Considering such limitations, the American Society of Clinical Oncologists (ASCO) guidelines recommend the use of these markers only for monitoring patients with metastatic disease during active therapy in conjunction with numerous other examinations and investigations.
Metabolite profiling (or metabolomics), can detect disease based on a panel of small molecules derived from the global or targeted analysis of metabolic profiles of samples such as blood and urine. Metabolite profiling uses high-resolution analytical methods such as nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS) for the quantitative analysis of hundreds of small molecules (less than ˜1,000 Da) present in biological samples. Owing to the complexity of the metabolic profile, multivariate statistical methods are extensively used for data analysis. The high sensitivity of metabolite profiles to even subtle stimuli can provide the means to detect the early onset of various biological perturbations in real time.
A monitoring test for recurrent breast cancer with a high degree of sensitivity and specificity is provided that detects the presence of a panel of multiplicity of biomarkers that were identified using metabolite profiling methods. The test is capable of detecting breast cancer recurrence about a years earlier than current available monitoring diagnostic tests. The panel of biomarkers is identified using a combination of nuclear magnetic resonance (NMR) and two dimensional gas chromatography-mass spectrometry (GC×GC-MS) to produce the metabolite profiles of serum samples. The NMR and GC×GC-MS data are analyzed by multivariate statistical methods to compare identified metabolite signals between samples from patients with recurrence of breast cancer and those from patients having no evidence of disease.
In a preferred embodiment, a method is disclosed for detecting a panel of a multiplicity of predetermined metabolic biomarkers that are indicative of the recurrence of breast cancer in a subject, comprising obtaining a sample of a biofluid from the subject; analyzing the sample to determine the presence and the amount of each of the metabolic biomarkers in the panel; wherein the presence and the amount of each of the metabolic biomarkers in the panel as a whole are indicative of the recurrence of breast cancer in a subject. Typically the biofluid is blood, plasma, serum, sweat, saliva, sputum, or urine. Preferably the biofluid is serum.
In a preferred embodiment, the panel of a multiplicity of metabolic biomarkers consists of at least seven compounds selected from the group consisting of 3-hydroxybutyrate acetoacetate, alanine, arginine, asparagine, choline, creatinine, glucose, glutamic acid, glutamine, glycine, formate, histidine, isobutyrate, isoleucine, lactate, lysine, methionine, N-acetylaspartate, proline, threonine, tyrosine, valine, 2-hydroxy butanoic acid, hexadecanoic acid, aspartic acid, 3-methyl-2-hydroxy-2-pentenoic acid, dodecanoic acid, 1,2,3, trihydroxypropane, beta-alanine, alanine, phenylalanine, 3-hydroxy-2-methyl-butanoic acid 9,12-octadecadienoic acid, acetic acid, N-acetylglycine, glycine, nonanedioic acid, nonanoic acid, and pentadecanoic acid.
In another preferred embodiment, the panel consists of 3-hydroxybutyrate, acetoacetate, alanine, arginine, choline, creatinine, glutamic acid, glutamine, formate, histidine, isobutyrate, lactate, lysine, proline, threonine, tyrosine, valine, hexadecanoic acid, aspartic acid, dodecanoic acid, alanine, phenylalanine, 3-hydroxy-2-methyl-butanoic acid, 9,12 octadecadienoic acid, acetic acid, N-acetylglycine, nonanedioic acid, and pentadecanoic acid.
In a further preferred embodiment, the panel consists of 3 hydroxybutyrate, choline, glutamic acid, formate, histidine, lactate, proline, tyrosine, 3 hydroxy-2-methyl-butanoic acid, N-acetylglycine, and nonanedioic acid. In another preferred embodiment, the panel consists of choline, glutamic acid, formate, histidine, proline, 3 hydroxy-2-methyl-butanoic acid, N-acetylglycine, and nonanedioic acid. In yet another preferred embodiment, the panel consists of 3-hydroxybutyrate, choline, formate, histidine, lactate, proline, and tyrosine.
In a preferred embodiment the metabolic biomarkers in the panel are determined by obtaining samples of biofluid from subjects with known breast cancer status; measuring one or more metabolite species in the samples of by subjecting the sample to nuclear magnetic resonance measurements; measuring one or amore metabolite species in the samples of by subjecting the sample to mass spectrometry measurements; analyzing the results of the nuclear magnetic resonance measurements and the results of the mass spectrometry measurements to produce spectra containing individual spectral peaks representative of the one or more metabolite species contained within the sample; subjecting the spectra to multivariate statistical analysis to identify one or more metabolite species contained within the sample; and determining which metabolic species are correlated, with a given breast cancer status.
In another preferred embodiment, a method is disclosed for detecting secondary tumor cell proliferation in a mammalian subject comprising: obtaining a sample of a biofluid from the subject; analyzing the sample to determine the presence and the amount of each of the metabolic biomarkers in a panel of predetermined biomarkers; wherein the presence and the amount of each of the metabolic biomarkers in the panel as a whole are indicative of secondary tumor cell proliferation in a mammalian subject. Typically the biofluid is blood, plasma, serum, sweat, saliva, sputum, or urine. Preferably the biofluid is serum.
In a preferred embodiment, the panel of a multiplicity of metabolic biomarkers consists of at least seven compounds selected from the group consisting (of 3-hydroxybutyrate, acetoacetate, alanine, arginine, asparagine, choline, creatine, glucose, glutamic acid, glutamine, glycine, formate, histidine, isobutyrate, isoleucine, lactate, lysine, methionine, N-acetylaspartate, proline threonine, tyrosine, valine, 2-hydroxybutanoic acid, hexadecanoic acid, aspartic acid, 3-methyl-2-hydroxy-2-pentatonic acid, dodecanoic acid, 1,2,3, trihydroxypropane, beta-alanine, alanine, phenylalanine, 3-hydroxy-2-methyl butanoic acid, 9,12-octadecadienoic acid, acetic acid, N-acetylglycine, glycine, nonanedioic acid, nonanoic acid, and pentadecanoic acid. In another preferred embodiment, the panel consists of 3-hydroxybutyrate, acetoacetate, alanine, arginine, choline, creatinine, glutamic acid, glutamine, formate, histidine, isobutyrate, lactate, lysine, proline, threonine, tyrosine, valine, hexadecanoic acid, aspartic acid, dodecanoic acid, alanine, phenylalanine, 3-hydroxy-2-methyl-butanoic acid, 9,12 octadecadienoic acid, acetic acid, N-acetylglycine, nonanedioic acid, and pentadecanoic acid.
In a further preferred embodiment, the panel consists of 3 hydroxybutyrate, choline, glutamic acid, formate, histidine, lactate, proline, tyrosine, 3 hydroxy-2-methyl-butanoic acid, N-acetylglycine, and nonanedioic acid, in another preferred embodiment, the panel consists of choline, glutamic acid, formate, histidine, proline, 3 hydroxy-2-methyl-butanoic acid, N-acetylglycine, and nonanedioic acid. In yet another preferred embodiment, the panel consists of 3-hydroxybutyrate, choline, formate, histidine, lactate, proline, and tyrosine.
In a preferred embodiment the metabolic biomarkers in the panel are determined by obtaining samples of biofluid from subjects with known secondary tumor cell proliferation; measuring one or more metabolite species in the samples of by subjecting the sample to nuclear magnetic resonance measurements; measuring one or more metabolite species in the samples of by subjecting the sample to mass spectrometry measurements; analyzing the results of the nuclear magnetic resonance measurements and the results of the mass spectrometry measurements to produce spectra containing individual spectral peaks representative of the one or more metabolite species contained within the sample; subjecting the spectra to multivariate statistical analysis to identify the at least one or more metabolite species contained within the sample; and determining which metabolic species are correlated with secondary tumor cell proliferation.
In another preferred embodiment, a method is disclosed for detecting the recurrence breast cancer status within a biological sample, comprising: measuring one or more metabolite species within the sample by subjecting the sample to a combined nuclear magnetic resonance and mass spectrometry analysis, the analysis producing a spectrum containing individual spectral peaks representative of the one or more metabolite species contained within the sample; subjecting the individual spectral peaks to a statistical pattern recognition, analysis to identify the at least one or more metabolite species contained within the sample, and correlating the measurement of other one or more metabolite species with a breast cancer status. Preferably, the one or multiple metabolite species is selected from the group consisting of 2-methyl,3-hydroxy butanoic acid; 3-hydroxybutyrate; choline; formate; histidine; glutamic acid; N-acetyl-glycine; nonanedenoic acid; proline; threonine; tyrosine; and combinations thereof. Typically the sample comprises a biofluid, preferably serum. Typically the mass spectrometry analysis comprises a two-dimensional gas chromatography coupled mass spectrometry analysis.
In another preferred embodiment, the invention provides a panel of biomarkers for detecting breast cancer, comprising at least one metabolite species or parts thereof, selected from the group consisting of consisting of 2-methyl,3-hydroxy butanoic acid; 3-hydroxybutyrate; choline; formate; histidine; glutamic acid; N-acetyl-glycine; nonanedenoic acid; proline; threonine; tyrosine; and combinations thereof.
The above-mentioned aspects of the present teachings and the manner of obtaining them will become more apparent and the teachings will be better understood by reference to the following description of the embodiments taken in conjunction with the accompanying drawings, in which corresponding reference characters indicate corresponding parts throughout the several views.
In one preferred embodiment, a monitoring test for recurrent breast cancer that was developed using metabolite profiling methods is disclosed. Using a combination of nuclear magnetic resonance (NMR) and two-dimensional gas chromatography-mass spectrometry (GC×GC-MS) methods, we analyzed the metabolite profiles of 257 retrospective serial serum samples from 56 previously diagnosed and surgically treated breast cancer patients. One hundred sixteen of the serial samples were from 20 patients with recurrent breast cancer, and 141 samples were from 36 patients with no clinical evidence of the disease during ˜6 years of sample collection. NMR and GC×GC-MS data were analyzed by multivariate statistical methods to compare identified metabolite signals between the recurrence samples and those with no evidence of disease, producing a set of 40 biomarkers (Table 2, below). A subset of eleven metabolite markers (seven from NMR and four from GC×GC-MS) was selected from an analysis of all patient samples by using logistic regression and 5-fold cross-validation. A partial least squares discriminant analysis model, built using these markers with leave-one-out cross-validation provided a sensitivity of 86% and a specificity of 84% (area under the receiver operating characteristic curve=0.88). Strikingly, 55% of the patients could be correctly predicted to have recurrence more than a year (13 months ort average) before the recurrence was clinically diagnosed, representing a large improvement over the current breast cancer-motoring assay CA 27.29.
The embodiments of the present disclosure described below are not intended to be exhaustive or to limit the disclosure to the precise forms disclosed in the following detailed description. Rather, the embodiments are chosen and described so that others skilled in the art may appreciate and understand the principles and practices of the present disclosure.
Unless defined otherwise, all technical and scientific terms used herein have the meaning commonly understood by a person skilled in the art to which this invention belongs.
As used herein, “metabolite” refers to any substance produced or used during all the physical and chemical processes within the body that create and use energy, such as: digesting food and nutrients, eliminating waste through urine and feces, breathing, circulating blood, and regulating temperature. The term “metabolic precursors” refers to compounds from which the metabolites are made. The term “metabolic products” refers to any substance that is part of a metabolic pathway (e.g. metabolite, metabolic precursor).
As used herein, “biological sample” refers to a sample obtained from a subject. In preferred embodiments, biological sample can be selected, without limitation, from the group of biological fluids (“biofluids”) consisting of blood, plasma, serum, sweat, saliva, including sputum, urine, and the like. As used herein, “serum” refers to the fluid portion of the blood obtained after removal of the fibrin clot and blood cells, distinguished from the plasma in circulating blood. As used herein, “plasma” refers to the fluid, non-cellular portion of the blood, as distinguished from the serum, which is obtained after coagulation.
As used herein, “subject” refers to any warm-blooded animal, particularly including a member of the class Mammalia such as, without limitation, humans and non-human primates such as chimpanzees and other apes and monkey species; farm animals such as cattle, sheep, pigs, goats and horses; domestic mammals such as dogs and cats; laboratory animals including rodents such as mice, rats and guinea pigs, and the like. The term does not denote a particular age or sex and, thus, includes adult and newborn subjects, whether male or female.
As used herein, “detecting” refers to methods which include identifying the presence or absence of substance(s) in the sample, quantifying the amount of substance(s) in the sample, and/or qualifying the type of substance. “Detecting” likewise refers to methods which include identifying the presence or absence of breast cancer tissue or breast cancer recurrence in a subject.
“Mass spectrometer” refers to a gas phase ion spectrometer that measures a parameter that can be translated into mass-to-charge ratios of gas phase ions. Mass spectrometers generally include an ion source and a mass analyzer. Examples of mass spectrometers are time-of-flight, magnetic sector, quadrupole filter, ion trap, on cyclotron resonance, electrostatic sector analyzer and hybrids of these. “Mass spectrometry” refers to the use of a mass spectrometer to detect gas phase ions.
The terms “comprises,” “comprising,” and the like are intended to have the broad meaning ascribed to them in U.S. Patent Law and can mean “includes,” “including” and the like.
It is to be understood that this invention is not limited to the particular component parts of a device described or process steps of the methods described, as such devices and methods may vary. It is also to be understood that the terminology used herein is for purposes of describing particular embodiments only, and is not intended to be limiting. As used in the specification and the appended claims, the singular forms “a,” “an,” and “the” include plural referents unless the context clearly indicates otherwise.
The present disclosure provides a monitoring test based on a panel of selected biomarkers that have been selected as being effective, in detecting the early recurrence of breast cancer. The test has a high degree of clinical sensitivity and clinical specificity and is capable of detecting breast cancer recurrence at a much earlier time point than current monitoring diagnostics. The test is based on biological sample classification methods that utilize a combination of nuclear magnetic resonance (“NMR”) and mass spectrometry (“MS”) techniques. More particularly, the present teachings take advantage of the combination of NMR and two-dimensional gas chromatography-mass spectrometry (“GC×GC-MS”) to identify small molecule biomarkers comprising a set of metabolite species found in patient serum samples. Panels of these identified biomarkers have been found to be effective in detecting recurrent breast cancer at an early stage by comparing identified metabolite signals between recurrence samples and no evidence of disease samples, providing an indication of recurrence more than a year earlier than presently available diagnostic tests or clinical diagnosis.
Metabolite profiling utilizes high-throughput analytical methods such as nuclear magnetic resonance spectroscopy and mass spectroscopy for the quantitative analysis of hundreds of small molecules (less than ˜1000 Daltons) present in biological samples. Owing to the complexity of the metabolic profile, multivariate statistical methods are extensively used for data analysis. The high sensitivity of metabolite profiles to even subtle stimuli can provide the means to detect the early onset of various biological perturbations in real time.
In the present study, the metabolite profiling method was used to determine and select metabolites that are sensitive to recurrent breast cancer and are detected in serum samples. A combination of NMR and two dimensional gas chromatography resolved MS (“2D GC-MS”) methods were utilized to build and validate a model for early breast cancer recurrence detection based on a set of 257 retrospective serial serum samples. The performance of the derived 11 metabolite biomarkers selected for the model compared very favorably with the performance of the currently used molecular marker, CA 27.29, indicating that metabolite profiling methods promise a sensitive test for follow-up surveillance of treated breast cancer patients. In particular, over 60% of the recurring patients could be identified more than 10 months prior to their detection by clinical diagnosis. The resulting test provides a sensitive and specific model for the early detection of recurrent breast cancer
While this metabolite profile was discovered using a platform of NMR and MS methods, one of ordinary skill in the art will recognize that these identified biomarkers can be detected by alternative methods of suitable sensitivity, such as HPLC, immunoassays, enzymatic assays or clinical chemistry methods.
In one embodiment of the invention, samples may be collected from individuals over a longitudinal period of time. Obtaining numerous samples from an individual over a period of time can be used to verify results from earlier detections and/or to identify an alteration in marker pattern as a result of, for example, pathology.
In one embodiment of the invention, the samples are analyzed without additional preparation and/or separation procedures. In another embodiment of the invention, sample preparation and/or ration can involve, without limitation, any of the following procedures, depending on the type of sample collected and/or types of metabolic products searched: removal of high abundance polypeptides (e.g., albumin, and transferrin); addition of preservatives and calibrants, desalting of samples; concentration of sample substances; protein digestions; and fraction collection. In yet another embodiment of the invention, sample preparation techniques concentrate information-rich metabolic products and deplete polypeptides or other substances that would carry little or no information such as those that are highly abundant or native to serum.
In another embodiment of the invention, sample preparation takes place in a manifold or preparation/separation device. Such a preparation/separation device may, for example, be a microfluidics device, such as a cassette. In yet another embodiment of the invention, the preparation/separation device interfaces directly or indirectly with a detection device. Such a preparation/separation device may, for example, be a fluidics device.
In another embodiment of the invention, the removal of undesired polypeptides (e.g., high abundance, uninformative, or undetectable polypeptides) can be achieved using high affinity reagents, high molecular weight filters, column purification ultracentrifugation and/or electrodialysis. High affinity reagents include antibodies that selectively bind to high abundance polypeptides or reagents that have a specific pH, ionic value, or detergent strength. High molecular weight filters include membranes that separate molecules on the basis of size and molecular weight. Such filters may further employ reverse osmosis, nanofiltration, ultrafiltration and microfiltration.
Ultracentrifugation constitutes another method for removing undesired polypeptides. Ultracentrifugation is the centrifugation of a sample at about 60,000 rpm while monitoring with an optical system the sedimentation (or lack thereof) of particles. Finally, electrodialysis is an electromembrane process in which ions are transported through ion permeable membranes from one solution to another under the influence of a potential gradient. Since the membranes used in electrodialysis have the ability to selectively transport ions having positive or negative charge and reject ions of the opposite charge, electrodialysis is useful for concentration, removal, or separation of electrolytes.
In another embodiment of the invention, the manifold or microfluidics device perms electrodialysis to remove high molecular weight polypeptides or undesired polypeptides. Electrodialysis can be used first to allow only molecules under approximately 35 30 kD to pass through into a second chamber. A second membrane with a very small molecular weight cutoff (roughly 500 D) allows smaller molecules to exit the second chamber.
Upon preparation of the samples, metabolic products of interest may be separated in another embodiment of the invention. Separation can take place in the same location as the preparation or in another location. In one embodiment of the invention, separation occurs in the same microfluidics device where preparation occurs, but in a different location on the device. Samples can be removed from an initial manifold location to a microfluidics device using various means, including an electric field. In another embodiment of the invention, the samples are concentrated during their migration to the microfluidics device using reverse phase beads and an organic solvent elution such as 50% methanol. This elutes the molecules into a channel or a well on a separation device of a microfluidics device.
Chromatography constitutes another method for separating subsets of substances. Chromatography is based on the differential absorption and elution of different substances. Liquid chromatography (LC), for example, involves the use of fluid carrier over a non-mobile phase. Conventional LC columns have an in inner diameter of roughly 4.6 mm and a flow rate of roughly 1 ml/min. Micro-LC has an inner diameter of roughly 1.0 mm and a flow rate of roughly 40 μl/min. Capillary LC utilizes a capillary with an inner diameter of roughly 300 im and a flow rate of approximately 5 μl/min. Nano-LC is available with an inner diameter of 50 μm-1 mm and flow rates of 200 nl/min. The sensitivity of nano-LC as compared to HPLC is approximately 3700 fold. Other types of chromatography suitable for additional embodiments of the invention include, without limitation, thin-layer chromatography (TLC), reverse-phase chromatography, high-performance liquid chromatography (HPLC), and gas chromatography (GC).
In another embodiment of the invention, the samples are separated using capillary electrophoresis separation. This will separate the molecules based on their electrophoretic mobility at a given phi (or hydrophobicity), in another embodiment of the invention, sample preparation and separation are combined using microfluidics technology. A microfluidic device is a device that can transport liquids including various reagents such as analytes and elutions between different locations using microchannel structures.
Suitable detection methods are those that have a sensitivity for the detection of an analyte in a biofluid sample of at least 50 μM. In certain embodiments, the sensitivity of the detection method is at least 1 μM. In other embodiments, the sensitivity of the detection method is at least 1 nM.
In one embodiment of the invention, the sample may be delivered directly to the detection device without preparation and/or separation beforehand. In another embodiment of the invention, once prepared and/or separated, the metabolic products are delivered to a detection device, which detects them in a sample. In another embodiment of the invention, metabolic products in elutions or solutions are delivered to a detection device by electrospray ionization (ESI). In yet another embodiment of the invention, nanospray ionization (NSI) is used. Nanospray ionization is a miniaturized version of ESI and provides low detection limits using extremely limited volumes of sample fluid.
In another embodiment of the invention, separated metabolic products are directed down a channel that leads to an electrospray ionization emitter, which is built into a microfluidic device (an integrated ESI microfluidic device). Such integrated ESI microfluidic device may provide the detection device with samples at flow rates and complexity levels that are optimal for detection. Furthermore, a microfluidic device may be aligned with a detection device for optimal sample capture.
Suitable detection devices can be any device or experimental methodology that is able to detect metabolic product presence and/or level, including, without limitation, IR (infrared spectroscopy), NMR (nuclear magnetic resonance), including variations such as correlation spectroscopy (COSy), nuclear Overhauser effect spectroscopy (NOESY), and rotating frame nuclear Overhauser effect spectroscopy (ROESY), and Fourier Transform, 2-D PAGE technology, Western blot technology, tryptic mapping, in vitro biological assay, immunological analysis, LC-MS (liquid chromatography-mass spectrometry, LC-TOF-MS, LC-MS/MS, and MS (mass spectrometry).
For analysis relying on the application of NMR spectroscopy, the spectroscopy may be practiced as one-, two-, or multidimensional NMR spectroscopy or by other NMR spectroscopic examining techniques, among others also coupled with chromatographic methods (for example, as LC-NMR). In addition to the determination of the metabolic product in question, 1H-NMR spectroscopy offers the possibility of determining further metabolic products in the same investigative run. Combining the evaluation of a plurality of metabolic products in one investigative run can be employed for so-called “pattern recognition”. Typically, the strength of evaluations and conclusions that are based on a profile of selected metabolites, i.e., a panel of identified biomarkers, is improved compared to the isolated determination of the concentration of a single metabolite.
For immunological analysis, for example, the use of immunological reagents (e.g. antibodies), generally in conjunction with other chemical and/or immunological reagents, induces reactions or provides reaction products which then permit detection and measurement of the whole group, a subgroup or a subspecies of the metabolic product(s) of interest. Suitable immunological detection methods with high selectivity and high sensitivity (10-1000 pg, or 0.02-2 pmoles), e.g., Baldo, B. A., et al. 1991, A Specific, Sensitive and High-Capacity Immunoassay for PAF, Lipids 26(12): 1136-1139), that are capable of detecting 0.5-21 ng/ml of an analyte in a biofluid sample (Cooney, S. J., et al, Quantitation by Radioimmunoassay of PAF in Human Saliva), Lipids 26(12): 1140-1143).
In one embodiment of the invention, mass spectrometry is relied upon to detect metabolic products present in a given sample. In another embodiment of the invention, an ESI-MS detection device. Such an ESI-MS may utilizes a time-of-flight (TOF) mass spectrometry system. Quadrupole mass spectrometry, ion trap mass spectrometry, and Fourier transform ion cyclotron resonance (FTICR-MS) are likewise contemplated in additional embodiments of the invention.
In another embodiment of the invention, the detection device interfaces with a separation/preparation device or microfluidic device, which allows for quick assaying of many, if not all, of the metabolic products in a sample. A mass spectrometer may be utilized that will accept a continuous sample stream for analysis and provide high sensitivity throughout the detection process (e.g., an ESI-MS). In another embodiment of the invention, a mass spectrometer interfaces with one or more electrosprays two or more electrosprays, three or more electrosprays or four or more electrosprays. Such electrosprays can originate from a single or multiple microfluidic devices.
In another embodiment of the invention, the detection system utilized allows for the capture and measurement of most or all of the metabolic products introduced into the detection device. In another embodiment of the invention, the detection system allows for the detection of change in a defined combination (“profile,” “panel,” “ensemble, or “composite”) of metabolic products.
In the Examples, a combination of NMR and 2D GC×GC-MS methods were used to analyze the metabolite profiles of 257 retrospective serial serum samples from 56 previously diagnosed and surgically treated breast cancer patients, 116 of the serial scrum samples were from 20 patients with recurrent breast cancer and 141 serum samples were from 36 patients with no clinical evidence of the disease during the sample collection period. NMR and GC×GC-MS data were analyzed by multivariate statistical methods to compare identified metabolite signals between the recurrence and no evidence of disease samples. Eleven metabolite markers (7 from NMR and 4 from GC×GC-MS) were selected from an analysis of all patient samples by logistic regression model using 5-fold cross validation. A PLS-DA model built using these markers with leave one out cross validation provided a sensitivity of 86% and a specificity of 84% (AUROC>0.85). Strikingly, over 60% of the patients could be correctly predicted to have recurrence 10 months (on average) before the recurrence was diagnosed clinically, representing a large improvement over the current breast cancer monitoring assay CA 27.29. To the best of our knowledge, this is the first study to develop and pre-validate a prediction model for early detection of recurrent breast cancer based on a metabolic profile. In particular, the combination of two advanced analytical methods, NMR and MS, provides a powerful approach for the early detection of recurrent breast cancer.
Two-hundred fifty-seven serum, samples (each ˜400 microliter (μl) from 56 breast cancer patients were obtained from the M.D. Anderson, Cancer Center (Houston, Tex.). These banked serum samples were collected between 1997 and 2003 with an average of 5 serial time-course samples per patient from female volunteers (ages 40-75) who were breast cancer patients enrolled at M.D. Anderson Cancer Center (Houston, Tex.). Follow-up investigations by oncologists at the M.D. Anderson for breast cancer recurrence were based on a combination of factors including CA 27.29, CEA, and/or CA 125 IVD results, patient symptoms, initial breast cancer stage, hormone receptor and lymph node status. Of the 56 patients, breast cancer recurred in 20, either locally or in a distant organ, and the remaining 36 had no evidence of disease (NED) recurrence during the sampling period as well as 2 years afterward.
A total of 116 serum samples were obtained from recurrent breast cancer patients, which constituted 67 samples collected earlier than 3 months before the recurrence was clinically diagnosed (Pre), 18 samples collected within ±3 months of recurrence (Within), and 31 collected later than 3 months after diagnosed recurrence (Post). The remaining 141 samples represented the cases in which the patient remained NED for at least 2 years beyond their sample collection period. Nearly all samples were evaluated for CA 27.29 values at the time of collection and therefore could be used for comparison. Study samples were maintained at −80° C. from collection until their transfer over dry ice to the evaluation laboratory at Purdue University where they were again stored frozen at −80° C. until this study was conducted. Serum samples and accompanying clinical data were appropriately de-identified before transfer into this study. Table 1 summarizes the clinical parameters and demographic characteristics of the cancer patients.
After thawing, 200 microliter (“μL”) serum was mixed with 330 μL D2O and 5 μL sodium azide (12.3 nmol). Sample solutions were vortexed for 60 seconds (sec.) and centrifuged for 5 minutes (min.) at 8000 revolutions per minute (RPM). Thereafter, 530 μL aliquots were transferred into standard 5 millimeter (mm) NMR tubes for NMR measurements. An external capillary tube (a glass stem coaxial insert, OD 2 mm) containing 60 μL 0.012% 3-(trimethylsilyl) propionic-(2,2,3,3-d4) acid sodium salt (“TSP”) solution in D2O was used as a chemical shift frequency standard (δ=0.00 ppm) and for locking purposes. All NMR experiments were carried out at 25° C. on a Bruker DRX 500 Megahertz (“MHz”) spectrometer equipped with a cryogenic probe and triple-axis magnetic field gradients. Two 1H NMR spectra were measured for each sample, a standard 1D NOESY (Nuclear Overhauser Effect Spectroscopy) and CPMG (Carr-Purcell-Meiboom-Gill) pulse sequences coupled with water pre-saturation. For each spectrum, 32 transients were collected using 32 k data points and a spectral width of 6000 Hz. An exponential weighting function corresponding to 0.3 Hz line broadening was applied to the free induction decay (FID) before applying Fourier transformation. Each peak was integrated and then normalized using the value of the total NMR spectral intensity (total sum) excluding the water and urea peaks. After phasing and baseline correction using Bruker XWINNMR software version 3.5, the processed data were saved in ASCII format for further analysis.
Protein precipitation was performed for each sample by mixing 200 μL serum with 400 μL methanol in a 1.5 mL Eppendorf tube. The mixture was briefly vortexed, and then held at −20° C. for 30 min. The samples were centrifuged while still cold at 14,000 RPM for 10 min. The upper layer (supernatant) was transferred into another Eppendorf tube for further use. Chloroform (200 μL) was mixed with the protein pellet and centrifuged at 14,000 RPM for another 10 min. After centrifugation, the aliquot was transferred and combined with the methanol supernatant solution from the previous step. The resultant mixture was lyophilized to remove the solvents for 5 hrs using a Speed Vac (Savant AES2010). Each dried sample was then dissolved in 50 μL of anhydrous pyridine and after a brief vortexing was sonicated for approximately 20 min. Twenty μL of this solution was mixed with 20 μL of the derivatizing reagent MTBSTFA (N-methyl-N-(tert-butyldimethylsilyl, trifluoroacetamide) (Regis, Morton Grove, Ill.). Addition of this derivatizing agent containing an active tert-butyldimethylsilyl group to the mixture activates functional groups such as the hydroxyl, amines or carboxylic acid of the metabolites present in the biological sample. The samples were then incubated at 60° C. for 1 hr to affect the reaction. After derivatization, the solution contents were transferred to a glass GC (auto sampler) vial for the analysis.
Two dimensional GC×GC-MS analysis was performed using a Pegasus 4D system (LECO, St. Joseph, Mich.) consisting of an Agilent 6890 gas chromatograph (Agilent Technologies, Palo Alto, Calif.) coupled to a Pegasus time of flight mass spectrometer. The first dimension chromatographic separation was performed on a DB-5 capillary column (30 m×0.25 mm inner diameter 0.25 μm film thickness). At the end of the first column the eluted samples were frozen by cryotrapping for a period of 4 s and then quickly heated and sent to the second dimension chromatographic column (DB-17, 1 m×0.1 mm inner diameter, 0.10 μm film thickness). The first column temperature ramp began at 50° C. with a hold time of 0.2 min, which was then increased to 300° C. at rate of 10° C./min and held at this temperature for 5 min. The second column temperature ramp was 20° C. higher than the corresponding first column temperature ramp with the same rate and hold time. The second dimension separation time was set for 4 sec. High purity helium was used as a carrier gas at a flow rate of 1.0 mL/min. The temperatures for the inlet and transfer line were set at 280° C., and the ion source was set a 200° C. The detection and filament bias voltages were set to 1600 V and −70 V, respectively.
Mass spectra ranging from 50 to 600 m/z were collected at a rate of 50 Hz. LECO ChromaTOF software (version 4.10) was used for automatic peak detection and mass spectrum deconvolution. The NIST MS database (NIST MS Search 2.0, NIST/EPA/NIH Mass Spectral Library; NIST 2002) was used for data processing and peak matching. Mass spectra of all identified compounds were compared with standard mass spectra in the NIST database (NIST MS Search 2.0, NIST/EPA/NIH Mass Spectral Library; NIST 2002). Further, the identified biomarker candidates were confirmed from the mass spectra and retention times of authentic commercial samples purchased and run under identical experimental conditions.
The NMR spectrum from each sample was aligned with reference to the 3-(trimethylsilyl) propionic-(2,2,3,3-d4) (“TSP”) acid sodium salt signal at 0 ppm. Spectral regions within the range of 0.5 to 9.0 ppm were analyzed after excluding the region between 4.5 and 6.0 ppm that contained the residual water peak and urea signal. Twenty-two spectral regions, corresponding to biomarkers, initially identified in a study on early breast cancer detection, were selected as biomarker candidates for further analysis. The statistical significance of each metabolite in the selected regions was determined by calculating the P-values using Student's t-test in the training set. To further enhance the pool of metabolites, 18 additional metabolites were identified for targeted MS analysis based on highest difference in intensity of the peaks between recurrence and NED samples. (Table 2). A software program was developed in-house to extract these metabolite signals from the GC×GC-MS datasets. Based on the input value of m/z and a retention time range, the program integrates chromatography peaks for each metabolite after the metabolite's spectrum was matched to the characteristic experimental mass spectrum from the standard NIST library available in the LECO Chroma TOF software package (v1.61).
The complete set of biomarkers identified using the present method consists of 3-hydroxybutyrate, acetoacetate, alanine, arginine, asparagine, choline, creatinine, glucose, glutamic acid, glutamine, glycine, formate, histidine, isobutyrate, isoleucine, lactate, lysine, methionine, N-acetylaspartate, proline, threonine, tyrosine, valine, 2-hydroxy butanoic acid, hexadecanoic acid, aspartic acid, 3-methyl-2-hydroxy-2-pentenoic acid, dodecanoic acid, 1,2,3, trihydroxypropane, beta-alanine, alanine, phenylalanine, 3 hydroxy-2-methyl-butanoic acid, 9,12-octadecadienoic acid, acetic acid, N-acetylglycine, glycine, nonanedioic acid, nonanoic acid, and pentadecanoic acid (Table 2).
Further analysis was performed on a subset of the biomarkers, as illustrated in the box and whisker plots of
A further subset, or panel, of biomarkers was selected for the development of prediction models and validation of the models, consisting of the metabolites 3-hydroxybutyrate, choline, glutamic acid, formate, histidine, lactate, proline, tyrosine, 3 hydroxy-2-methyl-butanoic acid, N-acetylglycine and nonanedioic acid.
Alternatively, a subset, or panel, of eight biomarkers was selected, consisting of the metabolites choline, glutamic acid, formate, histidine, proline, 3 hydroxy-2-methyl-butanoic acid, N-acetylglycine, and nonanedioic acid.
In other embodiments, a subset, or panel, of seven biomarkers was selected, consisting of the metabolites 3-hydroxybutyrate, choline, formate, histidine, lactate, proline, and tyrosine.
In order to select the metabolites with highest scores for developing the prediction model, samples from NED, post and within recurrence groups were used. Pre-recurrence samples were omitted to avoid any ambiguity in determining the correct disease status prior to clinical diagnosis. Post and within recurrence vs. NED samples were divided into five cross validation (CV) groups. Multivariate analysis using logistic regression model of the 22 NMR and 18 GC×GC/MS detected metabolite signals was applied to 4 CV groups and the resulting model was used to predict the class membership of the 5th CV group. The output of the logistic regression procedure is a ranked set of markers. The best combination of NMR and GC markers that resulted to a model with lowest misclassification error rate and the highest predictive power was retained and used to build final prediction model using all samples.
Based on their performance, eleven metabolite markers (7 from NMR and 4 GC×GC-MS) were selected for model building. NMR and MS data for these markers were imported into Matlab software (Mathworks, MA) installed with the PLS toolbox (Eigenvector Research, Inc, version 4.0) for PLS-DA modeling. Leave one out cross validation was chosen and the number of latent variables (LV) were selected according to the root mean square error of the cross validation (RMSECV). The R statistical package (version 2.8.0) was used to generate the receiver operating characteristics (ROC) curves. The sensitivity, specificity and the area under the receiver operating characteristic curve (AUROC) of the model was calculated and compared.
The performance of these markers was also assessed based on the time of sample collection, before or after the clinical diagnosis of the recurrence (post recurrence vs. NED within recurrence vs. NED and pre-recurrence vs. NED). The class membership of each sample was determined and compared to the patient's status. The ROC curve was generated and AUROC, sensitivity, and specificity were calculated. The scores from the model were scaled to yield a range of 0-100, and the cutoff vale for recurrence status was determined by a judicious choice between sensitivity and specificity. The performance of the model with reference to the initial stage of the breast cancer, ER/PR status, and the site of recurrence was also assessed.
Finally, the performance of the NMR and MS metabolite markers was also tested by splitting the samples randomly into two parts, training (141 samples) and testing (116 samples) sets and analyzed as illustrated in
NMR spectra of breast cancer serum samples obtained using the CPMG sequence were devoid of signals from macromolecules and clearly showed signals for a large number of small molecules including sugars, amino acids and carboxylic acids. A representative NMR spectrum from a post recurrence patient is shown in
Initial data analysis was focused on testing the performance of the 22 NMR and 18 MS metabolites, and from these data, selecting the markers with highest rank to maximize diagnostic accuracy. Making use of variable selection protocol, and from logistic regression analysis, a subset of 11 metabolites (7 identified by NMR and 4 identified by MS) were selected based on their highest ranking and predictive accuracy to form a test panel of biomarkers. Table 3, below, shows the list of 11 biomarkers and their P-values for Pre vs. NED, and Within and Post (=“Recurrence”) vs. NED comparisons using all samples. In general, the individual P-values of these markers for the Within and Post (=“Recurrence”) vs. NED comparisons were quite low, although there were four exceptions that were nevertheless highly ranked by logistic regression. In two of these four cases, the identified metabolites showed low P values for either Within versus NED or Post versus NED, but not both.
Subsequent analysis was based on the 11 NMR/MS biomarkers listed in Table 3, above. The performance of the metabolite markers in classifying the recurrence of breast cancer was tested both individually and collectively. Box and whisker plots for the individual biomarkers are shown in
A comparison of the metabolite profiling results with the CA 27.29 data that had been obtained for the same samples is shown in Table 4, below, showing a large improvement in sensitivity that is provided by a preferred embodiment of the present invention over the currently available in vitro diagnostic (“IVD”) test, CA 27.29.
Subsequently, the predictive power of the model for early detection of breast cancer recurrence was evaluated. All samples from the recurrent breast cancer patients were grouped together with respect to the time of diagnosis (t=0) for each patient. Samples within 5 months of one another were grouped, and an average value in months was assigned to each group. The number of months and sign represent the average time at which the samples were collected before (i.e., negative time) or after (positive time) the clinical diagnosis. The percentage of patient's for which the recurrence was correctly diagnosed was calculated using the model
Increasing the threshold value to 54 led to an increase in specificity to ˜94%, and concomitantly, a decrease in sensitivity to 68%. The threshold value for 98% specificity was 65 and for 94% sensitivity, 41.
Separately, the model was also tested on the recurrent breast cancer patients based on the stage of the cancer at the initial diagnosis, the type of recurrence, estrogen ER,
Additional analysis based on the prediction model was derived from variable selection using a training sample set (
As shown in
This study illustrates an embodiment of a metabolomics based method for the early detection of breast cancer recurrence. The investigation makes use of a combination of analytical techniques, NMR and MS, and advanced statistics to identify a group of metabolites that are sensitive to the recurrence of breast cancer. We have shown that the new method distinguishes recurrence from no evidence of disease with significantly improved sensitivity and specificity. Using the predictive model, the recurrence in nearly 60% of the patients was detected as early as 10 to 18 months before the recurrence was diagnosed based on the conventional methods.
Although perturbation in the metabolite levels was detected for all the 40 metabolites that were used in the initial analysis (Table 2, above), several groups of small number of metabolites chosen based on the highest ranking and different cut-off levels provided improved models. Particularly, the panel of 11 metabolites (7 from NMR and 4 from GC; Table 3, above) contributed significantly to distinguishing recurrence from NED. Further, the predictive model derived from these 11 metabolites performed significantly better in terms of both sensitivity and specificity when compared to those derived using individual metabolites or a group of metabolites derived from a single analytical method, NMR or MS. With regard to early detection of the recurrence (
Evaluation of other models with panels of fewer metabolites indicated that these embodiments could also provide useful results. The AUROC for an eight biomarker panel consisting of the metabolites choline, glutamic acid, formate, histidine, proline, 3 hydroxy-2-methyl-butanoic acid, N-acetylglycine, and nonanedioic acid (four metabolites detected by NMR and four metabolites detected by GC×GC-MS) was 0.86, whereas a seven biomarker panel consisting of the metabolites 3-hydroxybutyrate, choline, formate, histidine, lactate, proline, and tyrosine (using seven metabolites detected by NMR alone) had an AUROC of 0.80. These results demonstrate that individual biomarkers within a panel that is useful for detecting the recurrence of breast cancer may be deleted or substituted by other compounds of Table 2 and still retain utility for detecting the recurrence of breast cancer.
The embodiment of the panel of eleven selected biomarkers represents sharp changes in metabolic activity of several pathways associated with breast cancer, including amino acids metabolism (histidine, proline, tyrosine and threonine), phospholipid metabolism (choline) and fatty acid metabolism (nonanedioic acid). Numerous investigations of metabolic aspects of tumorigenesis have shown the association of a majority of these metabolites with breast cancer. As shown in
While an exemplary embodiment incorporating the principles of the present disclosure has been disclosed hereinabove, the present disclosure is not limited to the disclosed embodiments. Instead, this application is intended to cover any variations, uses, or adaptations of the disclosure using its general principles. Further, this application is intended to cover such departures from the present disclosure as come within known or customary practice in the art to which this disclosure pertains and which fall within the limits of the appended claims.
This application is a continuation of co-pending intentional patent application PCT/US2011/029681, filed on Mar. 23, 2011, and claims benefit of U.S. provisional patent application Ser. No. 61/316,679, filed on Mar. 23, 2010. The entire disclosures of both applications are incorporated herein by reference.
This invention was made with United States government support under R01 GM085291 from the National Institute of General Medical Sciences. The United States government has certain rights to this invention.
Number | Date | Country | |
---|---|---|---|
61316679 | Mar 2010 | US |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/US2011/029681 | Mar 2011 | US |
Child | 13624042 | US |