Serum glycan composition and structure are well known to be altered in many different types of cancer.1-4 In fact, for over a decade now, global blood plasma/serum (P/S) glycomics has held out the promise of new, non-invasive cancer markers derived from a small volume of this easily accessible biofluid.5,6 Modern analytical methods for quantifying the relative abundance of different glycans in P/S vary widely7-9, ranging from multiplexed capillary gel electrophoresis with laser-induced fluorescence (a DNA sequencer-adapted method)10,11 to hydrophilic interaction liquid chromatography (HILIC)12 or porous graphitized carbon (PGC)13,14 chromatography interfaced with electrospray ionization-based mass spectrometers or as a means of prefractionation prior to analysis by MALDI-MS15—for which glycans are generally permethylated prior to analysis.9,16
Nearly all approaches employed in P/S glycomics focus on the analysis of intact glycans-most commonly N-linked glycans (generally to the exclusion of O-linked glycans and glycolipids). Quite commonly, accounts of such studies that are focused on cancer conclude by taking a wide-angle view of all intact glycans that were altered in cancer relative to a healthy or benign disease state and reporting unique glycan features such as core fucosylation, bisecting N-acetylglucosamine (GlcNAc), and α2-6 sialylation that were found increased or decreased in cancer.6 Often these features are then directly connected to the activity of specific glycosyltransferases.17 In 2013, Borges et al.18 developed a molecularly bottom-up approach to serum glycomics, which, following permethylation of an unfractionated P/S sample, employs the principles of glycan linkage analysis to break down all P/S glycans into monosaccharides in a way that maintains information about which hydroxyl groups of each monosaccharide were connected to other carbohydrate residues in the original glycan polymer18-20 (
Urothelial cell carcinoma (UCC) or bladder cancer is one of the top ten causes of cancer deaths annually [1]. From a clinical perspective, there are two major forms of this cancer: 1) non-muscle-invasive bladder cancer (NMIBC; stages pTa/pT1/pTis) and 2) muscle-invasive bladder cancer (MIBC; stages pT2+). Early detection of bladder cancer is very important; patients with non-muscle-invasive tumors have a much higher 5-year survival rateÐ88% for NMIBC patients relative to 41% for MIBC patients [2]. Yet despite the stage at which it is diagnosed, high recurrence rate is one of the essential characteristics of this cancer [3]. Therefore, even if diagnosed at early stages and treated, former bladder cancer patients need to be monitored frequently. Currently, common methods for detecting bladder cancer and monitoring for its recurrence include: cystoscopy (which is invasive and expensive [4]), urine cytology (which has low sensitivity for low-grade bladder cancer [5]), and computed tomography (CT) screening (which may not detect small tumors [6]). Accordingly, there has been a wide search for new biomarkers that are noninvasive, cost effective, and can outperform cytology [7-10].
At present, there are no clinically employed serum-based markers for monitoring patients after their treatment. Targeted glycomics, particularly when combined with other well-defined markers and risk stratification models, represents a promising source for a new generation of bladder cancer markers [11]. Some evidence toward this end based on the detection of the Sialyl Lewisa antigen [12, 13] and analysis of intact N-glycans [14, 15] in blood plasma/serum (P/S) from bladder cancer patients has been obtained. Aberrant glycosylation is a universal feature of cancer [16] where it appears to enable the ability of tumor cells to avoid innate immune detection [17]. The changes in structure and abundance of glycans are often caused by dysregulated glycosyltransferase (GT) activity [16]. Thus conceptually, a targeted glycan analysis technique that could provide one-to-one surrogate data for abnormal GT activity using routinely available clinical samples and that relied upon existing clinical technology could be quite valuable.
In 2013, one of the present inventors developed a molecularly bottom-up approach called glycan node analysis that, unlike other approaches used in P/S glycomics, focuses on the analysis of monosaccharide and linkage-specific glycan “nodes” instead of intact glycans [18-21]. It does this by employing the principles and processing chemistry of glycan methylation analysis (i.e., linkage analysis;
Lung cancer accounts for approximately 25% of all U.S. cancer deaths, making it the leading cause of U.S. cancer deaths.1 More than half of lung cancer patients are diagnosed at an advanced stage: about 33% and 40% of lung cancer patients are diagnosed at stage IIIB and IV, respectively,2 primarily due to a lack of early stage symptoms. The five-year survival rate of stage IV patients is only ˜5%.1 Conversely, if lung cancer can be detected before it escapes the lungs, five-year survival rates usually exceed 50%.1 Therefore, to improve the outcomes of lung cancer patients, a major clinical priority is to detect lung cancer early. Recently, the National Lung Screening Trial (NLST) applied low dose chest computed tomography (LDCT) in older, high-risk individuals and achieved 20% reduction in lung cancer mortality. Yet the positive screening rate in this study was 24.2%, of which 96.4% were false-positive results.3 The high false-positive rate may lead to additional clinical tests, emotional distress, and unnecessary treatments, as well as unnecessary time and costs spent. Thus, a reliable and highly specific noninvasive blood test could help to reduce the false-positive and overdiagnosis rate of CT scans.
Biomarkers from easily accessible biofluids, such as blood plasma or serum (P/S), could potentially be used as a noninvasive and cost-effective way to improve lung cancer diagnosis and screening. Numerous P/S biomarkers for lung cancer have been extensively studied, including proteins (such as cytokeratin 19 fragments4,5 and carcinoembryonic antigen6,7), miRNAs (such as miR-348 and miR-1829,10), methyl-DNA (such as P1611 and BRMS112), and circulating tumor cells.13 However, biomarkers with improved sensitivity and specificity are still needed.
Aberrant glycosylation is a well-established hallmark of cancer and seems to facilitate the metastasis of various tumor cells.14 Thus, blood P/S glycomics represents a promising source for a new generation of cancer biomarkers. At present, almost all P/S glycomics studies focus on the analysis of intact glycans—primarily N-linked glycans, with O-linked and lipid-linked glycans usually excluded. Generally, a great many intact glycan structures need to be investigated in order to fully capture and quantify the cancer-specific behavior of one unique glycan feature, such as core fucosylation, α2-6 sialylation, or β1-4 branching.15 Glycan node analysis is a molecularly bottom-up approach to P/S glycomics developed by Borges et al. in 2013 that focuses on monosaccharides and linkage specific glycan “nodes” rather than the intact glycan structures.16-20 This approach captures all P/S glycans including N-, O-, and lipid-linked glycans and breaks them down into monosaccharides that maintain their original linkage information. In short, the method involves the application of glycan linkage (methylation) analysis to whole biofluids. Uniquely in this approach, linkage-related glycan features are captured and quantified as single analytical signals, rather than being spread across numerous intact glycans that bear the specific feature. For example, 6-linked galactose and 2,6-linked mannose, corresponding to α2-6 sialylation and β1-6 branching, respectively, are both captured as single chromatographic peak areas (
Interestingly, there are several important gender differences in lung cancer, including the facts that (1) after adjusting for the number of cigarettes smoked, women have a 3-fold greater risk of lung cancer than men,21-24 (2) never-smoker women are at significantly greater risk for lung cancer than men,25 and (3) women tend to have better survival rates than men.26,27
Citation of any reference in this section is not to be construed as an admission that such reference is prior art to the present disclosure.
The present disclosure provides a method of detecting altered glycan nodes in a sample from a patient having or being treated for cancer, suspected of having cancer or at risk to having cancer. The method comprises (a.) obtaining a plasma sample from the patient, wherein the plasma sample comprises glycans; (b.) permethylating the sample comprising glycans; (c.) hydrolyzing the product from step (b); (d.) reducing the product from step c; (e.) acetylating the product from step (d); (f.) partially purifying the product from step (e); and (g.) analyzing the product of step (f) using a substance identifying technique to detect altered glycan nodes in the plasma sample.
The present disclosure also provides a method of detecting terminal fucosylation, α2-6 sialylation, β1-4 branching, β1-6 branching, and outer-arm fucosylation in a sample from a patient having or being treated for cancer, suspected of having cancer or at risk to having cancer. The method comprises (a.) obtaining a sample from the patient, wherein the sample comprises glycans; (b.) permethylating the sample comprising glycans; (c.) hydrolyzing the product from step (b); (d.) reducing the product from step (c); (e.) acetylating the product from step (d); (f.) purifying the product from step (e); (g.) analyzing the product of step (f) using a substance identifying technique to terminal fucosylation, α2-6 sialylation, β1-4 branching, β1-6 branching, and outer-arm fucosylation in the sample.
The present disclosure also provides a method of detecting α2-6 sialylation, β1-4 branching, β1-6 branching, and outer-arm fucosylation in a sample from a patient having or being treated for bladder cancer, suspected of having bladder cancer or at risk to having bladder cancer. The method comprises: (a.) obtaining a sample from the patient, wherein the sample comprises glycans; (b.) permethylating the sample comprising glycans; (c.) hydrolyzing the product from step (b); (d.) reducing the product from step (c); (e.) acetylating the product from step (d); (f.) purifying the product from step (e); (g.) analyzing the product of step (f) using a substance identifying technique to terminal fucosylation, α2-6 sialylation, β1-4 branching, β1-6 branching, and outer-arm fucosylation in the sample.
Certain illustrative embodiments include the following:
1. A method of detecting altered glycan nodes in a sample from a patient having or being treated for cancer, suspected of having cancer or at risk to having cancer, the method comprising:
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as those commonly understood by one of ordinary skill in the art to which this invention belongs. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present disclosure, suitable methods and materials are described below. The materials, methods and examples are illustrative only, and are not intended to be limiting. All publications, patents and other documents mentioned herein are incorporated by reference in their entirety.
Throughout this specification, the word “comprise” or variations such as “comprises” or “comprising” will be understood to imply the inclusion of a stated integer or groups of integers but not the exclusion of any other integer or group of integers.
The term “a” or “an” may mean more than one of an item.
The terms “and” and “or” may refer to either the conjunctive or disjunctive and mean “and/or”.
The term “about” means within plus or minus 10% of a stated value. For example, “about 100” would refer to any number between 90 and 110.
The term “patient” means an animal, preferably a mammal, and most preferably, a mouse, rat, other rodent, dog, cat, swine, cattle, sheep, horse, or primate, and even more preferably a human.
The present disclosure provides a method of detecting altered glycan nodes in a sample from a patient having or being treated for cancer, suspected of having cancer or at risk to having cancer. The method comprises (a.) obtaining a plasma sample from the patient, wherein the plasma sample comprises glycans; (b.) permethylating the sample comprising glycans; (c.) hydrolyzing the product from step (b); (d.) reducing the product from step c; (e.) acetylating the product from step (d); (f.) partially purifying the product from step (e); and (g.) analyzing the product of step (f) using a substance identifying technique to detect altered glycan nodes in the plasma sample.
In some embodiments, the altered glycan nodes are selected from the group consisting of terminal fucose; xylose (any linkage); terminal galactose; 2-linked mannose; 4-linked galactose; 4-linked mannose; 4-linked glucose; 3-linked mannose; 2-linked galactose; 3-linked galactose; 6-linked glucose; 6-linked mannose; 6-linked galactose; 3,4-linked galactose; 2,3-linked galactose; 2,4-linked mannose; 4,6-linked glucose; 2,6-linked mannose; 3,6-linked mannose; 3,4,6-linked mannose; terminal N-acetylglucosamine (GlcNAc); terminal N-acetylgalactosamine (GalNAc); 4-linked GlcNAc; 3-linked GlcNAc; 3-linked GalNAc; 6-linked GlcNAc; 3,4-linked GlcNAc. 4-linked GalNAc; 6-linked GalNAc; 4,6-linked GlcNAc; and 3,6-linked GalNAc.
In the methods described herein, the cancer is any type of cancer. In some embodiments, the cancer is selected from the group consisting of lung cancer, prostate cancer, ovarian cancer, pancreatic cancer and bladder cancer.
The samples used in the methods described herein may be obtained from a patient. The sample may be blood plasma, serum, sputum, seminal fluid, urine, saliva, skin, prostatic fluid, tissue, other biofluid optionally derived from tissue ex vivo, microvesicles/exosomes from both serum and urine or combinations thereof. In some embodiments, the sample is derived from a diseased organ, tissue or secretion therefrom. Samples derived from a diseased organ or tissue include, but not limited to, sputum, prostatic fluid or semen, lung tissue, breast tissue, liver tissue, colon tissue and prostate tissue. In some embodiments, the sample is plasma.
In some embodiments, step (b) includes an initial substep of mixing the sample comprising glycans with a labeled chemical substance. In one aspect of this embodiment, the labeled chemical substance is heavy-labeled D-glucose. In other aspects, the labeled chemical substance is N-acetyl-D-[UL-13C6]glucosamine. In other aspects, the labeled chemical substance is a combination of heavy-labeled D-glucose and N-acetyl-D-[UL-13C6]glucosamine.
In some embodiments, step (b) comprises liquid/liquid extraction. Liquid/liquid extraction, in some embodiments, comprises adding a solution of NaCl followed by a halogenated solvent. The halogenated solvent can be methylene chloride or chloroform. Adding the solution of NaCl prior to the halogenated solvent reduces the number of liquid/liquid extraction steps needed to partially purify the permethylated glycans.
In step (c), the permethylated glycans are hydrolyzed. The permethylated glycans can be hydrolyzed according to any method known in the art. In some embodiments, step (c) uses acid. In some embodiments, step (c) uses trifluoroacetic acid.
In step (d), the product of step (c) is reduced to partially permethylated alditols. The reducing agent used in this step can be any known in the art. In some embodiments, step (d) uses a reducing agent selected from the group consisting of NaBH4, NaBD4 and a combination thereof.
In step (e), the product of step (d) is acetylated to form partially methylated alditol acetates. The aceylation step can be performed using any known acetylated reagent. In some embodiments, step (e) uses acetic anhydride.
In step (f), the product of step (e) is partially purified. As used herein, the term “partially purifying” refers to methods of at least partially removing the product from a mixture of other compounds. In some embodiments, partially purifying the product of step (e) comprises liquid/liquid extraction.
The substance identifying technique used in the methods described herein is selected from gas chromatography-mass spectrometry (GC-MS), liquid chromatography-mass spectrometry (LC-MS), gas Chromatography coupled to tandem mass spectrometry (GC-MS/MS) and liquid chromatography with tandem mass spectrometry (LC-MS/MS). In some embodiments, the substance identifying technique of step (g) is GC-MS.
The present disclosure also provides a method of detecting terminal fucosylation, α2-6 sialylation, β1-4 branching, β1-6 branching, or outer-arm fucosylation in a sample from a patient having or being treated for cancer, suspected of having cancer or at risk to having cancer. The method comprises (a.) obtaining a sample from the patient, wherein the sample comprises glycans; (b.) permethylating the sample comprising glycans; (c.) hydrolyzing the product from step (b); (d.) reducing the product from step (c); (e.) acetylating the product from step (d); (f.) purifying the product from step (e); (g.) analyzing the product of step (f) using a substance identifying technique to terminal fucosylation, α2-6 sialylation, β1-4 branching, β1-6 branching, and outer-arm fucosylation in the sample.
In some embodiments, step (b) includes an initial substep of mixing the sample comprising glycans with a labeled chemical substance. In one aspect of this embodiment, the labeled chemical substance is heavy-labeled D-glucose. In other aspects, the labeled chemical substance is N-acetyl-D-[UL-13C6]glucosamine. In other aspects, the labeled chemical substance is a combination of heavy-labeled D-glucose and N-acetyl-D-[UL-13C6]glucosamine.
In some embodiments, step (b) comprises liquid/liquid extraction. Liquid/liquid extraction, in some embodiments, comprises adding a solution of NaCl followed by a halogenated solvent. The halogenated solvent can be methylene chloride or chloroform. Adding the solution of NaCl prior to the halogenated solvent reduces the number of liquid/liquid extraction steps needed to partially purify the permethylated glycans.
In step (c), the permethylated glycans are hydrolyzed. The permethylated glycans can be hydrolyzed according to any method known in the art. In some embodiments, step (c) uses acid. In some embodiments, step (c) uses trifluoroacetic acid.
In step (d), the product of step (c) is reduced to partially permethylated alditols. The reducing agent used in this step can be any known in the art. In some embodiments, step (d) uses a reducing agent selected from the group consisting of NaBH4, NaBD4 and a combination thereof.
In step (e), the product of step (d) is acetylated to form partially methylated alditol acetates. The aceylation step can be performed using any known acetylated reagent. In some embodiments, step (e) uses acetic anhydride.
In step (f), the product of step (e) is partially purified. As used herein, the term “partially purifying” refers to methods of at least partially removing the product from a mixture of other compounds. In some embodiments, partially purifying the product of step (e) comprises liquid/liquid extraction.
The present disclosure also provides a method of detecting α2-6 sialylation, β1-4 branching, β1-6 branching, and outer-arm fucosylation in a sample from a patient having or being treated for bladder cancer, suspected of having bladder cancer or at risk to having bladder cancer. The method comprises: (a.) obtaining a sample from the patient, wherein the sample comprises glycans; (b.) permethylating the sample comprising glycans; (c.) hydrolyzing the product from step (b); (d.) reducing the product from step (c); (e.) acetylating the product from step (d); (f.) purifying the product from step (e); (g.) analyzing the product of step (f) using a substance identifying technique to terminal fucosylation, α2-6 sialylation, β1-4 branching, β1-6 branching, and outer-arm fucosylation in the sample.
In some embodiments, step (b) includes an initial substep of mixing the sample comprising glycans with a labeled chemical substance. In one aspect of this embodiment, the labeled chemical substance is heavy-labeled D-glucose. In other aspects, the labeled chemical substance is N-acetyl-D-[UL-13C6]glucosamine. In other aspects, the labeled chemical substance is a combination of heavy-labeled D-glucose and N-acetyl-D-[UL-13C6]glucosamine.
In some embodiments, step (b) comprises liquid/liquid extraction. Liquid/liquid extraction, in some embodiments, comprises adding a solution of NaCl followed by a halogenated solvent. The halogenated solvent can be methylene chloride or chloroform. Adding the solution of NaCl prior to the halogenated solvent reduces the number of liquid/liquid extraction steps needed to partially purify the permethylated glycans.
In step (c), the permethylated glycans are hydrolyzed. The permethylated glycans can be hydrolyzed according to any method known in the art. In some embodiments, step (c) uses acid. In some embodiments, step (c) uses trifluoroacetic acid.
In step (d), the product of step (c) is reduced to partially permethylated alditols. The reducing agent used in this step can be any known in the art. In some embodiments, step (d) uses a reducing agent selected from the group consisting of NaBH4, NaBD4 and a combination thereof.
In step (e), the product of step (d) is acetylated to form partially methylated alditol acetates. The aceylation step can be performed using any known acetylated reagent. In some embodiments, step (e) uses acetic anhydride.
In step (f), the product of step (e) is partially purified. As used herein, the term “partially purifying” refers to methods of at least partially removing the product from a mixture of other compounds. In some embodiments, partially purifying the product of step (e) comprises liquid/liquid extraction.
The methods described herein can be used in diagnostic applications to predict progression, recurrence, and survival in cancer patients.
The methods described herein can be used in conjunction with methods of treatment. In these embodiments, the method further comprises administering a treatment to the patient comprising one or more therapeutic agents for treating the cancer. The amount and nature of the therapeutic agent can be varied depending on the diagnosis or predicted progression, recurrence or survival.
In order that this invention be more fully understood, the following examples are set forth.
These examples are for the purpose of illustration only and are not to be construed as limiting the scope of the invention in any way.
To date, results from pilot studies in which this methodology was applied to (mostly) advanced stages of lung18 and breast cancer20 have been reported. In order to gain a representative perspective on the potential utility of this approach to detecting a variety of different types of cancer at varying stages, we have now applied it to over 950 clinical P/S samples from 7 different case control studies across all stages of cancer in which the cancer cases were compared to related benign conditions and/or healthy controls. A study of plasma samples from 428 Stage I-IV lung cancer patients, age/gender/smoking-status matched controls, and certifiably healthy living kidney donors serves as the backbone for this report—in which plasma from a single donor served as a quality control specimen in every single batch of samples—facilitating comparisons to pancreatic (rapid autopsy), ovarian (Stage III), prostate (Stage II), and a large independent lung cancer (Stage I) case-control study. Based on the behavior of P/S glycans established to date, we hypothesized that the alteration of P/S glycans observed in cancer would be independent of the tissue in which the tumor originated yet exhibit stage dependence that varied little across cancers classified on the basis of tumor origin.
Heavy, stable-isotope-labeled D-glucose (U-13C6, 99%; 1,2,3,4,5,6,6-D7, 97%-98%) was purchased from Cambridge Isotope Laboratories. N-Acetyl-D-[UL-13C6]glucosamine was obtained from Omicron Biochemicals, Inc. Methanol was purchased from Honeywell Burdick and Jackson. Acetone was obtained from Avantor Performance Materials. Acetonitrile and dichloromethane were acquired from Fisher Scientific. Chloroform, sodium hydroxide beads (20-40 mesh) DMSO, iodomethane (99%, catalog no. 18507), trifluoroacetic acid (TFA), ammonium hydroxide, sodium borohydride, and acetic anhydride were obtained from Sigma-Aldrich. Pierce spin columns (0.9 mL volume) including plugs were purchased from ThermoFisher Scientific (Waltham, Mass., catalog no. 69705). GC-MS autosampler vials and Teflon-lined pierceable caps were acquired from ThermoFisher Scientific. GC consumables were purchased from Agilent; MS consumables were obtained from Waters.
A summary of the case-control sample sets employed in this study is provided in
Living Kidney Donors. EDTA plasma samples from certifiably healthy living kidney donors were enrolled in the Multidisciplinary Biobank at Mayo Clinic Arizona under a Mayo Clinic Institutional Review Board (IRB)-approved protocol. Patients eligible for enrollment were those seen at Mayo Clinic Arizona who were ≥18 years old, able to provide informed consent, and undergoing evaluation as a potential living kidney donor. Detailed inclusion and exclusion criteria for these patients are provided in the Supporting Information for Ferdosi S. et al., Journal of Proteome Research 2018, 17(1):543-558. None of these patients smoked at the time of health screening and blood collection; 27% were former smokers, and 73% never smoked. Specimens were collected over a 2-year period from December 2013 to December 2015. Standard operating protocols and blood collections were performed as previously described.21 All specimens were stored at −80° C. prior to shipment to Arizona State University.
Large Lung Cancer Set. Sodium heparin plasma samples for the large lung cancer study were collected at the University of Texas MD Anderson Cancer Center under the supervision of Dr. Xifeng Wu. Heparin is a glycosaminoglycan itself but the vast majority of its monomer units are carboxylated, sulfated, or both. As previously described,18 sulfated and carboxylated glycan monomers cannot be directly detected by the analytical methodology employed in this study. The PMAA from 4-linked GlcNAc could theoretically be produced by the heparin anticoagulant, but empirically, was found in matched collection studies (described in the Results of Example 1) that 4-linked GlcNAc from heparin plasma is not significantly different from EDTA plasma or serum. Specimens for lung cancer cases and controls from the University of Texas MD Anderson Cancer Center included in this example are part of an ongoing large lung cancer study that has been recruiting since 1995. This study has received approval from the University of Texas MD Anderson Cancer Center and Kelsey-Seybold institutional review boards. Venous blood was drawn from newly diagnosed and histologically confirmed lung cancer patients (prior to therapy) and age-, gender-, and ethnicity-matched controls at the MD Anderson Cancer Center hospital and the nearby Kelsey-Seybold Clinic, respectively. All blood was drawn and processed under the same SOP. Patients were not necessarily in a fasted state. Blood was centrifuged then aliquoted and placed into a liquid nitrogen tank. After collection, samples were coded and de-identified prior to shipment to Arizona State University for analysis. A more-detailed profile of the clinical characteristics of the patients in this large lung cancer study is provided in Table S1 accompanying Ferdosi S. et al., Journal of Proteome Research 2018, 17(1):543-558.
Liver Fibrosis (Non-Cancerous). Serum samples from patients at all stages of liver fibrosis were collected at the Sunnybrook Health Sciences Centre, under the direction of Dr. Lei Fu and Dr. David E. C. Cole. This study was approved by Research Ethics Board, Sunnybrook Health Sciences Centre, Toronto. Patients were recruited between 2007 and 2011. Written informed consent was obtained from each participant. All subjects with various chronic liver diseases were considered eligible if they would have liver biopsy for the diagnosis of liver fibrosis as part of their routine care. Blood specimens were collected, and serum was separated from cells following standard clinical laboratory procedures. Serum aliquots were stored in −70° C. The specimens were coded and de-identified according to the study protocol.
Stage I Lung Adenocarcinoma. Serum samples from stage I lung adenocarcinoma patients and age-, gender-, and smoking-status-matched controls were collected under NYU IRB approval at the NYU Langone Medical Center by Dr. Harvey Pass. Arterial blood samples were collected from fasting patients undergoing surgery in the time frame from September 2006 to August 2013 to remove one or more lung nodules that were detected during a CT scan. Determination of whether nodules were benign or malignant was made following a pathological exam of the excised nodules. Serum was collected in standard glass serum tubes and allowed to sit upright for 30-60 min to allow clotting. Subsequently, tubes were centrifuged at 1200 g for 20 min at room temperature, then aliquoted and placed at −80° C. within 2-3 h of collection. No freeze-thaw cycles occurred prior to shipment to Arizona State University (Borges lab) for analysis.
Stage II Prostate Cancer. Serum samples from stage II prostate cancer patients were obtained from the Cooperative Human Tissue Network (CHTN), an NIH-sponsored biospecimen collection agency. The quality management system of the CHTN is described elsewhere.22 Age-matched control samples from nominally healthy male donors were obtained from ProMedDx (Norton, Mass.).
Stage III Serous Ovarian Cancer. Serum specimens from stage III serous ovarian cancer patients were collected at Brigham and Women's Hospital under IRB approval by Dr. Daniel Cramer. Sera were obtained at the time of presentation prior to surgery. Age, gender, and location matched control sera from women without a history of cancer (other than nonmelanoma skin cancer) were obtained from the general population under a standardized serum collection protocol. All serum samples were collected from 2001 to 2010 and were stored at −80° C. prior to analysis. These specimens have previously been described.23,24
Stage IV Lung Cancer. A set of serum samples from stage IV lung cancer patients and age- and gender-matched nominally healthy control donors that was completely separate from those provided by Dr. Xifeng Wu at the University of Texas MD Anderson Cancer Center was obtained from ProMedDx.
Rapid Autopsy Pancreatic Cancer. Serum specimens from rapid autopsy patients who had recently died from pancreatic cancer were collected by Dr. Michael Hollingsworth at the University of Nebraska Medical Center under IRB approval. These samples have previously been described.25 In brief, specimens were collected within 2-3 h of death. Control serum samples were from patients with benign pancreatic conditions and elevated CA19-9 levels. Samples were coded, de-identified, and kept at −80° C. prior to shipment to Arizona State University.
Additional Biospecimen Details. As described above, all blood samples were processed into P/S immediately following collection and stored at −70° C. or colder until analyzed. Following shipment in dry ice, vial headspace was vented prior to thawing to avoid CO2-mediated sample acidification.26 The molecular integrity of the sample set that showed the greatest differences between cases and controls (rapid autopsy pancreatic cancer sera) was examined using an assay based on ex vivo protein oxidation that was recently developed by the Borges group.27 The prostate cancer and stage I lung adenocarcinoma sets were spot-checked as well. No samples produced evidence for concern about specimen integrity.
In this example, multiple independent sets of sample were compared to each other. Each case-control set was analyzed blind and in random order. Within each batch, across all sets, a quality control (QC) EDTA plasma sample was included consisting of a 9 uL aliquot of the same bulk plasma sample to verify the reproducibility across batches. Notably, the samples from the certifiably healthy living kidney donors were analyzed in separate batches of samples from those in the large lung cancer set. To justify direct comparison of these two sets of samples, we verified that the average values measured for each glycan node in the two sets of QC sample results were not statistically significantly different. Moreover, if the average value of the QC sample was slightly higher or lower in the large lung cancer set relative to the living kidney donor set a scaling factor based on this difference in QC samples was employed to adjust the living kidney donor data set. For each glycan node, this adjustment brought the living kidney donor data set distribution slightly closer to the control distributions observed in the large lung cancer set, meaning that it was a conservative adjustment. Furthermore, to validate the comparability of results in serum and multiple different types of plasma, the glycan “node” analysis procedure was applied to matched sets of P/S samples from 21 donors. This set consisted of four different types of plasma and a serum sample from each donor. The difference between these four types of plasma was based on the different anticoagulants, which were K2EDTA, K3EDTA, sodium EDTA, and 3.8% sodium citrate. In an additional study, six matched-collection aliquots of serum, K2EDTA plasma, and heparin plasma from a single donor were analyzed and compared to each other to verify the consistency of glycan nodes between the aforementioned types of samples.
The global glycan methylation analysis procedure consisted of five main steps; permethylation, trifluoroacetic acid (TFA) hydrolysis, reduction of sugar aldehydes, acetylation of nascent hydroxyl groups, and final cleanup.18,19 Each step is described in detail in the following.
Permethylation, Nonreductive Release, and Purification of Glycans. A total of 9 μL of P/S was added into a 1.5 mL eppendorf tube followed by 1 μL of a 10 mM solution of heavy-labeled D-glucose (U-13C6, 99%; 1,2,3,4,5,6,6-D7, 97%-98%), and N-acetyl-D-[UL-13C6]glucosamine, which served as internal standards for relative quantification. Then, 270 μL of dimethyl sulfoxide (DMSO) was added to the biological sample and mixed to dissolve completely. Once the sample was fully dissolved, 105 μL of iodomethane was added to the mixture. This solution was then added to a plugged 1 mL spin column, which contained ˜0.7 g of sodium hydroxide beads. The NaOH beads had been preconditioned with acetonitrile and rinsed with DMSO twice before the sample was added. Then, the NaOH column was stirred occasionally for 11 min. When finished, samples were unplugged and spun for 15 s at 5000 rpm (2400 g) in a microcentrifuge to extract the glycan-containing solution. To wash off all the permethylated glycan, 300 μL of acetonitrile was added to the spin column and then centrifuged for 30 s at 10 000 rpm (9600 g). Then, samples from the first spin-through were placed in a silanized 13×100 mm glass test tube containing approximately 3.5 mL of 0.5 M NaCl solution in 0.2 M sodium phosphate buffer (pH 7.0) and mixed well. Next, the second spin-through was pooled with the rest of the sample, avoiding the white residue at the bottom of the spin column. The test tube was capped and shaken thoroughly after adding 1.2 mL of chloroform to the sample. Liquid/liquid extraction was performed three times, saving the chloroform layer. The chloroform layer was then extracted with a silanized pipet, transferred to a silanized glass test tube, and dried under nitrogen at heater-block temperature setting of 74° C.
TFA Hydrolysis. A total of 325 μL of 2 M TFA was added to each sample. Samples were then capped and heated at 121° C. for 2 h. Afterward, samples were dried down under nitrogen at 74° C.
Reduction of Sugar Aldehydes. A total of 475 μL of a freshly prepared 10 mg/mL solution of sodium borohydride in 1 M ammonium hydroxide was added to each test tube. After the sample was allowed to react for 1 h at room temperature, 63 μL of methanol was added to each sample and then dried down at 74° C. under nitrogen. A solution of 9:1 (v/v) methanol/acetic acid was then prepared, and 125 μL was added to each test tube, which was again dried under nitrogen. Before moving forward, the samples were fully dried in a vacuum desiccator for at least 15-20 min.
Acetylation of Nascent Hydroxyl Groups. A total of 18 μL of water was added to each sample and mixed well to dissolve the entire sample residue. A total of 250 μL of acetic anhydride was then added to each sample. Next, the sample was sonicated in a water bath for 2 min, followed by an incubation for 10 min at 60° C. A total of 230 μL of concentrated TFA was then added to each test tube. The capped test tube was then incubated at 60° C. for 10 min.
Final Cleanup. Approximately 2 mL of methylene chloride was added to each test tube and mixed well. Then, 2 mL water was added to each sample and mixed well. Liquid/liquid extraction was performed twice, saving the organic layer. Next, the organic layer was transferred with a silanized glass pipet into a silanized autosampler vial. The organic layer was then evaporated under nitrogen, reconstituted in 120 μL of acetone and capped for injection onto GC-MS. A molecular overview of the global glycan methylation analysis procedure is shown in
Gas Chromatography-Mass Spectrometry. For sample analysis, an Agilent Model A7890 gas chromatograph (equipped with a CTC PAL autosampler) was used coupled to a Waters GCT (time-of-flight) mass spectrometer. A total of 1 μL of the sample was injected in split mode onto an Agilent split-mode liner that contained a small plug of silanized glass wool with the temperature set to 280° C. For all samples, one injection was made at split ratio of 20:1. A 30 m DB-5 ms GC column was used for chromatography. The oven temperature was initially held at 165° C. for 0.5 min. Then, the temperature increased 10° C./min up to 265° C., followed by an immediate increase of 30° C./min to 325° C., where it was kept constant for 3 min. The total run time was 15.5 min. The temperature of the transfer line was kept at 250° C. After the sample components were eluted from the GC column, they were subjected to electron ionization with an electron energy of 70 eV at a temperature of 250° C. The m/z range of analysis was 40-800 with a spectral acquisition rate of 10 Hz. Perfluorotributylamine was used for the daily tuning and calibration of the mass spectrometer.
Data Processing. Quantification was done by integrating the summed extracted ion chromatogram peak areas (details provided elsewhere)” using QuanLynx software. The peaks were integrated automatically and verified manually. Then, all the information given by integration was exported to a spreadsheet for further analysis.
Statistical Analysis. All data (chromatographic peak areas) for each sample analyzed as part of this example are provided within a spreadsheet available as the Supporting Information accompanying Ferdosi S. et al., Journal of Proteome Research 2018, 17(1):543-558. The peak area for each glycan node was normalized in one of two possible ways. In the first approach, individual hexoses were normalized to heavy glucose, and individual HexNAcs were normalized to heavy N-acetyl glucosamine (heavy GlcNAc). (Notably, these two internal standards were omitted during analysis of the prostate cancer set of samples.) In the second approach, individual hexoses were normalized to the sum of all endogenous hexoses, and individual HexNAcs were normalized to the sum of all endogenous HexNAcs. This normalization scheme provided modestly improved within-batch reproducibility but limited observation of potential simultaneous increases in all glycan nodes; see the spreadsheet provided in the Supporting Information accompanying Ferdosi S. et al., Journal of Proteome Research 2018, 17(1):543-558 “Average CVs” worksheet for details on the reproducibility of each normalization approach. Based on the QC sample analyzed in each batch, the average percent CV for the heavy glucose/heavy GlcNAc normalization approach for the top five performing glycan nodes described in the Results section was 17%; for normalization by the sum of endogenous hexoses or HexNAcs, this value was 10%.
Each stage of each cohort was log-transformed, and outliers were removed with the ROUT method at Q=1% using GraphPad Prism 7. Data were then reversed transformed by taking the anti-log of each value. Differences between patient cohorts and stages in the large lung cancer study were evaluated by means of the Kruskal-Wallis test followed by the Benjamini-Hochberg false discovery correction procedure using R version 3.3.3. This software was also used to generate the receiver operating characteristic (ROC) curves that were statistically compared to one another via DeLong's test using RStudio Version 1.0.143. GraphPad Prism 7 was used to plot the ROC curves shown in
Prior to initiating this study, matched collections of serum and several different types of plasma were acquired from healthy donors. Glycan nodes were analyzed in these samples to determine whether subtle differences in sample matrix (i.e., different anticoagulants and serum) impacted the analytical results. Only a few statistically significant differences between the P/S matrices were observed (Tables S2 and S3 accompanying Ferdosi S. et al., Journal of Proteome Research 2018, 17(1):543-558). Sodium citrate and sodium EDTA plasma samples were excluded from this study, which accounts for all of the pair-wise differences observed in Table S2; a few remaining differences (noted within the smaller sample set involving heparin plasma; Table S3), while statistically significant, were small and actually within the interassay precision range for the relevant markers.19 A summary of all sample sets analyzed as part of this example is provided in
The primary focus of this study was the large lung cancer set as it constituted the single largest set and covered all stages of cancer. In total, 19 glycan “nodes” were measured with relative abundances that were consistently greater than 1% of respective total hexoses or total N-acetylhexosamines (HexNAcs). As reported elsewhere, this threshold ensures quantitative precision between batches of samples.18,19 Relative to the age/gender/smoking-status matched controls, significant changes were observed in 4 out of 19, 2 out of 19, 17 out of 19 and 17 out of 19 nodes in plasma samples taken from stage I, II, III and IV patients, respectively (
Highly Altered Glycan Features: The five glycan nodes that were most elevated in the cancer cases relative to the at-risk controls included the following: 1) Terminal fucose-which corresponds to essentially all fucose in blood plasma. (Non-terminal fucose is only found in Notch proteins18,28 which, at most, would contribute only an infinitesimal fraction of the fucose found in blood plasma and, if ever detected by the approach employed here, would be observed as 3-linked fucose.) 2) 6-linked galactose, which corresponds specifically to α2-6 sialylation and almost completely to the activity of the ST6GalI glycosyltransferase enzyme18; 3) 2,4-linked mannose, which corresponds to β1-4 branching of N-linked glycans and almost completely to the activity of the GnT-IVa enzyme18; 4) 2,6-linked mannose, which corresponds to β1-6 branching of N-linked glycans and to the activity of the GnT-V enzyme18; 5) 3,4-linked N-acetylglucosamine (GlcNAc), which predominately corresponds to outer-arm fucosylation and the activity of the FucT-III, FucT-V, FucT-VI, and FucT-XI enzymes18. The univariate distributions of these five glycan nodes (normalized to heavy glucose or heavy GlcNAc added as an internal standard), along with receiver operating characteristic (ROC) curves that describe the potential clinical relevance of their distributions are shown in
Stage and Health-Status Dependence:
In general, the five glycan nodes increased together as the stage of cancer advanced (
Orthogonality of Glycan Features: In order to evaluate the orthogonality of all 19 glycan nodes included in this study (
Comparison to Liver Fibrosis: The vast majority of glycoproteins found in blood P/S are derived from either liver glycoproteins or immunoglobulins (IgG molecules) secreted by the immune system.29,30 In terms of raw abundance, which is in the 10 s of mg/mL range, the relative contribution of P/S glycoproteins provided by the liver and by the immune system is approximately 50% each.30 Essentially all non-protein targeting serum glycomics approaches, including the one employed in this study, detect changes in these abundant P/S glycans and not novel glycans secreted or sloughed-off by cancer cells. This concept has been acknowledged elsewhere.31 Nevertheless, P/S glycans are notoriously known for being altered in cancer.1-4,32 However, they are also known to be altered in inflammatory conditions in the absence of cancer.33-35 As an initial attempt to begin to parse out the behavior of the five glycan nodes that were most elevated in the large lung cancer set, they were analyzed in a set of serum samples from liver fibrosis patients (
Prediction of Progression and All-Cause Mortality: The five glycan nodes that were most elevated in the large lung cancer set were evaluated for their ability to predict both progression and all-cause mortality in a Cox proportional hazards regression model. After adjusting for age, gender, smoking status and cancer stage, only 6-linked galactose, which corresponds to α2-6 sialylation, predicted both progression and all-cause mortality with p-values of <0.01 when the glycan nodes were modeled as continuous variables. All four other top-performing glycan nodes were able to predict survival (p<0.05), but only β1-4 branching and β1-6 branching were also able to predict progression (p<0.05). Because relative rather than absolute quantification was employed, glycan node units lack readily interpretable meaning. As such, measurements of α2-6 sialylation were broken into quartiles and the Cox proportional hazards analysis repeated. After adjusting for age, gender, smoking status and cancer stage, the top α2-6 sialylation quartile predicted progression with a hazard ratio of 2.45 relative to all other quartiles combined (lower bound at 95% CL=1.54; upper bound at 95% CL=3.90; p=1.5×10−4). Likewise, after the same adjustments, the top α2-6 sialylation quartile predicted all-cause mortality with a hazard ratio of 1.52 relative to all other quartiles combined (lower bound at 95% CL=1.02; upper bound at 95% CL=2.23; p=0.042). Progression and survival curves illustrate the differences in the rates of occurrence of these events for the top α2-6 sialylation quartile vs. all other quartiles (
The five glycan features that were most elevated relative to healthy individuals and at-risk controls were terminal (total) fucosylation, α2-6 sialylation, β1-4 branching, β1-6 branching and outer-arm fucosylation (
A second notable feature apparent in the large lung cancer set was the statistically significant difference between certifiably healthy living kidney donors and risk-matched controls for α2-6 sialylation, β1-4 branching and β1-6 branching—with controls always increased toward the direction of cancer (
A few studies have been published that are closely related to the one reported here, but in which intact glycans were analyzed.31,44,45 While not in conflict with any of these studies, our most prominent findings of increased terminal (total) fucosylation, α2-6 sialylation, β1-4 branching, β1-6 branching and outer-arm fucosylation in stage III-IV lung cancer are most closely aligned with the major changes reported by Vasseur et al.31 for intact glycans in lung cancer. They reported significant increases in fucosylated tri- and tetra-antennary structures, outer-arm fucosylated structures, and α2-6 sialylated structures. Moreover, they reported that all of these features were elevated in control-group former smokers relative to control group non-smokers. We found increases in terminal (total) fucosylation, β1-6 branching and outer-arm fucosylation in current smokers relative to never smokers, but only increases in terminal (total) fucosylation and outer-arm fucosylation in former smokers relative to never smokers (
Multivariate logistic regression models were not able to outperform individual glycan nodes (cf
But this is not to imply that liver glycoprotein and immunoglobulin glycan alterations are unimportant or lack a cancer-relevant pathological effect. Several cancer-upregulated glycoforms that cancer cells have in common with glycans that are induced on acute phase liver proteins and/or IgG molecules in the presence of cancer have been found to mediate specific immune-modulating effects—some of which overtly favor cancer progression:
Galectins are a family of lectins that bind β-galactoside sugars within glycans and are known to modulate a variety of immunological processes involved in cancer.39,54,55 Malignant T-cells in mycosis fungoides/Sezary syndrome have been found to resist galectin-1 mediated apoptosis because they both lack the CD7 receptors that carry the oligosaccharides recognized by galectin-1 and because they express sialylated core 1 O-glycans that promote galectin-1 resistance.56 Poly-N-acetylactosamine-modified core 2 O-glycans bind to galectin-3, reducing the affinity of tumor major histocompatibility complex (MHC) class I-related chain A (MICA) for the activating NKG2D receptor on natural killer (NK) cells, preventing tumor cell killing of core 2 O-glycan expressing cancer cells.57-60 Similarly, modification of MUC1 by poly-N-acetylactosamine and subsequent binding by galectin-3 interferes with TRAIL-mediated killing of DR4-expressing cancer cells by NK cells.60-62 But perhaps the best known example is the ability of excessive tumor cell surface sialylation to continually stimulate the inhibitory Siglec-7 receptor on NK cells, preventing their activation.60,63-65
In light of these discoveries, the fact that α2-6 sialylation of abundant plasma/serum proteins is both associated with metastasis and poor prognosis66,67 and, in our study, was not only elevated in lung cancer but predicted progression and all-cause mortality in the large lung cancer set may shed additional light on a means by which cancer potentially manipulates the immune system to groom the physiological landscape and carve out a metastatic niche: Rather than directly interacting (cell-to-cell) with NK cells, tumor cells may simply be able to send out cytokine signals that are picked up by the liver and/or the immune system that alter the way that these nominally healthy tissues glycosylate their secreted proteins. This could, for example, facilitate a large-scale amplification of sialylated glycans that are able to continually activate Siglec-7 receptors on NK cells, preventing them from killing tumor cells and allowing them to metastasize. The possibility that cancer cells may induce the abnormal glycosylation of the highly abundant liver glycoproteins and/or IgG molecules found in P/S as a shielding mechanism against innate immune detection during metastasis attempts has received very little attention, but may be worth investigating. Though speculative, this strategy could even potentially be deployed in cases where cancer cells deplete themselves of a glycan feature required for immune-cell recognition—such as fucosylation recognized by the TRAIL-mediate killing mechanism of NK cells68—but induce it on abundant P/S proteins, serving to “swamp out” the recognition mechanism of innate immune surveillance.
The ability of α2-6 sialylation to predict lung cancer progression and survival is not unique among P/S glycans. Indeed, all five top-performing glycan nodes in the present study were able to predict progression and/or survival to a more limited extent than α2-6 sialylation. The prognostic capacity of β1-4 and β1-6 branching however, may, at least in part, be due to the fact that these glycan features simply create greater opportunity for sialylation. Beyond this study, others have found that the sialyl Lewis X epitope (which displays α2-3 sialylation rather than α2-6 sialylation) predicts progression and survival in both small cell69 and non-small cell lung cancer.70-72 Like the prognostic Veristrat markers,73-75 which are serum amyloid A proteoforms76, elevated α2-6 sialylation in lung cancer may largely be due to an inflammatory response by the liver. But if, as described above, sialylation-based cloaking of tumor cells from the immune system plays an important role in the metastatic process, α2-6 sialylation may turn out to play a causative, mechanistic role in lung cancer progression.
A molecularly bottom-up approach to plasma/serum (P/S) glycomics based on glycan linkage analysis that captures unique glycan features such as α2-6 sialylation, β1-6 branching and core fucosylation as single analytical signals was employed to evaluate the behavior of P/S glycans in all stages of lung cancer and across various stages of prostate, ovarian and pancreatic cancers. Elevation of terminal (total) fucosylation, α2-6 sialylation, β1-4 branching, β1-6 branching and outer-arm fucosylation markers were most pronounced in lung cancer in a stage-dependent manner, but these changes were found to be independent of the tumor tissue-of-origin. Using a Cox proportional hazards regression model, the marker for α2-6 sialylation was found to predict both progression and all-cause mortality in lung cancer patients after adjusting for age, gender, smoking status and stage at which the sample was taken. Interestingly, certifiably healthy P/S donors had markedly lower levels of α2-6 sialylation, β1-4 branching and β1-6 branching relative to cancer risk-matched controls. While early detection is ideal, the information provided by this and related studies31,33-35, 41,42,49-53 suggests that pre-cancerous inflammation may be responsible for the elevation of many of the glycan features observed in the at-risk controls relative to the certifiably healthy donors—implying that the goal of preventing such a pre-cancerous state may be as important as preventing the transition from an at-risk state to stage I cancer.
Our recent large lung cancer study provided important information about the diagnostic and prognostic value of P/S glycan nodes in lung cancer as well as other types of cancer.21 In particular, we observed strong stage-dependence, but tissue-of-tumor-origin independence of elevated P/S glycan features. Moreover, we found that glycan nodes corresponding to α2-6 sialylation, β1-4 branching and β1-6 branching were able to predict survival and progression. The primary purposes of this study were to evaluate the ability of unique glycan features, quantified via glycan node analysis, to 1) evaluate the potential ability of glycan nodes to distinguish MIBC from NMIBC, 2) distinguish NMIBC patients from patients with a history of bladder cancer but currently exhibiting no clinical evidence of disease (NED), and 3) evaluate the ability of glycan nodes to predict recurrence from a state of remission (i.e., the NED state). Based on our observations in lung cancer21, we anticipated findings of potential clinical interest under each objective. Moreover, elevated blood plasma protein glycosylation is known to be associated with inflammation in some non-cancerous clinical conditions.22-24 Since C-reactive protein (CRP) is a well-studied marker of inflammation25 as well as a prognostic marker for UCC26-29, we also evaluated the quantitative relationship between glycan nodes that were prognostically useful in NED patients and CRP.
The primary purposes of Example 2 were to evaluate the ability of unique glycan features, quantified via glycan node analysis, to 1) evaluate the potential ability of glycan nodes to distinguish MIBC from NMIBC, 2) distinguish NMIBC patients from patients with a history of bladder cancer but currently exhibiting no clinical evidence of disease (NED), and 3) evaluate the ability of glycan nodes to predict recurrence from a state of remission (i.e., the NED state). Based on our observations in lung cancer in Example 1, we anticipated findings of potential clinical interest under each objective. Moreover, elevated blood plasma protein glycosylation is known to be associated with inflammation in some non-cancerous clinical conditions [22-24]. Since C-reactive protein (CRP) is a well-studied marker of inflammation [25] as well as a prognostic marker for UCC [26-29], we also evaluated the quantitative relationship between glycan nodesthat were prognostically useful in NED patients and CRP.
Plasma Samples EDTA plasma samples from MIBC (n=12), NMIBC (n=39) and NED patients (n=72), as well as certifiably healthy living kidney donors (n=30) were enrolled in the Multidisciplinary Biobank at Mayo Clinic Arizona under a Mayo Clinic Institutional Review Board (IRB)-approved protocol. Patients eligible for enrollment were those seen at Mayo Clinic Arizona who were >18 years old, able to provide informed consent, and undergoing evaluation as either a potential living kidney donor or for genitourinary diseases. Detailed inclusion & exclusion criteria for living kidney donors are provided in Supporting Information (S1 Appendix). None of the living kidney donor patients smoked at the time of health screening and blood collection; 27% were former smokers and 73% never smoked. Living kidney donor and UCC patients were excluded if they declined to participate or if the banking of their biospecimens would compromise the availability of tissue for diagnosis and standard clinical care. All specimens were collected during the time frame of June 2010 through February 2016. Standard operating protocols and blood collections were performed as previously described [30]. All specimens were stored at −80° C. prior to shipment to ASU and maintained at −80° C. at ASU prior to analysis. All specimens were analyzed blind and in random order. An aliquot of plasma from the same individual donor was analyzed in every batch as a quality control (QC) specimen to ensure batch-to-batch consistency.
This research was approved by Arizona State University's IRB and all clinical investigations were conducted according to the principles expressed in the Declaration of Helsinki.
Sample Preparation Glycan node analysis was performed on the plasma samples as described previously19. Briefly, it includes four main steps (
Gas Chromatography-Mass Spectrometry As previously described,21 an Agilent Model A7890 gas chromatograph (equipped with a CTC PAL autosampler) was used coupled to a Waters GCT (time-of-flight) mass spectrometer to analyze the prepared samples. For all samples, one injection of 1 μL was made at split ratio of 20:1 onto an Agilent split-mode liner containing a small plug of silanized glass wool with the temperature set to 280° C. The DB-5 ms GC column that was used for chromatography was 30 m. The oven temperature, initially kept at 165° C., was increased at a rate of 10° C./min up to 265° C. Immediately after that, the temperature was increased at a rate of 30° C./min to 325° C., then held constant for 3 min. The transfer line to the mass spectrometer was kept at 250° C. Following the elution of sample components from the GC column, they were subjected to electron ionization (70 eV, 250° C.) and analyzed in the m/z range of 40-800 with a scan cycle time of 0.1 s. Daily calibration and tuning of the mass spectrometer was done using perfluorotributylamine.
The quantification method is described in detail elsewhere.18 Briefly, summed extracted ion chromatogram peaks were integrated automatically and checked manually using QuanLynx software. The collected data were then exported to a spreadsheet for detailed analysis.
Human C-Reactive Protein ELISA Assay The Invitrogen™ Human C-Reactive Protein ELISA kit (Catalog Number KHA0031, ThermoFisher Scientific) was used, following the manufacturer instructions, to measure the concentration of CRP in patient plasma samples. Final absorbance values were read at 450 nm by Thermo Scientific Multiskan Go plate reader and the concentration of samples were calculated using SkanIt Software 3.2.
Statistical Analysis Individual extracted-ion chromatographic peak areas for each glycan node were normalized using one of two possible approaches: 1) Individual hexose residues were normalized to heavy glucose and individual N-acetylhexosamine (HexNAc) residues were normalized to heavy N-acetyl glucosamine (heavy GlcNAc). 2) Individual hexose residues were normalized to the sum of all endogenous hexose residues. Likewise, each HexNAc residue was normalized to the sum of all endogenous HexNAcs. The average % CV calculated based on the analysis of the QC sample in each batch shows that the latter normalization method provides better inter-batch reproducibility (<10% for the four most elevated glycan nodes) but the former normalization method performs better in separating the patient groups while still keeping the average inter-batch % CV in an acceptable range (i.e., <18%). Unless otherwise noted, results described below are based on normalization with heavy glucose and heavy GlcNAc. All extracted-ion chromatographic peak areas for all samples, including their normalization to heavy glucose or heavy GlcNAc and normalization to the sum of all endogenous hexoses or HexNAcs as well as % CV values for batch-to-batch QC samples are included in Supporting Information accompanying Ferdosi S et al., PloS One 2018, 13(7) (S1 File).
For both the glycan node data and the CRP ELISA data, outliers within each clinical group (Control, NED, NMIBC and MIBC) were removed after log10 transformation using the ROUT method at Q=1% by GraphPad Prism 7. After removing the outliers, the anti-log of each value was taken to reverse the transformation. To identify differences between clinical groups, the Kruskal-Wallis test was performed followed by the Benjamini-Hochberg false discovery correction procedure at a 5% false discovery rate using RStudio Version 1.0.143. Univariate distributions and ROC curves were plotted using GraphPad Prism 7. The ability of certain glycan nodes to predict bladder cancer recurrence was evaluated by performing Cox proportional hazards regression models using SAS 9.4. Correlations between CRP and glycan nodes were examined using Pearson correlation in GraphPad Prism 7.
Altered Glycan Features in UCC The relative abundance of 19 glycan “nodes” was quantified in each of the control, NED, NMIBC, and MIBC patient samples. Each of these nodes contributed at least 1% of the sum total of all hexoses or all HexNAcs. Data normalized to heavy, stable isotope-labeled glucose and GlcNAc internal standards were first evaluated for statistically significant differences between all four patient groups. No differences were found between MIBC, NMIBC and NED patients (Table 1). However, relative to the certifiably healthy controls, statistically significant changes were found in more than half of the glycan nodes measured in NED, NMIBC, and MIBC patients (Table 1). Among these glycan nodes, the only one that was decreased in the current and former cancer patient samples was 4-linked glucose (i.e., 4-Glc, which is mostly derived from glycolipids). The same trend was previously observed in lung cancer patient samples.21 The rest of the altered nodes were increased in current and former UCC patients compared to the certifiably healthy controls.
aIndividual hexose residues were normalized to heavy glucose and individual HexNAc residues were normalized to heavy GlcNAc).
b Significance was determined by the Kruskal-Wallis test followed by the Benjamini-Hochberg correction procedure at a 5% false discovery rate.
c “ns” stands for “not significant”. “i” and “d” stand for “increased” or “decreased” glycan levels in the cohort with clinically more advanced disease listed in the column header. “i” or “d” indicates p < 0.05, “ii” or “dd” indicates p < 0.01, “iii” or “ddd” indicates p < 0.001, and “iiii” or “dddd” indicates p < 0.0001.
There were four glycan nodes that were most elevated in the current and former UCC patients relative to the certifiably healthy controls, including 6-linked galactose, 2,4-linked mannose, 2,6-linked mannose, and 3,4-linked GlcNAc. These nodes correspond to α2-6 sialylation, β1-4 branching, β1-6 branching, and outer-arm fucosylation, respectively.18,21 The univariate distributions of these four glycan nodes in each of the four clinical groups are shown in
a Significance was determined by the Kruskal-Wallis test followed by the Benjamini-Hochberg correction procedure at a 5% false discovery rate.
b “ns” stands for “not significant”. “i” and “d” stand for “increased” or “decreased” glycan levels in the cohort with clinically more advanced disease listed in the column header. “i” or “d” indicates p < 0.05, “ii” or “dd” indicates p < 0.01, “iii” or “ddd” indicates p < 0.001, and “iiii” or “dddd” indicates p < 0.0001.
The average age of the certifiably healthy living kidney donors (controls) was 47, while the average age for the NED, NMIBC and MIBC patients was 74, 76, and 73, respectively. Yet after correcting for multiple comparisons, no statistically significant correlation of any glycan node with age could be found when pooling data from all cohorts and evaluating correlations for the age range in which there was overlap between the controls and the current and former UCC patients (i.e., ages 45-67; see
Prognostic Value of Glycan Nodes Within the NED cohort there were numerous samples with high levels of specific glycan nodes that were well out of the range observed in the controls—and which were similar to the cancer patient samples—even though the NED patients were clinically free of disease (
CRP Correlation with Glycan Nodes CRP was measured in order to correlate changes in patient glycan nodes with patient inflammation status. The average level of CRP in the certifiably healthy controls was 1.76 mg/L whereas the NED, NMIBC, and MIBC samples had average CRP levels of 3.84, 3.21, and 3.08 mg/L, respectively (which are above the normal range of CRP (<3.0 mg/L) [28]). The levels of 6-linked galactose, which corresponds to α2-6 sialylation, positively correlated with CRP (r=0.34, p<0.001), as did the levels of 2,6-linked mannose, which corresponds to β1-6 branching (r=0.38, p<0.001) (
Out of 19 quantified glycan nodes, four of them, each corresponding to a unique glycan feature including α2-6 sialylation, β1-4 branching, β1-6 branching and outer-arm fucosylation, were most significantly elevated in UCC patients compared to certifiably healthy individuals (Table 1 and
In order to interpret the physiological significance of these findings, it must be understood that the glycans being measured are from high-concentration glycoproteins derived primarily from the liver (i.e., transferrin, alpha-2-macroglobulin, haptoglobin, etc) and the immune system (i.e., IgG antibody glycans) rather than being sloughed off or secreted by cancer cells themselves31,32. These macro-level (mg/mL scale) changes in blood plasma glycan biochemistry are thought to be mediated, at least in part, by cytokines secreted from the tumor which are recognized by the liver and/or immune system as part of a systemic inflammatory response, altering the way that these two major glycoprotein-producing systems glycosylate their proteins.33-38
With this in mind, there are three possible causes for the increases in various glycan nodes observed in Table 1 and
Overall, the glycan node distributions observed here in UCC suggest that UCC makes modest, early-stage alterations to blood plasma glycans that, even at stages III-IV, do not reach the extreme levels observed in pancreatic, ovarian, lung and other types of cancer.21 To illustrate, lung cancer patient glycan node data from Example 1 are compared side-by-side with UCC patient glycan node data in
Though the reason(s) for glycan node elevation in nominally cancer-free individuals are not fully known, it has previously been shown that serum glycans can be elevated in inflammatory patient states in the absence of cancer.22-24 Moreover, chronic inflammation is known to be closely associated with the development of cancer.42-44 Together with the observations presented here, this suggests that the elevated plasma glycan levels observed in former UCC patients (currently in the NED state) that are prognostic of recurrence may be driven by or simply part of inflammatory processes. To assess this possibility, we measured CRP concentrations and found them to be strongly significantly correlated with levels of both α2-6 sialylation and β1-6 branching (
This brings up the question of whether there is a mechanistic connection between alterations in plasma glycans (associated with inflammation) and the development or progression of cancer. There is evidence for the concept that the biological landscape experiences “grooming” or premetastatic “niche” formation prior to cancer establishing residence within the body.45-49 And while glycans are not solely responsible for this process, evidence exists that they play important roles. As previously summarized21 and others have explained in detail, cell-surface glycans that facilitate resistance of galectin-mediated apoptosis48,50-52 (including poly-N-acetyllactosamine modified core 2 O-glycans53-58) as well as sialylated glycans that stimulate the inhibitory Siglec-7 receptor on natural killer cells53-56 have important roles to play in helping cancer evade the body's natural immunity.
The results of Example 2 show that, relative to healthy individuals, there is a significant alteration of P/S glycan features that correlates with inflammation and is present at the onset of UCC—but that, unlike other types of cancer that we have observed to date21, does not change in a stage-dependent manner—even when UCC patients go into remission. Certifiably healthy individuals cannot be considered to be clinically relevant controls for the development of cancer diagnostics—but they do illustrate the striking changes in blood biochemistry that occur as cancer develops and takes hold in the human body. Thus taken together with Example 1, the findings of Example 2 suggest that if there are clinical applications for P/S glycan node measurements, they most likely lie in evaluating cancer patient relapse or progression risk—or in monitoring nominally healthy persons who exhibit behaviors such as smoking that put them at risk for the biochemical transition between a genuinely healthy state and one in which their blood chemistry (above and beyond mere behavior) reveals a truly high-risk state.
α2-6 sialylation, β1-4 branching, β1-6 branching, and outer-arm fucosylation were found to be significantly elevated in both current and former (in remission) UCC patients relative to certifiably healthy living kidney donors, with ROC curve c-statistics averaging approximately 0.8—yet this does not make them clinically relevant diagnostic biomarkers of UCC. Differences between patients with muscle invasive UCC, non-muscle invasive UCC and patients in remission were not statistically significant. For UCC patients in remission, α2-6 sialylation and β1-6 branching were prognostic indicators of recurrence and were correlated with CRP levels (r=0.34 & 0.38, resp.; p<0.001), a known prognostic marker in UCC. Results highlighted the pronounced difference between the serum glycan biochemistry of healthy individuals vs. any stage of UCC (including remission) and underscored the concept that for plasma glycans the transition between a healthy state and an at-risk state is much more pronounced than that between an at-risk state and early stage cancer.
Materials. Heavy, stable-isotope-labeled D-glucose (U-13C6, 99%; 1,2,3,4,5,6,6-D7, 97-98%) was obtained from Cambridge Isotope Laboratories (Tewksbury, Mass.). Acetone was acquired from Avantor Performance Materials (Center Valley, Pa.). Methanol was purchased from Honeywell Burdick & Jackson (Muskegon, Mich.). Acetonitrile and methylene chloride were obtained from Fisher Scientific (Fair Lawn, N.J.). Dimethyl sulfoxide (DMSO), iodomethane (99%, Cat. No. 18507), chloroform, trifluoroacetic acid (TFA), ammonium hydroxide, sodium borohydride, acetic anhydride, sodium acetate, and sodium hydroxide beads (20-40 mesh, Cat. No. 367176) were acquired from Sigma-Aldrich. Pierce spin columns (900 μL volume) were purchased from ThermoFisher Scientific (Waltham, Mass., Cat. No. 69705). GC-MS autosampler vials and Teflon-lined pierceable caps were obtained from Thermo-Fisher Scientific. GC consumables were acquired from Agilent (Santa Clara, Calif.); MS consumables were obtained from Waters (Milford, Mass.).
Plasma and Serum Samples. All specimens were collected in compliance with the Declaration of Helsinki principles. Once collected, they were coded and deidentified to protect patient identities.
Women Epidemiology Lung Cancer (WELCA) Set. EDTA plasma samples from stage I-IV lung cancer patients and age-matched controls were collected at 12 different collection centers in France.26 This study was approved by the Institutional Review Board of the French National Institute of Health and Medical Research and by the French Data Protection Authority (IRB-Inserm, no. 3888 and CNIL no. C13-52). As part of the WELCA Study, all-female lung cancer patients were recruited between September 2014 and December 2017, and age-matched all-female controls were recruited between June 2015 and December 2017. All women living in Paris and the lle de France area, newly diagnosed with lung cancer, were considered as eligible cases. Age-matched controls were randomly sampled from women living in the same area without a history of lung cancer. All peripheral blood samples were drawn and processed following a written standardized protocol.26 Briefly, after transport to the laboratory at 4° C., blood samples collected in tubes containing EDTA additive were spun for 15 min at 3000 rpm and 4° C. in a standard centrifuge. Then the collected plasma samples were aliquoted and periodically transported on dry ice to the central repository for final storage at −80° C. No freeze-thaw cycles occurred prior to shipment to Arizona State University (Borges lab) for analysis. A detailed profile of the clinical characteristics of the patients in this WELCA study is given in Table S1 of Hu et al., J. Proteome Res. 2019 Nov. 1; 18(11):3985-3998.
Dual Gender Lung Cancer Set. Sodium heparin plasma samples from a lung cancer study consisting of patients and controls in both genders were collected by Dr. Xifeng Wu at the University of Texas MD Anderson Cancer Center. Even though it is a glycosaminoglycan itself, heparin possesses monomer units that are predominately carboxylated, sulfated or both, and thus cannot be directly detected by the analytical methodology used in this study. As reported previously, there are only negligible differences between glycan nodes measure in heparin plasma vs EDTA plasma or serum,19 and thus direct comparisons were made for these three types of biospecimens. Venous blood samples were collected from newly diagnosed and histologically confirmed lung cancer patients prior to therapy at the MD Anderson Cancer Center hospital. Blood samples of age-, gender-, smoking-, and ethnicity-matched controls were collected at the Kelsey-Seybold Clinic. All blood samples were collected since 1995 and processed following the same SOP. These specimens has previously been described.19
Stage I-Only Lung Cancer Set (Also Dual Gender). Serum samples for dual gender stage I lung adenocarcinoma patients were collected together with age-, gender-, and smoking-statusmatched controls, under NYU IRB approval at the NYU Langone Medical Center by Dr. Harvey Pass. Arterial blood samples were drawn from fasting patients undergoing surgery between September 2006 to August 2013 to remove one or more lung nodules that were detected during a CT scan. A pathological exam of the excised nodules was performed to determine whether nodules were benign or malignant. Serum was collected under a standardized procedure. These specimens have previously been described.19
Plasma Samples for the Stability Study. The samples employed for the ex vivo thawed-state stability study included EDTA plasma samples from three healthy male and two healthy female donors. These samples were aliquoted and stored at different temperatures over the course of a year, with their matched control aliquots stored continuously at −80° C. The mistreatment conditions included 10 days at −20° C., 90 days at −20° C., 360 days at −20° C., 2 days at 4° C., 90 days at 4° C., and 1 day at 25° C. At the end of the 360-day time point, glycan node analysis was performed on all the mistreated sample aliquots and their matched control aliquots.
Additional Biospecimen Details. A summary of the case-control sample sets discussed in this study is provided in Table S2 of Hu et al., J. Proteome Res. 2019 Nov. 1; 18(11):3985-3998. A 300 mL plasma sample from an individual donor was obtained from BioIVT, which served as a quality control sample to ensure batch-to-batch quantitative reproducibility. All specimens were stored at −80° C. prior to analysis.
The glycan node analysis procedure was adapted from Borges et al.16,17
Permethylation, Nonreductive Release, and Purification of Glycans. Nine microliters (9 μL) of blood plasma and 1 μL of a 5 mMsolution of heavy-labeled D-glucose (U-13C6, 99%; 1,2,3,4,5,6,6-D7, 97-98%) and N-acetyl-D-[UL-13C6]-glucosamine were mixed in a 1.5 mL Eppendorf tube, followed by the addition of 270 μL of DMSO. About 0.7 g sodium hydroxide beads were collected in a Pierce spin column (900 μL volume) and washed once with 350 μL of acetonitrile (ACN) followed by two rinses with 350 μL of DMSO. The plasma sample was mixed in with 270 μL of DMSO and 105 μL of iodomethane followed by immediate mixing. The whole mixture was then added to the preconditioned NaOH beads in the plugged microfuge spin column. After occasional gentle stirring the sample solution in NaOH column for 11 min, the microfuge spin column was unplugged and spun for 30 s at 5000 rpm (1000 g in a fixed-angle rotor). The collected sample solution was quickly transferred into 3.5 mL of 0.5MNaCl solution in 0.2 M sodium phosphate buffer (pH 7) within a silanized 13×100 mm glass test tube. To maximize glycan recovery, the NaOH beads were then washed twice by 300 μL of ACN, with all spin-throughs immediately transferred into the same silanized glass test tube. To perform liquid/liquid (L/L) extraction, 1.2 mL of chloroform was added to each test tube, which was then capped and shaken well. After brief centrifugation to separate the layers, the aqueous layer (top) was discarded and then replaced by a fresh aliquot of 3.5 mL of 0.5 M NaCl solution in 0.2 M sodium phosphate buffer (pH 7). After three L/L extraction rounds, the chloroform layer was finally recovered and dried under a gentle stream of nitrogen in a heater block set to 74° C.
Hydrolysis, Reduction, and Acetylation. To perform TFA hydrolysis, each sample was mixed with 2MTFA (325 μL) and incubated at 121° C. for 2 h, which was then dried under a gentle stream of nitrogen in a heater block set to 74° C. To reduce the sugar aldehydes, each sample was incubated at room temperature for 1 h after dissolution in 475 μL of freshly made 10 mg/mL sodium borohydride in 1 M ammonium hydroxide. To remove excess borate, 63 μL of methanol (MeOH) was added and dried under nitrogen, followed by adding 125 μL of 9:1 (v/v) MeOH:acetic acid. Samples were then dried under nitrogen and then fully dried in a vacuum desiccator for 20 min. The last step is acetylation of nascent hydroxyl groups, in which 18 μL of deionized water was added to each test tube to dissolve any precipitates. After adding 250 μL of acetic anhydride and sonicating in a water bath for 2 min, each sample was incubated for 10 min at 60° C., followed by mixing with 230 μL of concentrated TFA and incubated again at 60° C. for 10 min. To clean up the sample mixture, L/L extraction was performed twice after adding 1.8 mL of dichloromethane and 2 mL of deionized water to each test tube. With the aqueous layer (top layer) discarded for each round, the organic layer of each sample was then transferred to a silanized autosampler vial, dried under nitrogen and reconstituted in 120 μL of acetone, which was then capped in preparation for injection onto the GC-MS.
Gas Chromatography-Mass Spectrometry. An Agilent Model A7890 gas chromatograph (equipped with a CTC PAL autosampler) coupled to a Waters GCT (time-of-flight) mass spectrometer was employed to analyze the prepared samples. For each sample, 1 μL of the 120 μL total volume was injected onto a hot (280° C.), silanized glass liner (Agilent Cat. No. 5183-4647) containing a small plug of silanized glass wool at a split ratio of 20:1. A 30-m DB-5 ms GC column was used to separate different sample components, facilitated by the carrier gas (helium) with a 0.8 mL/min flow rate. The GC oven temperature was initially kept at 165° C. for 0.5 min, then increased to 265° C. at a rate of 10° C./min, followed by immediate ramping to 325° C. at a rate of 30° C./min, and finally held at 325° C. for 3 min. Sample components eluted from GC column were subjected to electron ionization (70 eV, 250° C.). Positive-ion mode mass spectra from individual TOF pulses over a m/z range of 40-800 were summed every 0.1 s. Daily tuning and calibration of the mass spectrometer was performed with perfluorotributylamine to ensure reproducible relative abundances of EI ions and mass accuracy within 10 ppm.
Data Processing. Quanlynx 4.1 software was employed to integrate the summed extracted-ion chromatogram (XIC) peak areas for all glycan nodes. The peak areas were automatically integrated and manually verified, then exported to a spreadsheet for further analysis.
Two possible normalization approaches were considered: (1) individual hexoses were normalized to heavy glucose, and individual N-acetylhexosamines (HexNAcs) were normalized to heavy N-acetyl glucosamine (GlcNAc); (2) individual hexoses were normalized to the sum of all endogenous hexoses, and individual HexNAcs were normalized to the sum of all endogenous HexNAcs. The second normalization approach tends to provide better interbatch reproducibility (<9% average CV for the six most elevated glycan nodes), but the first approach performs better in identifying the potential increases of all glycan nodes in the patient groups relative to the control group while maintaining a reasonable interbatch % CV (i.e., <21%). Thus, results reported below are based on normalization with heavy glucose and heavy GlcNAc, unless otherwise stated. The raw data of all XIC peak areas for all samples, together with the normalized data by the two normalization approaches and % CV values for batch-to-batch quality control (QC) samples are provided in a spreadsheet available as Supporting Information of Hu et al., J. Proteome Res. 2019 Nov. 1; 18(11):3985-3998.
For the glycan node data of each cohort, outliers were removed by log-transformation and the ROUT method at Q=1% using GraphPad Prism 7. Outlier-removed data were then reverse transformed by taking the antilog of each value. To identify differences between cohorts, the Kruskal-Wallis test followed by the Benjamini-Hochberg false discovery correction procedure was performed at a 5% false discovery rate using GraphPad Prism 7. RStudio Version 1.0.143 was used to compare different receiver operating characteristic (ROC) curves by Delong's test or Bootstrap test. The ROC curves shown in figures were plotted by GraphPad Prism 7. Correlation of glycan nodes with age or smoking pack-years were assessed via Spearman's rank correlation in GraphPad Prism 7. Stage-bystage multivariate modeling was performed using multivariate logistic regression in RStudio Version 1.0.143, with assessment carried out by leave-one-out-validation, and model selection done using a best subsets procedure. The ability of specific glycan nodes to predict lung cancer survival was evaluated with Cox proportional hazards regression model in SAS 9.4. And GraphPad Prism 7 was applied to generate survival curves and perform associated log-rank Mantel-Cox tests.
Cancer patient enrollment for the WELCA study took place at 12 different sites. In some cases, samples were permitted to sit overnight at 4° C. prior to final processing and storage at −80° C. In other cases, sample aliquots were temporarily stored at −20° C. prior to shipment a few weeks later to the central repository where they were kept long-term at −80° C. As such, assessment of the stability of glycan nodes in EDTA plasma kept at room temperature, 4° C., and −20° C. for varying lengths of time was assessed.
Five EDTA plasma samples from separate healthy donors (three male and two female), were aliquoted and temporarily kept at −20° C. for 10, 90, and 360 days, 4° C. for 2 or 90 days, room temperature for 1 day, or kept continuously at −80° C. Samples kept temporarily at temperatures warmer than −80° C. were compared with their respective control aliquots kept continuously at −80° C. The glycan nodes that are typically present at >1% relative abundance within their respective hexose or HexNAc class were measured and normalized to heavy, stable isotope-labeled glucose and GlcNAc internal standards or, alternatively, normalized to the sum of endogenous hexoses or HexNAcs. No significant differences were observed in the data sets normalized to the sum of endogenous hexoses/HexNAcs. When normalized to heavy, stable isotope-labeled glucose and GlcNAc internal standards, the only significant difference observed was an increase in 6-linked galactose for samples stored at room temperature for 1 day (p=0.033; Table S3 of Hu et al., J. Proteome Res. 2019 Nov. 1; 18(11):3985-3998). Thus, under the mildly adverse conditions to which some of the specimens in this study may have been exposed (less than a day at 4° C. or up to a few weeks at −20° C.), glycan nodes were found to be stable.
Notably, a study of the impact of plasma vs serum matrices on glycan nodes was previously reported in this journal.19 Differences observed were modest and did not impact the biological results of either the previous study or this one.
Basic clinical characteristics and n-values of the WELCA sample set were described in the Materials and Methods section and Table S1 of Hu et al., J. Proteome Res. 2019 Nov. 1; 18(11):3985-3998. All 207 control and 208 stage I-IV patient samples were randomized and analyzed in 27 batches. Within each control and case sample, a total of 19 glycan “nodes” were measured. The relative abundances of each of these nodes contributed at least 1% of the total hexose or total Nacetylhexosamine (HexNAc) signal. Data from each of the 19 glycan nodes were normalized to heavy, isotope-labeled glucose and GlcNAc internal standards. Statistically significant differences were detected in each cancer stage relative to the control cohort: 10, 6, 18, and 19 out of 19 glycan nodes were increased in stage I, II, III, and IV, respectively (Table 1 of Hu et al., J. Proteome Res. 2019 Nov. 1; 18(11):3985-3998). Data for each glycan node normalized to the sum of endogenous hexoses or HexNAcs were analyzed analogously (Table S4 of Hu et al., J. Proteome Res. 2019 Nov. 1; 18(11):3985-3998). This revealed shifts in glycan compositions in stage I-IV patients vs controls. However, because quantitative changes in glycans tended to outpace glycan compositional changes (as previously observed19) this normalization procedure was not as sensitive in distinguishing age-matched controls from lung cancer patients at each stage.
Six glycan nodes were found to be significantly elevated at nearly every stage in lung cancer patients relative to the age-matched controls, and these included: 2-linked mannose (2-Man) and 4-linked N-acetylglucosamine (4-GlcNAc), both of which are associated with total glycosylation levels especially for N-glycans;14 6-linked galactose, corresponding to α2-6 sialylation;16 2,4-linked mannose, corresponding to β1-4 branching;16 2,6-linked mannose, corresponding to β1-6 branching;16 and 3,4-linked GlcNAc, which primarily corresponds to antennary fucosylation16 (FIG. 3 of Hu et al., J. Proteome Res. 2019 Nov. 1; 18(11):3985-3998). The latter four nodes were among the top five most elevated nodes in a previously reported lung cancer study.19 The receiver operating characteristic (ROC) curve c-statistics (areas under the curve, AUCs) for these six glycan nodes in stage I-IV patients vs controls ranged (with two exceptions) from 0.68 to 0.92 (FIG. 3 of Hu et al., J. Proteome Res. 2019 Nov. 1; 18(11):3985-3998).
For most of these six glycan nodes there were significant differences between stages (FIG. 3 of Hu et al., J. Proteome Res. 2019 Nov. 1; 18(11):3985-3998), but the most robust differences tended to be between stage IV and stage I-II patients. 2,4-Man, the glycan node indicative of β1-4 branching, was the best at differentiating stage IV vs all other stages of lung cancer. ROC curves showing the ability of β1-4 branching to distinguish between stage IV and all other stages are provided in Figure S1 of Hu et al., J. Proteome Res. 2019 Nov. 1; 18(11):3985-3998.
Relative to the age-matched controls, five of the six top performing glycan node markers in stage I patients, and four in stage II patients, were significantly increased (
No significant alteration of five out of the six top performing glycan node markers was observed when each individual glycan node was separately analyzed for differences among never-smokers, previous smokers and current smokers within the WELCA study control cohort. The only exception was 3,4-linked GlcNAc (corresponding to antennary fucosylation), which was slightly elevated in current smokers relative to previous smokers (Figure S2 of Hu et al., J. Proteome Res. 2019 Nov. 1; 18(11):3985-3998). Spearman's rank correlation analysis demonstrated no statistically significant correlation with smoking pack-years in the control cohort, both for all control patients and control patients with smoking history (smoking pack-year >0). Together, these data revealed that the top performing glycan node markers within the control cohort had negligible dependence on smoking status. (A parallel analysis within the cancer patient cohort was not conducted due to the confounding correlation between smoking and lung cancer.)
The average ages of the control and case cohorts were nearly identical (61.2 and 61.6, respectively; Table S1). After pooling all data from the cases (all stages) and controls, 3,4-linked GlcNAc, corresponding to antennary fucosylation, was found to be weakly correlated with age (correlation coefficient r=0.159, p=0.0016;
The effect of lung cancer histological subtypes on the six glycan nodes was evaluated in the stage IV non-small-cell lung cancer (NSCLC) subcohort (i.e., the largest single-stage subcohort available; Table S1 of Hu et al., J. Proteome Res. 2019 Nov. 1; 18(11):3985-3998). For each glycan node marker, ROC curves of the three histological subtypes of NSCLC-adenocarcinoma, squamous cell carcinoma, and large cell Carcinoma-were compared pairwise by Delong's test or Bootstrap test (Figure S3, Table S8 of Hu et al., J. Proteome Res. 2019 Nov. 1; 18(11):3985-3998). No statistically significant differences between histological subtypes of NSCLC were discovered for any glycan node marker.
These findings on glycan node independence from smoking status, age and histological type are consistent with previously reported findings from other lung cancer case/control studies.19
The WELCA studied consisted entirely of women. Thus, to evaluate the role of gender in plasma glycan nodes, we turned to control patient data from the “large lung cancer” cohort of a previous study.19 This set of cancer-free patients consisted of plasma samples from 123 males and 76 females. Since it was not previously done, we looked for gender differences in all 19 glycan nodes evaluated in the WELCA study and found significant decreases in 3,4-linked GlcNAc (the node that corresponds to antennary fucosylation) as well as total fucose in females relative to males-regardless of whether data were normalized to heavy Glc/GlcNAc or to the sum of endogenous hexoses/HexNAcs (p<0.05 or lower after applying the Benjamini-Hochberg false discovery correction procedure). These observed increases in antennary fucosylation agree with previously published findings on studies of women of approximately the same age.28,29 Moreover, in the WELCA study we found that the observed increase in antennary fucosylation with age (FIG. 5 of Hu et al., J. Proteome Res. 2019 Nov. 1; 18(11):3985-3998) agreed with that previously observed by Reiding et al.29
Notably, we also observed increases in 2,4-linked mannose, corresponding to β1-4 branching, in women compared to men (p<0.05 for both heavy Glc/GlcNAc and endogenous normalizations). These findings align with those from Knežević et al.28 and Reiding et al.29 in which they found modest increases in triantennary and tetrantennary glycans in women relative to men-though for this glycan feature only the study of Knežević et al. revealed a statistically significant difference.28
The clinical performance characteristics of total glycosylation (i.e., total hexoses, total HexNAcs, and the sum of total hexoses and total HexNAcs) were evaluated and compared to individual glycan node markers on a stage-by-stage basis (Table S9 of Hu et al., J. Proteome Res. 2019 Nov. 1; 18(11):3985-3998). Results of ROC curve comparisons by paired Delong's tests demonstrated that total glycosylation cannot distinguish stage I-IV cases from controls better than individual glycan node markers.
Additionally, multivariate logistic regression models were built and compared with the clinical performance characteristics of individual glycan nodes at each stage (Figure S4 of Hu et al., J. Proteome Res. 2019 Nov. 1; 18(11):3985-3998). Fully cross-validated multivariate logistic regression models were no better at detecting lung cancer than the top-performing individual glycan node at each respective stage. Again, these results were consistent with previous observations in lung cancer.19
To evaluate the ability of the six glycan nodes to predict all-cause mortality, glycan node data were broken into quartiles and analyzed by Cox proportional hazards regression, with adjustment for age, smoking status, and cancer stage (Table S10 of Hu et al., J. Proteome Res. 2019 Nov. 1; 18(11):3985-3998). First and foremost, for patients in all four stages, the top quartiles of all six glycan node markers predicted all-cause mortality with hazard ratios in the range of 2-3 and p<0.01, relative to all other quartiles combined. The different rates of death for the top quartile versus all other quartiles for each glycan node marker are illustrated by survival curves (FIG. 6 of Hu et al., J. Proteome Res. 2019 Nov. 1; 18(11):3985-3998).
When focusing on stage III and IV patients, the top quartiles of all six glycan node markers predicted all-cause mortality with hazard ratios in the range of 2-3 and p<0.05 (Table S10 of Hu et al., J. Proteome Res. 2019 Nov. 1; 18(11):3985-3998) relative to all other quartiles combined (survival curves shown in Figure S5 of Hu et al., J. Proteome Res. 2019 Nov. 1; 18(11):3985-3998). Similar results were observed for stage IV patients only (Table S10 of Hu et al., J. Proteome Res. 2019 Nov. 1; 18(11):3985-3998). However, when stage III patients were analyzed alone, the hazard ratios of all six glycan nodes were not significantly different from 1 (p>0.05), indicating the relative risk of death was not detectably different between patients in the top quartile vs all other quartiles of each glycan node. 6-linked galactose (corresponding to α2-6 sialylation) and 2,4-linked mannose (corresponding to β1-4 branching) were significantly different between stages III and IV (
Overall, these results for glycan node-based prediction of mortality vary slightly, but are largely consistent with previously reported results on the ability of α2-6 sialylation and branched mannose residues to predict all-cause mortality in lung cancer.
Six out of 19 quantified glycan nodes, corresponding to total glycosylation levels (especially for N-glycans), α2-6 sialylation, β1-4 branching, β1-6 branching, and antennary fucosylation, were significantly elevated in the WELCA lung cancer patients relative to age-matched controls. These findings in the WELCA set are highly consistent with our previously reported lung cancer study on a dual gender lung cancer set,9 which also demonstrated the distinct increase of the latter four glycan features within stage III-IV cases compared to their respective control cohorts.
Our observations of the glycan node-based feature changes in lung cancer patients are closely aligned with the intact glycan changes reported in lung cancer by Vasseur and colleagues.30 Their intact glycan analysis results primarily revealed significant increases in antennary fucosylation, as well as fucosylated tri- and tetra-antennary N-glycans-findings that are in line with increases observed here in β1-4 branching and β1-6 branching.
The six top performing glycan nodes-based features in this study were not only able to distinguish lung cancer patients from age-matched controls, but were also able to predict all-cause mortality in the WELCA set-a finding that agrees well with the survival-predicting nodes in our previously reported study on the dual gender lung cancer set.19 Similar discoveries regarding the prognostic capacity of P/S glycans have also been reported by other groups. Hashimoto and colleagues31 suggested that specific glycoforms of serum al-acid glycoprotein (AGP) seemed to predict progression and mortality of several carcinomas, including lung cancer. According to their follow-up studies, patients who had the AGP glycoforms that contained highly fucosylated and branched sugar chains tended to have a poor prognosis. Besides the glycan features discussed above, another good prognostic predictor of lung cancer is the sialyl Lewis X epitope (SLex),32 which consists of α2-3 sialylation instead of α2-6 sialylation. The progression and survival in nonsmall-cell33-35 and small-cell lung cancer36 can both be predicted by SLex.
Most clinical trials require that enrolled patient life expectancy exceed three months such that a benefit from treatment can be observed-yet formal guidelines are generally not provided to facilitate this prediction.37 Glycan nodes representing α2-6 sialylation and β1-4 branching both performed well as prognostic indicators of survival within stage IV patients (Figure S6 of Hu et al., J. Proteome Res. 2019 Nov. 1; 18(11):3985-3998), and as such they may be able to provide some clinical utility toward this end.
Unlike the other two lung cancer sets that reported on previously,19 some glycan node-based features were substantially altered in the WELCA lung cancer patients at stages I-II (FIG. 4 of Hu et al., J. Proteome Res. 2019 Nov. 1; 18(11):3985-3998). Even though a relatively low number of early stage samples were measured (n=16 and 13 for stage I and II, respectively), statistically significant elevations were detected in most of the six glycan node markers, alongside comparatively high ROC c-statistics. Outside of a statistical anomaly, there are two possible noncancer related causes for this phenomenon. First, since the lung cancer patients and controls enrolled in the WELCA study are all female, a distinct gender dependence of glycan features may exist, especially in early stages. However, this possibility was not evidenced by the observation that no significant difference was detected between men and women in stage I and II of the dual gender lung cancer set, as well as in the stage I-only lung cancer set, which was also dual gender (Table S6 of Hu et al., J. Proteome Res. 2019 Nov. 1; 18(11):3985-3998). The second possible explanation is that the nonsmoking-matched controls of the WELCA set may have lower relative abundances of all the glycan nodes of interest relative to the smoking-matched controls for other lung cancer sets. In the WELCA set most controls were never-smokers, but the cancer patients were mainly current-smokers (Table S1 of Hu et al., J. Proteome Res. 2019 Nov. 1; 18(11):3985-3998), suggesting that smoking history might possibly contribute to increases in some glycan nodes. Taken together with the observation that the top performing glycan node markers within the control cohort had near-negligible dependence on smoking status (Figure S2 of Hu et al., J. Proteome Res. 2019 Nov. 1; 18(11):3985-3998), smoking appears to contribute to slight, but mostly statistically insignificant elevation of glycan nodes. Smoking is undoubtedly bad for the liver,38 which secretes approximately half of all circulating glycoproteins.39,40 Nevertheless, these results that indicate only a mild contribution of smoking to alterations in circulating glycan nodes is in full agreement with results from a previous study of glycan nodes in lung cancer patients in which controls were smoking status-matched to the lung cancer patients, and in which only minor impacts of smoking on glycan nodes within the control population were observed.19
Many studies have reported important gender differences in lung cancer between men and women, in terms of histological type, tobacco exposure, and survival and treatment response.41,42 Here, by comparison with previously conducted studies,19 no obvious gender differences were detected with regard to P/S glycan features. Smoking is the primary risk factor for lung cancer. However, a large percentage of women with lung Adenocarcinoma-between 20% and 30% in Western countries and nearly 80% in Asian countries-are nonsmokers.26 Hence, some female-specific risk factors for lung cancer must exist and may play vital roles in lung cancer development, progression and survival; these may include hormonal factors and occupational risk factors in female occupations—as suggested by Stticker et al.26
As represented by glycan nodes, blood plasma glycans were found to be stable under a variety of less-than-ideal sample storage conditions. The diagnostic and prognostic capacity of plasma glycan features in stage I-IV lung cancer-as represented by monosaccharide and linkage-specific glycan nodes-were validated in the WELCA case-control study. Significant elevation of α2-6 sialylation, β1-4 branching, β1-6 branching, antennary fucosylation, and total N-glycosylation level was observed in almost every stage of lung cancer relative to age-matched control groups. Early stage detection was stronger than previously observed,19 but this observation may have been related to the lack of smoking status-matching between cases and controls in the WELCA study. Nevertheless, alteration of glycan features in lung cancer was found to be almost completely independent of smoking status, age, and histological subtypes of lung cancer. The six most-elevated glycan features predicted all-cause mortality in lung cancer patients after adjusting for age, smoking status, and cancer stage. No gender-based differences were discovered in glycan features associated with lung cancer.
The following references are hereby incorporated by reference in their entireties:
B., Comprehensive native glycan profiling with isomer separation and quantitation for the discovery of cancer biomarkers. The Analyst 2011, 136, (18), 3663-71.
While particular materials, formulations, operational sequences, process parameters, and end products have been set forth to describe and exemplify this invention, they are not intended to be limiting. Rather, it should be noted by those ordinarily skilled in the art that the written disclosures are exemplary only and that various other alternatives, adaptations, and modifications may be made within the scope of the present invention. Accordingly, the present invention is not limited to the specific embodiments illustrated herein but is limited only by the following claims.
This invention was made with government support under R33 CA191110 awarded by the National Institutes of Health. The government has certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
62758026 | Nov 2018 | US |