The present invention relates generally to diagnostic testing. More particularly, the present invention relates to a diagnostic test for detecting stage I lung cancer biomarkers.
Lung cancer is one of the most commonly occurring types of cancer, and it accounts for almost 25% of all cancer deaths. Treatment and long-term outcomes are dependent on the stage and type of lung cancer, as well as on the patient's health. While it is possible to diagnose lung cancer with medical imaging, it would be helpful to find additional diagnostic methods to allow for earlier diagnosis and treatment.
Regular screening in patients at risk has previously shown a mortality benefit and a patient's best chance of survival remains early detection. The National Lung Cancer Screening Trial (NLST) demonstrated a 20% relative decrease in lung cancer mortality with low dose CT scans (LDCT) with a sensitivity of 93.8%, specificity of 73.4%, and negative predictive value of 99.9%. Due to the results of this trial, LDCT scan has become the gold standard for early lung cancer detection. Despite these efforts, CT screening has suffered from slow adoption in part due to its 27% false positive rate which has led to unnecessary procedures with associated morbidity and mortality. Only 15% of lung cancer patients are diagnosed at an early stage. If detected at stage 1, the five-year survival can exceed 90%, thus additional early identification tests are needed.
It would therefore be advantageous to provide a new method for diagnosis of lung cancer, while it is in its earliest stage.
In accordance with an embodiment, the present invention provides a method of detecting stage one lung cancer in a subject including collecting a breath sample from the subject. The method also includes analyzing the breath sample to detect at least one of Acetoin, Dodecane, and p-Cymene. The method further includes initiating a follow-up plan for the subject, if the at least one of Acetoin, Dodecane, and p-Cymene are detected.
In accordance with an aspect of the present invention, the method includes collecting multiple breath samples from the subject. The method includes using a device for analysis of the VOCs in the breath. The method includes using the device more than once, in order to confirm results. Additionally, the method includes collecting the breath sample in a bag or other receptacle. The bag or other receptacle takes the form of a Tedlar® bag or other film bag. The method includes analyzing the breath samples within 24 hours of collection, and in some instances includes analyzing the breath samples within 2 hours of collection. The method includes using a gas chromatograph for analysis of the breath sample. The follow up plan further includes additional testing, treatment, preventative and/or lifestyle changes.
The accompanying drawings provide visual representations, which will be used to more fully describe the representative embodiments disclosed herein and can be used by those skilled in the art to better understand them and their inherent advantages. In these drawings, like reference numerals identify corresponding elements and:
The presently disclosed subject matter now will be described more fully hereinafter with reference to the accompanying Drawings, in which some, but not all embodiments of the inventions are shown. Like numbers refer to like elements throughout. The presently disclosed subject matter may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will satisfy applicable legal requirements. Indeed, many modifications and other embodiments of the presently disclosed subject matter set forth herein will come to mind to one skilled in the art to which the presently disclosed subject matter pertains having the benefit of the teachings presented in the foregoing descriptions and the associated drawings. Therefore, it is to be understood that the presently disclosed subject matter is not to be limited to the specific embodiments disclosed and that modifications and other embodiments are intended to be included within the scope of the appended claims.
An invention in accordance with the present invention includes using VOCs in exhaled breath to diagnose stage 1 lung cancer (SL1C). Three potential biomarkers Acetoin, Dodecane, and p-Cymene have predictive power for SL1C. Acetoin and Dodecane are predictive with relation to their concentrations in the 1 L breath sample, and p-Cymene is predictive with relation to being above or below the limit of detection. The diagnostic of the present invention is capable of detecting S1LC non-invasively and potentially earlier than other methodologies. This diagnostic can then be paired with appropriate treatments to address S1LC before it grows larger or metastasizes.
The present invention is directed to concentrations of volatile organic compounds (VOCs) in the breath of biopsy-confirmed S1LC patients (cases) and lung-cancer-free individuals (controls). The present invention is specifically focused on S1LC, which is a very specific, early phase of lung cancer; (2) it uses calibration curves for thirteen compounds identified as potential biomarkers to obtain their concentrations expressed in μg/L; and (3) it uses a case to two control sampling design and additional protocol features to reduce the potential for bias, confounding, and measurement error.
Volatile organic compounds (VOCs) are carbon containing compounds which may be produced by the body, or be environmental contaminants. These compounds can be measured in breath, and there has been extensive research into their ability to be used for lung cancer diagnostics. The use of breath analytics in lung cancer has been previously attempted, each study with limitations. One of the first studies was in 1971 and identified 250 different VOCs in human breath samples. As technology advanced, this study was repeated in 1999 and over 3,000 breath VOCs were identified. Multiple studies have since followed and a re-emergence of a search for a VOC signature started in the 2010s. Fu et al. studied the differences in VOCs between lung cancer patients and healthy controls, with the goal to identify two or more VOCs as a “fingerprint” to identify lung cancer. They were able to identify a signature of 4 VOCs and reported a sensitivity and specificity of 89.9% and 81.3%, respectively. A second study was released a year later, which included an analysis of three groups of participants: benign nodules, healthy controls and lung cancer patients. This study reported a VOC signature with a sensitivity and specificity of 88.5% and 86.5%, respectively21. Multiple other studies have been conducted, none of which identified a VOC signature. None of these studies provided a practical and reproducible VOC signature of cancer as: (1) the definitions of “signatures” are impractical as they only provide a list of compounds; (2) data that underlined the results are not publicly available; (3) code that was used for signatures is not available; (4) VOCs are not expressed in units of concentration (e.g., μg/L); and (5) not enough practical details are provided about how to re-construct the “signature” in a new study. Moreover, many of these studies have been limited by small sample sizes, heterogeneous experimental conditions, lack of calibration of VOCs to the concentration scale, irreproducible analytic pipelines, and/or lack of built-in validation component. Despite being extensive, this existing literature does not provide a practical solution to detecting early-stage lung cancer using breath VOCs.
The present invention identifies VOCs signatures and quantify their discrimination properties, investigates whether simpler signatures (containing fewer VOCs) with large discrimination power exist and quantifies and ranks their discrimination power; and (3) searches for new VOC signatures and quantify their discrimination properties.
In order to implement the present invention, breath samples are obtained from the subject. The breath sample is analyzed for the VOCs that are indicative of S1LC. In some embodiments, it may be preferable for the subject to supply more than one sample for analysis. In other embodiments, it is possible that a device for analysis of the VOCs in the breath can be used. In such instances, the device can be used more than once, in order to confirm results. If the subject is to provide a sample for analysis, the breath sample can be provided in a bag or other receptacle known to or conceivable to one of skill in the art, such as a Tedlar® bag or other film bag. When possible, to avoid the effects of time on the samples, breath samples are analyzed within 24 hours of their collection, though, typically, the analysis is conducted within two hours of breath sample collection. In some embodiments, a gas chromatograph is used for analysis. If VOCs indicative of SL1C are found, additional steps can be taken by the diagnosis and treatment team. For instance, the diagnosis and treatment team might preform additional diagnostic testing, treatment, preventative and/or lifestyle changes, or any other treatment step known to or conceivable to one of skill in the art.
The following examples and data are included herein by way of illustration of the invention. These examples are not meant to be considered limiting, and the invention is considered to include any implementation known to or conceivable to one of skill in the art.
A study used as a basis for the present invention has a matched case to two controls design with continuous enrollment. The first 30 cases with matched and housemate control trios are used to conduct preliminary exploratory analyses, sample size projections, and exploratory analyses of new VOC signatures. Some trios may not contain one of the controls and the resulting data are referred to as groups instead of trios to reflect this reality. Each participant provided two samples of breath air into Tedlar® bags or other film bags, one after the other without breathing in-between. The resulting samples are referred to as the bag 1 and bag 2 samples, respectively. For each participant and each bag sample all VOCs detectable by the lab equipment were described using two methods: (1) log-area under the curve peaks; and (2) concentration for a subset of 13 VOCs for which calibration curves could be obtained. These 13 VOCs are referred to as quantifiable VOCs and the rest as unquantifiable VOCs. In other applications, calibration curves can be available for a larger or smaller set of VOCs; therefore, the definition of quantifiable VOCs is specific to this study; a list of quantifiable VOCs are included, herein. The quantifiable VOCs were chosen among the ones identified in the VOC breath analysis literature, though calibration substances were available only for some of the published compounds. The quantifiable VOC data is used for the first 30 groups as training set for conducting cancer prediction modeling. The remaining groups (a total of 58 groups) are used for validation of findings.
A case is defined as a person with biopsy-confirmed S1LC. A control is defined as a person without lung cancer. Cases were identified from a population of lung cancer patients. For each case the plan was to identify 2 controls: (1) the first control (type 1: matched control) participant was identified from the population of patients who do not have a lung cancer diagnosis; (2) the second control (type 2: housemate control) participant was identified as another adult person from the household of the case patient who does not have a lung-cancer diagnosis. Collecting information from a type 2 control was not always possible either because another household adult was not available or, if one was available, they did not consent to participate in the study. The type 1 controls were identified via a medical records system based on covariate matching. The type 2 controls were identified, when possible, during the preliminary patient visit. Because some cases do not have a matched housemate control, some matched groups contain two study participants (case and type 1 control matched on covariates) and some contain three participants (case, type 1 control matched on covariates, and type 2 housemate control). These are referred to as groups instead of trios, as the matched groups do not always contain three study participants.
Matching was conducted to reduce the potential for confounding. Two types of controls are used for each case. All participants were asked to abstain from smoking, vaping, or drinking for at least 30 minutes before conducting the test. The type 1 control was identified via the medical records system and was matched to the lung cancer case patient using the following variables:
The type 2 control is identified, when possible, during the preliminary patient visit and is an adult person who lives in the same household with the lung cancer case patient. Some patients did not have another adult living in the same household or if they have, they did not consent to participate in the study; in these cases the type 2 control sample was not collected.
To avoid possible analytic batch effects, the sampling for cases and controls was conducted as close in time as possible. For the type 2 controls, the sampling was done during the same visit, whenever possible. To avoid potential effects of time between sample collection and analysis, the lab received and analyzed the breath samples within 24 hours of their collection, though, typically, the analysis was conducted within two hours of breath sample collection. In 11 samples the time between collection and analysis was between 24 and 32 hours, and in 3 samples the time between collection and analysis was between 6 and 11 days. The sampling and analysis of cases and controls was not conducted in separate groups to avoid temporal batch effects. The VOC analysis laboratory did not receive information about case status.
Every study participant was assigned a unique anonymous subject code (UASC). Subject identifying data (e.g., name) and the link to the UASC data were stored securely by the team. Subject-identifying data was not requested or shared with the laboratory team. Quality control was assured by direct collaboration between the teams. Each of the three teams identified a team member who was in charge of data quality control. All data were recorded in a database system, compliant with all Personal Health Information (PHI) regulations. Quality control was conducted in a multilayer approach and in close collaboration between the team members to: (1) correct typos and incorrect coding; (2) identify unusual observations and re-check them; (3) review each group to ensure that matching was conducted according to protocol; (4) review groups with fewer than 3 study participants; and (5) check for consistency of data entry formats.
Participants exhaled directly into two Tedlar® bags or other film bags. The volume of Bag 1 was 0.5 L (SKC Inc. Cat #232-01) and the volume of Bag 2 was 1 L (SKC Inc. Cat #232-02). Each Tedlar® Bag or other film bag was flushed at least 3 times with ultra-high purity nitrogen (Part #NI UHP300, Airgas, US) before use to remove residual contaminants from the manufacturer. Participants were instructed to take a deep inhalation and exhale ˜150-300 mL of breath into the 0.5 L bag (about half full). Immediately, the participant inflated the 1 L bag (Bag 2) using the rest of the exhaled breath. All collected breath samples were delivered at room temperature to the research lab for VOC analysis within 2 hours after collection (whenever possible). The Tedlar bags were measured within 24 hours of breath by the lab in 166 (74%) patients. All but 3 bags were read within 24 hours of lab acquisition. Only data collected from Bag 2 were used for the analyses.
Clean and humidified air was injected into a subset of bags to evaluate measurement background. Volatile organic compounds (VOCs) in the exhaled breath were analyzed using thermal desorption (TD) and gas chromatography-mass spectrometry (GC-MS). A multiple channel thermal desorption system (UNITY-xr™) with an auto-sampler (CIA Advantage-xr™ both from Markes International, Inc., UK) was used to sample 100 mL of exhaled breath from each of the Tedlar® bags or other film bags at a flow rate of 50 mL/min and flow path temperature of 150° C. Helium was used as the carrier gas at a constant pressure of 5 Pounds per Square Inch (PSI); the sample was directly injected from the TD unit into the gas chromatograph for analysis. Chromatographic analysis was performed using a Trace GC-Ultra gas chromatograph attached to an ISQ Mass Spectrometer (GC-MS, Thermo Scientific). VOC compounds were separated with a 30 meter column×0.25 millimeter internal diameter and 1.40 μm film thickness (Cat #19915, Rtx-VMS, Restek Corp, U.S). The oven temperature was set on a gradient to achieve optimal separation of the analytes at an initial temperature of 35° C. with 1 min hold; the temperature rate was increased by 5° C./min to reach 100° C. followed by a final temperature ramp of 50° C./min to 240° C.
Thirteen previously reported VOCs, representing different chemical groups, were selected for quantitative analysis; see Table 1 below for a complete description. For each selected chemical, a five-point calibration curve was generated by spiking reagent-grade standards into Tedlar® bags or other film bags in concentrations ranging from 0.390 μg/mL to 4000 μg/mL using methanol as solvent. Exactly 1 μL aliquot of each standard was injected into five different bags filled with 1 L of pure Nitrogen, diluting the concentration of the analyte by 1000×. Five calibration curves for each VOC were generated, and their average slope and intercept were used to quantify concentrations from participant samples. Ten blanks were prepared by inflating Tedlar® bags or other film bags with clean and humidified air. Clean and humidified air was injected into a subset (10%) of bags to evaluate measurement background.
Volatile organic compounds (VOCs) in the exhaled breath were analyzed using thermal desorption (TD) and gas chromatography-mass spectrometry (GC-MS). The laboratory did not receive information about participant case status. A multiple channel thermal desorption system (UNITY-xr™) with an auto-sampler (CIA Advantage-xr™ both from Markes International, Inc., UK) was used to sample 100 mL of exhaled breath from each of the Tedlar bags at a flow rate of 50 mL/min and flow path temperature of 150° C. Helium was used as the carrier gas at a constant pressure of 5 Pounds per Square Inch (PSI); the sample was directly injected from the TD unit into the gas chromatograph for analysis.
Chromatographic analysis was performed using a Trace GC-Ultra gas chromatograph attached to an ISQ Mass Spectrometer (GC-MS, Thermo Scientific). The lowest standard of each VOC was prepared at least five times and injected into the GC-MS. The limit of detection (LOD) for each chemical was calculated by multiplying the standard deviation of those low analytical standard replicates by 3 (LOD=StDev×3). All lab analysts were blinded to study participant's status and information. Standardized procedures were used for performing and documenting lab operations, including sample management (login, registration integrity, life cycle tracking), chain of custody, inventory and storage management.
Thirteen previously reported VOCs, (Table 1) representing different chemical groups, were selected for quantitative analysis. The limit of detection (LOD) for each chemical was calculated according to the methods provided in the supplementary materials.
-
- -
- -
- -
- -
- -
indicates data missing or illegible when filed
There are 330 study participants who provided breath samples that were analyzed as part of the study. Not all these participants were included in the analysis because some of them were identified by the clinical team as potential cases, but were not confirmed to have S1LC after biopsy results. The controls for these study participants were also not included in the statistical analysis. Below is the inclusion and exclusion report in Table 2.
Since many VOCs were below the limit of detection (LOD) for a large percentage of observations, only four VOCs with less than 10% data below the LOD were used in analyses. Each concentration was log10-transformed. Additional models were fit with each individual VOC being above/below the LOD as a predictor and S1LC as an outcome using univariate logistic regression analysis. A total cohort of 300 individuals was planned: 240 individuals with lung nodules suspicious for possible lung cancer, 30 long-term smokers, and 30 non-smokers. With the overall prevalence of disease of 56.67%, the total sample size of 300 yielded at least 90% power to estimate sensitivity with a 95% confidence interval of ±0.09 at an expected sensitivity of 0.90 and at least 90% power to estimate specificity with a 95% confidence interval of ±0.11 at an expected specificity of 0.90. Power analysis was conducted utilizing R (Vienna, Austria). Following the study analytic protocol, the first 30 groups of matched cases and controls, determined by case enrollment time, were used for training and the last 58 groups were used for testing. Larger proportion of the data was selected for testing to illustrate the higher robustness of the predictions. Analyses were conducted by combining the two types of controls, whenever they were both available.
Each model was fit to the training data, and then applied to: (i) the testing; and (ii) the combined testing and training data. All analyses were performed in the R statistical software. To detect statistically significant differences between VOC breath concentrations in S1LC and controls, two sample unpaired t-tests, which lose some power, but ensure that results are generalizable to the population, were performed using the R function t.test( ). Classification tests using thresholds of the statistically significant VOC were developed based on the 10th, 25th and 50th percentiles of VOC concentrations in the training data of controls. Univariate and multivariate forward selection logistic regression models were fit using the glm( ) function in R. Forward selection was used to identify the combination of most predictive VOCs. Selection of VOCs were based on the improvement in the receiver operating characteristic area under the curve (AUC) in the training data, where at each stage the VOC with the highest AUC in the training data was incorporated into the model. For each selected model the AUC on the test data was computed. Missing observations were excluded in each candidate model when individual VOCs were below the LOD.
Breath samples were collected and analyzed on all study participants who were likely to have S1LC according to the biopsy protocol. However, the breath sample was taken before the biopsy was performed to mitigate the potential effects of sedation and biopsy procedure on the breath VOCs. Among these potential S1LC cases only some had biopsy-confirmed S1LC. There are 157 potential cases in the data (study participants who were likely to have lung cancer before biopsy). Out of these, 65 potential cases (41.4%) did not have biopsy-confirmed S1LC and were excluded from the analysis. All excluded cases had a valid exclusion reason recorded in RedCap. Table 4 summarizes the exclusion criteria for these potential cases after biopsy results. The specific reasons for excluding potential cases labeled “Other” in Table 4 are provided in Table 5. The category “Other” was used in Table 4 because the reasons for non-inclusion listed in Table 5 are rare. Most potential cases were excluded because biopsy results were negative (person did not have biopsy-confirmed S1LC, even though they were considered likely to have S1LC before the biopsy was conducted). For the purpose of this analysis the matched and housemate control data associated with the cases that met the exclusion criteria have also been removed from the analysis. These data exist for some study participants, but was not included in the analysis.
After the exclusion of groups that did not have a biopsy-confirmed S1LC case, there were 231 study participants left (cases and controls). These data include a total of 92 cases with 51 control housemates and 88 matched controls. The number and type of controls are displayed for these 92 cases in Table 6.
From these data four patients with biopsy-confirmed S1LC were further excluded. Out of these four patients 2 did not have either matched or housemate control data. The other 2 cases had only housemate control but not matched control data. Data for these groups were excluded from the analysis. These exclusions were applied to avoid groups that are not balanced on covariates.
Data analysis is conducted only for groups of study participants that contained a patient with biopsy-confirmed S1LC. For this analysis, only 88 groups with a case who had least one available matched control were used. These data are referred to as “included groups”. Among the included groups, 39 groups had only one matched control and no housemate control and 49 had both matched control and housemate controls; see Table 6 for more details.
There are 330 total number of study participants, which included 157 potential cases (patients who were identified by the clinical team as potential cases before the biopsy). Out of the 157 potential cases, 65 (41.4%) were excluded from the analysis. Most potential cases were excluded from the study because biopsy results did not confirm the S1LC diagnosis; see Table 2. Matching control and housemate control data associated with the cases that were excluded were also removed from the analysis. The data used in this analysis has 225 participants, which includes a total of 88 cases with at least one available matched control.
According to the pre-specified analysis protocol, data were split into training (for biomarker discovery and model exploration) and testing (for validation of biomarkers and models). The first 30 groups and their controls were used for training and the remaining 58 groups were used for testing.
The demographic and behavioral summaries for the study participants in the 88 analyzed groups (case and at least one available matched control) are presented in Table 5. Details are further provided by the three study participant types (case, matched control, housemate control). Table 6 provides the demographic and behavioral information separated by training and testing data sets. For each subject, two bags of exhaled breath were collected consecutively during one forceful exhalation process. Bag 1 (diluted) had a volume of 0.5 liters and was used to collect the first air exhaled (tidal volume), which is thought to represent the normal exhalation process. Bag 2 (alveolar) had a volume of 1.0 liter and was used to collect the expiratory reserve volume (the gas mixture coming from the dead space of the bronchial tree and the alveolar gas exchange space of the lungs). The air from each bag was injected into a gas chromatograph (GC-MS), which separated the different compounds in the exhaled air into a series of “peaks”. Each peak was associated with a distinct VOC.
To convert an original GC-MS peak area result (unitless) to a concentration value in the sample (mass of compound per volume of air), a calibration curve was constructed for each of the 13 quantifiable VOC compounds described in Section 4.7.3. A calibration curve was obtained by serially diluting a chemical standard to obtain at least five different and known concentrations, which are plotted along the x-axis. These known concentrations are injected into the GC-MS and the resulting peaks are plotted along the y-axis. Each calibration curve was compound specific. This provided the mapping (calibration) of VOC peak areas to concentrations measurements for Bags 1 and 2.
0
8
.
(mean (SD))
indicates data missing or illegible when filed
The first step is to compare the consistency of VOC quantification in the two bags. Note: bag comparison results are based on the analyzed data only, which included 225 study participants (88 cases, 88 matched controls, and 49 housemate controls). As both measures are highly right skewed, the log10 (peak area) and log10 (concentration) were used instead.
Results indicate that for most compounds, the VOC peak areas measurements for the two bags are strongly correlated; see Table 7 and
- -1-
-C
indicates data missing or illegible when filed
The association between the measurements in the two bags was also quantified using a linear model regression for Bag 2 (y, outcome) versus Bag 1 (x, regressor) based on log10 peak areas and concentrations, respectively. Table 9 provides summaries of these regressions, where: (1) the columns labeled “Estimate” provide the point estimate for the slope of the regression; (2) the column labeled “p-value” is the p-value for testing the null hypothesis of no association between measurements in Bags 1 and 2; (3) the columns labeled “lower CL” and “upper CL” are the lower and upper limits of the 95% confidence intervals for the participants who had both bag measurements. Results indicate that there is strong evidence that the log peak area measurements in the two Bags are strongly statistically associated for all quantifiable compounds peak data, where 2-Butanone, Toluene have the slope estimates greater than 0.9 and Ethylbenzene, p-Cymene greater than 0.8. Scatterplots of Bag 1 (x-axis) versus Bag 2 (y-axis) measurements are shown in
indicates data missing or illegible when filed
The fewer data points in
According to the study design, each study participant started exhaling in Bag 1 (diluted), and continued exhaling into Bag 2 (alveolar), which was assumed to collect deeper air from the lungs. Comparison of Bags 1 and 2 peak area and concentration measurements indicates that there are strong correlation between the measurements in the two bags; see Table 7,
- -1-
-C
indicates data missing or illegible when filed
Quantifiable compounds were not detected for some study participants. The missing (below limit of detection) concentrations by VOC and collection bag are presented in Table 10. Here missing concentration values include both missing peak values, which did not produce a concentration value after calibration, and peak values which corresponded to a VOC concentration value that was considered below the limit of detection. There were 4 (Control-Housemate: N=1, Matched-control: N=3) study participants in the test data set with missing Bag 1 measurement. These study participants were removed from the Bag 1 versus 2 analysis, but were kept in the predictive modeling analysis.
indicates data missing or illegible when filed
Table 11 further lists the number of case and control study participants in the training and testing data with missing quantifiable peaks and concentrations, respectively. Results indicate that the individual VOC limit of detection and percent missingness depends on the compound type both for peaks and concentrations. There is also a bag effect for peak areas, with fewer missing peak areas in Bag 2 (with the exception of Ethylbenzene). For concentrations with lower percent missingness (Dodecane, Acetoin, 2-Pentanone, Heptanal) the percent missing observations was lower in Bag 2. For concentrations with higher percent missingness the difference between bags was less clear.
The quantifiable VOC peak area obtained from Bag 2 (alveolar) in the training data is examined.
The overall goal of the project is to identify individual or VOC combinations that discriminate S1LC patients from controls. The first step was to conduct forward selection based on logistic regression on the training data, regressing on the case/control status. The ideas is to select the combination of variables with the highest predictive performance as measured by the area under the receiver operating characteristic (AUC) curve in the training data set. The second step is to apply and evaluate these models on the test data set. A control is defined as a study participant in the “included data” subset who does not have cancer (either control housemate or matched control). For each compound the missing observations were removed in all models that contained that compound.
Pairs of VOCs with high correlations between log peak area measurements may not improve the predictive performance of models using only one of the VOCs in the pair. This is due to the overlap in information between the two VOCs in the pair. On the contrary, pairs of VOCs with low correlations are good candidates for jointly improving prediction. In this data set, many VOC pairs have highly correlated log peaks; see
The performance of each VOC (log peak area) in a univariate model is examined, that is, using each VOC as a single predictor of lung cancer. Table 12 ranks predictive performance of each compound. Based on the training AUC results, p-Cymene, Heptanal, Acetoin are the top 3 VOCs in terms of S1LC case prediction performance. Table 12 also shows that the top individual predictors ranked by test AUC are Acetoin (test AUC 0.648), p-Cymene (test AUC 0.612) and 2-Butanone (test AUC 0.61).
Table 12 displays the results of the forward selection procedure, where each VOC is added in the predictive model based on the maximum AUC criteria in the training set. The model with maximum test AUC included p-Cymene and 2-Butanone (test AUC 0.669). The second best performing model included p-Cymene, 2-Butanone, Heptanal, and Acetonin (test AUC 0.620).
A major practical limitation of the VOC peak-based analysis is that multiple compounds are below the limit of detection; see Tables 10 and 11. For example, the top predictor based on log peak area used p-Cymene (64% missing concentrations in cases/training, 38% missing concentrations in controls/training, 70% missing concentrations in cases/test, and 47% missing concentrations in controls/test) and 2-Butanone (94% missing concentrations in training cases and controls and 98% missing concentrations in test cases and controls). This is a problem because even if the compounds may have discriminatory power, they are generally under the limit of detection of the GC-MS instrument used in the study. The implication is that concentration thresholds with discriminating properties cannot be provided for these compounds.
Therefore, in what follows VOC concentrations with values above the limit of Detection are used.
indicates data missing or illegible when filed
indicates data missing or illegible when filed
Correlations between individual VOC log concentrations (using pairwise complete observations) in the training data are presented in Table 14. Results are consistent with the correlation results for VOC log peak areas; see
Table 15 provides individual VOCs S1LC case prediction performance using univariate logistic regression based on log concentrations above the limit of detection. Acetoin, Heptanal have training AUC greater than 0.6, while other compounds have AUCs close to 0.5. The AUC for Acetoin is 0.649 in the training data and 0.650 in the testing data. In contrast, the AUC for Heptanal is 0.610 in the training data, but falls to 0.511 in the test data. Dodecane has a consistent AUC across training and test data (0.574 in training and 0.541 in testing).
A forward selection approach was used to identify the combination of most predictive VOCs. Selection of VOCs and ranking of models were based on the maximum improvement in the AUC using training data. For each selected model the AUC on the test data was also computed. Missing observations are excluded when individual VOCs are below the detection limit in each candidate model. Table 16 displays the results of the procedure and provides both the training and test AUC as additional covariates are included into the model. The table is cumulative; for example, the row labeled 2-Pentanone indicates that 2-Pentanone was the third variable added to the model and the corresponding AUC refers to the model that includes Acetoin, Heptanal, and 2-Pentanone.
In the log concentration analysis, Acetoin is the strongest predictor with a training AUC of 0.649 and a test AUC of 0.65. Adding Heptanal increases the training AUC to 0.669 and decreases the test AUC to 0.669. Adding 2-Pentanone to the model increases slightly the training AUC (from 0.669 to 0.689) though the test AUC of 0.601 is still below the test AUC of 0.65 for Acetoin alone. This suggests that using a one variable model based on Acetoin may be the best approach. One could also consider a two variable model adding either Dodecane or 2-Pentanone. However, more complex models are not considered at this time given the results in Table 16 and the high correlations among the other log concentrations of quantifiable VOCs shown in Table 14.
Un-paired t-tests were conducted to compare the mean of the log concentration among cases and combined controls separately in the training and test data as well as in the combined test and training data. Table 17 provides the results indicating that the difference in log concentrations of Acetoin is: (1) not significant at the α=0.05 level in the training sample (p-value=0.091; (2) is significant in the test sample (p-value=0.001); and (3) is significant in the combined sample (p-value=<0.001). This is likely due to the differences in sample sizes between the training and testing data sets. For all other VOCs and data sets, the differences were not statistically significant at the α=0.05 level.
Results based on VOC concentrations suggest that Acetoin: (1) has most concentrations above the limit of detection; (2) leads to the best predictive model in the test data; and (3) has a stable performance when transitioning from training to test data. Thus, the specific Acetoin concentration thresholds expressed in μg/L and their associated S1LC case prediction performance are explored. Because Acetoin concentrations were, on average, lower in S1LC patients compared to controls, the test follows the following rule:
The thresholds, thresholdtrain, can be chosen in many different ways to balance sensitivity and specificity. Here, the following thresholds on the percentiles of Acetoin concentrations in the training data of controls are considered: (a) the 10th percentile (0.026 μg/L); (b) the 25th percentile (0.044 μg/L); and the 50th percentile (0.098 μg/L). These thresholds are provided directly on the concentration scale. The corresponding thresholds, thresholdtrain, on the log10 concentration scale can be obtained by taking the log10 transformation of the thresholds on the concentration scale. These choices are made for illustration purposes only.
Table 18 further quantifies the results displayed in
Table 19 provides the estimated sensitivity (proportion of correctly identified S1LC cases), specificity (proportion of correctly identified controls), and accuracy (proportion of correctly classified cases and controls). The part of the table labeled “Test Data” corresponds exactly to
Focus has been on the prediction performance of concentrations when they are above the limit of detection, which was the main goal of the study. However, several VOCs have large proportions of observations that are below the limit of detection. Thus, there is a need to investigate whether being above/below the limit of detection predicts S1LC status. To conduct this analysis missing VOC concentrations were recoded as 0 and those present were recoded as 1. These recoded variables are referred to as presence/absence of individual VOCs.
Analyses were conducted using individual quantifiable VOCs presence/absence data as predictors and S1LC case indicators as outcome. Table 20 provides the train and test data AUC for each VOC presence/absence data. All models are univariate (using one presence/absence predictor). The test AUCs for all compounds, except p-Cymene are close to 0.5. The AUC for p-Cymene is 0.633 in the training data and 0.580 in the test data. The limit of detection for p-Cymene (see Table 21) was 0.00011 μg/L. The model uses a decision rule of having a p-Cymene breath concentration below 0.00011 μg/L to predict S1LC cases.
Analysis of VOC concentrations data indicated that Acetoin was the strongest predictor S1LC cases in the test data set. Analysis of presence/absence concentrations data indicated that p-Cymene being below the limit of detection was predictive of S1LC. Here, the investigation is focused on whether the combination of Acetoin and presence/absence of p-Cymene given its specific LOD in or study performs better than Acetoin alone.
Results indicate that the model with Acetoin alone has better prediction performance (training AUC=0.649; testing AUC=0.65) than the model with Acetoin and the indicator variable for presence/absence of p-Cymene (training AUC=0.606; testing AUC=0.504).
Table 21 provides the range of the distribution of detected concentrations for Bag 2 in all analyzed data (testing and training combined) and the corresponding limit of detection for every compound. All values are expressed in μg/L. For example, for 2-Pentanone the minimum observed concentration was 0.00133 μg/L and the maximum observed concentration was 0.22125 μg/L with a limit of detection of 0.00130 μg/L and an upper bound for the concentration curve calibration of 0.10000 μg/L. It is worth noting that most limits of detection are in the nanograms (one thousandth of one microgram) per liter (ng/L) range. The highest limit of detection among the thirteen quantifiable compounds in this study is Toluene, with a limit of detection of 0.01854 or approximately, 18 ng/L.
The maximum upper bound for concentration for each compound is related to the data available for calibrating the curves. A few observations were estimated to be above the upper bound and were based on extrapolation of the calibration curve. All analyses were based on data using these few extrapolated values. Two sensitivity analyses were conducted by: (1) removing all observations that were above the upper bound of concentrations; and (2) removing all observations that were more than 20% above the upper bound. Results were robust to these changes in the data, most likely because very few data points were affected by this problem.
Focus has been on thirteen quantifiable VOCs, which were identified from literature as potential predictors of cancer and for which calibration (transformation from peak area to concentrations) was possible. These are referred to as quantifiable compounds, though the term is specific to the analysis and report as the number and type of VOCs that are quantifiable can vary with the study. However, there is a large number of VOCs that were not calibrated in the data. More precisely, they have an associated peak area measurement, but do not a corresponding concentration expressed in international units of measurement. These VOCs will be referred to as “unquantifiable” VOCs, though, the list of VOCs that are not quantifiable can vary substantially from study to study.
In the study, Tentatively Identified Compounds (TICs) information was used for the unquantifiable VOC analysis. This information was obtained directly from a Chromeleon CDS system (Version 7.2.8 with NIST MS search V.2.0, Thermo Fisher Scientific). As mentioned in the EPA TIC (2006) document: “The [TICS] identification is not considered “absolute” or “confirmed” until a known standard for the suspect compound can be analyzed on the same instrument which made the tentative identification.” Due to various constraints this was not done for the unquantifiable VOCs in the study, though it was done for the 13 quantifiable VOCs.
Before the study started it was not known what and how many additional VOCs will be identified and in what proportion of the study participants each VOC will be present. Here an exploratory analysis of the unquantifiable VOCs identified in the study is provided. The statistical analysis mirrors the one conducted for the log peak area of quantifiable compounds and quantifies: (1) the association between presence/absence of each VOC and the S1LC case indicator; and (2) the association between log10 peak corresponding to each VOC and the S1LC case indicator. The same training/test data split used for the analysis of quantifiable VOCs was used for the unquantifiable VOCs. For prediction purposes only data based on Bag 2 (alveolar) was used, though some summary statistics are presented for Bag 1 (tidal), as well.
In the case of quantifiable compounds only one peak area was returned by the software and calibrated to concentrations. However, for some unquantifiable VOCs sometimes there are multiple peak areas that are associated with the same compound. In these cases the area of the maximum peak was used and the other peaks were discarded. Additional analysis could be conducted using the sum of the areas or a repeated measures analysis. Only VOC peaks that were identified as being “excellent” were retained based on the criterion that both the Similarity Index (SI) and the Reverse Search Index (RSI) are greater than or equal to 900.
There were 167 total VOCs with identified peaks, out of which 60 were identified as “good” (both SI and RSI greater than or equal to 800), and 24 compounds with “excellent” data (both SI and RSI greater than or equal to 900). These numbers contain VOCs in either bag that were identified in at least one study participant in all included data (training and test combined). Results are summarized in Table 22.
However, the number of VOCs identified in the breath of at least one individual was different depending on the bag. For example, there were 129 total VOCs in Bag 1 compared to 144 in Bag 2, 48 VOCs of “good” quality in Bag 1 compared to 55 in Bag 2, and 23 VOCs of “excellent” quality in Bag 1 compared to 22 in Bag 2.
Recall that the training data consists of 30 groups with a total of 81 study participants, with 30 cases and 51 combined matched and housemate controls. The test data consists of 58 groups with a total of 144 study participants, with 58 cases and 86 combined matched and housemate controls.
Table 23 provides the results for Fisher's exact test of the null hypothesis of no association between the presence/absence indicator of individual VOCs and S1LC case status in the training data for Bag 2. The column labeled N present denotes the number of cases and controls that have the specific VOC present among the 81 study participants (ncases=30; ncontrols=51) Bag 2 training data. Results are shown for VOCs with a p-value for Fisher's exact test less than 0.5 (not 0.05) for exploratory reasons. VOCs are ranked from the smallest (stronger evidence against the null hypothesis) to the largest p-value. The columns labeled “Sensitivity” and “Specificity” provide the sensitivity and specificity of the test that predicts a S1LC case if the VOC is present in the training data.
Table 24 provides the AUC for the training and test data for the presence/absence data for the top eight unquantifiable compounds in the study. Ranking was based on the p-values of the Fisher's exact test for no association between presence/absence and S1LC case status in the training data. Current software implementations of AUC (as implemented in the function prediction in R package ROCR) are used, though this may be inappropriate for binary predictors. A better measure of AUC is estimating the AUC without adding in ties, which tends to provide lower values of AUC. However, this version of AUC is used to keep the AUC calculations consistent within this report.
With the exception is Argon (Training AUC=0.607, Test AUC=0.509), there is good agreement between the training and test AUC. This may be due to the fact that for binary prediction there is no tuning parameter (decision threshold). Thus, the consistency of AUCs is a consequence of the stability of missing VOC proportions in the training and test data. The presence/absence of the VOCs listed in Table 24 could be potentially useful for building prediction models for S1LC cancer cases. However, the definition of presence/absence depends substantially on the technology used and its VOC detection sensitivity. In the absence of information about limits of detection and calibration curves this information cannot be directly generalizable.
In this section the prediction performance of the log10 peak area of unquantifiable VOCs for S1LC case status is explored. Only VOC peaks that were identified as being of “excellent” quality (SI and RSI greater than or equal to 900) are used.
Table 25 displays the S1LC case prediction performance of login peak area of unquantifiable VOCs based on t-tests and AUCs. VOCs are ranked from the smallest to the largest p-value for the t-test and only VOCs with an AUC larger than 0.55 are shown. Also shown are the number of samples available for each compound broken down by case status. Table 26 displays similar results with Table 25, but includes VOCs that had an AUC greater than 0.55 in either the training or test data sets. VOCs are ranked from the largest to the smallest AUC in the test data.
Phosponic acid has a large training AUC (0.838), but this is based on a small number of study participants who had this particular VOC detected (9 cases and 11 controls). In the test data the AUC for Phosphonic acid is much smaller (0.538) based on a larger number of study participants who had this particular VOC detected (31 cases and 34 controls). Carbamic acid (training AUC=0.637, test AUC=0.595), Acetone (training AUC=0.572, test AUC=0.698), Carbon dioxide (training AUC=0.658, test AUC=0.512), and Cyclopropane (training AUC=0.571, test AUC=0.532) have been identified as possible targets for further investigation. All compounds in Table 26 could be of interest in future analyses.
Overall, a list of promising unquantifiable VOC based both on the presence/absence and on the compound and peak area are identified. To evaluate the translational potential of these findings additional studies would need to be conducted, including developing calibration curves to transform peak area values into concentrations and independent validation studies. Given the experience with quantifiable VOCs, results may or may not be reproducible depending on the limits of detection and the patterns of missingness induced by technological limitations.
indicates data missing or illegible when filed
.08
,
.33
.00
.60
90
indicates data missing or illegible when filed
Many VOCs in exhaled breath had low concentrations in the range of 0.0001 to 17.4973 μg/L for Acetoin and 0.00011 to 0.22125 μg/L for all other VOCs. Each VOC had a different LOD and the percent of VOC measurements below the LOD for most VOCs was high for combined, training, and test data. Among the thirteen quantifiable VOCs considered in this analysis, only four VOC were below the LOD in less than 10% across all samples: 2-Pentanone (7.6%), Acetoin (3.1%), Heptanal (8.4%), Dodecane (1.8%). The proportion of VOCs below LOD among cases and controls in testing and training data was similar for all VOCs except p-Cymene. For p-Cymene the percentage of compounds below LOD was higher in S1LC cases. In the training data, 64% of the measurements were below LOD among cases and 38% among controls. In the test data 70% of the measurements were below LOD among cases and 54% among controls.
As several VOCs had large proportions of observations that are below the LOD, the predictive performance was investigated for every VOC being above/below the limit of detection (LOD). Univariate analyses of the prediction performance of S1LC cases using the predictors “above or below the LOD” indicated that p-Cymene had the highest predictive accuracy (training AUC=0.630; testing AUC=0.580; see Table 20). The limit of detection for p-Cymene was 0.00011 μg/L; thus, the model uses a decision rule of having a p-Cymene breath concentration below 0.00011 μg/L to predict S1LC cases. The test AUCs for the remaining 12 VOCs was close to 0.5 indicating that being above or below the LOD was not predictive of S1LC.
Table 17 presents the results of comparing the mean of the log10 concentration among cases and combined controls in the training, test, and combined test and training data using unpaired t-test. With the exception of Acetoin, the difference between cases and controls was not statistically significant for any of the group comparisons. For Acetoin the difference in the means was: (1) not significant in the training sample (p-value=0.091); (2) significant in the test sample (p-value=0.001); and (3) significant in the combined sample (p-value<0.001). These differences are likely due to the difference in sample size; for example, for Acetoin there are 28 cases and 49 controls in the training data, but there are 85 cases and 133 controls in the combined data.
Table 16 provides individual VOCs S1LC case prediction performance using univariate and multivariate forward selection logistic regression based on log10 concentrations above the LOD. In univariate models (one predictor at a time) Acetoin and Heptanal have training AUC greater than 0.6, while other compounds have AUCs close to 0.5. The AUC for Acetoin is 0.649 in the training data (N=77) and 0.650 in the test data (N=141), indicating that the predictive performance of Acetoin was preserved in the test data. In contrast, the AUC for Heptanal is 0.610 in the training data (N=68) and only 0.511 in the test data (N=138), indicating that Heptanal may not be a reliable predictor of S1LC cases. Dodecane has a consistent, low AUC for training (0.574) and test (0.541) data.
Cumulative AUCs for the multivariate forward selection logistic regression as additional VOCs are included into the model are provided in Table 16 for both the training and test data. Acetoin is the strongest predictor with a training AUC of 0.649 and a test AUC of 0.650. Adding Heptanal increases the training AUC to 0.669 and decreases the test AUC to 0.559. Adding 2-Pentanone to the model increases the training AUC (from 0.669 to 0.689) though the test AUC of 0.601 is lower than the test AUC of 0.65 for Acetoin alone. A two variable model adding either Dodecane or 2-Pentanone could also be considered. However, more complex models are not considered at this time given the low individual AUC values for these VOCs and the high correlations among the other log concentrations of VOC (Table S3 in the supplementary materials).
Results based on VOC concentrations suggest that Acetoin: (1) has most concentrations above the limit of detection; (2) leads to the best predictive model in the test data; and (3) has a stable performance when transitioning from training to test data. Thus, the specific Acetoin concentration thresholds expressed in mg/L and their associated S1LC case prediction performance are examined. Because Acetoin concentrations were, on average, lower in S1LC patients compared to controls, the test follows the following rule:
The threshold (from training data), can be chosen in many different ways to balance sensitivity and specificity. Here the following thresholds were considered based on the percentiles of Acetoin concentrations in the training data of controls: (a) the 10th percentile (0.026 mg/L); (b) the 25th percentile (0.044 mg/L); and the 50th percentile (0.098 mg/L).
This was the largest case-control VOC study to date with the inclusion of a healthy control and a housemate control to aid in the elimination of potential environmental confounders for VOCs that may indicate the presence of lung cancer. The control group (S1LC) cases was diverse in terms of covariates and analytic approach of combining type 1 and 2 cases ensures that study results are generalizable to the population. The novelty of the study consists of its focus on: (1) early lung cancer detection, specifically S1LC; (2) practical, translatable and reproducible signature of breath VOC for S1LC; (3) design of experiment targeted to elimination of potential confounders due to environment, technology, and breath analysis procedure; and (4) definition of training and testing data sets before data were collected. The data presents results that are contrary to the published literature indicating that: (a) most VOCs published in the literature have a weak or inexistent association with S1LC; (b) Acetoin, the only VOC that was associated with S1LC, has a much lower predictive performance than the performance of previously published VOC signatures, though none of these results specifically focused on S1LC; and (c) Acetoin concentrations were on average lower (not higher) in the breath of S1LC cases than in controls. Acetoin has an AUC of 0.65 with a sensitivity of 87.1% (specificity of 36.8%) when predicting that a person has SILO if the Acetoin concentration is below 0.098 mg/L. This is a promising result that will need further investigation as this single VOC approaches the sensitivity of LDCT4.
Acetoin has not been a VOC closely studied in its relationship to lung cancer, and in a recent review article on VOCs it was not a described candidate VOC for the detection of lung cancer but is typically used in the flavorings of foods, as well as e-cigarettes. As additional VOCs were added to the model, the test AUC dropped. This is in contrast to multiple other studies. Indeed, in a small study of seventy patients, a signature was identified without providing the specific VOCs, with a sensitivity of 81% and specificity of 91%. A prior study of 229 participants reported an AUC of 0.81, though the VOCs used were not disclosed. Another studied 2-butanone, 3-hydroxy-2-butanone, 2-hydroxyacetaldehyde and 4-hydroxyhexanal in a large study with 405 participants, and were able to show a sensitivity and specificity of 93.6% and 85.6%, respectively. There are concerns about these studies, especially because: (1) the data are not available; (2) methods used are only superficially described; (3) analytic methods used can be over-fit; (4) VOC measurements are not expressed in concentration units, which implies that the measurement values may be indistinguishable from the experimental noise; and (5) there are many levels of data processing and cleaning that cannot be understood when data and code are not reproducible.
There are several limitations to this trial. First, the presence of dead space in the lung can dilute VOC's in the same breath. To combat this, a separate Tedlar® bag was used for the first 150-200 cc of exhalation, followed by the rest of the breath into a 1 L Tedlar® bag. Second, the effect of condensation on VOCs is unknown and, unfortunately, this effect was not controllable in the Tedlar® bags. Third, it is not possible to control for all environmental exposures, so there may be confounders present that were not considered—this includes the potential that participants did not abstain from smoking, vaping or drinking prior to breath collection. Fourth, although the protocol planned to analyze all breaths within a 24-hour period, this was not always the case. It is possible that these delays could have led to changes in the VOC concentrations in the Tedlar® bags. Fifth, S1LC was the focus, which may not be associated with substantial changes in breath VOCs. This leaves the possibility that changes may occur in more advanced stages of lung cancer. Sixth, the time interval to abstain from smoking, vaping, or drinking for at least 30 minutes prior to collecting exhaled breath interval was chosen as a reasonable compromise for the participants and the study feasibility, however different interval lengths could affect the concentration of individual VOCs. Last, many of the demographic confounders were based on recall, such as a family history of cancer-selective memory may have played a part in answers when participants are being biopsied to assess whether they have cancer or not.
Lung cancer is the number one cause of cancer related deaths in the United States1. The 5-year survival of patients identified to have lung cancer drastically decreases with each advancing stage. In the most recent American Cancer Society statistics, the 5-year survival for localized, regional and distant was 61%, 35%, 6%, respectively. Given the drastic decrease in survival for every increasing stage, a minimally invasive, accurate diagnostic test is needed.
Although the present invention has been described in connection with preferred embodiments thereof, it will be appreciated by those skilled in the art that additions, deletions, modifications, and substitutions not specifically described may be made without departing from the spirit and scope of the invention as defined in the appended claims.
This application claims the benefit of U.S. Provisional Patent Application No. 63/154,997 filed on Mar. 1, 2021, which is incorporated by reference, herein, in its entirety.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2022/018356 | 3/1/2022 | WO |
Number | Date | Country | |
---|---|---|---|
63154997 | Mar 2021 | US |