Described herein are methods that use blood biomarkers to identify increased risk of multiple independent infection episodes (MIIE) before clinical signs of infection appear.
The high value of a tailored approaches in the care of patients is increasingly appreciated (Chaussabel, 2015; Parikh et al., 2016; Sweeney and Wong, 2016). A method to expedite the timeline of threat detection to before infection happens could yield valuable time for early prophylactic and therapeutic interventions. Moreover, the ability to identify patients at high risk for repeated infections, or infection-related morbidity and mortality, might be considered an important measure to fairly allocate resources such as medication, personal protective equipment, or another high-value scarce intervention (Massachusetts, 2020; Medicine, 2013; Medicine, 2020).
Trauma is one of the leading causes of morbidity and mortality worldwide (Heron, 2018; Krug et al., 2000). Severe trauma induces various immune-related responses acutely—it can trigger a state of immunosuppression (Islam et al., 2016; Ward, 2005), prolonged inappropriate immune response (Heffernan et al., 2012; Huber-Lang et al., 2018), leukocytosis (Paladino et al., 2010), and the elevation of specific subpopulations of myeloid cells (Cuenca et al., 2011). Among trauma patients, infections and infections-related complications contribute to substantial mortality and morbidity, and prolonged hospital stay, significantly adding to health care costs (Cole et al., 2014; Dutton et al., 2010; Glance et al., 2011; Hashmi et al., 2014). Infections and infections-related outcomes vary across individuals, suggesting the importance of considering individual patients' underlying susceptibility and the degree of immunosuppression, or inappropriate immune response experienced.
Severe trauma predisposes patients to multiple independent infection episodes (MIIE), leading to augmented morbidity and mortality. Described herein are methods to identify increased MIIE risk before clinical signs appear that are fundamentally different from existing approaches entailing infections' detection after their establishment. Applying unbiased machine learning algorithms to genome-wide transcriptome data from 128 adult blunt trauma patients' (42 MIIE cases and 85 non-cases) leukocytes collected ≤48 hours of injury and >3 days before any infection, 15-transcript and 26-transcript multi-biomarker panel models were constructed with the least absolute shrinkage and selection operator (LASSO) and Elastic Net, respectively, which accurately predicted MIIE (AUROC [95% CI]: 0.90 [0.84-0.96] and 0.92 [0.86-0.96]), and significantly outperformed clinical models. Gene Ontology and network analyses found various pathways to be relevant. External validation found the model to be generalizable. This unique precision medicine approach can be applied to a wide range of patient populations and outcomes.
Thus, provided herein are methods for detecting or predicting risk of developing multiple independent infection episodes (MIIE) in a subject who has experienced blunt trauma. The methods include providing a sample comprising blood, serum, or plasma from a subject who has experienced blunt trauma; detecting one, two, three, four, five, six, seven, eight, nine, ten, 11, 12, 13, 14, or all of the transcripts in a 15 probe set biomarker panel comprising or consisting of hepatocyte growth factor (HGF); kelch repeat and BTB domain containing 7 (KBTBD7); adenosine A3 receptor (ADORA3); ADP-ribosylation factor-like GTPase 4A (ARL4A); epiplakin 1 (EPPK1); zinc finger protein 354A (ZNF354A); SH3 and PX domains 2B (SH3PXD2B); RNase A family 1 (RNASE1), BTB domain containing 19 (BTBD19), zeta chain of T cell receptor associated protein kinase 70 kDa (ZAP70); endoplasmic reticulum aminopeptidase 2 (ERAP2); CD96 molecule (CD96); membrane metallo-endipeptidase (MME); killer cell lectin-like receptor subfamily F, member 1 (KLRF1); and/or non-coding transcript, nuclear paraspeckle assembly transcript 1 (NEAT1). The methods can include comparing the level of the transcript to a reference level, wherein the presence of: an increase in one or more of: hepatocyte growth factor (HGF); kelch repeat and BTB domain containing 7 (KBTBD7); adenosine A3 receptor (ADORA3); ADP-ribosylation factor-like GTPase 4A (ARL4A); epiplakin 1 (EPPK1); zinc finger protein 354A (ZNF354A); SH3 and PX domains 2B (SH3PXD2B); RNase A family 1 (RNASE1), or BTB domain containing 19 (BTBD19), or a decrease in one or more of: zeta chain of T cell receptor associated protein kinase 70 kDa (ZAP70); endoplasmic reticulum aminopeptidase 2 (ERAP2); CD96 molecule (CD96); membrane metallo-endipeptidase (MME); killer cell lectin-like receptor subfamily F, member 1 (KLRF1); and/or non-coding transcript, nuclear paraspeckle assembly transcript 1 (NEAT1), indicates that the subject has, or has an increased risk of developing (i.e., a risk above that of a reference cohort of subjects with blunt trauma), an infection, within the next 2, 3, 4, 5, 6, 7, 10, 12, 14, 20, 21, or 24 days. Alternatively or in addition, the methods can include calculating a score based on the transcript levels, and comparing the score to a threshold or reference score, wherein a score about the threshold or reference score indicates that the subject has, or has an increased risk of developing, MIIE In some embodiments, all of the transcripts in the 15 probe set biomarker panel are used; in some embodiments, 14 are used, excluding SH3PXD2B.
In some embodiments, the methods include determining a level of all of ZNF354A; EPPK1; RNASE1; BTBD19; ADORA3; KBTBD7; SH3PXD2B; CD96; ARL4A; ZAP70; ERAP2; HGF; KLRF1; NEAT1; and MME.
In some embodiments, the methods also include determining a level of one or more of Interleukin 1 receptor, type II (IL1R2) and mannose receptor, C type 1 (MRC1); importin 11 (IPO11); dedicator of cytokinesis 4 (DOCK4); Kruppel-like factor 9 (KLF9); nebulette (NEBL); a different probe set for MME; ribosomal protein S6 kinase, 90 kDa, polypeptide 5 (RPS6KA5); killer cell lectin-like receptor, subfamily K, member 1 (KLRK1); granzyme K (GZMK); and/or ADP-ribosylation factor-like GTPase 4C (ARL4C).
In some embodiments, the methods include determining a level of ARL4C; KLRK1; SH3PXD2B; EPPK1; ZNF354A; RNASE1; MME; BTBD19; MRC1; ADORA3; NEBL; KBTBD7; IP011; RPS6KA5; KLF9; DOCK4; IL1R2; CD96; ARL4A; ERAP2; GZMK; NEAT1; KLRF1; HGF; MME; and ZAP70.
In some embodiments, an increase in one or more of interleukin 1 receptor, type II (IL1R2); mannose receptor, C type 1 (MRC1); importin 11 (IP011); dedicator of cytokinesis 4 (DOCK4); Kruppel-like factor 9 (KLF9); and/or nebulette (NEBL), or a decrease in one or more of MME, ribosomal protein S6 kinase, 90 kDa, polypeptide 5 (RPS6KA5); killer cell lectin-like receptor, subfamily K, member 1 (KLRK1); granzyme K (GZMK); and/or ADP-ribosylation factor-like GTPase 4C (ARL4C) indicates that the subject has, or has an increased risk of developing (i.e., a risk above that of a reference cohort of subjects with blunt trauma), an infection, e.g., within the next 2, 3, 4, 5, 6, 7, 10, 12, 14, 20, 21, or 24 days.
Also provided herein are methods comprising: providing a sample comprising blood, serum, or plasma from a subject who has experienced blunt trauma; detecting a level of transcripts consisting of one, two, three, four, five, six, seven, eight, nine, ten, 11, 12, 13, 14, or all of hepatocyte growth factor (HGF); kelch repeat and BTB domain containing 7 (KBTBD7); adenosine A3 receptor (ADORA3); ADP-ribosylation factor-like GTPase 4A (ARL4A); epiplakin 1 (EPPK1); zinc finger protein 354A (ZNF354A); SH3 and PX domains 2B (SH3PXD2B); RNase A family 1 (RNASE1), BTB domain containing 19 (BTBD19), zeta chain of T cell receptor associated protein kinase 70 kDa (ZAP70); endoplasmic reticulum aminopeptidase 2 (ERAP2); CD96 molecule (CD96); membrane metallo-endipeptidase (MME); killer cell lectin-like receptor subfamily F, member 1 (KLRF1); and non-coding transcript, nuclear paraspeckle assembly transcript 1 (NEAT1).
In some embodiments, the methods include determining a level of all of ZNF354A; EPPK1; RNASE1; BTBD19; ADORA3; KBTBD7; SH3PXD2B; CD96; ARL4A; ZAP70; ERAP2; HGF; KLRF1; NEAT1; and MME.
In some embodiments, the methods include determining a level of one or more transcripts selected from the group consisting of Interleukin 1 receptor, type II (IL1R2) and mannose receptor, C type 1 (MRC1); importin 11 (IPO11); dedicator of cytokinesis 4 (DOCK4); Kruppel-like factor 9 (KLF9); nebulette (NEBL); a different probe set for MME; ribosomal protein S6 kinase, 90 kDa, polypeptide 5 (RPS6KA5); killer cell lectin-like receptor, subfamily K, member 1 (KLRK1); granzyme K (GZMK); and ADP-ribosylation factor-like GTPase 4C (ARL4C).
In some embodiments, the methods include determining a level of ARL4C; KLRK1; SH3PXD2B; EPPK1; ZNF354A; RNASE1; MME; BTBD19; MRC1; ADORA3; NEBL; KBTBD7; IP011; RPS6KA5; KLF9; DOCK4; IL1R2; CD96; ARL4A; ERAP2; GZMK; NEAT1; KLRF1; HGF; MME; and ZAP70.
In some embodiments, the methods include calculating a score based on the levels of the transcripts. In some embodiments, the score is calculated using an algorithm comprising summation or weighted summation of normalized levels of the biomarkers. In some embodiments, the score is calculated using principal components analysis (PCA) linear regression, support vector machine (SVM), decision trees, K-nearest neighbors (KNN), K-means, gradient boosting, or random forest methods. The methods can this include comparing a score to a threshold or reference score, wherein a score about the threshold or reference score indicates that the subject has a high risk of MIIE
In some embodiments, determining a level comprises using PCR-based methods (e.g., (RT-qPCR), RNA sequencing, next generation sequencing, digital gene expression analysis, or microarray analysis.
In some embodiments, the sample is obtained from the subject within 48 hours of the blunt trauma.
In some embodiments, the subject does not have clinical signs of an infection (e.g., fever, elevated white blood cell count, or other signs) when the sample is obtained.
In some embodiments, the infections include infection with a Gram positive or Gram negative bacterium, virus, or fungus, e.g., Streptococcus pneumonia, Streptococcus viridans, Staphylococcus aureus, Enterococcus species, Coagulase negative staphylococci, and Streptococcus pneumoniae and viridian, Clostridium sp.; Candida species, Enterobacter species, Acinetobacter species, Pseudomonas aeruginosa, Haemophilus influenza, Bacteroides species, Klebsiella pneumoniae, Neisseria, Proteus, Serratia marcescens, Escherichia coli, Stenotrophomonas, and Candida species.
In some embodiments, the methods include treating the subject, e.g., with a broad-spectrum antibiotic, increasing the frequency or length of monitoring of the subject for infection; implementing prophylactic measures; and/or enhancing patient nutrition. Those who are identified to be at high risk could can be treated with from increased surveillance and additional preventative measures taken early. Additional interventions for the high risk group can include increased surveillance for early mobilization and removal of lines/tubes, coating IV lines and urine catheters with antimicrobials and/or antibiotics, immunomodulatory nutrition therapies
In some embodiments, the subject is a mammal, e.g., a human.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Methods and materials are described herein for use in the present invention; other, suitable methods and materials known in the art can also be used. The materials, methods, and examples are illustrative only and not intended to be limiting. All publications, patent applications, patents, sequences, database entries, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control.
Other features and advantages of the invention will be apparent from the following detailed description and figures, and from the claims.
Methods to identify trauma patients with particularly increased risk of infection could be advantageous for ensuring timely and appropriate delivery of preventative measures (such as early immune-modulating nutrition, microbiome modulation, early mobilization, early removal of lines/tubes, taking all transmission-based precautions), improving surveillance and promoting antibiotic stewardship to limit the emergence of multi-drug resistance bacteria, reduce toxicity to patients and decrease health care costs. Previous studies have evaluated the use of injury severity scores, such as the Acute Physiology and Chronic Health Evaluation (APACHE) II (Knaus et al., 1985) Injury Severity Score (ISS) (Baker et al., 1974) and New Injury Severity Score (NISS) (Osler et al., 2010) as predictors of infection, in addition to their intended use to predict mortality (Cheadle et al., 1989; Jamulitrat et al., 2002). Using genome-wide transcriptomic information from leukocytes provided at triage to assess the underlying susceptibility, well before the onset of infections, is expected to significantly improve the accuracy of identifying patients who are most at-risk of multiple independent infection episodes. A recent study described that employing a combination of predictors could be more effective than using a single predictor with strong statistical significance, further suggesting that multi-biomarker panels could be highly effective (Lo et al., 2015). Previous studies have utilized transcriptome data in the trauma setting to find transcripts that correlate with poor outcome (Desai et al., 2011) or sepsis (Sweeney et al., 2015). The objective of this study is distinct, as we aim to develop a method to predict multiple infections prior to classic clinical signs of infection. And thus, our approach focuses on the prevention of infections, aiming to predict the outcome before its onset, using early blood samples.
In a previous study among burn trauma patients, a blood transcriptomic multi-biomarker panel was developed for predicting multiple independent infection episodes (MIIE) outcome during the course of recovery (Yan et al., 2015). The least absolute shrinkage and selection operator (LASSO) machine learning algorithm was employed to select probe sets that together (i.e., a multi-transcriptome panel) resulted in highly accurate prediction. This model performed significantly better than those based on injury severity assessments at triage and demographic information (Yan et al., 2015). Described herein is a new approach combining the use of two algorithms, LASSO and Elastic Net regression, to investigate MIIE outcome. LASSO and Elastic Net were used to reduce the complexity of regression models, in conjunction with cross-validation to select the optimal parameter for reducing the number of predictors. These techniques are highly beneficial in cases such as transcriptome data where the number of potential predictors is extremely large, to overcome the problem of overfitting. LASSO regression reduces highly correlated predictors and selects a minimal panel of predictors, compared to Elastic Net that includes some correlated predictors. Blunt trauma patients in the multi-center Inflammation and Host Response to Injury (“Glue Grant”) cohort were investigated. This cohort enrolled a high number of patients, generated genome-wide transcriptome data, and collected data longitudinally, allowing the building of a predictive model. The present methods employ unbiased analyses of genome-wide information for the identification of patients at increased risk for MIIE before clinical signs of infection. These methods are also advantageous for providing new insights into the molecular pathways that characterize the pathophysiology underlying hypersusceptibility to infections.
The present study shows that employing novel prognostic models based on early blood transcriptome profiling following severe trauma is an effective method for identifying patients who are particularly at high risk for MIIE and thus, hypersusceptible to infections. That the transcriptome information provides much better prediction than injury severity information, argues for the importance of considering each patient's underlying susceptibility and of elucidating relevant molecular mechanisms. The comprehensive dataset we used had genome-wide transcriptome and clinical data collected longitudinally from a large number of patients, providing the opportunity to assess early susceptibility. Notably, these results suggest that by measuring the biomarkers at admission, patients at increased risk for MBE can be identified before any clinical signs of infection appear. The biomarker panel models used herein had particularly high specificity and NPV measures, while also exerting good sensitivity and PPV. Moreover, when applied to an external validation cohort, the models still performed. On the other hand, none of the injury severity scores often used at triage in the trauma setting were effective in predicting the MIIE
Application of both LASSO and Elastic Net regression methods allowed the construction of a highly predictive model from a minimal set of predictors. The LASSO approach selects a stringent set of predictors with less redundancy, which is advantageous in the clinical setting, where a device requiring less measurements is more practical and easier to implement. The Elastic Net approach that allows for correlated predictors to be selected found additional transcripts for a more comprehensive discovery of biological mechanisms. The probe sets selected consisted of transcripts with GO terms relevant to infections and signaling pertinent to oncogenesis and cancer progression.
In the present study, HGF was the transcript showing the highest upregulation among MIIE patient blood and in both the 15 probe set and 26 probe set panels. HGF and Met expression levels have been suggested as a putative biomarker for monitoring infections, as it is well-established that the HGF-Met signaling pathway deregulation promotes the growth and invasion by various pathogens (Imamura and Matsumoto, 2017). Another upregulated transcript, ADORA3, has also been implicated in the clinical setting, and agonists have been developed and shown to induce anti-inflammatory effects by altering the Wnt and NF-κB pathways. As such, the agonists are considered for purposes of treating cancers, and inflammatory diseases such as rheumatoid arthritis and psoriasis (Fishman et al., 2012). NEATI is a non-coding RNA that is shown to colocalize with MALAT1, a long non-coding RNA often associated with metastatic cancer, at many genomic sites to transcriptionally regulate target genes (West et al., 2014). CD96 is highly expressed in T and NK cells and well-established to be a regulator of immune responses during infection and cancer (Georgiev et al., 2018).
Elastic Net selected two probe sets corresponding to WE, providing further support for its importance in MIIE outcome. Studies on its molecular mechanisms and clinical use of inhibitors to its protein product, Neprilysin has been conducted widely, including in Alzheimer's, heart failures, hypertension, and renal diseases (Riddell and Vader, 2017). Our study suggests that its potential role in immunity among patients warrants further investigation. KLRK1 and KLRF1, both killer cell lectin-like receptor subfamily members, were found by Elastic Net, providing evidence of their relevance in infections in the blunt trauma setting. These receptors are abundant on NK cells, and it is well-established that they play crucial roles in innate immunity (Barten et al., 2001). These previous findings provide additional confidence in the relevance of our methodology. The pathway analysis found key signaling pathway components among the central nodes having extensive edges, including the major cytokines, TNF, TGFβ-1, CCLs, and CXCLs, as well as key signaling components, p38 MAPK, ERK1/2, SMAD3, and CTNNB1. These components represent the chief signaling pathways that regulate inflammation, mitogenic response, and tissue regeneration, which are also often dysregulated in cancer. Notably, the present results suggest that the TNF, TGFβ-1, and Wnt signaling pathways, which are known to also cross-talk with one another through downstream cascades, may be important central pathways that explain the interconnection between the prognostic biomarkers identified. These results may suggest that these signaling pathways may represent new host immunomodulatory targets that warrant future mechanistic studies. Follow-up studies in model organisms and controlled studies would aid in establishing whether the genes identified in this study drive susceptibility, and in uncovering further mechanistic insights.
It is noteworthy that when comparing the current biomarker panels in the blunt trauma setting with that from our previous study among burn patients (Yan et al., 2015), none of the transcripts in the panels were shared. These differences may indicate that increased risk depends on the interaction of the type of trauma with each patient's underlying susceptibility to MIIE.
The present study describes methods towards the development of precision medicine tools and offers the possibility of analyses also for outcomes other than multiple infections. The failure of drug trials targeting sepsis (Marshall, 2014; Mitka, 2011) highlights the importance of further studies elucidating the underlying molecular mechanisms and components of heterogeneity in susceptibility to infection and infection-related morbidity within a population. Measuring the biomarker panels described herein to triage patients according to susceptibility to multiple infections can be used to strategically guide prophylactic patient management and help reduce the incidence of infections to limit sepsis (Boomer et al., 2011; Chaussabel, 2015; Parikh et al., 2016).
This study provides for the first time, prediction models for hypersusceptibility to infections, which is highly relevant for critically injured trauma patients, using a machine learning approach. A concern in general for prediction model building is that models may overfit to a specific dataset, making them less generalizable. However, using the multi-biomarker panel derived from the Glue Grant population to make predictions in the Cabrera et al. population yielded a relatively high AUROC, demonstrating the broader applicability of the biomarker panel model. These two populations had comparable injury severity; however, they were considerably different in their geographic locations and healthcare systems. Moreover, the gene expression levels were measured by two different transcriptome technologies (Illumina and Affymetrix), and the Cabrera et al. dataset was very small in sample size. Despite these differences, our multi-biomarker panels still conferred prediction, providing additional assurance in the validity of our results and evidence for the generalizability of our model. The present methods provide novel approaches for predicting outcomes from blood transcriptome information at admissions.
The value of early MIIE identification, prior to any clinical sign of infection, could be an indispensable tool in other types of trauma and to a wide range of clinical settings. Uncovering biomarkers of increased susceptibility to infections may open new avenues for novel therapeutic targets, as well as contribute to standardizing populations in clinical trials. Although predictive algorithms cannot eliminate medical uncertainty, our analysis method is expected to be widely applicable to other susceptible populations, such as those with diabetes or cardiac disease, the frail elderly population, those treated with immunosuppressive medication, as well as others. The described methodology of multi-biomarker panel development has the potential to be applied to outcomes and clinical contexts other than MIIE and trauma, providing additional value.
Provided herein are methods for detecting or predicting risk of developing multiple independent infection episodes (MIIE), i.e., two or more infections, in a subject who has experienced blunt trauma, but who does not presently have an infection. In some embodiments, the subject is a mammal, e.g., a human. The methods can be used to determine that the subject has, or has an increased risk of developing (i.e., a risk above that of a reference cohort of subjects with blunt trauma), an infection, e.g., developing two or more infections before discharge, e.g., within the next 2, 3, 4, 5, 6, 7, 10, 12, 14, 20, 21, or 24 days or more, e.g., within the next 3-120 days.
The infections can be, e.g., an infection with a Gram positive or Gram negative bacterium, virus, or fungus, e.g., Streptococcus pneumonia, Streptococcus viridans, Staphylococcus aureus, Enterococcus species, Coagulase negative staphylococci, and Streptococcus pneumoniae and viridian, Clostridium sp.; Candida species, Enterobacter species, Acinetobacter species, Pseudomonas aeruginosa, Haemophilus influenza, Bacteroides species, Klebsiella pneumoniae, Neisseria, Proteus, Serratia marcescens, Escherichia coli, Stenotrophomonas, and Candida species.
The methods include evaluating levels of biomarkers, e.g., the 15 probe set biomarker transcript panel or the 26 probe set biomarker transcript panel, in a sample. Preferably, the sample includes blood, serum, or plasma, e.g., comprising leukocytes, from a subject who has experienced blunt trauma, e.g., within the previous 72, 60, or 48 hours. The methods can include evaluating levels of subsets of the biomarker, e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or all 15 transcripts in the 15 probe set biomarker panel.
In some embodiments, the methods include detecting one or more, e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or all 15 transcripts in a 15 probe set biomarker panel comprising or consisting of hepatocyte growth factor (HGF); kelch repeat and BTB domain containing 7 (KBTBD7); adenosine A3 receptor (ADORA3); ADP-ribosylation factor-like GTPase 4A (ARL4A); epiplakin 1 (EPPK1); zinc finger protein 354A (ZNF354A); SH3 and PX domains 2B (SH3PXD2B); RNase A family 1 (RNASE1), BTB domain containing 19 (BTBD19), zeta chain of T cell receptor associated protein kinase 70 kDa (ZAP70); endoplasmic reticulum aminopeptidase 2 (ERAP2); CD96 molecule (CD96); membrane metallo-endipeptidase (MME); killer cell lectin-like receptor subfamily F, member 1 (KLRF1); and/or non-coding transcript, nuclear paraspeckle assembly transcript 1 (NEAT1) using a method known in the art.
In some embodiments, the methods include determining a level of all of ZNF354A; EPPK1; RNASE1; BTBD19; ADORA3; KBTBD7; SH3PXD2B; CD96; ARL4A; ZAP70; ERAP2; HGF; KLRF1; NEAT1; and MME.
In some embodiments, the methods also include determining a level of one or more of Interleukin 1 receptor, type II (IL1R2) and mannose receptor, C type 1 (MRC1); importin 11 (IPO11); dedicator of cytokinesis 4 (DOCK4); Kruppel-like factor 9 (KLF9); nebulette (NEBL); a different probe set for MME; ribosomal protein S6 kinase, 90 kDa, polypeptide 5 (RPS6KA5); killer cell lectin-like receptor, subfamily K, member 1 (KLRK1); granzyme K (GZMK); and/or ADP-ribosylation factor-like GTPase 4C (ARL4C).
In some embodiments, the methods include determining a level of one or more of Interleukin 1 receptor, type II (IL1R2) and mannose receptor, C type 1 (MRC1); importin 11 (IPO11); dedicator of cytokinesis 4 (DOCK4); Kruppel-like factor 9 (KLF9); nebulette (NEBL); a different probe set for MME; ribosomal protein S6 kinase, 90 kDa, polypeptide 5 (RPS6KA5); killer cell lectin-like receptor, subfamily K, member 1 (KLRK1); granzyme K (GZMK); and/or ADP-ribosylation factor-like GTPase 4C (ARL4C).
In some embodiments, the methods include determining a level of ARL4C; KLRK1; SH3PXD2B; EPPK1; ZNF354A; RNASE1; MME; BTBD19; MRC1; ADORA3; NEBL; KBTBD7; IPO11; RPS6KA5; KLF9; DOCK4; IL1R2; CD96; ARL4A; ERAP2; GZMK; NEAT1; KLRF1; HGF; MME; and ZAP70.
In some embodiments, the methods can include isolating nucleic acids from the samples. Various methods are well known within the art for the identification and/or isolation and/or purification of a biological marker from a sample. An “isolated” or “purified” biological marker is substantially free of cellular material or other contaminants from the cell or tissue source from which the biological marker is derived, i.e., partially or completely altered or removed from the natural state through human intervention. For example, nucleic acids contained in the sample can be first isolated according to standard methods, for example using lytic enzymes, chemical solutions, or isolated by nucleic acid-binding resins following the manufacturer's instructions.
The presence and/or level of a nucleic acid can be evaluated using methods known in the art, e.g., using polymerase chain reaction (PCR), reverse transcriptase polymerase chain reaction (RT-PCR), quantitative or semi-quantitative real-time RT-PCR, digital PCR i.e. BEAMing ((Beads, Emulsion, Amplification, Magnetics) Diehl (2006) Nat Methods 3:551-559); RNAse protection assay; RNA sequencing (RNA-Seq); Northern blot; various types of nucleic acid sequencing (Sanger, pyrosequencing, Next Generation Sequencing (NGS)); fluorescent in-situ hybridization (FISH); or hybridization-based approaches such as microarrays/chips, e.g., performed by commercially available equipment following manufacturer's protocols, e.g., using the Affymetrix GeneChip technology (Affymetrix, Santa 20 Clara, Calif.), Agilent (Agilent Technologies, Inc., Santa Clara, Calif.), or Illumina (Illumina, Inc., San Diego, Calif.) microarray technology) or Digital multiplexed gene expression analysis using the NanoString nCounter system (Kulkarni et al., Curr Protoc Mol Biol. 2011 April; Chapter 25:Unit25B.10), or barcoding methods (e.g., Serial Analysis of Gene Expression (SAGE)) (see also Lehninger Biochemistry (Worth Publishers, Inc., current addition; Sambrook, et al, Molecular Cloning: A Laboratory Manual (3. Sup.rd Edition, 2001); Bernard (2002) Clin Chem 48(8): 1178-1185; Miranda (2010) Kidney International 78:191-199; Bianchi (2011) EMBO Mol Med 3:495-503; Taylor (2013) Front. Genet. 4:142; Yang (2014) PLOS One 9(11):e110641); Nordstrom (2000) Biotechnol. Appl. Biochem. 31(2):107-112; Ahmadian (2000) Anal Biochem 280:103-110. In some embodiments, high throughput methods, e.g., gene chips or microarrays as are known in the art (see, e.g., Ch. 12, Genomics, in Griffiths et al., Eds. Modern genetic Analysis, 1999, W. H. Freeman and Company; Ekins and Chu, Trends in Biotechnology, 1999, 17:217-218; MacBeath and Schreiber, Science 2000, 289(5485):1760-1763; Hardiman, Microarrays Methods and Applications: Nuts & Bolts, DNA Press, 2003), can be used to detect the presence and/or level of the biomarkers. Measurement of the level of a biomarker can be direct or indirect. For example, the abundance levels of the biomarkers can be directly quantitated. Alternatively, the amount of a biomarker can be determined indirectly by measuring abundance levels of cDNA, amplified RNAs or DNAs, or by measuring quantities or activities of RNAs, or other molecules that are indicative of the expression level of the biomarker. In some embodiments a technique suitable for the detection of alterations in the structure or sequence of nucleic acids, such as the presence of deletions, amplifications, or substitutions, can be used for the detection of biomarkers of this invention.
RT-PCR can be used to determine the expression profiles of biomarkers (U.S. Patent No. 2005/0048542A1). The first step in expression profiling by RT-PCR is the reverse transcription of the RNA template into cDNA, followed by its exponential amplification in a PCR reaction (Ausubel et al (1997) Current Protocols of Molecular Biology, John Wiley and Sons). To minimize errors and the effects of sample-to-sample variation, RT-PCR is usually performed using an internal standard, which is expressed at constant level among tissues, and is unaffected by the experimental treatment. Housekeeping genes, such as Beta-actin, GAPDH and Beta-tubulin, can be used.
In some embodiments, blood transcriptome microarrays can be used. Arrays can be prepared by selecting probes that hybridize to a polynucleotide sequence of a target biomarker, and then immobilizing such probes to a solid support or surface. For example, the probes may comprise DNA sequences, RNA sequences, co-polymer sequences of DNA and RNA, DNA and/or RNA analogues, or combinations thereof. The probe sequences can be, e.g., synthesized either enzymatically in vivo, enzymatically in vitro (e.g. by PCR), or non-enzymatically in vitro.
In some embodiments, the probe sets used in the present methods comprises or consists of probes for the biomarkers listed herein, as well as housekeeping or other genes for normalization. In some embodiments, the methods include determining levels of transcripts in addition to the biomarkers described herein, but the methods include calculating a score based on only the biomarkers described herein.
In some embodiments, the presence and/or level of a biomarker is comparable to the presence and/or level of the biomarker(s) in the disease reference, then the subject has an increased risk of developing MIIE
In some embodiments, an increase in one, two, three, four, or more of, e.g., 14 or 15 of: hepatocyte growth factor (HGF); kelch repeat and BTB domain containing 7 (KBTBD7); adenosine A3 receptor (ADORA3); ADP-ribosylation factor-like GTPase 4A (ARL4A); epiplakin 1 (EPPK1); zinc finger protein 354A (ZNF354A); SH3 and PX domains 2B (SH3PXD2B); RNase A family 1 (RNASE1), or BTB domain containing 19 (BTBD19), or a decrease in one or more of: zeta chain of T cell receptor associated protein kinase 70 kDa (ZAP70); endoplasmic reticulum aminopeptidase 2 (ERAP2); CD96 molecule (CD96); membrane metallo-endipeptidase (MME); killer cell lectin-like receptor subfamily F, member 1 (KLRF1); and/or non-coding transcript, nuclear paraspeckle assembly transcript 1 (NEAT1), indicates that the subject has, or has an increased risk of developing (i.e., a risk above that of a reference cohort of subjects with blunt trauma), an infection, e.g., within the next 2, 3, 4, 5, 6, 7, 10, 12, 14, 20, 21, or 24 days. In some embodiments, SH3PXD2B is not evaluated.
In some embodiments, an increase in one or more of interleukin 1 receptor, type II (IL1R2); mannose receptor, C type 1 (MRC1); importin 11 (IPO11); dedicator of cytokinesis 4 (DOCK4); Kruppel-like factor 9 (KLF9); and/or nebulette (NEBL), or a decrease in MME, ribosomal protein S6 kinase, 90 kDa, polypeptide 5 (RPS6KA5); killer cell lectin-like receptor, subfamily K, member 1 (KLRK1); granzyme K (GZMK); and/or ADP-ribosylation factor-like GTPase 4C (ARL4C) indicates that the subject has, or has an increased risk of developing (i.e., a risk above that of a reference cohort of subjects with blunt trauma), an infection, e.g., within the next 2, 3, 4, 5, 6, 7, 10, 12, 14, 20, 21, or 24 days.
In some embodiments, the methods include calculating a score based on presence and/or level of a biomarker as described herein, and the score is comparable to a disease reference score (e.g., is above or below a threshold score), then the subject has an increased risk of developing MIIE In some embodiments, the levels of the biomarkers are used to calculate a score. The score can be calculated, e.g., using an algorithm such as summation, or weighted summation, of the (normalized) levels of the biomarkers. Specific algorithms can be identified using known statistical methods including PCA, linear regression, SVM (support vector machine), decision tree, KNN (K-nearest neighbors), K-means, gradient boosting, or random forest methods.
Thus, in some embodiments, to assess whether a subject has an increased risk of developing MIIE, the method can include first log transforming the biomarker values and then assigning a predicted probability, e.g., using a logistic regression model, to produce a probability score. If a subject has a predicted probability score above a selected threshold, e.g., at least 50%, 60%, 70%, 80%, or 90%, the subject would be predicted to have an increased risk of developing MIIE If the predicted probability score is below the selected threshold, e.g., 50%, the subject would be predicted to have a normal risk of developing MIIE
For example, in some embodiments, an exemplary model uses a logistic regression analysis to calculate a probability score, wherein each variable (biomarker, X) gets a weight (B). In the exemplary equation below, the weights (B) are calculated for each marker, and there can be unique B values for each of the biomarkers, e.g., for each of the biomarkers.
The coefficients of the model provide a weighting of the predictors to produce the predicted response variable value. In some embodiments, e.g., for the 15 marker panel, the coefficients are:
Using these exemplary coefficients, algorithms for calculating patients' predicted probability of multiple infections based on levels of each of the biomarkers are:
Logit(MIIE)=1.67158+0.52495BTBD19+0.57518RNASE1 −0.6925MME−0.10097ARL4A+0.78963ZNF354A−0.06368CD96−0.43305HGF−0.20691ZAP70 −0.48816KLRF1+0.47734ADORA3−0.40628ERAP2+0.35635KBTBD7+0.14436SH3PXD2B+0.66524EPPK1 −0.63188NEAT1
And
Odds(MIIE)=e{circumflex over ( )}(1.67158+0.52495BTBD19+0.57518RNASE1 −0.6925MME−0.10097ARL4A+0.78963ZNF354A−0.06368CD96—0.43305HGF−0.20691ZAP70 −0.48816KLRF1+0.47734ADORA3−0.40628ERAP2+0.35635KBTBD7+0.14436SH3PXD2B+0.66524EPPK1 −0.63188NEAT1)
And finally:
In some embodiments, additional variables are considered, e.g., presence of comorbidities including diabetes or cardiovascular disease, age, or injury severity scores, e.g., APACHEII, ISS, NISS.
In some embodiments, the amount by which the level (or score) in the subject is less than the reference level (or score) is sufficient to distinguish a subject from a control subject, and optionally is a statistically significantly less than the level (or score) in a control subject. In cases where the level (or score) of the biomarker(s) in a subject being equal to the reference level (or score) of the biomarker(s), the “being equal” refers to being approximately equal (e.g., not statistically different).
Suitable reference values can be determined using methods known in the art, e.g., using standard clinical trial methodology and statistical analysis. The reference values can have any relevant form. In some cases, the reference comprises a predetermined value for a meaningful score or level of a biomarker, e.g., a control reference level that represents a normal level of a biomarker, e.g., a level in an unaffected subject or a subject who is not at risk of developing MIIE, and/or a disease reference that represents a level of the proteins associated with risk of developing MIIE
The predetermined level or score can be a single cut-off (threshold) value, such as a median or mean, or a level or score that defines the boundaries of an upper or lower quartile, tertile, or other segment of a clinical trial population that is determined to be statistically different from the other segments. It can be a range of cut-off (or threshold) values, such as a confidence interval. It can be established based upon comparative groups, such as where association with risk of developing disease or presence of disease in one defined group is a fold higher, or lower, (e.g., approximately 2-fold, 4-fold, 8-fold, 16-fold or more) than the risk or presence of disease in another defined group. It can be a range, for example, where a population of subjects (e.g., control subjects) is divided equally (or unequally) into groups, such as a low-risk group, a medium-risk group and a high-risk group, or into quartiles, the lowest quartile being subjects with the lowest risk and the highest quartile being subjects with the highest risk, or into n-quantiles (i.e., n regularly spaced intervals) the lowest of the n-quantiles being subjects with the lowest risk and the highest of the n-quantiles being subjects with the highest risk.
In some embodiments, the predetermined level or score is a level or score determined in the same subject, e.g., at a different time point, e.g., an earlier time point.
Subjects associated with predetermined values are typically referred to as reference subjects. For example, in some embodiments, a control reference subject is one who has a blunt trauma and later experiences 0 or 1 infection events.
A disease reference subject is one who has a blunt trauma and who later experiences 2 or more infections (MIIE).
An increased risk is defined as a risk above the risk of subjects in the general population of subjects who have blunt trauma.
The predetermined value can depend upon the particular population of subjects (e.g., human subjects) selected. For example, an apparently healthy population will have a different ‘normal’ range of levels or scores than will a population of subjects which have, are likely to have, or are at greater risk to have, a disorder described herein. Accordingly, the predetermined values selected may take into account the category (e.g., sex, age, health, risk, presence of other diseases) in which a subject (e.g., human subject) falls. Appropriate ranges and categories can be selected with no more than routine experimentation by those of ordinary skill in the art.
In characterizing likelihood, or risk, numerous predetermined values can be established.
Methods of Treatment
In some embodiments, once it has been determined that a person has an increased risk of developing MIIE, then a treatment, e.g., as known in the art or as described herein, can be administered.
The present methods can be used to facilitate clinical decision-making by effectively discriminating between those who are likely to develop multiple infections and those who are not. In some embodiments, patients who are found not to be hypersusceptible to MIIE (i.e., to have a lower risk of developing MIIE) can continue to receive the currently established standard of care. In some embodiments, the methods include treating subjects identified as hypersusceptible to or at risk of developing MIIE, e.g., by administering a broad-spectrum antibiotic, increasing the frequency or length of monitoring of the subject for infection; implementing prophylactic measures; and/or enhancing patient nutrition; those who are identified to be at high risk could can be treated with from increased surveillance and additional preventative measures taken early. Some antibiotics include vancomycin; linezolid; daptomycin; telavancin; doxycycline; minocycline; aminoglycosides; ampicillin; amoxicillin/clavulanic acid (augmentin); azithromycin; carbapenems (e.g. imipenem, doripenem); piperacillin/tazobactam; quinolones (e.g. ciprofloxacin); tetracycline-class drugs; tigecycline; chloramphenicol; ticarcillin; colistin and tigecycline combination therapy; and trimethoprim/sulfamethoxazole (bactrim); see, e.g., Pakyz, “Broad-spectrum antibiotics,” Infectious Disease Advisor, 2017 (available at infectiousdiseaseadvisor.com/home/decision-support-in-medicine/hospital-infection-control/broad-spectrum-antibiotics/). Additional interventions for the high risk group can include increased surveillance for early mobilization and removal of lines/tubes, coating IV lines and urine catheters with antimicrobials and/or antibiotics, immunomodulatory nutrition therapies (Aghaeepour et al., 2017; Lorenz et al., 2015), and microbiome alterations (Harris et al., 2017; Tosh and McDonald, 2012).
Such additional measures would incur unnecessary costs if implemented in all trauma patients, however, could be cost-effective when used in this targeted set of patients. Efforts aimed at increased prevention have the potential to contribute to alleviating the current antibiotic resistance crisis, toxicity of antibiotics, and the imposed burden on healthcare costs. Accurate outcome prediction and risk stratification methodologies, such as one we describe here, could be valuable amid crisis situations that result in severe hospital overload with critically ill patients and scarcity of medical resources. The ability to identify patients at low risk for specific morbidity and mortality can be used as an aid to informed prioritization of resource allocation to patients with better potential for recovery and survival (Massachusetts, 2020; Medicine, 2013; Medicine, 2020).
The invention is further described in the following examples, which do not limit the scope of the invention described in the claims.
Materials and Methods
The following materials and methods were used in the Examples herein.
Study design/patients
We performed a secondary analysis of patient clinical and genomic data from the Glue Grant, a prospective, longitudinal study that enrolled patients at seven US Level 1 trauma centers between 2003 and 2009. Permission for the use of de-identified data was obtained from the Massachusetts General Hospital Institutional Review Board. Among the 2,002 patients in the dataset, inclusion criteria identified 128 adult (age ≥16 years) patients who suffered blunt trauma (excluding penetrating injury only, or blunt with penetrating injury), had follow-up time of at least 10 days since blood collection, and had early (≤48 hours since trauma injury) blood microarray transcriptome data of high RNA quality (RNA quality ≥3 out of 4) and chips that were not outliers. At least three days from time of blood draw to first recorded infection (
Major Clinical Procedures Categories:
For assigning categories of major clinical procedures, sub-categories were combined as follows: “Laparotomy” includes entries of laparotomy NOS, and laparotomy with other procedure, with splenectomy, with repair/packing of the liver, with the repair of gastric, duodenal or small bowel perforation, with the repair of the large bowel or rectal perforation, with nephrectomy, with repair of major vascular injury, with drainage of intra-abdominal abscess, and second-look laparotomy or abdominal washout. “Orthopedic” includes soft tissue debridement/amputations, internal fixation of femur, open skeletal fixation exclusive of femur, percutaneous skeletal fixation, and spine fixation. “Vascular” includes peripheral vascular and angiographic embolization. “Craniotomy” and “Thoracotomy” were used as recorded in the original dataset.
Surgical Site Infections Categories:
For surgical site infection categories, “Superficial incisional infection” includes recordings of the shoulder girdle, lumbar spine (bony), lower extremity, and abdomen and pelvis (non-bony) indicated as superficial incisional and “Deep incisional” combines upper extremity, lower extremity, pelvis (body), and abdomen and pelvis (non-body) recorded as deep incisional, as well as chest (pleural space), abdomen and pelvis (non-bony), upper extremity, and lower extremity recorded as organ/space.
Outcome Definition:
Patients were assigned to MBE-cases (≥2 cumulative infection episodes) and non-cases (≤1 infection episode), following the same method as in our previous study (Yan et al., 2015), in which a decision tree considering the timing and type of infection and the isolated pathogen was used to tabulate total independent infection episodes (
Quantification and Statistical Analysis
Software Used
R version 3.4.4 was used for the statistical analyses, as described below, with the following packages and versions: GCRMA 2.50.0 (Wu and Irizarry, 2017), arrayQualityMetrics 3.34.0 (Kauffmann et al., 2009), EMA 1.4.5 (Servant et al., 2010), LIMMA 3.34.9 (Ritchie et al., 2015), Glmnet 2.0-16 (Friedman et al., 2010), pROC 1.13.0 (Robin et al., 2011), Caret 6.0-81 (Kuhn, 2008), epiR 0.9-99 (Stevenson et al., 2019).
Statistical Analyses
Baseline characteristics are reported as medians with interquartile range, means with standard deviation, or total numbers with proportions in percentages, as indicated in the legend (Table 1A). Medians between two groups (non-cases versus MIIE-cases) were compared using the Mann-Whitney U test, and means were analyzed by the unpaired t-test assuming equal variance. For comparing proportions, the Chi-square test was used for expected values 5 or greater, or Fisher's exact test was used for an expected value less than 5.
For processing microarray data files, first, the GCRMA package was employed to obtain normalized log 2 expression values of probe sets. Then chips that were flagged as outliers by the arrayQualityMetrics package were excluded. Subsequently, internal control probe sets were removed, and the EMA package was used to filter out low abundance probe sets (below the threshold of 3.5 log 2 expression value across all samples). These filtering steps reduced the number of probe sets from 54,675 to 25,567 for subsequent analyses. The LIMMA package was used to identify probe sets with at least 1.5-fold difference between the non-cases and MIIE-cases; this reduced the number of probe sets to 137. Functional annotation analyses were conducted using Database for Annotation, Visualization, and Integrated Discovery (DAVID), version 6.8 (Huang da et al., 2009), using the 1.5-fold changed 137 probe sets as the target and total 25,567 probe sets as the background set. The fold enrichment, unadjusted and FDR-adjusted p-values were reported.
The Glmnet package was used to implement the Least absolute shrinkage and selection operator (LASSO) regression and Elastic Net regression, to select probe sets that were predictive of MIIE, as previously conducted (Yan et al., 2015). The penalty weight, lambda (k), was identified by finding the 10-fold cross-validation (CV) error, repeated 100 times. Probe sets were selected according to the value of k that yielded minimum average binomial deviance plus one standard error on the test set (λ1se), rather than minimum (λmin) to limit overfitting. For LASSO, the λ1se was found to be 0.068, and for the Elastic net, it was found to be 0.108 (
The following logistic regression models were constructed for the binary MIIE outcome: (a) multivariate model with the 15 probe set LASSO predictors, (b) multivariate model with the 26 probe sets Elastic Net predictors, (c) univariate models of various clinical scores (APACHEII ISS, NISS), and (d) multivariate models with a combination of the 15, or 26 probe sets with each of the clinical scores. The predictors were identified with LASSO or Elastic Net. Then, a logistic regression model with those 15 (LASSO) or 26 (Elastic Net) predictors was fit by maximum likelihood. The area under the receiver operating characteristic curve (AUROC) with bootstrap 95% confidence intervals were estimated and used to compare the models using the pROC package. The comparison was repeated for 10-fold cross-validation sampling using the Caret package. Sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) and confidence intervals were calculated using the epiR package, using the optimal probability cut-off determined as the top-left corner of the ROC curve for each model.
For the volcano plot, log 2 fold change comparing MIIE-cases and non-cases was plotted on the x-axis and p-values on the y-axis, of all initial 25,567 probe sets after the filtering step. Data points of probe sets with at least 1.5-fold expression level difference (137 probe sets) are marked with black dots. Among these 137 probe sets, those corresponding to the 15 probe set panel selected with LASSO are marked with orange squares, and additional probe sets selected by Elastic Net for the 26 probe set panel are marked with blue triangles. Ingenuity Pathway Analyses (IPA) was used to generate an interaction network of the 26 probe sets selected by Elastic Net (QIAGEN) (Kramer et al., 2014).
External Validation Dataset and Statistical Analyses
Blood transcriptome data for external validation was obtained from the Cabrera et al., which conducted a secondary analysis of the Activation of Coagulation and Inflammation in Trauma (ACIT2) cohort (Cabrera et al., 2017). The ACIT2 study enrolled adult trauma patients at the Royal London Hospital, and the transcriptome study was conducted among patients enrolled between 2008-2012. The datasets with normalized Illumina microarray log 2 expression levels, accompanying clinical information, and infections outcome were downloaded from: github.com/C4TS/HyperacutePhase. For our analysis, we included 28 unique critically injured trauma patients with baseline ISS and infections outcome information available, after removing those with missing information. The Cabrera et al. data set does not have the resolution of the infection outcome that our Glue grant data set does; therefore we classified infection “yes” or “no” status, as indicated in their dataset. This subpopulation's ISS range was 25-51 (median 31), and the age range was 17-69 (median 37.5), which were comparable to the Glue Grant population. Among them, 26 had blood transcriptome data available for the 24 hour time point, and 25 had data available for the 72 hour time. Expression values of all the genes represented in the 15 probe set panel, except SH3PXD2B, was available in the Cabrera et al. dataset (“NormData” file). The data processing steps in Cabrera et al. entailed filtering out “nonexpressed” probe sets that did not pass the detection p-value threshold of 0.05 in at least three arrays. Therefore, we assumed the expression values for SH3PXD2B to be 0 for our analysis. We applied the 15 probe set panel model to calculate the predicted probabilities for the subjects in the Cabrera et al. dataset and constructed the ROC curve.
Baseline characteristics of the 128 blunt trauma patients included in the study (
As expected, orthopedic procedures were the most frequent surgical interventions that patients received overall (76.6%), followed by laparotomy (47.7%), vascular procedures (23.4%), thoracotomy (6.3%) and craniotomy (1.6%) (Table 1A). Apart from the proportion of patients having undergone laparotomy, which was significantly higher among MIIE-cases compared to non-cases (41.2% vs. 60.5%, p=0.04), other procedures were similar between non-cases and MIIE-cases.
There were five total patients who did not survive, and the cause of death was different for each (Table 1C). Mortality was similar between non-cases (3.5%) and MIIE-cases (4.7%, p=1.00). Among survivors, MIIE-cases had significantly longer hospital stay than non-cases (discharge at day 19 [14-26.75] vs. 35 [27-47], p<0.0001). MIIE-cases also had a higher proportion of those experiencing non-infection complications compared to non-cases (47.1% vs. 81.4%, p=0.0002). Maximum Denver and Marshall scores were significantly higher for MIIE-cases compared to non-cases (Denver score 2 [0-3] vs. 3 [2-3.5], p<0.0001; Marshall score 4.7 [3.3-6.4] vs. 6.9 [5.8-8.1], p<0.0001). Among the sub-categories that together determined the total Marshall Score, maximum scores of the following were significantly higher for MIIE-cases compared to non-cases: cardio (2.4 [1.7-3.2] vs. 2.8 [2.8-4.0], p<0.0001), respiratory (2.4 [1.7-3.2] vs. 2.8 [2.1-3.1], p<0.0001), and hepatic score (0.0 [0.0-0.8] vs. 0.7 [0.0-1.5], p<0.001). The maximum central nervous system, renal and hematologic scores were not significantly different.
(i) Patient Case Outcomes and Timing of Infection Onset
Among the 128 patients in the study, there were 85 non-cases—42 with no infection and 43 with one infection episode—and 42 were MIIE-cases (i.e., ≥2 infection episodes). The median [IQR] day for detection of first infection episode, among those who experienced at least one infection episode (i.e., excluding 42 patients those who had no infection episode, for a total of 86 patients) was 8 [5-12] days (Table 2).
(ii) Incidence of Surgical Site Infections Versus Other Nosocomial Infections
Among all 128 patients, 36 (28.1%) experienced surgical site infections, compared to 80 (62.5%) who experienced other nosocomial infections (Table 2). Comparing specific subtypes of nosocomial infections, pneumonia (39.1% overall) was highest, followed by urinary tract infection (18.8%), blood infection (17.2%), pseudomembranous colitis (3.9%), catheter-related bloodstream infection (3.9%), empyema (2.3%), and other unspecified infections (5.5%).
(iii) Microorganism Detection
When comparing the incidence of various microorganisms among non-cases with one infection episode versus MIIE-cases, relatively higher proportion was found for MIIE-cases specifically for Gram positives of Staphylococcus aureus (18.2% vs. 39.5%), Enterococcus species (11.4% vs. 25.6%), Coagulase negative staphylococci (2.3% vs. 16.3%), and Streptococcus pneumoniae and viridans (2.3% vs. 4.7% for both). The incidence of the Gram positive, Clostridium species was the same for non-cases and MIIE-cases (both 7.0%) (Table 2). For Gram negative bacteria, the incidences of the following microorganisms were higher for MIIE-cases compared to non-cases: Enterobacter species (11.4% vs. 37.2%), Acinetobacter species (9.1% vs. 30.2%), Pseudomonas aeruginosa (11.4% vs. 16.3%), Haemophilus influenza (4.5% vs. 14.0%), Bacteroides species (0% vs. 9.3%), Klebsiella pneumoniae (2.3% vs. 7.0%), Neisseria (0% vs. 7.0%), Proteus (2.3% vs. 4.7%), Serratia marcescens (2.3% vs. 4.7%), and Gram negative, not otherwise specified (NOS) (2.3% vs. 11.6%). The incidence was higher among non-cases than MIIE-cases for Escherichia coli (11.4% vs. 4.7%), and Stenotrophomonas (4.6% vs. 0%). Fungi incidences were higher among MIIE-cases compared to non-cases: Candida species (9.1% vs. 11.6%) and unspecified fungi (0% vs. 2.3%).
(iv) Timing of Microorganism Detection
The median time to the first day of detection of different microorganisms ranged widely (
Staphylococcus aureus
Enterococcus species
staphylococci
Clostridium species
Streptococcus
pneumoniae
Streptococcus viridans
Enterobacter species
Acinetobacter species
Pseudomonas aeruginosa
Haemophilus influenza
Escherichia coli
Bacteroides species
Klebsiella pneumoniae
Neisseria
Proteus
Serratia marcescens
Stenotrophomonas
Candida species
We identified 137 probe sets showing at least 1.5-fold up- or down-regulated difference in expression levels between non-cases and MIIE-cases (
To identify a multi-biomarker panel that collectively predicts the outcome of MIIE, we analyzed the 137 differentially-regulated probe sets by employing a machine learning pipeline that we previously developed and successfully used in our previous study among burn patients (Yan et al., 2015). In the current study, we further added to our analyses pipeline by utilizing a combination of LASSO and Elastic Net regression methods. The LASSO regression that reduces redundancy in predictor selection would allow for a narrow selection of a minimal biomarker panel, which is expected to be more practical. Elastic Net regression that includes correlated predictors allowed for a more comprehensive discovery of additional probe sets that are potentially biologically relevant. With LASSO, 15 probe sets were selected, mostly relevant to immune functions and signaling cascades for cellular proliferation and differentiation (Table 3;
With Elastic Net, a total of 26 probe sets were selected that included the 15 probe sets from LASSO and 11 additional ones (Table 3;
We assessed the molecular network connection among the 26 probe sets that were selected by Elastic Net (
The AUROC [95% CI] of the logistic regression model for predicting MBE outcome with the 15 probe set biomarker panel developed with LASSO was 0.90 [0.84-0.96] (
The 15 and 26 probe set panel models had sensitivity [95% CI] of 0.74 [0.59-0.86] vs. 0.79 [0.64-0.90] respectively; specificity [95% CI] of 0.94 [0.87-0.98] vs. 0.94 [0.87-0.98]; positive predictive value (PPV) of 0.86 [0.71-0.95] vs. 0.87 [0.73-0.96]; and negative predictive value (NPV) of 0.88 [0.79-0.94] vs. 0.90 [0.82-0.95] (Table 7). For the various injury severity scores (APACHEII, ISS, NISS), the sensitivity, specificity, PPV, and NPV were generally lower compared to the multi-biomarker panel models (Table 7).
Moreover, we constructed multivariate logistic regression models combining the 15, or the 26 probe set panel with each of the clinical injury severity scores. The AUROC [95% CI] of the 15 probe set panel combined with APACHEII was 0.90 [0.84-0.96], with ISS it was 0.902 [0.84-0.96], and with NISS it was 0.902 [0.84-0.96] (
We performed external validation of our biomarker panel using a severe blunt trauma patient cohort from a previous publication by Cabrera et al. (Cabrera et al., 2017), which had post-admission blood transcriptome log2 expression values available. The microarray platform used in this study was different from the one used in the Glue Grant, and the measure of one of the genes in the 15 probe set panel was missing. Despite the differences in measurement technology, the lack of measurement for one of the panel, and outcome resolution between the Glue Grant and the Cabrera et al. datasets (as described in detail in the methods section), the multi-biomarker panel achieved a relatively high AUROC of 0.76 [0.57-0.96] for the 24 hour post-admission dataset and 0.81 [0.62-1.00] for 72 hours post-admission (
It is to be understood that while the invention has been described in conjunction with the detailed description thereof, the foregoing description is intended to illustrate and not limit the scope of the invention, which is defined by the scope of the appended claims. Other aspects, advantages, and modifications are within the scope of the following claims.
This application claims the benefit of U.S. Provisional Application Ser. No. 62/943,855, filed on Dec. 5, 2019; and 63/089,512, filed on Oct. 8, 2020. The entire contents of the foregoing are incorporated herein by reference.
This invention was made with Government support under Grant No. GM062119 awarded by the National Institute of General Medical Sciences of the National Institutes of Health. The Government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2020/063583 | 12/7/2020 | WO |
Number | Date | Country | |
---|---|---|---|
63089512 | Oct 2020 | US | |
62943855 | Dec 2019 | US |