Severe COVID-19 pneumonia can be complicated by secondary bacterial or fungal infections, but their clinical distinction from isolated SARS-CoV-2 infection is challenging, especially with the more restricted practices regarding invasive diagnostics in patients with COVID-19. We sought to comprehensively screen for secondary infections by DNA pathogens (bacterial, fungal or viral) with a non-invasive, culture-independent metagenomic approach (microbial cell-free DNA sequencing—mcfDNA-Seq), and also examine for the biologic impact of circulating mcfDNA on the host response in COVID-19.
Variability in host inflammatory response has emerged as a key predictor of outcome in critically ill patients. Elevated biomarkers of host innate immunity and inflammation upon admission to the Intensive Care Unit (ICU) have been consistently associated with worse outcomes in patients with severe pneumonia and acute respiratory distress syndrome (ARDS). Little is known about the specific stimuli and triggers of this inflammatory response, but recent research implicates variation in the lung microbiome in patients with acute respiratory failure. Low community diversity and high abundance of pathogenic bacteria in the respiratory tract possibly correlate with elevated inflammatory biomarkers and worse clinical outcomes. It is unclear whether this early systemic inflammatory response reflects local interactions between microbes and immune cells in the alveolar space or systemic activation of innate immunity from circulating pathogen-associated molecular patterns (PAMPs) that leak from the injured alveolar epithelium. Such distinction is important for understanding severe pneumonia pathogenesis and clarifying causal mechanisms for circulating PAMPs.
The advent of ultra-sensitive, plasma metagenomic sequencing for circulating microbial cell-free DNA (mcfDNA) offers the opportunity to study the impact of a PAMP (mcfDNA) on systemic host-responses in pneumonia.
In one aspect, a method of detecting a secondary infection in a subject with a first infection is provided, comprising: (a) preparing a plasma sample from blood obtained from the subject with the first infection, wherein the plasma sample comprises microbial cell-free nucleic acids (mcfNA) from at least two different microbes; (b) producing a sequencing library comprising mcfNA attached to adapters; (c) measuring an amount of total mcfNA in the plasma sample by performing next generation sequencing on the sequencing library comprising the mcfNA attached to adapters, wherein the total mcfNA comprises mcfNA from at least two different microbes; (d) comparing the amount of total mcfNA comprising mcfNA from at least two different microbes to a threshold amount of total mcfNA; and (e) detecting a secondary infection that is different from the first infection when the amount of total mcfNA comprising mcfNA from at least two different microbes exceeds the threshold amount of total mcfNA.
In another aspect, a method of detecting a secondary infection in a subject with a first infection is provided, comprising: (a) preparing a plasma sample from blood obtained from the subject with the first infection, wherein the plasma sample comprises microbial cell-free nucleic acids (mcfNA) from at least two different microbes; (b) measuring an amount of total mcfNA in the plasma sample by performing next generation sequencing, wherein the total mcfNA comprises mcfNA from at least two different microbes; (c) comparing the amount of total mcfNA comprising mcfNA from at least two different microbes to a threshold amount of total mcfNA; and (d) detecting a secondary infection that is different from the first infection when the amount of total mcfNA comprising mcfNA from at least two different microbes exceeds the threshold amount of total mcfNA.
In yet another aspect, a method of treating a secondary infection in a subject with a first infection is provided, the method comprising: (a) collecting a blood sample from the subject with the first infection; (b) detecting a secondary infection when an amount of total microbial cell-free nucleic acids (mcfNA) comprising mcfNA from at least two microbes in the blood sample exceeds a threshold amount of total mcfNA, wherein the amount of total mcfNA is calculated by next generation sequencing; and (c) administering a therapeutic drug to the subject with the first infection in order to treat the secondary infection. In some cases, the method further comprises (d) repeating (a), (b), and (c) until the amount of total mcfNA in the blood decreases to a value at or below the threshold amount of total mcfNA.
In yet another aspect, a method of treating a secondary infection in a subject with a first infection is provided, the method comprising: (a) collecting a blood sample from the subject with the first infection; and (b) detecting a secondary infection when an amount of total microbial cell-free nucleic acids (mcfNA) comprising mcfNA from at least two microbes in the blood sample exceeds a threshold amount of total mcfNA, wherein the amount of total mcfNA is calculated by next generation sequencing.
In any of the preceding methods, in some embodiments, the first infection is a COVID-19 infection. In any of the preceding methods, in some embodiments, the first infection is a viral lung infection. In any of the preceding methods, in some embodiments, the first infection is COVID-19 pneumonia. In any of the preceding methods, in some embodiments, the secondary infection is a bacterial or fungal infection. In any of the preceding methods, in some embodiments, the method further comprises determining a presence of at least one bacterium, fungus, or parasite in the subject. In any of the preceding methods, in some embodiments, the first and secondary infections are respiratory infections caused by different microbes. In any of the preceding methods, in some embodiments, the first and second infections are pneumonia caused by different microbes. In any of the preceding methods, in some embodiments, the at least two microbes are respiratory pathogens. In any of the preceding methods, in some embodiments, the at least two microbes are at least two microbes from the group consisting of S. aureus, P. aeruginosa and K. pneumoniae. In any of the preceding methods, in some embodiments, the at least two microbes are at least two microbes listed in Table 2. In any of the preceding methods, in some embodiments, the at least two microbes are at least two respiratory pathogens listed in Table 2. In any of the preceding methods, in some embodiments, the first infection is culture-positive pneumonia. In any of the preceding methods, in some embodiments, the first infection is culture-negative pneumonia. In any of the preceding methods, in some embodiments, the at least two microbes comprise Candida. In any of the preceding methods, in some embodiments, the amount of total mcfNA is an aggregated amount of each type mcfNA in the sample. In any of the preceding methods, in some embodiments, the amount of total mcfNA is an aggregated amount of total bacterial mcfNA in the sample. In any of the preceding methods, in some embodiments, the amount of total mcfNA is an aggregated amount of total mcfNA from respiratory pathogens in the sample. In any of the preceding methods, the threshold amount of total mcfNA is an amount of mcfNA measured in plasma of a healthy or un-infected subject. In any of the preceding methods, in some embodiments, the amount of total mcfNA is measured by metagenomic next generation sequencing. In any of the preceding methods, in some embodiments, the mcfNA is mcfDNA. In any of the preceding methods, in some embodiments, the plasma or blood sample is spiked with a known concentration of synthetic normalization controls. In any of the preceding methods, in some embodiments, the mcfNA is extracted from the plasma of the subject. In any of the preceding methods, in some embodiments, a DNA sequencing library is constructed from the extracted mcfNA, and sequence reads are produced from the sequencing library. In any of the preceding methods, in some embodiments, the measuring the amount of mcfNA in the sample comprises (a) aligning the sequence reads with a microorganism database, wherein the microorganism library comprises more than 10,000 genomic reference sequences; (b) retaining reliable reads comprising alignments with high percent identity and high query coverage; (c) assigning relative abundances to each taxon based on the number of reliable reads and their alignments; (d) computing statistical significance values for each estimate of taxon abundance; (e) using taxon abundance to determine mcfNA concentration; and/or (f) using abundance of spiked synthetic normalization controls to calculate the molecules per microliter (MPM) value of mcfNA in the sample. In any of the preceding methods, in some embodiments, the microorganism library comprises at least 100, 200, 500, 750, 1000, 2000, 5000, 9000, 10000, or 15000 genomic reference sequences. In any of the preceding methods, in some embodiments, the method further comprises measuring levels of biomarkers of innate immunity or epithelial or endothelial injury in the plasma sample of the subject. In any of the preceding methods, in some embodiments, the biomarkers are selected from the group consisting of IL-6, IL-8, IL-10, RAGE, TNFR1, angiopoietin-2, procalcitonin, fractalkine, pentraxin-3, and ST2. In any of the preceding methods, in some embodiments, the biomarker is IL-8 or ST2. In any of the preceding methods, in some embodiments, the biomarker is procalcitonin or pentraxin-3. In any of the preceding methods, in some embodiments, the method further comprises comparing the amount of mcfNA in the patient with the biomarker levels using an algorithm to yield a test score. In any of the preceding methods, in some embodiments, the method further comprises administering a therapeutic drug to the patient based on the test score. In any of the preceding methods, in some embodiments, the therapeutic drug is optionally an antimicrobial drug, an antibiotic drug, or an antifungal drug. In any of the preceding methods, in some embodiments, the amount is measured in molecules per microliter of plasma (MPM). In any of the preceding methods, in some embodiments, the threshold amount of total mcfNA is greater than 400 MPM for all types of mcfNA in the sample. In any of the preceding methods, in some embodiments, the threshold amount of total mcfNA is greater than 600 MPM for total mcfNA in the sample when the total mcfNA is determined by aligning sequence reads to a genomic database comprising sequences from at least 100 different microbes. In any of the preceding methods, in some embodiments, the threshold amount of total mcfNA is greater than 4000 MPM for mcfNA from respiratory pathogens in the sample. In any of the preceding methods, the threshold amount of total mcfNA is greater than 4000 MPM when the total mcfNA is determined by aligning sequence reads to a genomic database comprising sequences from at least 100 different microbes. In any of the preceding methods, in some embodiments, the subject in (a) has received an empiric antibiotic. In any of the preceding methods, in some embodiments, the subject is not bacteremic. In any of the preceding methods, in some embodiments, the method further comprises adding synthetic nucleic acids to the plasma sample. In any of the preceding methods, in some embodiments, the method further comprises performing next generation sequencing of the synthetic nucleic acids. In any of the preceding methods, in some embodiments, the method further comprises attaching adapters to the cell-free nucleic acids in order to produce cell-free nucleic acids attached to the adapters. In any of the preceding methods, in some embodiments, the adapters are ligated to the cell-free nucleic acids. In any of the preceding methods, in some embodiments, the adapters are attached to the cell-free nucleic acids by a primer extension reaction. In any of the preceding methods, in some embodiments, the adapters comprise a sequence unique to the subject. In any of the preceding methods, in some embodiments, the method further comprises combining the cell-free nucleic acids attached to the adapters with cell-free nucleic acids obtained from a different subject. In any of the preceding methods, in some embodiments, the cell-free nucleic acids obtained from a different subject are attached to adapters that comprise a sequence unique to the different subject.
In yet another aspect, a method of detecting an inflammatory response in a patient is provided, comprising: (a) preparing a plasma sample from blood obtained from the patient, wherein the plasma sample comprises microbial cell-free nucleic acids (mcfNA); (b) producing a sequencing library comprising mcfNA attached to adapters; (c) measuring an amount of total mcfNA in the plasma sample, wherein the total mcfNA comprises mcfNA from at least two different microbes; (d) comparing the amount of the total mcfNA to a threshold amount of mcfNA; and (e) detecting an inflammatory response when the amount of total mcfNA exceeds the threshold amount of total mcfNA.
In yet another aspect, a method of detecting an inflammatory response in a patient is provided, comprising: (a) preparing a plasma sample from blood obtained from the patient, wherein the plasma sample comprises microbial cell-free nucleic acids (mcfNA); (b) measuring an amount of total mcfNA in the plasma sample, wherein the total mcfNA comprises mcfNA from at least two different microbes; (c) comparing the amount of the total mcfNA to a threshold amount of mcfNA; and (d) detecting an inflammatory response when the amount of total mcfNA exceeds the threshold amount of total mcfNA.
In yet another aspect, a method of treating an inflammatory response in a patient is provided, comprising: (a) collecting a blood sample from the patient; (b) detecting an inflammatory response in the patient when an amount of total mcfNA in the blood sample comprises mcfNA from at least two different microbes and exceeds a threshold amount of total mcfNA; and (c) administering an anti-inflammatory drug to the patient to treat the inflammatory response.
In yet another aspect, a method of treating an inflammatory response in a patient is provided, comprising: (a) collecting a blood sample from the patient; and (b) detecting an inflammatory response in the patient when an amount of total mcfNA in the blood sample comprises mcfNA from at least two different microbes and exceeds a threshold amount of total mcfNA.
In any of the preceding methods, in some embodiments, the subject has pneumonia. In any of the preceding methods, in some embodiments, the pneumonia is culture-positive pneumonia. In any of the preceding methods, in some embodiments, in some embodiments, the pneumonia is culture-negative pneumonia. In any of the preceding methods, in some embodiments, the mcfNA is mcfDNA. In any of the preceding methods, in some embodiments, the threshold amount of mcfNA is greater than 100,000 molecules per microliter of plasma (MPM). In any of the preceding methods, in some embodiments, the threshold amount of mcfNA is greater than 100,000 molecules per microliter of plasma (MPM) for mcfNA from known respiratory pathogens. In any of the preceding methods, in some embodiments, the method further comprises measuring levels of biomarkers of innate immunity or epithelial or endothelial injury in the plasma sample of the patient. In any of the preceding methods, in some embodiments, the biomarkers are selected from the group consisting of IL-6, IL-8, IL-10, RAGE, TNFR1, angiopoietin-2, procalcitonin, fractalkine, pentraxin-3, and ST2. In any of the preceding methods, in some embodiments, the biomarker is IL-8 or ST2. In any of the preceding methods, in some embodiments, the biomarker is procalcitonin or pentraxin-3. In any of the preceding methods, in some embodiments, the method further comprises comparing the amount of mcfNA in the subject with the biomarker levels using an algorithm to yield a test score. In any of the preceding methods, in some embodiments, the method further comprises administering a therapeutic drug to the subject based on the test score. In any of the preceding methods, in some embodiments, the subject is not bacteremic. In any of the preceding methods, in some embodiments, adapters are attached to the cell-free nucleic acids by ligation. In any of the preceding methods, in some embodiments, adapters are attached to the cell-free nucleic acids by primer extension. In any of the preceding methods, in some embodiments, the inflammatory response is a hyper-inflammatory response.
In yet another aspect, a method of detecting a bacterial infection in a patient with a COVID-19 infection is provided, comprising: (a) preparing a plasma sample from blood obtained from the patient with the COVID-19 infection, wherein the plasma sample comprises microbial cell-free nucleic acids (mcfNA); (b) producing a sequencing library comprising the mcfNA attached to the adapters; (c) conducting next generation sequencing on the sequencing library to produce sequence reads corresponding to the mcfNA; (d) aligning the sequence reads to sequences from a database comprising at least 1000 bacterial reference sequences; (e) determining an amount of mcfNA from at least one bacterium based on the aligning of the sequence reads; and (f) identifying a bacterial infection in the patient based on the amount of mcNA from the at least one bacterium.
In yet another aspect, a method of detecting a bacterial infection in a patient with a COVID-19 infection is provided, comprising: (a) preparing a plasma sample from blood obtained from the patient with the COVID-19 infection, wherein the plasma sample comprises microbial cell-free nucleic acids (mcfNA); (b) conducting next generation sequencing to produce sequence reads corresponding to the mcfNA; (c) aligning the sequence reads to sequences from a database comprising at least 1000 bacterial reference sequences; (d) determining an amount of mcfNA from at least one bacterium based on the aligning of the sequence reads; and (e) identifying a bacterial infection in the patient based on the amount of mcNA from the at least one bacterium.
In yet another aspect, a method of diagnosing and treating a bacterial infection in a patient with a COVID-19 infection is provided, comprising: (a) collecting a blood sample from the patient with the COVID-19 infection; (b) detecting the bacterial infection when an amount of bacterial mcfNA in the blood sample exceeds a threshold amount of mcfNA; and (c) administering a therapeutic drug to the patient to treat the bacterial infection.
In yet another aspect, a method of diagnosing and treating a bacterial infection in a patient with a COVID-19 infection is provided, comprising: (a) collecting a blood sample from the patient with the COVID-19 infection; and (b) detecting the bacterial infection when an amount of bacterial mcfNA in the blood sample exceeds a threshold amount of mcfNA.
In any of the preceding methods, in some embodiments, the patient has COVID-19 pneumonia. In any of the preceding methods, in some embodiments, wherein the bacterial infection is a respiratory infection. In any of the preceding methods, in some embodiments, the mcfNA (e.g., mcfDNA) is bacterial mcfNA from S. aureus, P. aeruginosa or K. pneumoniae. In some embodiments, the mcfNA (e.g., mcfDNA) is derived from at least one pathogen listed in Table 2. In some embodiments, the mcfNA (e.g., mcfDNA) is derived from at least one respiratory pathogen listed in Table 2. In any of the preceding methods, in some embodiments, the patient has culture-positive pneumonia. In any of the preceding methods, in some embodiments, the patient has culture-negative pneumonia. In any of the preceding methods, in some embodiments, the threshold amount of mcfNA is the amount of mcfNA measured in plasma of a healthy or uninfected subject. In any of the preceding methods, in some embodiments, the amount of mcfNA is measured by metagenomic next generation sequencing. In any of the preceding methods, in some embodiments, the mcfNA is mcfDNA. In any of the preceding methods, in some embodiments, the plasma is spiked with a known concentration of synthetic normalization controls.
In yet another aspect, a nucleic acid sequencing system for detecting secondary infection in a subject with a first infection is provided comprising: (a) a next-generation sequencing device comprising a flow cell and a computer processor that outputs data comprising sequence reads collected from measurements conducted in the flow cell; and (b) a computing device that comprises quantitation of total microbial cell-free nucleic acids (mcfNA) logic that (i) detects mcfNA from at least two different microbes by aligning the sequence reads to microbial reference sequence reads; (ii) calculates total mcfNA as a function of molecules per microliter of plasma, wherein the total mcfNA is an aggregate value of mcfNA from the at least two different microbes; and (iii) comprises an event generator to generate an event indicative a secondary infection when the total mcfNA exceeds a threshold value. In some embodiments, the quantitation of total microbial cell-free nucleic acids (mcfNA) logic comprises logic that excludes sequence reads from the analysis if they align to human reference sequences. In some embodiments, the quantitation of total microbial cell-free nucleic acids (mcfNA) logic comprises logic that excludes sequence reads from the analysis if they align to a synthetic nucleic acid reference. In some embodiments, the mcfNA is microbial cell-free DNA. In some embodiments, the threshold value is at least 600 MPM. In some embodiments, the threshold value is at least 4000 MPM.
In yet another aspect, a method of detecting secondary infection in a subject exhibiting pneumonia is provided, said method comprising (a) obtaining a plasma sample from said subject, (b) evaluating the amount of microbial cell-free nucleic acids in said sample; (c) comparing said amount of microbial cell free nucleic acids to a threshold level; and (d) detecting a secondary infection if said amount of microbial cell free nucleic acids exceeds said threshold level. In some embodiments, said subject has COVID-19. In some embodiments, said secondary infection is bacterial or fungal. In some embodiments, the method further comprises determining the presence and quantity of at least one bacterium, fungus or parasite in said subject.
In yet another aspect, a method of identifying a secondary infection at a site of localization in a subject with a viral infection is provided, comprising a) obtaining a plasma sample from said subject, (b) evaluating the amount of microbial cell-free nucleic acids in said sample; (c) comparing said amount of microbial cell free nucleic acids to a threshold level; and (d) detecting an infection at a site of localization in said subject if said amount of microbial cell free nucleic acids exceeds said threshold level. In some embodiments, said site of localization is the lungs.
In yet another aspect, anon-invasive method of detecting a respiratory infection in a subject exhibiting a pneumonia is provided, said method comprising a) obtaining a plasma sample from said subject, (b) evaluating the amount of microbial cell-free nucleic acids in said sample; (c) comparing said amount of microbial cell free nucleic acids to a threshold level; and (d) detecting a respiratory infection if said amount of microbial cell free nucleic acids exceeds said threshold level. In some embodiments, said subject has Covid-19 and is at risk for pneumonia.
In yet another aspect, a method for treating a patient suspected of having a secondary infection is provided, the method comprising: determining whether the patient will benefit from anti-microbial therapy by: determining in a sample from the patient a microbial cell-free nucleic acid level value (amount) and determining in a sample from the patient the level of a set of biomarkers, wherein the set of biomarkers comprises biomarkers of innate immunity (e.g., IL-8 and ST2) and/or bacterial infections (e.g., procalcitonin and pentraxin-3); and comparing the expression level values with the biomarker levels to yield a test score. In some embodiments, the method further comprises administering a treatment regimen comprising an anti-microbial therapy to the patient based on the test score.
In yet another aspect, a method for assessing the risk or prognosis of an inflammatory response in a subject with a disease is provided, the method comprising: performing at least one immunoassay on a blood sample from the subject to generate a first dataset comprising protein level data for at least two protein markers, wherein the at least two protein markers comprise at least two markers selected from fractalkine, interleukin(IL)-6, IL-8, pentraxin-3, procalcitonin, receptor for advanced glycation end products (RAGE), suppression of tumorigenicity (ST)-2, and tumour necrosis factor receptor (TNFR)-1 to provide a multi-biomarker inflammatory activity score (MBDA); performing at least one assay on a blood sample from the subject to generate determine the molecules per milliliter (MPM) of microbial cell-free DNA (mcfDNA); and determining the risk/prognosis of an elevated inflammatory response based on the mcfDNA MPM and MBDA score. In some embodiments, the disease is pulmonary pneumonia. In some embodiments, the subject has ventilator-associated pneumonia. In any of the preceding methods, in some embodiments, the inflammatory response is a hyper-inflammatory response.
In yet another aspect, a method of obtaining an inflammatory progression (IP) risk score for a subject with pneumonia is provided, said method comprising: obtaining or having obtained a biological sample from said subject; determining a multi-biomarker inflammatory activity score (MBDA) for said subject; determining the molecules per milliliter (MPM) of microbial cell-free DNA (mcfDNA); and obtaining an IP risk score from said subject's MBDA and MPM using an interpretation function. In some embodiments, the inflammatory response is a hyper-inflammatory response.
In yet another aspect, a method of detecting a localized respiratory infection in a subject is provided, the method comprising: obtaining or providing a plasma sample from the subject, wherein the subject is not bacteremic and the plasma sample comprises cell-free nucleic acids; performing next generation sequencing or metagenomic sequencing on cell-free nucleic acids from the plasma sample and producing sequence reads; and aligning the sequence reads with sequences of respiratory pathogens in order to detect the presence and quantity of at least one respiratory pathogen, wherein the at least one respiratory pathogen is associated with the localized respiratory infection. In some embodiments, the cell-free nucleic acids are cell-free DNA. In some embodiments, the sequence reads aligned with the sequences of respiratory pathogens correspond to microbial cell-free DNA. In some embodiments, the respiratory infection is pneumonia. In some embodiments, the respiratory infection is bacterial pneumonia. In some embodiments, the at least one respiratory pathogen is at least one bacterium associated with a respiratory infection. In some embodiments, the respiratory infection is a bacterial respiratory infection. In some embodiments, the at least one respiratory pathogen is S. aureus, P. aeruginosa or K. pneumoniae. In some embodiments, the at least one respiratory pathogen is at least one respiratory pathogen listed in Table 2. In some embodiments, the method further comprises adding synthetic nucleic acids to the plasma sample. In some embodiments, the method further comprises performing next generation sequencing on the synthetic nucleic acids. In some embodiments, the synthetic nucleic acids are normalization controls. In some embodiments, the method further comprises attaching adapters to the cell-free nucleic acids in order to produce cell-free nucleic acids attached to the adapters. In some embodiments, the adapters are ligated to the cell-free nucleic acids. In some embodiments, the adapters are attached to the cell-free nucleic acids by a primer extension reaction. In some embodiments, the adapters comprise a sequence unique to the subject. In some embodiments, the method further comprises combining the cell-free nucleic acids attached to the adapters with cell-free nucleic acids obtained from a different subject. In some embodiments, the cell-free nucleic acids obtained from a different subject are attached to adapters that comprise a sequence unique to the different subject. In some embodiments, the method further comprises administering a treatment (e.g., antibiotic) to the subject to treat the respiratory infection. In some embodiments, the method further comprises administering an antibiotic to treat the at least one pathogen associated with the respiratory infection. In some cases, the subject is blood culture negative. In some embodiments, the subject is blood culture positive. In some embodiments, culture of secretions from the respiratory tract is positive. In some embodiments, culture of the respiratory tract secretions is negative. In some embodiments, the subject has bacterial pneumonia and a viral pneumonia. In some cases, the viral pneumonia is caused by SARS-CoV-2 virus. In some embodiments, the bacterial pneumonia is caused by S. aureus, P. aeruginosa or K. pneumoniae. In some embodiments, the bacterial pneumonia is caused by a respiratory pathogen listed in Table 2.
All publications, patents, and patent applications mentioned in this specification are herein incorporated by reference in their entireties to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference.
The invention will now be described in detail by way of reference only using the following definitions and examples. All patents and publications, including all sequences disclosed within such patents and publications, referred to herein are expressly incorporated by reference in their entireties.
Provided herein are methods, devices, and systems for analyzing total microbial cell-free nucleic acids, particularly total microbial cell-free DNA (“total mcfDNA”), in order to detect or predict or otherwise evaluate a secondary infection in a subject, a hyperinflammatory response in a subject, or severity of infection in a subject. In some cases, the total microbial cell-free nucleic acids (e.g., total mcfDNA) is used to detect or predict or otherwise evaluate whether a patient (e.g., a patient with COVID-19) is likely to survive. Often, the subject is culture-negative for bacteria or viral pathogens that can cause the secondary infection or hyperinflammatory response at the time a sample is collected from the patient. The samples used in this disclosure are generally plasma samples or other samples that can be obtained relatively non-invasively. In some embodiments, the subject has pneumonia. In some cases, the subject has culture-positive pneumonia. In some cases, the subject has culture-negative pneumonia. In some cases, the subject has a COVID-19 infection. In some cases, the subject has COVID-19 pneumonia or severe COVID-19. In some cases, the threshold value for total microbial cell-free nucleic acids (e.g., mcfDNA) is an aggregate value for mcfNA (e.g., mcfDNA) from at least two different microbes. In some embodiments, the threshold value for total mcfNA (e.g., total mcfDNA) is 400 molecules per microliter of plasma (MPM), 600 MPM, 1000 MPM, 5000 MPM, 10000 MPM, or 100000 MPM. In some cases, the total mcfDNA reflects the total mcfDNA that derives from bacterial microbes. In some cases, the total mcfDNA reflects the total mcfDNA that derives from respiratory pathogens. In some embodiments, the respiratory pathogen is at least one respiratory pathogen listed in Table 2, in any combination. In some embodiments, the respiratory pathogen is a streptococcus, pseudomonas, or klebsiella bacterium. In some embodiments, the respiratory pathogen is from any genus listed in Table 2. In some cases, the respiratory pathogen is from the genus Actinomyces, Aspergillus, Bacteroides, Citrobacter, Cytomegalovirus, Enterobacter, Eschericihia, Enterococcus, Streptooccus, Pseudomonas, Klebsiella, and/or Haemophilus, In some cases, the respiratory pathogen is S. aureus, P. aeruginosa and/or K. pneumoniae, in any combination.
In some cases, the method comprises detecting a secondary infection in a patient with COVID-19, wherein the method comprises detecting at least one microbe associated with the secondary infection by performing next generation sequencing (e.g., metagenomic next generation sequencing) on microbial cell-free nucleic acids (e.g., microbial cell-free DNA (mcfDNA)) obtained from a sample (e.g., plasma) obtained from the subject. In some cases, the secondary infection is a bacterial infection and the COVID-19 patient is culture negative for the bacterial infection. In some cases, the secondary infection is a bacterial infection that is caused by a respiratory microbe (e.g., a bacterium that causes a respiratory infection or pneumonia). In some cases, the secondary infection is a bacterial pneumonia infection.
The methods provided herein have multiple uses and advantages. For example, the methods provide reliable methods for detecting a secondary infection in a patient, particularly when the secondary infection is not detectable by culture. The methods can also help identify the causative agents of a secondary pneumonia in patients with COVID-19 pneumonia, particularly when clinical distinction between the secondary pneumonia and COVID-19 pneumonia is challenging, or even not possible. The methods provide the further advantage of detecting pathogens associated with secondary pneumonia even when the patient has been administered an antibiotic, which can, in some cases, limit the sensitivity of microbiologic studies. The non-invasive nature of the methods provided herein also has the advantage of avoiding subjecting a patient to the discomfort and risks associated with bronchoscopy, as well as limiting exposure of healthcare personnel to SARS-COV-2 that is potentially aerosolized during a bronchoscopy procedure.
The headings provided herein are not limitations of the various aspects or embodiments of the invention which can be had by reference to the specification. Accordingly, the terms defined immediately below are more fully defined by reference to the specification.
All definitions herein described whether specifically mentioned or not, should be construed to refer to definitions as used throughout the specification and attached claims.
In the present disclosure, wherever aspects are described herein with the language “comprising,” otherwise analogous aspects described in terms of “consisting of” and/or “consisting essentially of” are also provided.
Numeric ranges are inclusive of the numbers defining the range. The term “about” as used herein generally means plus or minus ten percent (10%) of a value, inclusive of the value, unless otherwise indicated by the context of the usage. For example, “about 100” refers to any number from 90 to 110, inclusive of 100.
Whenever the term “at least,” “greater than,” or “greater than or equal to” precedes the first numerical value in a series of two or more numerical values, the term “at least,” “greater than” or “greater than or equal to” applies to each of the numerical values in that series of numerical values. For example, greater than or equal to 1, 2, or 3 is equivalent to greater than or equal to 1, greater than or equal to 2, or greater than or equal to 3.
Whenever the term “no more than,” “less than,” “at most,” or “less than or equal to” precedes the first numerical value in a series of two or more numerical values, the term “no more than,” “at most,” “less than,” or “less than or equal to” applies to each of the numerical values in that series of numerical values. For example, less than or equal to 3, 2, or 1 is equivalent to less than or equal to 3, less than or equal to 2, or less than or equal to 1.
Unless otherwise indicated, nucleic acids are written left to right in 5′ to 3′ orientation; amino acid sequences are written left to right in amino to carboxy orientation, respectively.
The term “attach” and its grammatical equivalents may refer to connecting two molecules using any mode of attachment. For example, attaching may refer to connecting two molecules by chemical bonds or other method to generate a new molecule. Attaching an adapter to a nucleic acid may refer to forming a chemical bond between the adapter and the nucleic acid. In some cases, attaching is performed by ligation, e.g., using a ligase. For example, a nucleic acid adapter may be attached to a target nucleic acid by ligation, via forming a phosphodiester bond catalyzed by a ligase. In some embodiments, the attachment comprises attaching via performing a primer extension reaction, wherein the sequence to be attached is present in the primer.
As used herein, the term “or” is used to refer to a nonexclusive or, such as “A or B” includes “A but not B,” “B but not A,” and “A and B,” unless otherwise indicated.
As used herein, “a”, “an”, and “the” can include plural referents unless otherwise limited expressly or by context.
“Interpretation function,” as used herein, means the transformation of a set of observed data into a meaningful determination of particular interest; e.g., an interpretation function may be a predictive model that is created by utilizing one or more statistical algorithms to transform a dataset of observed biomarker data and/or MPM into a meaningful determination of disease activity or the disease state of a subject.
By a “multi-biomarker disease activity score”, “multi-biomarker disease activity index score”, “MBDA score” or simply “MBDA” is intended a score that provides a semi-quantitative measure of inflammatory disease activity or the state of inflammatory disease in a subject. The interpretation function, in some embodiments, can be created from predictive or multivariate modeling based on statistical algorithms. In some embodiments, input to the interpretation function can comprise the results of testing one or more, two or more, three or more, four or more, five or more, six or more, seven or more, eight or more, nine or more, ten or more, 11 or more, 15 or more, 20 or more, 50 or more, or 100 or more biomarkers alone or in combination with microbial cell-free DNA measurements, also described herein. In some embodiments, the MBDA score is an indirect measure of inflammatory disease activity. In some embodiments, the MBDA score is a quantitative measure of inflammatory disease activity.
In some embodiments, the interpretation function is based on a predictive model. Established statistical algorithms and methods, useful as models or useful in designing predictive models, can include but are not limited to: analysis of variants (ANOVA); Bayesian networks; boosting and Ada-boosting; bootstrap aggregating (or bagging) algorithms; decision trees classification techniques, such as Classification and Regression Trees (CART), boosted CART, Random Forest (RF), Recursive Partitioning Trees (RPART), and others; Curds and Whey (CW); Curds and Whey-Lasso; dimension reduction methods, such as principal component analysis (PCA) and factor rotation or factor analysis; discriminant analysis, including Linear Discriminant Analysis (LDA), Eigengene Linear Discriminant Analysis (ELDA), and quadratic discriminant analysis; Discriminant Function Analysis (DFA); factor rotation or factor analysis; genetic algorithms; Hidden Markov Models; kernel based machine algorithms such as kernel density estimation, kernel partial least squares algorithms, kernel matching pursuit algorithms, kernel Fisher's discriminate analysis algorithms, and kernel principal components analysis algorithms; linear regression and generalized linear models, including or utilizing Forward Linear Stepwise Regression, Lasso (or LASSO) shrinkage and selection method, and Elastic Net regularization and selection method; glmnet (Lasso and Elastic Net-regularized generalized linear model); Logistic Regression (LogReg); meta-learner algorithms; nearest neighbor methods for classification or regression, e.g. Kth-nearest neighbor (KNN); non-linear regression or classification algorithms; neural networks; partial least square; rules based classifiers; shrunken centroids (SC); sliced inverse regression; Standard for the Exchange of Product model data, Application Interpreted Constructs (StepAIC); super principal component (SPC) regression; and, Support Vector Machines (SVM) and Recursive Support Vector Machines (RSVM), among others. Additionally, clustering algorithms as are known in the art can be useful in determining subject sub-groups.
Logistic Regression is the traditional predictive modeling method of choice for dichotomous response variables; e.g., treatment 1 versus treatment 2. It can be used to model both linear and non-linear aspects of the data variables and provides easily interpretable odds ratios.
Discriminant Function Analysis (DFA) uses a set of analytes as variables (roots) to discriminate between two or more naturally occurring groups. DFA is used to test analytes that are significantly different between groups. A forward stepwise DFA can be used to select a set of analytes that maximally discriminate among the groups studied. Specifically, at each step all variables can be reviewed to determine which will maximally discriminate among groups. This information is then included in a discriminative function, denoted a root, which is an equation consisting of linear combinations of analyte concentrations for the prediction of group membership. The discriminatory potential of the final equation can be observed as a line plot of the root values obtained for each group. This approach identifies groups of analytes whose changes in concentration levels can be used to delineate profiles, diagnose and assess therapeutic efficacy. The DFA model can also create an arbitrary score by which new subjects can be classified as either “healthy” or “diseased.” To facilitate the use of this score for the medical community the score can be rescaled so a value of 0 indicates a healthy individual and scores greater than 0 indicate increasing risk.
Classification and regression trees (CART) perform logical splits (if/then) of data to create a decision tree. All observations that fall in each node are classified according to the most common outcome in that node. CART results are easily interpretable—one follows a series of if/then tree branches until a classification results.
Support vector machines (SVM) classify objects into two or more classes. Examples of classes include sets of treatment alternatives, sets of diagnostic alternatives, or sets of prognostic alternatives. Each object is assigned to a class based on its similarity to (or distance from) objects in the training data set in which the correct class assignment of each object is known. The measure of similarity of a new object to the known objects is determined using support vectors, which define a region in a potentially high dimensional space (>R6).
The process of bootstrap aggregating, or “bagging,” is computationally simple. In the first step, a given dataset is randomly resampled a specified number of times (e.g., thousands), effectively providing that number of new datasets, which are referred to as “bootstrapped resamples” of data, each of which can then be used to build a model. Then, in the example of classification models, the class of every new observation is predicted by the number of classification models created in the first step. The final class decision is based upon a “majority vote” of the classification models; i.e., a final classification call is determined by counting the number of times a new observation is classified into a given group and taking the majority classification (33%+ for a three-class system). In the example of logistical regression models, if a logistical regression is bagged 1000 times, there will be 1000 logistical models, and each will provide the probability of a sample belonging to class 1 or 2.
Curds and Whey (CW) using ordinary least squares (OLS) is another predictive modeling method. Breiman, 1997, J. Royal. Stat. Soc. B, 59:3-54. This method takes advantage of the correlations between response variables to improve predictive accuracy, compared with the usual procedure of performing an individual regression of each response variable on the common set of predictor variables X. In CW, Y=XB*S, where Y=(ykj) with k for the kth patient and j for jth response (j=1 for TJC, j=2 for SIC, etc.), B is obtained using OLS, and S is the shrinkage matrix computed from the canonical coordinate system. Another method is Curds and Whey and Lasso in combination (CW-Lasso). Instead of using OLS to obtain B, as in CW, here Lasso is used, and parameters are adjusted accordingly for the Lasso approach.
Many of these techniques are useful either combined with a biomarker selection technique (such as, for example, forward selection, backwards selection, or stepwise selection), or for complete enumeration of all potential panels of a given size, or genetic algorithms, or they can themselves include biomarker selection methodologies in their own techniques. These techniques can be coupled with information criteria, such as Akaike's Information Criterion (AIC), Bayes Information Criterion (BIC), or cross-validation, to quantify the tradeoff between the inclusion of additional biomarkers and model improvement, and to minimize overfit. The resulting predictive models can be validated in other studies, or cross-validated in the study they were originally trained in, using such techniques as, for example, Leave-One-Out (LOO) and 10-Fold cross-validation (10-Fold CV).
By “prognosis” is intended a prediction as to the likely outcome of a disease. Prognostic estimates are useful in, among other things, determining an appropriate therapeutic regimen for a subject.
A “multiplex assay” as used herein refers to an assay that simultaneously measures multiple analytes, e.g., multiple nucleic acid analytes, multiple DNA analytes, multiple cell-free DNA analytes, multiple protein analytes, in a single run or cycle of the assay.
A “predictive model,” which term may be used synonymously herein with “multivariate model” or simply a “model,” is a mathematical construct developed using a statistical algorithm or algorithms for classifying sets of data. The term “predicting” refers to generating a value for a datapoint without actually performing the clinical diagnostic procedures normally or otherwise required to produce that datapoint; “predicting” as used in this modeling context should not be understood solely to refer to the power of a model to predict a particular outcome. Predictive models can provide an interpretation function; e.g., a predictive model can be created by utilizing one or more statistical algorithms or methods to transform a dataset of observed data into a meaningful determination of a risk score or the disease state of a subject.
A “quantitative dataset” or “quantitative data” as used in the present teachings, refers to the data derived from, e.g., detection and composite measurements of expression of a plurality of biomarkers (i.e., two or more) in a subject sample. The quantitative dataset can be used to generate a score for the identification, monitoring and treatment of disease states, and in characterizing the biological condition of a subject. It is possible that different biomarkers will be detected depending on the disease state or physiological condition of interest.
“Biomarker,” “biomarkers,” “marker” or “markers” in the context of the present disclosure encompasses, without limitation, cytokines, chemokines, growth factors, proteins, peptides, nucleic acids, oligonucleotides, and metabolites, together with their related metabolites, mutations, isoforms, variants, polymorphisms, modifications, fragments, subunits, degradation products, elements, and other analytes or sample-derived measures. Biomarkers can also include mutated proteins, mutated nucleic acids, variations in copy numbers and/or transcript variants. Biomarkers also encompass non-blood borne factors and non-analyte physiological markers of health status, and/or other factors or markers not measured from samples (e.g., biological samples such as bodily fluids), such as clinical parameters and traditional factors for clinical assessments. Biomarkers can also include any indices that are calculated and/or created mathematically.
Biomarkers can also include combinations of any one or more of the foregoing measurements, including temporal trends and differences. In some embodiments, biomarkers are two or more of the following: fractalkine, interleukin-8, procalcitonin, pentraxin-3, suppression of tumorigenicity-2 (ST-2), and soluble tumor necrosis factor receptor-1 (TNFR-1). In some embodiments, biomarkers are one or more, two or more, three or more, four or more, five or more, or six of the following: fractalkine, interleukin-8, procalcitonin, pentraxin-3, suppression of tumorigenicity-2 (ST-2), and soluble tumor necrosis factor receptor-1 (TNFR-1).
Subjects
By “subject” is generally intended a mammal, particularly a human, such as a human patient. The term “mammal” includes but is not limited to a human, non-human primate, dog, cat, mouse, rat, cow, horse, pig, sheep, and camel. Mammals other than humans can be advantageously used as subjects that represent animal models of inflammation or secondary infection. A subject may be male, female, adult, immature, or young.
In some embodiments, the subject has a first infection, e.g., viral infection, COVID-19 infection, pneumonia, viral pneumonia, culture-positive infection, culture-negative infection, culture-positive pneumonia, culture-negative pneumonia. A subject may be one who has been previously diagnosed or identified as having an inflammatory disease. A subject can be one who has already undergone or is undergoing a therapeutic intervention for an inflammatory disease. A subject may also be one who has not been previously diagnosed as having an inflammatory disease; for example a subject may be one who exhibits one or more symptoms or risks factors for an inflammatory condition, or a subject who does not exhibit symptoms or risk factors for an inflammatory condition, or a subject who is asymptomatic for inflammatory disease. In some cases, the inflammatory condition is a hyper-inflammatory response.
Identifying the risk of inflammatory progression (IP) in a subject can allow for a prognosis of the disease and thus for the informed selection of, initiation of, adjustment of or increasing or decreasing various therapeutic regimens to delay, reduce or prevent that subject's progression to a more advanced disease state, e.g. a hyperinflammatory response. Subjects can be identified as having a particular risk of IP and so can be selected to begin or accelerate treatment to prevent or delay the further progression of inflammatory disease. In some cases, subjects can be identified as having a low or moderate risk of IP, and so can be selected to have their treatment decreased or discontinued. In other embodiments subjects may be identified by their IP risk scores as being at a particular risk for IP and can have therapy selected based on IP risk.
In some embodiments, the subject has, is suspected of having, or is at risk of having an infection by a bacterium, a fungus, a virus, a parasite, or any combination thereof. Such infection can be a secondary infection, such as an infection secondary to viral pneumonia, COVID-19 infection, viral infection, COVID-19 pneumonia, or other first infection. In some embodiments, an infection by a bacteria, a fungus, a virus, a parasite, or any combination thereof is a respiratory infection, e.g., pneumonia. In some embodiments, the infection is a fungal infection. In some embodiments, the infection is a bacterial infection. In some embodiments, a bacterial or fungal infection can comprise an infection by an organism selected from the group consisting of Bacillus spp., Clostridium spp, Corynebactehum jeikeium, Enterococcus spp., Lactobacillus spp., Rothia spp., Staphylococcus spp., Streptococcus spp., Citrobacter spp., Escherichia coli, Klebsiella spp., Pseudomonas spp., Stenotrophomonas maltophilia, and Candida spp. In some embodiments, the bacterial infection is a gram-negative bacterial infection. In some embodiments, the bacterial infection is a gram-positive bacterial infection, In some embodiments, the bacterial or fungal infection is susceptible to empirical antimicrobial therapy. In some embodiments, a subject is diagnosed with having an infection or with having a hyper-inflammatory response using methods disclosed herein. In some embodiments, a subject is diagnosed with having an increased risk of having severe disease or increased risk of death from the infection. For example, in some embodiments, the methods can detect that the subject has an increased risk of severe COVID-19, risk of a hyper-inflammatory response, and/or heightened risk of death from COVID-19.
In some cases, the subject has a localized infection. In some embodiments, the localized infection is a localized lung infection, e.g., pneumonia. In some cases, the subject is not bacteremic. In some cases, mcfDNA derived from a pathogen (e.g., respiratory pathogen) is detected in the subject, in the absence of bacteremia. In some cases, such mcfDNA is detected in plasma of a subject. For example, in some cases, the methods provided herein allow for detection in a plasma sample of a mcfDNA derived from a respiratory pathogen (e.g., bacterial pathogen associated with a respiratory infection) in a subject with a localized infection (e.g., pneumonia) and who does not have bacteremia.
Samples
A “sample” in the context of the present disclosure refers to any biological sample that is isolated from a subject. A sample can include, without limitation, a single cell or multiple cells, fragments of cells, an aliquot of body fluid, whole blood, platelets, serum, plasma, red blood cells, white blood cells or leucocytes, endothelial cells, tissue biopsies, synovial fluid, lymphatic fluid, ascites fluid, or interstitial or extracellular fluid. The term “sample” also encompasses the fluid in spaces between or external to the tissues that produce them, including synovial fluid, gingival crevicular fluid, bone marrow, cerebrospinal fluid (CSF), saliva, mucous, sputum, semen, sweat, urine or bodily fluids generally. “Blood sample” can refer to whole blood or any fraction thereof, including but not limited to blood cells, red blood cells, white blood cells, platelets, serum and plasma. Samples can be obtained from a subject by any means known in the art including, but not limited to, venipuncture, excretion, biopsy, needle aspirate, lavage, scraping, surgical incision or intervention or other methods known in the art.
In some embodiments, a sample is collected from a subject (e.g., a patient). Samples can be obtained from a subject by any methods known in the art including, but not limited to, venipuncture, excretion, biopsy, needle aspirate, lavage, scraping,
In some embodiments, a sample is a biological sample. In some embodiments, the biological sample is a whole blood sample. In some embodiments, the sample is a cell-free sample, such as a plasma sample or a cell-free plasma sample. In some embodiments, the sample is a sample of isolated or extracted nucleic acids (e.g., DNA, RNA, cell-free DNA). In some embodiments, the plasma sample is collected by collecting blood through venipuncture. In some embodiments, a specimen is mixed with an additive immediately after collection. In some cases, the additive is an anti-coagulant. In some cases, the additive prevents degradation of nucleic acids. In some cases, the additive is EDTA. In some embodiments, measures can be taken to avoid hemolysis or lipemia. In some embodiments, a sample is processed or unprocessed. In some embodiments, a sample is processed by extracting nucleic acids from a biological sample. In some embodiments, DNA is extracted from a sample. In some embodiments, nucleic acids are not extracted from the sample. In some embodiments, a sample comprises nucleic acids. In some embodiments, a sample consists essentially of nucleic acids.
In some cases, the methods provided herein comprise processing whole blood into a plasma sample. In some embodiments, such processing comprises centrifuging the whole blood in order to separate the plasma from blood cells. In some cases, the method further comprises subjecting the plasma to a second centrifugation, often at a higher speed in order to remove bacterial cells and cellular debris. In some cases, the second centrifugation is at a relative centrifugal force (rcf) of least about 4,000 rcf, at least about 5,000 rcf, at least about 6,000 rcf, at least about 8,000 rcf, at least about 10,000 rcf, at least about 12,000 rcf, at least about 14,000 rcf, at least about 16,000 rcf, or at least about 20,000 rcf.
At time of collection of a sample from the subject, the subject can be culture-negative for a microbe that is subsequently detected by a method provided herein. In some embodiments, at time of collection of a sample from the subject, the subject is culture-negative for a microbe that is subsequently detected by a method provided herein and the subject later becomes culture-positive for the microbe at a point in time following the collection of the sample. In some cases, at time of collection of the sample from the subject, the subject is culture-positive for a microbe that is subsequently detected by a method provided herein.
Often, a sample disclosed herein comprises a target nucleic acid (e.g., target DNA, target RNA). In some embodiments, a target nucleic acid is a cell-free nucleic acid or circulating cell-free nucleic acid. For example, the sample can comprise microbial cell-free nucleic acids (e.g., mcfDNA) that comprises a microbial target DNA (e.g., mcfDNA derived from a microbe, which can include pathogenic microbes). Exemplary microbes that can be detected by the methods provided herein include bacteria, fungi, parasites, and viruses. In some embodiments, a cell-free nucleic acid is a circulating cell-free nucleic acid. In some embodiments, a cell free nucleic acid can comprise cell-free DNA.
In some embodiments, nucleic acids (e.g., cell-free nucleic acids, cell-free DNA, RNA, or other nucleic acid in any combination thereof) are extracted from a sample. In some embodiments, isolated nucleic acids (e.g., extracted DNA) can be used to prepare DNA libraries. In some embodiments, DNA libraries can be prepared by attaching adapters to nucleic acids. In some embodiments, adapters can be used for sequencing of nucleic acids. In some embodiments, nucleic acids can comprise DNA. In some embodiments, nucleic acids containing adapters can be sequenced to obtain sequence reads. In some embodiments, a sample (e.g., a plasma sample comprising mcfDNA) is mixed with adapters prior to extracting nucleic acids or DNA from the sample. In some embodiments, nucleic acids extracted from a sample (e.g., a plasma sample comprising mcfDNA) are attached to adapters following extraction. In some embodiments, sequence reads can be produced through high-throughput sequencing (HTS). In some embodiments, HTS can comprise next-generation sequencing (NGS). In some cases, the HTS is metagenomic sequencing or metagenomic next generation sequencing. In some embodiments, sequence reads can be aligned to sequences in a reference dataset. In some cases, the reference dataset has sequences from at least 2, 5, 7, 10, 50, 100, 500, 750, 800, 900, 1000, or 2000 different microbes (e.g., bacteria, viruses, parasites, fungi). In some embodiments, the sequences are derived from a combination of respiratory pathogens, particularly bacteria associate with respiratory infections. In some embodiments, sequences can be a bacterial sequence aligned to a reference dataset to obtain an aligned sequence read. In some embodiments, a sequence can be a fungal sequence aligned to a reference dataset to obtain an aligned sequence read. In some embodiments, an aligned bacterial sequence, a fungal sequence or a combination thereof, can be quantified for bacterial sequences or fungal sequences based on aligned sequence reads obtained.
In the methods provided herein, nucleic acids can be isolated, extracted or purified. In some embodiments, nucleic acids can be extracted using a liquid extraction. In some embodiments, a liquid extraction can comprise a phenol-chloroform extraction. In some embodiments, a phenol-chloroform extraction can comprise use of Trizol™, DNAzol™, or any combination thereof. In some embodiments, nucleic acids can be extracted using centrifugation through selective filters in a column. In some embodiments, nucleic acids can be concentrated or precipitated by known methods, including, by way of example only, centrifugation. In some embodiments, nucleic acids can be bound to a selective membrane (e.g., silica) for the purposes of purification. In some embodiments, nucleic acids can be extracted using commercially available kits (e.g., QIAamp Circulating Nucleic Acid Kit™, Qiagen DNeasy kit™, QIAamp kit™, Qiagen Midi kit™, QIAprep spin kit™, or any combination thereof). Nucleic acids can also be enriched for fragments of a desired length, e.g., fragments which are less than 1000, 500, 400, 300, 200 or 100 base pairs in length. In some embodiments, enrichment based on size can be performed using, e.g., PEG-induced precipitation, an electrophoretic gel or chromatography material (Huber et al. (1993) Nucleic Acids Res. 21:1061-6), gel filtration chromatography, or TSKgel (Kato et al. (1984) J. Biochem, 95:83-86), which publications are hereby incorporated by reference in their entireties for all purposes.
In some embodiments, a nucleic acid sample is enriched for a target nucleic acid. In some embodiments, a target nucleic acid is a microbial cell-free nucleic.
In some embodiments, target (e.g., pathogen, microbial) nucleic acids is enriched relative to background (e.g., subject) nucleic acids in a sample, for example, by pull-down (e.g., preferentially pulling down target nucleic acids in a pull-down assay by hybridizing them to complementary oligonucleotides conjugated to a label such as a biotin tag and using, for example, avidin or streptavidin attached to a solid support), targeted PCR, or other methods. Examples of enrichment techniques include, but are not limited to: (a) self-hybridization techniques in which a major population in a sample of nucleic acids self-hybridizes more rapidly than a minor population in a sample; (b) depletion of nucleosome-associated DNA from free DNA; (c) removing and/or isolating DNA of specific length intervals; (d) exosome depletion or enrichment; and (e) strategic capture of regions of interest.
In some embodiments, an enriching step can comprise preferentially removing nucleic acids from a sample that are above about 120, about 150, about 200, or about 250 bases in length. In some embodiments, an enriching step comprises preferentially enriching nucleic acids from a sample that are between about 10 bases and about 60 bases in length, between about 10 bases and about 120 bases in length, between about 10 bases and about 150 bases in length, between about 10 bases and about 300 bases in length between about 30 bases and about 60 bases in length, between about 30 bases and about 120 bases in length, between about 30 bases and about 150 bases in length, between about 30 bases and about 200 bases in length, or between about 30 bases and about 300 bases in length. In some embodiments, an enriching step comprises preferentially digesting nucleic acids derived from the host (e.g., subject). In some embodiments, an enriching step comprises preferentially replicating the non-host nucleic acids.
In some embodiments, a nucleic acid library is prepared. In some embodiments, a double-stranded DNA library, a single-stranded DNA library or an RNA library is prepared. A method of preparing a dsDNA library can comprise ligating an adapter sequence onto one or both ends of a dsDNA fragment. In some cases, the adapter sequence comprises a primer docking sequence. In some cases, the method further comprises hybridizing a primer to the primer docking sequence and initiating amplification or sequencing of the nucleic acid attached to the adapter. In some embodiments, the primer or the primer docking sequence comprises at least a portion of an adapter sequence that couples to a next-generation sequencing platform. In some embodiments, a method can further comprise extension of a hybridized primer to create a duplex, wherein a duplex comprises an original ssDNA fragment and an extended primer strand. In some embodiments, an extended primer strand can be separated from an original ssDNA fragment. In some embodiments, an extended primer strand can be collected, wherein an extended primer strand is a member of an ssDNA library.
In some cases, the library is prepared in an unbiased manner. For example, in some cases, the library is prepared without using a primer that specifically hybridizes to a microbial nucleic acid. For example, in some embodiments, the only amplification performed on the sample involves the use of a primer specific for a sequence of one or more adapters attached to nucleic acids within the sample. In some cases, whole genome amplification is used to prepare the library prior to attachment of the adapters. In some cases, whole genome amplification is not used to prepare the library. In some cases, one or more primers that specifically hybridize to a microbial nucleic acid (e.g., pathogen, viral, fungal, bacterial or parasite nucleic acid) are used to amplify the sample.
In some cases, multiple DNA libraries from different samples (e.g., samples from different patients or subjects) are combined and then subjected to a next generation sequencing assay. In some cases, the libraries are indexed prior to combining in order to track which library corresponds to which sample. Indexing can involve the inclusion of a specific code or bar code in an adapter, e.g., an adapter that is attached to the nucleic acids are to be analyzed. In some cases, the samples comprise a negative control sample or a positive control sample, or both a negative control sample and a positive control sample.
In some cases, multiple DNA libraries from different samples (e.g., samples from different patients or subjects) are combined and then subjected to a next generation sequencing assay. In some cases, the samples comprise a negative control sample or a positive control sample.
In some embodiments, a length of a nucleic acid can vary. In some embodiments, a nucleic acid or nucleic acid fragment (e.g., dsDNA fragment, RNA, or randomly sized cDNA) can be less than 1000 bp, less than 800 bp, less than 700 bp, less than 600 bp, less than 500 bp, less than 400 bp, less than 300 bp, less than 200 bp, or less than 100 bp. In some embodiments, a DNA fragment can be about 40 to about 100 bp, about 50 to about 125 bp, about 100 to about 200 bp, about 150 to about 400 bp, about 300 to about 500 bp, about 100 to about 500 bp, about 400 to about 700 bp, about 500 to about 800 bp, about 700 to about 900 bp, about 800 to about 1000 bp, or about 100 to about 1000 bp. In some embodiments, a nucleic acid or nucleic acid fragment (e.g., dsDNA fragment, RNA, or randomly sized cDNA) can be within a range from about 20 to about 200 bp, such as within a range from about 40 to about 100 bp.
In some embodiments, an end of a dsDNA fragment can be polished (e.g., blunt-ended)) or be subject to end-repair to create a blunt end. In some embodiments, an end of a DNA fragment can be polished by treatment with a polymerase. In some embodiments, a polishing can involve removal of a 3′ overhang, a fill-in of a 5′ overhang, or a combination thereof. In some embodiments, a polymerase can be a proof-reading polymerase (e.g., comprising 3′ to 5′ exonuclease activity). In some embodiments, a proofreading polymerase can be, e.g., a T4 DNA polymerase, Pol 1 Klenow fragment, or Pfu polymerase. In some embodiments, a polishing can comprise removal of damaged nucleotides (e.g., abasic sites), using any means known in the art.
In some embodiments, a ligation of an adapter to a 3′ end of a nucleic acid fragment can comprise formation of a bond between a 3′ OH group of the fragment and a 5′ phosphate of the adapter. Therefore, removal of 5′ phosphates from nucleic acid fragments can minimize aberrant ligation of two library members. Accordingly, in some embodiments, 5′ phosphates are removed from nucleic acid fragments. In some embodiments, 5′ phosphates are removed from at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, or greater than 95% of nucleic acid fragments in a sample. In some embodiments, substantially all phosphate groups are removed from nucleic acid fragments. In some embodiments, substantially all phosphates are removed from at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, or greater than 95% of nucleic acid fragments in a sample. Removal of phosphate groups from a nucleic acid sample can be by any means known in the art. Removal of phosphate groups can comprise treating the sample with heat-labile phosphatase. In some embodiments, phosphate groups are not removed from the nucleic acid sample. In some embodiments, ligation of an adapter to the 5′ end of the nucleic acid fragment is performed.
Exemplary Sample Processing and Analysis
What follows is an example of methods provided by this disclosure. In some cases, plasma is spiked with a known concentration of synthetic normalization molecule controls. In some cases, the plasma is then subjected to cell-free NA (cfNA) extraction (e.g., extraction of cell-free DNA). The extracted cfNA can be processed by end-repair and ligated to adapters containing specific indexes to end-repaired cfDNA. The products of the ligation can be purified by beads. In some embodiments, the cfDNA ligated to adapters can be amplified with P5 and P7 primers, and the amplified, adapted cfDNA is purified.
Purified cfDNA attached to adapters derived from a plasma sample can be incorporated into a DNA sequencing library. Sequencing libraries from several plasma samples can be pooled with control samples, purified, and, in some embodiments, sequenced on Illumina sequencers using a 75-cycle single-end, dual index sequencing kit. Primary sequencing output can be demultiplexed followed by quality trimming of the reads. In some embodiments, the reads that pass quality filters are aligned against human and synthetic references and then excluded from the analysis, or otherwise set aside. Reads potentially representing human satellite DNA can also be filtered, e.g., via a k-mer-based method; then the remaining reads can be aligned with a microorganism reference database, (e.g., a database with 20,963 assemblies of high-quality genomic references). In some embodiments, reads with alignments that exhibit both high percent identity and/or high query coverage can be retained, except, e.g., for reads that are aligned with any mitochondrial or plasmid reference sequences. PCR duplicates can removed based on their alignments. Relative abundances can be assigned to each taxon in a sample based on the sequencing reads and their alignments.
For each combination of read and taxon, a read sequence probability can be defined that accounts for the divergence between the microorganism present in the sample and the reference assemblies in the database. A mixture model can be used to assign a likelihood to the complete collection of sequencing reads that included the read sequence probabilities and the (unobserved) abundances of each taxon in the sample. In some cases, an expectation-maximization algorithm is applied to compute the maximum likelihood estimate of each taxon abundance. From these abundances, the number of reads arising from each taxon can be aggregated up the taxonomic tree. The estimated taxa abundances from the no template control (NTC) samples within the batch can be combined to parameterize a model of read abundance arising from the environment with variations driven by counting noise. Statistical significance values can then be computed for each estimate of taxon abundance in each patient sample. In some embodiments, taxa that exhibit a high significance level, and are one of the 1449 taxa within the reportable range, comprise the candidate calls. Final calls can be made after additional filtering is applied, which accounts for read location uniformity as well as cross-reactivity risk originating from higher abundance calls. The microorganism calls that pass these filters are reported along with abundances in MPM, as estimated using the ratio between the unique reads for the taxon and the number of observed unique reads of normalization molecules.
The amount of mcfDNA plasma concentration in each sample can then be quantified by using the measured relative abundance of the synthetic molecules initially spiked in the plasma.
In some cases, testing with plasma mcfDNA-seq is performed on available samples collected between seven days before and four days after each BSI episode, and two negative control samples are added for each BSI episode. In some cases, the samples are collected at least three days prior to a bloodstream infection of invasive fungal infection. The laboratory can be blinded to expected results until sequencing is completed and reported.
Analysis
Disclosed herein in some embodiments, are methods of analyzing nucleic acids. Such analytical methods include sequencing the nucleic acids as well as bioinformatic analysis of the sequencing results (e.g., sequence reads).
In some embodiments, a sequencing is performed using a next generation sequencing assay. As used herein, the term “next generation” generally refers to any high-throughput sequencing approach including, but not limited to one or more of the following: massively-parallel signature sequencing, pyrosequencing (e.g., using a Roche 454 Genome Analyzer™ sequencing device), Illumina™ (Solexa™) sequencing (e.g., using an Illumina NextSeq™ 500), sequencing by synthesis (Illumina™), ion semiconductor sequencing (Ion torrent™), sequencing by ligation (e.g., SOLiD™ sequencing), single molecule real-time (SMRT) sequencing (e.g., Pacific Bioscience™), polony sequencing, DNA nanoball sequencing (Complete Genomics™), heliscope single molecule sequencing (Helicos Biosciences™), and nanopore sequencing (e.g., Oxford Nanopore™). In some embodiments, a sequencing assay can comprise nanopore sequencing. In some embodiments, a sequencing assay can include some form of Sanger sequencing. In some embodiments, a sequencing can involve shotgun sequencing; in some embodiments, a sequencing can include bridge amplification PCR. In some embodiments, a sequencing can be broad spectrum. In some embodiments, a sequencing can be targeted.
In some embodiments, a sequencing assay can comprise a Gilbert's sequencing method. In some embodiments, a Gilbert's sequencing method can comprise chemically modifying nucleic acids (e.g., DNA) and then cleaving them at specific bases. In some embodiments, a sequencing assay can comprise dideoxynucleotide chain termination or Sanger-sequencing.
In some embodiments, a sequencing-by-synthesis approach can be used in the methods provided herein. In some embodiments, fluorescently-labeled reversible-terminator nucleotides are introduced to clonally-amplified DNA templates immobilized on the surface of a glass flowcell. During each sequencing cycle, a single labeled deoxynucleoside triphosphate (dNTP) may be added to the nucleic acid chain. The labeled terminator nucleotide may be imaged when added in order to identify the base and may then be enzymatically cleaved to allow incorporation of the next nucleotide. Since all four reversible terminator-bound dNTPs (A, C, T, G) are generally present as single, separate molecules, natural competition may minimize incorporation bias.
In some embodiments, a method called Single-molecule real-time (SMRT) is used. In such approach, nucleic acids (e.g., DNA) are synthesized in zero-mode wave-guides (ZMWs), which are small well-like containers with capturing tools located at the bottom of the well. The sequencing is performed with use of unmodified polymerase (attached to the ZMW bottom) and fluorescently labelled nucleotides flowing freely in the solution. The fluorescent label is detached from the nucleotide upon its incorporation into the DNA strand, leaving an unmodified DNA strand. A detector such as a camera may then be used to detect the light emissions; and the data may be analyzed bioinformatically to obtain sequence information.
In some embodiments, a sequencing by ligation approach is used to sequence the nucleic acids in a sample. One example is the next generation sequencing method of SOLiD (Sequencing by Oligonucleotide Ligation and Detection) sequencing (Life Technologies). This next generation technology may generate hundreds of millions to billions of small sequence reads at one time. The sequencing method may comprise preparing a library of DNA fragments from the sample to be sequenced. In some embodiments, the library is used to prepare clonal bead populations in which only one species of fragment is present on the surface of each bead (e.g., magnetic bead). The fragments attached to the magnetic beads may have a universal P1 adapter sequence attached so that the starting sequence of every fragment is both known and identical. In some embodiments, the method may further involve PCR or emulsion PCR. For example, the emulsion PCR may involve the use of microreactors containing reagents for PCR. The resulting PCR products attached to the beads may then be covalently bound to a glass slide. A sequencing assay such as a SOLiD sequencing assay or other sequencing by ligation assay may include a step involving the use of primers. Primers may hybridize to the P1 adapter sequence or other sequence within the library template. The method may further involve introducing four fluorescently labelled di-base probes that compete for ligation to the sequencing primer. Specificity of the di-base probe may be achieved by interrogating every first and second base in each ligation reaction. Multiple cycles of ligation, detection and cleavage may be performed with the number of cycles determining the eventual read length. In some embodiments, following a series of ligation cycles, the extension product can be removed and the template can be reset with a primer complementary to the n−1 position for a second round of ligation cycles. Multiple rounds (e.g., 5 rounds) of primer reset may be completed for each sequence tag. Through the primer reset process, each base may be interrogated in two independent ligation reactions by two different primers. For example, a base at read position 5 can be assayed by primer number 2 in ligation cycle 2 and by primer number 3 in ligation cycle 1.
In some embodiments, a detection or quantification analysis of oligonucleotides can be accomplished by sequencing. In some embodiments, entire synthesized oligonucleotides can be detected via full sequencing of all oligonucleotides by e.g., Illumina HiSeq 2500™, including the sequencing methods described herein.
In some embodiments, a sequencing can be accomplished through classic Sanger sequencing methods which are well known in the art. Sequencing can also be accomplished using high-throughput systems some of which allow detection of a sequenced nucleotide immediately after or upon its incorporation into a growing strand, e.g., detection of sequence in real time or substantially real time. In some embodiments, high throughput sequencing generates at least 1,000, at least 5,000, at least 10,000, at least 20,000, at least 30,000, at least 40,000, at least 50,000, at least 100,000, or at least 500,000 sequence reads per hour. In some embodiments, each read is at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 120, or at least 150 bases per read. In some embodiments, each read is up to 2000, up to 1000, up to 900, up to 800, up to 700, up to 600, up to 500, up to 400, up to 300, up to 200, or up to 100 bases per read. Long read sequencing can include sequencing that provides a contiguous sequence read of longer than 500 bases, longer than 800 bases, longer than 1000 bases, longer than 1500 bases, longer than 2000 bases, longer than 3000 bases, or longer than 4500 bases per read.
In some embodiments, a high-throughput sequencing can involve the use of technology available by Illumina's Genome Analyzer IIX™, MiSeq personal sequencer™, or HiSeq™ systems, such as those using HiSeq 2500 ™, HiSeq 1500 ™, HiSeq 2000 ™, or HiSeq 1000 ™. These machines use reversible terminator-based sequencing by synthesis chemistry. These machines can sequence 200 billion or more reads in eight days. Smaller systems may be utilized for runs within 3, 2, or 1 days or less time. Short synthesis cycles may be used to minimize the time it takes to obtain sequencing results.
In some embodiments, a high-throughput sequencing involves the use of technology available by ABI Solid System. This genetic analysis platform can enable massively parallel sequencing of clonally-amplified DNA fragments linked to beads. The sequencing methodology is based on sequential ligation with dye-labeled oligonucleotides.
In some embodiments, a next-generation sequencing can comprise ion semiconductor sequencing (e.g., using technology from Life Technologies™ (Ion Torrent™)). Ion semiconductor sequencing can take advantage of the fact that when a nucleotide is incorporated into a strand of DNA, an ion can be released.
To perform ion semiconductor sequencing, a high density array of micromachined wells can be formed. Each well can hold a single DNA template. Beneath the well can be an ion sensitive layer, and beneath the ion sensitive layer can be an ion sensor. When a nucleotide is added to a DNA, an H+ ion can be released, which can be measured as a change in pH. The H+ ion can be converted to voltage and recorded by the semiconductor sensor. An array chip can be sequentially flooded with one nucleotide after another. In some embodiments, no scanning, light, or cameras are required. In some embodiments, an IONPROTON™ Sequencer is used to sequence nucleic acid. In some embodiments, an IONPGM™ Sequencer is used. The Ion Torrent Personal Genome Machine™ (PGM) can sequence 10 million reads in two hours.
In some embodiments, a high-throughput sequencing involves the use of technology available by Helicos BioSciences Corporation™ (Cambridge, Massachusetts) such as the Single Molecule Sequencing by Synthesis (SMSS) method. SMSS can allow for sequencing the entire human genome in up to 24 hours. In some embodiments, SMSS may not require a pre amplification step prior to hybridization. In some embodiments, SMSS may not require any amplification. In some embodiments, methods of using SMSS are described in part in US Publication Application Nos. 20060024711 which is herein incorporated by reference.
In some embodiments, a high-throughput sequencing involves the use of technology available by 454 Lifesciences, Inc.™ (Branford, Connecticut) such as the Pico Titer Plate™ device which includes a fiber optic plate that transmits chemiluminescent signal generated by the sequencing reaction to be recorded by a charge-coupled device (CCD) camera in the instrument. This use of fiber optics can allow for the detection of a minimum of 20 million base pairs in 4.5 hours. In some embodiments, methods for using bead amplification followed by fiber optics detection are described in US Publication Application Nos. 20020012930; 20030058629; 20030100102; 20030148344; 20040248161; 20050079510, 20050124022; and 20060078909, each of which are herein incorporated by reference.
In some embodiments, high-throughput sequencing is performed using Clonal Single Molecule Array (Solexa, Inc.™) or sequencing-by-synthesis (SBS) utilizing reversible terminator chemistry.
In some embodiments, the next generation sequencing is nanopore sequencing. A nanopore can be a small hole, e.g., on the order of about one nanometer in diameter. Immersion of a nanopore in a conducting fluid and application of a potential across it can result in a slight electrical current due to conduction of ions through the nanopore. The amount of current which flows can be sensitive to the size of the nanopore. As a DNA molecule passes through a nanopore, each nucleotide on the DNA molecule can obstruct the nanopore to a different degree. Thus, the change in the current passing through the nanopore as the DNA molecule passes through the nanopore can represent a reading of the DNA sequence. The nanopore sequencing technology can be from Oxford Nanopore Technologies™; e.g., a GridION™ system. A single nanopore can be inserted in a polymer membrane across the top of a microwell. Each microwell can have an electrode for individual sensing. The microwells can be fabricated into an array chip, with 100,000 or more microwells (e.g., more than 200,000, 300,000, 400,000, 500,000, 600,000, 700,000, 800,000, 900,000, or 1,000,000) per chip. An instrument (or node) can be used to analyze the chip. Data can be analyzed in real-time. One or more instruments can be operated at a time. The nanopore can be a protein nanopore, e.g., the protein alpha-hemolysin, a heptameric protein pore. The nanopore can be a solid-state nanopore made, e.g., a nanometer sized hole formed in a synthetic membrane (e.g., SiNx, or SiO2). The nanopore can be a hybrid pore (e.g., an integration of a protein pore into a solid-state membrane). The nanopore can be a nanopore with an integrated sensors (e.g., tunneling electrode detectors, capacitive detectors, or graphene based nano-gap or edge state detectors (see e.g., Garaj et al. (2010) Nature vol. 67, doi: 10.1038/nature09379)). A nanopore can be functionalized for analyzing a specific type of molecule (e.g., DNA, RNA, or protein). Nanopore sequencing can comprise “strand sequencing” in which intact DNA polymers can be passed through a protein nanopore with sequencing in real time as the DNA translocates the pore. An enzyme can separate strands of a double stranded DNA and feed a strand through a nanopore. The DNA can have a hairpin at one end, and the system can read both strands. In some embodiments, nanopore sequencing is “exonuclease sequencing” in which individual nucleotides can be cleaved from a DNA strand by a processive exonuclease, and the nucleotides can be passed through a protein nanopore. The nucleotides can transiently bind to a molecule in the pore (e.g., cyclodextran). A characteristic disruption in current can be used to identify bases. Methods of using these technologies are described in part in Soni G V and Meller A. (2007) Clin Chem 53: 1996-2001, which are herein incorporated by reference.
In some embodiments, a nanopore sequencing technology from GENIA™ can be used. An engineered protein pore can be embedded in a lipid bilayer membrane. “Active Control” technology can be used to enable efficient nanopore-membrane assembly and control of DNA movement through the channel. In some embodiments, the nanopore sequencing technology is from NABsys™. Genomic DNA can be fragmented into strands of average length of about 100 kb. The 100 kb fragments can be made single stranded and subsequently hybridized with a 6-mer probe. The genomic fragments with probes can be driven through a nanopore, which can create a current-versus-time tracing. The current tracing can provide the positions of the probes on each genomic fragment. The genomic fragments can be lined up to create a probe map for the genome. The process can be done in parallel for a library of probes. A genome-length probe map for each probe can be generated. Errors can be fixed with a process termed “moving window Sequencing By Hybridization (mwSBH).” In some embodiments, the nanopore sequencing technology is from IBM™ or Roche™. An electron beam can be used to make a nanopore sized opening in a microchip. An electrical field can be used to pull or thread DNA through the nanopore. A DNA transistor device in the nanopore can comprise alternating nanometer sized layers of metal and dielectric. Discrete charges in the DNA backbone can get trapped by electrical fields inside the DNA nanopore. Turning off and on gate voltages can allow the DNA sequence to be read.
The next generation sequencing can comprise DNA nanoball sequencing (as performed, e.g., by Complete Genomics™; see e.g., Drmanac et al. (2010) Science 327: 78-81, which is incorporated herein by reference). DNA can be isolated, fragmented, and size selected. For example, DNA can be fragmented (e.g., by sonication) to a mean length of about 500 bp. Adapters (Ad1) can be attached to the ends of the fragments.
The adapters can be used to hybridize to anchors for sequencing reactions. DNA with adapters bound to each end can be PCR amplified. The adapter sequences can be modified so that complementary single strand ends bind to each other forming circular DNA. The DNA can be methylated to protect it from cleavage by a type IIS restriction enzyme used in a subsequent step. An adapter (e.g., the right adapter) can have a restriction recognition site, and the restriction recognition site can remain non-methylated. The non-methylated restriction recognition site in the adapter can be recognized by a restriction enzyme (e.g., Acul), and the DNA can be cleaved by Acul 13 bp to the right of the right adapter to form linear double stranded DNA. A second round of right and left adapters (Ad2) can be ligated onto either end of the linear DNA, and all DNA with both adapters bound can be PCR amplified (e.g., by PCR). Ad2 sequences can be modified to allow them to bind each other and form circular DNA. The DNA can be methylated, but a restriction enzyme recognition site can remain non-methylated on the left Ad1 adapter. A restriction enzyme (e.g., Acul) can be applied, and the DNA can be cleaved 13 bp to the left of the Ad1 to form a linear DNA fragment. A third round of right and left adapter (Ad3) can be ligated to the right and left flank of the linear DNA, and the resulting fragment can be PCR amplified. The adapters can be modified so that they can bind to each other and form circular DNA. A type III restriction enzyme (e.g., EcoP15) can be added; EcoP15 can cleave the DNA 26 bp to the left of Ad3 and 26 bp to the right of Ad2. This cleavage can remove a large segment of DNA and linearize the DNA once again. A fourth round of right and left adapters (Ad4) can be ligated to the DNA, the DNA can be amplified (e.g., by PCR), and modified so that they bind each other and form the completed circular DNA template.
Rolling circle replication (e.g., using Phi 29 DNA polymerase) can be used to amplify small fragments of DNA. The four adapter sequences can contain palindromic sequences that can hybridize and a single strand can fold onto itself to form a DNA nanoball (DNB™) which can be approximately 200-300 nanometers in diameter on average. A DNA nanoball can be attached (e.g., by adsorption) to a microarray (sequencing flowcell). The flow cell can be a silicon wafer coated with silicon dioxide, titanium and hexamethyldisilazane (HMDS) and a photoresistant material. Sequencing can be performed by unchained sequencing by ligating fluorescent probes to the DNA. The color of the fluorescence of an interrogated position can be visualized by a high resolution camera. The identity of nucleotide sequences between adapter sequences can be determined.
The methods provided herein may include use of a system that contains a nucleic acid sequencer (e.g., DNA sequencer, RNA sequencer) for generating DNA or RNA sequence information. The system may include a computer comprising software that performs bioinformatic analysis on the DNA or RNA sequence information. Bioinformatic analysis can include, without limitation, assembling sequence data, detecting and quantifying genetic variants in a sample, including germline variants and somatic cell variants (e.g., a genetic variation associated with cancer or pre-cancerous condition, a genetic variation associated with infection, or a combination thereof).
Sequencing data may be used to determine genetic sequence information, ploidy states, the identity of one or more genetic variants, as well as a quantitative measures of the variants, including relative and absolute relative measures.
In some embodiments, a sequencing can involve sequencing of a genome. In some embodiments, a genome can be that of a pathogen as disclosed herein. In some embodiments, sequencing of a genome can involve whole genome sequencing or partial genome sequencing. In some embodiments, a sequencing can be unbiased and can involve sequencing all or substantially all (e.g., greater than 70%, 80%, 90%) of the nucleic acids in a sample. In some embodiments, a sequencing of a genome can be selective, e.g., directed to portions of a genome of interest. In some embodiments, sequencing of select genes, or portions of genes may suffice for a desired analysis. In some embodiments, polynucleotides mapping to specific loci in a genome can be isolated for sequencing by, for example, sequence capture or site-specific amplification.
In some embodiments, disclosed herein, is a method comprising a process of analyzing, calculating, quantifying, or a combination thereof. In some embodiments, a method can be used to determine quantities of bacterial and fungal sequence reads. In some embodiments, metrics can be generated to determine quantities of bacterial sequences, fungal sequences or a combination thereof.
In some embodiments, the quantity for each organism identified in a method provided herein is expressed in Molecules Per Microliter of biological fluid (e.g., plasma) (MPM), the number of DNA sequencing reads from the reported organism present per microliter of plasma. In some cases, detection or prediction of infection (or of severity of infection or of hyper-inflammatory response or of mortality from COVID-19) occurs when the MPM is greater than a threshold value. In some cases, such threshold value of MPM is 10, 15, 20, 30, 50, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 3500, 4000, 4500, 5000, 7000, 10000, 20000, 30000, or 40000. In some cases, the threshold value is 100 MPM. In some cases, the threshold value is 100 MPM. In some cases, total MPM (e.g., total MPM from respiratory pathogens) above 100 MPM is indicative of a secondary infection. In some cases, total MPM above 100 MPM is indicative of a hyperinflammatory response. In some cases, the threshold value is 400 MPM. In some cases, total MPM (e.g., total MPM from respiratory pathogens) above 400 MPM is indicative of a secondary infection. In some cases, total MPM above 400 MPM is indicative of a hyperinflammatory response. In some cases, the threshold value is 3000 MPM. In some cases, total MPM (e.g., total MPM from respiratory pathogens) above 3000 MPM is indicative of a secondary infection. In some cases, total MPM above 3000 MPM is indicative of a hyperinflammatory response. In some cases, the threshold value is 4000 MPM. In some cases, total MPM (e.g., total MPM from respiratory pathogens) above 4000 MPM is indicative of a secondary infection. In some cases, total MPM above 4000 MPM is indicative of a hyperinflammatory response. In some cases, such threshold value of MPM is at least (or greater than) 10, 15, 20, 30, 50, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 3500, 4000, 4500, 5000, 7000, 10000, 20000, 30000, or 40000. In some cases, the MPM threshold is determined for a particular organism. In some cases, the MPM threshold is a value that is an aggregate amount of mcfNA (e.g., mcfDNA) from more than one single organism (e.g., aggregate amount of mcfNA from bacteria, from respiratory pathogens, from respiratory bacteria, from bacteria and fungi, or from a specific set of pathogens). In some embodiments, the respiratory pathogen is at least one respiratory pathogen listed in Table 2, in any combination. In some embodiments, the respiratory pathogen is a Streptococcus, Pseudomonas, or Klebsiella bacterium. In some embodiments, the respiratory pathogen is from any genus listed in Table 2. In some cases, the respiratory pathogen is from the genus Actinomyces, Aspergillus, Bacteroides, Citrobacter, Cytomegalovirus, Enterobacter, Escherichia, Enterococcus, Streptococcus, Pseudomonas, Klebsiella, and/or Haemophilus, In some cases, the respiratory pathogen is S. aureus, P. aeruginosa and/or K. pneumoniae, in any combination. In some cases, the MPM threshold for any of the preceding infections is “about” (as defined herein) any of the preceding values.
In some cases, the MPM threshold represents the MPM for an uninfected or healthy control. In some cases, the MPM threshold refers to a threshold indicative of disease severity or risk of mortality (e.g., greater than 1000, 4000, 5000, 7000, or 10000) may indicate a high risk of non-survival from Covid-19.
Sequencing Systems
This disclosure also provides sequencing systems for nucleic acid or DNA sequencing. In some embodiments, the nucleic acid sequencing system is for detecting secondary infection in a subject with a first infection. In some embodiments, the system comprises a next-generation sequencing device comprising a flow cell and a computer processor that outputs data comprising sequence reads collected from measurements conducted in the flow cell. In some embodiments, the system comprises or further comprises a computing device that comprises quantitation of total microbial cell-free nucleic acids (mcfNA) logic that (i) detects mcfNA from at least two different microbes by aligning the sequence reads to microbial reference sequence reads; (ii) calculates total mcfNA as a function of molecules per microliter of plasma, wherein the total mcfNA is an aggregate value of mcfNA from the at least two different microbes; and (iii) comprises an event generator to generate an event indicative a secondary infection when the total mcfNA exceeds a threshold value. In some cases, the genomic references include sequences from pathogens in Table 2.
In some cases, the threshold value is at least 50 MPM, 70 MPM, 100 MPM, 200 MPM, 500 MPM, 1000 MPM, 2000 MPM, 3000 MPM, 4000 MPM, 5000 MPM, 10000 MPM, 50000 MPM, or 100000 MPM. In some cases, the threshold value that is “about” any of the preceding MPM values. In some cases, the threshold value is the value associated with MPM for microbial cell-free nucleic acids (e.g, mcfDNA) from a healthy or uninfected subject, or subject that has a hypo-inflammatory response.
Treatments
In some embodiments, the non-limiting methods provided herein can comprise administering a treatment to a subject. In some cases, the treatment treats a disease or disorder, such as by reducing symptoms or signs of the disease or disorder. In some cases, the disease or disorder is an infection (e.g., bacterial infection, fungal infection, respiratory infection, pneumonia, bacterial pneumonia, viral pneumonia). In some cases, the disease or disorder is inflammation. In some cases, the treating occurs prior to onset of an infection or inflammation and, in some embodiments, prior to onset of one or more symptoms of infection (e.g., fever, elevated heart rate, low blood pressure, hyperventilation). In some embodiments, the treatment is administered to a subject when the subject is blood culture negative for the organism that is the target of the treatment. In some embodiments, the infection is detected or predicted by a method provided herein when the subject is blood culture negative, but the treatment is administered when the subject is blood culture positive. In some embodiments, the infection is detected or predicted by a method provided herein when the subject is blood culture negative, and the treatment is administered when the subject is blood culture negative. In some embodiments, the infection is detected or predicted by a method provided herein when the subject is blood culture positive, and the treatment is administered when the subject is blood culture positive. In some cases, the treatment is provided when the subject has not had a blood culture, or when the blood culture is non-conclusive. In some embodiments, the treatment is a preemptive treatment that prevents an asymptomatic infection from progressing into a symptomatic infection. In some embodiments, the treatment is a prophylactic treatment that prevents the onset of infection. In some embodiments, the treatment treats or reduces symptoms of an infection.
Various non-limiting treatments provided herein can be administered to the subject. In some embodiments, the treatment is a broad-spectrum antimicrobial drug or an antimicrobial drug that targets a specific microbe or a specific class of microbes. In some embodiments, the treatment targets bacteria and/or fungi, particularly any of the microbial organisms identified herein (e.g, in the Examples section of this application). In some embodiments, the subject is treated with a combination of drugs (e.g., a combination of multiple antibiotics, multiple anti-fungal drugs, or both antibiotics and antifungal drugs). In some embodiments, the subject is treated with a combination of broad-spectrum antibiotics, a combination of broad- and narrow-spectrum antibiotics, a combination of narrow-spectrum antibiotics, a combination of broad-spectrum antifungals, a combination of broad and narrow-spectrum antifungals, or a combination of narrow-spectrum antifungals. In some embodiments, the subject is treated with a broad-spectrum antibiotic, a narrow-spectrum antibiotic, a broad-spectrum antifungal, a narrow-spectrum antifungal, or any combination thereof.
In some embodiments, the treatment is an antimicrobial. In some embodiments, the antimicrobial comprises a beta-lactam, an aminoglycoside, a quinolone, an oxazolidinone, a sulfonamide, a macrolide, a tetracycline, an ansamycin, a streptogramin, a lipopeptide, used singly, or in any combination thereof as used herein and/or as recommended by a clinician. In some embodiments, the treatment is a broad-spectrum treatment. In some embodiments, the broad-spectrum treatment is a broad-spectrum antibiotic, a broad-spectrum anti-bacterial drug, a broad-spectrum antifungal, or any combination thereof. As used herein, the term “broad spectrum antibiotic” generally refers to a drug that acts on both gram negative and gram-positive bacteria, that acts on multiple types of gram-negative bacteria, and/or that acts on multiple types of gram-positive bacteria. In some embodiments, the broad-spectrum treatment acts on multiple types of fungal infections. In some embodiments, the drug is a beta-lactam penicillin such as flucloxacillin, ampicillin (or amoxicillin). In some embodiments, the broad-spectrum drug is a beta-lactam such as cephalosporin antibiotic (e.g., ceftriaxone, cefepime). The cephalosporin drug can be, in some embodiments, a first, second, third or fourth generation cephalosporin drug. In some embodiments, the broad-spectrum antibiotic is a quinolone drug (e.g., levofloxacin), a carbopenem-type antibiotic (e.g., meropenem), or a metronidazole.
In some cases, the treatment is an antibiotic. In some embodiments, the treatment is a glycopeptidic antibiotic active against gram-positive bacteria. For example, in some embodiments, the treatment is vancomycin. In some embodiments, the treatment comprises one or more antibiotics listed in Table 5.
In some embodiments, the treatment is an anti-fungal drug. In some embodiments, the treatment is a broad-spectrum antifungal drug. In some embodiments, the antifungal drug is, for example, a cefepime, a clotrimazole, an econazole, a miconazole, a terbinafine, a fluconazole, a ketoconazole, a nystatin, an amphotericin B, or any other known antifungal drugs and/or a combination thereof.
In some embodiments, the treatment comprises various narrow-spectrum drugs, for example, a flucytosine. In some embodiments, the narrow-spectrum drug is an oxazolidinone, for example, a linezolid, a posizolid, a radezolid, a penicillin VK, or any combination thereof.
In some embodiments, the antimicrobial drug is a pill, a gel, a tablet, a coated tablet, or any combination thereof and can be administered to the subject orally. In some embodiments, the treatment using an anti-fungal can be administered to the subject topically. In some embodiments, a treatment can be administered in the form of a capsule, a tablet, a liquid, an injectable, a pessary or any combination thereof. In some embodiments, the antimicrobial drug is formulated as an infusion, and can be administered to the subject intravenously via a needle or catheter.
In some cases, the treatment is an anti-inflammatory drug. For example, in some cases, the treatment is a non-steroidal anti-inflammatory drug (NSAID). In some cases, the anti-inflammatory drug is a steroid. In some cases, the drug is a corticosteroid. In some cases, the drug is dexamethasone. In some cases, the drug is prednisone.
In some cases, the treatment is a treatment for COVID-19. In some cases, the treatment is remdesivir. In some cases, the drug is a monoclonal antibody. In some cases, a method provided herein may indicate that the subject has a risk of severe COVID-19 or a risk of not surviving COVID-19, and the subject may be administered a drug to treat or prevent the severe COVID-19, such as remdesivir or a mono-clonal antibody.
The present invention is described in further detain in the following examples which are not in any way intended to limit the scope of the invention as claimed. The attached Figures are meant to be considered as integral parts of the specification and description of the invention. All references cited are herein specifically incorporated by reference for all that is described therein. The following examples are offered to illustrate, but not to limit the claimed invention.
In the experimental disclosure which follows, the following abbreviations apply: eq (equivalents); M (Molar); μM (micromolar); N (Normal); mol (moles); mmol (millimoles); pmol (micromoles); nmol (nanomoles); g (grams); mg (milligrams); kg (kilograms); μg (micrograms); L (liters); ml (milliliters); μl (microliters); cm (centimeters); mm (millimeters); μm (micrometers); nm (nanometers); ° C. (degrees Centigrade); h (hours); min (minutes); sec (seconds); msec (milliseconds).
This example illustrates plasma mcfDNA metagenomic sequencing. As previously described, plasma mcfDNA metagenomic sequencing can be performed according to Blauwkamp 2019.
Briefly, plasma is spiked with a known concentration of synthetic normalization molecule controls, followed by cell-free DNA extraction. The extracted cfDNA is processed by end-repair and ligated to adapters containing specific indexes to end-repaired cfDNA. The products of the ligation are purified by beads. The cfDNA attached to adapters is amplified with P5 and P7 primers, and the amplified cfDNA is purified.
Purified cfDNA derived from a plasma sample is incorporated into a DNA sequencing library. Sequencing libraries from several plasma samples can be pooled with control samples, purified, and sequenced on Illumina sequencers using a 75-cycle single-end, dual index sequencing kit. Primary sequencing output is demultiplexed, then the reads are quality trimmed, and reads that pass quality filters are aligned against human and synthetic references and set aside. Reads potentially representing human satellite DNA are also filtered via a k-mer-based method; then the remaining reads are aligned with a microorganism reference database, which consists of 20,963 assemblies of high-quality genomic references. Reads with alignments that exhibit both high percent identity and high query coverage are retained, except for reads that are aligned with any mitochondrial or plasmid reference sequences. PCR duplicates are removed based on their alignments. Relative abundances are assigned to each taxon in a sample based on the sequencing reads and their alignments.
For each combination of read and taxon, a read sequence probability is defined that accounts for the divergence between the microorganism present in the sample and the reference assemblies in the database. A mixture model is used to assign a likelihood to the complete collection of sequencing reads that included the read sequence probabilities and the (unobserved) abundances of each taxon in the sample. An expectation-maximization algorithm can be applied to compute the maximum likelihood estimate of each taxon abundance. From these abundances, the number of reads arising from each taxon is aggregated up the taxonomic tree. The estimated taxa abundances from the no template control (NTC) samples within the batch are combined to parameterize a model of read abundance arising from the environment with variations driven by counting noise. Statistical significance values are then computed for each estimate of taxon abundance in each patient sample. Taxa that exhibit a high significance level, and that are one of the 1449 taxa within the reportable range, comprise our candidate calls. Final calls are made after additional filtering is applied, which accounts for read location uniformity as well as cross-reactivity risk originating from higher abundance calls. The microorganism calls that pass these filters are reported along with abundances in MPM, as estimated using the ratio between the unique reads for the taxon and the number of observed unique reads of normalization molecules.
The amount of mcfDNA plasma concentration in each sample is quantified by using the measured relative abundance of the synthetic molecules initially spiked in the plasma.
Forty-two hospitalized patients with COVID-19 were prospectively enrolled and compared with a historical cohort of mechanically ventilated patients with culture-positive (n=27) vs. culture-negative pneumonia (n=40) or no clinical infection (n=18 controls). From plasma samples, mcfDNA-Seq was used to measure ten host response biomarkers of innate immunity and epithelial/endothelial injury (IL-6, IL-8, IL-10, RAGE, TNFR1, Angiopoietin-2, Procalcitonin, Fractalkine, Pentraxin-3, ST2). Levels of mcfDNA was compared between clinical groups and associations of mcfDNA and biomarker levels were examined with linear regression models.
McfDNA-Seq was successful in 33/42 (79%) baseline samples from patients with COVID-19, with nine samples failing QC requirements. McfDNA was detectable in 21/33 (64%) of COVID-19 samples, a proportion significantly lower to culture-positive pneumonia (96%), higher than uninfected controls (33%) and like culture-negative pneumonia (56%) (between-groups Fisher's exact p<0.001). A similar distribution was seen for mcfDNA levels, with mcfDNA load in COVID-19 being similarly distributed to non-COVID culture-negative pneumonia (
Plasma metagenomics in patients with COVID-19 revealed mcfDNA load of similar magnitude as in critically ill patients without COVID-19 with clinically suspected infection but negative microbiologic cultures. The significant associations of mcfDNA with host inflammation support the biological relevance of detectable circulating mcfDNA. Our preliminary results warrant further study of secondary infections in hospitalized patients with COVID-19 to define the clinical utility of non-invasive molecular diagnostics for antimicrobial treatment guidance.
Fifteen critically ill patients with COVID-19 (confirmed by nasopharyngeal qPCR for SARS-CoV-2) were enrolled in a prospective ICU cohort study. Plasma samples for conducting mcfDNA-Seq were analyzed according to the methods described in Blauwkamp, 2019, Nature Microbiol, 4:663-74 incorporated by reference herein. Detection of mcfDNA was evaluated in the context of clinical diagnoses and prescribed antimicrobial therapies by the treating physicians and examined for associations with clinical outcomes.
Of fifteen patients analyzed (median age 63, 53% females, 73% mechanically ventilated), six (40%) died within 30 days from enrollment. Samples were obtained at a median (interquartile range-IQR) of ten (4-12) days from COVID-19 symptoms onset, and each sample contained a median of 837 (111-4638) total mcfDNA molecules per microliter (MPMs) and 2 (1-4) identified organisms. Of the total 92,791 MPMs reported across fifteen samples, 90% belonged to typical pathogenic bacteria (e.g., E. coli and K. pneumoniae), with the remainder MPMs aligned to commensal bacteria (5%, e.g., oral Streptococcus species), fungi (4%, Candida species) and DNA viruses (1%). Compared to survivors, non-survivors had higher total mcfDNA (p=0.04), higher pathogenic bacteria MPMs (p=0.02) and a trend for a higher number of identified organisms per sample (p=0.06). (
Respiratory pathogen MPMs (S. aureus, Ps. aeruginosa and K. pneumoniae) were detected in 3/4 subjects with low suspicion for secondary infection (Group B,
McfDNA-Seq in patients with COVID-19 indicates a higher incidence of probable secondary infections than previously recognized. The significant association between mcfDNA and 30-day mortality suggests that COVID-19 severity may be influenced by circulating bacterial fragments, either from secondary pneumonias or from possible translocation of colonizing microbiota along the disrupted alveolar/epithelial surface of lungs injured by COVID-19. Kitsios, 2019, Open Forum Infect Dis, 6:S138. Integration of mcfDNA detection with clinical data demonstrates opportunity for antibiotic stewardship in patients with suspected infection. On the other hand, the signal for undiagnosed and untreated secondary infections should serve as a call for vigilance and thorough diagnostic workup in patients with severe COVID-19.
A nested case-control study of mechanically ventilated patients with and without severe pneumonia from an ICU cohort was conducted. Community or hospital-acquired pneumonia were defined per established criteria (Gong, 2005, Crit Care Med, 33:1191-98). Classified patients were defined as culture-positive when pathogenic microbial species were isolated from respiratory specimen or blood cultures vs. culture-negative when no growth in neither culture, or only normal respiratory flora were reported in respiratory cultures. The radiologic severity index (RSI) was quantified on the first available chest radiograph post-intubation and calculated clinical pulmonary infection scores (CPIS) from available data. See Zilberberg, 2010, Clin Infect Dis, 51, S131-35; and Sheshadri, 2019, BMJ Open Respir Res, 6:e000471, herein incorporated by reference in their entirety. Uninfected controls were patients intubated for airway protection or for hypoxemia from decompensated congestive heart failure. Plasma mcfDNA metagenomics was conducted as disclosed in Example 1. Nine host-response biomarkers were measured, and patients were classified in a hyper- vs. hypo-inflammatory sub-phenotype. Metagenomic sequences were quantified as mcfDNA molecules per microliter (MPMs). Clinical variables were compared with biomarker and mcfDNA levels between the three clinical groups (culture-positive pneumonia, culture-negative pneumonia, and uninfected controls) with non-parametric tests and post-hoc adjustments for pairwise comparisons. Associations between biomarkers and mcfDNA concentration (MPMs) were examined with multivariate adjusted linear models following log transformation.
Clinical cohort and sample collection—A convenience sample of consecutive, adult patients intubated and mechanically ventilated was prospectively enrolled. Upon enrollment blood samples were collected for centrifugation, separation of plasma and quantification of host inflammation response biomarkers as well as mcfDNA metagenomic sequencing.
Plasma biomarker measurement—A custom Luminex multi-analyte panel (R&D Systems, Minnesota) was constructed to measure plasma levels of biomarkers with established prognostic utility in pneumonia and Acute Respiratory Distress Syndrome (ARDS), including fractalkine, interleukin (IL)-6, IL-8, pentraxin-3, procalcitonin, receptor for advanced glycation end products (RAGE), suppression of tumorgenicity (ST)-2, and tumor necrosis factor receptor (TNFR)-1.
Hyper- and hypo-inflammation sub-phenotype assignment A 4-variable parsimonious model was used for classification of patients into a hyper- vs. hypo-inflammatory sub-phenotype of host-responses, previously defined by latent class analysis utilizing several clinical and biomarker variables. Drohan, 2020, Host-Response Subphenotypic Classification with A Parsimonious Model Offers Prognostic Information in Patients with Acute Respiratory Failure: A Prospective Cohort Study, doi:10.21203/rs.3.rs-57907/v1. The logit of the probability of hypo-inflammatory sub-phenotype classification was calculated as 0.8739604−8.798345e-05*(angiopoietin-2)−6.049412e-04*(procalcitonin)−4.048723e04*(TNFR-1)+2.883218e-01*(bicarbonate).
To determine an association of mcfDNA and biomarker with inflammation prognosis, twenty-seven culture-positive pneumonia patients, forty culture-negative pneumonia patients, and sixteen uninfected controls were examined. Data of Table 1 are presented as median with interquartile ranges for continuous variables and N with percentage for categorical variables. P-values for comparisons between the three clinical categories were obtained from Kruskal Wallis test for continuous variables and Fisher's exact test for categorical variables. P-values for the comparison between culture-positive vs. -negative pneumonia patients were adjusted for multiple testing with Benjamini-Hochberg correction post-hoc from three group comparisons. P-values for the comparison between patients with pneumonia (both culture-positive and negative) vs. controls were obtained from Wilcoxon test for continuous variables and Fisher's exact test for categorical variables. Among the sixteen uninfected controls, twelve patients were intubated for airway protection without any evidence of respiratory infection, and the remaining four were intubated for cardiogenic pulmonary edema from decompensated congestive heart failure.
Patients with pneumonia (culture-positive or negative) had fewer ventilator-free days, higher CPIS, RSI, and levels of inflammatory biomarkers compared to controls (Table 1, p<0.05). Culture-positive patients had higher circulating mcfDNA compared to other groups (post-hoc p<0.001,
For host response, only pentraxin-3 was significantly elevated in the culture-positive vs. culture-negative participants among patients with pneumonia (post hoc p=0.05, Table 1). Linear regression models were built comparing plasma biomarkers (outcomes) to plasma mcfDNA levels (predictor) in unadjusted as well as adjusted models for a priori selected potential confounders.
Table 4 reports the results for each regression model of calculations of estimated regression coefficients, 95% confidence intervals, and p values for significance of mcfDNA vs. plasma inflammatory biomarkers. Analyses were done for total mcfDNA, as well as for mcfDNA corresponding to recognized respiratory pathogens. All mcfDNA MPMs and biomarker measurements were log transformed; regression models with p<0.05 are shown in bold. In univariate linear regression models of host-response biomarkers against mcfDNA in patients with pneumonia, significant associations were detected for fractalkine, interleukin-8, procalcitonin, pentraxin-3, suppression of tumorigenicity-2 (ST-2), and soluble tumor necrosis factor receptor-1 (TNFR-1) levels (all p<0.05,
The results revealed a novel link between circulating mcfDNA and systemic inflammation in patients with severe pneumonia, suggesting a biological microbe-host interaction in the systemic circulation. Circulating mcfDNA was associated with the intensified inflammatory host-responses, which have been reproducibly associated with worse clinical outcomes in severe pneumonia. Kitsios 2019. The discovery of a higher mcfDNA load in patients assigned to the hyperinflammatory sub-phenotype also linked microbiota and patient-level outcomes. [00159]). McfDNA of respiratory pathogens were detected in 82% and 38% of culture-positive and -negative patients, respectively. Table 2. Of these, one or more previously identified pneumonia pathogens were found in 12/18 (67%) of critically ill patients with pneumonia.
Notably, the significant associations between mcfDNA and fractalkine, procalcitonin, pentraxin-3 and ST-2 were independent of our radiographic (RSI) and biomarker (RAGE) measurements of the degree of lung injury. Microbial DNA is an established pathogen-associated molecular pattern (PAMP) that can stimulate pattern recognition receptors (PRRs) in innate immune cells to activate downstream inflammatory signaling See, e.g., Mogensen, 2009, Clin Microbiol Rev, 22:240-73.
It is understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims. All publications, patents, and patent applications cited herein are hereby incorporated by reference in their entirety for all purposes.
$= SOFA score calculation did not include the neurologic component, as all patients were intubated and receiving sedative medications, which impaired our ability to perform assessment of Glasgow Coma Scale in a consistent and reproducible manner.
Actinomyces viscosus
Aspergillus fumigatus
Aspergillus niger
Bacteroides fragilis
Citrobacter koseri
Enterobacter cloacae complex
Enterococcus avium
Enterococcus faecalis
Enterococcus faecium
Escherichia coli
Fusobacterium nucleatum
Haemophilus influenzae
Haemophilus parainfluenzae
Klebsiella michiganensis
Klebsiella pneumoniae
Klebsiella variicola
Porphyromonas gingivalis
Pseudomonas aeruginosa
Staphylococcus aureus
Staphylococcus aureus (MSSA)
Streptococcus anginosus
Streptococcus intermedius
Streptococcus mitis
Streptococcus parasanguinis (AKA
Streptococcus parasanguinius)
Streptococcus pneumoniae
Aggregatibacter segnis
Atopobium vaginae
Bacteroides distasonis
Bacteroides merdae
Bacteroides ovatus
Bacteroides thetaiotaomicron
Bacteroides uniformis
Bacteroides vulgatus
Bifidobacterium longum
Campylobacter concisus
Campylobacter curvus
Candida albicans
Candida dubliniensis
Candida glabrata
Candida tropicalis
Clostridium butyricum
Clostridium innocuum
Corynebacterium striatum
Gardnerella vaginalis
Gemella haemolysans
Gemella morbillorum
Gemella sanguinis
Haemophilus haemolyticus
Haemophilus parahaemolyticus
Helicobacter pylori
Lactobacillus crispatus
Lactobacillus fermentum
Lactobacillus gasseri
Malassezia furfur
Megasphaera micronuciformis
Morococcus cerebrosus
Neisseria flavescens
Neisseria mucosa
Neisseria sicca
Prevotella melaninogenica
Prevotella oris
Rothia dentocariosa
Rothia mucilaginosa
Saccharomyces cerevisiae
Staphylococcus haemolyticus
Streptococcus agalactiae
Streptococcus dentisani
Streptococcus oralis
Streptococcus salivarius
Streptococcus thermophilus
Streptococcus tigurinus
Streptococcus vestibularis
Sutterella wadsworthensis
Torque teno virus
Veillonella dispar
Veillonella parvula
Escherichia coli, 150
Escherichia coli, 546
Streptococcus pneumoniae, 8537, Klebsiella pneumoniae,
Escherichia coli, 978
Escherichia coli, 114
parasanguinis, 4592, Streptococcus anginosus, 4362,
Escherichia coli, 3412, Streptococcus vestibularis, 3207,
Saccharomyces cerevisiae, 96
Streptococcus salivarius, 448, Streptococcus parasanguinis,
Streptococcus thermophilus, 10810, Neisseria mucosa,
Streptococcus mitis, 2775, Prevotella melaninogenica, 318,
Streptococcus intermedius, 944
aureus, Light Normal
haemolyticus, 800, Herpes simplex virus type 1 (HSV-1),
Staphylococcus aureus, 475
aureus
Streptococcus pneumoniae, 777, Aggregatibacter segnis,
pneumoniae, Heavy
Staphylococcus aureus
Staphylococcus aureus, 45941
Staphylococcus
aureus;
aureus, Light NRF
Escherichia coli, 6461, Bacteroides vulgatus, 751
Streptococcus agalactiae, 656, Haemophilus influenzae,
agalactiae), Light Normal
Staphylococcus aureus, 95, Actinomyces viscosus, 93,
Streptococcus parasanguinius (Sequencing passed
Staphylococcus aureus (MSSA), 1375, Streptococcus mitis,
salivarius, 83, Klebsiella variicola, 59
Streptococcus mitis, 1312, Candida tropicalis, 372,
Staphylococcus aureus, 305, Prevotella melaninogenica,
aureus (MSSA), 24360
aureus
Escherichia coli, 122894, Streptococcus agalactiae, 6735,
Staphylococcus aureus, 5151, Bacteroides thetaiotaomicron,
Streptococcus
constellatus;
Staphylococcus
aureus
aureus(MRSA), NRF
Escherichia coli, 6733, Bacteroides fragilis, 705,
aureus(MRSA), Light
agalactiae), Light NRF
Streptococcus mitis, 307
Staphylococcus
aureus
agalactiae), Light NRF
0.0012
0.0020
0.0119
0.0020
0.0008
0.0027
0.0102
0.0155
0.0036
0.0199
0.0218
0.0447
0.0151
0.0296
0.0028
0.0134
0.0097
0.0103
0.0279
J Clin Microbiol, 44: 160-65) that considered dosing duration,
While preferred embodiments of the present invention have been shown and described herein, such embodiments are provided by way of example only. Numerous variations, changes, and substitutions are possible within the scope of the invention. It should be understood that various alternatives to the embodiments of the invention described herein may be employed in practicing the invention. It is intended that the following claims define the scope of the invention and that methods and structures within the scope of these claims and their equivalents be covered thereby.
This application is a continuation of International Application No. PCT/US2021/064445, filed Dec. 20, 2021, which claims the benefit of U.S. Provisional Patent Application No. 63/128,552, filed Dec. 21, 2020, U.S. Provisional Patent Application No. 63/199,497, filed Jan. 3, 2021, and U.S. Provisional Patent Application No. 63/139,245, filed Jan. 19, 2021, which are herein incorporated by reference in their entireties.
Number | Date | Country | |
---|---|---|---|
20240132978 A1 | Apr 2024 | US |
Number | Date | Country | |
---|---|---|---|
63128552 | Dec 2020 | US | |
63199497 | Jan 2021 | US | |
63139245 | Jan 2021 | US |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/US2021/064445 | Dec 2021 | WO |
Child | 18338128 | US |