The present disclosure relates to diagnostic systems and methods for depleting non-target (e.g., human, animal, and plant) nucleic acids from a sample to enrich target (e.g., microbial) nucleic acids by immobilized adsorption, and also relates to diagnostic systems and methods for identifying target microorganisms and/or resistance genes from the sequences of the enriched target nucleic acids.
Rapid and accurate recognition of pathogens and antimicrobial resistance is crucial for improving patient health. Currently, the “gold standard” method for clinical diagnostics is based on phenotypic analysis of microbial culture. However, this diagnostic process takes at least 24 hours to serval days to obtain a preliminary answer from bacterial growth and tests in a clinical microbial laboratory. This cannot generate timely guidance in the initial stage for a patient against infectious diseases, such as bacteremia, sepsis, and pneumonia, which may quickly become deteriorative and life-threatening. Accordingly, the patient suffering from sepsis is faced with ineffective or excessive antibiotic treatment, and that could lead to the emergence of multidrug-resistant pathogens due to inappropriate use of antibiotics.
The typically applicable technology for rapid detection of pathogens is nucleic acid amplification technology (NAAT), and it has been applied in, for example, the diagnosis of sepsis [e.g., Septifast (Roche Diagnostics, Mannheim, Germany)] and the respiratory tract infection [e.g., FilmArray Respiratory Panel (Biofire Defense, Salt Lake City, USA)]. Nevertheless, NAAT is limited by primer design, such that the detection of different target pathogens and resistance genes can only be performed in different reactions. Taking the FilmArray Blood Culture Identification (BCID) Panel for example, only 33 specific target pathogens and 10 specific resistance genes can be detected thereby. Therefore, most pathogens and resistance genes would not be applicable; particularly, rare pathogens and special resistance genes would be hardly identifiable, such that the traditional microbiological culture cannot be completely replaced with NAAT. There is thus still an urgent need for a universal diagnostic technology that can rapidly identify pathogens (such as viruses, bacteria, and fungi) and resistance genes as many as possible.
Recently, next-generation DNA sequencing (NGS), including Illumina, PacBio, and Nanopore sequencing platforms, has widely been used to obtain DNA sequences for accurate identification of pathogens and resistance genes, and for other applications, such as genotyping. However, the application of NGS in the identification of pathogens and resistance genes is faced with a big challenge as clinical specimens or blood cultures usually contain a large amount of non-target (e.g., human, animal, and plant) nucleic acids. It means that only a very small amount of the sequences generated from NGS can be used in the identification of pathogens and resistance genes, which may lead to low sensitivity in detecting pathogens due to the low abundance of target DNA sequences. Also, filtering out host sequences from a large amount of raw data is time-consuming and highly dependent on computational capability.
Nowadays, several approaches have been developed for the depletion of non-target nucleic acids in specimens. MolYsis Basic 5 Kit (Molzym, Bremen, Germany) utilizes a nuclease to digest non-target nucleic acids, while the extracted nucleic acid fragments of bacteria are relatively short, and thus it would be difficult to generate long sequence reads. NEBNext Microbiome DNA Enrichment Kit (New England Biolabs, Inc., USA) utilizes a monoclonal antibody capable of specifically binding the methylated CpG island of the human genome; however, DNA methylation is unevenly distributed across the human genome, and this kit is not cost-effective for routine examination. QIAamp BiOstic Bacteremia DNA Kit (QIAGEN, Hilden, Germany) utilizes multiple centrifugation steps to separate host cells according to the difference in cell density. However, there is still an unmet need to provide a fast and cost-effective strategy for the identification of pathogens as well as features associated with antibiotic resistance in a clinical setting and general microbiological laboratories.
In view of the foregoing, the present disclosure provides a diagnostic system and a method for depleting non-target nucleic acids from specimens by immobilized adsorption, thereby enriching target nucleic acids therein. The diagnostic system and the method provided herein have a variety of applications, including, for example, the identification of bacterial species and resistance genes through the pretreatment of a biological sample obtained from a host.
In at least one embodiment of the present disclosure, a method for enriching a target (e.g., bacterial) nucleic acid in a sample is provided. The method comprises providing a sample including a target microorganism and a non-target cell that originate from different species; adding a non-ionic surfactant to the sample to lyse the non-target cell and release a non-target nucleic acid from the non-target cell; contacting the sample with a solid phase adsorbent to bind free nucleic acids (including the non-target nucleic acid) in the sample; and removing the solid phase adsorbent and the nucleic acids thereon, thereby enriching the target nucleic acid contained in the target microorganism in the sample.
In at least one embodiment of the present disclosure, a diagnostic system for identifying a target microorganism and/or a resistance gene in a sample is provided. The diagnostic system comprises a cell lysis unit configured to lyse a non-target cell in the sample, wherein the target microorganism and the non-target cell originate from different species; a target nucleic acid enrichment unit equipped with an immobilized adsorption device and configured to deplete a nucleic acid of the lysed non-target cell, thereby enriching a nucleic acid of the target microorganism in the sample; a sequencing unit configured to sequence the enriched nucleic acid of the target microorganism; and a sequence analysis unit connected to the sequencing unit and configured to receive sequencing data generated by the sequencing unit and to compare the sequencing data with a microbial genome database and/or a resistance gene database, thereby producing an identification result of the target microorganism and/or the resistance gene carried by the target microorganism.
In at least one embodiment of the present disclosure, the immobilized adsorption device comprises a solid phase adsorbent, and the cell lysis unit comprises a non-ionic surfactant. In some embodiments, the lysis of the non-target cell is performed in an alkaline environment. In some embodiments, the solid phase adsorbent used in the present disclosure does not contain an antibody. In some embodiments, the binding or removal of non-target nucleic acids or free nucleic acids in the sample by the immobilized adsorption device is not based on the principle of antibody-antigen interaction.
In at least one embodiment of the present disclosure, the diagnostic system further comprises a target microorganism amplification unit configured to amplify an amount of the target microorganism or a nucleic acid thereof. In some embodiments, the target microorganism amplification unit comprises a blood culture device.
In at least one embodiment of the present disclosure, the sequencing unit is at least one of a next-generation sequencing platform, a high-throughput sequencing platform, an Illumina sequencing platform, a Nanopore sequencing platform, a PacBio sequencing platform, and a Sanger sequencing platform.
In at least one embodiment of the present disclosure, in order to identify microorganisms and resistance genes and/or predict antimicrobial resistance (AMR) of the microorganisms, the sequencing data to be compared are subjected to the following procedures through the microorganism comparison software and/or the resistance gene interpretation software: obtaining an index of the indicated length sequence in the sequencing data to be compared; correcting and assembling the microbial genome and bacterial plasmid sequences; reading the corresponding sequence from the reference gene sequence according to the index; and determining whether the corresponding sequence and the sequencing data to be compared are the same or not, thereby producing an identification result.
In at least one embodiment of the present disclosure, the sequence analysis unit is further configured to analyze the resistance gene carried by the target microorganism, e.g., an antimicrobial resistance gene. In some embodiments, the sequence analysis unit is further configured to calculate at least one parameter selected from the number of effective sequences for alignment, coverage, coverage depth, relative abundance, and degree of dispersion, thereby producing the identification result of the target microorganism and/or the resistance gene carried by the target microorganism.
In at least one embodiment of the present disclosure, the sequence analysis unit generates sequencing data with at least 20 times the genome size of the target microorganism. In some embodiments, the sequencing data that are generated by the sequence analysis unit within, for example, 15 min throughput or have at least one time the genome size of the target microorganism are used to calculate the distribution of the microorganism greater than 1% of the total sequence reads, as the basis for the relative abundance of the target microorganism in the sample. In some embodiments, the sequence analysis unit is further configured to detect complete resistance genes, the subtypes thereof, and resistance-relevant mutations in the target microorganism within, for example, 6 hours, thereby predicting antimicrobial resistance of the target microorganism.
In at least one embodiment of the present disclosure, a method for using the diagnostic system is also provided. The method comprises providing a sample including a target microorganism and a non-target cell that originate from different species; lysing the non-target cell by the cell lysis unit; depleting free nucleic acids, especially the non-target nucleic acid released from the non-target cell, in the sample by the target nucleic acid enrichment unit, thereby enriching a target nucleic acid of the target microorganism in the sample; sequencing the enriched nucleic acid by the sequencing unit; and producing an identification result of the target microorganism and/or the resistance gene carried by the target microorganism by the sequence analysis unit.
In at least one embodiment of the present disclosure, the lysis of the non-target cell comprises adding the non-ionic surfactant to the sample by the cell lysis unit. In some embodiments, the depletion of the free nucleic acids comprises contacting the sample with the solid phase adsorbent by the target nucleic acid enrichment unit, and removing the solid phase adsorbent and the free nucleic acids thereon, thereby enriching the target nucleic acid in the sample.
In at least one embodiment of the present disclosure, a method for enriching a target nucleic acid in a sample is also provided. The method comprises providing a sample including a target microorganism and a non-target cell that originate from different species; lysing the non-target cell by a cell lysis unit of a diagnostic system to release a non-target nucleic acid from the non-target cell, and depleting the non-target nucleic acid by a target nucleic acid enrichment unit of the diagnostic system, thereby enriching the target nucleic acid of the target microorganism in the sample. In some embodiments, the target nucleic acid enrichment unit of the diagnostic system comprises an immobilized adsorption device containing a solid phase adsorbent. In some embodiments, the depletion of the non-target nucleic acid comprises contacting the sample with the solid phase adsorbent to bind the free nucleic acids, and removing the solid phase adsorbent, thereby enriching the target nucleic acids in the sample.
In at least one embodiment of the present disclosure, the method further comprises sequencing the enriched nucleic acid by a sequencing assay to generate sequencing data, and comparing the sequencing data with a microbial genome database and/or a resistance gene database, thereby producing an identification result of the target microorganism and/or the resistance gene carried by the target microorganism.
In at least one embodiment of the present disclosure, the solid phase adsorbent is selected from the group consisting of a silica magnetic bead, a silica bead, a column extraction membrane, an alkyl-bonded silica gel, a biochar, a cellulose, an anion exchange resin, and any combination thereof. The hydrogen bonding, hydrophobic interactions, and electrostatic interactions between the cationic portion of the adsorbent and the negatively charged phosphate groups of nucleic acids may be the driving force for the binding. In some embodiments, the solid phase adsorbent may be a silica magnetic bead or based on a silica magnetic bead. In some embodiments, the solid phase adsorbent may be controlled by salts and pH value; for example, the solid phase adsorbent may bind nucleic acids in an alkaline environment. In some embodiments, the surface of the silica magnetic bead may be further modified with a silane-modified polymer, including but not limited to tetramethoxysilane (TMOS), tetraethoxysilane (TEOS), and 3-aminopropyltriethoxysilane (APTES). In some embodiments, the solid phase adsorbent used in the present disclosure does not contain an antibody. In some embodiments, the method of the present disclosure does not include binding or removing non-target nucleic acids or free nucleic acids in the sample based on the principle of antibody-antigen interaction.
In at least one embodiment of the present disclosure, the non-ionic surfactant is selected from the group consisting of saponin, Tween, Triton, polyoxyethylene (10) oleyl ether (e.g., BrijO10), polyol, a polyoxyethylene-polyoxypropylene copolymer, polyoxyethylene ether, alkyl ethanolamide, glucoside, fatty alcohol, and any combination thereof. In some embodiments, the method further comprises incubating the non-ionic surfactant and the sample under an alkaline condition to separate the non-target nucleic acid from the non-target cell.
In at least one embodiment of the present disclosure, the target nucleic acid comprises at least one of a pathogenic nucleic acid, a microbial nucleic acid, a bacterial nucleic acid, a viral nucleic acid, a fungal nucleic acid, an algae nucleic acid, a protozoan nucleic acid, and a parasitic nucleic acid. In some embodiments, the target nucleic acid may be a bacterial nucleic acid. In some embodiments, the target nucleic acid may originate from a bacterium, e.g., an antibiotic-resistant bacterium. In some embodiments, the target nucleic acid may be a bacterial plasmid or a fragment thereof, e.g., a resistance gene.
In at least one embodiment of the present disclosure, the non-target cell is a eukaryotic host, such as an animal host. In some embodiments, the non-target nucleic acid originates from an animal host. In some embodiments, the animal host is a mammalian host. In some embodiments, the sample comprises a mammalian host nucleic acid and a nucleic acid originating from a pathogen in the mammalian host. In some embodiments, the sample is obtained from a human host and comprises a human host nucleic acid and a non-human nucleic acid.
In at least one embodiment of the present disclosure, the sample may be an environmental sample obtained from dust, soil, water, air, artificial water system, food, and the like. In some embodiments, the sample may be a biological sample obtained from a host suffering or suspected of suffering from an infectious disease. In some embodiments, the infectious disease includes, but is not limited to, bacteremia, sepsis, and pneumonia.
In at least one embodiment of the present disclosure, a method for identifying a target microorganism and/or a resistance gene in a biological sample is also provided. In some embodiments, the method of the present disclosure comprises providing the biological sample from a subject infected or suspected of being infected by the pathogen, adding a non-ionic surfactant to the biological sample, contacting the biological sample with a solid phase adsorbent to bind a non-target nucleic acid originating from the subject, removing the solid phase adsorbent, thereby enriching a nucleic acid of the pathogen in the biological sample, and sequencing the enriched nucleic acid of the pathogen by a sequencing assay.
In at least one embodiment of the present disclosure, the biological sample is selected from the group consisting of blood, serum, plasma, urine, sputum, saliva, cerebrospinal fluid, interstitial fluid, mucous, sweat, stool extract, fecal matter, synovial fluid, tears, semen, peritoneal fluid, nipple aspirates, milk, vaginal fluid, and any combination thereof.
In at least one embodiment of the present disclosure, depending on the amount of target nucleic acids in the biological sample, the method provided herein may further comprise preferentially amplifying the target microorganism, the pathogen, the target nucleic acid, and/or the nucleic acid of the pathogen in the biological sample before the addition of the non-ionic surfactant. For example, the biological sample is a blood sample that is obtained from a subject suffering from sepsis and has been preferentially subjected to blood culture. In some embodiments, the sample suitable to the method of the present disclosure may be a blood culture sample identified as positive by the continuous monitoring blood culture system (such as a blood sample identified as containing microorganisms by the Gram staining process). In some embodiments, the method provided herein further comprises removing a red blood cell from the blood sample.
In at least one embodiment of the present disclosure, the sequencing assay is selected from the group consisting of a next-generation sequencing assay, a high-throughput sequencing assay, an Illumina sequencing assay, a Nanopore sequencing assay, a PacBio sequencing assay, a Sanger sequencing assay, and any combination thereof. In some embodiments, the sequencing assay may be a Nanopore sequencing assay.
In at least one embodiment of the present disclosure, the target nucleic acid or the nucleic acid of the pathogen enriched by the method provided herein has at least 2,000 nucleotides (nt) in length. For example, the enriched target nucleic acid or the enriched nucleic acid of the pathogen to be sequenced has at least 2,000 nt, at least 2,500 nt, at least 3,000 nt, at least 3,500 nt, at least 4,000 nt, at least 4,500 nt, at least 5,000 nt, at least 5,500 nt, at least 6,000 nt, at least 6,500 nt, or at least 7,000 nt in length.
In at least one embodiment of the present disclosure, the method provided herein results in at least a 10-fold enrichment of the target nucleic acid or the nucleic acid of the pathogen originally comprised within the biological sample. For example, the method results in at least a 10-fold, at least a 102-fold, at least a 103-fold, at least a 104-fold, or at least a 105-fold enrichment of the target nucleic acid or the nucleic acid of the pathogen originally comprised within the biological sample. In some embodiments, with the enrichment method provided herein, the target nucleic acid or the nucleic acid of the pathogen accounts for more than 50%, e.g., more than 55%, more than 60%, more than 65%, more than 70%, more than 75%, more than 80%, more than 85%, more than 90%, more than 95%, and more than 99%, in the biological sample, based on the total amount of nucleic acids therein.
In at least one embodiment of the present disclosure, the method provided herein further comprises extracting the enriched nucleic acid of the pathogen from the biological sample prior to the sequencing. In some embodiments, the method provided herein further comprises identifying a resistance gene carried by the pathogen based on a sequencing result. In some embodiments, identifying the resistance gene is performed at least 20 times (such as at least 25 times, at least 30 times, at least 40 times, at least 50 times, at least 60 times, and at least 70 times) the genome size of the pathogen.
In at least one embodiment, the diagnostic system and the method of the present disclosure are effective in selectively depleting a non-target nucleic acid (e.g., a host nucleic acid) and providing high-quality pathogenic DNA that may be subjected to rapid sequencing, thereby generating long sequence reads for assembling the entire genome of the pathogen. Hence, the present disclosure is useful in eliminating the interference of non-target nucleic acids as well as accelerating and improving the bioinformatics analysis to effectively identify the species of pathogens and the resistance genes thereof.
For a full understanding of this disclosure, reference should be made to the following detailed descriptions, taken in connection with the accompanying drawings.
The description discloses some embodiments in such detail that a person skilled in the art can utilize the embodiments based on the disclosure. Not all steps or features of the embodiments are discussed in detail, as many of the steps or features will be obvious to a person skilled in the art based on this disclosure.
As used in this disclosure, the singular forms “a,” “an,” and “the” include plural referents unless the content clearly dictates otherwise. As used herein, the term “and” is intended to be inclusive unless otherwise indicated. As used herein, the term “or” is generally employed in its sense including “and/or” unless the context clearly dictates otherwise.
As used herein, the term “about” refers to a degree of deviation for a property, composition, amount, value, or parameter as identified, such as deviations based on experimental errors, measurement errors, approximation errors, calculation errors, standard deviations from a mean value, routine minor adjustments, and so forth.
As used herein, the terms “comprising,” “having,” “including,” and “containing” are to be construed as open-ended terms (i.e., meaning “including, but not limited to”) unless otherwise noted.
The present disclosure is directed to a method for enriching a target nucleic acid in a sample, e.g., a biological sample obtained from a host suffering or suspected of suffering from an infectious disease. In at least one embodiment, the sample comprises a non-target nucleic acid originating from the host and a target nucleic acid originating from a non-host source. In at least one embodiment, the method increases a ratio of the target nucleic acid relative to the non-target nucleic acid in the sample by at least 10 folds.
As used herein, the terms “patient,” “host” and “subject” are used interchangeably. The term “subject” means a human or an animal. Examples of the subject include, but are not limited to, human, monkey, mice, rat, woodchuck, ferret, rabbit, hamster, cow, horse, pig, deer, dog, cat, fox, wolf, chicken, emu, ostrich, and fish. In some embodiments, the subject is a mammal, e.g., a primate such as a human.
As used herein, the term “biological sample” refers to a sample to be processed or analyzed by any of the methods described herein that can be of any type of sample obtained from a subject to be detected. The biological samples used herein include, but are not limited to: tissue samples (such as tissue sections and needle biopsies of a tissue); cell samples (e.g., cytological smears (such as Pap or blood smears) or samples of cells obtained by microdissection); samples of whole organisms (such as samples of yeasts or bacteria); or cell fractions, fragments or organelles (such as those obtained by lysing cells and separating the components thereof by centrifugation or otherwise). Other examples of biological samples include, but are not limited to, body fluid samples, such as blood, serum, plasma, urine, sputum, saliva, cerebrospinal fluid, interstitial fluid, mucous, sweat, stool extract, fecal matter, synovial fluid, tears, semen, peritoneal fluid, nipple aspirates, milk, vaginal fluid, or any combination thereof. In some embodiments, a blood sample can be whole blood or a faction thereof, e.g., serum or plasma, heparinized or EDTA treated to avoid blood clotting.
The method of the present disclosure comprises adding a non-ionic surfactant, e.g., saponin, to a sample, e.g., a biological sample comprising a host nucleic acid and a non-host nucleic acid. In at least one embodiment, the host nucleic acid and the non-host nucleic acid are contained in a cell or a particle originating from the host and a non-host source, respectively. In at least one embodiment, the non-ionic surfactant selectively causes lysis of the host cell and the interior membrane thereof, releasing a host nucleic acid, such that the host nucleic acid can be partially or completely bound to a solid phase adsorbent. The nucleic acid within a non-host cell or particle (e.g., pathogen) is essentially left intact, and would not be significantly removed from the biological sample, such that such nucleic acid can be subsequently collected and analyzed by, e.g., sequencing. The non-host nucleic acid processed or analyzed by any of the methods described herein has an average length sufficiently long to be identifiable; that is, the sequence and/or biological origin thereof can thus be ascertained. In at least one embodiment, the non-host nucleic acid enriched by the methods described herein may have at least 2,000 nucleotides in length.
Referring to
In at least one embodiment of the present disclosure, the target nucleic acid enrichment unit 200 may be connected to the cell lysis unit 100 and configured to receive the sample where the cells therein have been lysed by the cell lysis unit 100. The target nucleic acid enrichment unit 200 may include an immobilized adsorption device 201 and a nucleic acid extraction device 202, wherein the immobilized adsorption device 201 includes a solid phase adsorbent, which is configured to bind and remove the non-target nucleic acid released from the lysed cells, thereby enriching the target nucleic acid contained in the sample. The enriched target nucleic acid may be subsequently extracted by the nucleic acid extraction device 202. For example, further referring to
Referring to
In at least one embodiment of the present disclosure, the sequence analysis unit 400 may be connected to the sequencing unit 300 and configured to receive the sequencing data generated by the sequencing unit 300, wherein the sequencing data include the barcode of subsequence with indicated length in the sequence to be compared (i.e., the nucleic acid sequence of the target microorganism). In at least one embodiment of the present disclosure, the sequence analysis unit 400 may include a microorganism identification module 401 and a resistance gene identification module 402. By the microorganism identification module 401, the sequencing data are compared with a microbial genome database, thereby producing the identification result of the target microorganism. Further, the resistance gene identification module 402 can be used to identify the resistance gene carried by the target microorganism.
In at least one embodiment of the present disclosure, for determining whether the sequencing data and the reference sequence of the microbial genome database are the same or not, the corresponding sequence to be compared can be read from the reference sequence according to the barcode of the sequencing data, and then the base pairs in the sequence to be compared are aligned to the reference sequence to determine whether the bases in the sequence to be compared and the reference sequence are the same or not. If the alignment result is the same, the index is used as the position information of the sequence to be compared. If the alignment result is different, it is determined that there is an inserted or deleted base pair in the sequence to be compared. In at least one embodiment, the microbial genome database suitable to the diagnostic system of the present disclosure includes, but is not limited to, Centrifuge and Karken2, which are clinical pathogen databases used to compare with bacteria, viruses, fungi, parasites, and the like.
In at least one embodiment of the present disclosure, the database for species identification includes a pathogen genome database and a pathogen literature database, whose original data sources may be a public database, such as National Center for Biotechnology Information (NCBI). At present, the microbial genome database records the reference sequences of a total of 69,836 species, including a total of 5,527 species of bacteria and archaea, 1,677 species of viruses, 5,523 species of fungi, and 865 species of parasites, as well as 62,602 species of eukaryotes. In at least one embodiment of the present disclosure, the database for resistance gene identification may be the resistance gene database Resfinder 4.0 (Center for Genomic Epidemiology, DTU, Denmark). Currently, the resistance gene database includes reference sequences with a total of 2,690 resistance genes on plasmids and 266 resistance gene mutation sites on chromosomes, and further includes 57 drugs for predicting resistance of microorganisms.
Further referring to
The step of lysing cells (S1) comprises adding a non-ionic surfactant to a sample collected from the environment or a host, thereby lysing non-target cells in the sample.
The step of enriching target nucleic acids (S2) comprises binding nucleic acids of the non-target cells by a solid phase adsorbent, and extracting target nucleic acids in the sample after removing the solid phase adsorbent.
The step of encoding sequence (S3) comprises constructing a sequencing library with a library preparation kit, sequencing the target nucleic acids by a sequencer, and generating sequencing data by a base-calling program.
The step of analyzing sequence (S4) comprises comparing the sequencing data with a microbial genome database and/or a resistance gene database, thereby producing the identification result of the target microorganism and/or the resistance gene.
The materials and processes used in the present disclosure will be provided and described in detail below.
When incubation of blood cultures in a system, for example, the BACTEC (BD), is flagged positive, 2 mL blood culture solution is taken and reacted with 1×red blood cell (RBC) lysis buffer at room temperature (RT) for 5 min to eliminate the RBC in the blood. Subsequently, the reacted solution is centrifuged at 3,000×g for 10 min to primarily clean the debris. The supernatant is discarded, and the pellet is resuspended with 250 μL of phosphate-buffered saline (PBS). Further, the non-ionic surfactant (e.g., saponin, Tween, Triton, polyoxyethylene (10) oleyl ether, polyols, polyoxyethylene-polyoxypropylene copolymers, polyoxyethylene ethers, alkyl ethanolamides, glucosides, and fatty alcohols) is added in the suspension. For example, 5% saponin is added to the suspension to reach the final concentration of 2.2%, and then subjected to incubation at RT for 10 min. After centrifugation at 6,000×g for 5 min, the supernatant is discarded, and the pellet is resuspended with 200 μL of PBS. To the suspension, 100 μL of solid-phase reversible immobilization (SPRI) beads are added, followed by pipetting for 5 min. Further, after standing on a magnet rack, the supernatant is collected. The supernatant is then centrifuged at 3,000×g for 3 min, and the pellet is resuspended in 200 μL of PBS.
To extract bacterial DNA from the pretreated pellet for Nanopore sequencing, a commercially available kit is employed generally based on protocols described in QIAamp blood and tissue genomic DNA from Qiagen manual, except that the lysozyme and lysostaphin protocol is used to reduce processing steps and turnaround time.
After DNA has been extracted, shorter DNA fragments (less than about 300 bp in length) are depleted by SPRI beads. DNA concentration is measured with a Qubit Fluorometer by using the Qubit Broad Range double-stranded DNA (dsDNA) quantification kit, which has a quantitation range of 2 ng/μL to 1,000 ng/μL. DNA purity and contamination are assessed by using a NanoDrop spectrophotometer. The suggested sample purity is A260/A230>2.0 and A260/A280>1.8.
The DNA concentration of the extracted sample is adjusted to 80 ng/μL, and then 5 μL of the sample (400 ng) is added with 2.5 μL of water to a final volume of 7.5 μL. The Rapid Barcoding kit (SQK-RBK004, Oxford Nanopore) is dissolved at room temperature for a subsequent experiment.
Further, 7.5 μL of the sample, 2.5 μL of each label barcode adapter 1 to 96, the sequencing adapters, and dynein are added into a 0.2 mL microcentrifuge tube. In the process of connecting the label barcode adapters, the same label barcode adapter cannot be reused within 96 consecutive samples.
The sample is placed in a PCR machine for a reaction of 30° C. for 1 min and 80° C. for 1 min, and further placed on an ice box to mix all labeled samples. Subsequently, DNA is purified by Agencourt AMPure XP magnetic beads. The magnetic beads shall be shaken well before use. Specifically, 60 μL of the magnetic beads are added to the reacted DNA solution, placed in a mixer, and inverted for 5 min. The microcentrifuge tube is stood on a magnet rack for 10 min. After the removal of the solution, the magnetic beads are washed with 70% alcohol twice. Afterward, the magnetic beads are dispersed with 25 μL of DNase-free water to dissolve DNA in water. The magnetic beads are then removed by the magnet rack to obtain a purified DNA library.
Sequencing is performed on MinION flow cells (R9.4.1 FLO-MIN106, Oxford Nanopore). The flow cells are placed in the MinION sequencer after returning to room temperature, and the Flow Cell Priming kit is used for the sequencing. Firstly, the flush buffer (FB) and the flush tether (FLT) are returned to room temperature, and 30 μL of FLT is added to FB to form a priming mixture. Subsequently, 800 μL of the priming mixture is loaded into the flow cells via the priming port and stood for 5 min. Further, another 200 μL of the priming mixture is loaded into the priming port.
In another microcentrifuge tube, 12 μL of the prepared DNA library is added with 37.5 μL of sequencing buffer (SQB) and 25.5 μL of loading beads to form a sequencing mixture with a total volume of 75 μL. The sequencing mixture is gently pipetted to avoid the introduction of any air bubbles, and then slowly dropped into a sample port. The reagent port and the sample port were closed for performing sequencing.
The data are collected by using the MinKNOW software v4.2.4. Base calling is performed using the Guppy command line tool with barcode de-multiplexing and FASTQ file output. Adaptor sequences are trimmed from the reads using Porechop v0.2.3, which is run with barcode de-multiplexing. Only reads for which Guppy and Porechop agreed on the barcode bin are kept to reduce the risk of cross-barcode contamination. The MinKNOW platform generated sequencing data, and all sequences per file are outputted using default settings. The first output file is produced approximately 2 hours after the start of the sequencing run until 10 hours. For this work, each output file is processed separately for keeping track of the time that passes from the start of the sequencing.
Raw sequencing reads (≥2,000 bp) are taxonomically classified by the classification program such as Centrifuge 1.0.4 and Kraken 2 and using default settings (minimum length of partial hits min_hitlen=22; at most k=5 distinct assignments for each read; no preferred/excluded taxa) and the reference gene sequences of bacteria, archaea, virus, and human.
Specifically, based on the barcode of subsequence with indicated length in the sequence to be compared, the corresponding sequence is readout from the reference gene sequences. The generated sequencing data are classified by the clinical pathogen database of Centrifuge 1.0.4 or Kraken 2, and the sequence whose alignment length is greater than 80% of the full length of the reference sequence and the mismatched bases in the alignment region is less than or equal to 10% is kept, so as to calculate the proportion of pathogen classification. The sample is identified as containing a pathogen if the proportion of pathogen classification is greater than 1% of the total sequence reads.
Once sequencing data have been collected, the next step is pre-processing and base calling, followed by metagenomic assembly. Various assemblers are appropriate for the assembly of long-read metagenomic data. These include long-read assemblers, such as Canu and Flye. In addition, long reads alone can be used for error correction by using Racon and Medaka, which uses neural networks to recognize and correct Nanopore homopolymer errors and generate consensus sequence, and the Homopolish, which is a method for the removal of systematic errors in Nanopore sequencing by homologous polishing software. Raw sequencing reads (≥300 bp) and assembled contigs tagged as plasmids are searched with ResFinder 4.0 databases using BLAST. Only hits with ≥90% similarity, E-value ≤10−6, and ≥60% coverage of the database entry are kept.
The assembled sequences are compared with the resistance gene database. Based on the alignment to microbial genome and resistance genes, at least one parameter selected from the number of effective sequences for alignment (i.e., the number of sequences of the species and genes for alignment between genus/species and resistance genes), coverage (i.e., the percentage of the length of the detected microbial nucleic acid sequence to the length of the genome sequence of microorganisms and resistance genes), coverage depth (i.e., the average depth of each base that is measured in the genome), relative abundance (i.e., the proportion of the detected microorganisms to the same genus/species of microorganisms in the sample), and degree of dispersion can be calculated, thereby producing the identification result.
The following examples provide various non-limiting embodiments and properties of the present disclosure.
In this example, a human blood sample containing Klebsiella pneumoniae (K. pneumoniae) strain KPC160111 or Staphylococcus aureus (S. aureus) strain TUH25713455 was pretreated with the immobilized adsorption of human nucleic acids, and then subjected to quantitative polymerase chain reaction (qPCR) and Nanopore sequencing.
The results indicated that the bacterial nucleic acids were enriched in the sample with the pretreatment of immobilized adsorption. As shown in Table 1 below, in the pretreated sample, the human nucleic acids were depleted to 0.005 to 0.016 times of the control sample, while the bacterial nucleic acids were increased to 2.34 to 5.78 times of the control sample.
K. pneumoniae
K. pneumoniae
S. aureus
S. aureus
Further, the results of Nanopore sequencing indicated that the number of reads (i.e., No. of reads), the read length (including average read length, median read length, and N50), and the total base obtained from the pretreated sample were all significantly higher than that from the control sample (Table 2).
K. pneumoniae
S. aureus
In terms of the proportion of bacterial nucleic acids after the Nanopore sequencing, Table 3 below shows that the proportion of non-target nucleic acids (i.e., human nucleic acids) was significantly decreased from 63.09% to 0.13% in the pretreated sample containing K. pneumoniae, and from 75.35% to 0.11% in the pretreated sample containing S. aureus; on the other hand, the proportion of bacterial nucleic acids was increased from 28.34% to 82.01% (K. pneumoniae) and from 20.72% to 81.14% (S. aureus).
K. pneumoniae
S. aureus
In this example, a human blood sample containing K. pneumoniae or S. aureus was pretreated with the immobilized adsorption of human nucleic acids or the commercially available kits (i.e., MolYsis Basic 5 Kit, NEBNext Microbiome DNA Enrichment Kit, and QIAamp BiOstic Bacteremia DNA Kit), and then subjected to qPCR, Nanopore sequencing, and identification of the bacterial species and resistance genes based on the sequencing data generated from the Nanopore sequencing.
In comparison with the commercially available kits, the sample pretreated with the immobilized adsorption provided herein had the longest read length (including average read length and mean read length) (Table 4 and Table 5 below).
Further, as shown in
In addition, as shown in
In this example, 36 human blood culture specimens provided by a hospital in Taiwan were pretreated with the immobilized adsorption of non-target nucleic acids, and then subjected to the identification of pathogens and the detection of resistance genes.
The results were shown in Table 6 below, in which the percentage represents a proportion of sequence reads. It can be found that among the 36 blood specimens, 33 cases indicated that the pathogens identified by the method of the present disclosure were consistent with those identified by the conventional microbial culture; moreover, in the case that the sample contained more than one pathogen or the pathogens therein were different species of the same genus, the minor pathogens or species in the sample could also be identified by the method of the present disclosure. Three cases, Nos. 7, 14, and 24, which showed inconsistent identification results with those obtained from the microbial culture, might be more likely to be close to the real result of infection.
Klebsiella pneumoniae
Klebsiella pneumoniae (77.7%)
Escherichia coli
Escherichia coli (75.6%)
Acinetobacter baumannii
Acinetobacter baumannii (62.9%)
Staphylococcus aureus
Staphylococcus aureus (53%)/Escherichia coli (25%)
Escherichia coli
Escherichia coli (65.6%)
Staphylococcus aureus
Staphylococcus aureus (56%)
Staphylococcus aureus
S. epidermidis (47%)/aureus (17%)/simulans
Proteus mirabilis
Proteus mirabilis (58%)
Escherichia coli
Escherichia coli (58%)
Escherichia coli
Escherichia coli (68%)
Escherichia coli
Escherichia coli (92.8%)
Escherichia coli
Escherichia coli (86%)/Enterococcus faecium (1.9%)
Pseudomonas
Pseudomonas BJP69 (55%)/putida (18.5%)/monteilii
Staphylococcus
Staphylococcus capitis (47.8%)/hominis (25.5%)/aureus (1.41%)
epidermidis
Enterococcus faecium
Enterococcus faecium (97%)
Acinetobacter baumannii
Acinetobacter baumannii (58.0%)
Acinetobacter baumannii
Acinetobacter baumannii (68.8%)
Klebsiella pneumoniae
Klebsiella pneumoniae (76.7%)/variicola
Staphylococcus aureus
Staphylococcus aureus (94%)
Acinetobacter baumannii
Acinetobacter baumannii (61.3%)/Enterococcus
faecium (5.0%)
Escherichia coli
Escherichia coli (90.4%)
Enterococcus faecium
Enterococcus faecium (96.5%)
Klebsiella aerogenes
Klebsiella aerogenes (93%)
Klebsiella pneumoniae
Klebsiella
quasipneumoniae (60.9%)/pneumoniae (7.1%)
Klebsiella variicola
Klebsiella
variicola (86.9%)
Klebsiella pneumoniae
Klebsiella pneumoniae (67.5%)/variicola (1.4%)
Klebsiella pneumoniae
Klebsiella pneumoniae (75.1%)/Escherichia coli (2.9%)
Klebsiella pneumoniae
Klebsiella pneumoniae (80.6%)
Klebsiella pneumoniae
Klebsiella pneumoniae (67.6%)/Escherichia coli (2.0%)
Klebsiella pneumoniae
Klebsiella pneumoniae (81.8%)
Klebsiella pneumoniae
Klebsiella pneumoniae (74.2%)/Klebsiella variicola
Klebsiella pneumoniae
Klebsiella pneumoniae (62.4%)/Escherichia coli (1.2%)
Klebsiella pneumoniae
Klebsiella pneumoniae (65.3%)/Escherichia coli (3.2%)
Klebsiella pneumoniae
Klebsiella pneumoniae (81.5%)/Escherichia coli (3.2%)
Candida glabrata
Candida glabrata (55.4%)/Escherichia coli (2.2%)
Candida albicans
Candida albicans (51.3%)/Escherichia coli (1.2%)
The resistance genes detected by the method of the present disclosure could be attributed to the phenotypic resistance in the sample determined by conventional antimicrobial susceptibility testing (AST). The resistance genes in samples Nos. 4, 17, 19, 21, 22, and 29 identified by the method of the present disclosure were shown in
The performance of the present disclosure (TCDC protocol) in the identification of bacterial species in 44 clinical blood specimens was compared with conventional culture, matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry, and the BIOFIRE Blood Culture Identification (BCID2, FilmArray). As shown in Table 7 below, the method of the present disclosure performed well in the identification of bacterial species in the sample containing gram-positive, gram-negative, or multiple bacteria, and had 100% consistency with the results of conventional culture MALDI-TOF. This evaluation also indicated that the method of the present disclosure was superior to FilmArray BCID2 in the identification of bacterial species in the 44 clinical specimens.
Klebsiella
pneumoniae
Klebsiella
variicola
Klebsiella
pneumoniae*
Citrobacter freundii
Serratia marcescens
Serratia rubidaea
Enterobacter cloacae
Moraxella osloensis
Escherichia coli
Pseudomonas aeruginosa
Acinetobacter baumannii
Acinetobacter guillouiae
Stenotrophomonas maltophilia
Staphylococcus
aureus
Enterococcus faecium
Escherichia coli
Klebsiella
pneumoniae
Enterococcus gallinarum
Klebsiella
pneumoniae
Enterobacter cloacae
Enterococcus gallinarum
Candida albicans
Proteus mirabilis
Klebsiella
pneumoniae
Staphylococcus epidermidis
Klebsiella aerogenes
Klebsiella aerogenes
Citrobacter cronae
Escherichia coli
As to the clinical specimens from intensive care units, the performance of the method of the present disclosure was also compared with conventional culture MALDI-TOF, FilmArray BCID2, and Nanopore sequencing of 16S rRNA gene, and the results of pathogens identification were shown in Table 8 below. The method of the present disclosure had concordant results with the culture method in the specimens identified with one pathogen, expect that in specimen ICU2-1, the method of the present disclosure further identified additional bacterial species. The FilmArray BCID2 panel failed to identify Moraxella osloensis in specimen ICU04 and Acinetobacter guillouiae in specimen ICU13. In specimens ICU2-1 and ICU38, the FilmArray BCID2 panel could not specifically identify the species of bacteria (e.g., Citrobacter freundii).
Citrobacter
freundii
Citrobacter
freundii
Enterobacterales
Citrobacter
murliniae
Citrobacter
freundii
Citrobacter
gillenii
Citrobacter
Citrobacter
freundii
portucalensis
Citrobacter
youngae
Citrobacter
braakii
Escherichia
coli
Moraxella
osloensis
Moraxella
Moraxella
osloensis
Escherichia
coli
osloensis
Serrita
rubidaea
Serrita
rubidaea
Enterobacterales
Serrita
rubidaea
Klebsiella
variicola
Klebsiella
Enterobacterales
Klebsiella
variicola
Klebsiella
pneumoniae
variicola
Klebsiella
Klebsiella
pneumoniae
pneumoniae
Acinetobacter
Acinetobacter
Acinetobacter
guillouiae
guillouiae
guillouiae
Serrita
rubidaea
Klebsiella
aerogenes
Klebsiella
Klebsiella
Klebsiella
aerogen
aerogens
aerogens
Citrobacter
cronae
Citrobacter
cronae
Escherichia
coli
Citrobacter
cronae
Citrobacter
freundii
Citrobacter
freundii
Enterobacterales
Citrobacter
murliniae
Citrobacter
gillenii
Citrobacter
freundii
Citrobacter
braakii
The performance of the method of the present disclosure in the identification of the phenotypic resistance and resistance genes in specimens was compared with FilmArray BCID2. As shown in Table 9, the method of the present disclosure could identify nearly all the resistance genes that could correspond to the phenotypic resistance detected by clinical blood culture and antimicrobial susceptibility testing (AST). In comparison, the FilmArray BCID2 only detected a limited number of the resistance genes.
Klebsiella
pneumoniae
Citrobacter freundii
Serratia marcescens
Enterobacter cloacae
Escherichia coli
Escherichia coli
Klebsiella
pneumoniae
Enterococcus
gallinarum
Staphylococcus
aureus
Pseudomonas
aeruginosa
Escherichia coli
Klebsiella
pneumoniae
Enterobacter
cloacae
Klebsiella
variicola
Acinetobacter
guillouiae
Serratia marcescens
Klebsiella
pneumoniae
Escherichia coli
Klebsiella
pneumoniae
Pseudomonas
aeruginosa
Serratia marcescens
Escherichia coli
Streptococcus
Escherichia coli
Pseudomonas
aeruginosa
Escherichia coli
Escherichia coli
Escherichia coli
Escherichia coli
Escherichia coli
Klebsiella
pneumoniae
Klebsiella
pneumoniae
Enterococcus
faecium
Candida albicans
Enterococcus
faecium
Klebsiella
aerogenes
Proteus mirabilis
Klebsiella
pneumoniae
Staphylococcus
epidermidis
Escherichia coli
Klebsiella
pneumoniae
Klebsiella
pneumoniae
Acinetobacter
baumannii
Citrobacter
cronae
Acinetobacter
baumannii
Stenotrophomonas
maltophilia
Enterococcus
faecium
From the above, these data reveal that the method of the present disclosure can be used for the rapid identification of bacterial species and can reach 20× coverage depths of sequence within 2 to 4 hours of the sequencing time, thereby arriving at genome assembly, resistance genes detection, and antimicrobial susceptibility prediction. By employing immobilized adsorption, the system and method of the present disclosure can be used to obtain high-quality bacterial DNA by removal of non-target nucleic acid from humans or other sources in blood culture specimens. The extracted high-quality bacterial DNA may be subjected to rapid sequencing using the Nanopore sequencing platform to generate long sequence reads, which may be further analyzed using the bioinformatics pipelines to identify the species of bacteria and resistance genes.
In comparison with conventional microbial culture followed by antimicrobial susceptibility testing, which requires a turnaround time of more than 3 days (
Hence, the present disclosure provides relevant information to timely select effective antimicrobials, thereby assisting in improving the cure rate of the diseases and curbing the emergence and spread of bacterial strains with resistance resulting from empirical use of non-effective antimicrobials.
It is obvious to a person skilled in the art that with the advancement of technology, the basic idea may be implemented in various ways. The embodiments are thus not limited to the examples described above; instead, they may vary within the scope of the claims.
The embodiments described hereinbefore may be used in any combination with each other. Several of the embodiments may be combined to form a further embodiment. A method disclosed herein may comprise at least one of the embodiments described hereinbefore. It will be understood that the benefits and advantages described above may relate to one embodiment or may relate to several embodiments. The embodiments are not limited to those that solve any or all of the stated problems or those that have any or all of the stated benefits and advantages.
Number | Date | Country | Kind |
---|---|---|---|
111121091 | Jun 2022 | TW | national |
Number | Date | Country | |
---|---|---|---|
63286548 | Dec 2021 | US |