COMPOSITIONS AND METHODS FOR ASSESSING MICROBIAL POPULATIONS

Information

  • Patent Application
  • 20220251669
  • Publication Number
    20220251669
  • Date Filed
    April 08, 2022
    2 years ago
  • Date Published
    August 11, 2022
    2 years ago
Abstract
The present disclosure provides compositions and methods, as well as combinations, kits, and systems that include the compositions and methods, for amplification, detection, characterization, assessment, profiling and/or measurement of nucleic acids in samples, particularly biological samples. Compositions and methods provided herein include combinations of microbial species target-specific nucleic acid primers for selective amplification and/or combinations of primers for amplification of nucleic acids from a large group of taxonomically related microorganisms. In one aspect, amplified nucleic acids obtained using the compositions and methods can be used in various processes including nucleic acid sequencing and used to detect the presence of microbial species and assess microbial populations in a variety of samples. In accordance with the teachings and principles, new methods, systems and non-transitory machine-readable storage medium are provided to compress reference sequence databases used in mapping sequence reads for analysis and profiling of microbial populations.
Description
SEQUENCE LISTING

This application hereby incorporates by reference the material of the electronic Sequence Listing filed concurrently herewith in its entirety. The material in the electronic Sequence Listing is submitted as a text (.txt) file entitled LT01495_ST25.txt created Oct. 8, 2020, and has a file size of 408 kilobytes.


BACKGROUND

The diversity of the microbiota of a variety of environments has become an area of intensive research as the scientific and medical communities gain an increased understanding of the important role microbiota play in ecosystems and the health of individuals and populations. In one example, the microbiota of the gut (also referred to as the gut microbiome) is made up of trillions of bacteria, fungi and other microbes. One third of the gut microbiota in humans is common to most people while two thirds are specific to each person. A healthy human gut has a variety of commensal or mutualistic bacteria living in relative homeostasis. When a microbial imbalance or maladaptation occurs, changing the makeup and proportions of the normal flora of bacteria, the gut enters a state of dysbiosis. Dysbiosis typically causes inflammation of the intestinal cell wall, disrupting the mucus barrier, epithelial barrier, and immunosensitive cells that line the gastrointestinal tract. Imbalances in the gut microbiota are associated with diseases, chronic health conditions and response to immuno-oncology treatments. For example, imbalances in the gut microbiota have been associated with gut disorders such as irritable bowel syndrome (IBS), inflammatory bowel disease (IBD) and obesity, and autoimmune disorders such as celiac disease, lupus and rheumatoid arthritis (RA). Additionally, the composition of the gut microbiota may influence susceptibility to oncological conditions, such as cancer, and responsiveness to cancer therapies. Due to the involvement of gut microbiota in a wide range of disorders and diseases across animal species, including humans, animals and insects, characterization and study of the gut microbiome have emerged as key research focuses in advancing the understanding of health and disease and in the development of therapies for related conditions.


Several different techniques have been employed to attempt to identify microbes in various environmental and organismal samples. Initial techniques relied on microbial culture processes which are time-consuming and provide limited information due, in part, to varying growth conditions required to obtain different microbial cultures. Many more recent techniques that do not require culturing involve analysis of the genetic makeup of microbial cells contained in samples using nucleic acid analysis methods including, for example nucleic acid amplification (e.g., PCR) and/or sequencing. Typically, such methods involve amplification and analysis of microbial 16S rRNA gene segments. While analysis of 16S rRNA sequences has reduced the time and labor required in some other methods of evaluating microbial composition of samples, the comprehensiveness, accuracy, quality and depth of the information obtained through 16S rRNA gene sequence analysis-based methods can vary and be limited, for example, by the amplicons targeted and primers used in the methods. Therefore, there is a need for more sensitive and comprehensive methods for accurately characterizing the whole of a microbial population in a sample through identifying and distinguishing microbial species and levels thereof in samples containing multiple species. Such methods will factor significantly in many research areas including those directed to the causes, complications, and diagnosis of multifactorial disorders and diseases, and in advancing research into and understanding of the gut microbiota in health and disease.


BRIEF SUMMARY

Provided herein are compositions and methods, as well as combinations, kits, and systems that include the compositions and methods, for amplification, detection, characterization, assessment, profiling and/or measurement of nucleic acids. In some embodiments, compositions provided herein include a nucleic acid, for example, a single-stranded nucleic acid, that is used as a primer and/or probe. In some embodiments, compositions provided herein include a combination of a plurality of nucleic acids. In particular embodiments, the primers and/or probes are capable of binding to, hybridizing to, amplifying and/or detecting target nucleic acids of microorganisms (e.g., bacteria), such as may occur in a sample (e.g., biological sample), for example, a sample of contents of an alimentary canal of an animal. Such nucleic acids provided herein include primers and probes that specifically or selectively amplify, bind to, hybridize to and/or detect a pre-determined unique nucleic acid sequence of a microorganism's genome, and primers and probes that amplify, bind to, hybridize to and/or detect a nucleic acid sequence in one or more genes that is homologous across most, or substantially all, members of a taxonomic category (e.g., domain, kingdom, phylum, class, order, genus, species) of organisms, e.g., microorganisms, but that varies between different organisms. In certain embodiments, such nucleic acids contain one or more modifications that facilitate manipulation and/or multiplex amplification of nucleic acids. For example, such modifications include modifications that increase susceptibility of the nucleic acid to cleavage relative to the nucleic acid that does not include the modification. In some embodiments, the nucleic acids include one or more pairs of nucleic acids that are used as primers (e.g., primer pairs) for amplification of a target nucleic acid, such as, for example, a specific nucleic acid unique to a species of microorganism or one or more, or multiple, nucleic acids contained within a homologous gene (e.g., a 16S ribosomal RNA (rRNA) gene) common to multiple different microorganisms. For example, in certain embodiments, the nucleic acids include one or more primer pairs that separately amplify two or more regions, e.g., hypervariable regions, in a prokaryotic 16S rRNA gene. In some embodiments, the nucleic acids include a combination of a plurality of primer pairs. In certain embodiments, the combination of a plurality of primer pairs is designed to amplify nucleic acids in one, some, most or substantially all of the microorganisms, such as, e.g., bacteria, in a sample in a species-targeted and/or kingdom-encompassing manner. Also provided herein are compositions containing a mixture of nucleic acids, in which most, or substantially all, of the nucleic acids contain sequence of a portion of the genome of a microorganism, e.g., a bacterium. In some embodiments, the sequences of portions of the genome of microorganisms are less than or about 250 nucleotides in length. In some embodiments, the nucleic acids include nucleotides containing a uracil nucleobase. In some embodiments, the composition contains one or more, or a plurality, of primers, e.g., nucleic acids and/or primer pairs of any of the embodiments described herein. In some embodiments, the composition includes a DNA polymerase, a DNA ligase, and/or at least one uracil cleaving or modifying enzyme.


In some embodiments of methods of amplification provided herein, nucleic acids are subjected to amplification using nucleic acids described herein as amplification primers. In some embodiments, the nucleic acid amplification is a multiplex amplification. In some embodiments, the methods of amplification include a plurality of nucleic acid primers, e.g., primer pairs, that separately amplify two or more regions in one or more genes that is homologous across most, or substantially all, members of a taxonomic category (e.g., domain, kingdom, phylum, class, order, genus, species) of organisms. For example, in some embodiments, a plurality of nucleic acid primers includes primers, or primer pairs, that separately amplify one or more or a plurality of hypervariable regions in a prokaryotic 16S rRNA gene. In some embodiments, the methods of amplification include one or more, or a plurality of, nucleic acid primers, e.g., primer pairs, that amplify a specific nucleic acid unique to a species of organism, e.g., a microorganism such as a bacterium. In some embodiments, the methods of amplification include a plurality of nucleic acid primers, e.g., primer pairs, that include a combination of primers that separately amplify two or more regions in one or more genes that is homologous across most, or substantially all, members of a taxonomic category of organisms and one or more, or a plurality of, nucleic acid primers, e.g., primer pairs, that amplify a specific nucleic acid unique to a species of organism. In some embodiments, primers used in a method of amplification include nucleic acids containing or consisting of nucleic acids provided herein and/or nucleic acids that are capable of amplifying nucleic acids containing or consisting essentially of target sequences provided herein.


In some embodiments of methods of detecting and/or measuring nucleic acids provided herein, nucleic acids described herein are used as primers and/or probes. For example, in some methods of detecting and/or measuring nucleic acids, nucleic acids are subjected to nucleic acid amplification using nucleic acids described herein as amplification primers, and the presence or absence of one or more nucleic acid amplification products is detected. In some embodiments, the amplification is performed using a plurality of nucleic acid primer pairs and is conducted in a single multiplex amplification reaction mixture. In some embodiments, the amplification is performed according to methods of amplification provided herein using any one or more primers, or combination of primers or primers pairs described herein. In some embodiments, nucleic acids are contacted with probes containing nucleic acids described herein under hybridizing conditions and the presence or absence of the hybridized probe is detected. In some embodiments, the presence or absence of one or more nucleic acid amplification products is detected using one or more nucleic acids provided herein as a probe (e.g., a detectable or labeled probe). In some embodiments, the presence or absence of one or more nucleic acid amplification products is detected by obtaining nucleotide sequence information of one or more nucleic acid amplification products. In some embodiments, the levels (absolute or relative) of detected amplification products are measured and determined. In some embodiments, the levels (absolute or relative) of detected hybridized probes are measured and determined. In some embodiments, the nucleic acids being detected and/or measured are nucleic acids of microorganisms, e.g., bacteria. In some embodiments, the nucleic acids being detected and/or measured are nucleic acids in or from a sample, e.g., a sample of contents of the alimentary canal of an organism.


Also provided herein are compositions and methods, as well as combinations, kits, and systems that include the compositions and methods, for characterizing, assessing, profiling and/or measuring a population of microorganisms (e.g., bacteria), and/or components or constituents thereof, in a sample (e.g., biological sample), for example, a sample of contents of an alimentary canal of an animal. In some embodiments, a method for characterizing, assessing, profiling and/or measuring a population of microorganisms in a sample includes subjecting nucleic acids in or from the sample to nucleic acid amplification using a combination of nucleic acid primer pairs that specifically amplify a pre-determined unique nucleic acid sequence of a microorganism's genome, and/or primer pairs that amplify a nucleic acid sequence that occurs in a homologous gene or genome region common to multiple microorganisms but that varies between different microorganisms. In particular embodiments, the primer pairs that amplify a nucleic acid sequence that occurs in a homologous gene or genome region include one or more primer pairs that amplify nucleic acids comprising nucleotide sequences of one or more hypervariable regions of a prokaryotic 16S rRNA gene. In some embodiments, the amplification using a combination of nucleic acid primer pairs is conducted in a single multiplex amplification reaction mixture. In some embodiments, the method for characterizing, assessing, profiling and/or measuring a population of microorganisms includes obtaining sequence information from nucleic acid products of amplification using the combination of primer pairs and/or determining the levels (e.g., relative and/or absolute levels) of nucleic acid products of the amplification and using the sequence information and/or level determinations to identify genera of microorganisms in the sample and species of one or more microorganisms in the sample, and optionally relative and/or absolute levels thereof, to characterize, assess, profile and/or measure a population of microorganisms, and/or components or constituents thereof, in the sample.


Also provided herein are compositions and methods, as well as combinations, kits, and systems that include the compositions and methods, for diagnosis and/or treatment, reduction in symptoms of, or prevention of microorganism (e.g., bacteria) imbalances and/or dysbiosis in a subject as well as conditions, disorders and diseases associated therewith. For example, in some instances, the microorganism imbalance and/or dysbiosis is in the alimentary canal, or gastrointestinal tract, of the subject. In some embodiments, diagnosis and/or treatment, reduction in symptoms of, or prevention of microorganism imbalances and/or dysbiosis in a subject includes subjecting nucleic acids in or from one or more samples from a subject to nucleic acid amplification, obtaining sequence information of the nucleic acid amplification products, detecting the presence or absence of one or more genus of microorganism in the sample, and detecting the presence or absence of a disproportionate level of one or more microorganisms in the sample, wherein the presence of a disproportionate level of one or more microorganisms is indicative of a microorganism imbalance and/or dysbiosis in the subject. In some embodiments of treating a subject having a microorganism imbalance and/or dysbiosis, a subject who has a disproportionate level of one or more microorganisms is treated to establish a balance of microorganisms or biosis in the subject. In some embodiments, the amplification is performed using a plurality of nucleic acid primer pairs. In some embodiments, detecting the presence or absence of one or more microorganisms in a sample includes identifying the genus of one or more microorganisms in the sample. In some embodiments, detecting the presence or absence of one or more microorganisms in a sample includes identifying the genus of one or more microorganisms in the sample and identifying one or more species of microorganism in the sample. In some embodiments, amplification is performed using a combination of nucleic acid primer pairs that specifically amplify a pre-determined unique nucleic acid sequence of a microorganism's genome, and/or primer pairs that amplify a nucleic acid sequence that occurs in a homologous gene or genome region common to multiple microorganisms but that varies between different microorganisms. In particular embodiments, the primer pairs that amplify a nucleic acid sequence that occurs in a homologous gene or genome region include one or more primer pairs that amplify nucleic acids comprising sequences of one or more hypervariable regions of a prokaryotic 16S rRNA gene. In some embodiments, the amplification is conducted in a single multiplex amplification reaction mixture. In some embodiments, obtaining nucleotide sequence information of nucleic acid amplification products includes detecting a nucleotide sequence using nucleic acids provided herein as a probe (e.g., a detectable or labeled probe). In some embodiments, obtaining nucleotide sequence information of nucleic acid amplification products includes conducting sequencing of the amplification products. In some embodiments, detecting the presence or absence of a disproportionate level of one or more microorganisms in the sample includes determining the relative levels of one or more microorganisms in the sample. Treating a subject having a disproportionate level of one or more microorganisms, in some embodiments, includes administering one or microorganisms to the subject and/or one or more compositions that reduce the levels of or eliminate certain microorganisms, e.g., an antibiotic-containing composition.


In accordance with the teachings and principles embodied in this application, new methods, systems and non-transitory machine-readable storage medium are provided to compress reference sequence databases used in mapping sequence reads for analysis and profiling of microbial populations.





BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings, which are incorporated into and form a part of the specification, illustrate one or more exemplary embodiments and serve to explain the principles of various exemplary embodiments. The drawings are exemplary and explanatory only and are not to be construed as limiting or restrictive in any way.



FIG. 1 is an illustration depicting the structure of a prokaryotic 16S ribosomal RNA (rRNA) gene showing 9 hypervariable regions 101 (boxes labelled as V1-V9) that are interspersed between conserved regions (white unlabeled boxes) of the gene. Arrows above the gene depict forward and reverse primers that hybridize to sequences of 8 targeted conserved hypervariable segment regions 102 at the indicated positions to amplify the hypervariable region between the arrowheads.



FIG. 2 illustrates a workflow for use in analysis of nucleotide sequence information generated in methods provided herein.



FIG. 3 is a block diagram of an exemplary workflow for processing sequence read data obtained from sequencing of amplified nucleic acids generated in amplification of microbial nucleic acids using 16S rRNA gene primers.



FIG. 4 is a block diagram of an exemplary workflow for processing sequence read data obtained from sequencing of amplified nucleic acids generated in amplification of microbial nucleic acids using species-specific nucleic acid primers.



FIG. 5A and FIG. 5B are each a graphic representation of the results of analysis using Spearman's rho of a comparison of the data of sequencing of four replicate aliquots of DNA amplicon libraries generated from six bacterial samples using a pool of 16S rRNA gene primers for amplification (FIG. 5A) or using a pool of species specific primers for amplification (FIG. 5B).



FIG. 6A is a bar graph showing the results of an analysis of reads from sequencing of a DNA amplicon library generated from a mixed bacteria sample using a pool of 16S rRNA gene primers for nucleic acid amplification. The numbers of reads mapping to different bacterial genera are shown. FIG. 6B depicts analytics from the analysis.



FIG. 7 shows graphs of Spearman's rho analyses of the results of sequencing of four replicate aliquots of a library generated from a mixture of bacterial DNA (Sample no. 1 (MSA1002)) using a 16S primer pool for amplification.



FIG. 8A is a bar graph showing the results of an analysis of reads from sequencing of a DNA amplicon library generated from a mixed bacteria sample using a pool of species-specific gene primers for nucleic acid amplification. The numbers of reads mapping to different bacterial species are shown. FIG. 6B depicts analytics from the analysis.



FIG. 9 shows graphs of Spearman's rho analyses of the results of sequencing of four replicate aliquots of a library generated from a mixture of bacterial DNA (Sample no. 1 (MSA1002)) using a species-specific primer pool for amplification.



FIG. 10 is a block diagram depicting various embodiments of nucleic acid sequencing platforms, e.g., sequencing instrument 200 can include a fluidic delivery and control unit 202, a sample processing unit 204, a signal detection unit 206, and a data acquisition, analysis and control unit 208.





DETAILED DESCRIPTION

The following description of various exemplary embodiments is exemplary and explanatory only and is not to be construed as limiting or restrictive in any way. Other embodiments, features, objects, and advantages of the present teachings will be apparent from the description and accompanying drawings, and from the claims. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as is commonly understood by one of ordinary skill in the art to which these inventions belong.


Provided herein are compositions and methods, as well as combinations, kits, and systems that include the compositions and methods, for amplification, detection, characterization, assessment, profiling and/or measurement of nucleic acids, such as nucleic acids of microorganisms, including microbes, e.g., bacteria. The compositions and methods provided herein enable highly sensitive, specific, accurate, reproducible detection and identification of one or more microorganisms in sample containing a complex population of microorganisms and other biological materials (e.g., cells that are not microorganisms). The compositions and methods further provide for accurate determination of relative and/or absolute levels or abundance of different microorganisms in a such a sample. These, and other, aspects of the compositions and methods provided herein make them ideally suited for use, for example, in a number of methods, including, but not limited to, accurate and comprehensive methods for assessing or characterizing a population of microorganisms in a sample (e.g., biological sample) or methods for diagnosing and/or treating, reducing symptoms of, or preventing microorganism imbalances and/or dysbiosis in a subject, including such methods described and provided herein. In some embodiments, the compositions and methods further enable multiplex, including highly multiplexed, amplification of nucleic acids of microorganisms in a single amplification reaction mixture and thereby provide for rapid and high-throughput, yet sensitive and readily discernable amplification of nucleic acids from large numbers of different microorganisms as may be found, for example, in numerous different samples, such as from food, water, soil, and animal, e.g., human, specimens such as biofluids (e.g., saliva, sputum, mucus, blood, urine, semen), tissues, skin, respiratory tract, genitourinary tract and the microbiota of an alimentary canal (e.g., gut) of an animal. In some embodiments, methods provided herein include a multiplex next generation sequencing workflow for accurate, sensitive, high-throughput assessment, characterization or profiling of a population of microorganisms that is used, for example, in correlating the microorganism composition of a subject (e.g., the microbiota of the alimentary canal, gastrointestinal tract, digestive tract or portion thereof of a subject) with states of health and diseases or disorders.


Definitions

As used herein, the terms “comprises,” “comprising,” “includes,” “including,” “has,” “having” or any other variation thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, article, or apparatus that comprises a list of features is not necessarily limited only to those features but may include other features not expressly listed or inherent to such process, method, article, or apparatus. Further, unless expressly stated to the contrary, “or” refers to an inclusive-or and not to an exclusive-or.


As used herein, “organism” refers to a life form or living thing. Examples of organisms include microorganisms, unicellular organisms, multicellular organisms, plants and animals. Examples of animals include insects, fish, birds and mammals, including humans and non-human mammals.


As used herein, a “subject” refers to an organism, frequently an animal, e.g., a human or non-human animal, such as a mammal, that is a focus of study, investigation, treatment and/or from which information and/or material (e.g., a sample or specimen) is sought and/or obtained. In some instances, a subject can be a patient.


As used herein, “microorganism,” used interchangeably with “microbe” herein, refers to an organism of microscopic or submicroscopic size. Examples of microorganisms include bacteria, archaea, protists and fungi. Many microorganisms are unicellular and capable of dividing and proliferating. Microorganisms include prokaryotes, e.g., bacteria, and non-prokaryotic, e.g., eukaryotic, organisms.


As used herein, “microbiota,” refers to a collection, population or community of microbes inhabiting a particular biological niche or ecosystem. Environments in which microbiota are found include soil, water, hydrothermal vents, and hosts, e.g., animal hosts. For example, the human microbiota is made up of the array of microbes colonizing a human, such as on or within human tissues and biofluids. Within the human microbiota are several habitats such as the skin, oral mucosa, respiratory tract, conjunctiva, genitourinary tract and the alimentary canal or tract, or gastrointestinal tract, often referred to as the “gut” microbiota. The genetic component (e.g., genes and genomes) of all the microbial cells in the microbiota is referred to herein as the “microbiome.”


As used herein, “sensitivity” with respect to detection and/or identification of a microorganism, e.g., bacterium, in a sample is a performance measure of methods of detecting or identifying a microorganism, for example at the genus and/or species level, that is based on calculating the true positive rate, i.e., the proportion of actual positives that are correctly identified as such. For example, one method of determining sensitivity of a nucleic acid sequencing and analysis method of detection or identification of a microorganism is to perform the method on a known control sample of microorganisms and then determining the percentage of sequence reads that are correctly and unambiguously assigned to a particular genus or species in the sample. The greater the sensitivity of detection or identification, the fewer the number of failures to detect the actual presence of a particular genus or species in a sample.


As used herein, “specificity” with respect to detection and/or identification of a microorganism, e.g., bacterium, in a sample is a performance measure of methods of detecting or identifying a microorganism, for example at the genus and/or species level, that is based on calculating the true negative rate, i.e., the proportion of actual negatives that are correctly identified as such. For example, one method of determining specificity of a nucleic acid sequencing and analysis method of detection or identification of a microorganism is to perform the method on a known control sample of microorganisms that is known to not include particular microorganisms and then determining the percentage of sequence reads that are incorrectly assigned to a particular genus or species that is absent from the sample. The greater the specificity of detection or identification, the fewer the number of errors in identification of a particular genus or species in a sample.


As used herein, the term “nucleic acid” refers to natural nucleic acids, artificial nucleic acids, analogs thereof, or combinations thereof, including polynucleotides and oligonucleotides. As used herein, the terms “polynucleotide” and “oligonucleotide” are used interchangeably and mean single-stranded, double-stranded, partially double-stranded polymers of nucleotides including, but not limited to, 2′-deoxyribonucleotides (DNA) and ribonucleotides (RNA) linked by internucleotide phosphodiester bond linkages, e.g. 3′-5′ and 2′-5′, inverted linkages, e.g. 3′-3′ and 5′-5′, branched structures, or analog nucleic acids. Examples of partially double-stranded nucleic acids include, for example, double-stranded molecules having a 5′ and/or 3′ single-stranded overhang. Polynucleotides have associated counter ions, such as H+, NH4+, trialkylammonium, Mg2+, Na+ and the like. An oligonucleotide can be composed entirely of deoxyribonucleotides, entirely of ribonucleotides, or chimeric mixtures thereof. Oligonucleotides can be comprised of nucleobase and sugar analogs. Polynucleotides typically range in size from a few monomeric units, e.g. 5-40, when they are more commonly frequently referred to in the art as oligonucleotides, to several thousands of monomeric nucleotide units, when they are more commonly referred to in the art as polynucleotides; for purposes of this disclosure, however, both oligonucleotides and polynucleotides may be of any suitable length. Unless denoted otherwise, whenever a oligonucleotide sequence is represented, it will be understood that the nucleotides are in 5′ to 3′ order from left to right and that “A” denotes deoxyadenosine, “C” denotes deoxycytidine, “G” denotes deoxyguanosine, “T” denotes thymidine, and “U’ denotes deoxyuridine. The letters A, C, G, and T may be used to refer to the bases themselves, to nucleosides, or to nucleotides comprising the bases, as is standard in the art. Oligonucleotides are said to have “5′ ends” and “3′ ends” because mononucleotides are typically reacted to form oligonucleotides via attachment of the 5′ phosphate or equivalent group of one nucleotide to the 3′ hydroxyl or equivalent group of its neighboring nucleotide, optionally via a phosphodiester or other suitable linkage.


As used herein, the term “nucleotide” and its variants comprises any compound, including without limitation any naturally occurring nucleotide or analog thereof, which is able to hybridize to another nucleotide and/or can bind to, or can be polymerized by, a polymerase. Typically, but not necessarily, selective binding of the nucleotide to the polymerase is followed by polymerization of the nucleotide into a nucleic acid strand by the polymerase; occasionally however the nucleotide may dissociate from the polymerase without becoming incorporated into the nucleic acid strand, an event referred to herein as a “non-productive” event. Such nucleotides include not only naturally occurring nucleotides but also any analogs, regardless of their structure, that can bind selectively to, or can be polymerized by, a polymerase. While naturally occurring nucleotides typically comprise base, sugar and phosphate moieties, the nucleotides of the present disclosure can include compounds lacking any one, some or all of such moieties. In some embodiments, the nucleotide can optionally include a chain of phosphorus atoms comprising three, four, five, six, seven, eight, nine, ten or more phosphorus atoms. In some embodiments, the phosphorus chain can be attached to any carbon of a sugar ring, such as the 5′ carbon. The phosphorus chain can be linked to the sugar with an intervening O or S. In one embodiment, one or more phosphorus atoms in the chain can be part of a phosphate group having P and O. In another embodiment, the phosphorus atoms in the chain can be linked together with intervening O, NH, S, methylene, substituted methylene, ethylene, substituted ethylene, CNH2, C(O), C(CH2), CH2CH2, or C(OH)CH2R (where R can be a 4-pyridine or 1-imidazole). In one embodiment, the phosphorus atoms in the chain can have side groups having O, BH3, or S. In the phosphorus chain, a phosphorus atom with a side group other than O can be a substituted phosphate group. In the phosphorus chain, phosphorus atoms with an intervening atom other than O can be a substituted phosphate group. Some examples of nucleotide analogs are described in Xu, U.S. Pat. No. 7,405,281. In some embodiments, the nucleotide comprises a label and referred to herein as a “labeled nucleotide”; the label of the labeled nucleotide is referred to herein as a “nucleotide label”. In some embodiments, the label can be in the form of a fluorescent dye attached to the terminal phosphate group, i.e., the phosphate group most distal from the sugar. Some examples of nucleotides that can be used in the disclosed methods and compositions include, but are not limited to, ribonucleotides, deoxyribonucleotides, modified ribonucleotides, modified deoxyribonucleotides, ribonucleotide polyphosphates, deoxyribonucleotide polyphosphates, modified ribonucleotide polyphosphates, modified deoxyribonucleotide polyphosphates, peptide nucleotides, modified peptide nucleotides, metallonucleosides, phosphonate nucleosides, and modified phosphate-sugar backbone nucleotides, analogs, derivatives, or variants of the foregoing compounds, and the like. In some embodiments, the nucleotide can comprise non-oxygen moieties such as, for example, thio- or borano-moieties, in place of the oxygen moiety bridging the alpha phosphate and the sugar of the nucleotide, or the alpha and beta phosphates of the nucleotide, or the beta and gamma phosphates of the nucleotide, or between any other two phosphates of the nucleotide, or any combination thereof. “Nucleotide 5′-triphosphate” refers to a nucleotide with a triphosphate ester group at the 5′ position, and are sometimes denoted as “NTP”, or “dNTP” and “ddNTP” to particularly point out the structural features of the ribose sugar. The triphosphate ester group can include sulfur substitutions for the various oxygens, e.g. .alpha.-thio-nucleotide 5′-triphosphates. For a review of nucleic acid chemistry, see: Shabarova, Z. and Bogdanov, A. Advanced Organic Chemistry of Nucleic Acids, VCH, New York, 1994.


As used herein, the term “hybridization” is consistent with its use in the art, and refers to the process whereby two nucleic acid molecules undergo base pairing interactions. Two nucleic acid molecule molecules are said to be hybridized when any portion of one nucleic acid molecule is base paired with any portion of the other nucleic acid molecule; it is not necessarily required that the two nucleic acid molecules be hybridized across their entire respective lengths and in some embodiments, at least one of the nucleic acid molecules can include portions that are not hybridized to the other nucleic acid molecule. “Hybridizing conditions” are conditions (e.g., temperature, ionic strength, etc.) suitable for hybridization of two nucleic acids containing sequences of nucleotides that are capable of undergoing base pairing interaction. The phrase “hybridizing under stringent conditions” and its variants refers to conditions under which hybridization of two nucleic acid sequence, e.g., a target-specific primer and a target sequence, occurs in the presence of high hybridization temperature and low ionic strength. In one exemplary embodiment, stringent hybridization conditions include an aqueous environment containing about 30 mM magnesium sulfate, about 300 mM Tris-sulfate at pH 8.9, and about 90 mM ammonium sulfate at about 60-68° C., or equivalents thereof. As used herein, the phrase “standard hybridization conditions” and its variants refers to conditions under which hybridization of two nucleic acids occurs in the presence of low hybridization temperature and high ionic strength. In one exemplary embodiment, standard hybridization conditions include an aqueous environment containing about 100 mM magnesium sulfate, about 500 mM Tris-sulfate at pH 8.9, and about 200 mM ammonium sulfate at about 50-55° C., or equivalents thereof.


The terms “identity” and “identical” and their variants, as used herein, when used in reference to two or more nucleic acid sequences, refer to similarity in sequence of the two or more sequences (e.g., nucleotide or polypeptide sequences). In the context of two or more homologous sequences, the percent identity, similarity or homology of the sequences or subsequences thereof indicates the percentage of all monomeric units (e.g., nucleotides or amino acids) that are the same (i.e., about 70% identity or more, about 75%, 80%, 85%, 90%, 95%, 98% or 99% identity). The percent identity can be over a specified region, when compared and aligned for maximum correspondence over a comparison window, or designated region as measured using a BLAST or BLAST 2.0 sequence comparison algorithms with default parameters described below, or by manual alignment and visual inspection. Sequences are said to be “substantially identical” when there is at least 85% identity at the amino acid level or at the nucleotide level. Preferably, the identity exists over a region that is at least about 25, 50, or 100 residues in length, or across the entire length of at least one compared sequence. A typical algorithm for determining percent sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, which are described in Altschul et al, Nuc. Acids Res. 25:3389-3402 (1977). Other methods include the algorithms of Smith & Waterman, Adv. Appl. Math. 2:482 (1981), and Needleman & Wunsch, J. Mol. Biol. 48:443 (1970), etc. Another indication that two nucleic acid sequences are substantially identical is that the two molecules or their complements hybridize to each other under stringent hybridization conditions.


The terms “complementary” and “complement” and their variants, as used herein, refer to any two or more nucleic acid sequences (e.g., portions or entireties of template nucleic acid molecules, target sequences and/or primers) that can undergo cumulative base pairing at two or more individual corresponding positions in antiparallel orientation, as in a hybridized duplex. Such base pairing can proceed according to any set of established rules, for example according to Watson-Crick base pairing rules or according to some other base pairing paradigm. Optionally there can be “complete” or “total” complementarity between a first and second nucleic acid sequence where each nucleotide in the first nucleic acid sequence can undergo a stabilizing base pairing interaction with a nucleotide in the corresponding antiparallel position on the second nucleic acid sequence. “Partial” complementarity describes nucleic acid sequences in which at least 20%, but less than 100%, of the residues of one nucleic acid sequence are complementary to residues in the other nucleic acid sequence. In some embodiments, at least 50%, but less than 100%, of the residues of one nucleic acid sequence are complementary to residues in the other nucleic acid sequence. In some embodiments, at least 70%, 80%, 90%, 95% or 98%, but less than 100%, of the residues of one nucleic acid sequence are complementary to residues in the other nucleic acid sequence. Sequences are said to be “substantially complementary” when at least 85% of the residues of one nucleic acid sequence are complementary to residues in the other nucleic acid sequence. In some embodiments, two complementary or substantially complementary sequences are capable of hybridizing to each other under standard or stringent hybridization conditions. “Non-complementary” describes nucleic acid sequences in which less than 20% of the residues of one nucleic acid sequence are complementary to residues in the other nucleic acid sequence. Sequences are said to be “substantially non-complementary” when less than 15% of the residues of one nucleic acid sequence are complementary to residues in the other nucleic acid sequence. In some embodiments, two non-complementary or substantially non-complementary sequences cannot hybridize to each other under standard or stringent hybridization conditions. A “mismatch” is present at any position in the two opposed nucleotides are not complementary. Complementary nucleotides include nucleotides that are efficiently incorporated by DNA polymerases opposite each other during DNA replication under physiological conditions. In a typical embodiment, complementary nucleotides can form base pairs with each other, such as the A-T/U and G-C base pairs formed through specific Watson-Crick type hydrogen bonding, or base pairs formed through some other type of base pairing paradigm, between the nucleobases of nucleotides and/or polynucleotides in positions antiparallel to each other. The complementarity of other artificial base pairs can be based on other types of hydrogen bonding and/or hydrophobicity of bases and/or shape complementarity between bases.


As used herein, “sample” and its derivatives, is used in its broadest sense and includes any specimen, culture and the like that may include composition of interest, such as a target. In some embodiments, the sample comprises cDNA, RNA, PNA, LNA, chimeric, hybrid, or multiplex-forms of nucleic acids. The sample can include any biological, clinical, surgical, agricultural, atmospheric or aquatic-based specimen containing one or more organisms and/or nucleic acids. One example of a biological or clinical sample is a sample of the contents of the alimentary canal of an animal. The alimentary canal is the continuous passageway, beginning at the mouth and ending at the anus, through which food and liquids are ingested, digested and absorbed and waste is processed and eliminated. The alimentary canal or tract is also referred to herein as the gastrointestinal tract and gut, and includes multiple organs. An example of a sample from the alimentary canal is a fecal sample. In some instances, at least some nucleic acids in a sample may be contained within a cell. In some instances, nucleic acids may be extracted from one or more cells in a sample. In some instances, the term “nucleic acid sample” can refer to a sample containing nucleic acids within a cell or organism or not within a cell or organism and/or nucleic acids extracted from the sample. The term also includes any isolated nucleic acid sample such as expressed RNA, fresh-frozen or formalin-fixed paraffin-embedded nucleic acid specimen.


As used herein, “homologous” or “homolog” and derivatives thereof, when used in reference to a portion of a genome or gene, refers to genomic segments or genes that display conserved sequences of substantial sequence similarity in multiple organisms, e.g., multiple organisms of a domain, kingdom, phylum, class, order, family genus and/or species, but that also have differences in sequence. Examples of homologous genes include, but are not limited to, the 16S rRNA gene, 18S rRNA gene, 23S rRNA gene and ABC transporter genes.


As used herein, “unique” when used in reference to a nucleic acid sequence in an organism or group of organisms refers to a nucleotide sequence of a nucleic acid (e.g., a segment or portion of a genome) in an organism or group of organisms that is sufficiently different from sequences in the genomes of other organisms or other groups of organisms such that it can be used to selectively detect or identify the organism, or members of a group of organisms, and/or distinguish the organism, or members of a group of organisms, from some, most, the majority of or substantially all different organisms or organisms that are not in the group of organisms. Such unique sequences are also referred to herein as “signature sequences” or “signature regions” of nucleic acids of an organism or group of organisms. For example, a nucleic acid sequence of nucleotides may be unique to an individual organism, unique to members of a strain of a species of organism, unique to members of a species of organism, unique to members of a genus of organisms, unique to members of a family of organisms, unique to members of an order of organisms, unique to members of a class of organisms, unique to members of a phylum of organisms, unique to members of a kingdom of organisms and/or unique to members of a domain of organisms. Typically, the difference in a unique sequence is the identity and/or order of consecutive nucleotides or nucleobases in the sequence. In some embodiments, a unique sequence is unique to the organism in comparison to, or with respect to, some specified group of organisms (e.g., organisms in the same kingdom, phylum, class, order, family, genus, species) but may not be unique to the organism in comparison to the totality of all other organisms or all other organisms outside of the specified group. A unique nucleotide sequence can be any length, for example, between about 20 and 1000 nucleotides, 30 and 750 nucleotides, 40 and 500 nucleotides, 50 and 400 nucleotides, 50 and 350 nucleotides, 50 and 300 nucleotides, 50 and 250 nucleotides, 50 and 200 nucleotides, 50 and 150 nucleotides or 50 and 100 nucleotides. In some embodiments, a unique nucleotide sequence can be about 1000 nucleotides or less, about 750 nucleotides or less, about 500 nucleotides or less, about 400 nucleotides or less, about 350 nucleotides or less, about 300 nucleotides or less, about 250 nucleotides or less, about 200 nucleotides or less, about 150 nucleotides or less, about 100 nucleotides or less, or about 50 nucleotides or less in length. In some embodiments, a unique nucleotide sequence can be greater than about 25 nucleotides, greater than about 40 nucleotides, greater than about 50 nucleotides, greater than about 60 nucleotides, greater than about 70 nucleotides, greater than about 75 nucleotides, greater than about 90 nucleotides, greater than about 95 nucleotides, greater than about 100 nucleotides, greater than about 150 nucleotides, greater than about 175 nucleotides, greater than about 200 nucleotides, greater than about 250 nucleotides, greater than about 275 nucleotides, greater than about 300 nucleotides, greater than about 325 nucleotides, greater than about 350 nucleotides or greater than about 400 nucleotides in length. In some embodiments, the unique sequence is such that it can be used to selectively detect, identify and/or distinguish an organism, or members of a group of organisms, by binding to, hybridizing to and/or being amplified by specific nucleic acid probes and/or primers that specifically or selectively or uniquely bind to, hybridize to and/or amplify the unique sequence, particularly in the presence of nucleic acids of other organisms or organisms that are not members of the group of organisms. For example, in some embodiments, a unique, or signature, sequence of an organism (e.g., microorganism, such as bacterium), or group of organisms, is a sequence that has less than 60%, less than 65%, less than 70%, less than 75%, less than 80%, less than 81%, less than 82%, less than 83%, less than 84%, less than 85%, less than 86%, less than 87%, less than 88%, less than 89%, less than 90%, less than 91%, less than 92%, less than 93%, less than 94%, or less than 95% identity to a sequence of nucleotides in a different organism or specified group of organisms. In some embodiments, a unique sequence has less than 90% identity to a sequence of nucleotides in a different organism or specified group of organisms. In some embodiments, a unique, or signature, sequence of an organism (e.g., microorganism, such as bacterium), or group of organisms, has less than 25%, less than 20%, less than 19%, less than 18%, less than 17%, less than 16%, less than 15%, less than 14%, less than 13%, less than 12%, less than 10%, nucleotides that match nucleotides in a sequence of nucleotides of a similar length in a different organism or specified group of organisms. In some embodiments, a unique sequence has less than 17% nucleotides that match nucleotides in a sequence of nucleotides in a different organism or specified group of organisms. In some embodiments, a unique sequence has less than 90% identity to a sequence of nucleotides in a different organism or specified group of organisms and has less than 17% nucleotides that match nucleotides in a sequence of nucleotides in a different organism or specified group of organisms. In some embodiments, a unique sequence within a group of organisms (e.g., a species of bacteria) is at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% identical among the majority of or substantially all members (e.g., strains of a species) of the group (e.g., a species). In some embodiments, a unique sequence within a group of organisms is at least 95%, at least 96%, at least 97% identical among the majority of or substantially all members (e.g., strains of a species) of the group. In some embodiments, a specified identity of the unique sequence within a group of organisms is among at least or greater than 75%, at least or greater than 80%, at least or greater than 85%, at least or greater than 90%, or at least or greater than 95% of the members of the group. In some embodiments, the nucleotide sequence of a unique sequence within a group of organisms (e.g., a species of bacteria) has at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% matching nucleotides among the members of the group. In some embodiments, the nucleotide sequence of the unique sequence within a group of organisms has at least 95% nucleotides matching among the members of the group. In some embodiments, a unique sequence within a group of organisms (e.g., a species of bacteria) is at least 95% identical and has least 95% nucleotides matching among at least or greater than 90% of the members of the group.


As used herein, “synthesizing” and its derivatives, refers to a reaction involving nucleotide polymerization by a polymerase, optionally in a template-dependent fashion. Polymerases synthesize an oligonucleotide via transfer of a nucleoside monophosphate from a nucleoside triphosphate (NTP), deoxynucleoside triphosphate (dNTP) or dideoxynucleoside triphosphate (ddNTP) to the 3′ hydroxyl of an extending oligonucleotide chain. For the purposes of this disclosure, synthesizing includes to the serial extension of a hybridized adapter or a target-specific primer via transfer of a nucleoside monophosphate from a deoxynucleoside triphosphate.


As used herein, “polymerase” and its derivatives, refers to any enzyme that can catalyze the polymerization of nucleotides (including analogs thereof) into a nucleic acid strand. Typically but not necessarily, such nucleotide polymerization can occur in a template-dependent fashion. Such polymerases can include without limitation naturally occurring polymerases and any subunits and truncations thereof, mutant polymerases, variant polymerases, recombinant, fusion or otherwise engineered polymerases, chemically modified polymerases, synthetic molecules or assemblies, and any analogs, derivatives or fragments thereof that retain the ability to catalyze such polymerization. Optionally, the polymerase can be a mutant polymerase comprising one or more mutations involving the replacement of one or more amino acids with other amino acids, the insertion or deletion of one or more amino acids from the polymerase, or the linkage of parts of two or more polymerases. Typically, the polymerase comprises one or more active sites at which nucleotide binding and/or catalysis of nucleotide polymerization can occur. Some exemplary polymerases include without limitation DNA polymerases and RNA polymerases. The term “polymerase” and its variants, as used herein, also refers to fusion proteins comprising at least two portions linked to each other, where the first portion comprises a peptide that can catalyze the polymerization of nucleotides into a nucleic acid strand and is linked to a second portion that comprises a second polypeptide. In some embodiments, the second polypeptide can include a reporter enzyme or a processivity-enhancing domain. Optionally, the polymerase can possess 5′ exonuclease activity or terminal transferase activity. In some embodiments, the polymerase can be optionally reactivated, for example through the use of heat, chemicals or re-addition of new amounts of polymerase into a reaction mixture. In some embodiments, the polymerase can include a hot-start polymerase or an aptamer based polymerase that optionally can be reactivated.


As used herein, “amplify”, “amplifying” or “amplification reaction” and their derivatives, refer to any action or process whereby at least a portion of a nucleic acid molecule (referred to as a template nucleic acid molecule, which can contain a target sequence) is replicated or copied into at least one additional nucleic acid molecule. The additional nucleic acid molecule optionally includes sequence that is substantially identical or substantially complementary to at least some portion of the template nucleic acid molecule. The template nucleic acid molecule can be single-stranded or double-stranded and the additional nucleic acid molecule can independently be single-stranded or double-stranded. In some embodiments, amplification includes a template-dependent in vitro enzyme-catalyzed reaction for the production of at least one copy of at least some portion of the nucleic acid molecule or the production of at least one copy of a nucleic acid sequence that is complementary to at least some portion of the nucleic acid molecule. Amplification optionally includes linear or exponential replication of a nucleic acid molecule. In some embodiments, such amplification is performed using isothermal conditions; in other embodiments, such amplification can include thermocycling. In some embodiments, the amplification is a multiplex amplification that includes the simultaneous amplification of a plurality of target sequences in a single amplification reaction. At least some of the target sequences can be situated on the same nucleic acid molecule or on different target nucleic acid molecules included in the single amplification reaction. In some embodiments, “amplification” includes amplification of at least some portion of DNA- and RNA-based nucleic acids alone, or in combination. The amplification reaction can include single- or double-stranded nucleic acid substrates and can further include any processes of amplification techniques known to one of ordinary skill in the art. In some embodiments, the amplification reaction includes polymerase chain reaction (PCR).


As used herein, “amplification conditions” and its derivatives, refers to conditions suitable for amplifying one or more nucleic acid sequences. Such amplification can be linear or exponential. In some embodiments, the amplification conditions can include isothermal conditions or alternatively can include thermocyling conditions, or a combination of isothermal and themocycling conditions. In some embodiments, the conditions suitable for amplifying one or more nucleic acid sequences includes polymerase chain reaction (PCR) conditions. Typically, the amplification conditions refer to a reaction mixture that is sufficient to amplify nucleic acids such as one or more target sequences, or to amplify an amplified target sequence ligated to one or more adapters, e.g., an adapter-ligated amplified target sequence. Amplification conditions include a catalyst for amplification or for nucleic acid synthesis, for example a polymerase; a primer that possesses some degree of complementarity to the nucleic acid to be amplified; and nucleotides, such as deoxyribonucleotide triphosphates (dNTPs) to promote extension of the primer once hybridized to the nucleic acid. The amplification conditions can require hybridization or annealing of a primer to a nucleic acid, extension of the primer and a dissociation step, e.g., denaturing, in which the extended primer is separated from the nucleic acid sequence undergoing amplification. Typically, but not necessarily, amplification conditions can include thermocycling; in some embodiments, amplification conditions include a plurality of cycles where the amplification steps of annealing, extending and separating are repeated. Typically, the amplification conditions include cations such as Mg++ or Mn++ (e.g., MgCl2, etc) and can also include various modifiers of ionic strength.


As defined herein “multiplex amplification” refers to selective and non-random amplification of two or more target sequences within a sample using at least one specific primer. In some embodiments, multiplex amplification is performed such that some or all of the target sequences are amplified within a single reaction vessel. The “plexy” or “plex” of a given multiplex amplification refers to the number of different target-specific sequences that are amplified during that single multiplex amplification. In some embodiments, the plexy can be about 12-plex, 24-plex, 48-plex, 74-plex, 96-plex, 120-plex, 144-plex, 168-plex, 192-plex, 216-plex, 240-plex, 264-plex, 288-plex, 312-plex, 336-plex, 360-plex, 384-plex, or 398-plex.


As used herein, the term “polymerase chain reaction” (“PCR”) refers to the method of K. B. Mullis U.S. Pat. Nos. 4,683,195 and 4,683,202, hereby incorporated by reference, which describe a method for increasing the concentration of a segment of a polynucleotide of interest in a mixture of expressed RNA or cDNA without cloning or purification. This process for amplifying the polynucleotide of interest consists of introducing a large excess of two oligonucleotide primers to the DNA mixture containing the desired polynucleotide of interest, followed by a precise sequence of thermal cycling in the presence of a DNA polymerase. The two primers are complementary to their respective strands of the double stranded polynucleotide of interest. To effect amplification, the mixture is denatured and the primers then annealed to their complementary sequences within the polynucleotide of interest molecule. Following annealing, the primers are extended with a polymerase to form a new pair of complementary strands. The steps of denaturation, primer annealing and polymerase extension can be repeated many times (i.e., denaturation, annealing and extension constitute one “cycle”; there can be numerous “cycles”) to obtain a high concentration of an amplified segment of the desired polynucleotide of interest. The length of the amplified segment of the desired polynucleotide of interest (amplicon) is determined by the relative positions of the primers with respect to each other, and therefore, this length is a controllable parameter. By virtue of repeating the process, the method is referred to as the “polymerase chain reaction” (hereinafter “PCR”). Because the desired amplified segments of the polynucleotide of interest become the predominant nucleic acid sequences (in terms of concentration) in the mixture, they are said to be “PCR amplified”. As defined herein, target nucleic acid molecules within a sample including a plurality of target nucleic acid molecules are amplified via PCR. In a modification to the method discussed above, the target nucleic acid molecules can be PCR amplified using a plurality of different primer pairs, in some cases, one or more primer pairs per target nucleic acid molecule of interest, thereby forming a multiplex PCR reaction. Using multiplex PCR, it is possible to simultaneously amplify multiple nucleic acid molecules of interest from a sample to form amplified target sequences. It is also possible to detect the amplified target sequences by several different methodologies (e.g., quantitation with a bioanalyzer or qPCR, hybridization with a labeled probe; incorporation of biotinylated primers followed by avidin-enzyme conjugate detection; incorporation of 32P-labeled deoxynucleotide triphosphates, such as dCTP or dATP, into the amplified target sequence). Any oligonucleotide sequence can be amplified with the appropriate set of primers, thereby allowing for the amplification of target nucleic acid molecules from RNA, cDNA, formalin-fixed paraffin-embedded DNA, fine-needle biopsies and various other sources. In particular, the amplified target sequences created by the multiplex PCR process as disclosed herein, are themselves efficient substrates for subsequent PCR amplification or various downstream assays or manipulations.


As used herein, “reamplifying” or “reamplification” and their derivatives refer to any process whereby at least a portion of an amplified nucleic acid molecule is further amplified via any suitable amplification process (referred to in some embodiments as a “secondary” amplification or “reamplification”, thereby producing a reamplified nucleic acid molecule. The secondary amplification need not be identical to the original amplification process whereby the amplified nucleic acid molecule was produced; nor need the reamplified nucleic acid molecule be completely identical or completely complementary to the amplified nucleic acid molecule; all that is required is that the reamplified nucleic acid molecule include at least a portion of the amplified nucleic acid molecule or its complement. For example, the reamplification can involve the use of different amplification conditions and/or different primers, including different target-specific primers than the primary amplification.


The term “extension” and its variants, as used herein, when used in reference to a given primer, comprises any in vivo or in vitro enzymatic activity characteristic of a given polymerase that relates to polymerization of one or more nucleotides onto an end of an existing nucleic acid molecule. Typically but not necessarily such primer extension occurs in a template-dependent fashion; during template-dependent extension, the order and selection of bases is driven by established base pairing rules, which can include Watson-Crick type base pairing rules or alternatively (and especially in the case of extension reactions involving nucleotide analogs) by some other type of base pairing paradigm. In one non-limiting example, extension occurs via polymerization of nucleotides on the 3′OH end of the nucleic acid molecule by the polymerase.


The term “portion” and its variants, as used herein, when used in reference to a given nucleic acid molecule, for example a primer or a template nucleic acid molecule, comprises any number of contiguous nucleotides within the length of the nucleic acid molecule, including the partial or entire length of the nucleic acid molecule.


As used herein, “target sequence” or “target sequence of interest” and its derivatives, refers to any single or double-stranded nucleic acid sequence that can be bound to, hybridized to, amplified and/or synthesized according to the disclosure, including, for example, any nucleic acid sequence suspected to be, expected to be, or that could potentially be present in a sample. In some embodiments, the target sequence is present in double-stranded form and includes at least a portion of the particular nucleotide sequence to be bound, hybridized, amplified and/or synthesized, or its complement, prior to the addition of specific primers or appended adapters. In some embodiments, a target sequence is a part of a target. For example, a target nucleic acid sequence can be a sequence located in a target gene, a target genome and/or a target organism, e.g., bacteria, or a specific family, genus or species of a target organism, e.g., Ruminococcaceae family, Ruminococcus genus, and R. gnavus species. Target sequences can include the nucleic acids to which primers useful in an amplification or synthesis reaction can hybridize prior to extension by a polymerase. In some instances, a target sequence is a sequence adjacent to and contiguous with a sequence to which a primer used to amplify the target sequence hybridizes. In some embodiments, the term refers to a nucleic acid sequence whose sequence identity, ordering or location of nucleotides is determined by one or more of the methods of the disclosure.


As used herein, “amplified target sequence” and its derivatives, refers to a nucleic acid sequence produced by the amplification of/amplifying the target sequence using specific primers and the methods provided herein. The amplified target sequences may be either of the same sense (the positive strand produced in the second round and subsequent even-numbered rounds of amplification) or antisense (i.e., the negative strand produced during the first and subsequent odd-numbered rounds of amplification) with respect to the target sequences. In some embodiments, the amplified target sequences are typically less than 50% complementary to any portion of another amplified target sequence in the reaction. As used herein, “amplicon” refers to the total nucleic acid that results from an amplification using primers and methods such as provided herein. In some instances, an amplicon may be the same as a target sequence. In some instances, when a target nucleic acid sequence is defined as not including primer sequences, an amplicon includes an amplified target sequence as well as the primers used to amplify the target sequence located at each end of the amplified target sequence. In such cases, the target sequence can be referred to as the “insert” of the amplicon.


As used herein, the term “primer,” “probe,” and derivatives thereof refer to any polynucleotide that can hybridize to a target sequence of interest. In some embodiments, the primer can also serve to prime nucleic acid synthesis. Typically, the primer functions as a substrate onto which nucleotides can be polymerized by a polymerase; in some embodiments, however, the primer can become incorporated into the synthesized nucleic acid strand and provide a site to which another primer can hybridize to prime synthesis of a new strand that is complementary to the synthesized nucleic acid molecule. A primer or probe may be comprised of any combination of nucleotides or analogs thereof, which may be optionally linked to form a linear polymer of any suitable length. In some embodiments, the primer is a single-stranded oligonucleotide or polynucleotide. (For purposes of this disclosure, the terms ‘polynucleotide” and “oligonucleotide” are used interchangeably herein and do not necessarily indicate any difference in length between the two). In some embodiments, the primer or probe is single-stranded but it can also be double-stranded. A primer or probe optionally occurs naturally, as in a purified restriction digest, or can be produced synthetically. In some embodiments, the primer acts as a point of initiation for amplification or synthesis when exposed to amplification or synthesis conditions; such amplification or synthesis can occur in a template-dependent fashion and optionally results in formation of a primer extension product that is complementary to at least a portion of the target sequence. Exemplary amplification or synthesis conditions can include contacting the primer with a polynucleotide template (e.g., a template including a target sequence), nucleotides and an inducing agent such as a polymerase at a suitable temperature and pH to induce polymerization of nucleotides onto an end of the target-specific primer. If double-stranded, a primer or probe can optionally be treated to separate its strands before being used to prepare primer extension products. In some embodiments, the primer probe is an oligodeoxyribonucleotide or an oligoribonucleotide. In some embodiments, the primer or probe can include one or more nucleotide analogs. The exact length and/or composition, including sequence, of a primer or probe can influence many properties, including melting temperature (Tm), GC content, formation of secondary structures, repeat nucleotide motifs, length of predicted primer extension products, extent of coverage across a nucleic acid molecule of interest, number of primers present in a single amplification or synthesis reaction, presence of nucleotide analogs or modified nucleotides within the primers, and the like. In some embodiments, a primer can be paired with a compatible primer within an amplification or synthesis reaction to form a primer pair made up of a forward primer and a reverse primer. In some embodiments, the forward primer of the primer pair includes a sequence that is substantially complementary to at least a portion of a strand of a nucleic acid molecule, and the reverse primer of the primer pair includes a sequence that is substantially identical to at least of portion of the strand. In some embodiments, the forward primer and the reverse primer are capable of hybridizing to opposite strands of a nucleic acid duplex. Optionally, the forward primer primes synthesis of a first nucleic acid strand, and the reverse primer primes synthesis of a second nucleic acid strand, wherein the first and second strands are substantially complementary to each other, or can hybridize to form a double-stranded nucleic acid molecule. In some embodiments, one end of an amplification or synthesis product is defined by the forward primer and the other end of the amplification or synthesis product is defined by the reverse primer. In some embodiments, where the amplification or synthesis of lengthy primer extension products is required, such as amplifying an exon, coding region, or gene, several primer pairs can be created that span the desired length to enable sufficient amplification of the region. In some embodiments, a primer or probe can include one or more cleavable groups. Primers and probes can be of any length. In some embodiments, a probe may be about 200 or less nucleotides, 175 nucleotides or less, 150 or less nucleotides, 125 nucleotides or less, 100 or less nucleotides, 90 nucleotides or less, 80 or less nucleotides, 75 nucleotides or less, 70 or less nucleotides, 60 nucleotides or less, 55 or less nucleotides, 50 nucleotides or less, 40 or less nucleotides, 35 nucleotides or less, 30 or less nucleotides, 25 nucleotides or less, 20 or less nucleotides, 15 nucleotides or less, or 10 or less nucleotides in length. In some embodiments, primer lengths are in the range of about 10 to about 60 nucleotides, about 12 to about 50 nucleotides and about 15 to about 40 nucleotides in length. Typically, a primer is capable of hybridizing to a corresponding target sequence and undergoing primer extension when exposed to amplification conditions in the presence of dNTPs and a polymerase. In some instances, the particular nucleotide sequence or a portion of the primer is known at the outset of the amplification reaction or can be determined by one or more of the methods disclosed herein. In some embodiments, a primer includes one or more cleavable groups at one or more locations within the primer. In some embodiments, a mixture of primers can be degenerate primers. Degenerate primers are primers having similar sequences but that differ at one or more nucleotide positions such that one primer may have an A at the position, another may have a G at the same position, another may have a T at the same position and a fourth primer may have a C at the same position. Probes and/or primers may be labeled. Labels are frequently used in detecting a primer or probe that has bound to or hybridized to another nucleic acid, for example, for the purpose of detecting a particular sequence to which the primer or probe specifically binds. Compositions and methods for labeling nucleic acids for use as detectable probes are known in the art and include attaching a reporter or signal-generating moiety to the probe. Examples of detectable labels include, but are not limited to, fluorescent, luminescent, chemiluminescent, chromogenic, radioactive and colorimetric moieties. The labels can be directly detectable or can be part of a system for generating a detectable signal.


As used herein, “capable of” when used with reference to processes such as amplifying, binding to or hybridizing to, refers to the ability of a nucleic acid, e.g., a primer or primer pair, to interact with another nucleic acid (e.g., target nucleic acid, target sequence, template) in such a way as to perform, participate in performing and/or accomplishing the stated process. For example, a nucleic acid capable of binding to another nucleic acid or other molecule through intermolecular forces or bonds is able to form a stable attachment to the other nucleic acid or molecule. A nucleic acid capable of hybridizing to another nucleic acid is able to undergo base pairing interactions with the other nucleic acid. In some embodiments, the nucleic acid is capable of hybridizing under low or high stringency conditions. Nucleic acids capable of amplifying another nucleic acid are able to serve as primers in a polymerization reaction that results in extension of the nucleic acid and generation of a complement of a template nucleic acid strand which can be a copy of an opposing strand of the template nucleic acid strand. A nucleic acid is specifically or selectively capable of binding to, hybridizing to and/or amplifying if it is capable of binding to a certain target molecule, hybridizing to a certain target nucleic acid and/or amplifying a certain target nucleic acid without substantially binding to, hybridizing to and/or amplifying a molecule or nucleic acid that is not the target molecule or nucleic acid. In some instances, such binding, hybridizing and/or amplifying is referred to as “uniquely” binding, hybridizing and/or amplifying a target molecule or nucleic acid.


As used herein, the term “separately” when used in reference to amplifying a nucleic acid refers to a primer or primer pair that is used to amplify a particular defined region of a nucleic acid, e.g., a gene, without amplifying another region of the nucleic acid. For example, primer pairs that separately amplify different hypervariable regions of a 16S rRNA gene each amplify only a single hypervariable region to generate separate amplicons for each different region and do not generate amplicons that contain more than one hypervariable region.


As defined herein, a “cleavable group” refers to any moiety that once incorporated into a nucleic acid can be cleaved under appropriate conditions. For example, a cleavable group can be incorporated into a target-specific primer, an amplified sequence, an adapter or a nucleic acid molecule of the sample. In an exemplary embodiment, a target-specific primer can include a cleavable group that becomes incorporated into the amplified product and is subsequently cleaved after amplification, thereby removing a portion, or all, of the target-specific primer from the amplified product. The cleavable group can be cleaved or otherwise removed from a target-specific primer, an amplified sequence, an adapter or a nucleic acid molecule of the sample by any acceptable means. For example, a cleavable group can be removed from a target-specific primer, an amplified sequence, an adapter or a nucleic acid molecule of the sample by enzymatic, thermal, photo-oxidative or chemical treatment. In one aspect, a cleavable group can include a nucleobase that is not naturally occurring. For example, an oligodeoxyribonucleotide can include one or more RNA nucleobases, such as uracil that can be removed by a uracil glycosylase. In some embodiments, a cleavable group can include one or more modified nucleobases (such as 7-methylguanine, 8-oxo-guanine, xanthine, hypoxanthine, 5,6-dihydrouracil or 5-methylcytosine) or one or more modified nucleosides (i.e., 7-methylguanosine, 8-oxo-deoxyguanosine, xanthosine, inosine, dihydrouridine or 5-methylcytidine). The modified nucleobases or nucleotides can be removed from the nucleic acid by enzymatic, chemical or thermal means. In one embodiment, a cleavable group can include a moiety that can be removed from a primer after amplification (or synthesis) upon exposure to ultraviolet light (i.e., bromodeoxyuridine). In another embodiment, a cleavable group can include methylated cytosine. Typically, methylated cytosine can be cleaved from a primer for example, after induction of amplification (or synthesis), upon sodium bisulfite treatment. In some embodiments, a cleavable moiety can include a restriction site. For example, a primer or target sequence can include a nucleic acid sequence that is specific to one or more restriction enzymes, and following amplification (or synthesis), the primer or target sequence can be treated with the one or more restriction enzymes such that the cleavable group is removed. Typically, one or more cleavable groups can be included at one or more locations with a target-specific primer, an amplified sequence, an adapter or a nucleic acid molecule of the sample.


As used herein, “cleavage step” and its derivatives, refers to any process by which a cleavable group is cleaved or otherwise removed from a target-specific primer, an amplified sequence, an adapter or a nucleic acid molecule of the sample. In some embodiments, the cleavage steps involves a chemical, thermal, photo-oxidative or digestive process.


In some embodiments, a primer is a single-stranded or double-stranded polynucleotide, typically an oligonucleotide, that includes at least one sequence that is at least 50% complementary, typically at least 75% complementary or at least 85% complementary, more typically at least 90% complementary, more typically at least 95% complementary, more typically at least 98% or at least 99% complementary, or 100% complementary or identical, to at least a portion of a nucleic acid molecule that includes a target sequence. In such instances, the primer and target sequence are described as “corresponding” to each other and, in some instances, the primer may be referred to as being “directed to” the target sequence. In some embodiments, a primer is capable of hybridizing to at least a portion of its corresponding target sequence (or to a complement of the target sequence); such hybridization can optionally be performed under standard hybridization conditions or under stringent hybridization conditions. In some embodiments, a primer is not capable of hybridizing to the target sequence, or to its complement, but is capable of hybridizing to a portion of a nucleic acid strand including the target sequence, or to its complement, e.g., sequence upstream or downstream or adjacent to the target sequence. In some embodiments, a primer includes at least one sequence that is at least 75% complementary, typically at least 85% complementary, more typically at least 90% complementary, more typically at least 95% complementary, more typically at least 98% complementary, or more typically at least 99% complementary, to at least a portion of the target sequence itself, in other embodiments, a primer includes at least one sequence that is at least 75% complementary, typically at least 85% complementary, more typically at least 90% complementary, more typically at least 95% complementary, more typically at least 98% complementary, or more typically at least 99% complementary, to at least a portion of the nucleic acid molecule other than the target sequence. In some embodiments, such primers are referred to as a “specific primer” or “selective primer” which is substantially non-complementary to target sequences other than the target sequence to which it corresponds or portion of a nucleic acid to which it corresponds that includes the target sequence; optionally, a specific primer, or selective primer, is substantially non-complementary to other nucleic acid molecules that may be present in a mixture of nucleic acids, e.g, in a sample. In some embodiments, nucleic acid molecules present in a sample that do not include or correspond to a target sequence (or to a complement of the target sequence) are referred to as “non-specific”sequences or “non-specific nucleic acids”. In some embodiments, a specific primer or selective primer is designed to include a nucleotide sequence that is substantially complementary to at least a portion of its corresponding target sequence. In some embodiments, a specific primer or selective primer is at least 95% complementary, or at least 99% complementary, 100% complementary or identical, across its entire length to at least a portion of a nucleic acid molecule that includes its corresponding target sequence. In some embodiments, a specific primer or selective primer can be at least 90%, at least 95% complementary, at least 98% complementary or at least 99% complementary, 100% complementary or identical, across its entire length to at least a portion of its corresponding target sequence. In some embodiments, a forward specific primer and a reverse specific primer define a specific primer pair (or selective primer pair) that can be used to amplify the target sequence via template-dependent primer extension. Typically, each primer of a specific primer pair includes at least one sequence that is substantially complementary to at least a portion of a nucleic acid molecule including a corresponding target sequence but that is less than 50% complementary to at least one other target sequence in a mixture or sample. In some embodiments, amplification can be performed using multiple specific primer pairs in a single amplification reaction, wherein each primer pair includes a forward specific primer and a reverse specific primer, each including at least one sequence that is substantially complementary or substantially identical to a corresponding target sequence in the mixture or sample, and each specific primer pair having a different corresponding target sequence. In some embodiments, a specific primer can be substantially non-complementary at its 3′ end or its 5′ end to any other specific primer present in an amplification reaction. In some embodiments, a specific primer can include minimal cross hybridization to other specific primers in an amplification reaction. In some embodiments, specific primers include minimal cross-hybridization to non-specific sequences in an amplification reaction mixture. In some embodiments, specific primers include minimal self-complementarity. In some embodiments, specific primers can include one or more cleavable groups located at the 3′ end. In some embodiments, specific primers can include one or more cleavable groups located near or about a central nucleotide of the specific primer. In some embodiments, one of more specific primers includes only non-cleavable nucleotides at the 5′ end of the specific primer. In some embodiments, a specific primer includes minimal nucleotide sequence overlap at the 3′ end or the 5′ end of the primer as compared to one or more different specific primers, optionally in the same amplification reaction. In some embodiments 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more, specific primers in a single reaction mixture include one or more of the above embodiments. In some embodiments, substantially all of a plurality of specific primers in a single reaction mixture includes one or more of the above embodiments.


As used herein, the terms “ligating”, “ligation” and their derivatives refer to the act or process for covalently linking two or more molecules together, for example, covalently linking two or more nucleic acid molecules to each other. In some embodiments, ligation includes joining nicks between adjacent nucleotides of nucleic acids. In some embodiments, ligation includes forming a covalent bond between an end of a first and an end of a second nucleic acid molecule. In some embodiments, for example embodiments wherein the nucleic acid molecules to be ligated include conventional nucleotide residues, the litigation can include forming a covalent bond between a 5′ phosphate group of one nucleic acid and a 3′ hydroxyl group of a second nucleic acid thereby forming a ligated nucleic acid molecule. In some embodiments, any means for joining nicks or bonding a 5′ phosphate to a 3′ hydroxyl between adjacent nucleotides can be employed. In an exemplary embodiment, an enzyme such as a ligase can be used. For the purposes of this disclosure, an amplified target sequence can be ligated to an adapter to generate an adapter-ligated amplified target sequence.


As used herein, “ligase” and its derivatives, refers to any agent capable of catalyzing the ligation of two substrate molecules. In some embodiments, the ligase includes an enzyme capable of catalyzing the joining of nicks between adjacent nucleotides of a nucleic acid. In some embodiments, the ligase includes an enzyme capable of catalyzing the formation of a covalent bond between a 5′ phosphate of one nucleic acid molecule to a 3′ hydroxyl of another nucleic acid molecule thereby forming a ligated nucleic acid molecule. Suitable ligases may include, but not limited to, T4 DNA ligase, T4 RNA ligase, and E. coli DNA ligase.


As used herein, “ligation conditions” and its derivatives, refers to conditions suitable for ligating two molecules to each other. In some embodiments, the ligation conditions are suitable for sealing nicks or gaps between nucleic acids. As defined herein, a “nick” or “gap” refers to a nucleic acid molecule that lacks a directly bound 5′ phosphate of a mononucleotide pentose ring to a 3′ hydroxyl of a neighboring mononucleotide pentose ring within internal nucleotides of a nucleic acid sequence. As used herein, the term nick or gap is consistent with the use of the term in the art. Typically, a nick or gap can be ligated in the presence of an enzyme, such as ligase at an appropriate temperature and pH. In some embodiments, T4 DNA ligase can join a nick between nucleic acids at a temperature of about 70-72° C.


As used herein, “blunt-end ligation” and its derivatives, refers to ligation of two blunt-end double-stranded nucleic acid molecules to each other. A “blunt end” refers to an end of a double-stranded nucleic acid molecule wherein substantially all of the nucleotides in the end of one strand of the nucleic acid molecule are base paired with opposing nucleotides in the other strand of the same nucleic acid molecule. A nucleic acid molecule is not blunt ended if it has an end that includes a single-stranded portion greater than two nucleotides in length, referred to herein as an “overhang”. In some embodiments, the end of nucleic acid molecule does not include any single stranded portion, such that every nucleotide in one strand of the end is based paired with opposing nucleotides in the other strand of the same nucleic acid molecule. In some embodiments, the ends of the two blunt ended nucleic acid molecules that become ligated to each other do not include any overlapping, shared or complementary sequence. Typically, blunted-end ligation excludes the use of additional oligonucleotide adapters to assist in the ligation of the double-stranded amplified target sequence to the double-stranded adapter, such as patch oligonucleotides as described in Mitra and Varley, US2010/0129874, published May 27, 2010. In some embodiments, blunt-ended ligation includes a nick translation reaction to seal a nick created during the ligation process.


As used herein, the terms “adapter” or “adapter and its complements” and their derivatives, refers to any linear oligonucleotide which can be ligated to a nucleic acid molecule of the disclosure. Optionally, the adapter includes a nucleic acid sequence that is not substantially complementary to the 3′ end or the 5′ end of at least one target sequences within the sample. In some embodiments, the adapter is substantially non-complementary to the 3′ end or the 5′ end of any target sequence present in the sample. In some embodiments, the adapter includes any single stranded or double-stranded linear oligonucleotide that is not substantially complementary to an amplified target sequence. In some embodiments, the adapter is substantially non-complementary to at least one, some or all of the nucleic acid molecules of the sample. In some embodiments, suitable adapter lengths are in the range of about 10-100 nucleotides, about 12-60 nucleotides and about 15-50 nucleotides in length. An adapter can include any combination of nucleotides and/or nucleic acids. In some aspects, the adapter can include one or more cleavable groups at one or more locations. In another aspect, the adapter can include a sequence that is substantially identical, or substantially complementary, to at least a portion of a primer, for example a universal primer. In some embodiments, the adapter can include a barcode or tag to assist with downstream cataloguing, identification or sequencing. In some embodiments, a single-stranded adapter can act as a substrate for amplification when ligated to an amplified target sequence, particularly in the presence of a polymerase and dNTPs under suitable temperature and pH.


As used herein, “DNA barcode” or “DNA tagging sequence” and its derivatives, refers to a unique short (6-14 nucleotide) nucleic acid sequence within an adapter that can act as a ‘key’ to distinguish or separate a plurality of amplified target sequences in a sample. For the purposes of this disclosure, a DNA barcode or DNA tagging sequence can be incorporated into the nucleotide sequence of an adapter.


As used herein, “GC content” and its derivatives, refers to the cytosine and guanine content of a nucleic acid molecule. In some embodiments, the GC content of a specific primer (or adapter) of is 85% or lower. In some embodiments, the GC content of a specific primer or adapter is between 15-85%.


Compositions


Compositions provided herein include compositions containing one or more nucleic acids, including, for example, but not limited to, double-stranded, partially double-stranded, single-stranded, modified and unmodified nucleic acids. In some embodiments, the nucleic acid is single-stranded, e.g., a single-stranded oligonucleotide that can be used as a primer and/or probe. In some embodiments, a composition provided herein contains two nucleic acids, e.g., a nucleic acid primer pair, that are capable of amplifying a particular nucleic acid in a nucleic acid amplification process or reaction. Compositions containing or consisting of a plurality of nucleic acids, e.g., primers and/or probes, including, for example, a plurality of primer pairs, are also provided herein. In some embodiments, a nucleic acid and/or nucleic acid pair (e.g., primer pair) in a composition provided herein is capable of binding to, hybridizing to and/or amplifying a nucleic acid contained within the genome of one or more microorganisms, such as, for example, bacteria or archaea. In some embodiments, a nucleic acid or nucleic acids (e.g., primer pair) in a composition provided herein is/are capable of binding to, hybridizing to and/or amplifying, or specifically binding to, hybridizing to and/or amplifying, a nucleic acid (e.g., a nucleic acid from a microorganism, such as a bacterium) that contains a nucleotide sequence set forth in SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence of any of these sequences. In some embodiments, a nucleic acid or nucleic acids (e.g., primer pair) in a composition provided herein is capable of amplifying, or specifically amplifying, a nucleic acid, such as a nucleic acid from a microorganism, e.g., bacteria, that contains a nucleotide sequence set forth in SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C to generate an amplicon sequence that is less than about 500, less than about 475, less than about 450, less than about 400, less than about 375, less than about 350, less than about 300, less than about 275, less than about 250, less than about 200, less than about 175, less than about 150, or less than about 100 nucleotides in length, or an amplicon sequence that consists essentially of a sequence selected from SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence, and optionally containing nucleic acid primer sequences at the 5′ and 3′ ends of the sequence. In some embodiments, a composition contains a plurality of nucleic acids that are capable of binding to, hybridizing to and/or amplifying, or specifically of binding to, hybridizing to and/or amplifying, a plurality of nucleic acids each of which contains a nucleotide sequence selected from SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C. In some embodiments, a composition contains a plurality of nucleic acids (e.g., primer pairs) that are capable of amplifying, or specifically amplifying, a plurality of nucleic acids (such as a nucleic acids from a microorganism, e.g., bacteria) each of which contains a nucleotide sequence selected from SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence, to generate amplicon sequences that are less than about 500, less than about 475, less than about 450, less than about 400, less than about 375, less than about 350, less than about 300, less than about 275, less than about 250, less than about 200, less than about 175, less than about 150, or less than about 100 nucleotides in length, or amplicon sequences that consist essentially of a sequence selected from SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C and optionally containing the nucleic acid primer sequences at the 5′ and 3′ ends of the sequence. In some embodiments, a composition provided herein contains a plurality of nucleic acids each of which comprises a nucleotide sequence selected from SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence. In some embodiments, the composition contains a plurality of nucleic acids each of which contains, or consists essentially of, a nucleotide sequence selected from SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or a substantially identical or similar sequence, and optionally containing nucleic acid primer sequences at the 5′ and 3′ ends of the sequence, and is less than about 500, less than about 475, less than about 450, less than about 400, less than about 375, less than about 350, less than about 300, less than about 275, less than about 250, less than about 200, less than about 175, less than about 150, or less than about 100 nucleotides in length.


In some embodiments, a nucleic acid in a composition provided herein includes or consists essentially of a nucleotide sequence in Table 15 or Table 16, or a nucleotide sequence in Table 15 or Table 16 in which one or more thymine bases is substituted with a uracil base. In some embodiments, a nucleic acid provided herein includes or consists essentially of a nucleotide sequence selected from SEQ ID NOS: 11-16, 23 and 24 of Table 15, SEQ ID NOS: 35-40, 47 and 48 of Table 15, SEQ ID NOS: 49-480 of Table 16A, SEQ ID NOS: 49-452 and 457-472 of Table 16A, SEQ ID NOS: 521-820 of Table 16C, SEQ ID NOS: 827-1258 of Table 16D, SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1598 of Table 16F or a substantially identical or similar sequence. In some embodiments, a composition contains or consists essentially of a plurality of nucleic acids each of which contains or consists essentially of a sequence selected from the sequences in Table 15, SEQ ID NOS: 1-24 of Table 15, SEQ ID NOS: 11-16, 23 and 24 of Table 15, SEQ ID NOS: 25-48 of Table 15, SEQ ID NOS: 35-40, 47 and 48 of Table 15, the sequences in Table 16, SEQ ID NOS: 49-520 of Table 16, SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, SEQ ID NOS: 49-492 of Table 16, SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, SEQ ID NOS: 49-480 of Table 16A, SEQ ID NOS: 49-452 and 457-472 of Table 16A, SEQ ID NOS: 521-826 of Table 16C, SEQ ID NOS: 521-820 of Table 16C, SEQ ID NOS: 827-1298 of Table 16, SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, SEQ ID NOS: 827-1270 of Table 16, SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, SEQ ID NOS: 827-1258 of Table 16D, SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, SEQ ID NOS: 1299-1604 of Table 16F, SEQ ID NOS: 1299-1598 of Table 16F, substantially identical or similar sequences and/or or any of the aforementioned nucleotide sequences in in which one or more thymine bases is substituted with a uracil base. In some embodiments, nucleic acids in a composition provided herein include one or more pairs of nucleic acids (e.g., primer pairs). Primer pairs include pairs of (i.e., 2) nucleic acids (polynucleotides) which can be used to amplify nucleic acids. Examples of primer pairs are shown in Tables 15 and 16 as “Primer 1” and “Primer 2” in each row of the tables that are capable of amplifying a nucleic acid sequence contained in the corresponding region (hypervariable region) of a prokaryotic (e.g., bacterial) 16S rRNA gene (Table 15) or contained in the corresponding species of microorganism (Table 16). In some embodiments, nucleic acids in a composition provided herein include, or consist essentially of, one or more pairs of nucleic acids that contain or consist essentially of the nucleotide sequences of one or more pairs of nucleotide sequences in Table 15 or Table 16, one or more pairs of nucleotide sequences selected from the pairs of sequences set forth in SEQ ID NOS: 1-24 of Table 15, SEQ ID NOS: 25-48 of Table 15, SEQ ID NOS: 11-16, 23 and 24 of Table 15, SEQ ID NOS: 35-40, 47 and 48 of Table 15, SEQ ID NOS: 49-520 of Table 16, SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, SEQ ID NOS: 49-492 of Table 16, SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, SEQ ID NOS: 49-480 of Table 16A, SEQ ID NOS: 49-452 and 457-472 of Table 16A, SEQ ID NOS: 521-826 of Table 16C, SEQ ID NOS: 521-820 of Table 16C, SEQ ID NOS: 827-1298 of Table 16, SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, SEQ ID NOS: 827-1270 of Table 16, SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, SEQ ID NOS: 827-1258 of Table 16D, SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, substantially identical or similar sequences and/or or any of the aforementioned nucleotide sequences of primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, a composition contains, or consists essentially of, a plurality of pairs of nucleic acids (e.g., primer pairs) that contain or consist essentially of the nucleotide sequences of two or more pairs of nucleotide sequences in Table 15 or Table 16, two or more pairs of nucleotide sequences selected from the pairs of sequences set forth in SEQ ID NOS: 1-24 of Table 15, SEQ ID NOS: 25-48 of Table 15, SEQ ID NOS: 11-16, 23 and 24 of Table 15, SEQ ID NOS: 35-40, 47 and 48 of Table 15, SEQ ID NOS: 49-520 of Table 16, SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, SEQ ID NOS: 49-492 of Table 16, SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, SEQ ID NOS: 49-480 of Table 16A, SEQ ID NOS: 49-452 and 457-472 of Table 16A, SEQ ID NOS: 521-826 of Table 16C, SEQ ID NOS: 521-820 of Table 16C, SEQ ID NOS: 827-1298 of Table 16, SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, SEQ ID NOS: 827-1270 of Table 16, SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, SEQ ID NOS: 827-1258 of Table 16D, SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or substantially identical or similar sequences, and/or or any of the aforementioned nucleotide sequences of primer pairs in which one or more thymine bases is substituted with a uracil base.


In some embodiments, a nucleic acid or nucleic acid pair, and optionally degenerate sequences thereof, binds to, hybridizes and/or amplifies a specific nucleic acid sequence unique to a particular microorganism (e.g., a species of bacteria). Such a nucleic acid or nucleic acid pair, and optionally degenerate sequences thereof, is referred to herein as “microorganism-specific” or “species-specific” and amplifies nucleic acids in a microorganism-specific or species-specific manner to produce a single amplification product having a unique sequence among microorganisms (e.g., bacteria) or a group of microorganisms in the presence of nucleic acids from the microorganism in an amplification reaction. Nonlimiting examples of sequences of such nucleic acids and nucleic acid primer pairs are provided in Table 16.


In some embodiments, a nucleic acid pair, and optionally degenerate sequences thereof, is capable of amplifying a sequence in a homologous gene or genomic region common to multiple, most, a majority, substantially all, or all microorganisms in a taxonomic group, but that varies between different microorganisms. Taxonomic groups include kingdom, domain, phylum, class, order, family and species. In one embodiment, the taxonomic group is the Bacteria kingdom. Such a nucleic acid pair, or primer pair, and optionally degenerate sequences thereof, that is capable of amplifying a sequence in a homologous gene or genomic region common to multiple, most, a majority, substantially all, or all microorganisms in a kingdom, but that varies between different microorganisms in the kingdom, is referred to herein as “kingdom-encompassing” and amplifies nucleic acid in microorganisms in the kingdom in a kingdom-encompassing manner to produce multiple amplification products having different nucleotide sequences of different microorganisms (e.g., different bacteria) in the kingdom in the presence of nucleic acids from microorganisms (e.g., bacteria) in an amplification reaction. Conserved sequences of nucleic acids can be found in the genomes of different organisms or microbes. Such sequences can be identical or share substantial similarity in the different genomes (see, e.g., Isenbarger et al. (2008) Orig Life Evol Biosph doi:10.1007/s11084-008-9148-z). In many instances, conserved sequences are located in essential genes, e.g., housekeeping genes, that encode elements required across a category or group of organisms or microbes for carrying out basic biochemical functions of survival. However, through evolution and adaptation of organisms and microbes to diverse conditions, even homologous genes diverged and contain sequences that vary between different organisms and microbes and that may be so divergent as to be unique to specific organisms or microbes such that they can be used to identify an individual organism or microbe or a related group (e.g., species) of organisms or microbes. Homologous genes include, for example, some essential genes required for basic functioning and survival of microorganisms. In some embodiments, the homologous gene is a 16S rRNA gene, 18S rRNA gene or an 23S rRNA gene common to multiple different organisms, or microorganisms (e.g., multiple different bacteria). For example, in certain embodiments, the nucleic acids include one or more primer pairs that separately amplify two or more regions, e.g., hypervariable regions, in a prokaryotic, e.g., bacterial, 16S rRNA gene. Nonlimiting examples of such nucleic acid primer pairs are provided in Table 15.


Variable region analysis has been used for taxonomic classification of prokaryotes, for example, in methods using nucleic acid primers that hybridize to conserved sequences flanking a variable region. Homologous genes that contain multiple variable regions interspersed between conserved regions are particularly useful in such methods because they provide multiple sequences that can be analyzed to more accurately and definitively identify individual constituents of a population of targeted elements. One example of such a gene is the prokaryotic 16S rRNA gene encoding ribosomal RNAs which are the main structural and catalytic components of ribosomes. The 16S ribosomal RNA (rRNA) gene of bacteria and archaea is about 1500-1700 base pairs long and includes 9 hypervariable regions of varying conservation, which are commonly referred to a V1-V9 (FIG. 1), that are interspersed between conservative or conserved regions (see, e.g., Wang and Qian (2009) PloS ONE 4:e7401 and Kim et al. (2011) J Microbiol Meth 84:81-87). Exemplary 16S rRNA gene sequences are known and include those contained in the Greengenes database (http://greengenes.lbl.gov), SILVA database (www.arb-silva.de) and GRD-Genomic-Based 16 Ribosomal RNA Database (https://metasystems.riken.jp/grd/). Sequences of the hypervariable regions of 16S rRNA genes which differ in different microorganisms can be used to identify microorganisms in a sample. Instead of specifically amplifying a hypervariable region of every possible microorganism that could be present in a sample by using many oligonucleotide primers, each specific to the hypervariable region of each organism, it is possible to utilize the conserved, highly similar or identical sequences flanking the hypervariable regions as primer-binding sequences to which one, or a small number of, primer pair(s) will bind and amplify a hypervariable region in substantially all of the microorganisms, e.g., bacteria, in a sample. This allows specific nucleic acids that can be used to identify a microorganism to be amplified from substantially all the microorganisms which can then be sequenced for efficient profiling of the population. However, the results of such methods tend to be inconsistent, and often incomplete in determining most or all microorganisms present in a sample, particularly samples containing multiple different microorganisms. Furthermore, such methods typically do not reliably or accurately discriminate between species of microorganisms, if they are able to distinguish species at all. Most such methods utilize primers intended for amplification of one or a few, and less than all, hypervariable regions of the 16S rRNA gene. If more than a limited number of hypervariable regions are targeted for amplification in these methods, the method typically requires multiple separate amplification reactions for different primers due to overlap of primer sequences, which introduces inefficiencies in resource use and time into the methods. Also, such methods often include primer pairs designed to amplify two or more hypervariable regions (e.g., V2-V3 or V3-V4) as a single amplicon which results in longer amplicons for sequencing.


Kingdom-Encompassing Nucleic Acids


Nucleic acid primer pairs are provided herein that separately amplify nucleic acids comprising sequences located in multiple hypervariable regions of the prokaryotic 16S rRNA gene. In some embodiments, there is little (e.g., less than or equal to 7 nucleotides) to no overlap of the nucleotide sequences of any two of the 16s rRNA gene primers that separately amplify nucleic acids comprising sequences located in multiple hypervariable regions. In some aspects, the primer pairs amplify 16s rRNA gene sequences less than or equal to about 200 nucleotides in length, for example, between about 125 and 200 nucleotides in length. In some embodiments, compositions provided herein contain a plurality of nucleic acid primer pairs that includes at least 2, at least 3, at least 4, at least 5, at least 6, or at least 7 separate primer pairs, and optionally degenerate variants thereof, which separately amplify nucleic acids comprising sequences of a different one of 2, 3, 4, 5, 6, or 7 different hypervariable regions, respectively, in a prokaryotic 16s rRNA gene in a nucleic acid amplification reaction. In some embodiments, compositions provided herein contain a plurality of nucleic acid primer pairs that includes at least 8 separate primer pairs, and optionally degenerate variants thereof, which separately amplify a nucleic acid comprising a sequence of one of 8 different hypervariable regions in a prokaryotic 16s rRNA gene in a nucleic acid amplification reaction. In some embodiments, a composition includes a combination of primer pairs, wherein the primer pairs in the combination of primer pairs separately amplify nucleic acids comprising sequences located in 3 or more hypervariable regions of a prokaryotic 16S rRNA gene and wherein one of the 3 or more regions is a V5 region. Degenerate primer variants, containing, for example, different nucleotides at 1 or 2 positions in the primer sequences, are included in some compositions to ensure amplification of 16S rRNA genes containing minor variations in conserved regions. Non-limiting examples of nucleotide sequences of primer pairs that separately amplify 8 hypervariable regions (V2, V3, V4, V5, V6, V7, V8 and V9) of the prokaryotic 16S rRNA gene are listed in Table 15. In some embodiments, compositions provided herein contain or consist essentially of at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, at least 24 nucleic acids, or primer pairs, in which the nucleic acids or primer pairs contain or consist essentially of sequences selected from those in Table 15 or from SEQ ID NOS: 1-24 of Table 15, SEQ ID NOS: 25-48 of Table 15, SEQ ID NOS: 11-16, 23 and 24 of Table 15, or SEQ ID NOS: 35-40, 47 and 48 of Table 15, or substantially identical or similar sequences. In some embodiments, compositions provided herein contain or consist essentially of nucleic acids, or primer pairs, in which the nucleic acids or primer pairs separately contain, or consist essentially of, all sequences of SEQ ID NOS: 1-24 of Table 15 and/or all sequences of SEQ ID NOS: 25-48 of Table 15. In some of the embodiments of compositions provided herein containing a plurality of nucleic acid primer pairs that separately amplify nucleic acids comprising sequences located in multiple hypervariable regions of a prokaryotic 16S rRNA gene, the plurality of primer pairs provide at least 85%, or at least 90%, or at least 92%, or at least 95% or at least 98%, or at least 99% or 100% coverage of different bacterial 16S rRNA gene sequences in a given database containing bacterial 16S rRNA gene sequences. In some embodiments of compositions provided herein containing a plurality of nucleic acid primer pairs that separately amplify nucleic acids comprising sequences located in multiple hypervariable regions of a prokaryotic 16S rRNA gene, the plurality of primer pairs are capable of amplifying all or substantially all microbial (e.g., bacterial) nucleic acids in a sample containing a mixture of microorganisms (e.g. bacteria) of at least 10, at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, at least 55, at least 60, at least 65, at least 70, at least 75, at least 80, at least 85, at least 90, at least 95, or at least 100 or more different genera.


Species/Microorganism-Specific Nucleic Acids


Nucleic acids and nucleic acid pairs (e.g., primer pairs) are provided herein that bind to, hybridize to and/or amplify a specific nucleic acid sequence unique to a particular microorganism (e.g., a species, subspecies or strain of bacteria). Such microorganism-specific (e.g., bacteria-specific or species-specific) nucleic acids can be used, for example, as specific, selective probes and/or primers to greatly increase the depth and exactness of the detection and identification of microorganisms in a sample and significantly enhance characterization, assessment, measuring and/or profiling of a population or community of microorganisms as well as the components or constituents thereof. Such information is required to gain a complete understanding of the biodiversity of a community of microorganisms, e.g., microbiota of the alimentary tract of an animal. Exemplary nucleic acid sequences provided in Table 16 bind to, hybridize to and/or amplify a specific nucleic acid sequence unique to species in more than 40 different genera of microorganism (bacteria), or at least 43 different genera of microorganism, and unique to more than 70, or at least 73, or at least 74, or at least 75, different species of microorganism (bacteria).


Microorganism-specific nucleic acids provide many advantages, for example, in completely and accurately assessing, characterizing, measuring and/or profiling the composition of a population of microorganisms, e.g., microbiota, and determining relationships of individual microorganisms, as well as relating and/or correlating a community of microorganisms, and a state (e.g., health, degree of balance, susceptibility to certain conditions, responsiveness to treatment) of a subject and/or environment. The microbiota of a human, i.e., microorganisms, including bacteria, associated with different areas of a human subject, contains more than 10 times more microorganism cells than human cells. The microbiota includes commensal microorganisms, in addition to occurrences of pathogenic microorganisms. While the significance of identifying pathogenic microbes within an animal is relatively clear, profiling the complex composition of all types of microorganisms in the microbiota is also of great significance in understanding health of an animal and potential therapeutic interventions in disorders and disease. For example, microorganisms residing in the alimentary tract of animals, often referred to as the “gut microbiome,” contribute to animal metabolism, and evidence supports roles of the gut microbiome in inflammatory bowel diseases, autoimmune disorders, cardiometabolic disorders, cancer and neuropsychiatric disorders and diseases.


Compositions and methods provided herein, including microorganism-specific and kingdom-encompassing nucleic acids, and use of them in sample analysis, enable not only a comprehensive survey of the entirety and relative levels of genera of microorganisms (e.g., bacteria), but also detailed identification of species of microorganisms that can be tailored to focus on one or more particular microorganisms of interest that may be significant in certain states of health and disease or imbalance. For example, provided herein are nucleic acids and/or nucleic acid primer pairs that bind to, hybridize to and/or amplify a specific nucleic acid sequence unique to a particular microorganism (e.g., a species, subspecies or strain of bacteria), i.e., microorganism-specific nucleic acids. In some embodiments, the nucleic acids are capable of specifically binding to and/or hybridizing to a target nucleic acid sequence contained within the genome of the microorganism in a mixture comprising nucleic acids of multiple different microorganisms, for example, in a mixture comprising nucleic acids of the genome of a different microorganism that is in the same genus of the microorganism containing the target nucleic acid sequence. In some embodiments, the nucleic acids that specifically bind to and/or hybridize to a nucleic acid sequence contained with the genome of a microorganism do not bind to or hybridize to a nucleic acid contained within any other genus of microorganism or within any other species of microorganism. In some embodiments, a primer pair specifically amplifies a specific target nucleic acid sequence unique to a particular microorganism in an amplification reaction mixture comprising nucleic acids of the genomes of multiple different microorganisms, and in particular embodiments, in an amplification reaction mixture comprising nucleic acid of the genome of a different microorganism that is in the same genus of the microorganism containing the target nucleic acid sequence. In some embodiments, the primer pair does not amplify a nucleic acid sequence contained within any other genus of microorganism or within any other species of organism. In some embodiments, combinations of nucleic acids include microorganism-specific nucleic acids and/or primer pairs that specifically bind to, hybridize to and/or amplify a nucleic acid sequence contained in the genome of one or more microorganisms (e.g., bacteria) implicated in one or more conditions, disorders and/or diseases. In particular embodiments of compositions provided herein, the composition includes a nucleic acid and/or a primer pair that specifically binds to, hybridizes to and/or amplifies a target nucleic acid sequence contained within a genome of a microorganism selected from the microorganisms in Table 1. In particular embodiments, the target nucleic acid sequence contained in the genome of a microorganism selected from the microorganisms in Table 1 is unique to the microorganism. In some embodiments, the composition includes, or consists essentially of, a plurality of nucleic acids and/or primer pairs that include at least one nucleic acid that specifically binds to and/or hybridizes to a target nucleic acid for each of the microorganisms in Table 1 and/or at least one primer pair that specifically amplifies a genomic target nucleic acid for each of the microorganisms in Table 1. In some embodiments, the composition includes, or consists essentially of, a plurality of nucleic acids and/or primer pairs that include at least one nucleic acid that specifically binds to and/or hybridizes to a target nucleic acid for each of the microorganisms in Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis. In some embodiments, the composition includes, or consists essentially of, a plurality of nucleic acids and/or primer pairs that include at least one primer pair that specifically amplifies a genomic target nucleic acid for each of the microorganisms in Table 1 except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis. In some embodiments, the plurality of primer pairs includes, or consists essentially of, different primer pairs that specifically and separately amplify different genomic target nucleic acids contained within, or within at least, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 70, 71, 72, 73, 74, 75 or more of the microorganisms in Table 1. In some embodiments, the plurality of nucleic acid primer pairs includes, or consists essentially of, a set of nucleic acid primer pairs in which each different nucleic acid primer pair specifically amplifies a different unique nucleic acid sequence contained in a different one of each of the genomes of the group of microorganisms in Table 1 or the group of microorganisms in Table 1 except for, or excluding, Actinomyces viscosus and/or Blauta coccoides, or except for, or excluding, Acinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis. In particular embodiments, the target nucleic acid sequences contained in the genome of the different microorganisms are unique to each of the microorganisms.









TABLE 1







Microorganisms








GENUS
SPECIES






Actinomyces


Viscosus




Akkermansia


Muciniphila




Anaerococcus


Vaginalis




Atopobium


Parvulum




Bacteroides


fragilis, nordii, thetaiotaomicron, vulgatus




Barnesiella


Intestinihominis




Bifidobacterium


adolescentis, animalis, bifidum, longum




Blautia


coccoides, obeum




Borreliella


Burgdorferi




Campylobacter


concisus, curvus, gracilis, hominis, jejuni, rectus




Chlamydia


pneumoniae, trachomatis




Citrobacter


Rodentium




Cloacibacillus


Porcorum




Clostridioides


Difficile




Collinsella


aerofaciens, stercoris




Cutibacterium


Acnes




Desulfovibrio


Alaskensis




Dorea


Formicigenerans




Enterococcus


faecium, faecalis, gallinarum, hirae




Escherichia


Coli




Eubacterium


limosum, rectale




Faecalibacterium


Prausnitzii




Fusobacterium


Nucleatum




Gardnerella


Vaginalis




Gemmiger


Formicilis




Helicobacter


bilis, bizzozeronii, hepaticus, pylori, salomonis




Holdemania


Filiformis




Klebsiella


Pneumoniae




Lactobacillus


acidophilus, delbrueckii, johnsonii, murinus,





reuteri, rhamnosus




Lactococcus


Lactis




Mycoplasma


fermentans, penetrans




Parabacteroides


distasonis, merdae




Parvimonas


Micro




Peptostreptococcus


anerobius, stomatis




Phascolarctobacterium


Faecium




Porphyromonas


Gingivalis




Prevotella


copri, histicola




Proteus


Mirabilis




Roseburia


Intestinalis




Ruminococcus


bromii, gnavus




Slackia


Exigua




Streptococcus


gallolyticus, infantarius




Veillonella


Parvula










In some embodiments, nucleic acids and/or nucleic acid primer pairs provided herein bind to, hybridize to and/or amplify, or specifically bind to, hybridize to and/or amplify, a nucleic acid (such as a nucleic acid from a microorganism, e.g., bacteria) that contains, or consists essentially of, a nucleotide sequence selected from among SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C and/or substantially identical or similar nucleotide sequences. In some embodiments, nucleic acid primer pairs provided herein are capable of amplifying, or specifically amplifying, a nucleic acid (such as a nucleic acid from a microorganism, e.g., bacteria) containing, or consisting essentially of, a nucleotide sequence selected from among SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C to generate an amplicon sequence that is less than about 500, less than about 475, less than about 450, less than about 400, less than about 375, less than about 350, less than about 300, less than about 275, less than about 250, less than about 200, less than about 175, less than about 150, or less than about 100 nucleotides in length, or an amplicon that consists essentially of a nucleotide sequence selected from among SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, and optionally containing primer sequences attached at the 5′ and 3′ ends. In some embodiments, compositions provided herein contain a combination of a plurality of microorganism-specific nucleic acids and/or primer pairs in which within the plurality of nucleic acids and/or primer pairs, there are different nucleic acids and/or primer pairs that bind to, hybridize to and/or amplify (or specifically bind to, hybridize to and/or amplify) at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, at least 55, at least 60, at least 65, at least 70, at least 75, at least 80, at least 85, at least 90, at least 95, at least 100, at least 125, at least 150, at least 175, at least 200, at least 225, at least 250, at least 275, or at least 300 or more different nucleic acids containing or consisting essentially of a different one of the sequences of SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence, and optionally containing primers attached at the 3′ and 5′ ends. In some embodiments, such different nucleic acids and/or primer pairs amplify different nucleic acids containing a different one of the sequences of SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C to generate amplicon sequences that are less than about 500, less than about 475, less than about 450, less than about 400, less than about 375, less than about 350, less than about 300, less than about 275, less than about 250, less than about 200, less than about 175, less than about 150, or less than about 100 nucleotides in length, or amplicon sequences that consist essentially of a nucleotide sequence selected from SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C and optionally primer sequences attached at the 5′ and 3′ end of the sequence. In some embodiments, a combination of microorganism-specific nucleic acids or nucleic acid primer pairs includes or consists essentially of two or more nucleic acids or primer pairs containing, or consisting essentially of, a nucleotide sequence or pair of sequences (for primer pairs) selected from Table 16, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or a sequence or sequences substantially identical or similar thereto, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, a combination of microorganism-specific nucleic acids or nucleic acid primer pairs includes or consists essentially of a plurality of nucleic acids or primer pairs wherein there is at least one nucleic acid or primer pair separately containing, or consisting essentially of, each nucleotide sequence or pair of sequences (for primer pairs) of SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or substantially identical or similar sequences, and/or or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, compositions provided herein contain or consist essentially of at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, at least 55, at least 60, at least 65, at least 70, at least 75, at least 80, at least 85, at least 90, at least 95, at least 100, at least 125, at least 150, at least 175, at least 200, at least 225, at least 250, at least 275, at least 300 or more, or all of the nucleic acids, or all of the primer pairs, containing or consisting essentially of sequences selected from Table 16 or from SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or a sequence or sequences substantially identical or similar thereto, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base.


In some embodiments, a combination of nucleic acids and/or nucleic acid primer pairs includes two or more nucleic acids and/or nucleic acid primer pairs that bind to, hybridize to and/or amplify, or specifically bind to, hybridize to and/or amplify, a unique nucleic acid sequence contained in the genome of one or more of Akkermansia muciniphila, Bacteroides vulgatus, Bifidobacterium adolescentis, Campylobacter concisus, Campylobacter jejuni, Clostridioides difficile, Escherichia coli, Eubacterium rectale, Helicobacter bilis, Helicobacter hepaticus, Lactobacillus delbrueckii, Parabacteroides distasonis, Ruminococcus bromii, Streptococcus gallolyticus, and Streptococcus infantarius (referred to herein as “Group A” microorganisms; see Table 2A), which are species implicated as having a role in multiple conditions, diseases and/or disorders, including, for example oncological conditions including, for example, response to immuno-oncology treatment and cancer, gastrointestinal disorders, including, for example, irritable bowel syndrome, inflammatory bowel disease and coeliac disease, and autoimmune diseases, including, for example, lupus and rheumatoid arthritis. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes a set of nucleic acid primer pairs in which each different nucleic acid primer pair specifically amplifies a different unique nucleic acid sequence contained in a different one of each of the genomes of the different microorganisms in Group A. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of nucleic acids and/or nucleic acid primer pairs that bind to, hybridize to and/or amplify, or specifically bind to, hybridize to and/or amplify, a nucleic acid (such as a nucleic acid from a microorganism, e.g., bacteria) containing a sequence selected from SEQ ID NOS: 1607-1609, 1619, 1620, 1635-1637, 1643-1645, 1663-1670, 1679-1684, 1699-1701, 1728-1730, 1752, 1753, 1801, 1802, 1809, 1810, 1827, 1828, 1831-1833, 1844, 1845, 1852-1856, 1864, 1876-1885, 1889-1891, 1899, 1900, 1932, 1933, 1968 and 1972 of Table 17 or a sequence that is substantially identical or similar to any of the aforementioned sequences. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of nucleic acids and/or nucleic acid primer pairs that bind to, hybridize to and/or amplify, or specifically bind to, hybridize to and/or amplify, a nucleic acid (such as a nucleic acid from a microorganism, e.g., bacteria) containing a sequence selected from SEQ ID NOS: 1607-1609, 1619, 1620, 1635-1637, 1643-1645, 1663-1670, 1679-1684, 1699-1701, 1728-1730, 1752, 1753, 1801, 1802, 1809 and 1810 of Table 17 or a sequence that is substantially identical or similar to any of the aforementioned sequences. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of primers and/or primer pairs capable of amplifying, or specifically amplifying, a nucleic acid (such as a nucleic acid from a microorganism, e.g., bacteria) containing a sequence selected from SEQ ID NOS: 1607-1609, 1619, 1620, 1635-1637, 1643-1645, 1663-1670, 1679-1684, 1699-1701, 1728-1730, 1752, 1753, 1801, 1802, 1809, 1810, 1827, 1828, 1831-1833, 1844, 1845, 1852-1856, 1864, 1876-1885, 1889-1891, 1899, 1900, 1932, 1933, 1968 and 1972 of Table 17 (or a sequence that is substantially identical or similar to any of the aforementioned sequences) to generate an amplicon sequence that is less than about 500, less than about 475, less than about 450, less than about 400, less than about 375, less than about 350, less than about 300, less than about 275, less than about 250, less than about 200, less than about 175, less than about 150, or less than about 100 nucleotides in length, or an amplicon sequence consisting essentially of a nucleotide sequence selected from SEQ ID NOS: 1607-1609, 1619, 1620, 1635-1637, 1643-1645, 1663-1670, 1679-1684, 1699-1701, 1728-1730, 1752, 1753, 1801, 1802, 1809, 1810, 1827, 1828, 1831-1833, 1844, 1845, 1852-1856, 1864, 1876-1885, 1889-1891, 1899, 1900, 1932, 1933, 1968 and 1972 of Table 17, or a sequence that is substantially identical or similar to any of the aforementioned sequences, and optionally containing the nucleic acid primer sequences at the 5′ and 3′ ends of the sequence. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of primers and/or primer pairs capable of amplifying, or specifically amplifying, a nucleic acid (such as a nucleic acid from a microorganism, e.g., bacteria) containing a sequence selected from SEQ ID NOS: 1607-1609, 1619, 1620, 1635-1637, 1643-1645, 1663-1670, 1679-1684, 1699-1701, 1728-1730, 1752, 1753, 1801, 1802, 1809, and 1810 of Table 17 (or a sequence that is substantially identical or similar to any of the aforementioned sequences) to generate an amplicon sequence that is less than about 500, less than about 475, less than about 450, less than about 400, less than about 375, less than about 350, less than about 300, less than about 275, less than about 250, less than about 200, less than about 175, less than about 150, or less than about 100 nucleotides in length, or an amplicon sequence consisting essentially of a nucleotide sequence selected from SEQ ID NOS: 1607-1609, 1619, 1620, 1635-1637, 1643-1645, 1663-1670, 1679-1684, 1699-1701, 1728-1730, 1752, 1753, 1801, 1802, 1809 and 1810 of Table 17, or a sequence that is substantially identical or similar to any of the aforementioned sequences, and optionally containing the nucleic acid primer sequences at the 5′ and 3′ ends of the sequence. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of nucleic acids and/or nucleic acid primer pairs containing, or consisting essentially of, a nucleotide sequence or sequences (in the case of primer pairs) selected from SEQ ID NOS: 53-58, 77-80, 109-114, 125-130, 165-180, 197-208, 237-242, 295-300, 343-346, 441-444, 457-460, 493-498, 511-520, 521-524, 529-534, 555-558, 571-580, 595, 596, 619-638, 645-650, 665-668, 731-734, 803, 804, 811, 812 and/or SEQ ID NOS: 831-836, 855-858, 887-892, 903-908, 943-958, 975-986, 1015-1020, 1073-1078, 1121-1124, 1219-1222, 1235-1238, 1271-1276, 1289-1298, 1299-1302, 1307-1312, 1333-1336, 1349-1358, 1373, 1374, 1397-1416, 1423-1428, 1443-1446, 1509-1512, 1581, 1582, 1589, and 1590 in Table 16, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of nucleic acids and/or nucleic acid primer pairs containing, or consisting essentially of, a nucleotide sequence or sequences (in the case of primer pairs) selected from SEQ ID NOS: 53-58, 77-80, 109-114, 125-130, 165-180, 197-208, 237-242, 295-300, 343-346, 441-444, 457-460, 493-498 and 511-520 in Table 16 and/or SEQ ID NOS: 831-836, 855-858, 887-892, 903-908, 943-958, 975-986, 1015-1020, 1073-1078, 1121-1124, 1219-1222, 1235-1238, 1271-1276, 1289-1298 in Table 16, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of nucleic acids and/or nucleic acid primer pairs containing, or consisting essentially of, a nucleotide sequence or sequences (in the case of primer pairs) selected from SEQ ID NOS: 53-58, 77-80, 109-114, 125-130, 165-180, 197-208, 237-242, 295-300, 343-346, 441-444, 457-460 in Table 16A and/or SEQ ID NOS: 831-836, 855-858, 887-892, 903-908, 943-958, 975-986, 1015-1020, 1073-1078, 1121-1124, 1219-1222, 1235-1238 in Table 16D, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the combination includes, or consists essentially of, different nucleic acids or primers separately containing, or consisting essentially of, each of the different sequences of SEQ ID NOS: 53-58, 77-80, 109-114, 125-130, 165-180, 197-208, 237-242, 295-300, 343-346, 441-444, 457-460, 493-498 and 511-520 in Table 16 and/or SEQ ID NOS: 831-836, 855-858, 887-892, 903-908, 943-958, 975-986, 1015-1020, 1073-1078, 1121-1124, 1219-1222, 1235-1238, 1271-1276, 1289-1298 in Table 16, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the combination includes, or consists essentially of, different nucleic acids or primers separately containing, or consisting essentially of, each of the different sequences of SEQ ID NOS: 53-58, 77-80, 109-114, 125-130, 165-180, 197-208, 237-242, 295-300, 343-346, 441-444, 457-460 in Table 16A and/or SEQ ID NOS: 831-836, 855-858, 887-892, 903-908, 943-958, 975-986, 1015-1020, 1073-1078, 1121-1124, 1219-1222, 1235-1238 in Table 16D, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base.









TABLE 2





MICROORGANISM GROUPS:


Exemplary Combinations Of Nucleic Acids, Primers And Primer Pairs That Bind To, Hybridize To and/or Amplify


A Unique Nucleic Acid Sequence Contained In The Genomes Of One Or More Of The Microorganisms in the Group


for Groups A (Table 2A), B (Table 2B), C (Table 2C), D (Table 2D) and E (Table 2E)

















I. Combination includes
II. Combination includes
III. Combination includes


or consists essentially of
or consists essentially of
or consists essentially of


nucleic acids and/or primer
primers and/or primer pairs
primers nucleic acids and/or


pairs that bind to, hybridize
capable of amplifying or
primer pairs containing, or


to and/or amplify, or
specifically amplifying a
consisting essentially of,


specifically bind to,
nucleic acid containing a
nucleic acids and/or nucleic


hybridize to and/or amplify,
sequence selected from (a) SEQ
acid primer pairs containing,


a nucleic acid containing a
ID NOS:      
or consisting essentially of, a


sequence selected from SEQ
to generate an amplicon sequence
nucleotide sequence or sequences


ID NOS:         
<about 500, <about 475, <about 450,
(in the case of primer pairs)


(SEE FIRST COLUMN OF
<about 400, <about 375, <about 350,
selected from SEQ ID


TABLES 2A-2E)
<about 300, <about 275, <about 250,
NOS:           



<about 200, <about 175, <about 150,
(SEE THIRD COLUMN OF



<about 100 nucleotides in length,
TABLES 2A-2E)



or an amplicon sequence consisting



essentially of a nucleotide sequence



selected from (b) SEQ ID



NOS:      



(SEE SECOND COLUMN OF



TABLES 2A-2E)










GROUP A MICROORGANISMS



Akkermansia muciniphila, Bacteroides vulgatus, Bifidobacterium adolescentis, Campylobacter concisus,




Campylobacter jejuni, Clostridioides difficile, Escherichia coli, Eubacterium rectale, Helicobacter bilis,




Helicobacter hepaticus, Lactobacillus delbrueckii, Parabacteroides distasonis, Ruminococcus bromii,




Streptococcus gallolyticus and Streptococcus infiantarius













SEQ ID NOS: 1607-1609,
(a) SEQ ID NOS: 1607-1609, 1619,
SEQ ID NOS: 53-58, 77-80,


1619, 1620, 1635-1637,
1620, 1635-1637, 1643-1645, 1663-1670,
109-114, 125-130, 165-180,


1643-1645, 1663-1670,
1679-1684, 1699-1701, 1728-1730,
197-208, 237-242, 295-300,


1679-1684, 1699-1701,
1752, 1753, 1801, 1802, 1809, 1810,
343-346, 441-444, 457-460,


1728-1730, 1752, 1753,
1827, 1828, 1831-1833, 1844, 1845,
493-498, 511-520, 521-524,


1801, 1802, 1809, 1810,
1852-1856, 1864, 1876-1885, 1889-1891,
529-534, 555-558, 571-580,


1827, 1828, 1831-1833,
1899, 1900, 1932, 1933, 1968 and 1972
595, 596, 619-638, 645-650,


1844, 1845, 1852-1856,
of Table 17, or a substantially
665-668, 731-734, 803, 804,


1864, 1876-1885, 1889-1891,
identical or similar sequence,
811, 812 and/or


1899, 1900, 1932, 1933,
(b) SEQ ID NOS: 1607-1609, 1619,
SEQ ID NOS: 831-836, 855-858,


1968 and 1972
1620, 1635-1637, 1643-1645, 1663-1670,
887-892, 903-908, 943-958,


of Table 17, or a
1679-1684, 1699-1701, 1728-1730,
975-986, 1015-1020, 1073-1078,


substantially identical
1752, 1753, 1801, 1802, 1809, 1810,
1121-1124, 1219-1222, 1235-1238,


or similar sequence OR
1827, 1828, 1831-1833, 1844, 1845,
1271-1276,1289-1298, 1299-1302,


SEQ ID NOS: 1607-1609,
1852-1856, 1864, 1876-1885, 1889-1891,
1307-1312, 1333-1336, 1349-1358,


1619, 1620, 1635-1637,
1899, 1900, 1932, 1933, 1968 and 1972
1373, 1374, 1397-1416, 1423-1428,


1643-1645, 1663-1670,
of Table 17, or a substantially
1443-1446, 1509-1512, 1581, 1582,


1679-1684, 1699-1701,
identical or similar sequence, or a
1589, and 1590 in Table 16, OR


1728-1730, 1752, 1753,
sequence that is substantially
SEQ ID NOS: 53-58, 77-80,


1801, 1802, 1809 and 1810
identical or similar to any of the
109-114, 125-130, 165-180,


of Table 17, or
aforementioned sequences, and
197-208, 237-242, 295-300,


or a substantially
optionally containing the nucleic
343-346, 441-444, 457-460,


identical or similar
acid primer sequences at the 5′ and
493-498, 511-520


sequence
3′ ends of the sequence OR
in Table 16 and/or



(a) SEQ ID NOS: 1607-1609, 1619,
SEQ ID NOS: 831-836, 855-858,



1620, 1635-1637, 1643-1645, 1663-1670,
887-892, 903-908, 943-958,



1679-1684, 1699-1701, 1728-1730,
975-986, 1015-1020, 1073-1078,



1752, 1753, 1801, 1802, 1809, and 1810
1121-1124, 1219-1222, 1235-1238,



of Table 17, or a substantially
1271-1276, 1289-1298



identical or similar sequence,
in Table 16, OR



(b) SEQ ID NOS: 1607-1609, 1619,
SEQ ID NOS: 53-58, 77-80,



1620, 1635-1637, 1643-1645, 1663-1670,
109-114, 125-130, 165-180,



1679-1684, 1699-1701, 1728-1730,
197-208, 237-242, 295-300,



1752, 1753, 1801, 1802, 1809 and 1810
343-346, 441-444, 457-460



of Table 17, or a substantially
in Table 16A and/or



identical or similar sequence, or a
SEQ ID NOS: 831-836, 855-858,



sequence that is substantially
887-892, 903-908, 943-958,



identical or similar to any of the
975-986, 1015-1020, 1073-1078,



aforementioned sequences, and
1121-1124, 1219-1222, 1235-1238



optionally containing the nucleic
in Table 16D, or substantially



acid primer sequences at the 5′ and
identical or similar sequences



3′ ends of the sequence
of any of the above, or any




of the aforementioned




nucleotide sequences of nucleic




acids or primer pairs in which




one or more thymine bases is




substituted with a uracil base










GROUP B MICROORGANISMS



Akkermansia muciniphila, Anaerococcus vaginalis, Atopobium parvulum, Bacteroides nordii,




Bacteroides thetaiotaomicron, Bacteroides vulgatus, Bifidobacterium adolescentis, Bifidobacterium longum,




Collinsella aerofaciens, Collinsella stercoris, Desulfovibrio alaskensis, Dorea formicigenerans, Enterococcus faecium,




Eubacterium rectale, Faecalibacterium prausnitzii, Gardnerella vaginalis, Gemmiger formicilis, Eloldemania filiformis,




Klebsiella pneumoniae, Parabacteroides distasonis, Parabacteroides merdae, Phascolarctobacterium faecium, Prevotella histicola,




Roseburia intestinalis, Ruminococcus bromii, Slackia exigua, Streptococcus infantarius, and Veillonella parvula













SEQ ID NOS: 1605, 1606,
(a) SEQ ID NOS: 1605, 1606,
SEQ ID NOS: 49-52, 125-130,


1643-1645, 1648-1650,
1643-1645, 1648-1650, 1659-1667,
135-140, 157-174, 203-208,


1659-1667, 1682-1684,
1682-1684, 1689-1694, 1702-1704,
217-228, 243-248, 275-286,


1689-1694, 1702-1704,
1718-1723, 1728-1730, 1735-1742,
295-300, 309-324, 335-342,


1718-1723, 1728-1730,
1748-1751, 1754-1766, 1780-1783,
347-372, 399-406, 421-424,


1735-1742, 1748-1751,
1791, 1792, 1801, 1802, 1809-1816,
441-444, 457-472, 481-492,


1754-1766, 1780-1783,
1821-1826, 1829, 1830, 1864,
525-528, 595, 596, 605-610,


1791, 1792, 1801, 1802,
1869-1871, 1874-1882, 1890-1896,
615-632, 647-660, 669-674,


1809-1816, 1821-1826, 1829,
1901-1903, 1910-1915, 1920-1922,
687-698, 707-712, 727-730,


1830, 1864, 1869-1871,
1930-1931, 1934-1939, 1954, 1955,
735-746, 775-778, 789-796,


1874-1882, 1890-1896,
1961-1964, 1968, 1972-1974
803, 804, 811-816,


1901-1903, 1910-1915,
and 1977-1979 of Table 17,
821-826 and/or


1920-1922, 1930-1931,
or a substantially identical
SEQ ID NOS: 827-830, 903-908,


1934-1939, 1954, 1955,
or similar sequence,
913-918, 935-952, 981-986,


1961-1964, 1968, 1972-1974
(b) SEQ ID NOS: 1605, 1606,
995-1006, 1021-1026, 1053-1064,


and 1977-1979
1643-1645, 1648-1650, 1659-1667,
1073-1078, 1087-1102, 1113-1120,


of Table 17, or
1682-1684, 1689-1694, 1702-1704,
1125-1150, 1177-1184, 1199-1202,


or a substantially
1718-1723, 1728-1730, 1735-1742,
1219-1222, 1235-1250, 1259-1270,


identical or similar
1748-1751, 1754-1766, 1780-1783,
1303-1306, 1373, 1374,


sequence
1791, 1792, 1801, 1802, 1809-1816,
1383-1388, 1393-1410, 1425-1438,


OR SEQ ID NOS:
1821-1826, 1829, 1830, 1864,
1447-1452, 1465-1476, 1485-1490,


1605, 1606, 1643-1645,
1869-1871, 1874-1882, 1890-1896,
1505-1508, 1513-1524, 1553-1556,


1648-1650, 1659-1667,
1901-1903, 1910-1915, 1920-1922,
1567-1574, 1581, 1582, 1589-1594,


1682-1684, 1689-1694,
1930-1931, 1934-1939, 1954, 1955,
1599-1604 in Table 16, OR


1702-1704, 1718-1723,
1961-1964, 1968, 1972-1974 and
SEQ ID NOS: 49-52, 125-130,


1728-1730, 1735-1742,
1977-1979 of Table 17, or a
135-140, 157-174, 203-208,


1748-1751, 1754-1766,
substantially identical or
217-228, 243-248, 275-286,


1780-1783, 1791, 1792,
similar sequence, or a sequence
295-300, 309-324, 335-342,


1801, 1802, 1809-1816
that is substantially identical
347-372, 399-406, 421-424,


and 1821-1826
or similar to any of the
441-444, 457-472, 481-492


of Table 17, or a
aforementioned sequences, and
and/or SEQ ID NOS: 827-830,


substantially
optionally containing the nucleic
903-908, 913-918, 935-952,


identical or similar
acid primer sequences at the 5′ and
981-986, 995-1006, 1021-1026,


sequence
3′ ends of the sequence
1053-1064, 1073-1078, 1087-1102,


OR SEQ ID NOS: 1605, 1606,
OR
1113-1120, 1125-1150, 1177-1184,


1643-1645, 1648-1650,
(a) SEQ ID NOS: 1605, 1606,
1199-1202, 1219-1222,


1659-1667, 1682-1684,
1643-1645, 1648-1650, 1659-1667,
1235-1250, 1259-1270


1689-1694, 1702-1704,
1682-1684, 1689-1694, 1702-1704,
in Table 16, OR


1718-1723, 1728-1730,
1718-1723, 1728-1730, 1735-1742,
SEQ ID NOS: 49-52, 125-130,


1735-1742, 1748-1751,
1748-1751, 1754-1766, 1780-1783,
135-140, 157-174, 203-208,


1754-1766, 1780-1783,
1791, 1792, 1801, 1802, 1809-1816
217-228, 243-248, 275-286,


1791, 1792, 1801, 1802
and 1821-1826 of Table 17, or a
295-300, 309-324, 335-342,


and 1809-1816
substantially identical or
347-372, 399-406, 421-424,


of Table 17A, or
similar sequence,
441-444, 457-472


or a substantially
(b) SEQ ID NOS: 1605, 1606,
in Table 16A and/or


identical or similar
1643-1645, 1648-1650, 1659-1667,
SEQ ID NOS: 827-830, 903-908,


sequence
1682-1684, 1689-1694, 1702-1704,
913-918, 935-952, 981-986,



1718-1723, 1728-1730, 1735-1742,
995-1006, 1021-1026, 1053-1064,



1748-1751, 1754-1766, 1780-1783,
1073-1078, 1087-1102, 1113-1120,



1791, 1792, 1801, 1802, 1809-1816
1125-1150, 1177-1184, 1199-1202,



and 1821-1826 of Table 17,
1219-1222, 1235-1250



or a substantially identical or
in Table 16D, or substantially



similar sequence, or a sequence
identical or similar sequences



that is substantially identical
of any of the above, or any



or similar to any of the
of the aforementioned



aforementioned sequences, and
nucleotide sequences of nucleic



optionally containing the nucleic
acids or primer pairs in which



acid primer sequences at the 5′ and
one or more thymine bases is



3′ ends of the sequence
substituted with a uracil base



OR



(a) SEQ ID NOS: 1605, 1606,



1643-1645, 1648-1650, 1659-1667,



1682-1684, 1689-1694, 1702-1704,



1718-1723, 1728-1730, 1735-1742,



1748-1751, 1754-1766, 1780-1783,



1791, 1792, 1801, 1802 and 1809-1816



of Table 17A, or a substantially



identical or similar sequence,



(b) SEQ ID NOS: 1605, 1606,



1643-1645, 1648-1650, 1659-1667,



1682-1684, 1689-1694, 1702-1704,



1718-1723, 1728-1730, 1735-1742,



1748-1751, 1754-1766, 1780-1783,



1791, 1792, 1801, 1802 and 1809-1816



of Table 17A, or a substantially



identical or similar sequence,



or a sequence that is substantially



identical or similar to any of the



aforementioned sequences, and



optionally containing the nucleic



acid primer sequences at the 5′ and



3′ ends of the sequence










GROUP C MICROORGANISMS



Bacteroides fragilis, Campylobacter jejuni, Cutibacterium acnes, Escherichia coli, Fusobacterium nucleatum,




Helicobacter bilis, Helicobacter bizzozeronii, Helicobacter hepaticus, Helicobacter pylori, Helicobacter salomonis,




Peptostreptococcus stomatis, and Streptococcus gallolyticus













SEQ ID NOS: 1616, 1619, 1620,
(a) SEQ ID NOS: 1616, 1619,
SEQ ID NOS: 71, 72, 77-80, 89-96,


1625-1628, 1635-1640, 1699,
1620, 1625-1628, 1635-1640, 1699,
109-120, 237-242, 249-256, 343-346,


1700, 1705-1708, 1752, 1753,
1700, 1705-1708, 1752, 1753,
407-412, 473-480, 493-496, 511-520,


1784-1786, 1817-1820, 1827,
1784-1786, 1817-1820, 1827, 1828,
521-524, 547-550, 555-558, 561-568,


1828, 1840, 1841, 1844, 1845,
1840, 1841, 1844, 1845, 1852-
571-586, 665-668, 675-678, 731-734,


1852-1859, 1899, 1900, 1904,
1859, 1899, 1900, 1904, 1905,
779-784, 817-820 and/or SEQ ID NOS:


1905, 1932, 1933, 1956-1958,
1932, 1933, 1956-1958, 1975, 1976
849, 850, 855-858, 867-874, 887-898,


1975, 1976
of Table 17, or a substantially
1012-1020, 1025-1034, 1121-1124,


of Table 17, or a substantially
identical or similar sequence,
1185-1190, 1251-1258, 1271-1276,


identical or similar sequence
(b) SEQ ID NOS: 1616, 1619,
1289-1298, 1299-1302, 1325-1328,


OR SEQ ID NOS: 1616, 1619,
1620, 1625-1628, 1635-1640, 1699,
1333-1336, 1339-1346, 1349-1364,


1620, 1625-1628, 1635-1640,
1700, 1705-1708, 1752, 1753,
1443-1446, 1453-1456, 1509-1512,


1699, 1700, 1705-1708, 1752,
1784-1786, 1817-1820, 1827, 1828,
1557-1562, 1595-1598 in Table 16,


1753, 1784-1786, 1817-1820
1840, 1841, 1844, 1845, 1852-
OR SEQ ID NOS: 71, 72, 77-80, 89-96,


of Table 17, or a substantially
1859, 1899, 1900, 1904, 1905,
109-120, 237-242, 249-256, 343-346,


identical or similar sequence
1932, 1933, 1956-1958, 1975, 1976
407-412, 473-480, 493-496, 511-520



of Table 17, or a substantially
and/or SEQ ID NOS: 849, 850,



identical or similar sequence, or a
855-858, 867-874, 887-898,



sequence that is substantially
1012-1020, 1025-1034, 1121-1124,



identical or similar to any of the
1185-1190, 1251-1258, 1271-1276,



aforementioned sequences, and
1289-1298 in Table 16,



optionally containing the nucleic
OR SEQ ID NOS: 71, 72, 77-80, 89-96,



acid primer sequences at the 5′ and
109-120, 237-242, 249-256, 343-346,



3′ ends of the sequence OR
407-412, 473-480 and/or SEQ ID NOS:



a) SEQ ID NOS: 1616, 1619, 1620,
849, 850, 855-858, 867-874, 887-898,



1625-1628, 1635-1640, 1699, 1700,
1012-1020, 1025-1034, 1121-1124,



1705-1708, 1752, 1753, 1784-1786,
1185-1190, 1251-1258 or substantially



1817-1820 of Table 17, or a
identical or similar sequences



substantially identical or
of any of the above, or any



similar sequence,
of the aforementioned



(b) SEQ ID NOS: 1616, 1619,
nucleotide sequences of nucleic



1620, 1625-1628, 1635-1640, 1699,
acids or primer pairs in which



1700, 1705-1708, 1752, 1753,
one or more thymine bases is



1784-1786, 1817-1820
substituted with a uracil base



of Table 17, or a substantially



identical or similar sequence,



or a sequence that is



substantially identical or



similar to any of the



aforementioned sequences, and



optionally containing the nucleic



acid primer sequences at the 5′ and



3′ ends of the sequence










GROUP D MICROORGANISMS



Akkermansia muciniphila, Bifidobacterium bifidum, Bifidobacterium longum, Blautia coccoides,




Campylobacter concisus Campylobacter curvus, Campylobacter jejuni, Campylobacter rectus, Clostridioides difficile,




Escherichia coli, Eubacterium rectale, Fusobacterium nucleatum, Helicobacter bilis, Helicobacter hepaticus,




Helicobacter pylori, Klebsiella pneumoniae, Lactobacillus delbrueckii, Parabacteroides distasonis, Proteus mirabilis,




Ruminococcus bromii andRuminococcus mavus













SEQ ID NOS in Table 17,
(a) SEQ ID NOS in Table 17,
SEQ ID NOS corresponding


or Table 17A and Table 17B,
or Table 17A and Table 17B,
to Group D microorganisms


which correspond to Group D
which correspond to Group D
in Table 16,


microorganisms, or a
microorganisms, or a
SEQ ID NOS: 49-520 of Table 16,


substantially identical
substantially identical
SEQ ID NOS: 49-492 of Table 16,


or similar sequence
or similar sequence,
SEQ ID NOS: 49-480 of Table 16A,



(b) SEQ ID NOS
SEQ ID NOS: 521-826 of Table 16C,



SEQ ID NOS in Table 17,
SEQ ID NOS: 521-820 of Table 16C,



or Table 17A and
SEQ ID NOS: 827-1298 of Table 16,



Table 17B, which correspond
SEQ ID NOS: 827-1258 of Table 16D,



to Group D microorganisms,
SEQ ID NOS: 1299-1604 of Table 16F, or



or a substantially identical
SEQ ID NOS: 1299-1598 of Table 16F, or



or similar sequence,
substantially identical or similar



or a sequence that is
sequences of any of the above,



substantially identical
or any of the aforementioned



or similar to any of the
nucleotide sequences of nucleic



aforementioned sequences, and
acids or primer pairs in which



optionally containing the nucleic
one or more thymine vbases is



acid primer sequences at the 5′ and
substituted with a uracil base



3′ ends of the sequence










GROUP E MICROORGANISMS



Akkermansia muciniphila, Bacteroides fragilis, Bacteroides vulgatus, Bifidobacterium adolescentis,




Campylobacter concisus, Campylobacter jejuni, Citrobacter rodentium, Clostridioides difficile, Enterococcus gallinarum,




Escherichia coli, Helicobacter bilis, Lactobacillus delbrueckii, Lactobacillus murinus, Lactobacillus reuteri,




Lactobacillus rhamnosus Lactococcus lactis, and Prevotella copri













SEQ ID NOS in Table 17,
(a) SEQ ID NOS in Table 17,,
SEQ ID NOS corresponding


or Table 17A and Table 17B,
or Table 17A and Table 17B,
to Group D microorganisms


which correspond to
which correspond to Group E
in Table 16,


Group E microorganisms,
microorganisms, or a
SEQ ID NOS: 49-520 of Table 16,


or a substantially identical
substantially identical
SEQ ID NOS: 49-492 of Table 16,


or similar sequence
or similar sequence,
SEQ ID NOS: 49-480 of Table 16A,



(b) SEQ ID NOS
SEQ ID NOS: 521-826 of Table 16C,



SEQ ID NOS in Table 17,
SEQ ID NOS: 521-820 of Table 16C,



or Table 17A and Table 17B,
SEQ ID NOS: 827-1298 of Table 16,



which correspond to Group E
SEQ ID NOS: 827-1258 of Table 16D,



microorganisms, or a
SEQ ID NOS: 1299-1604 of Table 16F, or



substantially identical
SEQ ID NOS: 1299-1598 of Table 16F,or



or similar sequence,
substantially identical or similar



or a sequence that is
sequences of any of the above,



substantially identical
or any of the aforementioned



or similar to any of the
nucleotide sequences of nucleic



aforementioned sequences, and
acids or primer pairs in which



optionally containing the nucleic
one or more thymine bases is



acid primer sequences at the 5′ and
substituted with a uracil base



3′ ends of the sequence









In some embodiments, a combination of nucleic acids and/or nucleic acid primer pairs includes two or more nucleic acids and/or nucleic acid primer pairs that specifically bind to, hybridize to and/or amplify a unique nucleic acid sequence contained in the genome of one or more of Akkermansia muciniphila, Anaerococcus vaginalis, Atopobium parvulum, Bacteroides nordii, Bacteroides thetaiotaomicron, Bacteroides vulgatus, Bifidobacterium adolescentis, Bifidobacterium longum, Collinsella aerofaciens, Collinsella stercoris, Desulfovibrio alaskensis, Dorea formicigenerans, Enterococcus faecium, Eubacterium rectale, Faecalibacterium prausnitzii, Gardnerella vaginalis, Gemmiger formicilis, Holdemania filiformis, Klebsiella pneumoniae, Parabacteroides distasonis, Parabacteroides merdae, Phascolarctobacterium faecium, Prevotella histicola, Roseburia intestinalis, Ruminococcus bromii, Slackia exigua, Streptococcus infantarius, and Veillonella parvula (referred to herein as “Group B” microorganisms; see Table 2B), which are species implicated as having a role in response to immuno-oncology treatment. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes a set of nucleic acid primer pairs in which each different nucleic acid primer pair specifically amplifies a different unique nucleic acid sequence contained in a different one of each of the genomes of the different microorganisms in Group B. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of nucleic acids and/or nucleic acid primer pairs that bind to, hybridize to and/or amplify, or specifically bind to, hybridize to and/or amplify, a nucleic acid (such as a nucleic acid from a microorganism, e.g., bacteria) containing a sequence selected from SEQ ID NOS: 1605, 1606, 1643-1645, 1648-1650, 1659-1667, 1682-1684, 1689-1694, 1702-1704, 1718-1723, 1728-1730, 1735-1742, 1748-1751, 1754-1766, 1780-1783, 1791, 1792, 1801, 1802, 1809-1816, 1821-1826, 1829, 1830, 1864, 1869-1871, 1874-1882, 1890-1896, 1901-1903, 1910-1915, 1920-1922, 1930-1931, 1934-1939, 1954, 1955, 1961-1964, 1968, 1972-1974 and 1977-1979 of Table 17, and/or a substantially identical or similar sequence. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of nucleic acids and/or nucleic acid primer pairs that bind to, hybridize to and/or amplify, or specifically bind to, hybridize to and/or amplify, a nucleic acid (such as a nucleic acid from a microorganism, e.g., bacteria) containing a sequence selected from SEQ ID NOS: 1605, 1606, 1643-1645, 1648-1650, 1659-1667, 1682-1684, 1689-1694, 1702-1704, 1718-1723, 1728-1730, 1735-1742, 1748-1751, 1754-1766, 1780-1783, 1791, 1792, 1801, 1802, 1809-1816 and 1821-1826, of Table 17, and/or a substantially identical or similar sequence. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of nucleic acids and/or nucleic acid primer pairs that bind to, hybridize to and/or amplify, or specifically bind to, hybridize to and/or amplify, a nucleic acid (such as a nucleic acid from a microorganism, e.g., bacteria) containing a sequence selected from SEQ ID NOS: 1605, 1606, 1643-1645, 1648-1650, 1659-1667, 1682-1684, 1689-1694, 1702-1704, 1718-1723, 1728-1730, 1735-1742, 1748-1751, 1754-1766, 1780-1783, 1791, 1792, 1801, 1802 and 1809-1816 of Table 17A, and/or a substantially identical or similar sequence. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of primers and/or primer pairs capable of amplifying, or specifically amplifying, a nucleic acid (such as a nucleic acid from a microorganism, e.g., bacteria) containing a sequence selected from SEQ ID NOS: 1605, 1606, 1643-1645, 1648-1650, 1659-1667, 1682-1684, 1689-1694, 1702-1704, 1718-1723, 1728-1730, 1735-1742, 1748-1751, 1754-1766, 1780-1783, 1791, 1792, 1801, 1802, 1809-1816, 1821-1826, 1829, 1830, 1864, 1869-1871, 1874-1882, 1890-1896, 1901-1903, 1910-1915, 1920-1922, 1930-1931, 1934-1939, 1954, 1955, 1961-1964, 1968, 1972-1974 and 1977-1979 of Table 17 (or a sequence that is substantially identical or similar to any of the aforementioned sequences) to generate an amplicon sequence that is less than about 500, less than about 475, less than about 450, less than about 400, less than about 375, less than about 350, less than about 300, less than about 275, less than about 250, less than about 200, less than about 175, less than about 150, or less than about 100 nucleotides in length, or an amplicon sequence consisting essentially of a nucleotide sequence selected from SEQ ID NOS: 1605, 1606, 1643-1645, 1648-1650, 1659-1667, 1682-1684, 1689-1694, 1702-1704, 1718-1723, 1728-1730, 1735-1742, 1748-1751, 1754-1766, 1780-1783, 1791, 1792, 1801, 1802, 1809-1816, 1821-1826, 1829, 1830, 1864, 1869-1871, 1874-1882, 1890-1896, 1901-1903, 1910-1915, 1920-1922, 1930-1931, 1934-1939, 1954, 1955, 1961-1964, 1968, 1972-1974 and 1977-1979 of Table 17, or a sequence that is substantially identical or similar to any of the aforementioned sequences, and optionally containing the nucleic acid primer sequences at the 5′ and 3′ ends of the sequence. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of primers and/or primer pairs capable of amplifying, or specifically amplifying, a nucleic acid (such as a nucleic acid from a microorganism, e.g., bacteria) containing a sequence selected from SEQ ID NOS: 1605, 1606, 1643-1645, 1648-1650, 1659-1667, 1682-1684, 1689-1694, 1702-1704, 1718-1723, 1728-1730, 1735-1742, 1748-1751, 1754-1766, 1780-1783, 1791, 1792, 1801, 1802, 1809-1816 and 1821-1826 of Table 17 (or a sequence that is substantially identical or similar to any of the aforementioned sequences) to generate an amplicon sequence that is less than about 500, less than about 475, less than about 450, less than about 400, less than about 375, less than about 350, less than about 300, less than about 275, less than about 250, less than about 200, less than about 175, less than about 150, or less than about 100 nucleotides in length, or an amplicon sequence consisting essentially of a nucleotide sequence selected from SEQ ID NOS: 1605, 1606, 1643-1645, 1648-1650, 1659-1667, 1682-1684, 1689-1694, 1702-1704, 1718-1723, 1728-1730, 1735-1742, 1748-1751, 1754-1766, 1780-1783, 1791, 1792, 1801, 1802, 1809-1816 and 1821-1826 of Table 17, or a sequence that is substantially identical or similar to any of the aforementioned sequences, and optionally containing the nucleic acid primer sequences at the 5′ and 3′ ends of the sequence. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of primers and/or primer pairs capable of amplifying, or specifically amplifying, a nucleic acid (such as a nucleic acid from a microorganism, e.g., bacteria) containing a sequence selected from SEQ ID NOS: 1605, 1606, 1643-1645, 1648-1650, 1659-1667, 1682-1684, 1689-1694, 1702-1704, 1718-1723, 1728-1730, 1735-1742, 1748-1751, 1754-1766, 1780-1783, 1791, 1792, 1801, 1802 and 1809-1816 of Table 17A (or a sequence that is substantially identical or similar to any of the aforementioned sequences) to generate an amplicon sequence that is less than about 500, less than about 475, less than about 450, less than about 400, less than about 375, less than about 350, less than about 300, less than about 275, less than about 250, less than about 200, less than about 175, less than about 150, or less than about 100 nucleotides in length, or an amplicon sequence consisting essentially of a nucleotide sequence selected from SEQ ID NOS: 1605, 1606, 1643-1645, 1648-1650, 1659-1667, 1682-1684, 1689-1694, 1702-1704, 1718-1723, 1728-1730, 1735-1742, 1748-1751, 1754-1766, 1780-1783, 1791, 1792, 1801, 1802 and 1809-1816 of Table 17A, or a sequence that is substantially identical or similar to any of the aforementioned sequences, and optionally containing the nucleic acid primer sequences at the 5′ and 3′ ends of the sequence. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of nucleic acids and/or nucleic acid primer pairs containing, or consisting essentially of, a nucleotide sequence or sequences (in the case of primer pairs) selected from SEQ ID NOS: 49-52, 125-130, 135-140, 157-174, 203-208, 217-228, 243-248, 275-286, 295-300, 309-324, 335-342, 347-372, 399-406, 421-424, 441-444, 457-472, 481-492, 525-528, 595, 596, 605-610, 615-632, 647-660, 669-674, 687-698, 707-712, 727-730, 735-746, 775-778, 789-796, 803, 804, 811-816, 821-826 and/or SEQ ID NOS: 827-830, 903-908, 913-918, 935-952, 981-986, 995-1006, 1021-1026, 1053-1064, 1073-1078, 1087-1102, 1113-1120, 1125-1150, 1177-1184, 1199-1202, 1219-1222, 1235-1250, 1259-1270, 1303-1306, 1373, 1374, 1383-1388, 1393-1410, 1425-1438, 1447-1452, 1465-1476, 1485-1490, 1505-1508, 1513-1524, 1553-1556, 1567-1574, 1581, 1582, 1589-1594, 1599-1604 in Table 16, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of nucleic acids and/or nucleic acid primer pairs containing, or consisting essentially of, a nucleotide sequence or sequences (in the case of primer pairs) selected from SEQ ID NOS: 49-52, 125-130, 135-140, 157-174, 203-208, 217-228, 243-248, 275-286, 295-300, 309-324, 335-342, 347-372, 399-406, 421-424, 441-444, 457-472, 481-492 and/or SEQ ID NOS: 827-830, 903-908, 913-918, 935-952, 981-986, 995-1006, 1021-1026, 1053-1064, 1073-1078, 1087-1102, 1113-1120, 1125-1150, 1177-1184, 1199-1202, 1219-1222, 1235-1250, 1259-1270 in Table 16, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of nucleic acids and/or nucleic acid primer pairs containing, or consisting essentially of, a nucleotide sequence or sequences (in the case of primer pairs) selected from SEQ ID NOS: 49-52, 125-130, 135-140, 157-174, 203-208, 217-228, 243-248, 275-286, 295-300, 309-324, 335-342, 347-372, 399-406, 421-424, 441-444, 457-472 in Table 16A and/or SEQ ID NOS: 827-830, 903-908, 913-918, 935-952, 981-986, 995-1006, 1021-1026, 1053-1064, 1073-1078, 1087-1102, 1113-1120, 1125-1150, 1177-1184, 1199-1202, 1219-1222, 1235-1250 in Table 16D, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the combination includes, or consists essentially of, different nucleic acids or primers separately containing, or consisting essentially of, each of the different sequences of SEQ ID NOS: 49-52, 125-130, 135-140, 157-174, 203-208, 217-228, 243-248, 275-286, 295-300, 309-324, 335-342, 347-372, 399-406, 421-424, 441-444, 457-472, 481-492 and/or SEQ ID NOS: 827-830, 903-908, 913-918, 935-952, 981-986, 995-1006, 1021-1026, 1053-1064, 1073-1078, 1087-1102, 1113-1120, 1125-1150, 1177-1184, 1199-1202, 1219-1222, 1235-1250, 1259-1270 in Table 16, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the combination includes, or consists essentially of, different nucleic acids or primers separately containing, or consisting essentially of, each of the different sequences of SEQ ID NOS: 49-52, 125-130, 135-140, 157-174, 203-208, 217-228, 243-248, 275-286, 295-300, 309-324, 335-342, 347-372, 399-406, 421-424, 441-444, 457-472 in Table 16A and/or SEQ ID NOS: 827-830, 903-908, 913-918, 935-952, 981-986, 995-1006, 1021-1026, 1053-1064, 1073-1078, 1087-1102, 1113-1120, 1125-1150, 1177-1184, 1199-1202, 1219-1222, 1235-1250 in Table 16D, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base.


In some embodiments, a combination of nucleic acids and/or nucleic acid primer pairs includes two or more nucleic acids and/or nucleic acid primer pairs that specifically bind to, hybridize to and/or amplify a unique nucleic acid sequence contained in the genome of one or more of Bacteroides fragilis, Campylobacter jejuni, Cutibacterium acnes, Escherichia coli, Fusobacterium nucleatum, Helicobacter bilis, Helicobacter bizzozeronii, Helicobacter hepaticus, Helicobacter pylori, Helicobacter salomonis, Peptostreptococcus stomatis, and Streptococcus gallolyticus (referred to herein as “Group C” microorganisms; see Table 2C), which are species implicated as having a role in cancer. In some embodiments, a combination of nucleic acids and/or nucleic acid primer pairs includes two or more nucleic acids and/or nucleic acid primer pairs that specifically bind to, hybridize to and/or amplify a unique nucleic acid sequence contained in the genome of one or more of Bacteroides fragilis, Campylobacter jejuni, Cutibacterium acnes, Escherichia coli, Fusobacterium nucleatum, Helicobacter bilis, Helicobacter bizzozeronii, Helicobacter hepaticus, Helicobacter pylori, Peptostreptococcus stomatis, and Streptococcus gallolyticus (referred to herein as “Subgroup 1” of the Group C microorganisms). In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes a set of nucleic acid primer pairs in which each different nucleic acid primer pair specifically amplifies a different unique nucleic acid sequence contained in a different one of each of the genomes of the different microorganisms in Group C or in Group C excluding Helicobacter salomonis (i.e., Subgroup 1 of Group C). In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of nucleic acids and/or nucleic acid primer pairs that bind to, hybridize to and/or amplify, or specifically bind to, hybridize to and/or amplify, a nucleic acid (such as a nucleic acid from a microorganism, e.g., bacteria) containing a sequence selected from SEQ ID NOS: 1616, 1619, 1620, 1625-1628, 1635-1640, 1699, 1700, 1705-1708, 1752, 1753, 1784-1786, 1817-1820, 1827, 1828, 1840, 1841, 1844, 1845, 1852-1859, 1899, 1900, 1904, 1905, 1932, 1933, 1956-1958, 1975, 1976 of Table 17, and/or a substantially identical or similar sequence, or a nucleic acid containing a sequence selected from SEQ ID NOS: 1616, 1619, 1620, 1625-1628, 1635-1640, 1699, 1700, 1705-1708, 1752, 1753, 1784-1786, 1827, 1828, 1840, 1841, 1844, 1845, 1852-1859, 1899, 1900, 1904, 1905, 1932, 1933, 1956, 1957, 1958 of Table 17, and/or a substantially identical or similar sequence. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of nucleic acids and/or nucleic acid primer pairs that bind to, hybridize to and/or amplify, or specifically bind to, hybridize to and/or amplify, a nucleic acid (such as a nucleic acid from a microorganism, e.g., bacteria) containing a sequence selected from SEQ ID NOS: 1616, 1619, 1620, 1625-1628, 1635-1640, 1699, 1700, 1705-1708, 1752, 1753, 1784-1786, 1817-1820 of Table 17A, and/or a substantially identical or similar sequence, or a nucleic acid containing a sequence selected from SEQ ID NOS: 1616, 1619, 1620, 1625-1628, 1635-1640, 1699, 1700, 1705-1708, 1752, 1753, 1784-1786 of Table 17A, and/or a substantially identical or similar sequence. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of primers and/or primer pairs capable of amplifying, or specifically amplifying, a nucleic acid (such as a nucleic acid from a microorganism, e.g., bacteria) containing a sequence selected from SEQ ID NOS: 1616, 1619, 1620, 1625-1628, 1635-1640, 1699, 1700, 1705-1708, 1752, 1753, 1784-1786, 1817-1820, 1827, 1828, 1840, 1841, 1844, 1845, 1852-1859, 1899, 1900, 1904, 1905, 1932, 1933, 1956-1958, 1975, 1976 of Table 17 (or a sequence that is substantially identical or similar to any of the aforementioned sequences) to generate an amplicon sequence that is less than about 500, less than about 475, less than about 450, less than about 400, less than about 375, less than about 350, less than about 300, less than about 275, less than about 250, less than about 200, less than about 175, less than about 150, or less than about 100 nucleotides in length, or an amplicon sequence consisting essentially of a nucleotide sequence selected from SEQ ID NOS: 1616, 1619, 1620, 1625-1628, 1635-1640, 1699, 1700, 1705-1708, 1752, 1753, 1784-1786, 1817-1820, 1827, 1828, 1840, 1841, 1844, 1845, 1852-1859, 1899, 1900, 1904, 1905, 1932, 1933, 1956-1958, 1975, 1976 of Table 17, or a sequence that is substantially identical or similar to any of the aforementioned sequences, and optionally containing the nucleic acid primer sequences at the 5′ and 3′ ends of the sequence. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of primers and/or primer pairs capable of amplifying, or specifically amplifying, a nucleic acid (such as a nucleic acid from a microorganism, e.g., bacteria) containing a sequence selected from SEQ ID NOS: 1616, 1619, 1620, 1625-1628, 1635-1640, 1699, 1700, 1705-1708, 1752, 1753, 1784-1786, 1817-1820 of Table 17A (or a sequence that is substantially identical or similar to any of the aforementioned sequences) to generate an amplicon sequence that is less than about 500, less than about 475, less than about 450, less than about 400, less than about 375, less than about 350, less than about 300, less than about 275, less than about 250, less than about 200, less than about 175, less than about 150, or less than about 100 nucleotides in length, or an amplicon sequence consisting essentially of a nucleotide sequence selected from SEQ ID NOS: 1616, 1619, 1620, 1625-1628, 1635-1640, 1699, 1700, 1705-1708, 1752, 1753, 1784-1786, 1817-1820 of Table 17A, or a sequence that is substantially identical or similar to any of the aforementioned sequences, and optionally containing the nucleic acid primer sequences at the 5′ and 3′ ends of the sequence. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of nucleic acids and/or nucleic acid primer pairs containing, or consisting essentially of, a nucleotide sequence or sequences (in the case of primer pairs) selected from SEQ ID NOs: 71, 72, 77-80, 89-96, 109-120, 237-242, 249-256, 343-346, 407-412, 473-480, 493-496, 511-520, 521-524, 547-550, 555-558, 561-568, 571-586, 665-668, 675-678, 731-734, 779-784, 817-820 and/or SEQ ID NOS: 849, 850, 855-858, 867-874, 887-898, 1012-1020, 1025-1034, 1121-1124, 1185-1190, 1251-1258, 1271-1276, 1289-1298, 1299-1302, 1325-1328, 1333-1336, 1339-1346, 1349-1364, 1443-1446, 1453-1456, 1509-1512, 1557-1562, 1595-1598 in Table 16, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of nucleic acids and/or nucleic acid primer pairs containing, or consisting essentially of, a nucleotide sequence or sequences (in the case of primer pairs) selected from SEQ ID NOs: 71, 72, 77-80, 89-96, 109-120, 237-242, 249-256, 343-346, 407-412, 473-480, 493-496, 511-520 and/or SEQ ID NOS: 849, 850, 855-858, 867-874, 887-898, 1012-1020, 1025-1034, 1121-1124, 1185-1190, 1251-1258, 1271-1276, 1289-1298 in Table 16, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of nucleic acids and/or nucleic acid primer pairs containing, or consisting essentially of, a nucleotide sequence or sequences (in the case of primer pairs) selected from SEQ ID NOs: 71, 72, 77-80, 89-96, 109-120, 237-242, 249-256, 343-346, 407-412, 473-480 and/or SEQ ID NOS: 849, 850, 855-858, 867-874, 887-898, 1012-1020, 1025-1034, 1121-1124, 1185-1190, 1251-1258 in Table 16, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the combination includes, or consists essentially of, different nucleic acids or primers separately containing, or consisting essentially of, each of the different sequences of SEQ ID NOS: 71, 72, 77-80, 89-96, 109-120, 237-242, 249-256, 343-346, 407-412, 473-480, 493-496, 511-520, 521-524, 547-550, 555-558, 561-568, 571-586, 665-668, 675-678, 731-734, 779-784, 817-820 and/or SEQ ID NOS: 849, 850, 855-858, 867-874, 887-898, 1012-1020, 1025-1034, 1121-1124, 1185-1190, 1251-1258, 1271-1276, 1289-1298, 1299-1302, 1325-1328, 1333-1336, 1339-1346, 1349-1364, 1443-1446, 1453-1456, 1509-1512, 1557-1562, 1595-1598 in Table 16, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the combination includes, or consists essentially of, different nucleic acids or primers separately containing, or consisting essentially of, each of the different sequences of SEQ ID NOS: 71, 72, 77-80, 89-96, 109-120, 237-242, 249-256, 343-346, 407-412, 493-496, 511-520, 521-524, 547-550, 555-558, 561-568, 571-586, 665-668, 675-678, 731-734, 779-784, and/or SEQ ID NOS: 849, 850, 855-858, 867-874, 887-898, 1012-1020, 1025-1034, 1121-1124, 1185-1190, 1271-1276, 1289-1298, 1299-1302, 1325-1328, 1333-1336, 1339-1346, 1349-1364, 1443-1446, 1453-1456, 1509-1512, 1557-1562, in Table 16, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the combination includes, or consists essentially of, different nucleic acids or primers separately containing, or consisting essentially of, each of the different sequences of SEQ ID NOS: 71, 72, 77-80, 89-96, 109-120, 237-242, 249-256, 343-346, 407-412, 473-480, 493-496, 511-520 and/or SEQ ID NOS: 849, 850, 855-858, 867-874, 887-898, 1012-1020, 1025-1034, 1121-1124, 1185-1190, 1251-1258, 1271-1276, 1289-1298 in Table 16, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the combination includes, or consists essentially of, different nucleic acids or primers separately containing, or consisting essentially of, each of the different sequences of SEQ ID NOS: 71, 72, 77-80, 89-96, 109-120, 237-242, 249-256, 343-346, 407-412, 493-496, 511-520 and/or SEQ ID NOS: 849, 850, 855-858, 867-874, 887-898, 1012-1020, 1025-1034, 1121-1124, 1185-1190, 1271-1276, 1289-1298 in Table 16, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the combination includes, or consists essentially of, different nucleic acids or primers separately containing, or consisting essentially of, each of the different sequences of SEQ ID NOS: 71, 72, 77-80, 89-96, 109-120, 237-242, 249-256, 343-346, 407-412, 473-480 and/or SEQ ID NOS: 849, 850, 855-858, 867-874, 887-898, 1012-1020, 1025-1034, 1121-1124, 1185-1190, 1251-1258 in Table 16, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the combination includes, or consists essentially of, different nucleic acids or primers separately containing, or consisting essentially of, each of the different sequences of SEQ ID NOS: 71, 72, 77-80, 89-96, 109-120, 237-242, 249-256, 343-346, 407-412 and/or SEQ ID NOS: 849, 850, 855-858, 867-874, 887-898, 1012-1020, 1025-1034, 1121-1124, 1185-1190 in Table 16, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base.


In some embodiments, a combination of nucleic acids and/or nucleic acid primer pairs includes two or more nucleic acids and/or nucleic acid primer pairs that specifically bind to, hybridize to and/or amplify a unique nucleic acid sequence contained in the genome of one or more of Akkermansia muciniphila, Bifidobacterium bifidum, Bifidobacterium longum, Blautia coccoides, Campylobacter concisus, Campylobacter curvus, Campylobacter jejuni, Campylobacter rectus, Clostridioides difficile, Escherichia coli, Eubacterium rectale, Fusobacterium nucleatum, Helicobacter bilis, Helicobacter hepaticus, Helicobacter pylori, Klebsiella pneumoniae, Lactobacillus delbrueckii, Parabacteroides distasonis, Proteus mirabilis, Ruminococcus bromii and Ruminococcus gnavus (referred to herein as “Group D” microorganisms; see Table 2D), which are species implicated as having a role in gastrointestinal disorders, including, for example, irritable bowel syndrome, inflammatory bowel disease and coeliac disease. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes a set of nucleic acid primer pairs in which each different nucleic acid primer pair specifically amplifies a different unique nucleic acid sequence contained in a different one of each of the genomes of the different microorganisms in Group D. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of nucleic acids and/or nucleic acid primer pairs that bind to, hybridize to and/or amplify, or specifically bind to, hybridize to and/or amplify, a nucleic acid (such as a nucleic acid from a microorganism, e.g., bacteria) containing a sequence selected from the sequences in Table 17, or Table 17 excluding SEQ ID NOS: 1807, 1808 and 1971, or Table 17A and Table 17B, or Table 17 B and Table 17A that excludes SEQ ID NOS: 1807 and 1808, and/or a substantially identical or similar sequence, which correspond to a Group D microorganism. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of primers and/or primer pairs capable of amplifying, or specifically amplifying, a nucleic acid (such as a nucleic acid from a microorganism, e.g., bacteria) containing a sequence selected from the sequences in Table 17, or Table 17 excluding SEQ ID NOS: 1807, 1808 and 1971, or Table 17A and Table 17B, or Table 17 B and Table 17A that excludes SEQ ID NOS: 1807 and 1808, and/or a substantially identical or similar sequence, which correspond to a Group D microorganism (or a sequence that is substantially identical or similar to any of the aforementioned sequences) to generate an amplicon sequence that is less than about 500, less than about 475, less than about 450, less than about 400, less than about 375, less than about 350, less than about 300, less than about 275, less than about 250, less than about 200, less than about 175, less than about 150, or less than about 100 nucleotides in length, or an amplicon sequence consisting essentially of a nucleotide sequence selected from the sequences in Table 17, or Table 17 excluding SEQ ID NOS: 1807, 1808 and 1971, or Table 17A and Table 17B, or Table 17 B and Table 17A that excludes SEQ ID NOS: 1807 and 1808, and/or a substantially identical or similar sequence, which correspond to a Group D microorganism, or a sequence that is substantially identical or similar to any of the aforementioned sequences, and optionally containing the nucleic acid primer sequences at the 5′ and 3′ ends of the sequence. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of primers and/or primer pairs capable of amplifying, or specifically amplifying, a nucleic acid (such as a nucleic acid from a microorganism, e.g., bacteria) containing a sequence selected from sequences in Table 17, or Table 17 excluding SEQ ID NOS: 1807, 1808 and 1971, or Table 17A and Table 17B, or Table 17 B and Table 17A that excludes SEQ ID NOS: 1807 and 1808, (or a sequence that is substantially identical or similar to any of the aforementioned sequences) which correspond to a Group D microorganism to generate an amplicon sequence that is less than about 500, less than about 475, less than about 450, less than about 400, less than about 375, less than about 350, less than about 300, less than about 275, less than about 250, less than about 200, less than about 175, less than about 150, or less than about 100 nucleotides in length, or an amplicon sequence consisting essentially of a nucleotide sequence selected from sequences in Table 17, or Table 17A and Table 17B, which correspond to a Group D microorganism or a sequence that is substantially identical or similar to any of the aforementioned sequences, and optionally containing the nucleic acid primer sequences at the 5′ and 3′ ends of the sequence. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of nucleic acids and/or nucleic acid primer pairs containing, or consisting essentially of, a nucleotide sequence or sequences (in the case of primer pairs) selected from sequences corresponding to Group D microorganisms in Table 16, or Table 16 excluding SEQ ID NOS: 453-456, 809, 810, 1231-1234 and 1587-1588, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452 and 457-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452 and 457-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-480 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230 and 1235-1298 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1258 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of nucleic acids and/or nucleic acid primer pairs containing, or consisting essentially of, a nucleotide sequence or sequences (in the case of primer pairs) selected from sequences corresponding to Group D microorganisms in Table 16, or Table 16 excluding SEQ ID NOS: 453-456, 809, 810, 1231-1234 and 1587-1588, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452 and 457-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452 and 457-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-480 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230 and 1235-1298 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1258 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of nucleic acids and/or nucleic acid primer pairs containing, or consisting essentially of, a nucleotide sequence or sequences (in the case of primer pairs) selected from sequences corresponding to Group D microorganisms in Table 16, or Table 16 excluding SEQ ID NOS: 453-456, 809, 810, 1231-1234 and 1587-1588, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452 and 457-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452 and 457-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-480 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230 and 1235-1298 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1258 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the combination includes, or consists essentially of, different nucleic acids or primers separately containing, or consisting essentially of, each of the different sequences corresponding to Group D microorganisms in Table 16, or Table 16 excluding SEQ ID NOS: 453-456, 809, 810, 1231-1234 and 1587-1588, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452 and 457-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452 and 457-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-480 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230 and 1235-1298 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1258 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base.


In some embodiments, a combination of nucleic acids and/or nucleic acid primer pairs includes two or more nucleic acids and/or nucleic acid primer pairs that specifically bind to, hybridize to and/or amplify a unique nucleic acid sequence contained in the genome of one or more of Akkermansia muciniphila, Bacteroides fragilis, Bacteroides vulgatus, Bifidobacterium adolescentis, Campylobacter concisus, Campylobacter jejuni, Citrobacter rodentium, Clostridioides difficile, Enterococcus gallinarum, Escherichia coli, Helicobacter bilis, Lactobacillus delbrueckii, Lactobacillus murinus, Lactobacillus reuteri, Lactobacillus rhamnosus, Lactococcus lactis, and Prevotella copri (referred to herein as “Group E” microorganisms; see Table 2E), which are species implicated as having a role in autoimmune disorders, including, for example, lupus and rheumatoid arthritis. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes a set of nucleic acid primer pairs in which each different nucleic acid primer pair specifically amplifies a different unique nucleic acid sequence contained in a different one of each of the genomes of the different microorganisms in Group E. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of nucleic acids and/or nucleic acid primer pairs that bind to, hybridize to and/or amplify, or specifically bind to, hybridize to and/or amplify, a nucleic acid (such as a nucleic acid from a microorganism, e.g., bacteria) containing a sequence selected from the sequences in Table 17, or Table 17A and Table 17B, and/or a substantially identical or similar sequence, which correspond to a Group E microorganism. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of primers and/or primer pairs capable of amplifying, or specifically amplifying, a nucleic acid (such as a nucleic acid from a microorganism, e.g., bacteria) containing a sequence selected from the sequences in Table 17, or Table 17A and Table 17B, and/or a substantially identical or similar sequence, which correspond to a Group E microorganism (or a sequence that is substantially identical or similar to any of the aforementioned sequences) to generate an amplicon sequence that is less than about 500, less than about 475, less than about 450, less than about 400, less than about 375, less than about 350, less than about 300, less than about 275, less than about 250, less than about 200, less than about 175, less than about 150, or less than about 100 nucleotides in length, or an amplicon sequence consisting essentially of a nucleotide sequence selected from the sequences in Table 17, or Table 17A and Table 17B, and/or a substantially identical or similar sequence, which correspond to a Group E microorganism, or a sequence that is substantially identical or similar to any of the aforementioned sequences, and optionally containing the nucleic acid primer sequences at the 5′ and 3′ ends of the sequence. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of primers and/or primer pairs capable of amplifying, or specifically amplifying, a nucleic acid (such as a nucleic acid from a microorganism, e.g., bacteria) containing a sequence selected from sequences in Table 17, or Table 17A and Table 17B, (or a sequence that is substantially identical or similar to any of the aforementioned sequences) which correspond to a Group E microorganism to generate an amplicon sequence that is less than about 500, less than about 475, less than about 450, less than about 400, less than about 375, less than about 350, less than about 300, less than about 275, less than about 250, less than about 200, less than about 175, less than about 150, or less than about 100 nucleotides in length, or an amplicon sequence consisting essentially of a nucleotide sequence selected from sequences in Table 17, or Table 17A and Table 17B, which correspond to a Group E microorganism or a sequence that is substantially identical or similar to any of the aforementioned sequences, and optionally containing the nucleic acid primer sequences at the 5′ and 3′ ends of the sequence. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of nucleic acids and/or nucleic acid primer pairs containing, or consisting essentially of, a nucleotide sequence or sequences (in the case of primer pairs) selected from sequences corresponding to Group E microorganisms in Table 16, SEQ ID NOS: 49-520 of Table 16, SEQ ID NOS: 49-492 of Table 16, SEQ ID NOS: 49-480 of Table 16A, SEQ ID NOS: 521-826 of Table 16C, SEQ ID NOS: 521-820 of Table 16C, SEQ ID NOS: 827-1298 of Table 16, SEQ ID NOS: 827-1258 of Table 16D, SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of nucleic acids and/or nucleic acid primer pairs containing, or consisting essentially of, a nucleotide sequence or sequences (in the case of primer pairs) selected from sequences corresponding to Group E microorganisms in Table 16, SEQ ID NOS: 49-520 of Table 16, SEQ ID NOS: 49-492 of Table 16, SEQ ID NOS: 49-480 of Table 16A, SEQ ID NOS: 521-826 of Table 16C, SEQ ID NOS: 521-820 of Table 16C, SEQ ID NOS: 827-1298 of Table 16, SEQ ID NOS: 827-1258 of Table 16D, SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes or consists essentially of nucleic acids and/or nucleic acid primer pairs containing, or consisting essentially of, a nucleotide sequence or sequences (in the case of primer pairs) selected from sequences corresponding to Group E microorganisms in Table 16, SEQ ID NOS: 49-520 of Table 16, SEQ ID NOS: 49-492 of Table 16, SEQ ID NOS: 49-480 of Table 16A, SEQ ID NOS: 521-826 of Table 16C, SEQ ID NOS: 521-820 of Table 16C, SEQ ID NOS: 827-1298 of Table 16, SEQ ID NOS: 827-1258 of Table 16D, SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the combination includes, or consists essentially of, different nucleic acids or primers separately containing, or consisting essentially of, each of the different sequences corresponding to Group E microorganisms in Table 16, SEQ ID NOS: 49-520 of Table 16, SEQ ID NOS: 49-492 of Table 16, SEQ ID NOS: 49-480 of Table 16A, SEQ ID NOS: 521-826 of Table 16C, SEQ ID NOS: 521-820 of Table 16C, SEQ ID NOS: 827-1298 of Table 16, SEQ ID NOS: 827-1258 of Table 16D, SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base.


Nucleic Acid Combinations


In order to accurately assess, profile and characterize a population of microorganisms as is necessary in order to establish meaningful correlations between an animal's or environment's microbiome and state of health or homeostasis and imbalance or disease, and then assess and characterize a microbiome sample to detect and/or diagnose an imbalance, susceptibility, disorder and/or disease, it is essential to be able to perform comprehensive, specific and proportional evaluation of the constituent microorganisms in a microbiome populations. Accurate analysis of a microbiome population relies on comprehensively detecting and identifying all microorganisms, e.g., bacteria, present in a population, at least at the genus level, and detecting and identifying some, for example microorganisms of particular significance in health and disease, most, the majority of, or substantially all of the species of microorganisms present in the population to achieve a sufficient depth of constituent microorganisms of the population. Provided herein are compositions and methods, as well as combinations, kits, and systems that include the compositions and methods, for accurate, comprehensive, informative, sensitive, specific, rapid, high-throughput and cost-effective assessment, profiling or characterization of a mixture or population of microorganisms, e.g., bacteria. In some embodiments, the mixture or population of microorganisms is in a sample (e.g., biological sample), for example, a sample of contents of an alimentary tract of an organism, such as an animal. In some embodiments, compositions provided herein for such assessment, profiling or characterization of a mixture or population of microorganisms, e.g., bacteria, include a combination of (1) one or more kingdom-encompassing nucleic acid primer pairs capable of amplifying a sequence in a homologous gene or genomic region common to multiple, most, a majority, substantially all, or all microorganisms in a kingdom (e.g., bacteria), but that varies between different microorganisms in the kingdom, and/or (2) microorganism-specific nucleic acids and/or nucleic acid primer pairs that are capable of amplifying, or specifically or selectively amplifying, a specific nucleic acid sequence unique to a particular microorganism (e.g., a species, subspecies or strain of microorganism, such as bacteria). Numerous embodiments of kingdom-encompassing nucleic acid primer pairs and microorganism-specific nucleic acid primer pairs that can be used in combinations of nucleic acids are provided herein.


For example, in some embodiments, the kingdom-encompassing nucleic acids in the combination of nucleic acids include one or more primer pairs that separately amplify two or more regions, e.g., hypervariable regions, in a prokaryotic, e.g., bacterial, 16S rRNA gene. In some embodiments, there is little (e.g., less than or equal to 7 nucleotides, or 6 nucleotides, or 5 nucleotides, or 4 nucleotides, or 3 nucleotides, or 2 nucleotides, or 1 nucleotide) to no overlap of the nucleotide sequences of any two of the 16s rRNA gene primers that separately amplify nucleic acids comprising sequences located in multiple hypervariable regions. In some aspects, kingdom-encompassing nucleic acid primer pairs amplify 16s rRNA gene sequences less than or equal to about 200 nucleotides in length, for example, between about 125 and 200 nucleotides in length. In some embodiments, the kingdom-encompassing nucleic acids in the combination of nucleic acids include a plurality of nucleic acid primer pairs that includes at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8 or at least 9 separate primer pairs, and optionally degenerate variants thereof, which separately amplify nucleic acids containing sequences located in 2, 3, 4, 5, 6, 7, 8 or 9 different hypervariable regions, respectively, in a prokaryotic 16s rRNA gene in a nucleic acid amplification reaction. In some embodiments, the kingdom-encompassing nucleic acids in the combination of nucleic acids include at least 8 separate primer pairs, and optionally degenerate variants thereof, which separately amplify nucleic acids containing sequences located in 8 different hypervariable regions in a prokaryotic 16s rRNA gene in a nucleic acid amplification reaction. In some embodiments, the kingdom-encompassing nucleic acids in the combination of nucleic acids include a plurality of primer pairs that separately amplify nucleic acids containing sequences located in 3 or more hypervariable regions of a prokaryotic 16S rRNA gene and wherein one of the 3 or more regions is a V5 region. Degenerate primer variants, containing, for example, different nucleotides at 1 or 2 positions in the primer sequences, are included in some compositions to ensure amplification of 16S rRNA genes containing minor variations in conserved regions. Nonlimiting examples of nucleotide sequences of primer pairs that separately amplify 8 hypervariable regions (V2, V3, V4, V5, V6, V7, V8 and V9) of the prokaryotic 16S rRNA gene are listed in Table 15. In some embodiments, the kingdom-encompassing nucleic acids in a combination of nucleic acids include at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 15, at least 20, at least 24, at least 30, at least 35, at least 40, at least 45 or more, or all of the primers, or of the primer pairs, having or consisting essentially of the sequences listed in Table 15 or SEQ ID NOS: 1-24 in Table 15 and/or SEQ ID NOS: 25-48 in Table 15 or SEQ ID NOS: 11-16, 23 and 24 in Table 15 and/or SEQ ID NOS: 35-40, 47 and 48 in Table 15. In some embodiments, the kingdom-encompassing nucleic acids in the combination of nucleic acids, include one or more primer pairs that provide at least 85%, or at least 90%, or at least 92%, or at least 95%, or at least 98%, or at least 99%, or 100% coverage of different bacterial 16S rRNA gene sequences in a given database (e.g., GreenGenes bacterial 16S rRNA gene sequence; www.greengenes.lbl.gov; SILVA database (www.arb-silva.de)) containing bacterial 16S rRNA gene sequences. In some embodiments, each of one or more microorganism-specific nucleic acid primer pairs contained in a combination of primer pairs is capable of amplifying, or specifically amplifying, a specific nucleic acid (e.g., a nucleic acid sequence from a microorganism such as a bacterium) containing, or consisting essentially of, a nucleotide sequence selected from among SEQ ID NOS: 1605-1979 of Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C. In some embodiments, each of one or more microorganism-specific nucleic acid primer pairs contained in a combination of primer pairs is capable of amplifying, or specifically amplifying, a specific nucleic acid sequence containing a nucleotide sequence selected from SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence, to generate amplicon sequences that are less than about 500, less than about 475, less than about 450, less than about 400, less than about 375, less than about 350, less than about 300, less than about 275, less than about 250, less than about 200, less than about 175, less than about 150, or less than about 100 nucleotides in length, or amplicon sequences that consist essentially of a sequence selected from SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or substantially identical or similar sequence, and optionally containing the nucleic acid primer sequences at the 5′ and 3′ ends of the sequence. In some embodiments, the collection of microorganism-specific nucleic acid primer pairs in a combination are capable of amplifying, or specifically amplifying, in a multiplex reaction at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, at least 55, at least 60, at least 65, at least 70, at least 75, at least 80, at least 85, at least 90, at least 95, at least 100, at least 125, at least 150, at least 175, at least 200, at least 225, or at least 230 or more different nucleic acids containing a different one of SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C. In some such embodiments, the microorganism-specific nucleic acid primer pairs in the combination can amplify the different nucleic acids containing a different one of SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970 and 1972-1974 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence, to generate an amplicon sequence that is less than about 500, less than about 475, less than about 450, less than about 400, less than about 375, less than about 350, less than about 300, less than about 275, less than about 250, less than about 200, less than about 175, less than about 150, or less than about 100 nucleotides in length, or that consists essentially of a nucleotide sequence selected from among SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or substantially identical or similar sequence, and optionally containing the nucleic acid primer sequences at the 5′ and 3′ ends of the sequence. In some embodiments, microorganism-specific nucleic acid primer pairs in the combination include one or more primer pairs having or consisting essentially of a nucleotide sequence or pair of sequences (for primer pairs) selected from Table 16, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or a sequence or sequences substantially identical or similar thereto, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, microorganism-specific nucleic acid primer pairs in the combination include at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, at least 55, at least 60, at least 65, at least 70, at least 75, at least 80, at least 85, at least 90, at least 95, at least 100, at least 125, at least 150, at least 175, at least 200, at least 225, or at least 230, or all of the nucleic acid primer pairs having or consisting essentially of sequences selected from Table 16, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or a sequence or sequences substantially identical or similar thereto, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the combinations include one or more microorganism-specific nucleic acid primer pairs that amplifies a specific nucleic acid sequence unique to one or more microorganisms (e.g., bacteria) implicated in one or more conditions, disorders and/or diseases.


In any of the embodiments described herein for compositions that include one or more, or a plurality of, or combinations of nucleic acids, primers or nucleic acid primer pairs, one or more of the nucleic acids, or one or more primers or primer pairs may include a modification. In some embodiments, a modification is one that facilitates nucleic acid manipulation, amplification, ligation and/or sequencing of amplification products and/or reduction or elimination of primer dimers. In particular embodiments, a modification is one that facilitates multiplex nucleic acid amplification, ligation and/or sequencing of products of multiplex amplification. In some embodiments, at least one primer of a primer pair or both primers of a primer pair contains a modification relative to the nucleic acid sequence to be amplified that increases the susceptibility of the primer to cleavage. For example, in some embodiments, one or more nucleic acids, or primers, or both primers of a primer pair has at least one cleavable group located at either a) the 3′ end or the 5′ end, and/or b) at about the central nucleotide position of the nucleic acid or primer, and wherein the nucleic acids, primers or primer pairs can be substantially non-complementary to other nucleic acids, primers or primer pairs in the composition. In some embodiments, the composition comprises at least 50, 100, 150, 200, 250, 300, 350, 398, or more primer pairs. In some embodiments, the primer pairs comprise about 15 nucleotides to about 40 nucleotides in length. In some embodiments, at least one nucleotide of one or more primers is replaced with a cleavable group. In some embodiments the cleavable group can be a uridine nucleotide. In some embodiments, the template, one or more primers and/or amplification product includes nucleotides or nucleobases that can be recognized by specific enzymes. In some embodiments, the nucleotides or nucleobases can be bound by specific enzymes. Optionally, the specific enzymes can also cleave the template, one or more primers and/or amplification product at one or more sites. In some embodiments, such cleavage can occur at specific nucleotides within the template, one or more primers and/or amplification product. For example, the template, one or more primers and/or amplification product can include one or more nucleotides or nucleobases including uracil, which can be recognized and/or cleaved by enzymes such as uracil DNA glycosylase (UDG, also referred to as UNG) or formamidopyrimidine DNA glycosylase (Fpg). The template, one or more primers and/or amplification product can include one or more nucleotides or nucleobases including RNA-specific bases, which can be recognized and/or cleaved by enzymes such as RNAseH. In some embodiments, the template, one or more primers and/or amplification product can include one or more abasic sites, which can be recognized and/or cleaved using various proofreading polymerases or apyrase treatments. In some embodiments, the template, one or more primers and/or amplification product can include 7,8-dihydro-8-oxoguanine (8-oxoG) nucleobases, which can be recognized or cleaved by enzymes such as Fpg. In some embodiments, one or more amplified target sequences can be partially digested by a FuPa reagent. In some embodiments, the primer includes a sufficient number of modified nucleotides to allow functionally complete degradation of the primer by the cleavage treatment, but not so many as to interfere with the primer's specificity or functionality prior to such cleavage treatment, for example in the amplification reaction. In some embodiments, the primer includes at least one modified nucleotide, but no greater than 75% of nucleotides of the primer are modified. For example, the primers can include uracil-containing nucleobases that can be selectively cleaved using UNG/UDG (optionally with heat and/or alkali). In some embodiments, the primers can include uracil-containing nucleotides that can be selectively cleaved using UNG and Fpg. In some embodiments, the cleavage treatment includes exposure to oxidizing conditions for selective cleavage of dithiols, treatment with RNAseH for selective cleavage of modified nucleotides including RNA-specific moieties (e.g., ribose sugars, etc.), and the like. This cleavage treatment can effectively fragment the original amplification primers and non-specific amplification products into small nucleic acid fragments that include relatively few nucleotides each. Such fragments are typically incapable of promoting further amplification at elevated temperatures. Such fragments can also be removed relatively easily from the reaction pool through the various post-amplification cleanup procedures known in the art (e.g., spin columns, NaEtOH precipitation, etc).


In some embodiments, a composition provided herein includes a sample containing a plurality of microorganisms, or nucleic acids from such a sample that contains a plurality of microorganisms, and one or more nucleic acids, primers and/or primer pairs of any embodiments of the compositions described herein and, optionally, a polymerase, e.g., a DNA polymerase. In some embodiments, the sample is a biological sample, such as, for example, an environmental sample or a sample from an animal subject, e.g., a human. Samples include, but are not limited to, biological fluid samples, blood samples, skin samples, mucus samples, saliva samples, sputum samples, samples from a subject's oral or nasal cavity, respiratory tract samples, vaginal samples, alimentary tract samples and fecal samples. In some embodiments, the sample is from the alimentary tract of an animal, such as, for example, a fecal or stool sample. In particular embodiments, the composition includes one or more kingdom-encompassing nucleic acid primer pairs capable of amplifying a sequence in a homologous gene or genomic region common to multiple, most, a majority, substantially all, or all microorganisms in a kingdom (e.g., bacteria), but that varies between different microorganisms in the kingdom, and/or one or more microorganism-specific nucleic acid primer pairs that amplify a specific nucleic acid sequence unique to a particular microorganism (e.g., a species, subspecies or strain of microorganism, such as bacteria). Numerous embodiments of kingdom-encompassing nucleic acid primer pairs and microorganism-specific nucleic acid primer pairs that can be used in combinations of nucleic acids are provided herein. For example, in some embodiments, the kingdom-encompassing nucleic acids in the combination of nucleic acids include one or more primer pairs that separately amplify two or more, three or more, four or more, five or more, 6 or more, 7 or more, or 8 or more regions, e.g., hypervariable regions, in a prokaryotic, e.g., bacterial, 16S rRNA gene. In some embodiments, at least one of the one or more microorganism-specific nucleic acid primer pairs is capable of amplifying, or specifically amplifying, a specific nucleic acid sequence containing a nucleotide sequence selected from among SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence, to generate amplicon sequences that are less than about 500, less than about 475, less than about 450, less than about 400, less than about 375, less than about 350, less than about 300, less than about 275, less than about 250, less than about 200, less than about 175, less than about 150, or less than about 100 nucleotides in length, or amplicon sequences that consist essentially of a sequence selected from SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or substantially identical or similar sequence, and optionally containing the nucleic acid primer sequences at the 5′ and 3′ ends of the sequence. In some embodiments, the one or more microorganism-specific nucleic acid primer pairs is a plurality of such primer pairs wherein each of the primer pairs is capable of amplifying, or specifically amplifying, a specific nucleic acid sequence containing a nucleotide sequence selected from among SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C. In some embodiments, the one or more microorganism-specific nucleic acid primer pairs is a plurality of such primer pairs wherein each of the primer pairs is capable of amplifying, or specifically amplifying, a specific nucleic acid sequence containing a nucleotide sequence selected from among SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence, to generate an amplicon sequence that is less than about 500, less than about 475, less than about 450, less than about 400, less than about 375, less than about 350, less than about 300, less than about 275, less than about 250, less than about 200, less than about 175, less than about 150, or less than about 100 nucleotides in length, or that consists essentially of a nucleotide sequence selected from SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or substantially identical or similar sequence, and optionally containing the nucleic acid primer sequences at the 5′ and 3′ ends of the sequence.


Also provided herein are compositions containing a mixture of nucleic acids, in which most, or substantially all of the nucleic acids contain sequence of a portion of the genome of a microorganism, e.g., a bacterium. In some embodiments, the mixture of nucleic acids includes nucleic acids containing sequences of a portion of at least 2, at least 5, at least 10, at least 20, at least 25, at least 30, at least 35, at least 40, at least 50, at least 75, at least 100, at least 150, at least 200, at least 250, at least 300, at least 350, at least 400, or at least 500 or more different microorganisms, e.g., different species of microorganisms such as bacteria. In some embodiments, the sequences of portions of the genome of microorganisms are each less than about 1000 nucleotides, less than about 900 nucleotides, less than about 1000 nucleotides, less than about 900 nucleotides, less than about 800 nucleotides, less than about 700 nucleotides, less than about 600 nucleotides, less than about 500 nucleotides, less than about 450 nucleotides, less than about 400 nucleotides, less than about 350 nucleotides, less than about 300 nucleotides, less than about 250 nucleotides, or less than about 200 nucleotides in length. In some embodiments, the sequences of portions of the genome of microorganisms are each less than or about 250 nucleotides in length. In some embodiments, the nucleic acids include double-stranded, partially double-stranded and/or single-stranded nucleic acids. In some embodiments, the nucleic acids include amplicons generated in a nucleic acid amplification reaction of nucleic acids from one or more, or a plurality of microorganisms, such as a plurality of different microorganisms, e.g., bacteria. In some embodiments, the nucleic acids include nucleotides containing a uracil nucleobase. In some embodiments, the nucleic acids contain 5′ and/or 3′ overhangs. In some embodiments, the composition contains one or more, or a plurality, of primers, e.g., nucleic acids and/or primer pairs of any of the embodiments described herein. In some embodiments, the composition includes a DNA polymerase, a DNA ligase, and/or at least one uracil cleaving or modifying enzyme. In some embodiments, the nucleic acids include any one or more of the following:


(1) one or more nucleic acids containing, or consisting essentially of, a nucleotide sequence of a hypervariable region of a prokaryotic 16S rRNA gene, e.g., a V1, V2, V3, V4, V5, V6, V7, V8 and/or V9 region,


(2) a plurality of nucleic acids containing, or consisting essentially of, a nucleotide sequence of a hypervariable region of a prokaryotic 16S rRNA gene, e.g., a V1, V2, V3, V4, V5, V6, V7, V8 and/or V9 region,


(3) one or more or a plurality of nucleic acids containing, or consisting essentially of, a nucleotide sequence of a hypervariable region of a prokaryotic 16S rRNA gene, e.g., a V1, V2, V3, V4, V5, V6, V7, V8 and/or V9 region, wherein the sequence has the sequence from only one hypervariable region,


(4) one or more nucleic acids containing, or consisting essentially of, a nucleotide sequence of a hypervariable region of a prokaryotic 16S rRNA gene, e.g., a V1, V2, V3, V4, V5, V6, V7, V8 and/or V9 region, wherein the sequence includes one or more sequences selected from among sequences listed in Table 15 or SEQ ID NOS: 1-24 in Table 15 and/or SEQ ID NOS: 25-48 in Table 15 or SEQ ID NOS: 11-16, 23 and 24 in Table 15 and/or SEQ ID NOS: 35-40, 47 and 48 in Table 15, and/or


(5) one or more single-stranded nucleic acids containing, or consisting essentially of, a nucleotide sequence selected from among SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence, optionally containing one or more primer sequences at the 3′ and/or 5′ end (e.g., sequences selected from Table 16, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F), or the complement thereof, and/or one or more double-stranded or partially double-stranded nucleic acids containing, or consisting essentially of, a nucleotide sequence selected from among SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence, optionally containing one or more primer sequences at the 3′ and/or 5′ end (e.g., sequences selected from Table 16, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F), and a complementary nucleotide sequence hybridized thereto.


In some embodiments, the nucleic acids include any combination of nucleic acids of (1), (2), (3) or (4) above with nucleic acids of (5) above. In some embodiments of the compositions containing a mixture of nucleic acids, in which most, or substantially all of the nucleic acids contain sequence of a portion of the genome of a microorganism, e.g., a bacterium, provided herein, the composition is or contains one or more libraries of microorganism, e.g., bacteria, nucleic acids. In some embodiments, the mixture of nucleic acids is generated by amplifying nucleic acids in or from a sample containing microorganisms (e.g., bacteria) using primers and/or primer pairs provided herein. For example, a mixture of nucleic acids can be generated by amplifying nucleic acids using (1) one or more kingdom-encompassing nucleic acid primer pairs capable of amplifying a sequence in a homologous gene or genomic region common to multiple, most, a majority, substantially all, or all microorganisms in a kingdom (e.g., bacteria), but that varies between different microorganisms in the kingdom, and/or (2) one or more microorganism-specific nucleic acids and/or nucleic acid primer pairs that are capable of amplifying, or specifically or selectively amplifying, a specific nucleic acid sequence unique to a particular microorganism (e.g., a species, subspecies or strain of microorganism, such as bacteria). Numerous embodiments of kingdom-encompassing nucleic acid primer pairs and microorganism-specific nucleic acid primer pairs that can be used in generating combinations of nucleic acids are provided herein. In some embodiments, the mixture is generated by amplifying microorganism nucleic acids using kingdom-encompassing nucleic acid primer pairs and microorganism-specific nucleic acid primers and/or nucleic acid primer pairs in a single reaction mixture. In some embodiments, the mixture is generated by separately amplifying microorganism nucleic acids, e.g., from a single sample, using kingdom-encompassing nucleic acid primer pairs in one amplification reaction and microorganism-specific nucleic acid primers and/or nucleic acid primer pairs in a separate amplification reaction and then combining the products of both amplification reactions. In some embodiments, the mixture of nucleic acids comprises or consists essentially of portions of a prokaryotic 16S rRNA gene, such as nucleotide sequences of a hypervariable region of a prokaryotic (e.g., bacteria) 16S rRNA gene (e.g., a V1, V2, V3, V4, V5, V6, V7, V8 and/or V9 region), from one or more, or a plurality of, microorganisms and portions of a microorganism (e.g., bacteria) genome from one or more, or a plurality of, microorganisms that are not contained within a prokaryotic 16S rRNA gene.


Methods for Amplification of Nucleic Acids


Methods provided herein include methods for amplification and/or detection of nucleic acids. In particular embodiments, the nucleic acids being amplified and/or detected are from microorganisms, including, for example, bacteria and archaea. As described further herein, methods for amplifying and/or detecting nucleic acids from microorganisms provided herein represent significant improvements over previous methods including, but not limited to, improvements in microorganism nucleic acid amplification and/or detection coverage, sensitivity, efficiency, scale, cost-effectiveness and/or application to or use in other methods. In some embodiments nucleic acids are subjected to nucleic acid hybridization and/or amplification, for example, using any of the nucleic acids provided herein as probes and/or amplification primers. In some embodiments, the presence or absence of one or more hybridization and/or nucleic acid amplification products is detected. In some embodiments, the nucleic acid amplification is a multiplex amplification. In some embodiments, the amplification is performed using a plurality of nucleic acid primer pairs and is conducted in a single multiplex amplification reaction mixture. In some embodiments, the presence or absence of one or more nucleic acids and/or amplification products is detected using one or more nucleic acids provided herein as a probe (e.g., a detectable or labeled probe). In some embodiments, the presence or absence of one or more nucleic acid amplification products is detected by obtaining nucleotide sequence information of one or more nucleic acid amplification products.


Methods for Amplification of Nucleic Acids of Selected Microorganisms


In some embodiments, a method provided herein for amplifying a target nucleic acid of one or more microorganisms includes (a) obtaining nucleic acids of one or more microorganisms selected from the microorganisms listed in Table 1 (or Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or Table 1, except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis) and (b) subjecting the nucleic acids to nucleic acid amplification using at least one primer pair that is capable of specifically amplifying a target nucleic acid sequence contained within a genome of a microorganism selected from the microorganisms of Table 1 (or Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or Table 1, except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis), thereby producing amplified copies of the target nucleic acid. In some embodiments, the target nucleic acid is unique to the microorganism. In some embodiments, the target nucleic acid is not contained within a prokaryotic 16S rRNA gene. In some embodiments, the nucleic acids subjected to amplification include nucleic acids from a plurality of different microorganisms listed in Table 1. In some such embodiments, amplified copies of a plurality of different microorganisms in Table 1 (or Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or Table 1, except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis) is produced, for example in a multiplex nucleic acid amplification. In some embodiments, the nucleic acids subjected to amplification include a mixture of nucleic acids of one or more, or a plurality of, microorganisms selected from among the microorganisms listed in Table 1 (or Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or Table 1, except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis) and one or more microorganisms, e.g., bacteria, not listed in Table 1. In some embodiments, nucleic acids of one or more microorganisms selected from the microorganisms listed in Table 1 are obtained from a biological sample, such as, for example, a sample of contents of the alimentary canal of an animal. In some embodiments, the sample is a fecal sample. In some embodiments, at least one, or one or more, target nucleic acid sequence(s) comprises or consists essentially of a nucleotide sequence selected from the nucleotide sequences of SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence. In some embodiments, at least one, or one or more, product(s) of the nucleic acid amplification comprises, or consists essentially of, a nucleotide sequence selected from SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence, or the complement thereof, and optionally having one or more primer sequences at the 5′ and/or 3′ end(s) of the sequence, such as any of the primer sequences provided herein. In some embodiments, the at least one primer pair does not detectably amplify a nucleic acid sequence contained within any genus other than the genus of the microorganism containing the target nucleic acid sequence. In some embodiments, the at least one primer pair does not detectably amplify a nucleic acid sequence contained within any species other than the species of the microorganism containing the target nucleic acid sequence. In some embodiments, at least one primer of the primer pair, or at least one primer pair, contains, or consists essentially of, the sequence or sequences of a primer or primer pair in Table 16, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or a substantially identical or similar sequence(s), or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the nucleic acids are subjected to nucleic acid amplification using a plurality of primers or primer pairs, each containing, or consisting essentially of, a sequence or sequences of a primer pair in Table 16, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or a substantially identical or similar sequence(s), or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some of the embodiments in which the nucleic acids are subjected to nucleic acid amplification using more than one, or a plurality of primers or primer pairs, the amplification is a multiplex amplification conducted in a single reaction mixture. In some embodiments, at least one primer or one primer pair includes a modification that facilitates nucleic acid manipulation, amplification, ligation and/or sequencing of amplification products and/or reduction or elimination of primer dimers. In particular embodiments, a modification is one that facilitates multiplex nucleic acid amplification, ligation and/or sequencing of products of multiplex amplification.


Methods for Multiplex Amplification of Multiple Regions of a Gene


In some embodiments, a multiplex amplification method provided herein is for amplifying multiple regions of a gene of one or more microorganisms, e.g., bacteria. In one embodiment, the method includes (a) obtaining nucleic acids of one or more microorganisms comprising a 16S rRNA gene and (b) subjecting the nucleic acids to nucleic acid amplification using a combination of primer pairs that includes at least two primer pairs that separately amplify nucleic acids containing sequences of different hypervariable regions of a prokaryotic 16S rRNA gene thereby producing amplified copies of the nucleic acid sequences containing sequences of different hypervariable regions of the 16S rRNA gene of one or more microorganisms. In some embodiments, the microorganism(s) is/are bacteria. In some embodiments, the prokaryotic 16S rRNA gene is a bacterial gene. In some embodiments, the nucleic acids subjected to amplification include nucleic acids from a plurality of different microorganisms. In some embodiments, nucleic acids of one or more microorganisms comprising a 16S rRNA gene are obtained from a biological sample, such as, for example, a sample of contents of the alimentary canal of an animal. In some embodiments, the sample is a fecal sample. In some embodiments, the primers of the combination of primer pairs are directed to, or bind to, or hybridize to nucleic acid sequences contained in conserved regions of a 16S rRNA gene. In some embodiments, each primer of the combination of primer pairs contains less than 10, less than 9, less than 8, less than 7, less than 6, less than 5, less than 4, less than 3, or less than 2 contiguous nucleotides of sequence identical to a sequence of contiguous nucleotides of another primer in the combination of primer pairs. In some embodiments, the nucleic acid sequences being amplified are less than about 300 bp, less than about 250 bp, less than about 200 bp, less than about 175 bp, less than about 150 bp, or less than about 125 bp in length. In some embodiments, the combination of primer pairs separately amplify nucleic acids containing sequences of 3 or more, 4 or more, 5 or more, 6 or more, 7 or more, 8 or more or 9 different hypervariable regions of a prokaryotic 16S rRNA gene thereby producing amplified copies of the nucleic acids containing sequences of the 3 or more, 4 or more, 5 or more, 6 or more, 7 or more, 8 or more or 9 different hypervariable regions of the 16S rRNA gene of one or more microorganisms, wherein the amplified copies of different hypervariable regions are separate amplicons. In some embodiments, the combination of primer pairs separately amplify 8 different nucleic acids separately containing sequences of 8 different hypervariable regions of a prokaryotic 16S rRNA gene. In some embodiments, the 8 different hypervariable regions are V2-V9. In some embodiments, the combination of primer pairs separately amplify at least 3 different nucleic acids each of which separately contains a sequence of a different hypervariable region of a prokaryotic 16S rRNA gene wherein one of the 3 or more regions is a V5 region thereby producing amplified copies of the nucleic acids separately containing sequences of 3 or more hypervariable regions of the 16S rRNA gene of one or more microorganisms. In some embodiments, the combination of primer pairs includes degenerate sequences of one or more primers in one or more primer pairs. For example, in some embodiments, for at least one of the hypervariable regions amplified by the combination of primer pairs, at least two different primer pairs in the combination of primer pairs separately amplify nucleic acid sequence within the same hypervariable region for 2 or more species of the same prokaryotic genus, or for 2 or more strains of the same prokaryotic species, having differences in nucleic acid sequences at the same hypervariable region. In some such instances, at least two different primer pairs in the combination of primer pairs separately amplify nucleic acid sequence within the V2 hypervariable region for 2 or more species of the same prokaryotic genus, or 2 or more strains of the same prokaryotic species, having differences in nucleic acid sequences at the V2 hypervariable region, and/or at least two different primer pairs in the combination of primer pairs separately amplify nucleic acid sequence within the V8 hypervariable region for 2 or more species of the same prokaryotic genus, or 2 or more strains of the same prokaryotic species, having differences in nucleic acid sequences at the V8 hypervariable region. In some embodiments, the combination of primer pairs that amplifies nucleic acids containing sequences of hypervariable regions of a prokaryotic 16S rRNA gene comprises primers and/or primer pairs containing, or consisting essentially of, a sequence or sequences of a primer or primer pair in Table 15, or SEQ ID NOS: 1-24 in Table 15 and/or SEQ ID NOS: 25-48 in Table 15, or SEQ ID NOS: 11-16, 23 and 24 in Table 15 and/or SEQ ID NOS: 35-40, 47 and 48 in Table 15, or substantially identical or similar sequences, and optionally wherein one or more thymine bases is substituted with a uracil base. In some embodiments, the amplification is a multiplex amplification conducted in a single reaction mixture. In some embodiments, at least one primer or primer pair in the combination includes a modification that facilitates nucleic acid manipulation, amplification, ligation and/or sequencing of amplification products and/or reduction or elimination of primer dimers. In particular embodiments, a modification is one that facilitates multiplex nucleic acid amplification, ligation and/or sequencing of products of multiplex amplification.


Methods for Amplification of Multiple Regions of a Genome of a Microorganism


In some embodiments, an amplification method is provided for amplifying multiple regions of the genome of one or more microorganisms. In some embodiments, the method includes (a) obtaining nucleic acids of one or more microorganisms comprising a 16S rRNA gene and (b) subjecting the nucleic acids to nucleic acid amplification using a combination of primer pairs comprising (i) one or more primer pairs that amplifies a nucleic acid containing a sequence of a hypervariable region of a prokaryotic 16S rRNA gene (referred to as the “16S rRNA gene primers and primer pairs”), and (ii) one or more primer pairs that amplify a target nucleic acid sequence contained within the genome of a microorganism that is not contained within a hypervariable region of a prokaryotic 16S rRNA gene, wherein different primer pairs amplify different target nucleic acid sequences contained within the genome of different microorganisms (referred to as the “non-16S rRNA gene primers and primer pairs”), thereby generating amplified copies of at least two different regions of the genome of one or more microorganisms. In some embodiments, the microorganism(s) is/are bacteria. In some embodiments, the prokaryotic 16S rRNA gene is a bacterial gene and/or the prokaryotic microorganism is a bacterium. In some embodiments, the one or more primer pairs that amplifies a nucleic acid containing a sequence of a hypervariable region of a prokaryotic 16S rRNA gene separately amplify nucleic acid sequences of different hypervariable regions. In some embodiments, the primers of the one or more primer pairs of (i) are directed to, or bind to, or hybridize to nucleic acid sequences contained in conserved regions of a prokaryotic 16S rRNA gene. In some embodiments, the amplification is a multiplex amplification conducted in a single reaction mixture. In some embodiments, an amplification method for amplifying multiple regions of the genome of one or more microorganisms includes (a) obtaining nucleic acids of one or more microorganisms comprising a 16S rRNA gene and (b) subjecting the nucleic acids to two or more separate nucleic acid amplification reactions using a first set of primer pairs for one nucleic acid amplification reaction and a second set of primer pairs for the other nucleic acid amplification reaction, wherein (i) the first set of primer pairs comprises one or more primer pairs that amplifies a nucleic acid containing a sequence of a hypervariable region of a prokaryotic 16S rRNA gene (referred to as the “16S rRNA gene primers and primer pairs”), and (ii) the second set of primer pairs comprises one or more primer pairs that amplify a target nucleic acid sequence contained within the genome of a microorganism that is not contained within a hypervariable region of a prokaryotic 16S rRNA gene, wherein different primer pairs amplify different target nucleic acid sequences contained within the genome of different microorganisms (referred to as the “non-16S rRNA gene primers and primer pairs”), thereby generating amplified copies of at least two different regions of the genome of one or more microorganisms. In some embodiments, the microorganism(s) is/are bacteria. In some embodiments, the prokaryotic 16S rRNA gene is a bacterial gene and/or the prokaryotic microorganism is a bacterium. In some embodiments, the one or more primer pairs that amplifies a nucleic acid containing a sequence of a hypervariable region of a prokaryotic 16S rRNA gene separately amplify nucleic acid sequences of different hypervariable regions. In some embodiments, the primers of the one or more primer pairs of (i) are directed to, or bind to, or hybridize to nucleic acid sequences contained in conserved regions of a prokaryotic 16S rRNA gene. In some embodiments, the amplification is a multiplex amplification conducted in a single reaction mixture.


In some embodiments, of the amplification methods for amplifying multiple regions of the genome of one or more microorganisms, the target nucleic acid sequence contained within a genome of a prokaryotic microorganism, e.g., bacteria, is unique to the microorganism. In some embodiments, the one or more 16S rRNA gene primer pairs amplify a nucleic acid sequence in a plurality of microorganisms, e.g., bacteria, from different genera. In some embodiments, a mixture of nucleic acids of at least two different microorganisms, e.g., bacteria, is obtained and subjected to nucleic acid amplification, and the genome of only one of the microorganisms contains a target sequence specifically amplified by the non-16S rRNA gene primer pair. In some such embodiments, the generated amplified copies contain copies of a target nucleic acid sequence amplified by a non-16S rRNA gene primer pair from the nucleic acid of the genome of one microorganism but do not contain copies of a target nucleic acid sequence amplified by a non-16S rRNA gene primer pair from the nucleic acid of the genome of any other microorganism that was subjected to nucleic acid amplification. Also in some such embodiments, the generated amplified copies contain copies of a nucleic acid sequence of a hypervariable region amplified by a 16S rRNA gene primer pair from the nucleic acids of the genome of a plurality of microorganisms. In some embodiments, the nucleic acids subjected to nucleic acid amplification include nucleic acids from a plurality of different microorganisms. In some embodiments, nucleic acids of one or more microorganisms, e.g., bacteria, comprising a 16S rRNA gene are obtained from a biological sample, such as, for example, a sample of contents of the alimentary tract of an animal. In some embodiments, the sample is a fecal sample. In some embodiments, each primer of the one or more 16S rRNA gene primer pairs contains less than 10, less than 9, less than 8, less than 7, less than 6, less than 5, less than 4, less than 3, or less than 2 contiguous nucleotides of sequence identical to a sequence of contiguous nucleotides of another primer in the combination of primer pairs. In some embodiments, the nucleic acid sequences being amplified by the one or more 16S rRNA gene primer pairs are less than about 300 bp, less than about 250 bp, less than about 200 bp, less than about 175 bp, less than about 150 bp, or less than about 125 bp in length. In some embodiments, the 16S rRNA gene primer pairs separately amplify nucleic acids separately containing a different one of 3 or more, 4 or more, 5 or more, 6 or more, 7 or more, 8 or more or 9 different hypervariable regions of a prokaryotic 16S rRNA gene thereby producing amplified copies of the nucleic acids separately containing sequences of one of 3 or more, 4 or more, 5 or more, 6 or more, 7 or more, 8 or more or 9 different hypervariable regions of the 16S rRNA gene of one or more microorganisms, wherein the amplified copies of different hypervariable regions are separate amplicons. In some embodiments, the 16S rRNA gene primer pairs separately amplify nucleic acids separately containing sequences of 8 different hypervariable regions of a prokaryotic 16S rRNA gene. In some embodiments, the 8 different hypervariable regions are V2-V9. In some embodiments, the 16S rRNA gene primer pairs separately amplify nucleic acids separately containing sequences of 3 or more different hypervariable regions of a prokaryotic 16S rRNA gene wherein one of the 3 or more regions is a V5 region thereby producing amplified copies of the nucleic acids separately containing sequences of 3 or more different hypervariable regions of the 16S rRNA gene of one or more microorganisms. In some embodiments, the combination of primer pairs includes degenerate sequences of one or more primers in one or more primer pairs. In some embodiments, the 16S rRNA gene primer pair(s) comprise primers and/or primer pairs containing, or consisting essentially of, a sequence or sequences of a primer or primer pair in Table 15, or SEQ ID NOS: 1-24 in Table 15 and/or SEQ ID NOS: 25-48 in Table 15, or SEQ ID NOS: 11-16, 23 and 24 in Table 15 and/or SEQ ID NOS: 35-40, 47 and 48 in Table 15, or substantially identical or similar sequences, and optionally wherein one or more thymine bases is substituted with a uracil base. In some embodiments, the at least one non-16S rRNA gene primer pair specifically amplifies a target nucleic acid sequence contained within a genome of a microorganism selected from the microorganisms of Table 1, or Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or Table 1, except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis. In some embodiments, the target nucleic acid is unique to the microorganism. In some embodiments, the nucleic acids subjected to amplification include nucleic acids from a plurality of different microorganisms listed in Table 1. In some such embodiments, amplified copies of a plurality of different microorganisms in Table 1, or Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or Table 1, except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis, is produced. In some embodiments, the nucleic acids subjected to amplification include a mixture of nucleic acids of one or more, or a plurality of, microorganisms selected from among the microorganisms listed in Table 1, or Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or Table 1, except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis, and one or more microorganisms, e.g., bacteria, not listed in Table 1. In some embodiments, at least one, or one or more, target nucleic acid sequence(s) comprises or consists essentially of a nucleotide sequence selected from the nucleotide sequences of SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence. In some embodiments, at least one, or one or more, product(s) of the nucleic acid amplification comprises, or consists essentially of, a nucleotide sequence selected from SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence, or the complement thereof, and optionally having one or more primer sequences at the 5′ and/or 3′ end(s) of the sequence, such as any of the primer sequences provided herein. In some embodiments, at least one, or one or more, product(s) of the nucleic acid amplification is less than about 500, less than about 475, less than about 450, less than about 400, less than about 375, less than about 350, less than about 300, less than about 275, less than about 250, less than about 200, less than about 175, less than about 150, or less than about 100 nucleotides in length. In some embodiments, the at least one non-16S rRNA gene primer pair does not detectably amplify a nucleic acid sequence contained within any genus other than the genus of the microorganism containing the target nucleic acid sequence. In some embodiments, the at least one non-16S rRNA gene primer pair does not detectably amplify a nucleic acid sequence contained within any species other than the species of the microorganism containing the target nucleic acid sequence. In some embodiments, at least one primer of the non-16S rRNA gene primer pair, or at least one non-16S rRNA gene primer pair, contains, or consists essentially of, the sequence or sequences of a primer or primer pair in Table 16, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or a substantially identical or similar sequence(s), or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the nucleic acids are subjected to nucleic acid amplification using a plurality of non-16S rRNA gene primers or primer pairs, each containing, or consisting essentially of, a sequence or sequences of a primer pair in Table 16, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or a substantially identical or similar sequence(s), or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, at least one primer or one primer pair in the combination of primer pairs includes a modification that facilitates nucleic acid manipulation, amplification, ligation and/or sequencing of amplification products and/or reduction or elimination of primer dimers. In particular embodiments, a modification is one that facilitates multiplex nucleic acid amplification, ligation and/or sequencing of products of multiplex amplification.


Procedures/Techniques for Use in Methods for Amplification of Nucleic Acids


Methods for obtaining nucleic acids, for example, from a sample are described herein and/or known to those of skill in the art. Samples containing microorganisms can come from a variety of sources, including, for example, environmental sources, e.g., water, soil, and organismal sources, e.g., animals, including, without limitation, insects, domestic animals (e.g., cattle, sheep, pigs, horses, dogs, cats, etc.), mammals (e.g., humans). Common animal samples include, without limitation, saliva, biopsies, tumors, scrapings, swabs, blood, mucus, urine, plasma, semen, hair, laser capture micro-dissections, surgical resections, feces and other clinical or laboratory obtained samples. Fecal samples are commonly used as sources of microorganisms from an animal's alimentary tract or gut. Kits, protocols and instruments for use in extracting nucleic acids from animal samples are available from commercial public sources and include, for example, the MagMAX™ Microbiome Ultra Nucleic Acid Isolation Kit (Thermo Fisher Scientific; catalog no. A42357 (with plate) or A42358 (with tubes)) which can be used with the Thermo Scientific™ Kingfisher™ Flex Magnetic Particle Processor with 96 deep well heads (Thermo Fisher Scientific; catalog no. 5400630). The amount of nucleic acid material required for successful multiplex amplification reactions as can be conducted in embodiments of the methods provided herein, can be about 1 ng. In some embodiments, the amount of nucleic acid material can be about 10 ng to about 50 ng, about 10 ng to about 100 ng, or about 1 ng to about 200 ng of nucleic acid material. Higher amounts of input material can be used, however one aspect of the disclosure is to selectively amplify a plurality of target sequence from a low (ng) about of starting material.


Amplification methods provided herein typically include preparation of an amplification reaction mixture containing reagents for conducting the reaction and subjecting the mixture to conditions to achieve repeated cycles of primer annealing to a template nucleic acid, primer extension and dissociation of the extended primer and template strands (e.g., denaturation). Various techniques for use in amplifying nucleic acids can be employed in the amplification methods, for example, polymerase chain reaction (PCR)-based techniques, helicase-dependent amplification (HDA), loop-mediated isothermal amplification (LAMP) and strand displacement amplification. In some embodiments, the method comprises hybridizing one or more primers of a primer pair to a target template sequence, extending a first primer of the primer pair, denaturing the extended first primer product from the population of nucleic acid molecules, hybridizing to the extended first primer product the second primer of the primer pair, extending the second primer to form a double stranded product, and, in some embodiments, digesting the target-specific primer pair away from the double stranded product to generate a plurality of amplified target sequences. In some embodiments, the digesting includes partial digesting of one or more of the target-specific primers from the amplified target sequence. In some embodiments, the method of performing multiplex PCR amplification includes contacting a plurality of primer pairs having a forward and reverse primer, with a population of template nucleic acid sequences, e.g., in or from a sample, to form a plurality of template/primer duplexes; adding a DNA polymerase and a mixture of dNTPs to the plurality of template/primer duplexes for sufficient time and at sufficient temperature to extend either (or both) the forward or reverse primer in each target-specific primer pair via template-dependent synthesis thereby generating a plurality of extended primer product/template duplexes; denaturing the extended primer product/template duplexes; annealing to the extended primer product the complementary primer from the target-specific primer pair; and extending the annealed primer in the presence of a DNA polymerase and dNTPs to form a plurality of target-specific double-stranded nucleic acid molecules. In some embodiments, the steps of the amplification PCR method can be performed in any order. In some instances, the methods disclosed herein can be further optimized to remove one or more steps and still obtain sufficient amplified target sequences to be used in a variety of downstream processes. For example, the number of purification or clean-up steps can be modified to include more or less steps than disclose herein, providing the amplified target sequences are generated in sufficient yield. In some embodiments the multiplex PCR comprises hybridizing one or more target-specific primer pairs to a nucleic acid molecule, extending the primers of the target-specific primer pairs via template dependent synthesis in the presence of a DNA polymerase and dNTPs; repeating the hybridization and extension steps for sufficient time and sufficient temperature there generating a plurality of amplified target sequences. In some embodiments, the steps of the multiplex amplification reaction method can be performed in any order. The multiplex PCR amplification reactions disclosed herein can include a plurality of “cycles” typically performed on a thermocycler. Each cycle includes at least one annealing step and at least one extension step. In one embodiment, a multiplex PCR amplification reaction is performed wherein target-specific primer pairs are hybridized to a target sequence; the hybridized primers are extended generating an extended primer product/nucleic acid duplex; the extended primer product/nucleic acid duplex is denatured allowing the complementary primer to hybridize to the extended primer product, wherein the complementary primer is extended to generate a plurality of amplified target sequences. In one embodiment, the methods disclosed herein have about 5 to about 18 cycles per preamplification reaction. The annealing temperature and/or annealing duration per cycle can be identical; can include incremental increases or decreases, or a combination of both. The extension temperature and/or extension duration per cycle can be identical; can include incremental increases or decreases, or a combination of both. For example, the annealing temperature or extension temperature can remain constant per cycle. In some embodiments, the annealing temperature can remain constant each cycle and the extension duration can incrementally increase per cycle. In some embodiments, increases or decreases in duration can occur in 15 second, 30 second, 1 minute, 2 minute or 4 minute increments. In some embodiments, increases or decrease in temperature can occur as 0.5, 1, 2, 3, or 4 Celsius deviations. In some embodiments, the amplification reaction can be conducted using hot-start PCR techniques. These techniques include the use of a heating step (>60° C.) before polymerization begins to reduce the formation of undesired PCR products. Other techniques such as the reversible inactivation or physical separation of one or more critical reagents of the reaction, for example the magnesium or DNA polymerase can be sequestered in a wax bead, which melts as the reaction is heated during the denaturation step, releasing the reagent only at higher temperatures. The DNA polymerase can also be kept in an active state by binding to an aptamer or an antibody. This binding is disrupted at higher temperatures, releasing the functional DNA polymerase that can proceed with the PCR unhindered.


In some embodiments, the amplified target sequences can be ligated to one or more adapters. In some embodiments, adapters can include one or more nucleic acid barcodes or tagging sequences. In some embodiments, amplified target sequences once ligated to an adapter can undergo a nick translation reaction and/or further amplification to generate a library of adapter-ligated amplified target sequences. In one embodiment, the amplification method involves performing multiplex PCR on a nucleic acid sample using a plurality of primers having a cleavable group. In some embodiments, a multiplex PCR amplification reaction is conducted using a plurality of primers provided herein that have a cleavable group, and includes a DNA polymerase, an adapter, dATP, dCTP, dGTP and dTTP. In some embodiments, the cleavable group can be a uracil nucleotide. In some embodiments, forward and reverse primer pairs contain a uracil nucleotide as the one or more cleavable groups. In one embodiment, a primer pair can include a uracil nucleotide in each of the forward and reverse primers of each primer pair. In one embodiment, a forward or reverse primer contains one, two, three or more uracil nucleotides. In some embodiments, methods involve amplifying at least 10, 50, 100, 150, 200, 250, 300, 350, 398 or more, target sequences from a population of nucleic acids having a plurality of target sequences using target-specific forward and reverse primer pairs containing at least two uracil nucleotides. The reaction can also include one or more antibodies and/or nucleic acid barcodes. In some embodiments, the methods include processes for reducing the formation of amplification artifacts in a multiplex PCR. In some embodiments, primer-dimers or non-specific amplification products are obtained in lower number or yield as compared to standard multiplex PCR of the prior art. In some embodiments, the reduction in amplification artifacts is in part, governed by the use of specific primer pairs in the multiplex PCR reaction. In one embodiment, the number of specific primer pairs in the multiplex PCR reaction can be greater than 50, 100, 150, 200, 250, 300 or more. In some embodiments, multiplex PCR is performed using primers that contain a cleavable group. In one embodiment, primers containing a cleavable group can include one or more cleavable moieties per primer of each primer pair. In some embodiments, a primer containing a cleavable group includes a nucleotide neither normally present in a sample nor native to the population of nucleic acids undergoing multiplex PCR. For example, a primer can include one or more non-native nucleic acid molecules such as, but not limited to thymine dimers, 8-oxo-2′-deoxyguanosine, inosine, deoxyuridine, bromodeoxyuridine, apurinic nucleotides, and the like.


In some embodiments, the disclosed methods can optionally include destroying one or more primer-containing amplification artifacts, e.g., primer-dimers, dimer-dimers or superamplicons. In some embodiments, the destroying can optionally include treating the primer and/or amplification product so as to cleave specific cleavable groups present in the primer and/or amplification product. In some embodiments, the treating can include partial or complete digestion of one or more target-specific primers. In one embodiment, the treating can include removing at least 40% of the target specific primer from the amplification product. The cleavable treatment can include enzymatic, acid, alkali, thermal, photo or chemical activity. The cleavable treatment can result in the cleavage or other destruction of the linkages between one or more nucleotides of the primer, or between one or more nucleotides of the amplification product. The primer and/or the amplification product can optionally include one or more modified nucleotides or nucleobases. In some embodiments, the cleavage can selectively occur at these sites, or adjacent to the modified nucleotides or nucleobases. In some embodiments, the primer includes a sufficient number of modified nucleotides to allow functionally complete degradation of the primer by the cleavage treatment, but not so many as to interfere with the primer's specificity or functionality prior to such cleavage treatment, for example in the amplification reaction. In some embodiments, the primer includes at least one modified nucleotide, but no greater than 75% of nucleotides of the primer are modified. In some embodiments, the cleavage or treatment of the amplified target sequence can result in the formation of a phosphorylated amplified target sequence. In some embodiments, the amplified target sequence is phosphorylated at the 5′ terminus.


In some embodiments, primers can be designed de novo using algorithms that generate oligonucleotide sequences according to specified design criteria. For example, the primers may be selected according to any one or more of criteria specified herein. In some embodiments, one or more of the primers are selected or designed to satisfy any one or more of the following criteria: (1) inclusion of two or more modified nucleotides within the primer sequence, at least one of which is included near or at the termini of the primer and at least one of which is included at, or about the center nucleotide position of the primer sequence; (2) primer length of about 15 to about 40 bases in length; (3) Tm of from about 60° C. to about 70° C.; (4) low cross-reactivity with non-target sequences present in the target genome or sample of interest; (5) for each primer in a given reaction, the sequence of at least the first four nucleotides (going from 3′ to 5′ direction) are not complementary to any sequence within any other primer present in the same reaction; and (6) no amplicon includes any consecutive stretch of at least 5 nucleotides that is complementary to any sequence within any other amplicon. In some embodiments, the primers include one or more primer pairs designed to amplify target sequences from the sample that are about 100 base pairs to about 500 base pairs in length. In some embodiments, the primers include a plurality of primer pairs designed to amplify target sequences, where the amplified target sequences are predicted to vary in length from each other by no more than 50%, typically no more than 25%, even more typically by no more than 10%, or 5%. For example, if one primer pair is selected or predicted to amplify a product that is 100 nucleotides in length, then other primer pairs are selected or predicted to amplify products that are between 50-150 nucleotides in length, typically between 75-125 nucleotides in length, even more typically between 90-110 nucleotides, or 95-105 nucleotides, or 99-101 nucleotides in length. In some embodiments, at least one primer pair in the amplification reaction is not designed de novo according to any predetermined selection criteria. For example, at least one primer pair can be an oligonucleotide sequence selected or generated at random, or previously selected or generated for other applications. In one exemplary embodiment, the amplification reaction can include at least one primer pair selected from the TaqMan® probe reagents (Roche Molecular Systems). The TaqMan® reagents include labeled probes and can be useful, inter alia, for measuring the amount of target sequence present in the sample, optionally in real time. Some examples of TaqMan technology are disclosed in U.S. Pat. Nos. 5,210,015, 5,487,972, 5,804,375, 6,214,979, 7,141,377 and 7,445,900, hereby incorporated by reference in their entireties. In some embodiments, at least one primer within the amplification reaction can be labeled, for example with an optically detectable label, to facilitate a particular application of interest. For example, labeling may facilitate quantification of target template and/or amplification product, isolation of the target template and/or amplification, product, and the like. In some embodiments, the primers do not contain a carbon-spacer or terminal linker. In some embodiments, the primers or amplified target sequences do not contain an enzymatic, magnetic, optical or fluorescent label.


In some embodiments, primers are synthesized that are complementary to, and can hybridize with, discrete segments of a nucleic acid template strand, including: a primer that can hybridize to the 5′ region of the template, which encompasses a sequence that is complementary to either the forward or reverse amplification primer. In some embodiments, the forward primers, reverse primers, or both, share no common nucleic acid sequence, such that they hybridize to distinct nucleic acid sequences. For example, target-specific forward and reverse primers can be prepared that do not compete with other primer pairs within the primer pool to amplify the same nucleic acid sequence. In this example, primer pairs that do not compete with other primer pairs in the primer pool assist in the reduction of non-specific or spurious amplification products. In some embodiments, the forward and reverse primers of each primer pair are unique, in that the nucleotide sequence for each primer is non-complementary and non-identical to the other primer in the primer pair. In some embodiments, the primer pair can differ by at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, or at least 90% nucleotide identity. In some embodiments, the forward and reverse primers in each primer pair are non-complementary or non-identical to other primer pairs in the primer pool or multiplex reaction. For example, the primer pairs within a primer pool or multiplex reaction can differ by at least 5%, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, or at least 70% nucleotide identity to other primer pairs within the primer pool or multiplex reaction. Primers are designed to minimize the formation of primer-dimers, dimer-dimers or other non-specific amplification products. Typically, primers are optimized to reduce GC bias and low melting temperatures (Tm) during the amplification reaction. In some embodiments, the primers are designed to possess a Tm of about 55° C. to about 72° C. In some embodiments, the primers of a primer pool can possess a Tm of about 59° C. to about 70° C., 60° C. to about 68° C., or 60° C. to about 65° C. In some embodiments, the primer pool can possess a Tm that does not deviate by more than 5° C.


In some embodiments, the primer pairs used to produce an amplicon library can result in the amplification of target-specific nucleic acid molecules possessing one or more of the following metrics: greater than 97% target coverage at 20× if normalized to 100× average coverage depth; greater than 97% of bases with greater than 0.2× mean; greater than 90% base without strand bias; greater than 95% of all reads on target; greater than 99% of bases with greater than 0.01× mean; and greater than 99.5% per base accuracy.


In some embodiments, the primers can be provided as a set of primer pairs in a single amplification vessel. In some embodiments, the primers can be provided in one or more aliquots of primer pairs that can be pooled prior to performing the multiplex PCR reaction in a single amplification vessel or reaction chamber. In one embodiment, the primers can be provided as a pool of forward primers and a separate pool of reverse primers. In another embodiment, primer pairs can be pooled into subsets such as non-overlapping primer pairs. In some embodiments, the pool of primer pairs can be provided in a single reaction chamber or microwell, for example on a PCR plate to perform multiplex PCR using a thermocycler. In some embodiments, the forward and reverse primer pairs can be substantially complementary to the target sequences. In some embodiments, the primer pairs do not contain a common extension (tail) at the 3′ or 5′ end of the primer. In another embodiment, the primers do not contain a Tag or universal sequence. In some embodiments, the primer pairs are designed to eliminate or reduce interactions that promote the formation of non-specific amplification.


Methods for Detecting and/or Measuring the Presence or Absence of Microorganisms in a Sample


Also provided herein are nucleic acid-based methods of detecting and/or measuring the presence or absence of a microorganism in a sample. In some embodiments of the detection and/or measurement methods, nucleic acids in or from a sample are subjected to nucleic acid hybridization and/or amplification, for example, using any of the nucleic acids provided herein as probes and/or amplification primers. In some embodiments, the presence or absence of one or more hybridization and/or nucleic acid amplification products is detected, thereby detecting the presence or absence of a microorganism.


In some embodiments, a method provided herein for detecting, determining the presence or absence of, and/or measuring one or more microorganisms in a sample includes (a) subjecting nucleic acids in or from the sample to nucleic acid amplification using one or more primer pairs that specifically amplifies a target nucleic acid sequence contained within a genome of a microorganism selected from the microorganisms listed in Table 1, or Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or Table 1, except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis, and (b) detecting one or more amplification products (or the presence or absence thereof), thereby detecting and/or measuring one or more microorganisms selected from the microorganisms listed in Table 1, or Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or Table 1, except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis, or determining the presence or absence of one or more microorganisms selected from the microorganisms listed in Table 1, or Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or Table 1, except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis. In some embodiments, the target nucleic acid is unique to the microorganism. In some embodiments, the target nucleic acid is not contained within a prokaryotic 16S rRNA gene. Any of the embodiments provided herein for subjecting nucleic acids to amplification using one or more primer pairs that specifically amplifies a target nucleic acid sequence contained within a genome of a microorganism listed in Table 1 can be used in any embodiment of this method for detecting the presence or absence of a microorganism in a sample. In some embodiments, the at least one primer pair does not detectably amplify a nucleic acid sequence contained within any genus other than the genus of the microorganism. In some embodiments, the at least one primer pair does not detectably amplify a nucleic acid sequence contained within any species other than the species of the microorganism. In some embodiments, the nucleic acids in or from the sample include nucleic acids from a plurality of different microorganisms listed in Table 1, or Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or Table 1, except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis, and/or a plurality of different microorganisms, e.g., bacteria, not listed in Table 1. In some embodiments, the sample is a biological sample, such as, for example, a sample of contents of the alimentary canal of an animal. In some embodiments, the sample is a fecal sample. In some embodiments, the target nucleic acid sequence comprises or consists essentially of a nucleotide sequence selected from SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence. In some embodiments, detecting the presence or absence of one or more amplification products comprises detecting the presence or absence of one or more nucleotide sequences selected from SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence, or the complement thereof. In some embodiments, the at least one primer pair does not detectably amplify a nucleic acid sequence contained within any genus other than the genus of the microorganism containing the target nucleic acid sequence. In some embodiments, the at least one primer pair does not detectably amplify a nucleic acid sequence contained within any species other than the species of the microorganism containing the target nucleic acid sequence. In some embodiments, at least one primer of the primer pair, or at least one primer pair, contains, or consists essentially of, a sequence, or sequences of a primer or primer pair, in Table 16, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or a substantially identical or similar sequence(s), or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the nucleic acids are subjected to nucleic acid amplification using a plurality of primers or primer pairs, each containing, or consisting essentially of, a sequence or sequences selected from the sequences of primers in Table 16, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or a substantially identical or similar sequence(s), or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some of the embodiments in which the nucleic acids are subjected to nucleic acid amplification using more than one, or a plurality of primers or primer pairs, the amplification is a multiplex amplification conducted in a single reaction mixture. In some embodiments, at least one primer or one primer pair includes a modification that facilitates nucleic acid manipulation, amplification, ligation and/or sequencing of amplification products and/or reduction or elimination of primer dimers. In particular embodiments, a modification is one that facilitates multiplex nucleic acid amplification, ligation and/or sequencing of products of multiplex amplification.


Methods for Detecting and/or Measuring a Microorganism Group


In some embodiments of the methods for detecting, determining the presence or absence of, and/or measuring one or more microorganisms in a sample, the method is designed to focus on detection and/or measuring a certain group of microorganisms. In such embodiments, the one or more primer pairs used in the method is a combination of primers and/or primer pairs that include a selected group or sub-group of microorganism-specific nucleic acids enable a directed survey of the sample for identification of species of microorganisms that may be significant, for example, in certain states of health and disease or microbiota imbalance, e.g., dysbiosis. In some embodiments, combinations of nucleic acids include microorganism-specific nucleic acids, and/or primer pairs, that specifically amplify a nucleic acid sequence contained in the genome of one or more microorganisms (e.g., bacteria) implicated in one or more conditions, disorders and/or diseases (referred to herein as a “condition-attendant group” of microorganisms. In particular embodiments, the combination of nucleic acid primers and/or primer pairs includes a nucleic acid and/or a primer pair that specifically amplifies a target nucleic acid sequence contained within a genome of a microorganism selected from the microorganisms in Table 1, or Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or Table 1, except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis. In some embodiments, the combination includes a plurality of nucleic acids and/or primer pairs that include at least one nucleic acid primer pair that specifically amplifies a target nucleic acid in each of the microorganisms in Table 1, or Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or Table 1, except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis. In some embodiments, the plurality of primer pairs includes primer pairs that specifically amplify genomic target nucleic acids contained within at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60 or 70 of the microorganisms in Table 1, or Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or Table 1, except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis. In particular embodiments, the target nucleic acid sequences contained in the genome of the different microorganisms are unique to each of the microorganisms. In some embodiments, a combination of nucleic acids and/or nucleic acid primer pairs includes two or more nucleic acids and/or nucleic acid primer pairs that specifically amplify a unique nucleic acid sequence contained in the genome of one or more of the Group A microorganisms (see Table 2A), which are species implicated as having a role in multiple conditions, diseases and/or disorders, including, for example oncological conditions including, for example, response to immune-oncology treatment and cancer, gastrointestinal disorders, including, for example, irritable bowel syndrome, inflammatory bowel disease and coeliac disease, and autoimmune diseases, including, for example, lupus and rheumatoid arthritis. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes a set of nucleic acid primer pairs in which each different nucleic acid primer pair specifically amplifies a different unique nucleic acid sequence contained in a different one of each of the genomes of the different microorganisms in Group A. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes nucleic acids and/or nucleic acid primer pairs that specifically bind to, hybridize to and/or amplify sequences as set forth in Table 2A for exemplary nucleic acids, primers and primer pairs for genomes of Group A microorganisms. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes nucleic acids and/or nucleic acid primer pairs having a nucleotide sequence or sequences as set forth in Table 2A for exemplary nucleic acids, primers and primer pairs for genomes of Group A microorganisms.


In some embodiments, a combination of nucleic acids and/or nucleic acid primer pairs for use in a method of detecting and/or measuring a certain group of microorganisms includes two or more nucleic acids and/or nucleic acid primer pairs that specifically amplify a unique nucleic acid sequence contained in the genome of one or more of the Group B microorganisms (see Table 2B), which are species implicated as having a role in response to immuno-oncology treatment. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes a set of nucleic acid primer pairs in which each different nucleic acid primer pair specifically amplifies a different unique nucleic acid sequence contained in a different one of each of the genomes of the different microorganisms in Group B. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes nucleic acids and/or nucleic acid primer pairs that specifically bind to, hybridize to and/or amplify sequences as set forth in Table 2B for exemplary nucleic acids, primers and primer pairs for genomes of Group B microorganisms. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes nucleic acids and/or nucleic acid primer pairs having a nucleotide sequence or sequences as set forth in Table 2B for exemplary nucleic acids, primers and primer pairs for genomes of Group B microorganisms.


In some embodiments, a combination of nucleic acids and/or nucleic acid primer pairs for use in a method of detecting and/or measuring a certain group of microorganisms includes two or more nucleic acids and/or nucleic acid primer pairs that specifically amplify a unique nucleic acid sequence contained in the genome of one or more of the Group C microorganisms (see Table 2C), or the Group C microorganisms excluding Helicobacter salomonis (Subgroup 1 of the Group C microorganisms) which are species implicated as having a role in cancer. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes a set of nucleic acid primer pairs in which each different nucleic acid primer pair specifically amplifies a different unique nucleic acid sequence contained in a different one of each of the genomes of the different microorganisms in Group C, or the Group C microorganisms excluding Helicobacter salomonis (Subgroup 1 of the Group C microorganisms). In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes nucleic acids and/or nucleic acid primer pairs that specifically bind to, hybridize to and/or amplify sequences as set forth in Table 2C for exemplary nucleic acids, primers and primer pairs for genomes of Group C microorganisms. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes nucleic acids and/or nucleic acid primer pairs that specifically bind to, hybridize to and/or amplify sequences selected from SEQ ID NOS: 1616, 1619, 1620, 1625-1628, 1635-1640, 1699, 1700, 1705-1708, 1752, 1753, 1784-1786, 1817-1820, 1827, 1828, 1840, 1841, 1844, 1845, 1852-1859, 1899, 1900, 1904, 1905, 1932, 1933, 1956-1958, 1975, 1976 of Table 17, and/or a substantially identical or similar sequence, or sequences selected from SEQ ID NOS: 1616, 1619, 1620, 1625-1628, 1635-1640, 1699, 1700, 1705-1708, 1752, 1753, 1784-1786, 1827, 1828, 1840, 1841, 1844, 1845, 1852-1859, 1899, 1900, 1904, 1905, 1932, 1933, 1956, 1957, 1958 of Table 17, and/or a substantially identical or similar sequence. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes nucleic acids and/or nucleic acid primer pairs having a nucleotide sequence or sequences as set forth in Table 2C for exemplary nucleic acids, primers and primer pairs for genomes of Group C microorganisms. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes nucleic acids and/or nucleic acid primer pairs having a nucleotide sequence or sequences selected from SEQ ID NOS: 71, 72, 77-80, 89-96, 109-120, 237-242, 249-256, 343-346, 407-412, 493-496, 511-520, 521-524, 547-550, 555-558, 561-568, 571-586, 665-668, 675-678, 731-734, 779-784, and/or SEQ ID NOS: 849, 850, 855-858, 867-874, 887-898, 1012-1020, 1025-1034, 1121-1124, 1185-1190, 1271-1276, 1289-1298, 1299-1302, 1325-1328, 1333-1336, 1339-1346, 1349-1364, 1443-1446, 1453-1456, 1509-1512, 1557-1562, in Table 16, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes nucleic acids and/or nucleic acid primer pairs having a nucleotide sequence or sequences selected from SEQ ID NOS: 71, 72, 77-80, 89-96, 109-120, 237-242, 249-256, 343-346, 407-412, 493-496, 511-520 and/or SEQ ID NOS: 849, 850, 855-858, 867-874, 887-898, 1012-1020, 1025-1034, 1121-1124, 1185-1190, 1271-1276, 1289-1298 in Table 16, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes nucleic acids and/or nucleic acid primer pairs having a nucleotide sequence or sequences selected from SEQ ID NOS: 71, 72, 77-80, 89-96, 109-120, 237-242, 249-256, 343-346, 407-412 and/or SEQ ID NOS: 849, 850, 855-858, 867-874, 887-898, 1012-1020, 1025-1034, 1121-1124, 1185-1190 in Table 16, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base.


In some embodiments, a combination of nucleic acids and/or nucleic acid primer pairs for use in a method of detecting and/or measuring a certain group of microorganisms includes two or more nucleic acids and/or nucleic acid primer pairs that specifically amplify a unique nucleic acid sequence contained in the genome of one or more of the Group D microorganisms (see Table 2D), which are species implicated as having a role in gastrointestinal disorders, including, for example, irritable bowel syndrome, inflammatory bowel disease and coeliac disease. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes a set of nucleic acid primer pairs in which each different nucleic acid primer pair specifically amplifies a different unique nucleic acid sequence contained in a different one of each of the genomes of the different microorganisms in Group D. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes nucleic acids and/or nucleic acid primer pairs that specifically bind to, hybridize to and/or amplify sequences as set forth in Table 2D for exemplary nucleic acids, primers and primer pairs for genomes of Group D microorganisms. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes nucleic acids and/or nucleic acid primer pairs having a nucleotide sequence or sequences as set forth in Table 2D for exemplary nucleic acids, primers and primer pairs for genomes of Group D microorganisms.


In some embodiments, a combination of nucleic acids and/or nucleic acid primer pairs for use in a method of detecting and/or measuring a certain group of microorganisms includes two or more nucleic acids and/or nucleic acid primer pairs that specifically amplify a unique nucleic acid sequence contained in the genome of one or more of the Group E microorganisms (see Table 2E), which are species implicated as having a role in autoimmune disorders, including, for example, lupus and rheumatoid arthritis. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes a set of nucleic acid primer pairs in which each different nucleic acid primer pair specifically amplifies a different unique nucleic acid sequence contained in a different one of each of the genomes of the different microorganisms in Group E. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes nucleic acids and/or nucleic acid primer pairs that specifically bind to, hybridize to and/or amplify sequences as set forth in Table 2E for exemplary nucleic acids, primers and primer pairs for genomes of Group E microorganisms. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes nucleic acids and/or nucleic acid primer pairs having a nucleotide sequence or sequences as set forth in Table 2E for exemplary nucleic acids, primers and primer pairs for genomes of Group E microorganisms.


In some embodiments, detecting, determining the presence or absence of, and/or measuring as provided herein, the one or more nucleic acid amplification products is detected via a labeled probe having a nucleic acid sequence of compositions provided herein that specifically identifies a particular nucleic acid product. For example, such methods can employ a nucleic acid microarray of oligonucleotides attached to a substrate, e.g., a chip, to capture amplification products which is then contacted with labeled (e.g., fluorescently labeled) specific probes under conditions suitable for hybridization which can be detected upon binding to a complementary product thereby detecting the presence of a microorganism in the sample. Methods for detection of labels are known in the art and include, for example, optical methods such as scanning using confocal laser microscopy or a CCD camera. Such methods also allow for quantitation of hybridization to assess abundance of the labeled nucleic acid product. In some embodiments, the presence or absence of one or more nucleic acid amplification products is detected by obtaining nucleotide sequence information of one or more nucleic acid amplification products. Methods for sequencing nucleic acids are described herein and/or known in the art. The sequence of an amplification product can also be used to identify the microorganism at various levels of specificity, e.g., kingdom, phylum, class, order, family, genus and/or species. If a microorganism of Table 1 or 2 for which presence or absence is being determined is present in the sample, the sequence of at least one amplification product will be the target sequence specifically amplified by the one or more primer pairs from the genome of the microorganism and, thus, the presence of the amplification product is detected. If the microorganism is absent from the sample, no amplification product will be produced that contains the target sequence specifically amplified by the one or more primer pairs, and, thus, the absence of the amplification product is detected. In some embodiments, detecting the presence or absence of a nucleic acid amplification product includes comparing the sequence of the one or more nucleic acid amplification products to nucleic acid sequences of the genome of one or more of the microorganisms in Table 1 or 2. Genome sequences for microorganisms in Table 1 or 2 are available in public databases (e.g., NCBI public database; www.ncbi.nlm.nih.gov/genome/microbes/). In some embodiments, comparing the sequence of a nucleic acid amplification product to reference genome sequences includes conducting computer-assisted alignment of the sequence and mapping it to a reference genome. Exemplary nucleotide sequence analysis workflows for mapping sequence reads of amplification products are provided herein. In particular embodiments, detecting the presence or absence of a nucleic acid amplification product includes detecting the presence or absence of an amplification product containing a sequence in SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence.


Methods for Increasing the Accuracy, Specificity and/or Sensitivity of Detecting and/or Measuring Microorganisms in a Sample


In some embodiments, a method provided herein for detecting, determining the presence or absence of, and/or measuring one or more microorganisms in a sample includes (a) subjecting nucleic acids in or from the sample to nucleic acid amplification using one or more, or a combination of, primer pairs capable of separately amplifying nucleic acids separately containing sequences of one or more different hypervariable regions of a prokaryotic 16S rRNA gene and (b) detecting one or more amplification products, thereby detecting, or determining the presence or absence of, one or more microorganisms in the sample. In some embodiments, the microorganism(s) is/are bacteria. In some embodiments, the presence or absence of one or microorganisms is detected at the level of the genus of the microorganisms. In some embodiments, the presence or absence of one or more microorganisms is detected at the level of the species of the microorganism. In some embodiments, the prokaryotic 16S rRNA gene is a bacterial gene. In some embodiments, the one or more, or combination of, primer pairs separately amplify nucleic acids containing sequences of different variable regions. In some embodiments, the primers of the primer pairs are directed to, or bind to, or hybridize to nucleic acid sequences contained in conserved regions of a prokaryotic 16S rRNA gene. In some embodiments, the one or more, or combination of primer pairs amplify, or separately amplify, nucleic acids separately containing sequences of 2 or more, 3 or more, 4 or more, 5 or more, 6 or more, 7 or more, 8 or more, or 9 different hypervariable regions of a prokaryotic 16S rRNA gene thereby producing amplified copies of the nucleic acids separately containing sequences of the 2 or more, 3 or more, 4 or more, 5 or more, 6 or more, 7 or more, 8 or more or 9 different hypervariable regions of the 16S rRNA gene of one or more microorganisms, wherein, in some embodiments, the amplified copies of different hypervariable regions are separate amplicons. In some embodiments, the one or more, or combination of, primer pairs separately amplify nucleic acids separately containing sequences of 8 different hypervariable regions of a prokaryotic 16S rRNA gene. In some embodiments, the 8 different hypervariable regions are V2-V9. In some embodiments, the one or more, or combination of, primer pairs separately amplify nucleic acids separately containing sequences of 3 or more different hypervariable regions of a prokaryotic 16S rRNA gene wherein one of the 3 or more different regions is a V5 region thereby producing amplified copies of the nucleic acids separately containing sequences of 3 or more different hypervariable regions of the 16S rRNA gene of one or more microorganisms. In some embodiments, the primer pairs includes degenerate sequences of one or more primers in one or more primer pairs. In some embodiments, the one or more, or combination, of primer pairs is selected from any of the primer pairs described herein that separately amplify nucleic acids containing sequences of one or more hypervariable regions of a prokaryotic 16S rRNA gene. For example, in some embodiments, the one or more, or a combination of, primer pairs that are capable of separately amplifying nucleic acids containing sequences of one or more hypervariable regions of a prokaryotic 16S rRNA gene comprises one or more primer pairs containing sequences selected from Table 15, or SEQ ID NOS: 1-24 in Table 15 and/or SEQ ID NOS: 25-48 in Table 15, or SEQ ID NOS: 11-16, 23 and 24 in Table 15 and/or SEQ ID NOS: 35-40, 47 and 48 in Table 15, or substantially identical or similar sequences, and optionally wherein one or more thymine bases is substituted with a uracil base. In some embodiments, the one or more, or a combination of, primer pairs that separately amplify nucleic acids containing sequences of one or more hypervariable regions of a prokaryotic 16S rRNA gene comprises one or more primer pairs containing sequences selected from SEQ ID NOS: 25-48 in Table 15 and/or SEQ ID NOS: 35-40, 47 and 48 in Table 15, or substantially identical or similar sequences in which one or more thymine bases is substituted with a uracil base. In some embodiments, the presence or absence of one or more nucleic acid amplification products is detected by obtaining nucleotide sequence information of one or more nucleic acid amplification products. If only one sequence is obtained for each of one or more hypervariable regions amplified, the sequence information is indicative of the presence of only one microorganism in the sample. If 2 or more different sequences are obtained for each of one or more hypervariable regions amplified, the sequence information is indicative of the presence of 2 or more different microorganisms in the sample. Additionally, the combined results of the amplifications using 16S rRNA gene primers that separately amplify multiple different hypervariable regions yield increased accuracy in the detection and/or measurement of one or more microorganisms in a sample by reducing false negatives or false positives that may occur in basing a determination solely on the results of amplification using 16S rRNA primers that amplify one or only a few (e.g., less than 8 or less than 5) hypervariable regions or that amplify combined regions as a single amplicon, e.g., V2-V3, or V3-V4, etc., if the amplification primers are not directed to highly conserved sequences flanking the multiple regions. For example, due to possible sequence variations in 16S rRNA gene conserved regions in some species or strains of microorganisms (e.g., bacteria), primers designed to amplify a particular hypervariable region of all bacteria may fail to amplify some bacterial nucleic acids present in a sample, thereby yielding a false negative result. This failure may be compounded if the primers are designed to amplify combined regions as a single amplicon and are not directed to highly conserved sequences flanking the multiple regions, because not only is one region potentially not amplified, two or more regions may not amplified. However, amplification of the same nucleic acids using 16S rRNA gene primers designed to separately amplify multiple hypervariable regions in such a microorganism would be likely to amplify at least one hypervariable sequence in the microorganism and enable detection of the presence of the microorganism in the sample. Separately amplifying multiple hypervariable regions increases the coverage of the sequence and provide more useful information and increases accuracy and resolution of detection. Also, when 16S rRNA gene primers that separately amplify multiple (e.g., 3, 4, 5, 6, 7, 8 or 9) hypervariable regions are used, the number (and sequences) of amplification products using the different hypervariable region primers provides a basis on which to filter out and eliminate sequences of hypervariable regions of a particular microorganism that are amplified in only one (or less than 3, or less than 4, or less than 5, or less than 6, or another threshold amount) 16S rRNA gene hypervariable region amplification(s) from consideration as unreliable since 16S rRNA gene nucleic acids from a bacterial microorganism that is truly present in a sample should be amplified in the majority of amplifications using primers that separately amplify multiple hypervariable regions. The sequence of an amplification product can also be used to identify the microorganism at various levels of specificity, e.g., kingdom, phylum, class, order, family, genus and/or species. In some embodiments, detecting, determining the presence or absence of, and/or measuring a nucleic acid amplification product includes comparing the sequence of the one or more nucleic acid amplification products to nucleic acid sequences of prokaryotic, e.g., bacterial, 16S rRNA genes of one or more of the microorganisms. Sequences of prokaryotic 16S rRNA genes are available in public databases (see, e.g., the GreenGenes bacterial 16S rRNA gene sequence; e.g., www.greengenes.lbl.gov). In some embodiments, comparing the sequence of a nucleic acid amplification product to reference genome sequences includes conducting computer-assisted alignment of the sequence and mapping it to a reference genome. Exemplary nucleotide sequence analysis workflows for mapping sequence reads of amplification products are provided herein. In some embodiments, relative and/or absolute levels of one or more microorganisms are determined or measured. For example, in some embodiments, the level of abundance of one or more nucleic acid amplification products and/or sequence reads can be measured to provide relative and/or absolute levels of one or microorganisms. Techniques for quantifying nucleic acids (e.g., amplification products) and/or sequence reads are known in the art and/or provided herein.


In some embodiments, a method provided herein for detecting, determining the presence or absence, and/or measuring one or more microorganisms in a sample includes (a) subjecting nucleic acids in or from the sample to nucleic acid amplification using a combination of primer pairs comprising (i) one or more primer pairs capable of amplifying nucleic acids containing sequences of one or more hypervariable regions of a prokaryotic 16S rRNA gene (referred to as the “16S rRNA gene primers and primer pairs”), and (ii) one or more primer pairs capable of amplifying a target nucleic acid sequence contained within the genome of a microorganism that is not contained within a hypervariable region of a prokaryotic 16S rRNA gene, wherein different primer pairs amplify different target nucleic acid sequences contained within the genome of different microorganisms (referred to as the “non-16S rRNA gene primers and primer pairs”); and (b) detecting one or more amplification products, thereby detecting or determining the presence or absence of a microorganism. In some embodiments, the microorganism(s) is/are bacteria. In some embodiments, the prokaryotic 16S rRNA gene is a bacterial gene and/or the prokaryotic microorganism is a bacterium. In some embodiments, the one or more primer pairs that amplifies a nucleic acid containing a sequence of a hypervariable region of a prokaryotic 16S rRNA gene separately amplify nucleic acid sequences of different hypervariable regions. In some embodiments, the primers of the one or more primer pairs of (i) are directed to, or bind to, or hybridize to nucleic acid sequences contained in conserved regions of a prokaryotic 16S rRNA gene. In some embodiments, the amplification is a multiplex amplification conducted in a single reaction mixture. In some embodiments a method provided herein for detecting one or more microorganisms in a sample includes (a) subjecting nucleic acids in or from a sample to two or more separate nucleic acid amplification reactions using a first set of primer pairs for one nucleic acid amplification reaction and a second set of primer pairs for the other nucleic acid amplification reaction, wherein: (i) the first set of primer pairs comprises one or more primer pairs capable of amplifying a nucleic acid containing a sequence of a hypervariable region of a prokaryotic 16S rRNA gene (referred to as the “16S rRNA gene primers and primer pairs”), and (ii) the second set of primer pairs comprises one or more primer pairs capable of amplifying, or specifically amplifying, a target nucleic acid sequence contained within the genome of a microorganism that is not contained within a hypervariable region of a prokaryotic 16S rRNA gene, wherein different primer pairs amplify different target nucleic acid sequences contained within the genome of different microorganisms (referred to as the “non-16S rRNA gene primers and primer pairs”); and (b) detecting one or more amplification products, thereby detecting or determining the presence or absence of a microorganism. In some embodiments, the microorganism(s) is/are bacteria. In some embodiments, the prokaryotic 16S rRNA gene is a bacterial gene and/or the prokaryotic microorganism is a bacterium. In some embodiments, the one or more primer pairs that amplifies a nucleic acid containing a sequence of a hypervariable region of a prokaryotic 16S rRNA gene separately amplify nucleic acids containing sequences of different hypervariable regions. In some embodiments, the primers of the one or more primer pairs of (i) are directed to, or bind to, or hybridize to nucleic acid sequences contained in conserved regions of a prokaryotic 16S rRNA gene. In some embodiments, the amplification is a multiplex amplification conducted in a single reaction mixture.


In any embodiments of methods for detecting and/or measuring the presence or absence of one or more microorganisms in a sample wherein the method includes subjecting nucleic acids in or from the sample to nucleic acid amplification using a combination of, or first and second sets of, 16S rRNA primer pairs and non-16S rRNA gene primer pairs, respectively, the nucleic acid amplification can be performed according to any of the embodiments provided herein for such amplification. For example, in some embodiments the target nucleic acid sequence contained within a genome of a prokaryotic microorganism, e.g., bacteria, is unique to the microorganism. In some embodiments, the one or more 16S rRNA gene primer pairs amplify a nucleic acid sequence in a plurality of microorganisms, e.g., bacteria, from different genera. In some embodiments, sample includes nucleic acids from a plurality of different microorganisms. In some embodiments, the sample is a biological sample, such as, for example, a sample of contents of the alimentary tract of an animal. In some embodiments, the sample is a fecal sample. In some embodiments, each primer of the one or more 16S rRNA gene primer pairs contains less than 10, less than 9, less than 8, less than 7, less than 6, less than 5, less than 4, less than 3, or less than 2 contiguous nucleotides of sequence identical to a sequence of contiguous nucleotides of another primer in the combination of primer pairs. In some embodiments, the nucleic acid sequences being amplified by the one or more 16S rRNA gene primer pairs are less than about 300 bp, less than about 250 bp, less than about 200 bp, less than about 175 bp, less than about 150 bp, or less than about 125 bp in length. In some embodiments, the 16S rRNA gene primer pairs separately amplify nucleic acids separately containing sequences of 3 or more, 4 or more, 5 or more, 6 or more, 7 or more, 8 or more or 9 different hypervariable regions of a prokaryotic 16S rRNA gene thereby producing amplified copies of the nucleic acids containing sequences of the 3 or more, 4 or more, 5 or more, 6 or more, 7 or more, 8 or more or 9 different hypervariable regions of the 16S rRNA gene of one or more microorganisms, wherein the amplified copies of different hypervariable regions are separate amplicons. In some embodiments, the 16S rRNA gene primer pairs separately amplify nucleic acids separately containing sequences of 8 different hypervariable regions of a prokaryotic 16S rRNA gene. In some embodiments, the 8 different hypervariable regions are V2-V9. In some embodiments, the 16S rRNA gene primer pairs separately amplify nucleic acids separately containing sequences of 3 or more hypervariable regions of a prokaryotic 16S rRNA gene wherein one of the 3 or more regions is a V5 region thereby producing amplified copies of the nucleic acids containing sequences of the 3 or more hypervariable regions of the 16S rRNA gene of one or more microorganisms. In some embodiments, the combination of primer pairs includes degenerate sequences of one or more primers in one or more primer pairs. In some embodiments, the 16S rRNA gene primer pair(s) comprise primers and/or primer pairs containing, or consisting essentially of, a sequence or sequences of a primer or primer pair in Table 15, or SEQ ID NOS: 1-24 in Table 15 and/or SEQ ID NOS: 25-48 in Table 15, or SEQ ID NOS: 11-16, 23 and 24 in Table 15 and/or SEQ ID NOS: 35-40, 47 and 48 in Table 15, or substantially identical or similar sequences, and optionally wherein one or more thymine bases is substituted with a uracil base. In some embodiments, the one or more non-16S rRNA gene primer pairs specifically amplifies a target nucleic acid sequence contained within a genome of a microorganism selected from the microorganisms of Table 1, or Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or Table 1, except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis. In some embodiments, the target nucleic acid is unique to the microorganism. In some embodiments, the sample includes nucleic acids from a plurality of different microorganisms listed in Table 1, or Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or Table 1, except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis. In some such embodiments, amplified copies of a plurality of different microorganisms in Table 1 are produced. In some embodiments, the sample includes a mixture of nucleic acids of one or more, or a plurality of, microorganisms selected from among the microorganisms listed in Table 1, or Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or Table 1, except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis, and one or more microorganisms, e.g., bacteria, not listed in Table 1. In some embodiments, at least one, or one or more, target nucleic acid sequence(s) comprises or consists essentially of a nucleotide sequence selected from the nucleotide sequences of SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence. In some embodiments, at least one, or one or more, product(s) of the nucleic acid amplification comprises, or consists essentially of, a nucleotide sequence selected from SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence, or the complement thereof, and optionally having one or more primer sequences at the 5′ and/or 3′ end(s) of the sequence, such as any of the primer sequences provided herein. In some embodiments, at least one, or one or more, product(s) of the nucleic acid amplification is less than about 500, less than about 475, less than about 450, less than about 400, less than about 375, less than about 350, less than about 300, less than about 275, less than about 250, less than about 200, less than about 175, less than about 150, or less than about 100 nucleotides in length. In some embodiments, the at least one non-16S rRNA gene primer pair does not detectably amplify a nucleic acid sequence contained within any genus other than the genus of the microorganism containing the target nucleic acid sequence. In some embodiments, the at least one non-16S rRNA gene primer pair does not detectably amplify a nucleic acid sequence contained within any species other than the species of the microorganism containing the target nucleic acid sequence. In some embodiments, at least one primer of the non-16S rRNA gene primer pair, or at least one non-16S rRNA gene primer pair, contains, or consists essentially of, the sequence or sequences of a primer or primer pair in Table 16, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or a substantially identical or similar sequence(s), or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the nucleic acids are subjected to nucleic acid amplification using a plurality of non-16S rRNA gene primers or primer pairs, each containing, or consisting essentially of, a sequence or sequences of a primer pair in Table 16, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or a substantially identical or similar sequence(s), or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, at least one primer or one primer pair in the combination of primer pairs includes a modification that facilitates nucleic acid manipulation, amplification, ligation and/or sequencing of amplification products and/or reduction or elimination of primer dimers. In particular embodiments, a modification is one that facilitates multiplex nucleic acid amplification, ligation and/or sequencing of products of multiplex amplification.


In some embodiments of methods for detecting, determining the presence or absence of, and/or measuring one or more microorganisms in a sample wherein the method includes subjecting nucleic acids in or from the sample to nucleic acid amplification using a combination of, or first and second sets of, 16S rRNA primer pairs and non-16S rRNA gene primer pairs, the one or more nucleic acid amplification products is/are detected by obtaining nucleotide sequence information of one or more nucleic acid amplification products. The detecting, determining the presence or absence of and/or measuring of one or more microorganisms in a sample may be based on nucleotide sequence information of products from one, some, a minority, most, the majority, or substantially all, or less than the majority, or less than substantially all of the amplifications performed using each of the different primer pairs in a combination of primer pairs employed in the method. In some embodiments, detecting and/or measuring only one, or more, particular microorganism(s) in a sample is without regard to the presence or absence of any other, or some other, or most other, microorganisms (or nucleic acids from other sources) in the sample. Thus, in some embodiments of the methods of detecting, determining presence or absence of, and/or measuring a microorganism, a determination of presence or absence may be made based on nucleotide sequence information of products from only some of the amplifications performed with different primer pairs, or of products from all of a smaller or limited number of amplifications. In some instances, even without a determination of the identities of some or all of the microorganisms, the number of different sequences in the products of amplification of sample nucleic acids with the combination of different primers (e.g, 16S rRNA gene primers (and/or the number of and particular hypervariable regions amplified by the primers) vs. non-16S rRNA gene primers (and/or the of the number of and particular target nucleic acids amplified by the primers)) provides useful information not only regarding the presence or absence of a microorganism of interest but also whether other microorganisms are present and relative abundance of a microorganism. For example, in some instances, no amplification products are detected from amplification of sample nucleic acids using one or more non-16S rRNA gene primer pairs. A lack of a particular product from the amplification indicates that one or more microorganisms containing a target nucleic acid that is amplified by the one or more non-16S rRNA gene primer pairs is absent from the sample or present in an amount below the limit of detection. In this case, if amplification products are detected from amplification of sample nucleic acids using one or more 16S rRNA gene primers, this indicates the presence of other microorganisms in the sample. Furthermore, if 2 or more amplification products are detected from amplification of a particular hypervariable region using the one or more 16S rRNA gene primers in this case, and the sequences of the 2 or more amplification products are different, this indicates the presence of a plurality of microorganisms in the sample that are not a microorganism containing a target nucleic acid that is amplified by the one or more non-16S rRNA gene primer pairs. In another example, if two or more different non-16S rRNA gene primer pairs that amplify target nucleic acids in different microorganisms are used in the amplification, and the amplification products contain copies of only a single target nucleic acid sequence, this is indicative of the presence of one microorganism and the absence of the other microorganism from the sample. Also, if in this example two or more amplification products are detected from amplification of a particular hypervariable region using the one or more 16S rRNA gene primers, and the sequences of the two or more amplification products are different, this indicates the presence of a plurality of microorganisms in the sample only one of which is the microorganism that contains target nucleic acid sequence amplified by a pair of the non-16S primers. Additionally, the combined results of the amplifications using the 16S rRNA gene primers (and particularly primers that separately amplify multiple different hypervariable regions) and the non-16S rRNA gene primers yield increased accuracy and specificity in the detection and/or measurement of one or more microorganisms in a sample by reducing false negatives that may occur in basing a determination solely on the results of amplification using 16S rRNA primers and/or reducing or eliminating the number of false positives that may occur in basing a determination solely on the results of amplification using a non-16S rRNA gene primer pair that amplifies a species-specific nucleic acid. For example, due to possible sequence variations in 16S rRNA gene conserved regions in some species or strains of microorganisms (e.g., bacteria), primers designed to amplify a 16S rRNA gene hypervariable region of all bacteria may fail to amplify some bacterial nucleic acids present in a sample, thereby yielding a false negative result. However, the results of amplification of the same nucleic acids using non-16S rRNA gene primers that amplify a specific target nucleic acid in such a microorganism would enable detection of the presence of the microorganism in the sample. In another example, a false positive result can occur in amplification using a non-16S rRNA gene primer pair that amplifies a target nucleic acid sequence that is not unique to the genome of the microorganism intended to be detected by amplification using the primer pair. However, sequence information obtained from products of amplification of the same nucleic acids using one, or typically multiple, 16S rRNA gene primer pairs would reveal the absence of any 16S rRNA gene hypervariable sequences for the intended microorganism, thereby yielding a result of detection of the absence of the microorganism in the sample. In some embodiments, detecting the presence or absence of a nucleic acid amplification product includes comparing the sequence of the one or more nucleic acid amplification products to reference nucleic acid sequences the genomes of microorganisms, e.g., bacteria, and/or of prokaryotic, e.g., bacterial, 16S rRNA genes. In some embodiments, comparing the sequence of a nucleic acid amplification product to reference genome sequences includes conducting computer-assisted alignment of the sequence and mapping it to a reference genome. Exemplary nucleotide sequence analysis workflows for mapping sequence reads of amplification products are provided herein. In some embodiments, relative and/or absolute levels of one or more microorganisms are determined or measured. For example, in some embodiments, the level of abundance of one or more nucleic acid amplification products and/or sequence reads can be measured to provide relative and/or absolute levels of one or microorganisms. Techniques for quantifying nucleic acids (e.g., amplification products) and/or sequence reads are known in the art and/or provided herein.


Methods for Assessing, Characterizing, Profiling and/or Measuring a Population of Microorganisms


Also provided herein are methods for characterizing, profiling, assessing and/or measuring a population of microorganisms, e.g., bacteria, in a sample. In some embodiments of the methods, nucleic acids in or from a sample are subjected to nucleic acid hybridization, annealing and/or amplification, for example, using any of the nucleic acids provided herein as probes and/or amplification primers.


In some embodiments, a method for characterizing, profiling, assessing and/or measuring a population of microorganisms, e.g., bacteria, and/or the composition or components thereof, in a sample provided herein includes (a) subjecting nucleic acids in or from a sample to nucleic acid amplification using a combination of primer pairs comprising (i) one or more primer pairs capable of amplifying nucleic acids containing sequences of one or more hypervariable regions of a prokaryotic 16S rRNA gene (referred to as the “16S rRNA gene primers or primer pairs”) and (ii) one or more primer pairs capable of amplifying a target nucleic acid sequence contained within the genome of a microorganism that is not contained within a hypervariable region of a prokaryotic 16S rRNA gene, wherein different primer pairs amplify different target nucleic acid sequences contained within the genome of different microorganisms (referred to as the “non-16S rRNA gene primers or primer pairs”); (b) obtaining sequence information from nucleic acid products amplified by the combination of primer pairs of (i) and (ii); and (c) identifying genera of microorganisms in the sample and species of one or more of the microorganisms in the sample, thereby characterizing a population of microorganisms in the sample. In some embodiments, the method further includes determining levels, e.g. relative and/or absolute levels, of nucleic acid products amplified by one or more primer pairs of (i), i.e., the 16S rRNA gene primer pairs, and/or (ii), i.e., the non-16S rRNA gene primer pairs, or sequence reads thereof. In some embodiments, a method for characterizing, profiling, assessing and/or measuring a population of microorganisms in a sample provided herein includes (a) subjecting the nucleic acids to two or more separate nucleic acid amplification reactions using a first set of primer pairs for one nucleic acid amplification reaction and a second set of primer pairs for the other nucleic acid amplification reaction, wherein (i) the first set of primer pairs comprises one or more primer pairs that amplifies a nucleic acid containing a sequence of one or more hypervariable regions of a prokaryotic 16S rRNA gene (referred to as the “16S rRNA gene primers or primer pairs”) and (ii) the second set of primer pairs comprises one or more primer pairs that amplify a target nucleic acid sequence contained within the genome of a microorganism that is not contained within a hypervariable region of a prokaryotic 16S rRNA gene, wherein different primer pairs amplify different target nucleic acid sequences contained within the genome of different microorganisms (referred to as the “non-16S rRNA gene primers or primer pairs”); (b) obtaining sequence information from nucleic acid products amplified by primer pairs of (i) and (ii); and (c) identifying genera of microorganisms in the sample and species of one or more of the microorganisms in the sample, thereby characterizing a population of microorganisms in the sample. In some embodiments, the method further includes determining levels, e.g. relative and/or absolute levels, of nucleic acid products amplified by one or more primer pairs of (i) and/or (ii) or sequence reads thereof.


In any embodiments of methods provided herein for characterizing, profiling, assessing and/or measuring a population of microorganisms and/or the composition or components thereof, in a sample, the methods can include any embodiments of the methods for detecting and/or measuring the presence or absence of one or more microorganisms in a sample as described herein. For example, in some embodiments, the microorganism(s) is/are bacteria. In some embodiments, the prokaryotic 16S rRNA gene is a bacterial gene and/or the prokaryotic microorganism is a bacterium. In some embodiments the target nucleic acid sequence contained within a genome of a prokaryotic microorganism, e.g., bacteria, is unique to the microorganism. In some embodiments, the one or more 16S rRNA gene primer pairs amplify a nucleic acid sequence in a plurality of microorganisms, e.g., bacteria, from different genera. In some embodiments, the sample is a biological sample, such as, for example, a sample of contents of the alimentary tract of an animal. In some embodiments, the sample is a fecal sample.


In any embodiments of methods for characterizing, profiling, assessing and/or measuring a population of microorganisms and/or the composition or components thereof, in a sample, a nucleic acid amplification can be performed according to any of the embodiments provided herein for such amplification. For example, in some embodiments, the one or more primer pairs that amplifies a nucleic acid containing a sequence of a hypervariable region of a prokaryotic 16S rRNA gene separately amplify nucleic acids containing sequences of different hypervariable regions. In some embodiments, the primers of the one or more 16S rRNA gene primer pairs are directed to, or bind to, or hybridize to nucleic acid sequences contained in conserved regions of a prokaryotic 16S rRNA gene. In some embodiments, the one or more 16S rRNA gene primer pairs and/or non-16S rRNA gene primer pairs comprise a plurality of primer pairs. For example, the one or more 16S rRNA gene primer pairs of can comprise a plurality of primer pairs that amplify nucleic acids containing sequences of multiple hypervariable regions of a prokaryotic 16S rRNA gene and/or the one or more non-16S rRNA gene primer pairs can comprise a plurality of primer pairs that amplify target nucleic acid sequences contained in the genomes of a plurality of microorganisms. In some embodiments, the amplification, or two or more separate nucleic acid amplification reactions, is/are multiplex amplification conducted in a single reaction mixture. In some embodiments, each primer of the one or more 16S rRNA gene primer pairs contains less than 10, less than 9, less than 8, less than 7, less than 6, less than 5, less than 4, less than 3, or less than 2 contiguous nucleotides of sequence identical to a sequence of contiguous nucleotides of another primer in the combination of primer pairs. In some embodiments, the nucleic acid sequences being amplified by the one or more 16S rRNA gene primer pairs are less than about 300 bp, less than about 250 bp, less than about 200 bp, less than about 175 bp, less than about 150 bp, or less than about 125 bp in length. In some embodiments, the 16S rRNA gene primer pairs separately amplify nucleic acids separately containing sequences of 3 or more, 4 or more, 5 or more, 6 or more, 7 or more, 8 or more or 9 different hypervariable regions of a prokaryotic 16S rRNA gene thereby producing amplified copies of the nucleic acids containing sequences of the 3 or more, 4 or more, 5 or more, 6 or more, 7 or more, 8 or more or 9 different hypervariable regions of the 16S rRNA gene of one or more microorganisms, wherein the amplified copies of different hypervariable regions are separate amplicons. In some embodiments, the 16S rRNA gene primer pairs separately amplify nucleic acids containing sequences of 8 different hypervariable regions of a prokaryotic 16S rRNA gene. In some embodiments, the 8 different hypervariable regions are V2-V9. In some embodiments, the 16S rRNA gene primer pairs separately amplify nucleic acids separately containing sequences of 3 or more different hypervariable regions of a prokaryotic 16S rRNA gene wherein one of the 3 or more regions is a V5 region thereby producing amplified copies of the nucleic acids containing sequences of the 3 or more different hypervariable regions of the 16S rRNA gene of one or more microorganisms. In some embodiments, the combination of primer pairs includes degenerate sequences of one or more primers in one or more primer pairs. In some embodiments, the 16S rRNA gene primer pair(s) comprise primers and/or primer pairs containing, or consisting essentially of, a sequence or sequences of a primer or primer pair in Table 15, or SEQ ID NOS: 1-24 in Table 15 and/or SEQ ID NOS: 25-48 in Table 15, or SEQ ID NOS: 11-16, 23 and 24 in Table 15 and/or SEQ ID NOS: 35-40, 47 and 48 in Table 15, or substantially identical or similar sequences, and optionally wherein one or more thymine bases is substituted with a uracil base.


In some embodiments of methods for characterizing, profiling, assessing and/or measuring a population of microorganisms, and/or the composition or components thereof, in a sample, the one or more non-16S rRNA gene primer pairs specifically amplifies a target nucleic acid sequence contained within a genome of a microorganism selected from the microorganisms of Table 1, or Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or Table 1, except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis. In some embodiments, the target nucleic acid is unique to the microorganism. In some embodiments, the sample includes nucleic acids from a plurality of different microorganisms listed in Table 1. In some such embodiments, amplified copies of a plurality of different microorganisms in Table 1, or Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or Table 1, except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis is produced. In some embodiments, the sample includes a mixture of nucleic acids of one or more, or a plurality of, microorganisms selected from among the microorganisms listed in Table 1, or Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or Table 1, except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis, and one or more microorganisms, e.g., bacteria, not listed in Table 1. In some embodiments, at least one, or one or more, target nucleic acid sequence(s) comprises or consists essentially of a nucleotide sequence selected from the nucleotide sequences of SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence, or the complement thereof. In some embodiments, at least one, or one or more, product(s) of the nucleic acid amplification comprises, or consists essentially of, a nucleotide sequence selected from SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence, or the complement thereof, and optionally having one or more primer sequences at the 5′ and/or 3′ end(s) of the sequence, such as any of the primer sequences provided herein, and is less than about 500, less than about 475, less than about 450, less than about 400, less than about 375, less than about 350, less than about 300, less than about 275, less than about 250, less than about 200, less than about 175, less than about 150, or less than about 100 nucleotides in length. In some embodiments, the at least one non-16S rRNA gene primer pair does not detectably amplify a nucleic acid sequence contained within any genus other than the genus of the microorganism containing the target nucleic acid sequence. In some embodiments, the at least one non-16S rRNA gene primer pair does not detectably amplify a nucleic acid sequence contained within any species other than the species of the microorganism containing the target nucleic acid sequence. In some embodiments, at least one primer of the non-16S rRNA gene primer pair, or at least one non-16S rRNA gene primer pair, contains, or consists essentially of, the sequence or sequences of a primer or primer pair in Table 16, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or a substantially identical or similar sequence(s), or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the nucleic acids are subjected to nucleic acid amplification using a plurality of non-16S rRNA gene primers or primer pairs, each containing, or consisting essentially of, a sequence or sequences of a primer pair in Table 16, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or a substantially identical or similar sequence(s), or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, at least one primer or one primer pair in the combination of primer pairs includes a modification that facilitates nucleic acid manipulation, amplification, ligation and/or sequencing of amplification products and/or reduction or elimination of primer dimers. In particular embodiments, a modification is one that facilitates multiplex nucleic acid amplification, ligation and/or sequencing of products of multiplex amplification.


In some embodiments of the methods, the characterizing, profiling, assessing and/or measuring of a population of microorganisms (e.g., bacteria) and/or the composition or components thereof, is designed to focus on the make-up of the population of microorganisms particularly with respect to certain groups of microorganisms and the proportionate presence of the group and/or group members in the population. In such embodiments, the combination of primers and/or primer pairs include a selected group or sub-group of microorganism-specific nucleic acids and include kingdom-encompassing nucleic acids (e.g., 16S rRNA gene primers and primer pairs), and enable not only a comprehensive survey of the entirety and relative levels of genera of microorganisms (e.g., bacteria), but also detailed identification of species of microorganisms that can be tailored, for example, to focus on one or more particular microorganisms of interest that may be significant in certain states of health and disease or microbiota imbalance, e.g., dysbiosis. In some embodiments, combinations of nucleic acids include microorganism-specific nucleic acids, and/or primer pairs, that specifically amplify a nucleic acid sequence contained in the genome of one or more microorganisms (e.g., bacteria) implicated in one or more conditions, disorders and/or diseases. In particular embodiments, the combination of nucleic acid primers and/or primer pairs includes a nucleic acid and/or a primer pair that specifically amplifies a target nucleic acid sequence contained within a genome of a microorganism selected from the microorganisms in Table 1, or Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or Table 1, except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis. In some embodiments, the combination includes a plurality of nucleic acids and/or primer pairs that include at least one nucleic acid primer pair that specifically amplifies a target nucleic acid in each of the microorganisms in Table 1, or Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or Table 1, except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis. In some embodiments, the plurality of primer pairs includes primer pairs that specifically amplify genomic target nucleic acids contained within at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60 or 70 of the microorganisms in Table 1, or Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or Table 1, except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis. In particular embodiments, the target nucleic acid sequences contained in the genome of the different microorganisms are unique to each of the microorganisms. In some embodiments, a combination of nucleic acids and/or nucleic acid primer pairs includes two or more nucleic acids and/or nucleic acid primer pairs that specifically amplify a unique nucleic acid sequence contained in the genome of one or more of the Group A microorganisms (see Table 2A), which are species implicated as having a role in multiple conditions, diseases and/or disorders, including, for example oncological conditions including, for example, response to immune-oncology treatment and cancer, gastrointestinal disorders, including, for example, irritable bowel syndrome, inflammatory bowel disease and coeliac disease, and autoimmune diseases, including, for example, lupus and rheumatoid arthritis. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes a set of nucleic acid primer pairs in which each different nucleic acid primer pair specifically amplifies a different unique nucleic acid sequence contained in a different one of each of the genomes of the different microorganisms in Group A. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes nucleic acids and/or nucleic acid primer pairs that specifically bind to, hybridize to and/or amplify sequences as set forth in Table 2A for exemplary nucleic acids, primers and primer pairs for genomes of Group A microorganisms. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes nucleic acids and/or nucleic acid primer pairs having a nucleotide sequence or sequences as set forth in Table 2A for exemplary nucleic acids, primers and primer pairs for genomes of Group A microorganisms.


In some embodiments, a combination of nucleic acids and/or nucleic acid primer pairs includes two or more nucleic acids and/or nucleic acid primer pairs that specifically amplify a unique nucleic acid sequence contained in the genome of one or more of the Group B microorganisms (see Table 2B), which are species implicated as having a role in response to immuno-oncology treatment. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes a set of nucleic acid primer pairs in which each different nucleic acid primer pair specifically amplifies a different unique nucleic acid sequence contained in a different one of each of the genomes of the different microorganisms in Group B. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes nucleic acids and/or nucleic acid primer pairs that specifically bind to, hybridize to and/or amplify sequences as set forth in Table 2B for exemplary nucleic acids, primers and primer pairs for genomes of Group B microorganisms. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes nucleic acids and/or nucleic acid primer pairs having a nucleotide sequence or sequences as set forth in Table 2B for exemplary nucleic acids, primers and primer pairs for genomes of Group B microorganisms.


In some embodiments, a combination of nucleic acids and/or nucleic acid primer pairs includes two or more nucleic acids and/or nucleic acid primer pairs that specifically amplify a unique nucleic acid sequence contained in the genome of one or more of the Group C microorganisms (see Table 2C), or the Group C microorganisms excluding Helicobacter salomonis (Subgroup 1 of the Group C microorganisms), which are species implicated as having a role in cancer. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes a set of nucleic acid primer pairs in which each different nucleic acid primer pair specifically amplifies a different unique nucleic acid sequence contained in a different one of each of the genomes of the different microorganisms in Group C, or the Group C microorganisms excluding Helicobacter salomonis (Subgroup 1 of the Group C microorganisms). In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes nucleic acids and/or nucleic acid primer pairs that specifically bind to, hybridize to and/or amplify sequences as set forth in Table 2C for exemplary nucleic acids, primers and primer pairs for genomes of Group C microorganisms. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes nucleic acids and/or nucleic acid primer pairs that specifically bind to, hybridize to and/or amplify sequences selected from SEQ ID NOS: 1616, 1619, 1620, 1625-1628, 1635-1640, 1699, 1700, 1705-1708, 1752, 1753, 1784-1786, 1817-1820, 1827, 1828, 1840, 1841, 1844, 1845, 1852-1859, 1899, 1900, 1904, 1905, 1932, 1933, 1956-1958, 1975, 1976 of Table 17, and/or a substantially identical or similar sequence, or sequences selected from SEQ ID NOS: 1616, 1619, 1620, 1625-1628, 1635-1640, 1699, 1700, 1705-1708, 1752, 1753, 1784-1786, 1827, 1828, 1840, 1841, 1844, 1845, 1852-1859, 1899, 1900, 1904, 1905, 1932, 1933, 1956, 1957, 1958 of Table 17, and/or a substantially identical or similar sequence. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes nucleic acids and/or nucleic acid primer pairs having a nucleotide sequence or sequences as set forth in Table 2C for exemplary nucleic acids, primers and primer pairs for genomes of Group C microorganisms. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes nucleic acids and/or nucleic acid primer pairs having a nucleotide sequence or sequences selected from SEQ ID NOS: 71, 72, 77-80, 89-96, 109-120, 237-242, 249-256, 343-346, 407-412, 493-496, 511-520, 521-524, 547-550, 555-558, 561-568, 571-586, 665-668, 675-678, 731-734, 779-784, and/or SEQ ID NOS: 849, 850, 855-858, 867-874, 887-898, 1012-1020, 1025-1034, 1121-1124, 1185-1190, 1271-1276, 1289-1298, 1299-1302, 1325-1328, 1333-1336, 1339-1346, 1349-1364, 1443-1446, 1453-1456, 1509-1512, 1557-1562, in Table 16, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes nucleic acids and/or nucleic acid primer pairs having a nucleotide sequence or sequences selected from SEQ ID NOS: 71, 72, 77-80, 89-96, 109-120, 237-242, 249-256, 343-346, 407-412, 493-496, 511-520 and/or SEQ ID NOS: 849, 850, 855-858, 867-874, 887-898, 1012-1020, 1025-1034, 1121-1124, 1185-1190, 1271-1276, 1289-1298 in Table 16, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes nucleic acids and/or nucleic acid primer pairs having a nucleotide sequence or sequences selected from SEQ ID NOS: 71, 72, 77-80, 89-96, 109-120, 237-242, 249-256, 343-346, 407-412 and/or SEQ ID NOS: 849, 850, 855-858, 867-874, 887-898, 1012-1020, 1025-1034, 1121-1124, 1185-1190 in Table 16, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base.


In some embodiments, a combination of nucleic acids and/or nucleic acid primer pairs includes two or more nucleic acids and/or nucleic acid primer pairs that specifically amplify a unique nucleic acid sequence contained in the genome of one or more of the Group D microorganisms (see Table 2D), which are species implicated as having a role in gastrointestinal disorders, including, for example, irritable bowel syndrome, inflammatory bowel disease and coeliac disease. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes a set of nucleic acid primer pairs in which each different nucleic acid primer pair specifically amplifies a different unique nucleic acid sequence contained in a different one of each of the genomes of the different microorganisms in Group D. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes nucleic acids and/or nucleic acid primer pairs that specifically bind to, hybridize to and/or amplify sequences as set forth in Table 2D for exemplary nucleic acids, primers and primer pairs for genomes of Group D microorganisms. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes nucleic acids and/or nucleic acid primer pairs having a nucleotide sequence or sequences as set forth in Table 2D for exemplary nucleic acids, primers and primer pairs for genomes of Group D microorganisms.


In some embodiments, a combination of nucleic acids and/or nucleic acid primer pairs includes two or more nucleic acids and/or nucleic acid primer pairs that specifically amplify a unique nucleic acid sequence contained in the genome of one or more of the Group E microorganisms (see Table 2E), which are species implicated as having a role in autoimmune disorders, including, for example, lupus and rheumatoid arthritis. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes a set of nucleic acid primer pairs in which each different nucleic acid primer pair specifically amplifies a different unique nucleic acid sequence contained in a different one of each of the genomes of the different microorganisms in Group E. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes nucleic acids and/or nucleic acid primer pairs that specifically bind to, hybridize to and/or amplify sequences as set forth in Table 2E for exemplary nucleic acids, primers and primer pairs for genomes of Group E microorganisms. In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes nucleic acids and/or nucleic acid primer pairs having a nucleotide sequence or sequences as set forth in Table 2E for exemplary nucleic acids, primers and primer pairs for genomes of Group E microorganisms.


Nucleotide Sequence Information


In some embodiments of the methods, the characterizing, profiling, assessing and/or measuring of a population of microorganisms and/or the composition or components thereof, involves utilization of nucleotide sequence information of amplification products. The nucleotide sequences provide information used in assessing or measuring the diversity (e.g., number of different microorganisms, such as number of different genera and/or species of microorganisms), the levels or abundance of different microorganisms, the proportionate amounts of different microorganisms and/or the identities (e.g., genus, species, etc.) of the microorganisms that make up a microorganism population in a sample. Typically, the characterizing, profiling, assessing and/or measuring of a population of microorganisms and/or the composition or components thereof, is based on nucleotide sequence information of products from the majority of, or substantially all, the amplifications performed using each of the different primer pairs in a combination of primer pairs employed in the method. One reason for this is that the microorganism population, as defined by the different microorganisms, and/or relative abundances thereof, is being elucidated in the method. In contrast, a method of amplifying, detecting and/or measuring only one, or more, particular microorganism(s) in a sample can be without regard to the presence or absence of any other, or some other, microorganisms (or nucleic acids from other sources) in the sample, and thus a determination may be made based on nucleotide sequence information of products from only some of the amplifications performed with different primer pairs, or of products from all of a smaller or limited number of amplifications.


In some embodiments, the methods for characterizing, profiling, assessing and/or measuring a population of microorganisms and/or the composition or components thereof, in a sample, include obtaining sequence information from nucleic acid products amplified by the combination of primer pairs employed in the method. Examples of sequence information include, but are not limited to, the identities of the nucleotides and the order thereof in a contiguous polynucleotide sequence (the nucleotide sequence determination) of an amplification product (which can include, for example, barcode sequence, e.g., corresponding to amplicon library source, primers used, etc.), alignment and/or mapping of nucleotide sequence to a reference sequence (and identity of the genus or species of the reference sequence), the number of sequence reads that map to a reference sequence and/or portions thereof (e.g., hypervariable regions of a 16S rRNA gene), the number of sequence reads that map uniquely to a reference sequence, and the number of regions (e.g., target sequences) of a reference sequence to which sequence reads map. In some embodiments, sequence information provides the identities (e.g., genus, species) of microorganisms and/or the number of different microorganisms in a population which is indicative of the diversity of the population. In some embodiments, sequence information provides a measure of the levels (e.g., relative and/or absolute) of microorganisms in a population (abundance and proportionate contribution or presence in a population).


Techniques for sequencing nucleic acids, e.g., nucleic acids of a prepared library of nucleic acids, such as amplicon libraries that can be generated using compositions and methods described herein, are provided herein and/or known in the art and include, for example, next generation sequencing-by-synthesis and Sanger sequencing methods. Any such methods can include use of microarrays for massively parallel sequencing of nucleic acids, e.g., on a substrate, such as a chip. In some embodiments, nucleic acids, e.g., an amplicon library, can be sequenced using an Ion Torrent Sequencer (Life Technologies), e.g., the Ion Torrent PGM 318™ or Ion Torrent S5 520™ Ion Torrent S5 530™, or Ion Torrent S5 XL™ system. Other sequencing systems include, but are not limited to, systems employing solid-phase PCR involving bridge amplification of nucleic acids using oligonucleotide adapters (e.g., Illumina MiSeq, NextSeq or HiSeq platforms). In some embodiments, nucleic acid templates to be sequenced can be prepared from a population of nucleic acid molecules using amplification methods provided herein. In some embodiments, a sequencer can be coupled to a server that applies parameters or software to determine the sequence of the amplified nucleic acid molecules.


In some embodiments, an amplicon library prepared using primers provided herein can be used in downstream enrichment applications. For example, following amplification of sample nucleic acids using nucleic acid composition, e.g., primers, primer pairs, provided herein, a secondary and/or tertiary amplification process including, but not limited to, a library amplification step and/or a clonal amplification step, including, for example, isothermal nucleic acid amplification, emulsion PCR or bridge PCR, can be performed. In some embodiments, the amplicon library can be used in an enrichment application and a sequencing application. For example, an amplicon library can be sequenced using any suitable DNA sequencing platform. In some embodiments, amplification and templating of amplified amplicons can be performed according to the Ion PGM™ Template IA 500 Kit user guide (see, e.g., Thermo Fisher Scientific Catalog no. A24622 and Publication no. MAN0009347), Ion 540™ Kit-Chef user guide (see, e.g., Thermo Fisher Scientific Catalog no. A30011 and Publication no. MAN0010851), or Ion 550™ Kit-Chef user guide (see, e.g., Thermo Fisher Scientific Catalog no. A34541 and Publication no. MAN0017275). In some embodiments, at least one of the amplified targets sequences to be clonally amplified can be attached to a support or particle (see, for example, U.S. Patent Application Publication No. US2019/0194719). The support can be comprised of any suitable material and have any suitable shape, including, for example, planar, spheroid or particulate. In some embodiments, the support is a scaffolded polymer particle as described in U.S. Published App. No. 20100304982. In some embodiments, the amplicon library can be prepared, enriched and sequenced in less than 24 hours. In some embodiments, the amplicon library can be prepared, enriched and sequenced in approximately 9 hours. In some embodiments, an amplicon library can be a paired or combined library, e.g., a library that contains amplicons generated from amplification of sample nucleic acids using 16S rRNA gene primers and amplicons generated from amplification of sample nucleic acids using species (e.g., bacterial species)-specific primers. In some embodiments, a library and/or template preparation to be sequenced can be prepared for sequencing automatically with an automated system, e.g., the Ion Chef™ system (Thermo Fisher Scientific, Inc.). Amplification products generated by the methods disclosed herein can be ligated to an adapter that may be used downstream as a platform for clonal amplification. The adapter can function as a template strand for subsequent amplification using a second set of primers and therefore allows universal amplification of the adapter-ligated amplification product. In some embodiments, adapters ligated to amplicons include one or more barcodes. In one embodiment, one barcode can be ligated to amplicons generated in amplification of sample nucleic acids using 16S rRNA gene primers and a different barcode can be ligated to amplicons generated in amplification of sample nucleic acids using species (e.g., bacterial species)-specific primers. The ability to incorporate barcodes enhances sample throughput and allows for analysis of multiple samples or sources of material concurrently. In one example, amplified nucleic acid molecules prepared using compositions and methods provided herein can be ligated to Ion Torrent™ Sequencing Adapters (A and P1 adapters, sold as a component of the Ion Fragment Library Kit, Life Technologies, Part No. 4466464) or Ion Torrent™ DNA Barcodes (Life Technologies, Part No. 4468654). In some embodiments, a barcode or key can be incorporated into each of the amplification products to assist with data analysis and for example, cataloging.


In some embodiments, an amplicon library produced by the teachings of the present disclosure is sufficient in yield to be used in a variety of downstream applications including the Ion Xpress™ Template Kit using an Ion Torrent™ PGM system (e.g., PCR-mediated addition of the nucleic acid fragment library onto Ion Sphere™ Particles)(Life Technologies, Part No. 4467389). For example, instructions to prepare a template library from the amplicon library can be found in the Ion Xpress Template Kit User Guide (Thermo Fisher Scientific). Instructions for loading the subsequent template library onto the Ion Torrent™ Chip for nucleic acid sequencing are described in the Ion Sequencing User Guide (Thermo Fisher Scientific). In some embodiments, the amplicon library produced by the teachings of the present disclosure can be used in paired end sequencing (e.g., paired-end sequencing on the Ion Torrent™ PGM system (Thermo Fisher Scientific). It will be apparent to one of ordinary skill in the art that numerous other techniques, platforms or methods for clonal amplification such as wildfire, PCR and bridge amplification can be used in conjunction with the amplification products of the present disclosure. It is also envisaged that one of ordinary skill in art upon further refinement or optimization of the conditions provided herein can proceed directly to nucleic acid sequencing (for example using the Ion Torrent PGM™ or Proton™ sequencers, Life Technologies) without performing a clonal amplification step. Sequence data processing and analysis programs to obtain the sequence of nucleotides of nucleic acids, e.g., amplicons, are available and include, for example, the Torrent Suite™ Software product for use with Ion sequencers (Thermo Fisher Scientific, Inc.).


In some embodiments, relative and/or absolute levels of one or more microorganisms are determined or measured. For example, in some embodiments, the level of abundance of one or more nucleic acid amplification products and/or sequence reads can be measured to provide relative and/or absolute levels of one or microorganisms. Techniques for quantifying nucleic acids (e.g., amplification products) and/or sequence reads are known in the art and/or provided herein.


In some embodiments, utilizing sequence information, for example, in detecting microorganisms and/or identifying genera and/or species of sample microorganisms, in the methods includes comparing the sequence of the one or more nucleic acid amplification products to each other and/or to reference nucleic acid sequences of the genomes of microorganisms, e.g., bacteria, and/or of prokaryotic, e.g., bacterial, genes, such as 16S rRNA genes. In some embodiments, comparing the sequence of a nucleic acid amplification product includes conducting computer-assisted mapping it to a reference genome. In some embodiments, comparing the sequence of a nucleic acid amplification product includes computer-assisted alignment of the sequence to other sequences. There are several software products that can be used in conducting the computational processing involved in aligning and mapping nucleic acid sequences. For example, some products utilize a Burrows-Wheeler Transform (BWT; see, e.g., Li and Durbin (2009) Bioinformatics 25:1754-1760) algorithm in mapping sequence reads to sequences in a reference database. One implementation of BWT is provided by the Burrows-Wheeler Aligner (see, e.g., https://sourceforge.net/projects/bio-bwa/files/). Some products utilize hashing algorithms (e.g., SSAHA; sanger.ac.uk/science/tools/ssaha; see, e.g., Ning et al. (2001) Genome Res 11(10):1725-11729) and/or dynamic programming algorithms (e.g., Needleman-Wunsch or Smith-Waterman) implemented, for example, in software tools available through the European Bioinformatics Institute (see, e.g., ebi.ac.uk/services/all). Another tool available for aligning nucleotide sequences is the Basic Local Alignment Search Tool (BLAST) available through the National Center for Biotechnology Information (NCBI) (see, e.g., https://blast.ncbi.nlm.nih.gov/Blast.cgi). This program can be used to search sequence databases (e.g., microbial genome databases) for similar sequences using metrics of read identity, alignment length and other parameters, and the sequence reads mapping to a particular genus or species can be calculated. An example of a program providing several options for mapping/alignment of nucleotide sequences is the Torrent Mapping Alignment Program (TMAP) module (see, e.g., Torrent Suite™ Software User Guide; Thermo Fisher Scientific Publication number MAN0017972) for use with the Torrent Suite™ Software product that is optimized for sequence data generated using Ion Torrent™ sequencer systems.


In one embodiment, described herein are nucleotide sequence analysis workflows for aligning and/or mapping sequence reads of amplification products. In some embodiments, these methods can be used to compress reference sequence databases used in mapping sequence reads for analysis and profiling of microbial populations. There can be over 100,000 nucleotide sequences in a database of genome and gene (e.g., 16S rRNA gene) sequences from numerous microorganisms (e.g., bacteria). In methods provided herein in which 16S rRNA gene hypervariable regions are amplified, the more of the nine hypervariable regions amplified and sequenced, the more the number of alignments that have to be performed. Thus, an analysis of multiple amplified nucleic acid regions of multiple microorganisms in a sample can require extensive processor memory and time and potentially introduce errors and uncertainties into the analysis. Methods of performing such analyses that reduce computational requirements, reduce memory requirements and improve the quality of characterization of nucleic acids in a sample are described herein. In some embodiments, an unaligned BAM file including sequence read information may be provided to a processor for analyzing the sequence reads corresponding to marker regions. Reads obtained from sequencing of library DNA templates may be analyzed to identify, and determine the levels of, microbial constituents of the samples. Analysis may be conducted using a workflow incorporating Ion Torrent Suite™ Software (Thermo Fisher Scientific) with a run plan template designed to facilitate microbial DNA sequence read analysis, and an AmpliSeq microbiome analysis software plugin which generates counts for amplicons targeted in the assay. Reference genome files used for alignment in read mapping aspects of the analysis may be included in the plugin. Compressed 16S reference sequences may be derived from the GreenGenes or other 16S rRNA gene sequence database. The compressed 16S reference sequences comprise a plurality of the hypervariable regions of the 16S rRNA gene. Primers, such as those described herein, targeting multiple of the 9 hypervariable regions for amplification, e.g., V2-V9, yield a set of hypervariable segment sequences through amplification of microbial nucleic acids. The set of hypervariable segments may be generated by applying an in silico PCR simulation using the primer pairs to the full length 16S rRNA gene sequences contained in the database to extract expected target segments for variable regions, e.g., V2-V9. The in silico PCR simulation may use available computational tools for calculation theoretical PCR results using a given set of primers and a target DNA sequence input by the user. One such tool is Primer-BLAST (https://www.ncbi.nlm.nih.gov/tools/primer-blast/index.cgi?LINK_LOC=BlastHome), described in Ye J, Coulouris G, Zaretskaya I, Cutcutache I, Rozen S, Madden T L. (2012) Primer-BLAST: a tool to design target-specific primers for polymerase chain reaction. BMC Bioinformatics. 13:134. The set of hypervariable segments derived from the in silico simulation provides a compressed reference containing only hypervariable region sequences of those full-length 16S rRNA sequences in the complete database that would be expected to be amplified by the primers. For example, the number of base pairs of the reference may be reduced from 1500 bp of the full length sequence to 8 hypervariable segments having a total of 1299 bp. The GreenGenes database contains about 150,000 16S rRNA gene sequences.



FIG. 2 illustrates a workflow for use in analysis of sequence information generated in methods provided herein. For example, the workflow can be used as a method for processing the sequence reads to assess microbial composition of a sample. The barcode/sample name parser separates the sequence reads into a set corresponding to the amplicons generated in amplification using 16S rRNA gene primers and a set corresponding to the amplicons generated in amplification using non-16S rRNA gene primers (e.g., targeted species-specific primers). Quality read trimming and short reads may be removed by the base caller. The base calls may be made by analyzing any suitable signal characteristics (e.g., signal amplitude or intensity). The structure and/or design of a sensor array, signal processing and base calling for use with the present teachings may include one or more features described in U.S. Pat. Appl. Publ. No. 2013/0090860, published Apr. 11, 2013, incorporated by reference herein in its entirety. For example, sequence reads having lengths less than 60 bases or greater than 350 may be removed. The sequence reads and targeted species of sequences generated through amplification of sample nucleic acids using 16S rRNA gene primers and species-specific primers may be analyzed independently. In some embodiments, more than one primer pair may target a given hypervariable region. The sequence of each primer of a primer pair for amplifying each region is directed to “conserved” sequence on each side of the particular V region so that the primer pair will theoretically amplify that variable region in the genome of every bacterium. However, some of the “conserved” regions on either side of each variable region, particularly the conserved regions on either side of V2 and V8, are not sufficiently conserved in order to amplify the V2 and V8 regions of all bacteria. Thus, for amplifying V2 and V8, the 3 primer pairs (instead of 1 primer pair) may be used with the sequences of each primer pair being almost identical but having one or two nucleotides different (referred to as “degenerate primers”). Using those 3 primer pairs is a means to amplify the V2 and V8 regions for all bacteria even though the conserved regions on either side of V2 and V8 may be slightly different for different bacteria.


The hypervariable segments generated by the in silico simulation for the species and strains in the 16S rRNA gene sequences database are further processed to identify expected patterns, or signatures, in the hypervariable segments characteristic of the species and strain. An expected signature is generated for each species and strain based on the presence (=1) or absence (=0) of each of the targeted hypervariable segments in the amplification results of the in silico simulation. Matrix A gives an example of expected signatures when there is one primer pair per hypervariable region V2-V9 for two species A and B.












MATRIX A
















SPECIES
SEQ. #
V2
V3
V4
V5
V6
V7
V8
V9





A
1
0
1
1
1
0
0
1
0


A
2
0
0
0
1
0
0
1
0


A
3
0
1
1
1
0
0
1
0


B
4
0
1
1
1
0
0
1
0









Matrix B gives an example of expected signatures when more than one primer pair targets two of the hypervariable regions, V2 and V8, for species C and D. In this example, three primer pairs target hypervariable region V2 and three primer pairs target hypervariable region V8.












MATRIX B




















SPECIES
SEQ. #
V2
V2
V2
V3
V4
V5
V6
V7
V8
V8
V8
V9





C
1
0
1
1
1
0
0
0
0
0
1
1
1


C
2
0
1
1
1
0
0
0
0
0
1
1
1


C
3
0
0
1
1
1
0
0
0
0
0
1
1


D
4
0
1
1
1
0
0
0
0
0
1
1
1









For example, for the GreenGenes database, 150,000 expected signatures may be determined, one for each gene sequence in the database. The expected signatures, such as the examples shown in Matrix A and Matrix B, may be used for combining counts of aligned reads for 16S, as described with respect to FIG. 3.



FIG. 2 is a block diagram of a method for processing the sequence reads to determine microbial composition. The sequence reads are obtained from sequencing of the amplicons generated using the 16S primer pool and a species primer pool to amplify nucleic acids extracted from a sample. The barcode/sample name parser 202 separates the sequence reads into a set of sequence reads corresponding to the 16S amplicons, or 16S sequence reads, and a set of sequence reads corresponding to targeted species amplicons, or targeted species sequence reads. Quality read trimming and short reads may be removed by the base caller. The base calls may be made by analyzing any suitable signal characteristics (e.g., signal amplitude or intensity). The structure and/or design of a sensor array, signal processing and base calling for use with the present teachings may include one or more features described in U.S. Pat. Appl. Publ. No. 2013/0090860, published Apr. 11, 2013, incorporated by reference herein in its entirety. For example, sequence reads having lengths less than 60 bases or greater than 350 may be removed. The 16S sequence reads may be analyzed by a 16S processing pipeline 204 and the targeted species sequence reads may be analyzed independently by a targeted species processing pipeline 206. The 16S processing pipeline 204 may provide information on the species/genus/family detected in the sample for a report 208. The targeted species processing pipeline 206 may provide information on the species detected in the sample for a report 210.



FIG. 3 is a block diagram of the 16S read data processing pipeline, in accordance with an embodiment. In step 302, the 16S sequence reads identified by the barcode/sample name parser 202 are received in an unaligned BAM file. The 16S sequence reads are subjected to two mapping steps 304 and 312. In a first mapping step 304, the 16S sequence reads are aligned to the reference hypervariable segments of the compressed 16S reference set, with multi-mapping and end-to-end mapping enabled. The mapping steps 304 and 312 determine aligned sequence reads and associated mapping quality parameters. Methods for aligning sequence reads for use with the present teachings may include one or more features described in U.S. Pat. Appl. Publ. No. 2012/0197623, published Aug. 2, 2012, incorporated by reference herein in its entirety. The mapped reads are filtered based on alignment quality. For example, a minimum local alignment score may be set to 35. With multi-mapping enabled, one read may align to more than one of the reference hypervariable segments. The alignments having equal best scores may be included in observed read counts for subsequent steps.


In step 306, a matrix of observed read counts for each of the targeted hypervariable regions for each species and strain is formed from the aligned reads information in the aligned BAM file and operations to reduce the read count matrix are applied. For example, the matrix may have dimensions of 150,000×(number of targeted hypervariable regions per gene). Matrix C gives an example of a portion of a matrix of observed read counts, or read count matrix, for targeted hypervariable regions corresponding to the example of Matrix B.












MATRIX C




















SPECIES
SEQ. #
V2
V2
V2
V3
V4
V5
V6
V7
V8
V8
V8
V9























C
1
5
10
200
1000
0
0
0
1
2
100
2000
400


C
2
15
20
100
1000
5
10
0
0
0
200
5000
500









In step 306, a first reduction of the read count matrix reduces the number of rows. The read counts corresponding to the hypervariable regions for each row may be added to form a sum and a threshold TRS applied to the sum. For example, the row sum threshold TRS may be set to 500 so that rows with less than 500 reads are eliminated. The row sum threshold TRS may be set in a range between 100 and 1000. The same row sum threshold TRS is applied to the sum for each row. Since each row corresponds to a species and strain, the eliminated rows correspond to species and strains that are eliminated from further consideration.


In step 306, a second reduction of the read count matrix combines the read counts of the rows based on the expected signatures determined from the in silico simulations, such as the expected signatures given in the examples of Matrix A or Matrix B. For species and strains (rows) having identical expected signatures, the read counts per hypervariable region (column) of the same species are added to give a column sum of read counts corresponding to a single species. For example, in Matrix B the expected signatures for species C, seq. #1 and seq. #2 are identical. Matrix C gives the read count matrix for row sums>TRS for species C, seq. #1 and seq. #2. The read counts per hypervariable region (column) for species C, seq. #1 and seq. #2 may be added because they correspond to identical expected signatures within the same species C in Matrix B. Note that Species D also has the same expected signature as seq. #1 and seq. #2 of species C, but read counts for species D will not be added to read counts for species C because of the different species. The column sums for the combined read counts may be added to form a combined sum. A threshold TCOMB is applied to the combined sum. If the combined sum is greater than TCOMB the corresponding rows of the row count matrix are retained, otherwise the corresponding rows of the row count matrix are eliminated, thus reducing the size of the row count matrix. The threshold TCOMB for the combined sum may be set to 10,000, for example and is configurable by the user.


A signature threshold TS is applied to each of the column sums to give binary values by assigning a “1” if the column sum≥TS and assigning “0” if the column sum<TS. The resulting array of binary values provides the observed signature. For example, the column sum threshold TS can be set to 10.


Exemplary results of the reduction operations and observed signature are given in Matrix D, corresponding to the examples of Matrix B and Matrix C. In this example, seq. #1 and seq. #2 of species C are retained because the combined sum is greater than the combined threshold TCOMB of 10,000.












MATRIX D



































COMB.


SPECIES
SEQ. #
V2
V2
V2
V3
V4
V5
V6
V7
V8
V8
V8
V9
SUM
























C
1
5
10
200
1000
0
0
0
1
2
100
2000
400



C
2
15
20
100
1000
5
10
0
0
0
200
5000
500


COLUMN

20
30
300
2000
5
10
0
1
2
300
7000
900
10,568


SUM


OBSERVED

1
1
1
1
0
1
0
0
0
1
1
1


SIGNATURE


EXPECTED

0
1
1
1
0
0
0
0
0
1
1
1


SIGNATURE









In step 308, a first reduced set of full-length 16S sequence is generated as follows. The binary values of the observed signature and corresponding expected signature (e.g. from Matrix B, species C, seq. #1 and seq. #2) are compared and the ratio of matching categories, or matching binary values, to total categories is determined. If the ratio meets a minimum threshold TR, the full-length 16S rRNA gene sequences corresponding to the expected signatures in the given species are selected for a first reduced set of full-length reference sequences. Otherwise, the full-length 16S rRNA gene sequences corresponding to the expected signature in the given species are not included in the first reduced set of full-length reference sequences. The ratio threshold TR may be set to 0.75, and is configurable by the user. In the example of Matrix D, the observed and expected signature's binary values agree for 10 categories out of 12 total categories, which is greater than 0.75% of the total categories (9 of the 12 categories corresponds to 75%). The full-length 16S rRNA gene sequences corresponding to seq. #1 and seq. #2 of species C are selected for a first reduced set of full-length reference sequences.


In step 310, the first reduced set of full-length 16S rRNA gene sequences may be further reduced by a reassignment of unannotated species strains based on a sequence similarity metric to form a second reduced set of full-length reference sequences. The second reduced set of full-length reference sequences is used for the second mapping step. The gene sequence for an unannotated species in the first reduced set is compared to each annotated sequence in the first reduced set having the same genus. The levenshtein distance is calculated between the unannotated sequence and the each of the annotated sequences having the same genus. The levenshtein distance is a count of differences between two sequences, including substitutions, insertions and deletions. If the levenshtein distance between an unannotated sequence and an annotated sequence in the first reduced set is less than a threshold TLEV then the annotated sequence is identified as candidate annotated sequence. For example, the threshold TLEV may be set to 80. The threshold TLEV of 80 counts corresponds to 5% of a 16S gene length of 1600 bp. In some situations, there may be a single candidate annotated sequence. For a single candidate annotated sequence, the unannotated sequence is reannotated with the annotation of the candidate annotated sequence. In some situations, there may be multiple candidate annotated sequences associated with a given unannotated sequence. When there are multiple candidate annotated sequences, the candidate annotated sequence with the lowest levenshtein distance is selected. The given unannotated sequence is reannotated with the annotation corresponding to the selected candidate annotated sequence. When more than one candidate annotated sequence associated with the given unannotated sequence have equal levenshtein distances, the given unannotated sequence is included as is in the second reduced set of full-length reference sequences. The reannotated sequences are represented by the annotated sequences to which they were matched and the unannotated versions are removed to form the second reduced set of full-length reference sequences. The size of the first reduced set of full-length sequences is reduced by the number of previously unannotated sequences that were removed, to produce the second reduced set of full-length reference sequences.


The second reduced set of full-length reference sequences may have substantially fewer full-length 16S rRNA sequences than the original number in the database. For example, the 150,000 16S rRNA sequences in the GreenGenes database may be reduced to a few thousand full-length sequences. An advantage of the smaller size of the second reduced set of full-length reference sequences is a smaller memory requirement. Another advantage is a faster search time for the second mapping step because there are fewer full-length reference sequences to match with the sequence reads. Another advantage is that the reannotated sequences allow more species level resolution because the reannotated sequences are associated with a species level rather than a genus level for unannotated sequences. In the second mapping step, sequence reads will be associated with reannotated reference sequences that indicate a species. This results in more sequence reads mapped to a given species for improved read depth at the species level. Furthermore, the second reduced set of full-length reference sequences are more likely to match the sequence reads in the second mapping step, since they are determined based on the observed read counts resulting from the first mapping step.


In a second mapping step, 312, the sequence reads are mapped to the second reduced set of full-length 16S rRNA gene sequences. In step 314, one best hit per sequence read is counted (i.e., multi-mapping is disabled). In a first normalizing step, 316, the read count for each 16S reference sequence obtained after the second mapping step, 312 is normalized by dividing the read count by the number of 1's in the expected signature to form first normalized counts. For the example of Matrix D, the expected signature has six 1's and corresponds to two 16S reference sequences for Species C. The read counts for the each of the 16S reference sequences corresponding to the same expected signature for species C are divided by six to give the corresponding first normalized counts. In a second normalizing step, 318, the first normalized counts are divided by an average copy number of the 16S gene for the species to form second normalized counts. The copy numbers for the species may be obtained from a 16S copy number database, such as rmDB (https://rmdb.umms.med.umich.edu/; Stoddard S. F, Smith B. J., Hein R., Roller B. R. K. and Schmidt T. M. (2015) rmDB: improved tools for interpreting rRNA gene abundance in bacteria and archaea and a new foundation for future development. Nucleic Acids Research 2014; doi: 10.1093/nar/gku1201). For a given species, the copy numbers for the 16S gene given in the database records may be averaged to form the average copy number used for the second normalizing step 318. The second normalizing step 318 may be optional.


In step 320, the second normalized counts are aggregated, or added, for the species level, genus level and family level. The percentage of aggregated counts to the total number of mapped reads is calculated and thresholds may be applied for species detection, genus detection and family detection to give relative abundances if the threshold criteria are met. The species, genus and/or family may be reported as present if the percentage value is greater than the respective threshold. For example, the thresholds applied may be set to the values shown in the threshold table below. The thresholds can be set by the user. The threshold for detection may also be referred to as a noise threshold. In step 322, a report of the relative abundance at the species/genus/family level may be reported to the user.









TABLE E







Example Thresholds








AG



GREGATED COUNTS/TOTAL MAPPED READS (%)
THRESHOLD





SPECIES
0.1%


GENUS
0.5%


FAMILY
1.0%









The above methods may provide greater than 90% sensitivity at the genus level and greater than 85% PPV at the genus level. The threshold may be used to call respective species, genus or family having the percent of aggregated counts/mapped reads above the threshold as existing in a sample. This threshold may be adjusted by user to optimize for sensitivity and specificity according to the user's application.


Referring to FIG. 2, the reads from the amplicons generated using the species primer pool are separately analyzed by the targeted species analysis pipeline 206. Prior to mapping step 404 in FIG. 4, the microbial genome database, e.g. the NCBI public database, may be pre-processed to provide segmented reference sequences corresponding to expected amplicons. In this pre-processing, the microbial genome sequences in the database may be subjected to an in silico PCR simulation conducted using primers of the species primer pool to generate expected amplicon (primers+inserts) sequences from the whole genomes of all microbial strains in the database. The in silico PCR results identify genomes in the database that contain sequences that will be amplified by the primers in the species primer pool. The in silico simulation provides segmented reference sequences corresponding to the expected amplicons for the targeted species and possibly off-target species. Any genomes that do not contain sequence that would be amplified by the species primers were eliminated from the database. Any genomes that contain sequence that would be amplified using the species primers but would not be expected to contain such sequence were evaluated to determine the average nucleotide identity (ANI) between the genome and a genome that was expected to be amplified by the primers to assess possible misclassification and reannotation of the genome, and retainment of the genome in the database. For possible misclassified genomes, histograms of identities with known strains were created. A genome was reclassified only if it had greater than 95% identity to the known genome to which it was being reclassified. The segmented reference sequences may be generated once for a given set of primers and applied in multiple experiments. The segmented reference sequences may be used instead of the full-length reference genomes for the species in the mapping step. A compressed reference database for the targeted species includes the segmented reference sequences for each strain, including any reannotated strains. For example, a segmented reference sequence for a strain may include 1 to 8 segments and each segment may include 6 to 300 bases, giving a total number of at most a few thousand bases. For example, a full-length reference genome for a strain of E. coli may include 4.6 to 5.3 Mbases.



FIG. 4 is a block diagram of the targeted species processing pipeline, in accordance with an embodiment. In step 402, the targeted species sequence reads identified by the barcode/sample name parser 202 are received in an unaligned BAM file. Following pre-processing of the reference database, in the mapping step 404 the targeted species sequence reads are mapped to the to the segmented reference sequences for the species and strains in the compressed targeted species reference, with end-to-end mapping enabled. The mapped reads are filtered based on alignment quality. For example, a minimum local alignment score may be set to 25. In step 406, those reads that uniquely mapped to a single species (either uniquely to one segmented reference sequence for a single strain of a species or to multiple segmented reference sequences for multiple strains of the same species) are included in a read count. Matrix E gives examples of reads W, X, Y and Z mapping to strains of species A, B and C.












MATRIX E












SPECIES
SEQ. #
READ W
READ X
READ Y
READ Z





A
1
MAPPED





A
2
MAPPED
MAPPED


B
1

MAPPED


B
2



MAPPED


C
1


MAPPED


C
2


MAPPED









In the example of Matrix E, reads W, Y and Z would be included in the read counts and read X would not be included in read counts. Reads W and Y mapped to segmented reference sequences corresponding to two strains of the same species, A and C, respectively. Read Z mapped to a single strain of a segmented reference sequence for a strain of species B. Read X mapped to a strain of species A and also to a strain of species B, so it will not be counted.


In step 408, the number of sequence reads mapped to segmented reference sequences of a species, including strains of the species, are calculated to form an aggregate read count per species. In step 410, the aggregate read count per species is, normalized by the number of amplifying amplicons for the species. An amplifying amplicon has a minimum number of mapped reads. For example, the minimum number of mapped reads can be set to 10. Dividing the aggregate read count per species by the number of total amplifying amplicons for the species gives a normalized read count per species. The normalized read counts across all species are added to form a total of normalized read counts. The normalized read count per species is divided by the total of normalized read counts to form a ratio of normalized read counts. The ratio of normalized read counts for a species may be compared to a threshold Ttarget to decide whether a species was present in the sample. For example, the threshold Ttarget may be set to 0.1% and a species may be determined as present if the ratio of normalized read counts is greater than 0.1%. The threshold Ttarget may be set by the user. In step 412, the results of the normalized read counts for each species and the detected species may be reported to the user.


Methods for Detecting, Diagnosing, Preventing and/or Treating Microorganism Imbalances


Also provided herein are methods for detection, diagnosis, prevention and/or treatment, reduction in symptoms, and/or prevention of microorganism (e.g., bacteria) imbalances and/or dysbiosis in a subject as well as of conditions, disorders and diseases associated therewith. In some embodiments, the subject is an animal, for example, an insect or a mammal, such as a domestic or agricultural mammal or a human. In some instances, the microorganism imbalance and/or dysbiosis is in the alimentary canal, or gastrointestinal tract (often referred to as the “gut microbiota”), of the subject. The gut microbiota includes a diverse population of bacteria having a symbiotic relationship with a subject. The majority of bacteria in the gut microbiota are from seven phyla: Firmicutes, Bacteroidetes, Proteobacteria, Fusobacteria, Verrucomicrobia, Cynaobacteria and Actinobaceria, with greater than 90% of the bacteria human gut microbiota being from the Firmicutes and Bacteroidetes phyla. The composition of the gut microbiota is associated with and/or plays a role in a number of conditions, as well as the state of health or disease of animal subjects, and contributes to the development of disorders including, for example, irritable bowel syndrome (IBS), inflammatory bowel disease (IBD) and obesity, and autoimmune disorders such as celiac disease, lupus and rheumatoid arthritis (RA). Additionally, the composition of the gut microbiome may influence susceptibility to oncological conditions, such as cancer, and responsiveness to cancer therapies. For example, the composition of the gut microbiome has been implicated as a biomarker for cancer immunotherapies, including, for example, immune checkpoint inhibitors. Methods of detecting, diagnosing, preventing and/or treating a condition relating to microorganism imbalances and/or dysbiosis provided herein can include detecting, measuring, characterizing, profiling, assessing and/or monitoring the bacterial composition of the microbiota of a subject. In some embodiments, methods of detecting, diagnosing, preventing and/or treating a condition provided herein are based on detecting, measuring, characterizing, profiling, assessing and/or monitoring the bacterial composition of the microbiota of the alimentary tract which has been associated with conditions, diseases and disorders affecting animal health and responsiveness to therapies.


In some embodiments, a method provided herein of detecting and/or diagnosing an imbalance of microorganisms or dysbiosis in a subject includes amplifying nucleic acids in or from a sample from the subject, obtaining sequence information of the nucleic acid amplification products, and, optionally determining the levels of nucleic acid amplification products, determining the microorganism composition of the sample by identifying genera of microorganisms in the sample, and optionally the relative levels thereof, and species of one or more of the microorganisms in the sample, comparing the microorganism composition of the sample to a reference microorganism composition, and detecting an imbalance of microorganisms in the subject if the level of one or more microorganisms in the sample differ from the level of the microorganism(s) in the reference microorganism composition, one or more microorganisms in the reference composition is not present in the sample, and/or one or more microorganisms present in the sample is not present in the reference microorganism composition. In some embodiments of the method, the sample from the subject is a sample from the alimentary canal of the subject, e.g., a fecal sample. In some embodiments, a reference microorganism composition comprises a bacterial population (representative types and relative levels of bacteria) characteristic of the microbiota of a normobiotic subject. A normobiotic subject is healthy and does not have a microorganism (e.g., bacteria) imbalance and thus is in a state of normobiosis, as opposed to dysbiosis. Typically, in a normobiotic state, microorganisms with a potential health benefit predominate in number over potentially harmful microorganisms in the microbiota. An imbalance of one or more microorganisms in the microbiota of a subject can also be relative to the composition of microorganisms in a reference microbiota of the same subject when not in a state of imbalance or dysbiosis or when in a healthy state free of a disorder, disease, condition and/or symptoms of an unhealthy state associated with an imbalance or dysbiosis. An imbalance of one or microorganisms in a subject's microbiota can be relative to the average levels of the microorganism(s) typically present in any subject who is not in a state of imbalance or dysbiosis or who is in a healthy state free of a disorder, disease, condition and/or symptoms of an unhealthy state associated with an imbalance or dysbiosis. Significant deviations in the types and relative levels of constituent bacteria in a subject's microbiota from those of a bacterial population of a normobiotic subject is indicative of dysbiosis. Furthermore, the composition of different types of bacteria, and levels thereof, in the microbiota of a subject having an imbalance of microorganisms (i.e., the microbiota profile of the subject) can be indicative of susceptibility to or the occurrence of a particular condition, disorder or disease. Thus, a comparison of the bacterial constituents, and relative levels thereof, in the microbiota of a subject having an imbalance of microorganisms to microbiota profiles characteristic of certain disorders and diseases can be a consideration in diagnosing a related disorder or disease of the subject. For example, bacteria that may contribute to dysbiosis in gut microbiota in irritable bowel syndrome (IBS) include Firmicuties, Proteobacteria (Shigella and Escherichia), Actinobacteria and Ruminococcus gnavus, whereas bacteria that may contribute to dysbiosis in gut microbiota in inflammatory bowel disease (IBD) include Proteobacteria (Shigella and Escherichia), Firmicuties (specifically F. prausnitzii) and Bacteroidetes (Bacteroides and Prevotella) (see, e.g., Casen et al. (2015) Aliment Pharmacol Ther 42:71-83). Typically, a reduction in the diversity of the gut microbiota occurs in IBD which includes an expansion of pro-inflammatory bacteria (e.g., Enterobacteriaceae and Fusobacteriaceae) and a reduction in phyla with anti-inflammatory properties (e.g., Firmicuties). In another example, Desulfococcus, Enterobacter, Prevotella and Veillonella may be increased in gut microbiota in primary hepatocellular carcinoma compared to healthy controls (see, e.g., Ni et al. (2019) Front Microbiol Volume 10 Article 1458). In some embodiments of the methods provided herein of detecting and/or diagnosing an imbalance of microorganisms or dysbiosis in a subject, an imbalance of microorganisms in the subject is detected if the level (relative and/or absolute) of one or more microorganisms differs from the level of one or more microorganisms in the reference microorganism composition. In some embodiments, an imbalance of microorganisms in the subject is detected if one or more microorganisms in the reference composition is not present in the sample, and/or one or more microorganisms present in the sample is not present in the reference microorganism composition. In some embodiments, the relative level of one or more microorganisms in a sample from a subject is determined by counting the number of sequence reads for nucleic acid products amplified from nucleic acids in the sample and normalizing the sequence read counts as described herein.


Also provided herein are methods of treating an imbalance of microorganisms or dysbiosis in a subject. In some embodiments of treating a subject having a microorganism imbalance or dysbiosis, a subject who has a disproportionate level of one or more microorganisms is treated to establish a balance of microorganisms or biosis in the subject. In some embodiments, a method provided herein of treating a subject having an imbalance of microorganisms or dysbiosis includes amplifying nucleic acids in or from a sample from the subject, obtaining sequence information of the nucleic acid amplification products, and, optionally determining the levels of nucleic acid amplification products, determining the microorganism composition of the sample by identifying genera of microorganisms in the sample, and optionally the relative levels thereof, and species of one or more of the microorganisms in the sample, detecting an imbalance of microorganisms in the subject and treating the subject to establish a balance of microorganisms or biosis (or normobiosis) in the subject. In some embodiments of the method, the sample from the subject is a sample from the alimentary canal of the subject, e.g., a fecal sample. In some embodiments, detecting an imbalance of microorganisms includes comparing the microorganism composition of the sample to a reference microorganism composition, and detecting an imbalance of microorganisms in the subject if the level of one or more microorganisms in the sample differ from the level of the microorganism(s) in the reference microorganism composition, one or more microorganisms in the reference composition is not present in the sample, and/or one or more microorganisms present in the sample is not present in the reference microorganism composition. In some embodiments, treating the subject to establish a balance of microorganisms in the subject includes, but is not limited to, administering to the subject microorganisms, e.g., bacteria, that are under-represented in or absent from the microbiota, creating conditions unfavorable to the survival or growth of a microorganism over-represented in the microbiota and/or creating conditions favorable to the survival or growth of a microorganism under-represented in or absent from the microbiota. For example, in the case of a microorganism imbalance or dysbiosis of microbiota of the alimentary tract, administering bacteria to a subject may include ingestion of probiotics, which are live microorganisms that are typically delivered in food. Another technique for administering bacteria to a subject is fecal microbial transplantation, typically through transcolonoscopic infusion, of a population of bacteria from a healthy donor to supplant the imbalanced microbiota of the subject (see, e.g, van Nood et al (2014) Curr Opin Gastroenterol 30(1):34-39). Creating conditions favorable to the survival and/or growth of a microorganism in a subject's microbiota include, for example, administration of prebiotics. Prebiotics are compositions, typically delivered as food ingredients or supplements, that selectively stimulate growth and/or activities of one or a select group of microorganisms and include, for example, inulin-type fructans (ITF) and galactooligosaccharides (GOS). Prebiotics such as ITF and GOS have been shown to have growth-promoting effects on Bifidobacteria and Lactobacilli. Creating conditions unfavorable to the survival and/or growth of a microorganism in a subject's microbiota include, for example, administration of antibiotics, e.g., rifaximin, and altering diet to eliminate or reduce intake of compositions that are favorable to growth of certain microorganisms, such as sulfates, animal proteins and refined sugars. Any of these and other possible interventions useful for establishing biosis or gut microorganism homeostasis (see, e.g., Bull and Plummer (2015) Integrative Medicine 14(1):25-33) can be used in treating a subject in methods described herein.


In some embodiments of the methods provided herein for detection, diagnosis, prevention and/or treatment, reduction in symptoms, and/or prevention of microorganism (e.g., bacteria) imbalances and/or dysbiosis in a subject as well as of conditions, disorders and diseases associated therewith, the step of amplifying nucleic acids in or from a sample from the subject includes (a) subjecting nucleic acids in or from a sample from the subject to nucleic acid amplification using a combination of primer pairs comprising (i) one or more primer pairs capable of amplifying nucleic acid sequences of one or more hypervariable regions of a prokaryotic 16S rRNA gene (referred to as the “16S rRNA gene primers or primer pairs”) and (ii) one or more primer pairs capable of amplifying a target nucleic acid sequence contained within the genome of a microorganism that is not contained within a hypervariable region of a prokaryotic 16S rRNA gene, wherein different primer pairs amplify different target nucleic acid sequences contained within the genome of different microorganisms (referred to as the “non-16S rRNA gene primers or primer pairs”). In some embodiments, obtaining sequence information from amplified nucleic acid products comprises obtaining sequence information from nucleic acid products amplified by the combination of primer pairs of (i) and (ii), and optionally determining the levels of nucleic acid products amplified by the one or more primer pairs of (i). In some embodiments, the method includes determining levels, e.g. relative and/or absolute levels, of nucleic acid products amplified by one or more primer pairs of (i), i.e., the 16S rRNA gene primer pairs, and/or (ii), i.e., the non-16S rRNA gene primer pairs, or sequence reads thereof. In some embodiments, the step of amplifying nucleic acids in or from a sample from the subject includes (a) subjecting the nucleic acids to two or more separate nucleic acid amplification reactions using a first set of primer pairs for one nucleic acid amplification reaction and a second set of primer pairs for the other nucleic acid amplification reaction, wherein (i) the first set of primer pairs comprises one or more primer pairs that amplifies a nucleic acid sequence of one or more hypervariable regions of a prokaryotic 16S rRNA gene (referred to as the “16S rRNA gene primers or primer pairs”) and (ii) the second set of primer pairs comprises one or more primer pairs that amplify a target nucleic acid sequence contained within the genome of a microorganism that is not contained within a hypervariable region of a prokaryotic 16S rRNA gene, wherein different primer pairs amplify different target nucleic acid sequences contained within the genome of different microorganisms (referred to as the “non-16S rRNA gene primers or primer pairs”), and obtaining sequence information comprises obtaining sequence information from nucleic acid products amplified by primer pairs of (i) and (ii). In some embodiments, the method includes determining levels, e.g. relative and/or absolute levels, of nucleic acid products amplified by one or more primer pairs of (i) and/or (ii) or sequence reads thereof.


In any embodiments of the methods provided herein for detection, diagnosis, prevention and/or treatment, reduction in symptoms, and/or prevention of microorganism (e.g., bacteria) imbalances and/or dysbiosis in a subject as well as of conditions, disorders and diseases associated therewith, the methods can include any embodiments of the methods for detecting and/or measuring the presence or absence of one or more microorganisms in a sample as described herein. For example, in some embodiments, the microorganism(s) is/are bacteria. In some embodiments, the prokaryotic 16S rRNA gene is a bacterial gene and/or the microorganism is a bacterium. In some embodiments the target nucleic acid sequence contained within a genome of a microorganism, e.g., bacteria, is unique to the microorganism. In some embodiments, the one or more 16S rRNA gene primer pairs amplify a nucleic acid sequence in a plurality of microorganisms, e.g., bacteria, from different genera. In some embodiments, the sample is a sample of contents of the alimentary tract of an animal. In some embodiments, the sample is a fecal sample.


In any embodiments of the methods provided herein for detection, diagnosis, prevention and/or treatment, reduction in symptoms, and/or prevention of microorganism (e.g., bacteria) imbalances and/or dysbiosis in a subject as well as of conditions, disorders and diseases associated therewith, a nucleic acid amplification can be performed according to any of the embodiments provided herein for such amplification. For example, in some embodiments, the one or more primer pairs that amplifies a nucleic acid containing a sequence of a hypervariable region of a prokaryotic 16S rRNA gene separately amplify nucleic acids containing sequences of different hypervariable regions. In some embodiments, the primers of the one or more 16S rRNA gene primer pairs are directed to, or bind to, or hybridize to nucleic acid sequences contained in conserved regions of a prokaryotic 16S rRNA gene. In some embodiments, the one or more 16S rRNA gene primer pairs and/or non-16S rRNA gene primer pairs comprise a plurality of primer pairs. For example, the one or more 16S rRNA gene primer pairs of can comprise a plurality of primer pairs that amplify nucleic acid sequences of multiple hypervariable regions of a prokaryotic 16S rRNA gene and/or the one or more non-16S rRNA gene primer pairs can comprise a plurality of primer pairs that amplify different target nucleic acid sequences contained in the genomes of a plurality of different microorganisms. In some embodiments, the amplification, or two or more separate nucleic acid amplification reactions, is/are multiplex amplification conducted in a single reaction mixture. In some embodiments, each primer of the one or more 16S rRNA gene primer pairs contains less than 10, less than 9, less than 8, less than 7, less than 6, less than 5, less than 4, less than 3, or less than 2 contiguous nucleotides of sequence identical to a sequence of contiguous nucleotides of another primer in the combination of primer pairs. In some embodiments, the nucleic acid sequences being amplified by the one or more 16S rRNA gene primer pairs are less than about 300 bp, less than about 250 bp, less than about 200 bp, less than about 175 bp, less than about 150 bp, or less than about 125 bp in length. In some embodiments, the 16S rRNA gene primer pairs separately amplify nucleic acids separately containing sequences of 3 or more, 4 or more, 5 or more, 6 or more, 7 or more, 8 or more or 9 different hypervariable regions of a prokaryotic 16S rRNA gene thereby producing amplified copies of the nucleic acids containing sequences of the 3 or more, 4 or more, 5 or more, 6 or more, 7 or more, 8 or more or 9 different hypervariable regions of the 16S rRNA gene of one or more microorganisms, wherein the amplified copies of different hypervariable regions are separate amplicons. In some embodiments, the 16S rRNA gene primer pairs separately amplify nucleic acids containing sequences of 8 different hypervariable regions of a prokaryotic 16S rRNA gene. In some embodiments, the 8 different hypervariable regions are V2-V9. In some embodiments, the 16S rRNA gene primer pairs separately amplify nucleic acids containing sequences of 3 or more different hypervariable regions of a prokaryotic 16S rRNA gene wherein one of the 3 or more regions is a V5 region thereby producing amplified copies of the nucleic acids containing sequences of the 3 or more hypervariable regions of the 16S rRNA gene of one or more microorganisms. In some embodiments, the combination of primer pairs includes degenerate sequences of one or more primers in one or more primer pairs. In some embodiments, the 16S rRNA gene primer pair(s) comprise primers and/or primer pairs containing, or consisting essentially of, a sequence or sequences of a primer or primer pair in Table 15, or SEQ ID NOS: 1-24 in Table 15 and/or SEQ ID NOS: 25-48 in Table 15, or SEQ ID NOS: 11-16, 23 and 24 in Table 15 and/or SEQ ID NOS: 35-40, 47 and 48 in Table 15, or substantially identical or similar sequences, and optionally wherein one or more thymine bases is substituted with a uracil base.


In some embodiments of the methods provided herein for detection, diagnosis, prevention and/or treatment, reduction in symptoms, and/or prevention of microorganism (e.g., bacteria) imbalances and/or dysbiosis in a subject as well as of conditions, disorders and diseases associated therewith, the one or more non-16S rRNA gene primer pairs specifically amplifies a target nucleic acid sequence contained within a genome of a microorganism selected from the microorganisms of Table 1, or Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or Table 1, except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis. In some embodiments, the target nucleic acid is unique to the microorganism. In some such embodiments, amplified copies of nucleic acids from a plurality of different microorganisms in Table 1, or Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or Table 1, except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis, is produced. In some embodiments, at least one, or one or more, target nucleic acid sequence(s) comprises or consists essentially of a nucleotide sequence selected from the nucleotide sequences of SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence, or the complement thereof. In some embodiments, at least one, or one or more, target nucleic acid sequences comprises a nucleotide sequence selected from the sequences of SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence, or the complement thereof and is less than about 500, less than about 475, less than about 450, less than about 400, less than about 375, less than about 350, less than about 300, less than about 275, less than about 250, less than about 200, less than about 175, less than about 150, or less than about 100 nucleotides in length. In some embodiments, at least one, or one or more, product(s) of the nucleic acid amplification comprises, or consists essentially of, a nucleotide sequence selected from SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence, or the complement thereof, and optionally having one or more primer sequences at the 5′ and/or 3′ end(s) of the sequence, such as any of the primer sequences provided herein, and is less than about 500, less than about 475, less than about 450, less than about 400, less than about 375, less than about 350, less than about 300, less than about 275, less than about 250, less than about 200, less than about 175, less than about 150, or less than about 100 nucleotides in length. In some embodiments, the at least one non-16S rRNA gene primer pair does not detectably amplify a nucleic acid sequence contained within any genus other than the genus of the microorganism containing the target nucleic acid sequence. In some embodiments, the at least one non-16S rRNA gene primer pair does not detectably amplify a nucleic acid sequence contained within any species other than the species of the microorganism containing the target nucleic acid sequence. In some embodiments, at least one primer of the non-16S rRNA gene primer pair, or at least one non-16S rRNA gene primer pair, contains, or consists essentially of, the sequence or sequences of a primer or primer pair in Table 16, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or a substantially identical or similar sequence(s), or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the nucleic acids are subjected to nucleic acid amplification using a plurality of non-16S rRNA gene primers or primer pairs, each containing, or consisting essentially of, a sequence or sequences of a primer pair in Table 16, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or a substantially identical or similar sequence(s), or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, at least one primer or one primer pair in the combination of primer pairs includes a modification that facilitates nucleic acid manipulation, amplification, ligation and/or sequencing of amplification products and/or reduction or elimination of primer dimers. In particular embodiments, a modification is one that facilitates multiplex nucleic acid amplification, ligation and/or sequencing of products of multiplex amplification.


In some embodiments of the methods provided herein for detection, diagnosis, prevention and/or treatment, reduction in symptoms, and/or prevention of microorganism (e.g., bacteria) imbalances and/or dysbiosis in a subject as well as of conditions, disorders and diseases associated therewith, the method is designed to focus on the make-up or composition of the population of microorganisms in the sample particularly with respect to certain groups of microorganisms, and, in some embodiments, the proportionate presence of the group and/or group members in the population. In such embodiments, the combination of primers and/or primer pairs includes a selected group or sub-group of microorganism-specific nucleic acids and includes kingdom-encompassing nucleic acids (e.g., 16S rRNA gene primers and primer pairs), and enables a focus on one or more particular microorganisms of interest that may be significant in certain states of health and disease or microbiota imbalance. In some embodiments, combinations of nucleic acids include microorganism-specific nucleic acids, and/or primer pairs, that specifically amplify a nucleic acid sequence contained in the genome of one or more microorganisms (e.g., bacteria) implicated in one or more conditions, disorders and/or diseases. In particular embodiments, the combination of nucleic acid primers and/or primer pairs includes a nucleic acid and/or a primer pair that specifically amplifies a target nucleic acid sequence contained within a genome of a microorganism selected from the microorganisms in Table 1. In some embodiments, the combination includes a plurality of nucleic acids and/or primer pairs that include at least one nucleic acid primer pair that specifically amplifies a target nucleic acid in each of the microorganisms in Table 1, or Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or Table 1, except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis. In some embodiments, the plurality of primer pairs includes primer pairs that specifically amplify genomic target nucleic acids contained within at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60 or 70 of the microorganisms in Table 1, or Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or Table 1, except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis. In particular embodiments, the target nucleic acid sequences contained in the genome of the different microorganisms are unique to each of the microorganisms. In some embodiments, a combination of nucleic acids and/or nucleic acid primer pairs includes two or more nucleic acids and/or nucleic acid primer pairs that specifically amplify a unique nucleic acid sequence contained in the genome of one or more of the Group A microorganisms, the Group B microorganisms, the Group C microorganisms, the Group C microorganisms excluding Helicobacter salomonis (Subgroup 1 of Group C), the Group D microorganisms or the Group E microorganisms (see Table 2). In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes a set of nucleic acid primer pairs in which each different nucleic acid primer pair specifically amplifies a different unique nucleic acid sequence contained in a different one of each of the genomes of the different microorganisms in Group A, Group B, Group C, Subgroup 1 of Group C (the Group C microorganisms excluding Helicobacter salomonis), Group D or Group E.


In some embodiments of the methods provided herein for detection, diagnosis, prevention and/or treatment, reduction in symptoms, and/or prevention of microorganism (e.g., bacteria) imbalances and/or dysbiosis in a subject as well as of conditions, disorders and diseases associated therewith, the method involves utilization of nucleotide sequence information of amplification products. Such embodiments include obtaining sequence information from nucleic acid products amplified by the combination of primer pairs employed in the method. Examples of sequence information are provided herein and include, but are not limited to, the identities of the nucleotides and the order thereof in a contiguous polynucleotide sequence (the nucleotide sequence determination) of an amplification product (which can include, for example, barcode sequence, e.g., corresponding to amplicon library source, primers used, etc.), alignment and/or mapping of nucleotide sequence to a reference sequence (and identity of the genus or species of the reference sequence), the number of sequence reads that map to a reference sequence and/or portions thereof (e.g., hypervariable regions of a 16S rRNA gene), the number of sequence reads that map uniquely to a reference sequence, and the number of regions (e.g., target sequences) of a reference sequence to which sequence reads map. Exemplary methods of sequencing nucleic acids and of nucleotide sequence analysis workflows for aligning and/or mapping sequence reads of amplification products are provided herein. In some embodiments, sequence information provides the identities (e.g., genus, species) of microorganisms and/or the number of different microorganisms in a population which is used to determine the microorganism composition of the sample. In some embodiments, as described herein, sequence information provides a measure of the levels (e.g., relative and/or absolute) of microorganisms in a population (abundance and proportionate contribution or presence in a population), which can also be used in determining the microorganism composition of the sample.


Also provided herein are methods for treating a subject with an immunotherapy. The composition of the gut microbiome has been implicated as a biomarker for cancer immunotherapies, including, for example, immune checkpoint inhibitors and CpG-oligonucleotide (CpG-ODN) immunotherapy. CpG-oligonucleotides are short single-stranded DNA molecules containing unmethylated cytosine-guanine motifs that serve as vaccine adjuvants in promoting antigen-specific immune responses, such as tumor antigen-specific cytotoxic T lymphocyte activation and accumulation. Immune checkpoint inhibitors are cancer therapeutics that target and inhibit checkpoint pathways of immune cells (e.g., T cells) involved in immunosuppression and particularly suppression of antitumor immune responses. Examples of checkpoint pathway proteins include, but are not limited to, PD-1, PD-L1 and CTLA-4. Checkpoint inhibitors include therapeutics that bind to these proteins, such as monoclonal antibodies directed to the proteins, and disrupt or prevent interaction of the proteins with other proteins. The composition of the gut microbiome has been shown to correlate with response to immune checkpoint inhibitors (see, e.g, Gong et al (2019) Clin Trans Med 8:9; https://doi.org/10.1186/s40169-019-0225-x) and CpG-oligonucleotide (CpG-ODN) immunotherapy (see, e.g., Lida et al (2013) Science 342(6161):967-970) and thus is a potential predictor of response to such immunotherapies. Particular species associated with checkpoint inhibitor response include, for example, Alistipes indistinctus, Anaerococcus vaginalis, Akkermansia muciniphila, Atopobium parvulum, Bacteroides caccae, Bacterioides fragilis, Bacteroides nordii, Bacteroides thetaiotamicron, Bacteroides vulgatus, Bifidobacterium adolescentis, Bifidobacterium breve, Bifidobacterium longum, Blautia obeum, Burkholderia cepacia, Cloacibacillus porcorum, Collinsella aerofaciens, Collinsella stercoris, Desulfovibrio alaskensis, Dorea formicigenerans, Enterococcus faecium, Enterococcus hirae, Eubacterium spp., Faecalibacterium prausnitzii, Gardnerella vaginalis, Gemmiger formicilis, Holdemania filiformis, Klebsiella pneumoniae, Lactobacillus spp., Parabacteroides merdae, Parabacteroides distasonis, Phascolarctobacterium faecium, Prevotella histicola, Roseburia intestinalis, Ruminococcus bromii, Slackia exigua Streptococcus infantarius Streptococcus parasanguinis and Veillonella parvula. For example, specific microbes that have been positively correlated with response to checkpoint inhibition by inhibitors of CTLA-4 include Bacteroides spp. and Burkholderia spp. In another example, specific microbes that have been positively correlated with response to checkpoint inhibition by inhibitors of interaction of PD-L1 and PD-1 include Bifidobacterium spp., Faecalibacterium spp., and Ruminococcaceae family (particularly for inhibitors targeting PD-L1), and Akkermansia muciniphila, Alistipes indistinctus and Enterococcus hirae (particularly for inhibitors targeting PD-1). Microbes that have been negatively associated with response to checkpoint inhibition by inhibitors of PD-1 and/or CTLA-4 include Bacteroidales order (including Bacteroides ssp., e.g., Bacteroides thetaiotamicron), Escherichia coli, Anaerotruncus colihominis and Roseburia intestinalis.


In some embodiments, methods for treating a subject with an immunotherapy provided herein include amplifying nucleic acids in or from a sample from the subject, obtaining sequence information of the nucleic acid amplification products, identifying genera of microorganisms in the sample and species of one or more of the microorganisms in the sample, and treating the subject with an immunotherapy or a composition that increases or decreases levels of one or more microorganisms in the sample and an immunotherapy. In some embodiments, the subject is treated with an immune checkpoint inhibition-based immunotherapy if the sample includes one or more microorganisms positively associated with response to immune checkpoint inhibition-based immunotherapy and/or excludes or has sufficiently low levels of one or more microorganisms negatively associated with response to immune checkpoint inhibition-based immunotherapy. A sufficiently low level of a microorganism negatively associated with response to immune checkpoint inhibition-based immunotherapy is a level that does not substantially or significantly interfere with or reduce a response to the immune checkpoint inhibition-based immunotherapy. In some embodiments, the subject is treated with a composition that increases levels of one or more microorganisms positively associated with response to immune checkpoint inhibition-based immunotherapy if the sample lacks one or more such microorganisms or sufficient levels thereof and/or a composition that eliminates or reduces levels of one or more microorganisms negatively associated with response to immune checkpoint inhibition-based immunotherapy if the sample contains one or more such microorganisms or prohibitively high levels thereof and is subsequently or simultaneously treated with an immune checkpoint inhibition-based immunotherapy. A less than sufficient level of a microorganism positively associated with response to immune checkpoint inhibition-based immunotherapy is a level that is insufficient to provide for a response to the immunotherapy. A prohibitively high level of a microorganism negatively associated with response to immune checkpoint inhibition-based immunotherapy is a level that substantially or significantly interferes with or reduces a response to the immune checkpoint inhibition-based immunotherapy. In some embodiments, the immune checkpoint inhibition-based immunotherapy is a composition that disrupts or prevents interaction of a checkpoint inhibitor pathway protein, including, for example, but are not limited to, PD-1, PD-L1 and/or CTLA-4. In some embodiments, an immune checkpoint inhibition-based immunotherapy includes an antibody, such as a monoclonal antibody, directed to a checkpoint inhibitor pathway protein. In some embodiments of the method, the sample from the subject is a sample from the alimentary canal of the subject, e.g., a fecal sample. In some embodiments, the method further comprises determining the relative level of one or more microorganisms in a sample from a subject by counting the number of sequence reads for nucleic acid products amplified from nucleic acids in the sample and normalizing the sequence read counts as described herein.


In some embodiments of the methods provided herein for treating a subject with an immunotherapy, the step of amplifying nucleic acids in or from a sample from the subject includes (a) subjecting nucleic acids in or from a sample from the subject to nucleic acid amplification using a combination of primer pairs comprising (i) one or more primer pairs capable of amplifying nucleic acids containing sequences of one or more hypervariable regions of a prokaryotic 16S rRNA gene (referred to as the “16S rRNA gene primers or primer pairs”) and (ii) one or more primer pairs capable of amplifying a target nucleic acid sequence contained within the genome of a microorganism that is not contained within a hypervariable region of a prokaryotic 16S rRNA gene, wherein the microorganism is one that is positively or negatively associated with response to immune checkpoint inhibition-based immunotherapy (referred to as the “non-16S rRNA gene primers or primer pairs”). In some embodiments, obtaining sequence information from amplified nucleic acid products in the method comprises obtaining sequence information from nucleic acid products amplified by the combination of primer pairs of (i) and (ii), and optionally determining the levels of nucleic acid products amplified by the one or more primer pairs of (i). In some embodiments, the method includes determining levels, e.g., relative and/or absolute levels, of nucleic acid products amplified by one or more primer pairs of (i), i.e., the 16S rRNA gene primer pairs, and/or (ii), i.e., the non-16S rRNA gene primer pairs, or sequence reads thereof. In some embodiments, the step of amplifying nucleic acids in or from a sample from the subject includes (a) subjecting the nucleic acids to two or more separate nucleic acid amplification reactions using a first set of primer pairs for one nucleic acid amplification reaction and a second set of primer pairs for the other nucleic acid amplification reaction, wherein (i) the first set of primer pairs comprises one or more primer pairs that amplifies a nucleic acid containing a sequence of one or more hypervariable regions of a prokaryotic 16S rRNA gene (referred to as the “16S rRNA gene primers or primer pairs”) and (ii) the second set of primer pairs comprises one or more primer pairs that amplify a target nucleic acid sequence contained within the genome of a microorganism that is not contained within a hypervariable region of a prokaryotic 16S rRNA gene, wherein the microorganism is one that is positively or negatively associated with response to immune checkpoint inhibition-based immunotherapy (referred to as the “non-16S rRNA gene primers or primer pairs”), and obtaining sequence information comprises obtaining sequence information from nucleic acid products amplified by primer pairs of (i) and (ii). In some embodiments, the method includes determining levels, e.g. relative and/or absolute levels, of nucleic acid products amplified by one or more primer pairs of (i) and/or (ii) or sequence reads thereof.


In any embodiments of the methods provided herein for treating a subject with an immunotherapy, the methods can include any embodiments of the methods for detecting and/or measuring the presence or absence of one or more microorganisms in a sample as described herein. For example, in some embodiments, the microorganism(s) is/are bacteria. In some embodiments, the prokaryotic 16S rRNA gene is a bacterial gene and/or the microorganism is a bacterium. In some embodiments the target nucleic acid sequence contained within a genome of a microorganism, e.g., bacteria, is unique to the microorganism. In some embodiments, the one or more 16S rRNA gene primer pairs amplify a nucleic acid sequence in a plurality of microorganisms, e.g., bacteria, from different genera. In some embodiments, the sample is a sample of contents of the alimentary tract of an animal. In some embodiments, the sample is a fecal sample.


In any embodiments of the methods provided herein for treating a subject with an immunotherapy, a nucleic acid amplification can be performed according to any of the embodiments provided herein for such amplification. For example, in some embodiments, the one or more primer pairs that amplifies a nucleic acid containing a sequence of a hypervariable region of a prokaryotic 16S rRNA gene separately amplify nucleic acids containing sequences of different hypervariable regions. In some embodiments, the primers of the one or more 16S rRNA gene primer pairs are directed to, or bind to, or hybridize to nucleic acid sequences contained in conserved regions of a prokaryotic 16S rRNA gene. In some embodiments, the one or more 16S rRNA gene primer pairs and/or non-16S rRNA gene primer pairs comprise a plurality of primer pairs. For example, the one or more 16S rRNA gene primer pairs of can comprise a plurality of primer pairs that amplify nucleic acid sequences of multiple hypervariable regions of a prokaryotic 16S rRNA gene and/or the one or more non-16S rRNA gene primer pairs can comprise a plurality of primer pairs that amplify different target nucleic acid sequences contained in the genomes of a plurality of different microorganisms. In some embodiments, the amplification, or two or more separate nucleic acid amplification reactions, is/are multiplex amplification conducted in a single reaction mixture. In some embodiments, each primer of the one or more 16S rRNA gene primer pairs contains less than 10, less than 9, less than 8, less than 7, less than 6, less than 5, less than 4, less than 3, or less than 2 contiguous nucleotides of sequence identical to a sequence of contiguous nucleotides of another primer in the combination of primer pairs. In some embodiments, the nucleic acid sequences being amplified by the one or more 16S rRNA gene primer pairs are less than about 300 bp, less than about 250 bp, less than about 200 bp, less than about 175 bp, less than about 150 bp, or less than about 125 bp in length. In some embodiments, the 16S rRNA gene primer pairs separately amplify nucleic acids separately containing sequences of 3 or more, 4 or more, 5 or more, 6 or more, 7 or more, 8 or more or 9 different hypervariable regions of a prokaryotic 16S rRNA gene thereby producing amplified copies of the nucleic acids containing sequences of the 3 or more, 4 or more, 5 or more, 6 or more, 7 or more, 8 or more or 9 different hypervariable regions of the 16S rRNA gene of one or more microorganisms, wherein the amplified copies of different hypervariable regions are separate amplicons. In some embodiments, the 16S rRNA gene primer pairs separately amplify nucleic acids separately containing sequences of 8 different hypervariable regions of a prokaryotic 16S rRNA gene. In some embodiments, the 8 different hypervariable regions are V2-V9. In some embodiments, the 16S rRNA gene primer pairs separately amplify nucleic acids containing sequences of 3 or more different hypervariable regions of a prokaryotic 16S rRNA gene wherein one of the 3 or more regions is a V5 region thereby producing amplified copies of the nucleic acids containing sequences of the 3 or more hypervariable regions of the 16S rRNA gene of one or more microorganisms. In some embodiments, the combination of primer pairs includes degenerate sequences of one or more primers in one or more primer pairs. In some embodiments, the 16S rRNA gene primer pair(s) comprise primers and/or primer pairs containing, or consisting essentially of, a sequence or sequences of a primer or primer pair in Table 15, or SEQ ID NOS: 1-24 in Table 15 and/or SEQ ID NOS: 25-48 in Table 15, or SEQ ID NOS: 11-16, 23 and 24 in Table 15 and/or SEQ ID NOS: 35-40, 47 and 48 in Table 15, or substantially identical or similar sequences, and optionally wherein one or more thymine bases is substituted with a uracil base.


In some embodiments of the methods provided herein for treating a subject with an immunotherapy, the one or more non-16S rRNA gene primer pairs specifically amplifies a target nucleic acid sequence contained within a genome of a microorganism selected from genera and/or species of the microorganisms of Table 1, or Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or Table 1, except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis. In some embodiments, the target nucleic acid is unique to the microorganism. In some such embodiments, amplified copies of a plurality of different microorganisms in Table 1, or Table 1, except for, or excluding, Actinomyces viscosus and/or Blautia coccoides, or Table 1, except for, or excluding, Actinomyces viscosus, Blautia coccoides and/or Helicobacter salomonis, is produced. In some embodiments, at least one, or one or more, target nucleic acid sequence(s) comprises or consists essentially of a nucleotide sequence selected from the nucleotide sequences in Table 17, or Table 17A and 17B, corresponding to the particular microorganisms species associated with checkpoint inhibitor response, or the complement thereof. In some embodiments, at least one, or one or more, target nucleic acid sequences comprises a nucleotide sequence selected from the sequences in Table 17, or Table 17A and 17B, corresponding to the particular microorganisms species associated with checkpoint inhibitor response, or the complement thereof, and is less than about 500, less than about 475, less than about 450, less than about 400, less than about 375, less than about 350, less than about 300, less than about 275, less than about 250, less than about 200, less than about 175, less than about 150, or less than about 100 nucleotides in length. In some embodiments, the at least one non-16S rRNA gene primer pair does not detectably amplify a nucleic acid sequence contained within any genus other than the genus of the microorganism containing the target nucleic acid sequence. In some embodiments, the at least one non-16S rRNA gene primer pair does not detectably amplify a nucleic acid sequence contained within any species other than the species of the microorganism containing the target nucleic acid sequence. In some embodiments, at least one primer of the non-16S rRNA gene primer pair, or at least one non-16S rRNA gene primer pair, contains, or consists essentially of, the sequence or sequences of a primer or primer pair corresponding to the particular microorganisms species associated with checkpoint inhibitor response in Table 16, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or a substantially identical or similar sequence(s), or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the nucleic acids are subjected to nucleic acid amplification using a plurality of non-16S rRNA gene primers or primer pairs, each containing, or consisting essentially of, a sequence or sequences of a primer pair corresponding to the particular microorganisms species associated with checkpoint inhibitor response in Table 16, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, in which one or more thymine bases is substituted with a uracil base. In some embodiments, at least one primer or one primer pair in the combination of primer pairs includes a modification that facilitates nucleic acid manipulation, amplification, ligation and/or sequencing of amplification products and/or reduction or elimination of primer dimers. In particular embodiments, a modification is one that facilitates multiplex nucleic acid amplification, ligation and/or sequencing of products of multiplex amplification.


In some embodiments of the methods provided herein for treating a subject with an immunotherapy, the method is designed to focus on the make-up or composition of the population of microorganisms in the sample particularly with respect to certain groups of microorganisms, and, in some embodiments, the proportionate presence of the group and/or group members in the population. In such embodiments, the combination of primers and/or primer pairs includes a selected group or sub-group of microorganism-specific nucleic acids and includes kingdom-encompassing nucleic acids (e.g., 16S rRNA gene primers and primer pairs), and enables a focus on one or more particular microorganisms of interest that may be particularly significant in response to immunotherapies. In some embodiments, combinations of nucleic acids include microorganism-specific nucleic acids, and/or primer pairs, that specifically amplify a nucleic acid sequence contained in the genome of one or more microorganisms (e.g., bacteria) implicated in one or more conditions, disorders and/or diseases. In particular embodiments, the target nucleic acid sequences contained in the genome of the different microorganisms are unique to each of the microorganisms. In some embodiments, a combination of nucleic acids and/or nucleic acid primer pairs includes two or more nucleic acids and/or nucleic acid primer pairs that specifically amplify a unique nucleic acid sequence contained in the genome of one or more of the Group A microorganisms and/or the Group B microorganisms (see Table 2B). In some embodiments, the combination of nucleic acids and/or nucleic acid primer pairs includes a set of nucleic acid primer pairs in which each different nucleic acid primer pair specifically amplifies a different unique nucleic acid sequence contained in a different one of each of the genomes of the different microorganisms in Group A or Group B.


In some embodiments of the methods provided herein for treating a subject with an immunotherapy, the method involves utilization of nucleotide sequence information of amplification products. Such embodiments include obtaining sequence information from nucleic acid products amplified by the combination of primer pairs employed in the method. Examples of sequence information are provided herein and include, but are not limited to, the identities of the nucleotides and the order thereof in a contiguous polynucleotide sequence (the nucleotide sequence determination) of an amplification product (which can include, for example, barcode sequence, e.g., corresponding to amplicon library source, primers used, etc.), alignment and/or mapping of nucleotide sequence to a reference sequence (and identity of the genus or species of the reference sequence), the number of sequence reads that map to a reference sequence and/or portions thereof (e.g., hypervariable regions of a 16S rRNA gene), the number of sequence reads that map uniquely to a reference sequence, and the number of regions (e.g., target sequences) of a reference sequence to which sequence reads map. Exemplary methods of sequencing nucleic acids and of nucleotide sequence analysis workflows for aligning and/or mapping sequence reads of amplification products are provided herein. In some embodiments, sequence information provides the identities (e.g., genus, species) of microorganisms and/or the number of different microorganisms in a population which is used in determining the treatment of the subject. In some embodiments, as described herein, sequence information provides a measure of the levels (e.g., relative and/or absolute) of microorganisms in a population (abundance and proportionate contribution or presence in a population), which can also be used in determining the treatment of the subject.


Kits


In some embodiments, a kit is provided for performing nucleic acid amplification comprising any one or more nucleic acid primers and/or primer pairs provided herein that comprise or consist essentially of a sequence or sequences selected from the sequences in Table 15 and Table 16. In some embodiments, the primers are less than about 100 nucleotides, less than about 90 nucleotides, less than about 80 nucleotides, less than about 70 nucleotides, less than about 60 nucleotides, less than about 50 nucleotides, less than about 45 nucleotides, less than about 44 nucleotides, less than about 43 nucleotides, less than about 42 nucleotides, less than about 41 nucleotides, less than about 40 nucleotides, less than about 38 nucleotides, less than about 35 nucleotides, or less than about 30 nucleotides in length. In some embodiments, the one or more nucleic acid primers and/or primer pairs that comprise or consist essentially of a sequence selected from the sequences in Table 15 are selected from SEQ ID NOS: 1-24 in Table 15 and/or SEQ ID NOS: 25-48 in Table 15, or SEQ ID NOS: 11-16, 23 and 24 in Table 15 and/or SEQ ID NOS: 35-40, 47 and 48 in Table 15, or substantially identical or similar sequences, and optionally wherein one or more thymine bases is substituted with a uracil base. In some embodiments, the one or more nucleic acid primers and/or primer pairs that comprise or consist essentially of a sequence selected from the sequences in Table 16 are selected from SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or a substantially identical or similar sequence(s), or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the kit contains one or more nucleic acid primer pairs comprising or consisting essentially of sequences of primer pairs selected from the sequences in Table 15 and Table 16. In some embodiments, the one or more nucleic acid primer pairs that comprise or consist essentially of sequences of primer pairs selected from the sequences in Table 15 are selected from SEQ ID NOS: 1-24 in Table 15 and/or SEQ ID NOS: 25-48 in Table 15, or SEQ ID NOS: 11-16, 23 and 24 in Table 15 and/or SEQ ID NOS: 35-40, 47 and 48 in Table 15, or substantially identical or similar sequences, and optionally wherein one or more thymine bases is substituted with a uracil base. In some embodiments, the one or more nucleic acid primer pairs that comprise or consist essentially of sequences of primer pairs selected from the sequences in Table 16 are selected from SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or a substantially identical or similar sequence(s), or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the kit is for performing multiplex nucleic acid amplification and comprises a plurality of primers and/or primer pairs comprising or consisting essentially of sequences selected from the sequences in Table 15 and Table 16. In some embodiments, the plurality of primers and/or primer pairs comprising or consisting essentially of sequences selected from Table 15 are selected from SEQ ID NOS: 1-24 in Table 15 and/or SEQ ID NOS: 25-48 in Table 15, or SEQ ID NOS: 11-16, 23 and 24 in Table 15 and/or SEQ ID NOS: 35-40, 47 and 48 in Table 15, or substantially identical or similar sequences, and optionally wherein one or more thymine bases is substituted with a uracil base. In some embodiments, the plurality of primers and/or primer pairs comprising or consisting essentially of sequences selected from Table 16 are selected from SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or a substantially identical or similar sequence(s), or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the kit comprises a composition containing a mixture of primers and/or primer pairs comprising or consisting essentially of sequences selected from the sequences in Table 15 and a separate composition containing a mixture of primers and/or primer pairs comprising or consisting essentially of sequences selected from the sequences in Table 16. In some embodiments, the composition containing a mixture of primers and/or primer pairs comprising or consisting essentially of sequences selected from the sequences in Table 15 are selected from SEQ ID NOS: 1-24 in Table 15 and/or SEQ ID NOS: 25-48 in Table 15, or SEQ ID NOS: 11-16, 23 and 24 in Table 15 and/or SEQ ID NOS: 35-40, 47 and 48 in Table 15, or substantially identical or similar sequences, and optionally wherein one or more thymine bases is substituted with a uracil base. In some embodiments, the composition containing a mixture of primers and/or primer pairs comprising or consisting essentially of sequences selected from the sequences in Table 16 are selected from SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or a substantially identical or similar sequence(s), or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In any of these embodiments, the kit further includes one or more of a DNA polymerase, an adapter, dATP, dCTP, dGTP and dTTP. The kit can further include one or more antibodies, nucleic acid barcodes, purification solutions or columns.


In some embodiments, a kit is provided for detecting or measuring one or more microorganisms, or for assessing, profiling, or characterizing a mixture or population of microorganisms, e.g., bacteria and includes (1) one or more kingdom-encompassing nucleic acid primer pairs capable of amplifying a sequence in a homologous gene or genomic region common to multiple, most, a majority, substantially all, or all microorganisms in a kingdom (e.g., bacteria), but that varies between different microorganisms in the kingdom, and (2) microorganism-specific nucleic acids and/or nucleic acid primer pairs that amplify a specific nucleic acid sequence unique to a particular microorganism (e.g., a species, subspecies or strain of microorganism, such as bacteria). In some embodiments, the kingdom-encompassing nucleic acid primer pairs and microorganism-specific nucleic acid primer pairs comprise or consist essentially of sequences selected from the sequences in Table 15 and Table 16, respectively. In some embodiments, the kingdom-encompassing nucleic acid primer pairs comprise or consist essentially of sequences selected from SEQ ID NOS: 1-24 in Table 15 and/or SEQ ID NOS: 25-48 in Table 15, or SEQ ID NOS: 11-16, 23 and 24 in Table 15 and/or SEQ ID NOS: 35-40, 47 and 48 in Table 15, or substantially identical or similar sequences, and optionally wherein one or more thymine bases is substituted with a uracil base. In some embodiments, the microorganism-specific nucleic acid primer pairs comprise or consist essentially of sequences selected from SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or a substantially identical or similar sequence(s), or any of the aforementioned nucleotide sequences of nucleic acids or primer pairs in which one or more thymine bases is substituted with a uracil base. In some embodiments, the kit comprises a composition containing one or more kingdom-encompassing nucleic acid primer pairs and a separate composition containing one or more species-specific primer pairs. In some embodiments, the microorganism-specific nucleic acid primer pairs are primers that specifically amplify a sequence comprising or consisting essentially of one or more sequences in Table 17. In some embodiments, the microorganism-specific nucleic acid primer pairs are primers that specifically amplify a sequence comprising or consisting essentially of SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence, or the complement thereof. In some embodiments, the kit further includes one or more of a DNA polymerase, an adapter, dATP, dCTP, dGTP and dTTP. The kit can further include one or more antibodies, nucleic acid barcodes, purification solutions or columns. In some embodiments, one or more of the primers in a primer pair have a cleavable group. In some embodiments, the cleavable group can be a uracil nucleotide. In some embodiments in which the one or more of the primers in a primer pair have a cleavable group, the kit can further include at least one cleaving reagent. In one embodiment, the cleavable group can be 8-oxo-deoxyguanosine, deoxyuridine or bromodeoxyuridine. In some embodiments, the at least one cleaving reagent includes RNaseH, uracil DNA glycosylase, Fpg or alkali. In one embodiment, the cleaving reagent can be uracil DNA glycosylase. In some embodiments, a kit is provided for amplifying multiple microbial sequences from a population of nucleic acid molecules in a single reaction. In some embodiments, the kit is provided to perform multiplex nucleic acid amplification in a single reaction chamber or vessel. In some embodiments, the kit includes at least one DNA polymerase, which can be a thermostable DNA polymerase. In some embodiments, the concentration of the one or more DNA polymerases is present in a 3-fold excess as compared to a single amplification reaction. In some embodiments, the final concentration of each primer pair is present at about 25 nM to about 50 nM. In one embodiment, the final concentration of each primer pair can be present at a concentration that is 50% lower than conventional single-plex PCR reactions. In some embodiments, the kit provides amplification of at least 100, 150, 200, 250, 300, 350, 398, or more, microbial sequences from a population of nucleic acid molecules in a single reaction chamber. In particular embodiments, a provided kit of the invention is a test kit. In some embodiments, the kit further comprises one or more adapters, barcodes, and/or antibodies.


Methods for Compressing Reference Databases and Analyzing Sequence Data for Profiling Microbial Populations


In some embodiments, the methods described herein may be used to compress reference nucleic acid sequences and analyze sequence reads generated through sequencing of nucleic acids of portions of the genomes of living organisms, microbes, parasites or infectious agents. In some embodiments, the organisms are related, for example, as belonging to a common taxonomic group (e.g., kingdom, phylum, class, order, family, genus and/or species). In some embodiments, the organisms are microorganisms, such as, for example, prokaryotes including bacteria and archaea. In some embodiments, the organisms are eukaryotes, including, for example, animals (e.g., mammals, insects), plants, fungi and algae. In some embodiments, the nucleic acid is from a microbe, e.g., bacteria, archaebacteria, or a virus, which is also an infectious agent. In some embodiments, the nucleic acid is obtained using oligonucleotides, such as primers, to amplify portions of the nucleic acids of an organism, microbe, parasite and/or infectious agent. In some embodiments, the oligonucleotides are contained in a collection, referred to as gene panels, designed to profile the composition of a sample or environment, such as, for example, a microbial environment. The microbes to be profiled may include bacteria, fungi or viruses. The characteristics of genes of targeted by the panel may include homologous genes. A homologous gene is one that displays conserved sequences in multiple organisms or microbes, but that can also have differences in sequences. Examples of homologous genes include, but are not limited to, the 16S rRNA gene, 18S rRNA gene, 23S rRNA gene and ABC transporter genes. For example, the 16S rRNA gene contains hypervariable segments that can vary in sequence in different organisms or microbes but that are separated by conservative segments of similar or nearly identical sequence in different organisms or microbes. Various embodiments of the method described herein may be used to profile microbiomes in the following applications: gut microbiome, skin microbiome, oral microbiome, respiratory tract microbiome, sepsis, infectious disease, women's health, viral type, fungal sample, metagenomics—analysis of soil and water samples, food pathogens.


In some embodiments, the disclosure provides for amplification of multiple target-specific sequences from a population of target nucleic acid molecules. In some embodiments, the method comprises hybridizing one or more target-specific primer pairs to the target sequence, extending a first primer of the primer pair, denaturing the extended first primer product from the population of nucleic acid molecules, hybridizing to the extended first primer product the second primer of the primer pair, extending the second primer to form a double stranded product, and digesting the target-specific primer pair away from the double stranded product to generate a plurality of amplified target sequences. In some embodiments, the digesting includes partial digesting of one or more of the target-specific primers from the amplified target sequence. In some embodiments, the amplified target sequences can be ligated to one or more adapters. In some embodiments, adapters can include one or more DNA barcodes or tagging sequences. In some embodiments, amplified target sequences once ligated to an adapter can undergo a nick translation reaction and/or further amplification to generate a library of adapter-ligated amplified target sequences.


In some embodiments, the methods of the disclosure include selectively amplifying target sequences in a sample containing a plurality of nucleic acid molecules and ligating the amplified target sequences to at least one adapter and/or barcode. Adapters and barcodes for use in molecular biology library preparation techniques are well known to those of skill in the art. The definitions of adapters and barcodes as used herein are consistent with the terms used in the art. For example, the use of barcodes allows for the detection and analysis of multiple samples, sources, tissues or populations of nucleic acid molecules per multiplex reaction. A barcoded and amplified target sequence contains a unique nucleic acid sequence, typically a short 6-15 nucleotide sequence, that identifies and distinguishes one amplified nucleic acid molecule from another amplified nucleic acid molecule, even when both nucleic acid molecules minus the barcode contain the same nucleic acid sequence. The use of adapters allows for the amplification of each amplified nucleic acid molecule in a uniformed manner and helps reduce strand bias. Adapters can include universal adapters or propriety adapters both of which can be used downstream to perform one or more distinct functions. For example, amplified target sequences prepared by the methods disclosed herein can be ligated to an adapter that may be used downstream as a platform for clonal amplification. The adapter can function as a template strand for subsequent amplification using a second set of primers and therefore allows universal amplification of the adapter-ligated amplified target sequence. In some embodiments, selective amplification of target nucleic acids to generate a pool of amplicons can further comprise ligating one or more barcodes and/or adapters to an amplified target sequence. The ability to incorporate barcodes enhances sample throughput and allows for analysis of multiple samples or sources of material concurrently.


Conserved sequences of nucleic acids can be found in the genomes of different organisms or microbes. Such sequences can be identical or share substantial similarity in the different genomes (see, e.g., Isenbarger et al. (2008) Orig Life Evol Biosph doi:10.1007/s11084-008-9148-z). In many instances, conserved sequences are located in essential genes, e.g., housekeeping genes, that encode elements required across a category or group of organisms or microbes for carrying out basic biochemical functions of survival. Such genes are referred to herein as “homologous” genes. However, through evolution and adaptation of organisms and microbes to diverse conditions, even homologous genes diverged and contain sequences that vary between different organisms and microbes and that may be so divergent as to be unique to specific organisms or microbes such that they can be used to identify an individual organism or microbe or a related group (e.g., species) of organisms or microbes. These features of homologous genes can be exploited in characterizing or profiling the nucleic acid composition of samples, such as, for example, biological or environmental samples. For example, in profiling the microbiota of a sample, the goal is not only to determine the presence of microorganisms in the sample, but to generate a comprehensive characterization of the total microorganism population, including the identities of the constituent microorganisms, e.g., genera, species, and relative levels of different microorganisms. Analysis of homologous genes containing sequences conserved (e.g., conserved regions) across substantially all of the targeted elements of a population being profiled in a sample (e.g., all bacteria) as well as sequences that vary (e.g., variable regions) and provide information specific to individuals or subgroups within the total population provides a method for efficiently profiling a population in a sample. Homologous genes that contain multiple variable regions interspersed between conserved regions are particularly useful in such methods because they provide multiple sequences that can be analyzed to more accurately and definitively identify individual constituents of a population of targeted elements. One example of such a gene is the prokaryotic 16S rRNA gene encoding ribosomal RNAs which are the main structural and catalytic components of ribosomes. The 16S rRNA gene is about 1500 nucleotides in length and contains nine hypervariable regions (V1-V9) interspersed between and flanked by conserved sequences of conserved regions (FIG. 1). Sequences of the hypervariable regions of 16S rRNA genes which differ in different microorganisms can be used to identify microorganisms in a sample. One method of obtaining the nucleic acids of the hypervariable regions of microorganisms in a sample in order to sequence the regions is to generate multiple copies of the regions through nucleic acid amplification (e.g., polymerase chain reaction or PCR) of all the nucleic acids extracted from a sample. Amplification can be accomplished by contacting the nucleic acids with oligonucleotides (i.e., primers) that hybridize to sequences on each end of a hypervariable region to be amplified (referred to as the template) and synthesizing a complement sequence of each strand of the template through nucleotide polymerization extension of the primers. Instead of specifically amplifying a hypervariable region of every possible microorganism that could be present in a sample by using many oligonucleotide primers, each specific to the hypervariable region of each organism, it is possible to utilize the conserved, highly similar or identical sequences flanking the hypervariable regions as primer-binding sequences to which one, or a small number of, primer pair(s) will bind and amplify a hypervariable region in substantially all of the microorganisms, e.g., bacteria, in a sample. This allows specific nucleic acids that can be used to identify a microorganism to be amplified from substantially all the microorganisms which can then be sequenced for efficient profiling of the population.


The sequences of amplified hypervariable region nucleic acids of all microorganisms present in a sample can be compared to reference sequences of particular microorganisms, or microbes, through computer-assisted sequence alignment and mapped to a gene of a known microorganism to identify the sample microorganism. Databases of 16S rRNA gene sequences from numerous microorganisms (e.g., bacteria) are publicly available (see, e.g., www.greengenes.lbl.gov and www.arb-silva.de). There can be over 100,000 sequences in a database. Furthermore, the more of the nine hypervariable regions amplified and sequenced, the more the number of alignments that have to be performed. Thus, an analysis of multiple amplified nucleic acid regions of multiple microorganisms in a sample can require extensive processor memory and time and potentially introduce errors and uncertainties into the analysis. Methods of performing such analyses that reduce computational requirements, reduce memory requirements and improve the quality of characterization of nucleic acids in a sample are provided herein.


Also provided herein are methods for facilitating and improving the efficiency of mapping sample nucleic acids to non-conserved genome regions unique to a particular microorganism or microbe that enable accurate identification of the nucleic acids in a sample that may contain a mixture of nucleic acids from a plurality of different organisms and/or microbes.


In some embodiments, an unaligned BAM file including sequence read information may be provided to a processor for analyzing the sequence reads corresponding to marker regions. Reads obtained from sequencing of library DNA templates may be analyzed to identify, and determine the levels of, microbial constituents of the samples. Analysis may be conducted using a workflow incorporating Ion Torrent Suite™ Software (Thermo Fisher Scientific) with a run plan template designed to facilitate microbial DNA sequence read analysis, and an AmpliSeq microbiome analysis software plugin which generates counts for amplicons targeted in the assay. Reference genome files used for alignment in read mapping aspects of the analysis may be included in the plugin. Reference sequences derived from the GreenGenes bacterial 16S rRNA gene sequence public database (see, e.g., www.greengenes.lbl.gov) may be used for mapping of reads obtained from sequencing of amplicons generated using the 16S primer pool. In some embodiments, other databases containing 16S rRNA gene sequence information may be used, such as Ribosomal Database Project (RDP) https.//rdp.cme.msu.edu/), GRD (https://metasystems.riken.jp/grd/), SILVA, (https://www.arb-silva.de/), and ExBioCloud (https://help.ezbiocloud.net/ezbiocloud-16s-database/). Reference microbial genome sequences available in an NCBI public database (see www.ncbi.nlm.nih.gov/genome/microbes/) may be used for mapping reads obtained from sequencing of amplicons generated using the species primer pool. Compressed 16S reference sequences may be derived from the GreenGenes or other 16S rRNA gene sequence database. The compressed 16S reference sequences comprise a plurality of the hypervariable regions of the 16S rRNA gene. FIG. 1 illustrates an example of a 16S rRNA gene having hypervariable regions. A full length 16S rRNA gene can include 1000 to 1700 bp. In this example, the full length 16S rRNA gene 101 includes 1500 bp and 9 hypervariable regions, V1-V9. A primer pool design targeting 8 of the 9 hypervariable regions for amplification, V2-V9, results in the set of hypervariable segments 102. The set of hypervariable segments 102 may be generated by applying an in silico PCR simulation using the primer pairs targeting V2-V9 to the full length 16S sequences contained in the database to extract expected target segments for V2-V9. The in silico PCR simulation may use available computational tools for calculation theoretical PCR results using a given set of primers and a target DNA sequence input by the user. One such tool is Primer-BLAST (https://www.ncbi.nlm.nih.gov/tools/primer-blast/index.cgi?LINK_LOC=BlastHome), described in Ye J, Coulouris G, Zaretskaya I, Cutcutache I, Rozen S, Madden T L. (2012) Primer-BLAST: a tool to design target-specific primers for polymerase chain reaction. BMC Bioinformatics. 13:134, and in NCBI Primer-BLAST An online tool for designing target-specific PCR primer pairs (with internal probes), NCBI Handout Series Primer-BLAST Last Update Sep. 8, 2016 (https://ftp.ncbi.nih.gov/pub/factsheets/HowTo_PrimerBLAST.pdf). In some embodiments, a proprietary simulation tool for in silico PCR simulation may be used to determine the set of hypervariable segments 102.


For the example of FIG. 1, the set of hypervariable segments 102 derived from the in silico simulation provides a compressed reference containing the hypervariable region sequences of those full-length 16S rRNA sequences in the complete database that would be expected to be amplified by the primers. For this example, the number of base pairs is reduced from 1500 bp of the full length sequence 101 to 8 hypervariable segments 102 having a total of 1299 bp. The GreenGenes database contains about 150,000 16S rRNA gene sequences.


In some embodiments, more than one primer pair may target a given hypervariable region. The sequence of each primer of a primer pair for amplifying each region is directed to “conserved” sequence on each side of the particular V region so that the primer pair will theoretically amplify that variable region in the genome of every bacterium. However, some of the “conserved” regions on either side of each variable region, particularly the conserved regions on either side of V2 and V8, are not sufficiently conserved in order to amplify the V2 and V8 regions of all bacteria. Thus, for amplifying V2 and V8, the 3 primer pairs (instead of 1 primer pair) may be used with the sequences of each primer pair being almost identical but having one or two nucleotides different (referred to as “degenerate primers”). Using those 3 primer pairs is a means to amplify the V2 and V8 regions for all bacteria even though the conserved regions on either side of V2 and V8 may be slightly different for different bacteria.


A BAM file format structure is described in “Sequence Alignment/Map Format Specification,” Sep. 12, 2014 (https://github.com/samtools/hts-specs). As described herein, a “BAM file” refers to a file compatible with the BAM format. As described herein, an unaligned BAM file refers to a BAM file that does not contain aligned sequence read information and mapping quality parameters and an aligned BAM file refers to a BAM file that contains aligned sequence read information and mapping quality parameters.


Nucleic acid sequence data can be generated using various techniques, platforms or technologies, including, but not limited to: capillary electrophoresis, microarrays, ligation-based systems, polymerase-based systems, hybridization-based systems, direct or indirect nucleotide identification systems, pyrosequencing, ion- or pH-based detection systems, electronic signature-based systems, etc.


Various embodiments of nucleic acid sequencing platforms, such as a nucleic acid sequencer, can include components as displayed in the block diagram of FIG. 10. According to various embodiments, sequencing instrument 200 can include a fluidic delivery and control unit 202, a sample processing unit 204, a signal detection unit 206, and a data acquisition, analysis and control unit 208. Various embodiments of instrumentation, reagents, libraries and methods used for next generation sequencing are described in U.S. Patent Application Publication No. 2009/0127589 and No. 2009/0026082. Various embodiments of instrument 200 can provide for automated sequencing that can be used to gather sequence information from a plurality of sequences in parallel, such as substantially simultaneously.


In various embodiments, the fluidics delivery and control unit 202 can include reagent delivery system. The reagent delivery system can include a reagent reservoir for the storage of various reagents. The reagents can include RNA-based primers, forward/reverse DNA primers, oligonucleotide mixtures for ligation sequencing, nucleotide mixtures for sequencing-by-synthesis, optional ECC oligonucleotide mixtures, buffers, wash reagents, blocking reagent, stripping reagents, and the like. Additionally, the reagent delivery system can include a pipetting system or a continuous flow system which connects the sample processing unit with the reagent reservoir.


In various embodiments, the sample processing unit 204 can include a sample chamber, such as flow cell, a substrate, a micro-array, a multi-well tray, or the like. The sample processing unit 204 can include multiple lanes, multiple channels, multiple wells, or other means of processing multiple sample sets substantially simultaneously. Additionally, the sample processing unit can include multiple sample chambers to enable processing of multiple runs simultaneously. In particular embodiments, the system can perform signal detection on one sample chamber while substantially simultaneously processing another sample chamber. Additionally, the sample processing unit can include an automation system for moving or manipulating the sample chamber.


In various embodiments, the signal detection unit 206 can include an imaging or detection sensor. For example, the imaging or detection sensor can include a CCD, a CMOS, an ion or chemical sensor, such as an ion sensitive layer overlying a CMOS or FET, a current or voltage detector, or the like. The signal detection unit 206 can include an excitation system to cause a probe, such as a fluorescent dye, to emit a signal. The excitation system can include an illumination source, such as arc lamp, a laser, a light emitting diode (LED), or the like. In particular embodiments, the signal detection unit 206 can include optics for the transmission of light from an illumination source to the sample or from the sample to the imaging or detection sensor. Alternatively, the signal detection unit 206 may provide for electronic or non-photon based methods for detection and consequently not include an illumination source. In various embodiments, electronic-based signal detection may occur when a detectable signal or species is produced during a sequencing reaction. For example, a signal can be produced by the interaction of a released byproduct or moiety, such as a released ion, such as a hydrogen ion, interacting with an ion or chemical sensitive layer. In other embodiments a detectable signal may arise as a result of an enzymatic cascade such as used in pyrosequencing (see, for example, U.S. Patent Application Publication No. 2009/0325145) where pyrophosphate is generated through base incorporation by a polymerase which further reacts with ATP sulfurylase to generate ATP in the presence of adenosine 5′ phosphosulfate wherein the ATP generated may be consumed in a luciferase mediated reaction to generate a chemiluminescent signal. In another example, changes in an electrical current can be detected as a nucleic acid passes through a nanopore without the need for an illumination source.


In various embodiments, a data acquisition analysis and control unit 208 can monitor various system parameters. The system parameters can include temperature of various portions of instrument 200, such as sample processing unit or reagent reservoirs, volumes of various reagents, the status of various system subcomponents, such as a manipulator, a stepper motor, a pump, or the like, or any combination thereof.


It will be appreciated by one skilled in the art that various embodiments of instrument 200 can be used to practice variety of sequencing methods including ligation-based methods, sequencing by synthesis, single molecule methods, nanopore sequencing, and other sequencing techniques.


In various embodiments, the sequencing instrument 200 can determine the sequence of a nucleic acid, such as a polynucleotide or an oligonucleotide. The nucleic acid can include DNA or RNA, and can be single stranded, such as ssDNA and RNA, or double stranded, such as dsDNA or a RNA/cDNA pair. In various embodiments, the nucleic acid can include or be derived from a fragment library, a mate pair library, a ChIP fragment, or the like. In particular embodiments, the sequencing instrument 200 can obtain the sequence information from a single nucleic acid molecule or from a group of substantially identical nucleic acid molecules.


In various embodiments, sequencing instrument 200 can output nucleic acid sequencing read data in a variety of different output data file types/formats, including, but not limited to: *.fasta, *.csfasta, *seq.txt, *qseq.txt, *.fastq, *.sff, *prb.txt, *.sms, *srs and/or *.qv.


According to various exemplary embodiments, one or more features of any one or more of the above-discussed teachings and/or exemplary embodiments may be performed or implemented using appropriately configured and/or programmed hardware and/or software elements. Determining whether an embodiment is implemented using hardware and/or software elements may be based on any number of factors, such as desired computational rate, power levels, heat tolerances, processing cycle budget, input data rates, output data rates, memory resources, data bus speeds, etc., and other design or performance constraints.


Examples of hardware elements may include processors, microprocessors, input(s) and/or output(s) (I/O) device(s) (or peripherals) that are communicatively coupled via a local interface circuit, circuit elements (e.g., transistors, resistors, capacitors, inductors, and so forth), integrated circuits, application specific integrated circuits (ASIC), programmable logic devices (PLD), digital signal processors (DSP), field programmable gate array (FPGA), logic gates, registers, semiconductor device, chips, microchips, chip sets, and so forth. The local interface may include, for example, one or more buses or other wired or wireless connections, controllers, buffers (caches), drivers, repeaters and receivers, etc., to allow appropriate communications between hardware components. A processor is a hardware device for executing software, particularly software stored in memory. The processor can be any custom made or commercially available processor, a central processing unit (CPU), an auxiliary processor among several processors associated with the computer, a semiconductor based microprocessor (e.g., in the form of a microchip or chip set), a macroprocessor, or generally any device for executing software instructions. A processor can also represent a distributed processing architecture. The I/O devices can include input devices, for example, a keyboard, a mouse, a scanner, a microphone, a touch screen, an interface for various medical devices and/or laboratory instruments, a bar code reader, a stylus, a laser reader, a radio-frequency device reader, etc. Furthermore, the I/O devices also can include output devices, for example, a printer, a bar code printer, a display, etc. Finally, the I/O devices further can include devices that communicate as both inputs and outputs, for example, a modulator/demodulator (modem; for accessing another device, system, or network), a radio frequency (RF) or other transceiver, a telephonic interface, a bridge, a router, etc.


Examples of software may include software components, programs, applications, computer programs, application programs, system programs, machine programs, operating system software, middleware, firmware, software modules, routines, subroutines, functions, methods, procedures, software interfaces, application program interfaces (API), instruction sets, computing code, computer code, code segments, computer code segments, words, values, symbols, or any combination thereof. A software in memory may include one or more separate programs, which may include ordered listings of executable instructions for implementing logical functions. The software in memory may include a system for identifying data streams in accordance with the present teachings and any suitable custom made or commercially available operating system (O/S), which may control the execution of other computer programs such as the system, and provides scheduling, input-output control, file and data management, memory management, communication control, etc.


According to various exemplary embodiments, one or more features of any one or more of the above-discussed teachings and/or exemplary embodiments may be performed or implemented using appropriately configured and/or programmed non-transitory machine-readable medium or article that may store an instruction or a set of instructions that, if executed by a machine, may cause the machine to perform a method and/or operations in accordance with the exemplary embodiments. Such a machine may include, for example, any suitable processing platform, computing platform, computing device, processing device, computing system, processing system, computer, processor, scientific or laboratory instrument, etc., and may be implemented using any suitable combination of hardware and/or software. The machine-readable medium or article may include, for example, any suitable type of memory unit, memory device, memory article, memory medium, storage device, storage article, storage medium and/or storage unit, for example, memory, removable or non-removable media, erasable or non-erasable media, writeable or re-writeable media, digital or analog media, hard disk, floppy disk, read-only memory compact disc (CD-ROM), recordable compact disc (CD-R), rewriteable compact disc (CD-RW), optical disk, magnetic media, magneto-optical media, removable memory cards or disks, various types of Digital Versatile Disc (DVD), a tape, a cassette, etc., including any medium suitable for use in a computer. Memory can include any one or a combination of volatile memory elements (e.g., random access memory (RAM, such as DRAM, SRAM, SDRAM, etc.)) and nonvolatile memory elements (e.g., ROM, EPROM, EEROM, Flash memory, hard drive, tape, CDROM, etc.). Moreover, memory can incorporate electronic, magnetic, optical, and/or other types of storage media. Memory can have a distributed architecture where various components are situated remote from one another, but are still accessed by the processor. The instructions may include any suitable type of code, such as source code, compiled code, interpreted code, executable code, static code, dynamic code, encrypted code, etc., implemented using any suitable high-level, low-level, object-oriented, visual, compiled and/or interpreted programming language.


According to various exemplary embodiments, one or more features of any one or more of the above-discussed teachings and/or exemplary embodiments may be performed or implemented at least partly using a distributed, clustered, remote, or cloud computing resource.


According to various exemplary embodiments, one or more features of any one or more of the above-discussed teachings and/or exemplary embodiments may be performed or implemented using a source program, executable program (object code), script, or any other entity comprising a set of instructions to be performed. When a source program, the program can be translated via a compiler, assembler, interpreter, etc., which may or may not be included within the memory, so as to operate properly in connection with the O/S. The instructions may be written using (a) an object oriented programming language, which has classes of data and methods, or (b) a procedural programming language, which has routines, subroutines, and/or functions, which may include, for example, C, C++, R, Pascal, Basic, Fortran, Cobol, Perl, Java, and Ada.


According to various exemplary embodiments, one or more of the above-discussed exemplary embodiments may include transmitting, displaying, storing, printing or outputting to a user interface device, a computer readable storage medium, a local computer system or a remote computer system, information related to any information, signal, data, and/or intermediate or final results that may have been generated, accessed, or used by such exemplary embodiments. Such transmitted, displayed, stored, printed or outputted information can take the form of searchable and/or filterable lists of runs and reports, pictures, tables, charts, graphs, spreadsheets, correlations, sequences, and combinations thereof, for example.


EXAMPLES
Example 1—Assay Materials and Methods

Nucleic acid sequencing-based assays to identify and characterize the microbial composition of samples were conducted using DNA amplicon libraries generated from sample nucleic acids using two separate primer pools. One library was prepared using a primer pool for targeted amplification of microbial 16S rRNA DNA (the “16S primer pool”) and the other library was prepared using a primer pool for targeted amplification of unique DNA sequences of different microbial species (the “species primer pool”). Primers used in the 16S primer pools included the primer pairs listed as SEQ ID NOS: 1-27 in Table 15 (see Example 5), and primers used in species primer pools the primer pairs listed as SEQ ID NOS: 49-520 in Table 16 (see Example 5) designed to amplify species-specific target sequences including sequences listed as SEQ ID NOS: 1605-1826 in Table 17 (see Example 5). After libraries were generated, templates prepared through amplification of library amplicons, e.g., using Ion Chef™ or Ion OneTouch™ 2 System, and templates were sequenced using next generation sequencing technology, e.g., an Ion S5™, an Ion PGM™ System.


Sample Processing and Nucleic Acid Extraction

When the sample was a biological specimen, such as human fecal stool specimen, total nucleic acids (including DNA and RNA) were extracted from the specimen (e.g., 50-250 mg) using the MagMAX™ Microbiome Ultra Nucleic Acid Isolation Kit (Thermo Fisher Scientific; catalog no. A42357 (with plate) or A42358 (with tubes)) and the Thermo Scientific™ Kingfisher™ Flex Magnetic Particle Processor with 96 deep well heads (Thermo Fisher Scientific; catalog no. 5400630) for automated lysate particle processing after an initial bead-beating lysis step using MagMax™ Microbiome kit reagents. Typical elution volume of extracted nucleic acids was 200 μl, and typical recovery volume was 180 μl. The extraction was conducted according to manufacturer's instructions except that RNaseA was added to the Wash-I Plate 1 and Wash-I-Plate 2 prior to purification to remove RNA from the extracted nucleic acids and obtain isolated DNA for use in the assays. Bacterial samples obtained from ATCC were of DNA that had already been extracted.


Extracted DNA was quantitated using the Qubit™ dsDNA BR Assay Kit (Thermo Fisher Scientific; catalog no. Q32853) or Qubit™ dsDNA HS Assay Kit (Thermo Fisher Scientific catalog no. Q32851) according to manufacturer's instructions. DNA concentrations of at least 1.67 ng/μl and preferably greater than 10 ng/μl are preferable.


DNA Amplicon Library Preparation

Two DNA amplicon libraries (a 16S rRNA gene segment DNA library and a targeted bacterial species DNA library) were prepared from each sample for each assay. The libraries were generated using reagents in the Ion AmpliSeq™ Library Kit Plus (Thermo Fisher Scientific; catalog no. A35907; see Table 3) for highly multiplexed PCR amplification of hundreds of target sequences.









TABLE 3







Ion AmpliSeq ™ Library Kit Plus


Components for Library Preparation












Cap





Component
Color
Quantity
Volume
Storage















5X Ion
Red
1 tube
480
μL
−30° C. to −10° C.


AmpliSeq ™


HiFi Mix


FuPa Reagent
Brown
1 tube
192
μL
−30° C. to −10° C.


Switch
Yellow
1 tube
384
μL
−30° C. to −10° C.


Solution


DNA Ligase
Blue
1 tube
192
μL
−30° C. to −10° C.


25X Library
Pink
1 tube
192
μL
−30° C. to −10° C.


Amp Primers


1X Library
Black
1 tube
4 × 1.2
mL
−30° C. to −10° C.


Amp Mix


Low TE
White
1 tube
2 × 60
mL
15° C. to 30° C.










The 25× library amp primers and 1× library amp mix provided in the kit were not used. Instead, the following primers were used as shown in Table 3A.









TABLE 3A







Ion AmpliSeq ™ Microbiome Health Research Kit











Component
Pool #
Concentration
Volume
Storage





16S rRNA
1
5X
260 μL
−30° C. to −10° C.


Gene


Primer Pool


Target
2
5X
260 μL
−30° C. to −10° C.


Species


Primer Pool









The concentration of each of the individual primers in each primer pool is 1,000 nM (at the 5× concentration). The concentration of the individual primers in the amplification reaction is 200 nM (1× reaction concentration). Prior to conducting the amplification protocol, the Low TE bottle was removed from the library kit in cold storage, defrosted and stored in an ambient location. Ethanol (70%), used in library purification, was also prepared. To begin the amplification protocol, the HiFi Mix from the library kit was thawed and vortexed for 5 sec to mix. Primer pools (16S primer pool and species primer pool) were removed from cold storage, warmed to room temperature, vortexed for 5 sec to mix and quick spun to draw the liquid to the bottom of the tube. Next, for each separate primer pool reaction, the primer pool, Hifi Mix, sample DNA to be amplified and nuclease free water (e.g., Invitrogen™ Nuclease Free Water, not DEPC treated; Thermo Fisher Scientific; catalog no. AM9937) were combined in a 96-well reaction plate as set out in Tables 4 and 5 (two separate amplification reactions were conducted for each sample: one for each of the two primer pools).









TABLE 4







Amplification Reaction Mix Setup for


Primer Pool 1 (16S Primer Pool)










Order of





Addition
Component
Concentration
Volume uL













1
Ion AmpliSeq ™
5X
4



HiFi Mix (red cap)


2
Ion AmpliSeq ™
5X
4



Primer Pool 1 - 16S


3
DNA Sample
0.167 ng/uL
6


4
Nuclease Free Water
n/a
6



Total

20
















TABLE 5







Amplification Reaction Mix Setup for


Primer Pool 2 (Species Primer Pool)










Order of





Addition
Component
Concentration
Volume uL













1
Ion AmpliSeq ™
5X
4



HiFi Mix (red cap)


2
Ion AmpliSeq ™
5X
4



Primer Pool 2 -Species


3
DNA Sample
1.67 ng/uL
6


4
Nuclease Free Water
n/a
6



Total

20









The 96-well reaction plate was sealed with a MicroAmp™ Clear Adhesive film (Thermo Fisher Scientific; catalog no. 4306311) and vortexed for 5 sec to thoroughly mix the components. The plate was quick spun to draw the liquid to the bottom of the plate and a MicroAmp™ Compression Pad was placed on the plate which was then placed in a thermal cycler. The amplification reaction was performed using the cycling parameters listed in Table 6.









TABLE 6







Amplification, Digestion and Ligation


Reaction Thermal Cycling Parameters











Stage
Step
Temperature
Time
Cycles












AMPLIFICATION REACTION PARAMETERS












1: Hold
1
99° C.
2
min
n/a


2: Cycling
1
99° C.
15
sec
20


2: Cycling
2
60° C.
4
min
20











3: Hold
1
10° C.

n/a









DIGESTION REACTION PARAMETERS












1: Hold
1
50° C.
10
min
n/a


2: Hold
1
55° C.
10
min
n/a


3: Hold
1
60° C.
20
min
n/a











4: Hold
1
10° C.
∞ (1 hour max)
n/a









LIGATION REACTION PARAMETERS












1: Hold
1
22° C.
30
min
n/a


2: Hold
1
68° C.
5
min
n/a


3: Hold
1
72° C.
5
min
n/a











4: Hold
1
10° C.

n/a









To trim the ends of the amplicons, the primer ends were partially digested with FuPa reagent from the library kit. The plate was removed from the thermal cycler and quick spun to draw the liquid to the bottom of the plate and then unsealed. FuPa reagent (2 μl) was added to 20 μl of amplified DNA sample. The plate was sealed with MicroAmp™ Clear Adhesive film, vortexed for 5 sec to thoroughly mix the components and quick spun. A MicroAmp™ Compression Pad was placed on the plate which was then placed in a thermal cycler. The digestion reaction was performed using the cycling parameters listed in Table 6.


The trimmed amplicons from each reaction were then ligated with IONCode™ Barcode Adapters 1-384 (Thermo Fisher Scientific; catalog no. A29751). Different adapter pairs were ligated to separate amplicon libraries. Each adapter pair contains a barcode adapter and an ION P1 adapter in order to enable unique identification of different libraries and sequencing in the Ion GeneStudio™ S5 sequencing system. In preparing the ligation reaction, the Switch Solution from the library kit and the Barcode Adapters were separately warmed to room temperature, vortexed for 5 sec and quick spun. The library plate from the digestion reaction was removed from the thermal cycler, quick spun and the seal film was removed from the plate. The components of the ligation reaction were then added to the wells of the plate as shown in Table 7.









TABLE 7







Ligation Reaction Mix Setup










Order of





Addition
Component
Concentration
Volume





n/a
Digested DNA Sample
n/a
22 uL 


1
Switch Solution (yellow cap)
n/a
4 uL


2
Barcode Adapters
n/a
2 uL


3
DNA Ligase (blue cap)
n/a
2 uL



Total

30 uL 









The reaction plate was sealed with a MicroAmp™ Clear Adhesive film, vortexed for 5 sec to thoroughly mix the components, quick spun and a MicroAmp™ Compression Pad was placed on the plate which was then placed in a thermal cycler. The ligation reaction was performed using the thermal parameters listed in Table 6.


The libraries were purified using the Agencourt AMPure XP Reagent (Fisher Scientific; catalog no. NC9959336). First, 70% ethanol was prepared by combining 100% ethanol with nuclease-free water (300 μl per library plus dead volume is required: 210 μl 100% ethanol+90 μl nuclease-free water). The Agencourt AMPure XP reagent was allowed to warm to room temperature and vortexed for 30 sec immediately prior to use. The library plate was removed from the thermal cycler, quick spun and the seal film was removed from the plate. Agencourt AMPure XP Reagent (45 μl) was added to each library and the contents of each library were pipetted up and down 5 times to mix the components. The mix was incubated for 5 min at room temperature and then the plate was placed in a DynaMag-96 Side Magnet plate holder and incubated for 2 min at room temperature or until the mix cleared. A first wash step was performed by removing supernatant without disturbing the pellet, adding 150 μl of freshly prepared 70% ethanol and moving the plate from side-to-side 3 times in the DynaMag plate holder to wash the beads. The wash step was repeated once and after the wash, all the supernatant was removed and the plate was air dried for 5 min and then removed from the DynaMag plate holder. To elute the library DNA from the AMPure XP beads, 50 μl of Low TE from the library kit was added to each library and the plate was sealed with MicroAmp™ Clear Adhesive film, vortexed for 5 sec, quick spun and incubated for 2 min at room temperature. The plate was placed in a DynaMag-96 Side Magnet plate holder and incubated for 2 min at room temperature or until the mix cleared. The seal was then removed from the plate and the supernatant containing the eluted library was transferred to a new 96-well plate or tubes, without disturbing the pellet. The plates, or tubes, were labeled with sample information and stored at +4° C. short term or −20° C. long term.


Each eluted library was quantitated using real time PCR and reagents provided in the Ion Library TaqMan™ Quantification Kit (Thermo Fisher Scientific; catalog no. 4468802) alongside PCR reactions of a control library serial diluted to create a standard curve and a no template control. Two replicate quantitation reactions were performed for each eluted library, control library and no template control, and the mean value was used to calculate the final library concentration. Forty-four eluted libraries were able to be quantitated on a single reaction plate. The eluted library was pre-diluted (e.g., 1:500) in Low TE prior to measuring the concentration. An example of a plate layout used is shown in Table 8.









TABLE 8







Library Quantitation Template Plate Layout for 44 Libraries, 3 Control Libraries and 1 No-Template Control


Library Quantitation Template Plate Layout




















1
2
3
4
5
6
7
8
9
10
11
12























A
S01-r1
S01-r2
S09-r1
S09-r2
S17-r1
S17-r2
S25-r1
S25-r2
S33-r1
S33-r2
S41-r1
S41-r2


B
S02-r1
S02-r2
S10-r1
S10-r2
S18-r1
S18-r2
S26-r1
S26-r2
S34-r1
S34-r2
S42-r1
S42-r2


C
S03-r1
S03-r2
S11-r1
S11-r2
S19-r1
S19-r2
S27-r1
S27-r2
S35-r1
S35-r2
S43-r1
S43-r2


D
S04-r1
S04-r2
S12-r1
S12-r2
S20-r1
S20-r2
S28-r1
S28-r2
S36-r1
S36-r2
S44-r1
S44-r2


E
S05-r1
S05-r2
S13-r1
S13-r2
S21-r1
S21-r2
S29-r1
S29-r2
S37-r1
S37-r2
STD01-r1  
STD01-r2  


F
S06-r1
S06-r2
S14-r1
S14-r2
S22-r1
S22-r2
S30-r1
S30-r2
S38r1
S38r2
STD02-r1  
STD02-r2  


G
S07-r1
S07-r2
S15-r1
S15-r2
S23-r1
S23-r2
S31-r1
S31-r2
S39-r1
S39-r2
STD03-r1  
STD03-r2  


H
S08-r1
S08-r2
S16-r1
S16-r2
S24-r1
S24-r2
S32-r1
S32-r2
S40-r1
S40-r2
NTC-r.1
NTC-r.2









The TaqMan PCR Master Mix, TaqMan Quantitation Assay and control library tubes from the Quantification Kit were warmed to room temperature, vortexed for 5 sec to mix and quick spun. Three 1:10 serial dilutions from the stock concentration (68 pM) of the control library were prepared for use in generating a standard curve as follows. Low TE (45 μl) was added to each of 4 microcentrifuge tubes labeled as STD01, STD02, STD03 and NTC. Stock concentration control library (5 μl) was added to a tube labeled STD01, which was then capped, vortexed for 5 sec and quick spun. Five microliters of diluted library from tube STD01 were added to tube STD02, which was then capped, vortexed for 5 sec and quick spun, followed by addition of 5 μl of diluted library from tube STD02 to tube STD03, which was also capped, vortexed for 5 sec and quick spun. This generated 3 control library serial dilutions for use as standards, with the following concentrations: 6.8 pm (STD01), 0.68 pM (STD02) and 0.068 pM (STD03). The no-template control (NTC) tube did not contain a library. Amplification reactions were then prepared. Each library had two duplicated reactions, the results of which were averaged. Each reaction included aliquots from the Master Mix, Assay and library (eluted sample, STD or NTC) tubes as listed in Table 9.









TABLE 9







Library Quantitation Reaction Mix Setup










Order of





Addition
Component
Concentration
Volume uL













1
TaqMan qPCR Master Mix
 2X
10


2
TaqMan Quantitation Assay
20X
1


3
Sample (Diluted Library,
n/a
9



SID Library, NTC)





Total

20









The reaction plate was sealed with the adhesive film, vortexed for 5 sec, quick spun and placed in a real time PCR instrument. The library quantitation reactions were performed using the thermal cycling parameters and plate run setup settings listed in Tables 10 and 11.









TABLE 10







Library Quantitation Reaction Thermal Cycling Parameters













Stage
Step
Temperature
Time
Cycles







1: Hold
1
50° C.
 2 min
n/a



2: Hold
1
95° C.
20 sec
n/a



3: Cycling
1
95° C.
 1 sec
40



3: Cycling
2
60° C.
20 sec
40

















TABLE 11





Library Quantitation Plate Run Setup Settings















1. Generate definitions:








Target Assay Name
Define as LibQuant


Reporter Dye
Define as FAM


Quencher
Define as NFQ-MGB


Passive Reference Dye
Define as ROX


Sample Names
Define for each eluted library,



standard library and the NTC







2. Assign the target assay and samples to the appropriate plate locations


3. Assign tasks:








For each eluted library dilution
Assign the task as U (unknown)


For each control library dilution
Assign the task as S (standard)



and enter appropriate



concentration


For the NTC
Assign the task as N







4. Enter reaction volume of 20 μl


5. Set analysis settings:








Threshold
Set Threshold to 0.2


Automatic baseline
Check the box net to Automatic



Baseline







Note: do not use default settings or automatic threshold


6. Export results


Note: when the quantitation run is complete, export the results


which contain the mean calculated quantity of each of the eluted


library dilutions (pM units). Refer to the generic Ion Library


TaqMan Quantitation Kit user guide publication MAN0015802.


7. Calculate final library concentration


Calculate the stock concentration of the eluted library


Multiply the mean concentration of the eluted library dilution exported


from the quantitation run by the dilution factor used (e.g., 500)









Each library was normalized to a concentration of 50 pM. The results of the library quantitation were used to calculate the dilution factor required to reach 50 pM concentration. If the library concentration was at or below 50 pM, then no dilution was required. To normalize the library, the following formula was used to calculate the dilution factor required to normalize the library to 50 pM: dilution factor=(eluted library concentration pM)/(50 pM). Eluted libraries were warmed to room temperature, vortexed for 5 sec and quick spun. The eluted library was combined with diluent (Low TE buffer) in a microcentrifuge tube or plate using the calculated dilution factor. A minimum of 5 μl of eluted library was used to create the normalized library. The diluted library was vortexed for 5 sec, quick spun and stored at +4° C. short term or −20° C. long term.


For conducting sequencing of the libraries, all the normalized libraries created from the 16S and species primer pools were combined at equimolar concentrations to formulate a pool that was 50 pM concentration. The library pool was created by combining equal volumes of each normalized library. A minimum of 5 μl of each elute library was typically used to create the library pool. If an eluted library concentration was at or below 50 pM, an equal volume of the eluted library was added into the pool. Libraries with a concentration of less than 50 M may not generate sufficient usable reads for analysis. The combined library pool was vortexed for 5 sec, quick spun and stored at +4° C. short term or −20° C. long term.


Preparing the Library for Sequencing

An aliquot of the final library was used in templating of the library amplicons onto bead supports (e.g., Ion Sphere Particles) using an Ion Chef™ instrument and Ion 540™ Kit-Chef and Ion 540™ Chip Kit according to the manufacturer's instructions.


Example 2—Sequencing of Library Template DNAs

Semiconductor chips containing the library DNA-templated beads were loaded into an Ion S5 Sequencer (Thermo Fisher Scientific) and sequencing of the DNA templates was conducted according to manufacturer's instructions.


Example 3—Data Analysis

Reads obtained from sequencing of library DNA templates were analyzed to identify, and determine the levels of, microbial constituents of the samples. Analysis was conducted using a workflow incorporating Ion Torrent Suite™ Software (Thermo Fisher Scientific) with a run plan template designed to facilitate microbial DNA sequence read analysis. The analysis program, which includes computational methods described herein, is referred to as an AmpliSeq microbiome analysis software plugin which generates counts for amplicons targeted in the assay. Reference sequences derived from the GreenGenes bacterial 16S rRNA gene sequence public database (see, e.g., www.greengenes.lbl.gov) were used for mapping of reads obtained from sequencing of amplicons generated using the 16S primer pool. Reference microbial genome sequences available in an NCBI public database (see www.ncbi.nlm.nih.gov/genome/microbes/) were used for mapping reads obtained from sequencing of amplicons generated using the species primer pool.


As shown in FIG. 2, which is a block diagram of a method for processing the sequence reads to determine microbial composition, the reads from the two amplicon libraries (16S primer pool library and species primer pool library) were separately analyzed. The barcode/sample name parser separates the sequence reads into a set corresponding to reads of amplicons generated using the 16S primer pool (the 16S amplicons) and a set corresponding to reads of amplicons generated using the species primer pool (the species amplicons). In this process, the unaligned sequence reads from BAM files are first trimmed (quality and length trimming performed using BaseCaller software) and then filtered to remove short (e.g., <60 bp) reads likely to have not originated from the amplified DNA product. The BAM file reads are then mapped to reference sequences.



FIG. 3 is a block diagram of the amplicon processing pipeline used in analysis of the amplicon library generated using the 16S primer pool. The 16S amplicon reads were subjected to two alignment/mapping steps. In the first alignment step, the reads were aligned and mapped, with multi-mapping and end-to-end mapping enabled, to segments of bacterial 16S rRNA reference gene sequences obtained by in silico PCR of a set of full-length bacterial 16S rRNA reference genes (e.g., the GreenGenes database) to generate amplicon sequences expected to be amplified using the 16S primers (i.e., expected hypervariable region amplicons). Expected signature patterns of hypervariable region amplicons expected for each microorganism identified by in silico PCR as containing sequences of expected hypervariable region amplicons were generated based on which of each of the 16S primer pairs would be expected to amplify a sequence in the microorganism and which of the 16S primer pairs would not be expected to amplify a sequence in the microorganism. For example, the amplicons for each of 8 hypervariable regions that could be amplified by the 16S primer pairs per microorganism 16S rRNA reference gene sequence were assigned a binary notation based on whether or not an amplicon was expected to be amplified for the microorganism to yield an expected signature pattern of ones (indicating a amplicon was expected to be generated) and zeros (indicating an amplicon was not expected to be generated). Thus, for example, for purposes of illustration, a particular species of microorganism could have a signature pattern of expected hypervariable region amplicons of: V2 (1), V2 (0), V2 (1), V3 (1), V4 (0), V5 (1), V6 (1), V7 (0), V8 (1), V8 (0), V8 (1), V9 (0). In this illustration, the 16S primer pool used in the in silico PCR of a set of full-length bacterial 16S rRNA reference genes would have included 3 versions of primers (i.e., degenerate primers) for amplifying the V2 region, 3 versions of primers (i.e., degenerate primers) for amplifying the V8 region, and only one primer pair each for amplifying each of the other hypervariable regions (i.e., V3, V4, V5, V6, V7 and V9). In a first mapping step, the reads are aligned to the reference hypervariable segments of the 16S reference set, with multi-mapping and end-to-end mapping enabled. The mapping steps determine aligned sequence reads and associated mapping quality parameters. The mapped reads are filtered based on alignment quality. After the first alignment step, the sequence reads were separated and assigned to a hypervariable region according to the different expected hypervariable region amplicons produced by the separate 16S primer pairs based on the alignments to generate a matrix of observed read counts for each of the targeted hypervariable regions for each species and strain. The number of sequence reads assigned to each expected hypervariable region for each microorganism were counted to obtain a total sequence read count for each microorganism and a series of computational steps as described herein were performed applying read thresholds (including read count thresholds per hypervariable region, as well as total read counts per 16S rRNA gene sequence) to reduce the number of reference sequences that would be used in a second alignment and mapping. The decision as to whether to include a 16S rRNA gene reference sequence in the second alignment step was thus determined by using read count thresholds per hypervariable region, as well as total read counts per 16S rRNA gene sequence. Only those 16S rRNA gene reference sequences that satisfied the criteria of at least a threshold number of total read counts per sequence, at least a threshold number of read counts per hypervariable region, and for which there was an observed pattern of reads that had at least a threshold level of similarity to the expected signature pattern were not excluded from the reference sequences used in the second mapping step. This group of microorganisms was used as a processed, reduced, filtered, high-confidence group of microorganism full-length 16S rRNA reference gene sequences (compared to the original complete set of 16S rRNA reference gene sequences contained in the GreenGenes database) to which all of the sequence reads were aligned in a second alignment step. In processing the original database of 16S rRNA gene sequences in this method, the number of reference sequences used in the second, and final, alignment step was reduced on the order of typically at least 50-fold, 75-fold or more, depending on the number of microorganisms in a sample. Furthermore, the quality of the reference sequences used in the second alignment step was greatly improved relative to the original database reference sequences as unannotated and incorrectly classified reference sequences were identified and reannotated and corrected (based on a sequence similarity metric (levenshtein distance)) during the processing of the database. Following the second alignment step, the total number of sequence reads aligning to each reference 16S rRNA gene sequence of the filtered group of microorganism sequences was determined as a sequence read count for each of the reference sequences. Only sequence reads aligning to the expected amplicon sequences were included in the read count (i.e., reads aligning to unexpected sequences in a 16S rRNA gene based on the primers used in the amplification were excluded from the count). Each read count was normalized by dividing it by the number of expected hypervariable region amplicons for the microorganism (e.g., the number of “ones” in the expected signature pattern). In a second normalizing step, the first normalized counts were divided by an average copy number of the 16S gene for the species to form second normalized counts. The copy numbers for the species may be obtained from a 16S copy number database, such as rrnDB (https://rrndb.umms.med.umich.edu/; Stoddard S. F, Smith B. J., Hein R., Roller B. R. K. and Schmidt T. M. (2015) rrnDB: improved tools for interpreting rRNA gene abundance in bacteria and archaea and a new foundation for future development. Nucleic Acids Research 2014; doi: 10.1093/nar/gku1201). For a given species, the copy numbers for the 16S gene given in the database records may be averaged to form the average copy number used for the second normalizing step. The normalized read counts for each reference sequence were then used to determine aggregate read counts by summing the read counts per species, genus and family. The second normalized counts are aggregated, or added, for the species level, genus level and family level. The percentage of aggregated counts to the total number of mapped reads was calculated and thresholds were applied for species detection, genus detection and family detection to give relative abundances if the threshold criteria are met. The species, genus and/or family may be reported as present if the percentage value is greater than the respective threshold. The threshold for detection may is also referred to as a noise threshold.


Based on the aggregated read counts, a determination was made as to whether a species was present in the sample (a species aggregated read count had to meet a threshold of being greater than 0.1% of the total normalized read count), a genus was present in the sample (a genus aggregated read count had to meet a threshold of being greater than 0.5% of the total normalized read count) and a family was present in the sample (a family aggregated read count had to meet a threshold of being greater than 1% of the total normalized read count). Relative abundance was reported as species-specific, genus-specific and family-specific normalized read count each divided by the total normalized read counts.


The reads from the amplicons generated using the species primer pool were separately analyzed (FIG. 4) using microbial genome sequences as a reference which was not limited to a subset of genes as was the reference used for mapping of reads from the amplicons generated using the 16S primer pool. Prior to mapping, the microbial genome database was pre-processed to provide for more efficient and accurate alignment of amplicon reads to the reference sequences. In this pre-processing, the microbial genome sequences in the database were subjected to an in silico PCR analysis conducted using primers of the species primer pool to generate expected amplicon (primers+inserts) sequences from the whole genomes of all microbial strains in the database. The in silico PCR results identify genomes in the database that contain sequences that will be amplified by the primers in the species primer pool. Any genomes that do not contain sequence that would be amplified by the species primers were eliminated from the database. Any genomes that contain sequence that would be amplified using the species primers but would not be expected to contain such sequence were evaluated to determine the average nucleotide identity (ANI) between the genome and a genome that was expected to be amplified by the primers to assess possible misclassification and reannotation of the genome, and retainment of the genome in the database. A genome was reclassified only if it had greater than 95% identity to the genome to which it was being reclassified. Following pre-processing of the reference database, the sequence reads were aligned and subjected to alignment quality filtering. Only those reads that uniquely mapped to a single species (either uniquely to one reference sequence or to multiple reference sequences of the same species) were included in a read count. The number of reads mapping to a species was calculated and an aggregate read count per species was determined, normalized by dividing the aggregate number for a species by the number of total amplicons for the species (i.e., the total number of amplicons for the species for which there was a minimum threshold number (e.g., greater than 10) of aligning sequence reads) and reported at the species level. Based on the aggregated read counts, a determination was made as to whether a species was present in the sample (a species aggregated read count had to meet a threshold of being greater than 0.10 of the total normalized read count). Relative abundance was reported as species-specific normalized read count divided by the total normalized read counts.


Example 4—Assay Results

Sample mixtures containing DNA from microbial species as shown in Table 12 were prepared. The samples are mixtures of known microbial DNA, at different limits of detection. Sample nos. 1-15 contain microbial DNAs at or above 50 LOD. Sample no. 16 was used as a negative control.









TABLE 12







Microbial DNA Sample Mixtures










Sample
Sample
Sample
Sample Composition


#
Type
ID
Genus/Species (ATCC Accession No.)













1
Microbial
MSA1002
20 Species @ 5% each (18 Genus):












DNA


Acinetobacter baumannii (17978)


Lactobacillus gasseri (33323)




mixture


Actinomyces odontolyticus (17982)


Neisseria meningitidis (BAA-335)







Bacillus cereus (10987)


Porphyromonas gingivalis (33277)







Bacteroides vulgatus (8482)


Pseudomonas aeruginosa (9027)







Bifidobacterium adolescentis (15703)


Rhodobacter sphaeroides (17029)







Clostridium beijerinckii (35702)


Staphylococcus aureus (BAA-1556)







Cutibacterium acnes (11828)


Staphylococcus epidermidis (12228)







Deinococcus radiodurans (BAA-816)


Streptococcus agalactiae (BAA-611)







Enterococcus faecalis (47077)


Streptococcus mutans (700610)







Escherichia coli (700926)







Helicobacter pylori (700392)











2
Microbial
MSA1006
12 Species @ 8.3% each (11 Genus):












DNA


Bacteroides fragilis (25285)


Enterococcus faecalis (700802)




mixture


Bacteroides vulgatus (8482)


Escherichia coli (700926)







Bifidobacterium adolescentis (15703)


Fusobacterium nucleatum subsp.







Clostridioides difficile (9689)


nucleatum (25586)







Enterobacter cloacae (13047)


Helicobacter pylori (700392)








Lactobacillus plantarum (BAA-793)








Salmonella enterica subsp








enterica (9150)








Yersinia enterocolitica (27729)











3
Microbial
MIX05
20 Species @ 5% each (13 genus)












DNA


Actinomyces viscosus (27045)


Citrobacter rodentium (51638)




mixture


Atopobium parvulum (33793)


Collinsella aerofaciens (25986)







Bacteroides fragilis (25285D-5)


Escherichia coli (10798D-5)







Bacteroides vulgatus (8482D-5)


Gardnerella vaginalis (14019D-5)







Bifidobacterium adolescentis (15703D-5)


Helicobacter pylori (43504D-5)







Bifidobacterium longum (15697D-5)


Klebsiella pneumoniae (700721D-5)







Campylobacter concisus (BAA-1457D-5)


Parabacteroides distasonis (8503D-5)







Campylobacter curvus (BAA-1459D-5)


Parabacteroides merdae (43184)







Campylobacter jejuni (700819D-5)


Porphyromonas gingivalis (BAA-308D-5)







Campylobacter rectus (33238D-5)







Bacteroides thetaiotaomicron






(Bacillus thetaiotaomicron)





(29148D-5)










4
Microbial
MIX06
20 Species @ 5% each (13 genus)












DNA


Akkermansia muciniphila (BAA-835D-5)


Lactobacillus acidophilus (4357D-5)




mixture


Anaerococcus vaginalis (51170)


Lactobacillus delbrueckii (9649D-5)







Borreliella burgdorferi (35210D-5)


Lactobacillus murinus (35020)







Desulfovibrio alaskensis (14563)


Lactobacillus reuteri (23272D-5)







Dorea formicigenerans (27755)


Lactobacillus rhamnosus (21052D-5)







Enterococcus faecium (BAA-472D-5)


Peptostreptococcus anaerobius (49031D-5)







Enterococcus gallinarum (49573)


Streptococcus gallolyticus (9809D-5)







Enterococcus hirae (10541D-5)


Streptococcus infantarius (BAA-102)







Faecalibacterium prausnitzii (27766)


Veillonella parvula (17745D-5)







Fusobacterium nucleatum (25586D-5)







Helicobacter bills (51631)











5
Microbial
MIX07
40 Species @ 2.5% each (25 genus)












DNA


Actinomyces viscosus (27045)


Enterococcus hirae (10541D-5)




mixture


Akkermansia muciniphila (BAA-835D-5)


Escherichia coli (10798D-5)







Anaerococcus vaginalis (51170)


Faecalibacterium prausnitzii (27766)







Atopobium parvulum (33793)


Fusobacterium nucleatum (25586D-5)







Bacteroides fragilis (25285D-5)


Gardnerella vaginalis (14019D-5)







Bacteroides thetaiotaomicron (29148D-5)


Helicobacter bills (51631)







Bacteroides vulgatus (8482D-5)


Helicobacter pylori (43504D-5)







Bifidobacterium adolescentis (15703D-5)


Klebsiella pneumoniae (700721D-5)







Bifidobacterium longum (15697D-5)


Lactobacillus acidophilus (4357D-5)







Borreliella burgdorferi (35210D-5)


Lactobacillus delbrueckii (9649D-5)







Campylobacter concisus (BAA-1457D-5)


Lactobacillus murinus (35020)







Campylobacter curvus (BAA-1459D-5)


Lactobacillus reuteri (23272D-5)







Campylobacter jejuni (700819D-5)


Lactobacillus rhamnosus (21052D-5)







Campylobacter rectus (33238D-5)


Parabacteroides distasonis (8503D-5)







Citrobacter rodentium (51638)


Parabacteroides merdae (43184)







Collinsella aerofaciens (25986)


Peptostreptococcus anaerobius (49031D-5)







Desulfovibrio alaskensis (14563)


Porphyromonas gingivalis (BAA-308D-5)







Dorea formicigenerans (27755)


Streptococcus gallolyticus (9809D-5)







Enterococcus faecium (BAA-472D-5)


Streptococcus infantarius (BAA-102)







Enterococcus gallinarum (49573)


Veillonella parvula (17745D-5)











6
Microbial
MIX09
22 Species @ 4.6% each (16 genus)












DNA


Bifidobacterium animalis (27536)


Helicobacter bizzozeronii (700031)




mixture


Bifidobacterium bifidum (29521)


Helicobacter hepaticus (51448)







Blautia/Ruminococcus gnavus (29149)


Holdemania filiformis (51649)







Campylobacter gracilis (33236D-5)


Lactobacillus johnsonii (33200)







Campylobacter hominis (BAA-381D-5)


Lactococcus lactis (19435D-5)







Chlamydia pneumoniae (VR-1360D-5)


Mycoplasma fermentans (19989D-5)







Chlamydia trachomatis (VR-885D-5)


Mycoplasma penetrans (55252)







Clostridioides difficile (9689D-5)


Parvimonas micra (33270)







Enterobacter cloacae (13047D-5)


Proteus mirabilis (29906)







Enterococcus faecalis (47077D-5)


Pseudomonas aeruginosa (47085D-5)







Eubacterium rectale (33656)


Ruminococcus bromii (27255)











7
Microbial
MIX11
20 Species @ 5% each












DNA


Akkermansia amuciniphila,


Dorea formicigenerans,




mixture


Anaerococcus vaginalis,


Enterococcus faecium,







Atopobium parvulum,


Eubacterium rectale,







Bacteroides fragilis,


Faecalibacterium prausnitzii,







Bifidobacterium animalis,


Fusobacterium nucleatum,







Borreliella burgdorferi,


Helicobacter bizzozeronii,







Campylobacter concisus,


Holdemania filiformis,







Citrobacter rodentium,


Lactobacillus acidophilus,







Clostridioides difficile,


Mycoplasma penetrans,







Desulfovibrio alaskensis,


Parabacteroides merdae











8
Microbial
MIX12
20 Species @ 5% each












DNA


Akkermansia muciniphila,


Dorea formicigenerans,




mixture


Anaerococcus vaginalis,


Enterococcus gallinarum,







Atopobium parvulum,


Eubacterium rectale,







Bacteroides thetaiotaomicron,


Helicobacter hepaticus,







Bifidobacterium bifidum,


Lactobacillus delbrueckii,







Borreliella burgdorferi,


Parvimonas micra,







Campylobacter curvus,


Peptostreptococcus anaerobius,







Citrobacter rodentium,


Proteus mirabilis,







Desulfovibrio alaskensis,


Ruminococcus bromii,








Streptococcus infantarius,








Veillonella parvula











9
Microbial
MIX13
20 Species @ 5% each












DNA


Bifidobacterium longum,


Lactobacillus johnsonii,




mixture


Borreliella burgdorferi,


Mycoplasma penetrans,







Campylobacter hominis,


Parabacteroides merdae,







Desulfovibrio alaskensis,


Parvimonas micra,







Dorea formicigenerans,


Peptostreptococcus anaerobius,







Enterococcus gallinarum,


Proteus mirabilis,







Eubacterium rectale,


Ruminococcus bromii,







Faecalibacterium prausnitzii,


Streptococcus infantarius,







Fusobacterium nucleatum,


Veillonella parvula







Helicobacter pylori,







Holdemania filiformis,











10
Microbial
MIX14
20 Species @ 5% each












DNA


Akkermansia muciniphila,


Fusobacterium nucleatum,




mixture


Anaerococcus vaginalis,


Helicobacter pylori,







Atopobium parvulum,


Holdemania filiformis,







Bacteroides fragilis,


Lactobacillus murinus,







Bifidobacterium animalis,


Mycoplasma penetrans,







Campylobacter jejuni,


Parabacteroides merdae,







Desulfovibrio alaskensis,


Parvimonas micra,







Dorea formicigenerans,


Peptostreptococcus anaerobius,







Enterococcus faecium,


Proteus mirabilis,







Faecalibacterium prausnitzii,


Ruminococcus bromii











11
Microbial
MIX15
20 Species @ 5% each












DNA


Akkermansia muciniphila,


Fusobacterium nucleatum,




mixture


Anaerococcus vaginalis,


Helicobacter hepaticus,







Atopobium parvulum,


Lactobacillus reuteri,







Bacteroides thetaiotaomicron,


Mycoplasma penetrans,







Bifidobacterium bifidum,


Parabacteroides merdae,







Borreliella burgdorferi,


Parvimonas micra,







Campylobacter rectus,


Peptostreptococcus anaerobius,







Citrobacter rodentium,


Proteus mirabilis,







Enterococcus gallinarum,


Ruminococcus bromii,








Streptococcus infantarius,








Veillonella parvula











12
Microbial
MIX16
20 Species @ 5% each












DNA


Bacteroides fragilis,


Enterococcus faecium,




mixture


Bacteroides thetaiotaomicron,


Enterococcus gallinarum,







Bifidobacterium animalis,


Helicobacter bizzozeronii,







Bifidobacterium bifidum,


Helicobacter hepaticus,







Bifidobacterium longum,


Helicobacter pylori,







Campylobacter concisus,


Lactobacillus acidophilus,







Campylobacter curvus,


Lactobacillus delbrueckii,







Campylobacter hominis,


Lactobacillus johnsonii,







Campylobacter jejuni,


Lactobacillus murinus,







Campylobacter rectus,


Lactobacillus reuteri











13
Microbial
MIX17
20 Species @ 5% each












DNA


Enterococcus faecium,


Lactobacillus murinus,




mixture


Enterococcus gallinarum,


Lactobacillus reuteri,







Fusobacterium nucleatum,


Mycoplasma penetrans,







Helicobacter bizzozeronii,


Parabacteroides merdae,







Helicobacter hepaticus,


Parvimonas micra,







Helicobacter pylori,


Peptostreptococcus anaerobius,







Holdemania filiformis,


Proteus mirabilis,







Lactobacillus acidophilus,


Ruminococcus bromii,







Lactobacillus delbrueckii,


Streptococcus infantarius,







Lactobacillus johnsonii,


Veillonella parvula











14
Microbial
MIX18
20 Species @ 5% each












DNA


Akkermansia muciniphila,


Campylobacter curvus,




mixture


Anaerococcus vaginalis,


Campylobacter hominis,







Atopobium parvulum,


Campylobacter jejuni,







Bacteroides fragilis,


Campylobacter rectus,







Bacteroides thetaiotaomicron,


Citrobacter rodentium,







Bifidobacterium animalis,


Clostridioides difficile,







Bifidobacterium bifidum,


Desulfovibrio alaskensis,







Bifidobacterium longum,


Dorea formicigenerans,







Borreliella burgdorferi,


Eubacterium rectale,







Campylobacter concisus,


Faecalibacterium prausnitzii











15
Microbial
MIX19
40 Species @ 2.5% each












DNA


Akkermansia muciniphila,


Eubacterium rectale,




mixture


Anaerococcus vaginalis,


Faecalibacterium prausnitzii,







Atopobium parvulum,


Fusobacterium nucleatum,







Bacteroides fragilis,


Helicobacter bizzozeronii,







Bacteroides thetaiotaomicron,


Helicobacter hepaticus,







Bifidobacterium animalis,


Helicobacter pylori,







Bifidobacterium bifidum,


Holdemania filiformis,







Bifidobacterium longum,


Lactobacillus acidophilus,







Borreliella burgdorferi,


Lactobacillus delbrueckii,







Campylobacter concisus,


Lactobacillus johnsonii,







Campylobacter curvus,


Lactobacillus murinus,







Campylobacter hominis,


Lactobacillus reuteri,







Campylobacter jejuni,


Mycoplasma penetrans,







Campylobacter rectus,


Parabacteroides merdae,







Citrobacter rodentium,


Parvimonas micra,







Clostridioides difficile,


Peptostreptococcus anaerobius,







Desulfovibrio alaskensis,


Proteus mirabilis,







Dorea formicigenerans,


Ruminococcus bromii,







Enterococcus faecium,


Streptococcus infantarius,







Enterococcus gallinarum,


Veillonella parvula











16
No
Water
n/a



Template



Control









Two libraries were prepared for each sample as described in Example 1: one library generated using a 16S primer pool containing 12 primer pairs (SEQ ID NOs: 1-24; see Table 15) and 1 library using a species primer pool containing 236 primer pairs (SEQ ID NOs: 49-520; see Table 16). Four replicate aliquots of each library were included on a semiconductor sequencing chip and sequenced as described in Example 2. The ability to replicate bacteria detection results for a sample was evaluated using Spearman's RHO, which is a non-parametric test used to measure the strength of association between two variables (rank order correlation) where r=1 means a perfect positive correlation. This test is based on detection of a monotonic trend between two variables (i.e., replicates), as opposed to just a linear trend. This analysis is better suited for read counts as it makes no assumptions of a normal distribution. FIG. 5A is a plot of Spearman's RHO for replicate sequencing of libraries generated using a 16S Primer Pool. FIG. 5B is a plot of Spearman's RHO for replicate sequencing of libraries generated using a species primer pool. As shown in FIGS. 5A and 5B, which depict the comparison of the results of sequencing of four replicate aliquots of libraries generated from six of the samples, the assay was very reproducible across samples.


An example of results of analysis of 16S sequence reads from sequencing of a library generated from Sample no 1 using the pool of 16S primers is shown in FIGS. 6A and 6B (Propionibacterium shown in FIG. 6A is the scientific name for Cutibacterium). FIG. 6A shows that all 18 of the 18 bacterial genera present in Sample no. 1 (MSA1002) were detected in the sequencing assay. FIG. 6A shows results where the second mapping step 312 was applied using the first reduced set of full-length 16S rRNA gene sequences (without reannotation) and without the first and second normalizing steps 316 (refer to block diagram in FIG. 3). The dashed line 602 in FIG. 6A is at 1.5% of the total number of mapped reads on the y-axis which represents the threshold in this analysis for the number of mapped reads that were considered as “noise” or background. This threshold, which can be varied for any given analysis, is the number of mapped reads out of the total number of mapped reads that can be considered as being attributable to non-specific sequences, such as, for example, primer dimers, erroneous amplification products and truncated reads. The number of mapped reads for each genus or species in excess of the noise threshold are those reads considered as specific, relevant and different from the reads at or below the threshold number of reads. FIG. 6B gives a table of the noise threshold, sensitivity and PPV for the example of Sample no. 1 (MSA1002) library generated using a 16S primer pool. The assay was highly reproducible as shown in an analysis of a comparison of the results of sequencing of four replicate aliquots of a library generated from Sample no. 1 (MSA1002) using a 16S primer pool (see FIG. 7).


An example of results of analysis of targeted species sequence reads from sequencing of a library generated from Sample no. 1 using the species primer pool is shown in FIGS. 8A and 8B. Sample no. 1 (MSA1002) contains 20 different bacterial species, 7 of which were targeted by the library preparation amplification (as described in Example 1) using a pool of species primers. As shown in FIG. 8A, all 7 of the bacterial species (Bacteroides vulgatus, Escherichia coli, Porphyromonas gingivalis, Cutibacterium acnes, Helicobacter pylori, Enterococcus faecalis and Bifidobacterium adolescentis) that were targeted by species primers in generating the library from Sample no. 1 were detected and correctly identified, whereas other, non-targeted species were not detected at a noise threshold of 1% of mapped reads. The dashed line 802 in FIG. 8A is at 1.0% of the total number of mapped reads on the y-axis which represents the threshold above which the species may be detected as present in the sample. This threshold can be set by the user for any given analysis. The resolution of the assay using the species primer pool to generate library amplicons was greater than that of the assay using the 16S primer pool. For example, the only Bifidobacterium species detected in a library generated from Sample no. 1 (MSA1002) using the species primer pool was B. adolescentis, which is the only Bifidobacterium species contained in Sample no. 1. However, reads of sequences from a library generated from Sample no. 1 using the 16S primer pool mapped to four additional Bifidobacterium species, as well as to B. adolescentis, which did have the greatest number read counts of the 5 Bifidobacterium species to which reads mapped. FIG. 8B gives a table of the noise threshold, sensitivity and PPV for the example of Sample no. 1 (MSA1002) library generated using a species primer pool. The assay was highly reproducible as shown in an analysis of a comparison of the results of sequencing of four replicate aliquots of a library generated from Sample no. 1 (MSA1002) using a species primer pool (see FIG. 9).


Performance metrics evaluated for the analysis conducted of the sample sequencing results included calculation of precision (or positive predictive value; PPV) and sensitivity (or recall) and generating precision recall (PR) curves. PR evaluation is a useful measure of success of prediction, particularly for unequal class distributions. Precision is a measure of result relevancy whereas recall is a measure of the quantity of relevant results returned in an analysis. High precision correlates with a low false positive rate and high recall correlates with low false negative rate. Precision is calculated as the number of results identified as positive in a test that are positive (i.e., true positives) divided by the total number of results identified as positive in the test (i.e., true positives+false positives). Recall is calculated as the number of results identified as positive in a test that are positive (i.e., true positives) divided by the number of true positives plus then number of false negatives. A PR curve is a plot of precision vs. recall for different thresholds. A high area under a PR curve (AUC) reflects high recall and high precision and many correctly identified results. Performance metrics determined for the analysis of sequence reads obtained from sequencing of one chip are shown in Tables 13 and 14. Table 13 shows examples of results for 16S sequence reads, where the noise threshold was set for genus level detection. Table 14 shows examples of results for targeted species sequence reads where the noise threshold was set for species level detection.









TABLE 13







Performance Metrics for Sequencing of Sample


DNA Amplicons Generated Using 16S Primers













Precision, Recall



Sample ID
Area Under PR Curve
(Noise Threshold)a
















MSA1002
1.00
1.00, 1.00
(0.5%)



MSA1006
0.97
0.91, 1.00
(0.5%)b



MIX05
0.98
0.92, 1.00
(0.5%)



MIX06
1.00
0.92, 1.00
(0.5%)



MIX07
0.99
1.00, 0.96
(0.5%)



MIX09
0.96
0.85, 1.00
(0.5%)b



MIX11
0.95
0.95, 0.95
(0.5%)b



MIX12
0.95
0.95, 0.95
(0.5%)b



MIX13
1.00
1.00, 1.00
(0.5%)



MIX14
1.00
1.00, 1.00
(0.5%)



MIX15
0.95
0.97, 0.90
(0.5%)b



MIX16
1.00
1.00, 1.00
(0.5%)



MIX17
1.00
1.00, 1.00
(0.5%)



MIX18
0.95
0.92, 0.95
(0.5%)b



MIX19
0.95
0.92, 0.95
(0.5%)b








aNoise threshold as a percentage of mapped reads





bEnterobacteriaceae family is poorly resolved by 16S rRNA gene hypervariable region analysis (see, e.g., Chakravorty et al. (2007) J Microbiol Methods 69(2):330-339).














TABLE 14







Performance Metrics for Sequencing of Sample


DNA Amplicons Generated Using Species Primers











Sample ID
Area Under PR Curve
Precision, Recall
















MSA1002
1.00
1.00, 1.00
(0.1%)



MSA1006
1.00
1.00, 1.00
(0.1%)



MIX05
1.00
1.00, 1.00
(0.1%)



MIX06
1.00
1.00, 1.00
(0.1%)



MIX07
1.00
1.00, 1.00
(0.1%)



MIX09
1.00
1.00, 1.00
(0.1%)



MIX11 -
1.00
1.00, 1.00
(0.1%)



MIX19










Example 5—Primer and Amplicon Sequences

This example provides primer sequences that can be included in pools used to amplify microbial 16S rRNA (Table 15) and microbial species-specific DNA sequences (Table 16) in assays to identify microbes and/or characterize microbial populations in samples. Table 17 provides microbial sequences, some or all of which can be targeted by primers in a species primer pool used in such assays.









TABLE 15







16S rRNA GENE PRIMER SEQUENCES











HYPER-
SEQ

SEQ



VARIABLE
ID

ID



REGION
NO:
PRIMER 1
NO:
PRIMER 2





V2
 1
GGCGGACGGGUGAGUAA
 2
AGTCUGGACCGTGTCUCA





V2
 3
GGCGCACGGGUGAGUAA
 4
AGTCUGGACCGTGTCUCA





V2
 5
GGCGAACGGGUGAGUAA
 6
AGTCUGGACCGTGTCUCA





V3
 7
ACUCCUACGGGAGGCAGCAG
 8
ACGGAGTUAGCCGGTGCUT





V4
 9
CAGCAGCCGCGGUAAUAC
10
CGCATTUCACCGCUACAC





V5
11
GGGAGCAAACAGGAUTAGAUACCC
12
CCCCCGTCAAUTCATTTGAGTUT





V6
13
ATGTGGUTTAATTCGAUGCAACGC
14
TUCACAACACGAGCUGACGAC





V7
15
TGGGUTAAGUCCCGCAACG
16
AAGGGCCAUGATGACTUGACG





V8
17
GGGCUACACACGCGCUAC
18
CCCGGGAACGUATUCACC





V8
19
GGGCUACACACGUGCAAC
20
CCCGGGAACGUATUCACC





V8
21
GGGCUACACACGTGCUAC
22
CCCGGGAACGUATUCACC





V9
23
TTCCCGGGCCUTGUACAC
24
CUTGTTACGACTUCACCCCAGT





V2
25
GGCGGACGGGTGAGTAA
26
AGTCTGGACCGTGTCTCA





V2
27
GGCGCACGGGTGAGTAA
28
AGTCTGGACCGTGTCTCA





V2
29
GGCGAACGGGTGAGTAA
30
AGTCTGGACCGTGTCTCA





V3
31
ACTCCTACGGGAGGCAGCAG
32
ACGGAGTTAGCCGGTGCTT





V4
33
CAGCAGCCGCGGTAATAC
34
CGCATTTCACCGCTACAC





V5
35
GGGAGCAAACAGGATTAGATACCC
36
CCCCCGTCAATTCATTTGAGTTT





V6
37
ATGTGGTTTAATTCGATGCAACGC
38
TTCACAACACGAGCTGACGAC





V7
39
TGGGTTAAGTCCCGCAACG
40
AAGGGCCATGATGACTTGACG





V8
41
GGGCTACACACGCGCTAC
42
CCCGGGAACGTATTCACC





V8
43
GGGCTACACACGTGCAAC
44
CCCGGGAACGTATTCACC





V8
45
GGGCTACACACGTGCTAC
46
CCCGGGAACGTATTCACC





V9
47
TTCCCGGGCCTTGTACAC
48
CTTGTTACGACTTCACCCCAGT
















TABLE 16







SPECIES PRIMER AND PROBE SEQUENCES













SEQ

SEQ




ID

ID


GENUS AND SPECIES
PRIMER 1
NO:
PRIMER 2
NO:










TABLE 16A (PRIMERS/PROBES SEQ ID NOS: 49-480)












Bifidobacterium longum

ACCAAGGUTCUAGCCGGT
49
GGCTTGGUGGCAGTAAGUG
50






Bifidobacterium longum

ACCAUCTGGATUGCCGCA
51
AGTGAAACAACAGUATTGA
52





UGCCG







Clostridioides difficile

ACATTTGCTGAAUCTTTTGC
53
TCAAGATAAAGGACAUCAA
54


QCD-66c26
TCTTTTUACT

GTGTUAGGT







Clostridioides difficile

CATCTACTGAAGCUGCTTCA
55
TTTGCTCTTTGAUATTTTT
56


QCD-66c26
AATUAGT

GCCAUACAGAT







Clostridioides difficile

ATCTTGAATAGUAACTTTTA
57
GATTCTGCTAAACUAATCG
58


QCD-66c26
AACTTUGCCCT

AAGAGGTUAGA







Lactococcus lactis subsp.

CAGCGAATAAUAATTCCCCT
59
GGATGACTTTCUATCGGCA
60


lactis I11403
UGACAG

CTUCA







Lactococcus lactis subsp.

GCAACAGCACUTCGUAACGA
61
GGAGAACCAAAUTCAACAC
62


lactis I11403
T

GAGTUT







Chlamydia pneumoniae TW-

AATTCACAGCTUGAGGAAAA
63
TGGCAACAUCTGTUCAGGA
64


183
GGUGT

C







Chlamydia pneumoniae TW-

TGCGTTGCUCGCTCUCT
65
TGCACTCTTUCAGAAAGAA
66


183


GGTCUT







Chlamydia pneumoniae TW-

ACGAAGAAGCUGUGGAGAAG
67
CCUTGAGACUACCAGGGAG
68


183
T

C







Chlamydia pneumoniae TW-

AAAAGTAAACAAUAAGAAAG
69
CGCGCAACAUAGACUCCC
70


183
AGGTTCAATAUGC









Fusobacterium nucleatum

AATTGTTCCTCAUCAACTAT
71
GTAGCGAGGAGGAUTATAG
72


subsp. nucleatum ATCC
TTTAATTCCTUG

UGAAAGA



25586










Porphyromonas gingivalis

GTGGCTTTCTTAUGTGCATG
73
TATTCGTAATTAGAGUAGG
74


W83
GATTUG

AGGAGAAGCTTUT







Porphyromonas gingivalis

TGTGGCACAUGACAGTCGTU
75
CATAAGGUCTTTGCGCUGG
76


W83
G

T







Helicobacter hepaticus

GTGGCAATUACTTGCGTATT
77
CCTGCUCAACCCCTATCUG
78


ATCC 51449
UGG

G







Helicobacter hepaticus

AGACAAAGTAUCAACATTGC
79
CGAAAGCGGGAAUGCUCCA
80


ATCC 51449
TCAUACCT

A







Lactobacillus johnsonii

AAATGAATGGGUAGAAGCTG
81
TTAAGATAACTAGGUCGCC
82


NCC 533
GUGT

GACUAC







Lactobacillus johnsonii

TTCAGCTTCAUTAGAAGACC
83
CGTCAATTUGGACTTTACT
84


NCC 533
UCGG

GATUGGA







Lactobacillus johnsonii

TCACCATCAAGUAGAACTGT
85
CCAGAAGAAUTGCTUCCCC
86


NCC 533
ATTTTGUGT

AT







Lactobacillus johnsonii

ACAATATTGGTCTUTTATTT
87
AGCTTATATUGAGGATTGT
88


NCC 533
TTAGCAACTUGT

GGCUACAC







Cutibacterium acnes

TCGGTGUCATTGGGAUCGAC
89
CUGGGCGACGACGCTUT
90


KPA171202










Cutibacterium acnes

GUGCCGTCATUGACCAGCAT
91
CGGAGGGCUAUCGCGGA
92


KPA171202










Helicobacter pylori 26695

GTGCCUAAAAGCACAAGCAA
93
AGGGAGTTTAAAAAUGAAA
94



TUG

CGCTTUCAA







Helicobacter pylori 26695

AAAGGTGAGAGGAUTTAGGA
95
CTAGAGAGATAGCACCUAC
96



CTTTTTACUAAA

TATAACAGATTUC







Borreliella burgdorferi

AGAGAAACCAGUTGGCCTTT
97
AACAAATCCUCGATTTATT
98


B31
UGG

TCAUGGCAG







Borreliella burgdorferi

AATGGATTTATTTTGAUTCC
99
ATTGCCAATATTCAAUCTT



B31
GAATATGCTTUT

CTAAATTCAUCAAT
100






Borreliella burgdorferi

TTGGCAATGTGAUCTTTATT
101
AGAAATGAGATAGCUTTTA
102


B31
GCAATTTAAUT

ATAATCACUGCA







Chlamydia trachomatis

GCTGCAGGGAUTATTCTTTC
103
AGGGCTCUATCTATCAGAA
104


D/UW-3/CX
UCCA

UCGGAA







Chlamydia trachomatis

AGAGCCCUTCTCGAATAUGG
105
AAATCGGGUGCACCTTCTG
106


D/UW-3/CX
GA

UAA







Chlamydia trachomatis

AGCAAAAGCUTGCATATUGG
107
ACCTCTATAGGUGTCCGTT
108


D/UW-3/CX
CA

ATTTTGAUG







Campylobacter jejuni

GCGTTCTCCAUCTTTTATAG
109
TTATTTTAGTGGGTUCTGC
110


subsp. Jejuni
CAGAAAUACG

AATGACAAGAUA







Campylobacter jejuni

AACAATTCTTTUAGCCTAAC
111
GCGAAAGTTACUTAGGTGG
112


subsp. Jejuni
AGUGCCA

TCTUGC







Campylobacter jejuni

GTTATGAAGCTTATUAATGG
113
CCTCAAATTGATCUTCTGC
114


subsp. Jejuni
TAGTGGTGAUGA

TGAAGTATUA







Bacteroides fragilis

TUGGCGGAUACAGCCCT
115
ATCCAGACUCTCCTGATTG
116


YCH46


UCCA







Bacteroides fragilis

GATCTGCCAUAGAATCTCGU
117
CGGCUGAAGAAGAGUGGGA
118


YCH46
CG

A







Bacteroides fragilis

TCCGGGCAGCGAGUCUG
119
GGCAGAUCGATUGCAGGGT
120


YCH46










Lactobacillus reuteri JCM

AAAAACGGAGGAGACUAATT
121
TGCTTTTGCTTCUTGTAAT
122


1112
AATAUGGCAA

TACGAATUAACT







Lactobacillus reuteri JCM

CCGGTUGACCGTATACUACG
123
CACAATCGTTTTUAGCTAG
124


1112
CT

AATCACTGUT







Bifidobacterium

GGAACAGCCGUCTGAUCAC
125
AAAAACACTCATUGTTTTC
126



adolescentis ATCC 15703



ATCGTTTTUCA







Bifidobacterium

CCAAAGACTUCGAGTAGGGC
127
GATTGTTCATAUGGGCTCT
128



adolescentis ATCC 15703

TUG

CCTAUCC







Bifidobacterium

CGCCGAATGAUGTTCGAAAT
129
CCGACAATCUCAAGAAAAC
130



adolescentis ATCC 15703

AUGGT

GCUGAT







Lactobacillus rhamnosus

ACGGGTCTUAGCATTGGCUT
131
GCACGCGUCAATUAAGCCC
132


GG










Lactobacillus rhamnosus

TCAATGGTUAAGTTGGCCGU
133
ACGATCACUCAAAATGGUG
134


GG
AG

CG







Bacteroides

CCAAAGCATUGGCATATGCA
135
AAGCCCAATCGUCATCTTT
136



thetaiotaomicron VPI-5482

GAUA

GTAGUT







Bacteroides

ACTAATAATAAGGGAUTTTC
137
AACTTTTTAGTAUCCTTAG
138



thetaiotaomicron VPI-5482

TGAATTTGGUGAT

CGAAGTUGAC







Bacteroides

TGCTCAAAGUGAGAACTTTT
139
TCTGTTTGTGAAUAACTAC
140



thetaiotaomicron VPI-5482

CAAATCGUAA

CGTUAGGAC







Mycoplasma penetrans HF-2

AGCATTACTACAAAAAGAAU
141
ATTTAGGGTGUAACAAAGA
142



CAAGCAATAAUAA

TGAAAAACATUAAT







Mycoplasma penetrans HF-2

GCACCTGCTUTTATAACATC
143
ACAGAAGAAAATAUGTCTG
144



ATTUCCA

CTACAAAUAGAT







Mycoplasma penetrans HF-2

GTAATCCUACTTTCATCATA
145
GGTGCAACAUGAAATCAAG
146



UGAAGAAGAACT

GUGA







Mycoplasma penetrans HF-2

GAAATTGCUACAGAGATAGU
147
GTAATGCTTTUAAAAATCA
148



CCCACC

TTCTAAUGACCCA







Lactobacillus acidophilus

ACTGGCAATTCAUCAGAAAA
149
CCGTAGTTUTTCCTTGCUG
150


NCFM
TACATCUAC

ACC







Lactobacillus acidophilus

GGACAGCUACCCTTGTUGCA
151
AAAGCACGAUTAATAGTTA
152


NCFM


AATUACCAAAAACA







Lactobacillus acidophilus

CGCTTCAACTGAUCATGTAG
153
CAGCATGACTGUTATCAGT
154


NCFM
AAAAAGUG

GTTTGUT







Lactobacillus acidophilus

GGTGTTAAGGUGAATTGGAC
155
CCTGTGCCCAAUTCATTAT
156


NCFM
UCAAAC

TAGTATUCAT







Desulfovibrio alaskensis

AAACCTTUGCCGGGCGUC
157
CGCAUCAGGCUCCCGCA
158


G20










Desulfovibrio alaskensis

GCGGAUAUCACGGACGC
159
GGCTGCGGUTGTGGUCG
160


G20










Desulfovibrio alaskensis

AGGUACCGGCCTGCUGCAT
161
TUCGCUGCCCGAAGCCG
162


G20










Desulfovibrio alaskensis

AGCAGAAAGACAGGCAUGAU
163
AGCACCUACTGCAUCGCC
164


G20
G









Bacteroides vulgatus ATCC

AUGCAGCCACAACCAAUCG
165
TTCGGCCACAUTCCATCCU
166


8482


AA







Bacteroides vulgatus ATCC

TUGCUGACCAAAACCACCAC
167
TTTTTATGGAAUGTTTTTC
168


8482


TGUCGGG







Bacteroides vulgatus ATCC

GTTCCTATTCCUATCTCTTC
169
CCGCCUTTGATAGAUCCGC
170


8482
CGGUGG

T







Parabacteroides

AGUCCCAACGCCATTGUGC
171
CAAGGAUGTTTAUGAACGG
172



distasonis ATCC 8503



CAAAACA







Parabacteroides

GAATATGAGCCAUGAGATAC
173
AGAAAGACATGCUACCGGA
174



distasonis ATCC 8503

GUACGC

TTCTAUG







Lactobacillus delbrueckii

GAAGCTGGAUTTGCCGACCU
175
GCGGGCACAAAACUCTUCA
176


subsp.bulgaricus ATCC
A





BAA-365










Lactobacillus delbrueckii

ACTCAGGCGACUCAGTCTUG
177
GGCGGUTCTGGUCAAGC
178


subsp.bulgaricus ATCC






BAA-365










Lactobacillus delbrueckii

TTCUGACGCCTAUGGGACA
179
GGTUGCGGACCTGCAUC
180


subsp.bulgaricus ATCC






BAA-365










Campylobacter curvus

CCCACGAAUGCGAUCACG
181
CAGCAAGGCCGAUGAGAUA
182


525.92


AG







Campylobacter curvus

GTGACATCUGAGGTAGATGA
183
ACUCGGCACAGAUACAAGC
184


525.92
TAUGGC

A







Campylobacter curvus

AGCCAGAUCTCCACGCUC
185
TAGGGCATATCGAUAAAAG
186


525.92


CTGTAAUAAAAA







Campylobacter curvus

ATGCCCUAAAAAUCGCAAGC
187
GAUATGGCUGCAAACGCGA
188


525.92
T









Campylobacter hominis

GCCGGAGTAUCAAGATTTAA
189
AGATTGTTTTATTUATTTG
190


ATCC BAA-381
ACCAUAAG

CAAAGAGAUGACG







Campylobacter hominis

CTTTGCAAAAUTTTGCATAT
191
GATTGATGUGGCTATTAAA
192


ATCC BAA-381
UCACCGA

AGTAUCGGC







Campylobacter hominis

GCTGACGCUCTCAUAAACGG
193
TTGCAAAGAATTUTGCGCC
194


ATCC BAA-381
A

ATTAUT







Campylobacter hominis

AGGTTTAAAGTATUTTCTAC
195
ACUCCGGCAGAAAGGGAUT
196


ATCC BAA-381
AAAAACTUCAACA









Campylobacter concisus

CATCGATAAGCUCATCATCA
197
TAAATTTATCTCAUAGTCT
198


13826
UGCCAA

GAGATAUCGACCT







Campylobacter concisus

ATAAUACGAGCAGCACACCU
199
AAATGAACCGGAUCAAAGC
200


13826
ACCG

UCCC







Campylobacter concisus

AGAGGAGTCUTTTAAAAAGA
201
TTGCGUCAGTGATCUCAGA
202


13826
CUGAAGAAGAT

AACAT







Akkermansia muciniphila

GGCAUTCTGAGGUACCGGAA
203
TTTTCGCCTCUCACATTGG
204


ATCC BAA-835


AAATTAUT







Akkermansia muciniphila

TGGGCAUGAUCGGAGAAAGA
205
TTGCCAUGGTATTCCTUGG
206


ATCC BAA-835
AG

CG







Akkermansia muciniphila

CCAATTGAACUACTGACCTG
207
CACCGUGGGTGCTGGUCG
208


ATCC BAA-835
TUGGAG









Bifidobacterium animalis

CGCAGTACAUGGATCACCTG
209
CGTATGCGAUGCGTUCGC
210


subsp. lactis AD011
TUC









Bifidobacterium animalis

CGCAUACGUGCAGCGGT
211
GGACAGGUGCCCGGUGG
212


subsp. lactis AD011










Bifidobacterium animalis

CTGTTCUGCTGGTTCUGCGA
213
GCCGTAGUAACAGCCUCGA
214


subsp. lactis AD011










Bifidobacterium animalis

ACTACGGCAUCATCGTTGUC
215
GTUACGCGCAUCGAGCC
216


subsp. lactis AD011
T









Atopobium parvulum DSM

GCAGCCAGCCCUTCTUG
217
GGCAGAAGAUTTGATGCUC
218


20469


CAT







Atopobium parvulum DSM

ACAGCCGCTUGATTATATTT
219
AGAGGTATTCCAAAUGCAG
220


20469
AAACUGCC

CTTATUG







Atopobium parvulum DSM

ACGATACCAGTAAUACTTAT
221
TGGCUGCTUGGAAACGAG
222


20469
TAAACTCAUCAAA









Veillonella parvula DSM

GCTGGTATTGGUATGATTCC
223
AAACCAAACCGUTGCCCCA
224


2008
AGAUGG

UA







Veillonella parvula DSM

TCGACTGATATAUCAAGAGA
225
CATCAGCCAUGTGUACAAA
226


2008
AAGAAAGTGUA

ACCT







Veillonella parvula DSM

AGAAACGGCUATACCAATTC
227
CTTCGTTCGTAAUAGATGG
228


2008
AUGAAGAG

CTCTACAAUAAG







Citrobacter rodentium

GCGGAAUGGCGTTUACAGT
229
TTTAGCTTATCAAUAGCAC
230


ICC168


AATTTUAGAAAACA







Citrobacter rodentium

GCCACCCAGCCAUGAUG
231
GCGCGGUGGAGGTGTCUA
232


ICC168










Citrobacter rodentium

ACTATGAATAAAAUTTATTT
233
TGGGUGGCGGAGCAUCA
234


ICC168
CTCUCAAGACCCG









Citrobacter rodentium

CTGGAUACGCAGACCGAUGT
235
CATTCCGCUGTTTCATCUG
236


ICC168


CA







Streptococcus

ATGTTGTTCAAGGUGACGGT
237
CAAGGTTTCAAGGAACAUT
238



gallolyticus UCN34

ACUG

GAAGTGAUAA







Streptococcus

CAAAACAGGAGAUAAGATTT
239
AAACAGTUCAGCACGTTCC
240



gallolyticus UCN34

TTGUCACAGGA

UGA







Streptococcus

CGGTGACACCUAAAGAACTG
241
TGACGATATCCTUTTTATT
242



gallolyticus UCN34

ATGATATUCT

CAAGTCTCUAAGG







Enterococcus faecium

TAATGAAATCCAAAUATTCT
243
AACGAGCUAGCGAUCGCA
244


TX0133a04
CTTTCTTTAUGGC









Enterococcus faecium

TCCUGCAAUCACCGGCA
245
TCACGCCGAUGAAUGAAGA
246


TX0133a04


G







Enterococcus faecium

ATTCTACCCATGUCTCTGGG
247
AGAAAAACCAAAAGCAACU
248


TX0133a04
ATTTUGA

GGUACG







Peptostreptococcus

GGAUTCATGGAUAGGAGAAA
249
TGCCGCCUACCTACCAGTA
250



stomatis DSM 17678

GGCT

UG







Peptostreptococcus

GTATCCTAGATATGUCATTT
251
AGAGATTGATGACCUGACT
252



stomatis DSM 17678

AGGTCTTCUACA

ATAGAGUCT







Peptostreptococcus

TTGAACTTGAAUCGACCCTA
253
ATGAATCCAAAUAGGGATT
254



stomatis DSM 17678

UGCA

CTGACTAUGT







Peptostreptococcus

ATCTCTATAUCAAAGCTCCU
255
AGGTTTAGGAAGGAAUTTA
256



stomatis DSM 17678

GGACACA

CAACTGAAAAUA







Mycoplasma fermentans JER

TCCTTGCGACUTTTGCAAAT
257
AAAGATCTTGATTAUGAAA
258



AATATUGA

TTCAAGAGCAAUT







Mycoplasma fermentans JER

TTTTTCAGCTUGCAAACGCT
259
TGAATTGCCTATTUATACA
260



TTATTAAAUT

CGCAATAAATUT







Mycoplasma fermentans JER

TCGGTTAATTTACUGAATGC
261
AATACAAATAATCTAUCGC
262



AAAAAGUAAAAA

TTTTTGGGUGT







Mycoplasma fermentans JER

TTTTACATTCTGTTUACCAG
263
GCCTTCTTCAAAUTCTTTA
264



GATCAATUACA

TAGCTTTTUGC







Eubacterium limosum

TTUCGCGGTGUAGAGCCG
265
CUGCAGAGCCGGCCCUC
266


KIST612










Eubacterium limosum

GCUGAGCCGGTCAAUGC
267
AGTGUGGCACCAAUGAACC
268


KIST612










Eubacterium limosum

GTTCCGGUAAAAGCAGGUGT
269
ACCCGCUGGTCAATTTCUC
270


KIST612


T







Eubacterium limosum

CACCTTACATGUAAAAATTC
271
CCGGAACCCCAUCCCUGT
272


KIST612
TTGCGATTUC









Blautia obeum ATCC 29174

CTTCTGCAUCCCGAACCUCC
273
TATUTCGTTGGCAAUAGAA
274





GAGCCA







Parabacteroides merdae

CACTTTTAUACTGTACCUCG
275
GGGCGUAGTCGGUGAGT
276


ATCC43184
ACCACA









Parabacteroides merdae

CGACCCUGACACTTTTTGCA
277
TCATGATGAGAACUTGGAG
278


ATCC43184
UT

AUAAAGCCT







Parabacteroides merdae

CTACGCCCACUTTAAACTGU
279
CAGGGTCGATAUCGATATC
280


ATCC43184
GG

GATAAUGT







Parabacteroides merdae

TCCTUGCAGGCATUCAGGT
281
ACTGACTATAAATUGATAT
282


ATCC43184


TGTGTGAUGACAG







Faecalibacterium

AAGCCGAAAUCTGAAUGACC
283
TCGAAGAAGCACUGCATCA
284



prausnitzii M21/2

GA

TGUC







Faecalibacterium

GTGCAGGCGAUCTACAACAT
285
AATAAUTATCAGTTGCUCG
286



prausnitzii M21/2

UC

CAGCCT







Parvimonas micra ATCC

CTAAAGCTTUGTCTATCTTA
287
GGTAACUCAGACGAGTTCT
288


33270
UCAACAGCT

CGUG







Parvimonas micra ATCC

AGATGGATTGTTUATCCAGT
289
GGAACTACACTUTCTTTTA
290


33270
TTTCTGUG

ATGCTTTUAAAGAT







Parvimonas micra ATCC

GCGAATAAATATUCTACTGA
291
TCTTGTUGCCTTCAGTUCC
292


33270
CGCTUCAT

AACT







Parvimonas micra ATCC

CCATTGTTGAGUCGTCAGCT
293
AGCTTUAGCAAGAGCTAUA
294


33270
TCATTUAT

AACCAAGT







Streptococcus infantarius

GCTGAGACAAUTCTTTTTCG
295
GCCAGAAGCGACAGUAGCT
296


subsp. infantarius ATCC
AACUCA

UA



BAA-102










Streptococcus infantarius

TGATATCATCAACAUTAAAC
297
ACCAAGCTTTTAUAAGAGA
298


subsp. infantarius ATCC
ATCTCATAGUCC

GTTGCUCT



BAA-102









Streptococcus infantarius
AGCTTGGTAATUCAGACAAA
299
GTCTCAGCAUGATTATTTC
300


subsp. infantarius ATCC
TCAATUCG

CATUCACG



BAA-102










Bifidobacterium bifidum

CGUCGCCAAGCCTUCGA
301
TGGTTCUGGTCGACCUGT
302


NCIMB 41171










Bifidobacterium bifidum

GACCUCGCTUACCCGGAA
303
ACCTCCUGAATCTTAUCCG
304


NCIMB 41171


CGA







Bifidobacterium bifidum

CACGGUGGCCGCTTTAAUG
305
TGGCGACGGUACTUGGC
306


NCIMB 41171










Bifidobacterium bifidum

CATCAGCGUCAAATCAGUCA
307
GGUACGCTGTUCGCCGT
308


NCIMB 41171
ACCG









Collinsella stercoris DSM

AGGAGTAGACAUCCATGAAU
309
TTCGCGUCATGGCATAUGC
310


13279
CCG

T







Collinsella stercoris DSM

GGAACTGGAUGTATCGCGAU
311
GUCGCCAAAUGGGCGAT
312


13279
GA









Collinsella stercoris DSM

TGUAAAACCGGCGAGGUGG
313
CGCTCAAAUGTCCUCGCT
314


13279










Collinsella stercoris DSM

TTUGAGCGCACAAGUAGGGT
315
CCAGTUCCCAGTCCAUGCA
316


13279










Roseburia intestinalis

CCGGTTUCCCTGGTUCG
317
CTGAATTUACGCGTGAGGU
318


L1-82


GA







Roseburia intestinalis

CGATCACTCCAAAUCCGGAG
319
AACCGGGUGGCAGCCGUA
320


L1-82
CAUA









Roseburia intestinalis

CGGCACCUTTCUGGCAC
321
GACTGUGGCTTGCUGCA
322


L1-82










Roseburia intestinalis

CTGCCCGGUATTTCGCAUT
323
ACGGGCACAGAUTATCGUG
324


L1-82


T







Enterococcus gallinarum

TTTGGAGCAATGAUTATCGG
325
CTCCAATTAAGCCUGCAGA
326


EG2
TCCATUAA

AAAATUACG







Enterococcus gallinarum

ATTACGGUACCTGGAAAUGA
327
GATAGCACGACCGAUCAAA
328


EG2
AGGCT

TAAAAATACTAUT







Enterococcus gallinarum

ATGGTTGGTAUGGCAGTTAT
329
TTGATAATGCCUTGTAAGA
330


EG2
UGGC

AUGCCC







Prevotella copri DSM

CCACACCAUTTTTGCCCTTU
331
CGGCTUCACCCAGTUCG
332


18205
CAC









Prevotella copri DSM

TGAAGCCGGAUGGCTUGA
333
TCTTCAAATTTTAAUTCTT
334


18205


AGATGTTGAUCCAC







Holdemania filiformis DSM

CGUCCCAGCUGACGCAA
335
TCGGTAUGGGATTATCCGU
336


12042


CCT







Holdemania filiformis DSM

CTTTAAAATCAGAUCCAGAT
337
TGAAGAAAATUCCGCCGCU
338


12042
TTTCATGTUCCA

GA







Holdemania filiformis DSM

GCCATAGACCGCUCTGACTU
339
CGCAGCUCAGACCATTCAT
340


12042
CC

UGG







Holdemania filiformis DSM

TTGGAAGACGUCATCCTCGA
341
TCAAUGCAACCCTTUCCCA
342


12042
TATAAUGA

G







Helicobacter bilis ATCC

AGAGTGAGACAAUTACGCTA
343
TTGATATTTCATTTUCAAG
344


43879
CCTUG

GTGTTTAAAGUGAG







Helicobacter bilis ATCC

AGATTCTAAAGAAGUGCTAG
345
TTGATGACATTTUGAGAGA
346


43879
ATTTAAGUGCG

ATGTCTTGCAUA







Slackia exigua ATCC

GGAAUGTGCGUCGAACGG
347
CCAGCUGCGGTTGCGAUT
348


700122










Slackia exigua ATCC

CCGTACCGGAUTCCAGCGUA
349
GTCTGGAATGUAGAACTAT
350


700122
T

GCGATGATAUAT







Slackia exigua ATCC

CTCTUGGCGCGAAUGGAC
351
TGGGCGGCUATCTGGAUG
352


700122










Anaerococcus vaginalis

AAGGACTTAUGCCTCAATTA
353
TCTACCGCAGAUAAAACTC
354


ATCC 51170
ATUCAACC

CCACUA







Anaerococcus vaginalis

ACCTATAGTCAUATCAACTG
355
AAGTCCTUGCATCCACTTU
356


ATCC 51170
GAATUGCG

GG







Anaerococcus vaginalis

TACTGGAGATGTAUTAGTGG
357
TTGCATAATAATTTGUAAG
358


ATCC 51170
GAGAAGUT

GTTTTTCATCCUC







Collinsella aerofaciens

CGUTCCAUCCCACCCCT
359
GCATCCAGAAUGCTTTTCT
360


ATCC 25986

UACCG








Collinsella aerofaciens

TCCCCAAUCTTCCGTAUAGC
361
ATCAGCGAAAUGCCGTUCA
362


ATCC 25986
G

AA







Collinsella aerofaciens

AAAGACCGCCGUTGCGGTTT
363
AUGGAACGGCCCAUGCA
364


ATCC 25986
UA









Dorea formicigenerans

ATGCATCUGTTTCCUGGCCA
365
TTTTGCAATCUGAATGTGA
366


ATCC 27755
T

TCUGGG







Dorea formicigenerans

AAACAGATCACGUCCAAGGT
367
GGGCCGAUGCAUGGAGA
368


ATCC 27755
CAUC









Dorea formicigenerans

AUCGGCCCAGTAUCCGAT
369
ATCCGGGUTGATUAGGAGG
370


ATCC 27755


AAGA







Dorea formicigenerans

TTGCAAAATAACATUTGTAA
371
AAGAGGGCAGAGUAUGCCG
372


ATCC 27755
TCCCAATTUCC









Ruminococcus gnavus ATCC

ATGCCCTGGAUTATCCCAAU
373
TTCAATGCCTCAUAATGCA
374


29149
GAA

TCTGAUC







Ruminococcus gnavus ATCC

TCAACAGCTUGAGTAGTCTC
375
TTCTGCAGUAACTGCAGGG
376


29149
GUC

UAC







Ruminococcus gnavus ATCC

ACGGAATGTTTUCCGCAATC
377
CAGGGCAUAAGAGGCAUAA
378


29149
GUT

GC







Ruminococcus gnavus ATCC

TGCAGCAUCACCTGCUGA
379
GCTGTUGAAGGGCUCGG
380


29149










Campylobacter rectus

GCTTATTACGCACAAUAGCG
381
AATAGTTTTGUAATAACAA
382


RM3267
AATUAAAACA

GAUGCAACCAG







Campylobacter rectus

AACCGAAGAAGGAGAGUTAA
383
CGGTAGTGGUGGTGTTATC
384


RM3267
AGACUT

GTUAAAT







Campylobacter rectus

CAGGTTGAGGGCCAUCTAAA
385
ATTGACAAAATCAUAGTTA
386


RM3267
TAATUCA

AAAACTCCTTUGAA







Campylobacter gracilis

TTTACTACCAUCGCGCCGAT
387
ATCGCCGCGUTTUGCGT
388


RM3268
ATUT









Campylobacter gracilis

AAACGGCUCATCTGCGUCA
389
GUTGCACCGUAAAAGAGAG
390


RM3268


GACT







Peptostreptococcus

AACCTAGCCATACUAGTATA
391
GAGTTGGUATCAGGAGAUG
392



anaerobius 653-L

GTCCCUT

AAGAAGC







Peptostreptococcus

CTGCAAACACAUCAAAAUAA
393
TGCCAAAAATAAGAUACAC
394



anaerobius 653-L

AAGGCAG

CTTCCTAUAAGA







Peptostreptococcus

ACCAACTCTAUATCGGCAAA
395
ACCTGAGGGUGACGACTUG
396



anaerobius 653-L

ATTUGT









Peptostreptococcus

TGTCCCTCAACCUAATTTTT
397
GTTTGCAGAUAGGTGTUCA
398



anaerobius 653-L

GGCUT

AGCA







Prevotella histicola

GUTTGGCUCAGGAAGAGAAA
399
GATACUACCATCGCUAGAA
400


F0411
CCT

ACACAGAA







Prevotella histicola

GCAAAGGCAGAGGUGGACAT
401
TCAAACGAACAGCCUGTUC
402


F0411
UAC

C







Prevotella histicola

TCGTTUGACGAATAACAUGC
403
AGAGCCTATCAUAGAAGAC
404


F0411
CG

ATCAAUAGC







Prevotella histicola

AGCACCTACCUTCTGGATGA
405
AGCAGCACAGGUCCTGUT
406


F0411
UC









Helicobacter bizzozeronii

GGATAGCATGGUGCATGTTA
407
GTUCCACAAGAGAGAUGGG
408


CIII-1
CAGAUAT

CA







Helicobacter bizzozeronii

TTTGGGCAGUAACCTCUAGG
409
TGCCCUAGAAGCCATTTAU
410


CIII-1
G

GACAAA







Helicobacter bizzozeronii

ACTGATAUGCACGCCATAGA
411
CCAAAGCATUTTAACCGAA
412


CIII-1
UCAC

AAUGGT







Enterococcus hirae ATCC

GGCGUTGAUACCCCAGC
413
TTGTCAGTCTATAUTGTGA
414


9790


GATGTTTCUCAAA







Enterococcus hirae ATCC

TGGTCCAACAGCUGTTTCTA
415
ATGAAGCAAAAGAAAAAUT
416


9790
CUT

ATUAGCACAACAA







Enterococcus hirae ATCC

TTTTTGAGGCUAACTTTGCC
417
TCAACGCCUTCTGGTATUC
418


9790
ATTUCT

CC







Enterococcus hirae ATCC

AGATTCGGACCAAGUTTAAC
419
ACCTTTAGGGAAGUACGGT
420


9790
TCTUCAA

ATUGAA







Bacteroides nordii

ACCAAGACTGCUGACAGCAT
421
TGCAGGCACGUATATUGGC
422


CL02T12C05
AUG









Bacteroides nordii

TGCCUGCATTGTGAUGGAG
423
AGACGACGUGTCCAACTAU
424


CL02T12C05


CAG







Barnesiella

CGAAGCAATTCAAUAAAACA
425
AGTTGCGUATTATCCAGTU
426



intestinihominis YIT

CGAAAGUG

GCGA



11860










Barnesiella

CGATGAATACUAAGCTCATA
427
TTGCTTCGAAGUAAGCGAT
428



intestinihominis YIT

CTCTUCGG

ATATTGTTTUT



11860










Barnesiella

AAAAUTGCGACCUCCCGAAA
429
AATTTTCTCACGGAUACTC
430



intestinihominis YIT

AAT

ACATTAATTUCGT



11860










Barnesiella

ACCGATAATUACACCAAACA
431
CGTCGAUCAACAGTGCGUT
432



intestinihominis YIT

ACAUGG





11860










Lactobacillus murinus

CCGATCACAUAAGCCACACC
433
GTGAGTCAAATAUCATTGA
434


ASF361
UAAC

TGTGAUCGT







Lactobacillus murinus

TCATCUGGAGCGACGUGA
435
GAACUGACCAACAAAGATC
436


ASF361


AAUGGA







Lactobacillus murinus

ATCCGTGCCUTAAGTAGTTU
437
CAAGGAAGGTAUAAATGAT
438


ASF361
GCT

ACACATTAUCCCA







Lactobacillus murinus

CCTTGATGCUTGGCTTGATG
439
GGCAAAATAAGCUCCTAAA
440


ASF361
UT

ACAUCG







Eubacterium rectale

TCCTACCGUAAAGCTCTGTG
441
TTTATTAGGTTTGAUTTTT
442


CAG:36
TUAC

CAGACCUGCCT







Eubacterium rectale

AGGTATTTTCTCTAUCCTCT
443
GCAGGCACTUTTAATATTC
444


CAG:36
TCCCTTUAAAACC

AATGTUCCG







Cloacibacillus porcorum

GAAAGGGUCAACATUGCCGT
445
GCGAUCGCCGTCGUGAC
446






Cloacibacillus porcorum

GAAGGTGCCGAUCGAGAAGU
447
ACCCTUTCAGGATUGGCAC
448



G

A







Cloacibacillus porcorum

ATAACCGGCGCGGUCUT
449
ACCUCCGUGACAGAGGGA
450






Cloacibacillus porcorum

CGATCATCACGUTTGAGGCT
451
TATGAATCTUAGCGCACGC
452



TUG

AAUC







Blautia coccoides

GACTCAGATTTUCAACCCCT
453
TGCTTTATACGCAUAAAAA
454



GTCUG

TAAGCTTAATUCA







Blautia coccoides

ATACUCCAGGGCACTUGCCG
455
TTTACCCTTGGGCAUTACC
456





GTATATACUA







Ruminococcus bromii

AUTAAGGTTGTUGAAGAAAG
457
AATACCGCCUCACTTACTA
458



CAGAAGAA

UAGCC







Ruminococcus bromii

TTGTCGGGACUTCTTGATTA
459
TCGGTAUCGCAGCTGAATT
460



UGCA

TAUAGT







Phascolarctobacterium

CUGACAGGGACAGAAAGUAA
461
TGCAACGGCUTTGTACUCA
462



faecium DSM 14760

CG

CT







Phascolarctobacterium

CCGATCGUTCCGCTTUCA
463
GTAACTAUCAGCGGCGGUA
464



faecium DSM 14760



CT







Phascolarctobacterium

ACATCGATGTTTTUGATGGC
465
ACGAUCGGCGGCGAUAT
466



faecium DSM 14760

TTTAATATUGC









Gemmiger formicilis

GAAAAGCCATTTTAUATTCT
467
CTGAAAAAGATTGGUGACA
468



CCTGTTCTTTUT

TCACAGAUAT







Gemmiger formicilis

GCACUGCGCCAGATAGGUA
469
GGCUCGGTTUCCGCGAT
470






Gemmiger formicilis

TGTAAGACCUGCGCGTTGUG
471
GCGATAGCCUGACCCAGUT
472






Helicobacter salomonis

GCAAAACCCUCTCTTGCTUG
473
TGGTGGCCUTGATAAGAGT
474



T

TUGA







Helicobacter salomonis

GCTGCTCTTCCUGTCAGGTA
475
GGTTTUGCAACAAGGGCTU
476



TTTUAG

C







Helicobacter salomonis

GCAGGGCUGGCGAUCAA
477
ATGGGTTTUAAACGCTTGA
478





AAAAUGC







Helicobacter salomonis

GCCAAGGCCUCTCTTCUCA
479
AGCCCUGCCCCTAATUGG
480










TABLE 16B (PRIMERS/PROBES SEQ ID NOS: 481-520)












Gardnerella vaginalis

AGCAGGCCUTTTCUCAGGA
481
GGAGCAACUTGTTAGCAGA
482





UGG







Gardnerella vaginalis

AGTUGCAGGTTTUGCGAGT
483
TGCCAAAAAGCCUTGAGAA
484





TATTCUG







Gardnerella vaginalis

TTGACGAATCUATTTAAACC
485
GGCCTGCUACTAATTCACT
486



TUACCGC

TATUGC







Klebsiella pneumoniae

ACGGTGGUCGCTGTACUG
487
GCAGGGUGCUGACCGAG
488






Klebsiella pneumoniae

TCAGCGCGCAGAGAAUACUG
489
ACCACCGUAACCGGCUC
490






Klebsiella pneumoniae

CACCCUGCGGGCTGUCT
491
GGATTACGCAUCGGAUCGG
492





G







Escherichia coli

TGCAATCUTGTGAGUGGCAG
493
CUCGACCACCACGAAUCGC
494



A









Escherichia coli

CAATCTTCGGCGUTTTGCTG
495
GTTGAAGAUGACATGAGCG
496



UAT

TUGAC







Escherichia coli

CGCCAGCGAAGGCUATUT
497
GTGGTGGAUGTTCCTCTGG
498





UG







Enterococcus faecalis

CGTAGCCAAAACUAATCCGG
499
CATTTGGACTUAAGAGGTA
500



ATUG

TTGCGATUT







Enterococcus faecalis

GCATTAAGAGCAAAUCACTG
501
ACGATTAUTTTAAAAGCGT
502



GGAAUT

UAGAAGAAGCC







Enterococcus faecalis

GTTGTTGTAAAUGCCATGGG
503
TCTGAAGTACUAGTTGCAG
504



TUCC

TGATUCAAC







Proteus mirabilis

CTTAAAGAAAGTCAUAATCC
505
GCGTTTGGGUTTATGAGCT
506



TCACCTUCCC

UGAAA







Proteus mirabilis

ATAAAGAAGCATATGGUGAA
507
GGCATTUGCGCCCATACUG
508



AAATAAAACTCUG









Proteus mirabilis

GTCGAGUACGACTUGCGAGA
509
GAGTCACCTATAUAAGCAT
510



A

CACTCTAUAAGAT







Escherichia coli

TCGTCGGUTCTGGCCUACT
511
GAGAAGCGACGACAUGATT
512





AACUCT







Escherichia coli

CGCCACGGCAAUGGTTUC
513
GCGCAAACGUGGTTAATGG
514





UA







Escherichia coli

CGUTATGUCGGGCGAACCA
515
GCAATCAUGGAAAACATCA
516





ACGUCAT







Escherichia coli

CTTCACCGCCAUTTCCGUAA
517
CCACACCGUTAGCAGCAAU
518



C

CA







Escherichia coli

GACCCAUCCGGCTGAUACC
519
CCGTGCUCGGCAATTTUAC
520





AT











TABLE 16C (PRIMERS/PROBES SEQ ID NOS: 521-826)












Escherichia coli

CGCATUGGTGAGCUGGC
521
GACAGCAACUCGCGGAUC
522






Escherichia coli

TCCGUATCGATCCUGAACAC
523
GCCACUCGCCCCTTGUT
524



CA









Bifidobacterium longum

GUACCCAACGGGCCGTUT
525
CAGAUGGUGCCCAGACG
526






Bifidobacterium longum

GCCUCGCGCGAGGGAUT
527
ACCUTGGUCGAGGCCGCT
528






Clostridioides difficile

ACAAGAAAGGAGCGAUAACT
529
CTCATCAATATTUAAAGCT
530


QCD-66c26
TTGGUT

CTTTGTUCAGCT







Clostridioides difficile

TGGATATTAAAAGUAAAACT
531
ATCATGTTATCCCUCCCAA
532


QCD-66c26
AGCTGATGUGG

TTTGTUCT







Clostridioides difficile

TTGATGAGATAUCAACGGAA
533
AATTTCTCACCCUAGTAAA
534


QCD-66c26
ATAACTAGTAUG

TACTGTTTCUC







Lactococcus lactis subsp.

GCTCCTTGAGUATAACCATT
535
TCAGAUGAAACAAAAGCGG
536



lactis I11403

GGUC

CTTUC







Lactococcus lactis subsp.

GAAATTCACTTCAUCAATTA
537
GCTGTTGCAACUGCTTTGU
538



lactis I11403

TACCAUAAACCAT

CA







Lactococcus lactis subsp.

TCTAAGGCAAUTGCTTTTAT
539
TTTCTGAAATTCUCTTTTA
540



lactis I11403

CATUGGG

TGTCATTTUAGGAC







Lactococcus lactis subsp.

GAATTCTACAACCAUCTTCA
541
ATTCGCTGATUTTTCAGGT
542



lactis I11403

CCACTUCA

ATTUGCT







Chlamydiapneumoniae TW-

CGCCUATGTUCAGGCAGC
543
CGTCACUCGATUCCCCGT
544


183










Chlamydiapneumoniae TW-

GAGUGACGGAATCTTTUACC
545
GAGCCTCUGGGTTCTGCUG
546


183
CC









Fusobacterium nucleatum

CAAAACCAATTAAAGAGUTA
547
AAATGTGGATTAAUTTGAC
548


subsp. nucleatum ATCC
GAGCAACATAUG

TGTAAAGUGCAT



25586










Fusobacterium nucleatum

TCCTATCAAGAATACUCATT
549
TGGTTTTGTAATUCTTCTC
550


subsp. nucleatum ATCC
GGACATTGAUT

AATACACUGAT



25586










Porphyromonas gingivalis

CAACCTTUAGCCTCGCCAUA
551
ACCTTTGAAAAUACAACAG
552


W83
GAA

AGGTGAUAAA







Porphyromonas gingivalis

GAGUCGACAACCGTCUGC
553
CGAGTATCTGCUGAAATGA
554


W83


GTGATAUAAC







Helicobacter hepaticus

AGCUAAGGCGCCTUGCAA
555
GTCTTCACCUTTAGAATCC
556


ATCC 51449


AUCGCT







Helicobacter hepaticus

TGCAAAGCTAAGCAAUTTAG
557
GCAATGTTGAUACTTTGTC
558


ATCC 51449
TCAAGCTTTUA

TUCACCT







Lactobacillus johnsonii

TTCAAAAACATATCUTCTAG
559
ATGGCCACCAUAATTTTGC
560


NCC 533
ATCTTCTUGGT

TTTUAAAGG







Cutibacterium acnes

GCCCUCCGCATCGCUGT
561
ACGTAUACCAGGCUCAAGG
562


KPA171202


CT







Cutibacterium acnes

TCGCCCAGGUGCTCUCC
563
GGGUTGGUGGAACGCGA
564


KPA171202










Cutibacterium acnes

GCCGCAGCCGAACUGGUT
565
ACCGCGAACUCGGGUGG
566


KPA171202










Cutibacterium acnes

TUCGCGGUCGACACCAA
567
GAGTTGAGGUGCTGAUCAA
568


KPA171202


CG







Chlamydia trachomatis

GTGAGCGAAUCAAGAAAGTU
569
AGAAGCTAGTGUATACACT
570


D/UW-3/CX
CGT

GCTTGUC







Campylobacter jejuni

TATAGTTGGCGUGGAGCAAA
571
ATTTTAATATTTUCCCCAG
572


subsp. jejuni NCTC 11168
AATUGA

TATCTTTAGUGCA



ATCC 700819










Campylobacter jejuni

AAGGTCTAAATTTUGTCCAT
573
TGCTTTCUCAAAAAGGATC
574


subsp. jejuni NCTC 11168
CTAGCAUG

UCAAGGT



ATCC 700819










Campylobacter jejuni

AAAACGAAAACGAAAAAGAU
575
TCTCTCAUAAAAACGCATA
576


subsp. jejuni NCTC 11168
GAAGGTUT

CCACUAAGT



ATCC 700819










Campylobacter jejuni

AGAAAGCAUCAAAACCAATA
577
TGCTATAAAAGAUGGAGAA
578


subsp. jejuni NCTC 11168
AAGGAUCAG

CGCTAUAGT



ATCC 700819










Campylobacter jejuni

AATAAGGTTTTGAUTGCAAA
579
CCACTTTAGACAUAGGTGG
580


subsp. jejuni NCTC 11168
ATTCTTUAGGAA

UGGT



ATCC 700819










Bacteroides fragilis

AGCCGCAAAUGAAUACGGC
581
CCATGGAGCUGGTTGGTUG
582


YCH46










Bacteroides fragilis

GGTATAAATGGAUCGTACGT
583
TCCGCCAACAAAACCUATG
584


YCH46
TUCGA

UCT







Bacteroides fragilis

GGCAACCACUTCCGGAATUT
585
TTGCGGCUGGAUGAGGT
586


YCH46










Lactobacillus reuteri JCM

GGGATGGACAAUTATTTTAT
587
CUGGCCAGUAACGGCGA
588


1112
GGATTCUGA









Lactobacillus reuteri JCM

AGTATTTTGGCUCACCAAGC
589
TGCCATTGAUCCACCTCAC
590


1112
AUCA

UT







Lactobacillus reuteri JCM

ATTATGATTATTGGUGGAGG
591
AATUACGCCAACGUACCCA
592


1112
ATGGTATACUGT

CC







Lactobacillus reuteri JCM

CTGGCCAGUATTTUGGCGGT
593
ACAGTTGAGGCUGAGAGAA
594


1112


AACTTUG







Bifidobacterium

GCCAAUCCCCGTCAUAGC
595
GATCCGGCGGCUGATATUC
596



adolescentis ATCC 15703











Lactobacillus rhamnosus

CCGGTTTTUGCGCGCTUC
597
GCGAUGGCAGAAGCGUT
598


GG










Lactobacillus rhamnosus

AAACCTTGAUGATTGCTTTU
599
AGACCCGUAAUGCCGCCT
600


GG
GGCAA









Lactobacillus rhamnosus

GCCAUCGCTCTUGGCGT
601
ACTTTGGTTTGAAUCAAGA
602


GG


CTTGAUCAC







Lactobacillus rhamnosus

GTGATCGUCATGTGCGAUCC
603
AAGGATGGAUCAACCGTTA
604


GG


TCCTTAAUAAA







Bacteroides

GCTGGTUCAGGTATTACAAC
605
GCGATTACGATTUGAAAAG
606



thetaiotaomicron VPI-5482

UGC

TTCTCACTUT







Bacteroides

TTATTGCAGGGUATGGUAGC
607
CTCCACCUAGTCCCTGUCC
608



thetaiotaomicron VPI-5482

CAG

G







Bacteroides

AAATCACCCUGGUGGAGCGA
609
CUACTGCCTTCTUCCGGGA
610



thetaiotaomicron VPI-5482



AAAT







Lactobacillus acidophilus

TAATGACGAGAUGCGTTUGG
611
CTGAATTAAATTUAGTGCA
612


NCFM
ACAG

TTTTCUAGCAAAGC







Lactobacillus acidophilus

CGGTTTAAACGAUGCTACTC
613
TTTCAGCAUACCAAAGTGG
614


NCFM
UCGA

ATATTUCCAT







Desulfovibrio alaskensis

GCAGCGAAAGUCCGUCG
615
ATATCCGCGCUGCGCUG
616


G20










Desulfovibrio alaskensis

CTGATGCGUGCCGUGCC
617
AGAACAGCGCAUGCGCUC
618


G20










Bacteroides vulgatus ATCC

CAAACCACTTGUTCAACTTC
619
GTCAGCAAUGTAACCGUCA
620


8482
CCUG

GG







Bacteroides vulgatus ATCC

AGGCAUGAGCAUGAAACGC
621
GGTGGAACUGACCGUAGGC
622


8482










Bacteroides vulgatus ATCC

TTUGGCCACAGCAUGGGA
623
AAAATGCTTCTUGTTCCAG
624


8482


TTCAUCC







Parabacteroides

CCCGGUCGTGTTTAUGGG
625
TCTGCGCAUTCATGUCCG
626



distasonis ATCC 8503











Parabacteroides

CCGGAGGGAGUGGAGUT
627
CCCTUACCCGTATCTTUCA
628



distasonis ATCC 8503



CGG







Parabacteroides

GAGGAAAAGGCGGAGUTTAT
629
CCCTCCGGCAUCATCAATU
630



distasonis ATCC 8503

AGATCUG

G







Parabacteroides

TGCGCAGATUCAGGATATTT
631
CCAGATACGCUTTATTATA
632



distasonis ATCC 8503

GUGC

ATAATTCUCGCC







Lactobacillus delbrueckii

CCGCAACCUGGTCTUAAAGA
633
TTGGCGCUTCAGCCAGUAT
634


subsp. bulgaricus ATCC
G





BAA-365










Lactobacillus delbrueckii

GUGCCCGCUGAAACGGA
635
CCGGUCAAGTUCCGGGCA
636


subsp. bulgaricus ATCC






BAA-365










Lactobacillus delbrueckii

CCTTUAAGAGCAGCCGGGAU
637
GCCUGAGTCAAUCCCGACC
638


subsp. bulgaricus ATCC
T





BAA-365










Campylobacter curvus

GCGACGGAUGACGUCCT
639
GGATGTUGCTAGCUAGCGG
640


525.92










Campylobacter hominis

CGCGCGAAAUGCUGAGA
641
GGUGAAGCGAUGGCGAA
642


ATCC BAA-381










Campylobacter hominis

GCTTCACCGAAUCCGUCG
643
TTTAGCATCTATUGAAAAT
644


ATCC BAA-381


AGTAGGATTUCACC







Campylobacter concisus

GGTTTGAUGGCAAAAATTTG
645
AGTTAATTGTGGCUTTAGC
646


13826
TGUGGT

TAGGATAAAUT







Akkermansia muciniphila

TGTAAGCGGCGUTGTATTTG
647
CCAATACCAGUCCAGUGCA
648


ATCC BAA-835
UCC

GC







Akkermansia muciniphila

CATTATCAACGGUTTTCAGC
649
AAGAAAGUAAACCTTACTA
650


ATCC BAA-835
GTGUAG

UCACGGC







Atopobium parvulum DSM

TAGGTCCTATAUTCCCCAGA
651
AGTAGTTTTGTCUACTCTT
652


20469
CUCAAA

GGAGTAGUG







Atopobium parvulum DSM

TAUTGATTCAAGTTTTGUGA
653
AGGACCTAUACTTGCAATT
654


20469
AGAGAGAAAAAC

UAAACGAC







Veillonella parvula DSM

GGAACUCACGAACUGACCAA
655
CAGATTTAATAAAAGCAUC
656


2008
AGA

CCCATTTTUAGCC







Veillonella parvula DSM

GCGCCCATUCCAACTAATAC
657
CCAATGTACAGGAUAACTC
658


2008
ATTATCTUC

TGTATUACACG







Veillonella parvula DSM

AAAAGAAGCGGAUAGTTGAG
659
CACTTGTGGACUGTAGAAT
660


2008
TTAAUCAGC

AUGGCA







Citrobacter rodentium

GCTGGAUGGCGGTAUCACT
661
GAGCAUCAATCCATGUCGG
662


ICC168


AT







Citrobacter rodentium

TGAUGCUCCGCCACCCA
663
AGGTGTCUACGGCACUCAC
664


ICC168










Streptococcus

AACTTGAAAAAGCAAAAGAU
665
AGCTTATACUAACGATAAT
666



gallolyticus UCN34

ACAAGAGTTAAUG

AAAAATUAACCCGA







Streptococcus

AGAAAGCCCAACGGUATAAA
667
CAATCGCTGTCTCUTACTT
668



gallolyticus UCN34

CATUACAA

CATTTATTTTAUGA







Enterococcus faecium

CGGCGUGAUCAGCGCCA
669
GGTTGCTGUGCCTCTTATG
670


TX0133a04


UGG







Enterococcus faecium

AGCTTTATACAAAAGCAUAT
671
CTTTAACGAACGUGTTCGC
672


TX0133a04
CTGCTCCUT

UAAAAA







Enterococcus faecium

AGCTCGUTCTCATUCAGCAG
673
ACUCAGGAAGCTTUGGCAG
674


TX0133a04
A

A







Peptostrepto-coccus

CCTCCATATACCAACUTAAA
675
CCCAAGAATAUTTTGCCAA
676



stomatis DSM 17678

TACTAAACAUGT

GGUCA







Peptostrepto-coccus

TCTTGGGCUATACCCAUAGA
677
GAGTCGATAAUAAAGAGGC
678



stomatis DSM 17678

CCT

TTTTAAGUGAT







Mycoplasma fermentans JER

AGCCACTTTTUGTTCGTCTT
679
ATTCAGCATATTUACCACT
680



AGUACT

TGCAATGUT







Mycoplasma fermentans JER

TGGATGGATTUTATGATGCT
681
AAGTGGCTTTUTAGTTCCT
682



TAUCCACA

TCUGC







Eubacterium limosum

CACAAGGGUCGCCGCGUC
683
CCGCGAAAUACGGCGAACU
684


KIST612


G







Eubacterium limosum

AAACATAAACGAUGGAAAAC
685
TAAGAAATUAACGGAAGGA
686


KIST612
AGATTAUGGAAAA

GAUGAAACAC







Parabacteroides merdae

TCAATGUACCGGUGGGCAA
687
ACAGACAGCCUAATTAACG
688


ATCC 43184


TAGUCC







Parabacteroides merdae

TUCGGAACCAUCCGGCA
689
CCAUGCAGAAAAACCGATU
690


ATCC 43184


CCG







Faecalibacterium

GCCATTGCGCAUCGTCAAAA
691
CATUGCAGGCAAGGAAUGA
692



prausnitzii M21/2

AUA

AGAG







Faecalibacterium

TAAGCCGAAAUCTGAAUGAC
693
CAGCTGAUGGATGAGATGA
694



prausnitzii M21/2

CGA

UCGAA







Faecalibacterium

CCTUAACGGCAACCACGAUG
695
GCGCTUCCAGCAUGCCA
696



prausnitzii M21/2











Faecalibacterium

CCGGUATGGGAAUAGGAAAA
697
AUGCCCCCGCGCAAAAUC
698



prausnitzii M21/2

AGC









Parvimonas micra ATCC

CCAGCACUCCGACTATAGAT
699
AATACAAAAGUATTCTGAU
700


33270
TUAGT

GACGGAGAG







Parvimonas micra ATCC

AGCTACTGAUCCCCAAAGTA
701
CGGACAACGAAGUCCGTTG
702


33270
AAATTCUC

TUT







Bifidobacterium bifidum

AGCGUACCGGAAGCUCG
703
GUCGGCAAUGCCGGCAC
704


NCIMB 41171










Bifidobacterium bifidum

GGCCGAATAUGTTTCGCGGA
705
GCGAGGUCAAGATAUACGG
706


NCIMB 41171
UA

C







Collinsella stercoris DSM

CGCCACCCCACCUCAAUT
707
ACCCCCGUTGCGCACAUT
708


13279










Roseburia intestinalis

GTCAGATTCUCTGCATAATT
709
AGCGTCAAUCAGGAUGAGG
710


L1-82
TTUCCG

T







Roseburia intestinalis

TTGACGCUGTCATCCUGCT
711
AAAAATCGAGGAUCTGCUG
712


L1-82


CG







Enterococcus gallinarum

GAGAAGATAAGUACCTAAAT
713
CAATGAAAACUGGATCACC
714


EG2
CUGAAAGAAACGC

CTTCUGAT







Enterococcus gallinarum

AAGGATGTGUCCAACATGAA
715
TCAAGAAATACUGTCTTTC
716


EG2
UCAGGA

TTCUGACCG







Enterococcus gallinarum

GAAGCCAATCCUGGTCCTGG
717
AAATGGGAAATAAACUTCA
718


EG2
TTUA

TGAATACCTCCUAT







Prevotella copri DSM

AGTCATTAUGAAGGAGACCA
719
GGGCGGAUAGATTUCCGGC
720


18205
ATTUCGAC

A







Prevotella copri DSM

AUCCGCCCAGCTUAGCC
721
CTGATTTTGUAGAGAATCC
722


18205


CTCGTUGA







Prevotella copri DSM

AACCUTGCACAUCGAAGAGG
723
ATTGCTTTATCGTUACTGA
724


18205


AATTCATAATCTUC







Prevotella copri DSM

AGACAATTCTUGGCAAACAA
725
GCAAGGTUCCTACGAAAUC
726


18205
TTCUGG

AAGC







Holdemania filiformis DSM

ACCCUCCTUACCCCACCA
727
AACAGGCGAAGGAAAAAUG
728


12042


TACTUAC







Holdemania filiformis DSM

AGCGUCGACGCTCTAUCCA
729
AGGAGGGUGAACGTTTUGG
730


12042


T







Helicobacter bilis ATCC

AAACAGAAGAACAAAUTCAA
731
TTTGCATGGUATTCTAGCU
732


43879
ATGCGUAACAA

CAGC







Helicobacter bilis ATCC

GGATATGTATAATCUCAATC
733
CGACAUGATATGCACUCCC
734


43879
CACAAGATAUCAC

AGAGA







Anaerococcus vaginalis

TTATGCAACGAAUATCCTAA
735
AAACTCCATUAAGCATAGG
736


ATCC 51170
ATACAAUGGAT

TAATGAUGAGA







Anaerococcus vaginalis

AGUCTAAATTCTAAATCUAG
737
ATTGCCGAUTTTCAGGAUA
738


ATCC 51170
GGCAACGG

AGCCA







Anaerococcus vaginalis

TCGGCAATATCAUTTTGATT
739
AATAGTTGCGGAUTATATA
740


ATCC 51170
TCCTUCA

ATCAACAAUCCAA







Collinsella aerofaciens

GAGCAGUCGGGTGTCUCC
741
GGGAAAUCAGCCCTTGAGA
742


ATCC 25986


UC







Dorea formicigenerans

AACCGGGAAUGACTAATCAA
743
CCGATTCAUCAAAGCAUAC
744


ATCC 27755
GUGT

CCC






Dorea formicigenerans
ACAAAAGAAATUATTGGAAC
745
TCCCGGUTCUACCGAAACC
746


ATCC 27755
CATUGGCA









Ruminococcus gnavus ATCC

CCGGAGUTTGATACCAUGGG
747
ATACCTGCUGCCCGGTUGA
748


29149
ACA









Ruminococcus gnavus ATCC

GCCGCTTUTACTGGCATUGT
749
TCTCUTTTTCCTGUCCCGC
750


29149


A







Campylobacter rectus

TTGTTTGCGTCUTATACTCG
751
CGATAATTTCTUAAAATTT
752


RM3267
TGUCT

AGATGTCUGACACA







Campylobacter rectus

AGTCATTTTGCUTGACTGTA
753
GCAAACAAACGGAUTTACG
754


RM3267
TTTTUGGT

AAGCUA







Campylobacter rectus

AAAAACAAACAAAUTTGAGA
755
TTTTGTTTTACTTTAUCGT
756


RM3267
GCAUAGAGGA

CCATATCGACUT







Actinomyces viscosus C505

GUACAGGCUCCCGGCGT
757
GCTTGCUGCAGCCCUCG
758






Actinomyces viscosus C505

CTCCACCGUCGGGTUGT
759
TTCCAACAUGTTGGCUCGC
760






Actinomyces viscosus C505

TGTTGGAAUGCCCGCTTAUC
761
CUTACGCCGUGGCCGGT
762



A









Campylobacter gracilis

CGCGCCUCGTCGATCUT
763
CGCCGUGCTTTTUGACGA
764


RM3268










Campylobacter gracilis

GCACGGCGUCAATGCUT
765
TGCCAACGGCUTTATATAT
766


RM3268


TTCTACAUC







Campylobacter gracilis

TCCTAGATTUGCGATCAGCG
767
AGCCGTTUTACGCGCUG
768


RM3268
UAAG









Campylobacter gracilis

GCGGCGAUTGAGCGAAAUT
769
TGGGCGAGAGUTTATCGUG
770


RM3268


C







Peptostrepto-coccus

CCATAACGGUCTTACTGCTC
771
AGTTACAGGTAGUCCCATC
772



anaerobius 653-L

TUGAA

TCTAUACAG







Peptostrepto-coccus

AGACTTGCAUGTTCTCCTGA
773
GATCGTAAACGUAACCACA
774



anaerobius 653-L

UGA

TGGUC







Prevotella histicola

TTGATAATGTGUTTACCAAC
775
CAGACGGTCUCAGTATTGT
776


F0411
AUCACCAC

TCUGAT







Prevotella histicola

ACCGCAACCCUTGUGAGGT
777
AGGUGCUAACGGCGAGA
778


F0411










Helicobacter bizzozeronii

GATCCAAAGUGATGGGTCCA
779
TGCCCAAAAUCTCCAAAAG
780


CIII-1
UAGAG

ATUGT







Helicobacter bizzozeronii

TGCTTUGGCTTCUCCCACT
781
CGATTTTATGGATUGCTTA
782


CIII-1


AAAAGGGTUAAGA







Helicobacter bizzozeronii

TGTGGAACAAAUGAGTATTC
783
ATAAACATCGGUCGCACGA
784


CIII-1
UAGCCAA

TUAGT







Enterococcus hirae ATCC

AAAACTTATGATUGACAATC
785
TGGCTGATGUTTGGTCTGU
786


9790
GAGGCAUT

ACA







Enterococcus hirae ATCC

AAACGGAAGAAGGAGUCTAT
787
ATCCTACACGACUAATCAT
788


9790
CAUGA

TAGAGAAAGUT







Bacteroides nordii

CGUAGAGCCTUCCCGGT
789
GGGCACCGAUGAGAAAAGU
790


CL02T12C05


T







Bacteroides nordii

TGATCACUCCGGCUACAAAG
791
TCCGGATAUAGATACTATU
792


CL02T12C05
GT

GCACCG







Bacteroides nordii

AAACGCCUTAAATTGATUCA
793
GTGGUGAAAGTTTCTGUGC
794


CL02T12C05
AGCGA

CC







Bacteroides nordii

TCAAGTTTCCTUCTAAAAGT
795
GGCGTTTTCUGGTGTTTAT
796


CL02T12C05
AGCUCGT

GTUCT







Barnesiella

AATCTTTGATUGGAAGGTTA
797
ACGCAAGAUTTTCATTCTU
798



intestinihominis YIT

GAAGTAUAAAAGG

GAAAGAGGAG



11860










Lactobacillus murinus

CTTTGCGACCACACUTAGCU
799
ATTCATAAGCGGUCGTGAC
800


ASF361
C

TTTTAACUT







Lactobacillus murinus

TGACTCACCUTCATATUCAA
801
ACGTTTUGAGCGATACGGU
802


ASF361
AGCC

CC







Eubacterium rectale

AATTACTCCTCTCTUCTTTT
803
TACCTTATTATGAUATCGT
804


CAG:36
AACCTTTGATCUG

CATCAAAUCGCC







Cloacibacillus porcorum

TCTCTTGATGUACTTGTTAA
805
AGAGCACTAUTCGACGCUA
806



TAAUGCCG

CC







Cloacibacillus porcorum

AGUGCTCTUAGCGGACGC
807
TTCTAATAGACGUTCACGT
808





GATATUGGT







Blautia coccoides

CTTCCATCCUCAGGTATACU
809
TGCTCTGTAAAUGGAAAAT
810



CCAG

AGTCCAUCAAAT







Ruminococcus bromii

GCGGUATTTAUGAAGAACAG
811
CCCGACAAAATUTCTTCAA
812



CGT

GAGTAUCC







Phascolarcto-bacterium

CCGTUGCAAAGGCTTUACAC
813
CGGCCCAGUAACCAGAAGU
814



faecium DSM 14760



A







Phascolarcto-bacterium

GGUTCTGGTTTTUCGAAAGC
815
CCTGUCAGCAATAGTUCAG
816



faecium DSM 14760

GAG

CACT







Helicobacter salomonis

CCUCCACAAATTUGAGGGCT
817
ACAAGGACUATATGAAGTA
818





TAUGCAAGCG







Helicobacter salomonis

TCACTAATCTTTUACTTGCC
819
TGUGGAGGCGTUGGCAT
820



ATCTCUCC









Gardnerella vaginalis

TTGCCGCTAUAGGAGCAGUA
821
ATTCTGCTTTAAUTGAACG
822



A

CAAUCG







Gardnerella vaginalis

AGCAGCAGUCGTGTTUGG
823
AGCGGCAACAACUGAGAUG
824





A







Gardnerella vaginalis

TTTTGGCAACUTGGGCUAGG
825
ACCCAAGUGACATUGCGCT
826










TABLE 16D (PRIMERS/PROBES SEQ ID NOS: 827-1258)












Bifidobacterium longum

ACCAAGGTTCTAGCCGGT
827
GGCTTGGTGGCAGTAAGTG
828






Bifidobacterium longum

ACCATCTGGATTGCCGCA
829
AGTGAAACAACAGTATTGA
830





TGCCG







Clostridioides difficile

ACATTTGCTGAATCTTTTGC
831
TCAAGATAAAGGACATCAA
832


QCD-66c26
TCTTTTTACT

GTGTTAGGT







Clostridioides difficile

CATCTACTGAAGCTGCTTCA
833
TTTGCTCTTTGATATTTTT
834


QCD-66c26
AATTAGT

GCCATACAGAT







Clostridioides difficile

ATCTTGAATAGTAACTTTTA
835
GATTCTGCTAAACTAATCG
836


QCD-66c26
AACTTTGCCCT

AAGAGGTTAGA







Lactococcus lactis subsp.

CAGCGAATAATAATTCCCCT
837
GGATGACTTTCTATCGGCA
838


lactis I11403
TGACAG

CTTCA







Lactococcus lactis subsp.

GCAACAGCACTTCGTAACGA
839
GGAGAACCAAATTCAACAC
840


lactis I11403
T

GAGTTT







Chlamydia pneumoniae TW-

AATTCACAGCTTGAGGAAAA
841
TGGCAACATCTGTTCAGGA
842


183
GGTGT

C







Chlamydia pneumoniae TW-

TGCGTTGCTCGCTCTCT
843
TGCACTCTTTCAGAAAGAA
844


183


GGTCTT







Chlamydia pneumoniae TW-

ACGAAGAAGCTGTGGAGAAG
845
CCTTGAGACTACCAGGGAG
846


183
T

C







Chlamydia pneumoniae TW-

AAAAGTAAACAATAAGAAAG
847
CGCGCAACATAGACTCCC
848


183
AGGTTCAATATGC









Fusobacterium nucleatum

AATTGTTCCTCATCAACTAT
849
GTAGCGAGGAGGATTATAG
850


subsp. nucleatum ATCC
TTTAATTCCTTG

TGAAAGA



25586










Porphyromonas gingivalis

GTGGCTTTCTTATGTGCATG
851
TATTCGTAATTAGAGTAGG
852


W83
GATTTG

AGGAGAAGCTTTT







Porphyromonas gingivalis

TGTGGCACATGACAGTCGTT
853
CATAAGGTCTTTGCGCTGG
854


W83
G

T







Helicobacter hepaticus

GTGGCAATTACTTGCGTATT
855
CCTGCTCAACCCCTATCTG
856


ATCC 51449
TGG

G







Helicobacter hepaticus

AGACAAAGTATCAACATTGC
857
CGAAAGCGGGAATGCTCCA
858


ATCC 51449
TCATACCT

A







Lactobacillus johnsonii

AAATGAATGGGTAGAAGCTG
859
TTAAGATAACTAGGTCGCC
860


NCC 533
GTGT

GACTAC







Lactobacillus johnsonii

TTCAGCTTCATTAGAAGACC
861
CGTCAATTTGGACTTTACT
862


NCC 533
TCGG

GATTGGA







Lactobacillus johnsonii

TCACCATCAAGTAGAACTGT
863
CCAGAAGAATTGCTTCCCC
864


NCC 533
ATTTTGTGT

AT







Lactobacillus johnsonii

ACAATATTGGTCTTTTATTT
865
AGCTTATATTGAGGATTGT
866


NCC 533
TTAGCAACTTGT

GGCTACAC







Cutibacterium acnes

TCGGTGTCATTGGGATCGAC
867
CTGGGCGACGACGCTTT
868


KPA171202










Cutibacterium acnes

GTGCCGTCATTGACCAGCAT
869
CGGAGGGCTATCGCGGA
870


KPA171202










Helicobacter pylori 26695

GTGCCTAAAAGCACAAGCAA
871
AGGGAGTTTAAAAATGAAA
872



TTG

CGCTTTCAA







Helicobacter pylori 26695

AAAGGTGAGAGGATTTAGGA
873
CTAGAGAGATAGCACCTAC
874



CTTTTTACTAAA

TATAACAGATTTC







Borreliella burgdorferi

AGAGAAACCAGTTGGCCTTT
875
AACAAATCCTCGATTTATT
876


B31
TGG

TCATGGCAG







Borreliella burgdorferi

AATGGATTTATTTTGATTCC
877
ATTGCCAATATTCAATCTT
878


B31
GAATATGCTTTT

CTAAATTCATCAAT







Borreliella burgdorferi

TTGGCAATGTGATCTTTATT
879
AGAAATGAGATAGCTTTTA
880


B31
GCAATTTAATT

ATAATCACTGCA







Chlamydia trachomatis

GCTGCAGGGATTATTCTTTC
881
AGGGCTCTATCTATCAGAA
882


D/UW-3/CX
TCCA

TCGGAA







Chlamydia trachomatis

AGAGCCCTTCTCGAATATGG
883
AAATCGGGTGCACCTTCTG
884


D/UW-3/CX
GA

TAA







Chlamydia trachomatis

AGCAAAAGCTTGCATATTGG
885
ACCTCTATAGGTGTCCGTT
886


D/UW-3/CX
CA

ATTTTGATG







Campylobacter jejuni

GCGTTCTCCATCTTTTATAG
887
TTATTTTAGTGGGTTCTGC
888


subsp. Jejuni
CAGAAATACG

AATGACAAGATA







Campylobacter jejuni

AACAATTCTTTTAGCCTAAC
889
GCGAAAGTTACTTAGGTGG
890


subsp. Jejuni
AGTGCCA

TCTTGC







Campylobacter jejuni

GTTATGAAGCTTATTAATGG
891
CCTCAAATTGATCTTCTGC
892


subsp. Jejuni
TAGTGGTGATGA

TGAAGTATTA







Bacteroides fragilis

TTGGCGGATACAGCCCT
893
ATCCAGACTCTCCTGATTG
894


YCH46


TCCA







Bacteroides fragilis

GATCTGCCATAGAATCTCGT
895
CGGCTGAAGAAGAGTGGGA
896


YCH46
CG

A







Bacteroides fragilis

TCCGGGCAGCGAGTCTG
897
GGCAGATCGATTGCAGGGT
898


YCH46










Lactobacillus reuteri JCM

AAAAACGGAGGAGACTAATT
899
TGCTTTTGCTTCTTGTAAT
900


1112
AATATGGCAA

TACGAATTAACT







Lactobacillus reuteri JCM

CCGGTTGACCGTATACTACG
901
CACAATCGTTTTTAGCTAG
902


1112
CT

AATCACTGTT







Bifidobacterium

GGAACAGCCGTCTGATCAC
903
AAAAACACTCATTGTTTTC
904



adolescentis ATCC 15703



ATCGTTTTTCA







Bifidobacterium

CCAAAGACTTCGAGTAGGGC
905
GATTGTTCATATGGGCTCT
906



adolescentis ATCC 15703

TTG

CCTATCC







Bifidobacterium

CGCCGAATGATGTTCGAAAT
907
CCGACAATCTCAAGAAAAC
908



adolescentis ATCC 15703

ATGGT

GCTGAT







Lactobacillus rhamnosus

ACGGGTCTTAGCATTGGCTT
909
GCACGCGTCAATTAAGCCC
910


GG










Lactobacillus rhamnosus

TCAATGGTTAAGTTGGCCGT
911
ACGATCACTCAAAATGGTG
912


GG
AG

CG







Bacteroides

CCAAAGCATTGGCATATGCA
913
AAGCCCAATCGTCATCTTT
914



thetaiotaomicron VPI-5482

GATA

GTAGTT







Bacteroides

ACTAATAATAAGGGATTTTC
915
AACTTTTTAGTATCCTTAG
916



thetaiotaomicron VPI-5482

TGAATTTGGTGAT

CGAAGTTGAC







Bacteroides

TGCTCAAAGTGAGAACTTTT
917
TCTGTTTGTGAATAACTAC
918



thetaiotaomicron VPI-5482

CAAATCGTAA

CGTTAGGAC







Mycoplasma penetrans HF-2

AGCATTACTACAAAAAGAAT
919
ATTTAGGGTGTAACAAAGA
920



CAAGCAATAATAA

TGAAAAACATTAAT







Mycoplasma penetrans HF-2

GCACCTGCTTTTATAACATC
921
ACAGAAGAAAATATGTCTG
922



ATTTCCA

CTACAAATAGAT







Mycoplasma penetrans HF-2

GTAATCCTACTTTCATCATA
923
GGTGCAACATGAAATCAAG
924



TGAAGAAGAACT

GTGA







Mycoplasma penetrans HF-2

GAAATTGCTACAGAGATAGT
925
GTAATGCTTTTAAAAATCA
926



CCCACC

TTCTAATGACCCA







Lactobacillus acidophilus

ACTGGCAATTCATCAGAAAA
927
CCGTAGTTTTTCCTTGCTG
928


NCFM
TACATCTAC

ACC







Lactobacillus acidophilus

GGACAGCTACCCTTGTTGCA
929
AAAGCACGATTAATAGTTA
930


NCFM


AATTACCAAAAACA







Lactobacillus acidophilus

CGCTTCAACTGATCATGTAG
931
CAGCATGACTGTTATCAGT
932


NCFM
AAAAAGTG

GTTTGTT







Lactobacillus acidophilus

GGTGTTAAGGTGAATTGGAC
933
CCTGTGCCCAATTCATTAT
934


NCFM
TCAAAC

TAGTATTCAT






Desulfovibrio alaskensis
AAACCTTTGCCGGGCGTC
935
CGCATCAGGCTCCCGCA
936


G20










Desulfovibrio alaskensis

GCGGATATCACGGACGC
937
GGCTGCGGTTGTGGTCG
938


G20










Desulfovibrio alaskensis

AGGTACCGGCCTGCTGCAT
939
TTCGCTGCCCGAAGCCG
940


G20










Desulfovibrio alaskensis

AGCAGAAAGACAGGCATGAT
941
AGCACCTACTGCATCGCC
942


G20
G









Bacteroides vulgatus ATCC

ATGCAGCCACAACCAATCG
943
TTCGGCCACATTCCATCCT
944


8482


AA







Bacteroides vulgatus ATCC

TTGCTGACCAAAACCACCAC
945
TTTTTATGGAATGTTTTTC
946


8482


TGTCGGG







Bacteroides vulgatus ATCC

GTTCCTATTCCTATCTCTTC
947
CCGCCTTTGATAGATCCGC
948


8482
CGGTGG

T







Parabacteroides

AGTCCCAACGCCATTGTGC
949
CAAGGATGTTTATGAACGG
950



distasonis ATCC 8503



CAAAACA







Parabacteroides

GAATATGAGCCATGAGATAC
951
AGAAAGACATGCTACCGGA
952



distasonis ATCC 8503

GTACGC

TTCTATG







Lactobacillus delbrueckii

GAAGCTGGATTTGCCGACCT
953
GCGGGCACAAAACTCTTCA
954


subsp.bulgaricus ATCC
A





BAA-365










Lactobacillus delbrueckii

ACTCAGGCGACTCAGTCTTG
955
GGCGGTTCTGGTCAAGC
956


subsp.bulgaricus ATCC






BAA-365










Lactobacillus delbrueckii

TTCTGACGCCTATGGGACA
957
GGTTGCGGACCTGCATC
958


subsp.bulgaricus ATCC






BAA-365










Campylobacter curvus

CCCACGAATGCGATCACG
959
CAGCAAGGCCGATGAGATA
960


525.92


AG







Campylobacter curvus

GTGACATCTGAGGTAGATGA
961
ACTCGGCACAGATACAAGC
962


525.92
TATGGC

A







Campylobacter curvus

AGCCAGATCTCCACGCTC
963
TAGGGCATATCGATAAAAG
964


525.92


CTGTAATAAAAA







Campylobacter curvus

ATGCCCTAAAAATCGCAAGC
965
GATATGGCTGCAAACGCGA
966


525.92
T









Campylobacter hominis

GCCGGAGTATCAAGATTTAA
967
AGATTGTTTTATTTATTTG
968


ATCC BAA-381
ACCATAAG

CAAAGAGATGACG







Campylobacter hominis

CTTTGCAAAATTTTGCATAT
969
GATTGATGTGGCTATTAAA
970


ATCC BAA-381
TCACCGA

AGTATCGGC







Campylobacter hominis

GCTGACGCTCTCATAAACGG
971
TTGCAAAGAATTTTGCGCC
972


ATCC BAA-381
A

ATTATT







Campylobacter hominis

AGGTTTAAAGTATTTTCTAC
973
ACTCCGGCAGAAAGGGATT
974


ATCC BAA-381
AAAAACTTCAACA









Campylobacter concisus

CATCGATAAGCTCATCATCA
975
TAAATTTATCTCATAGTCT
976


13826
TGCCAA

GAGATATCGACCT







Campylobacter concisus

ATAATACGAGCAGCACACCT
977
AAATGAACCGGATCAAAGC
978


13826
ACCG

TCCC







Campylobacter concisus

AGAGGAGTCTTTTAAAAAGA
979
TTGCGTCAGTGATCTCAGA
980


13826
CTGAAGAAGAT

AACAT







Akkermansia muciniphila

GGCATTCTGAGGTACCGGAA
981
TTTTCGCCTCTCACATTGG
982


ATCC BAA-835


AAATTATT







Akkermansia muciniphila

TGGGCATGATCGGAGAAAGA
983
TTGCCATGGTATTCCTTGG
984


ATCC BAA-835
AG

CG






Akkermansia muciniphila
CCAATTGAACTACTGACCTG
985
CACCGTGGGTGCTGGTCG
986


ATCC BAA-835
TTGGAG









Bifidobacterium animalis

CGCAGTACATGGATCACCTG
987
CGTATGCGATGCGTTCGC
988


subsp. lactis AD011
TTC









Bifidobacterium animalis

CGCATACGTGCAGCGGT
989
GGACAGGTGCCCGGTGG
990


subsp. lactis AD011










Bifidobacterium animalis

CTGTTCTGCTGGTTCTGCGA
991
GCCGTAGTAACAGCCTCGA
992


subsp. lactis AD011










Bifidobacterium animalis

ACTACGGCATCATCGTTGTC
993
GTTACGCGCATCGAGCC
994


subsp. lactis AD011
T









Atopobium parvulum DSM

GCAGCCAGCCCTTCTTG
995
GGCAGAAGATTTGATGCTC
996


20469


CAT







Atopobium parvulum DSM

ACAGCCGCTTGATTATATTT
997
AGAGGTATTCCAAATGCAG
998


20469
AAACTGCC

CTTATTG







Atopobium parvulum DSM

ACGATACCAGTAATACTTAT
999
TGGCTGCTTGGAAACGAG
1000


20469
TAAACTCATCAAA









Veillonella parvula DSM

GCTGGTATTGGTATGATTCC
1001
AAACCAAACCGTTGCCCCA
1002


2008
AGATGG

TA







Veillonella parvula DSM

TCGACTGATATATCAAGAGA
1003
CATCAGCCATGTGTACAAA
1004


2008
AAGAAAGTGTA

ACCT







Veillonella parvula DSM

AGAAACGGCTATACCAATTC
1005
CTTCGTTCGTAATAGATGG
1006


2008
ATGAAGAG

CTCTACAATAAG







Citrobacter rodentium

GCGGAATGGCGTTTACAGT
1007
TTTAGCTTATCAATAGCAC
1008


ICC168


AATTTTAGAAAACA







Citrobacter rodentium

GCCACCCAGCCATGATG
1009
GCGCGGTGGAGGTGTCTA
1010


ICC168










Citrobacter rodentium

ACTATGAATAAAATTTATTT
1011
TGGGTGGCGGAGCATCA
1012


ICC168
CTCTCAAGACCCG









Citrobacter rodentium

CTGGATACGCAGACCGATGT
1013
CATTCCGCTGTTTCATCTG
1014


ICC168


CA







Streptococcus

ATGTTGTTCAAGGTGACGGT
1015
CAAGGTTTCAAGGAACATT
1016



gallolyticus UCN34

ACTG

GAAGTGATAA







Streptococcus

CAAAACAGGAGATAAGATTT
1017
AAACAGTTCAGCACGTTCC
1018



gallolyticus UCN34

TTGTCACAGGA

TGA







Streptococcus

CGGTGACACCTAAAGAACTG
1019
TGACGATATCCTTTTTATT
1020



gallolyticus UCN34

ATGATATTCT

CAAGTCTCTAAGG







Enterococcus faecium

TAATGAAATCCAAATATTCT
1021
AACGAGCTAGCGATCGCA
1022


TX0133a04
CTTTCTTTATGGC









Enterococcus faecium

TCCTGCAATCACCGGCA
1023
TCACGCCGATGAATGAAGA
1024


TX0133a04


G







Enterococcus faecium

ATTCTACCCATGTCTCTGGG
1025
AGAAAAACCAAAAGCAACT
1026


TX0133a04
ATTTTGA

GGTACG







Peptostrepto-coccus

GGATTCATGGATAGGAGAAA
1027
TGCCGCCTACCTACCAGTA
1028



stomatis DSM 17678

GGCT

TG







Peptostrepto-coccus

GTATCCTAGATATGTCATTT
1029
AGAGATTGATGACCTGACT
1030



stomatis DSM 17678

AGGTCTTCTACA

ATAGAGTCT







Peptostrepto-coccus

TTGAACTTGAATCGACCCTA
1031
ATGAATCCAAATAGGGATT
1032



stomatis DSM 17678

TGCA

CTGACTATGT







Peptostrepto-coccus

ATCTCTATATCAAAGCTCCT
1033
AGGTTTAGGAAGGAATTTA
1034



stomatis DSM 17678

GGACACA

CAACTGAAAATA







Mycoplasma fermentans JER

TCCTTGCGACTTTTGCAAAT
1035
AAAGATCTTGATTATGAAA
1036



AATATTGA

TTCAAGAGCAATT






Mycoplasma fermentans JER
TTTTTCAGCTTGCAAACGCT
1037
TGAATTGCCTATTTATACA
1038



TTATTAAATT

CGCAATAAATTT






Mycoplasma fermentans JER
TCGGTTAATTTACTGAATGC
1039
AATACAAATAATCTATCGC
1040



AAAAAGTAAAAA

TTTTTGGGTGT






Mycoplasma fermentans JER
TTTTACATTCTGTTTACCAG
1041
GCCTTCTTCAAATTCTTTA
1042



GATCAATTACA

TAGCTTTTTGC







Eubacterium limosum

TTTCGCGGTGTAGAGCCG
1043
CTGCAGAGCCGGCCCTC
1044


KIST612










Eubacterium limosum

GCTGAGCCGGTCAATGC
1045
AGTGTGGCACCAATGAACC
1046


KIST612










Eubacterium limosum

GTTCCGGTAAAAGCAGGTGT
1047
ACCCGCTGGTCAATTTCTC
1048


KIST612


T







Eubacterium limosum

CACCTTACATGTAAAAATTC
1049
CCGGAACCCCATCCCTGT
1050


KIST612
TTGCGATTTC









Blautia obeum ATCC 29174

CTTCTGCATCCCGAACCTCC
1051
TATTTCGTTGGCAATAGAA
1052





GAGCCA







Parabacteroides merdae

CACTTTTATACTGTACCTCG
1053
GGGCGTAGTCGGTGAGT
1054


ATCC43184
ACCACA









Parabacteroides merdae

CGACCCTGACACTTTTTGCA
1055
TCATGATGAGAACTTGGAG
1056


ATCC43184
TT

ATAAAGCCT







Parabacteroides merdae

CTACGCCCACTTTAAACTGT
1057
CAGGGTCGATATCGATATC
1058


ATCC43184
GG

GATAATGT







Parabacteroides merdae

TCCTTGCAGGCATTCAGGT
1059
ACTGACTATAAATTGATAT
1060


ATCC43184
TGTGTGATGACAG









Faecalibacterium

AAGCCGAAATCTGAATGACC
1061
TCGAAGAAGCACTGCATCA
1062



prausnitzii M21/2

GA

TGTC







Faecalibacterium

GTGCAGGCGATCTACAACAT
1063
AATAATTATCAGTTGCTCG
1064



prausnitzii M21/2

TC

CAGCCT







Parvimonas micra ATCC

CTAAAGCTTTGTCTATCTTA
1065
GGTAACTCAGACGAGTTCT
1066


33270
TCAACAGCT

CGTG







Parvimonas micra ATCC

AGATGGATTGTTTATCCAGT
1067
GGAACTACACTTTCTTTTA
1068


33270
TTTCTGTG

ATGCTTTTAAAGAT







Parvimonas micra ATCC

GCGAATAAATATTCTACTGA
1069
TCTTGTTGCCTTCAGTTCC
1070


33270
CGCTTCAT

AACT







Parvimonas micra ATCC

CCATTGTTGAGTCGTCAGCT
1071
AGCTTTAGCAAGAGCTATA
1072


33270
TCATTTAT

AACCAAGT







Streptococcus infantarius

GCTGAGACAATTCTTTTTCG
1073
GCCAGAAGCGACAGTAGCT
1074


subsp. infantarius ATCC
AACTCA

TA



BAA-102










Streptococcus infantarius

TGATATCATCAACATTAAAC
1075
ACCAAGCTTTTATAAGAGA
1076


subsp. infantarius ATCC
ATCTCATAGTCC

GTTGCTCT



BAA-102










Streptococcus infantarius

AGCTTGGTAATTCAGACAAA
1077
GTCTCAGCATGATTATTTC
1078


subsp. infantarius ATCC
TCAATTCG

CATTCACG



BAA-102










Bifidobacterium bifidum

CGTCGCCAAGCCTTCGA
1079
TGGTTCTGGTCGACCTGT
1080


NCIMB 41171










Bifidobacterium bifidum

GACCTCGCTTACCCGGAA
1081
ACCTCCTGAATCTTATCCG
1082


NCIMB 41171


CGA







Bifidobacterium bifidum

CACGGTGGCCGCTTTAATG
1083
TGGCGACGGTACTTGGC
1084


NCIMB 41171










Bifidobacterium bifidum

CATCAGCGTCAAATCAGTCA
1085
GGTACGCTGTTCGCCGT
1086


NCIMB 41171
ACCG









Collinsella stercoris DSM

AGGAGTAGACATCCATGAAT
1087
TTCGCGTCATGGCATATGC
1088


13279
CCG

T







Collinsella stercoris DSM

GGAACTGGATGTATCGCGAT
1089
GTCGCCAAATGGGCGAT
1090


13279
GA









Collinsella stercoris DSM

TGTAAAACCGGCGAGGTGG
1091
CGCTCAAATGTCCTCGCT
1092


13279










Collinsella stercoris DSM

TTTGAGCGCACAAGTAGGGT
1093
CCAGTTCCCAGTCCATGCA
1094


13279










Roseburia intestinalis

CCGGTTTCCCTGGTTCG
1095
CTGAATTTACGCGTGAGGT
1096


L1-82


GA







Roseburia intestinalis

CGATCACTCCAAATCCGGAG
1097
AACCGGGTGGCAGCCGTA
1098


L1-82
CATA









Roseburia intestinalis

CGGCACCTTTCTGGCAC
1099
GACTGTGGCTTGCTGCA
1100


L1-82










Roseburia intestinalis

CTGCCCGGTATTTCGCATT
1101
ACGGGCACAGATTATCGTG
1102


L1-82


T







Enterococcus gallinarum

TTTGGAGCAATGATTATCGG
1103
CTCCAATTAAGCCTGCAGA
1104


EG2
TCCATTAA

AAAATTACG







Enterococcus gallinarum

ATTACGGTACCTGGAAATGA
1105
GATAGCACGACCGATCAAA
1106


EG2
AGGCT

TAAAAATACTATT







Enterococcus gallinarum

ATGGTTGGTATGGCAGTTAT
1107
TTGATAATGCCTTGTAAGA
1108


EG2
TGGC

ATGCCC







Prevotella copri DSM

CCACACCATTTTTGCCCTTT
1109
CGGCTTCACCCAGTTCG
1110


18205
CAC









Prevotella copri DSM

TGAAGCCGGATGGCTTGA
1111
TCTTCAAATTTTAATTCTT
1112


18205


AGATGTTGATCCAC







Holdemania filiformis DSM

CGTCCCAGCTGACGCAA
1113
TCGGTATGGGATTATCCGT
1114


12042


CCT







Holdemania filiformis DSM

CTTTAAAATCAGATCCAGAT
1115
TGAAGAAAATTCCGCCGCT
1116


12042
TTTCATGTTCCA

GA







Holdemania filiformis DSM

GCCATAGACCGCTCTGACTT
1117
CGCAGCTCAGACCATTCAT
1118


12042
CC

TGG







Holdemania filiformis DSM

TTGGAAGACGTCATCCTCGA
1119
TCAATGCAACCCTTTCCCA
1120


12042
TATAATGA

G







Helicobacterbilis ATCC

AGAGTGAGACAATTACGCTA
1121
TTGATATTTCATTTTCAAG
1122


43879
CCTTG

GTGTTTAAAGTGAG







Helicobacterbilis ATCC

AGATTCTAAAGAAGTGCTAG
1123
TTGATGACATTTTGAGAGA
1124


43879
ATTTAAGTGCG

ATGTCTTGCATA







Slackia exigua ATCC

GGAATGTGCGTCGAACGG
1125
CCAGCTGCGGTTGCGATT
1126


700122










Slackia exigua ATCC

CCGTACCGGATTCCAGCGTA
1127
GTCTGGAATGTAGAACTAT
1128


700122
T

GCGATGATATAT







Slackia exigua ATCC

CTCTTGGCGCGAATGGAC
1129
TGGGCGGCTATCTGGATG
1130


700122










Anaerococcus vaginalis

AAGGACTTATGCCTCAATTA
1131
TCTACCGCAGATAAAACTC
1132


ATCC 51170
ATTCAACC

CCACTA







Anaerococcus vaginalis

ACCTATAGTCATATCAACTG
1133
AAGTCCTTGCATCCACTTT
1134


ATCC 51170
GAATTGCG

GG







Anaerococcus vaginalis

TACTGGAGATGTATTAGTGG
1135
TTGCATAATAATTTGTAAG
1136


ATCC 51170
GAGAAGTT

GTTTTTCATCCTC







Collinsella aerofaciens

CGTTCCATCCCACCCCT
1137
GCATCCAGAATGCTTTTCT
1138


ATCC 25986


TACCG







Collinsella aerofaciens

TCCCCAATCTTCCGTATAGC
1139
ATCAGCGAAATGCCGTTCA
1140


ATCC 25986
G

AA







Collinsella aerofaciens

AAAGACCGCCGTTGCGGTTT
1141
ATGGAACGGCCCATGCA
1142


ATCC 25986
TA









Dorea formicigenerans

ATGCATCTGTTTCCTGGCCA
1143
TTTTGCAATCTGAATGTGA
1144


ATCC 27755
T

TCTGGG







Dorea formicigenerans

AAACAGATCACGTCCAAGGT
1145
GGGCCGATGCATGGAGA
1146


ATCC 27755
CATC









Dorea formicigenerans

ATCGGCCCAGTATCCGAT
1147
ATCCGGGTTGATTAGGAGG
1148


ATCC 27755


AAGA







Dorea formicigenerans

TTGCAAAATAACATTTGTAA
1149
AAGAGGGCAGAGTATGCCG
1150


ATCC 27755
TCCCAATTTCC









Ruminococcus gnavus ATCC

ATGCCCTGGATTATCCCAAT
1151
TTCAATGCCTCATAATGCA
1152


29149
GAA

TCTGATC







Ruminococcus gnavus ATCC

TCAACAGCTTGAGTAGTCTC
1153
TTCTGCAGTAACTGCAGGG
1154


29149
GTC

TAC







Ruminococcus gnavus ATCC

ACGGAATGTTTTCCGCAATC
1155
CAGGGCATAAGAGGCATAA
1156


29149
GTT

GC







Ruminococcus gnavus ATCC

TGCAGCATCACCTGCTGA
1157
GCTGTTGAAGGGCTCGG
1158


29149










Campylobacter rectus

GCTTATTACGCACAATAGCG
1159
AATAGTTTTGTAATAACAA
1160


RM3267
AATTAAAACA

GATGCAACCAG







Campylobacter rectus

AACCGAAGAAGGAGAGTTAA
1161
CGGTAGTGGTGGTGTTATC
1162


RM3267
AGACTT

GTTAAAT







Campylobacter rectus

CAGGTTGAGGGCCATCTAAA
1163
ATTGACAAAATCATAGTTA
1164


RM3267
TAATTCA

AAAACTCCTTTGAA







Campylobacter gracilis

TTTACTACCATCGCGCCGAT
1165
ATCGCCGCGTTTTGCGT
1166


RM3268
ATTT









Campylobacter gracilis

AAACGGCTCATCTGCGTCA
1167
GTTGCACCGTAAAAGAGAG
1168


RM3268


GACT







Peptostreptococcus

AACCTAGCCATACTAGTATA
1169
GAGTTGGTATCAGGAGATG
1170



anaerobius 653-L

GTCCCTT

AAGAAGC







Peptostreptococcus

CTGCAAACACATCAAAATAA
1171
TGCCAAAAATAAGATACAC
1172



anaerobius 653-L

AAGGCAG

CTTCCTATAAGA







Peptostreptococcus

ACCAACTCTATATCGGCAAA
1173
ACCTGAGGGTGACGACTTG
1174



anaerobius 653-L

ATTTGT









Peptostreptococcus

TGTCCCTCAACCTAATTTTT
1175
GTTTGCAGATAGGTGTTCA
1176



anaerobius 653-L

GGCTT

AGCA







Prevotella histicola

GTTTGGCTCAGGAAGAGAAA
1177
GATACTACCATCGCTAGAA
1178


F0411
CCT

ACACAGAA







Prevotella histicola

GCAAAGGCAGAGGTGGACAT
1179
TCAAACGAACAGCCTGTTC
1180


F0411
TAC

C







Prevotella histicola

TCGTTTGACGAATAACATGC
1181
AGAGCCTATCATAGAAGAC
1182


F0411
CG

ATCAATAGC







Prevotella histicola

AGCACCTACCTTCTGGATGA
1183
AGCAGCACAGGTCCTGTT
1184


F0411
TC









Helicobacter bizzozeronii

GGATAGCATGGTGCATGTTA
1185
GTTCCACAAGAGAGATGGG
1186


CIII-1
CAGATAT

CA







Helicobacter bizzozeronii

TTTGGGCAGTAACCTCTAGG
1187
TGCCCTAGAAGCCATTTAT
1188


CIII-1
G

GACAAA







Helicobacter bizzozeronii

ACTGATATGCACGCCATAGA
1189
CCAAAGCATTTTAACCGAA
1190


CIII-1
TCAC

AATGGT







Enterococcus hirae ATCC

GGCGTTGATACCCCAGC
1191
TTGTCAGTCTATATTGTGA
1192


9790


GATGTTTCTCAAA







Enterococcus hirae ATCC

TGGTCCAACAGCTGTTTCTA
1193
ATGAAGCAAAAGAAAAATT
1194


9790
CTT

ATTAGCACAACAA







Enterococcus hirae ATCC

TTTTTGAGGCTAACTTTGCC
1195
TCAACGCCTTCTGGTATTC
1196


9790
ATTTCT

CC







Enterococcus hirae ATCC

AGATTCGGACCAAGTTTAAC
1197
ACCTTTAGGGAAGTACGGT
1198


9790
TCTTCAA

ATTGAA







Bacteroides nordii

ACCAAGACTGCTGACAGCAT
1199
TGCAGGCACGTATATTGGC
1200


CL02T12C05
ATG









Bacteroides nordii

TGCCTGCATTGTGATGGAG
1201
AGACGACGTGTCCAACTAT
1202


CL02T12C05


CAG







Barnesiella

CGAAGCAATTCAATAAAACA
1203
AGTTGCGTATTATCCAGTT
1204



intestinihominis YIT

CGAAAGTG

GCGA



11860










Barnesiella

CGATGAATACTAAGCTCATA
1205
TTGCTTCGAAGTAAGCGAT
1206



intestinihominis YIT

CTCTTCGG

ATATTGTTTTT



11860










Barnesiella

AAAATTGCGACCTCCCGAAA
1207
AATTTTCTCACGGATACTC
1208



intestinihominis YIT

AAT

ACATTAATTTCGT



11860










Barnesiella

ACCGATAATTACACCAAACA
1209
CGTCGATCAACAGTGCGTT
1210



intestinihominis YIT

ACATGG





11860










Lactobacillus murinus

CCGATCACATAAGCCACACC
1211
GTGAGTCAAATATCATTGA
1212


ASF361
TAAC

TGTGATCGT







Lactobacillus murinus

TCATCTGGAGCGACGTGA
1213
GAACTGACCAACAAAGATC
1214


ASF361


AATGGA







Lactobacillus murinus

ATCCGTGCCTTAAGTAGTTT
1215
CAAGGAAGGTATAAATGAT
1216


ASF361
GCT

ACACATTATCCCA







Lactobacillus murinus

CCTTGATGCTTGGCTTGATG
1217
GGCAAAATAAGCTCCTAAA
1218


ASF361
TT

ACATCG







Eubacterium rectale

TCCTACCGTAAAGCTCTGTG
1219
TTTATTAGGTTTGATTTTT
1220


CAG:36
TTAC

CAGACCTGCCT







Eubacterium rectale

AGGTATTTTCTCTATCCTCT
1221
GCAGGCACTTTTAATATTC
1222


CAG:36
TCCCTTTAAAACC

AATGTTCCG







Cloacibacillus porcorum

GAAAGGGTCAACATTGCCGT
1223
GCGATCGCCGTCGTGAC
1224






Cloacibacillus porcorum

GAAGGTGCCGATCGAGAAGT
1225
ACCCTTTCAGGATTGGCAC
1226



G

A







Cloacibacillus porcorum

ATAACCGGCGCGGTCTT
1227
ACCTCCGTGACAGAGGGA
1228






Cloacibacillus porcorum

CGATCATCACGTTTGAGGCT
1229
TATGAATCTTAGCGCACGC
1230



TTG

AATC







Blautia coccoides

GACTCAGATTTTCAACCCCT
1231
TGCTTTATACGCATAAAAA
1232



GTCTG

TAAGCTTAATTCA







Blautia coccoides

ATACTCCAGGGCACTTGCCG
1233
TTTACCCTTGGGCATTACC
1234





GTATATACTA







Ruminococcus bromii

ATTAAGGTTGTTGAAGAAAG
1235
AATACCGCCTCACTTACTA
1236



CAGAAGAA

TAGCC







Ruminococcus bromii

TTGTCGGGACTTCTTGATTA
1237
TCGGTATCGCAGCTGAATT
1238



TGCA

TATAGT







Phascolarcto-bacterium

CTGACAGGGACAGAAAGTAA
1239
TGCAACGGCTTTGTACTCA
1240



faecium DSM 14760

CG

CT







Phascolarcto-bacterium

CCGATCGTTCCGCTTTCA
1241
GTAACTATCAGCGGCGGTA
1242



faecium DSM 14760



CT







Phascolarcto-bacterium

ACATCGATGTTTTTGATGGC
1243
ACGATCGGCGGCGATAT
1244



faecium DSM 14760

TTTAATATTGC









Gemmiger formicilis

GAAAAGCCATTTTATATTCT
1245
CTGAAAAAGATTGGTGACA
1246



CCTGTTCTTTTT

TCACAGATAT







Gemmiger formicilis

GCACTGCGCCAGATAGGTA
1247
GGCTCGGTTTCCGCGAT
1248






Gemmiger formicilis

TGTAAGACCTGCGCGTTGTG
1249
GCGATAGCCTGACCCAGTT
1250






Helicobacter salomonis

GCAAAACCCTCTCTTGCTTG
1251
TGGTGGCCTTGATAAGAGT
1252



T

TTGA







Helicobacter salomonis

GCTGCTCTTCCTGTCAGGTA
1253
GGTTTTGCAACAAGGGCTT
1254



TTTTAG

C







Helicobacter salomonis

GCAGGGCTGGCGATCAA
1255
ATGGGTTTTAAACGCTTGA
1256





AAAATGC







Helicobacter salomonis

GCCAAGGCCTCTCTTCTCA
1257
AGCCCTGCCCCTAATTGG
1258










TABLE 16E (PRIMERS/PROBES SEQ ID NOS 1259-1298)












Gardnerella vaginalis

AGCAGGCCTTTTCTCAGGA
1259
GGAGCAACTTGTTAGCAGA
1260





TGG







Gardnerella vaginalis

TTGCAGGTTTTGCGAGT
1261
TGCCAAAAAGCCTTGAGAA
1262





TATTCTG







Gardnerella vaginalis

TTGACGAATCTATTTAAACC
1263
GGCCTGCTACTAATTCACT
1264



TTACCGC

TATTGC







Klebsiella pneumoniae

ACGGTGGTCGCTGTACTG
1265
GCAGGGTGCTGACCGAG
1266






Klebsiella pneumoniae

TCAGCGCGCAGAGAATACTG
1267
ACCACCGTAACCGGCTC
1268






Klebsiella pneumoniae

CACCCTGCGGGCTGTCT
1269
GGATTACGCATCGGATCGG
1270





G







Escherichia coli

TGCAATCTTGTGAGTGGCAG
1271
CTCGACCACCACGAATCGC
1272



A









Escherichia coli

CAATCTTCGGCGTTTTGCTG
1273
GTTGAAGATGACATGAGCG
1274



TAT

TTGAC







Escherichia coli

CGCCAGCGAAGGCTATTT
1275
GTGGTGGATGTTCCTCTGG
1276





TG







Enterococcus faecalis

CGTAGCCAAAACTAATCCGG
1277
CATTTGGACTTAAGAGGTA
1278



ATTG

TTGCGATTT







Enterococcus faecalis

GCATTAAGAGCAAATCACTG
1279
ACGATTATTTTAAAAGCGT
1280



GGAATT

TAGAAGAAGCC







Enterococcus faecalis

GTTGTTGTAAATGCCATGGG
1281
TCTGAAGTACTAGTTGCAG
1282



TTCC

TGATTCAAC







Proteus mirabilis

CTTAAAGAAAGTCATAATCC
1283
GCGTTTGGGTTTATGAGCT
1284



TCACCTTCCC

TGAAA







Proteus mirabilis

ATAAAGAAGCATATGGTGAA
1285
GGCATTTGCGCCCATACTG
1286



AAATAAAACTCTG









Proteus mirabilis

GTCGAGTACGACTTGCGAGA
1287
GAGTCACCTATATAAGCAT
1288



A

CACTCTATAAGAT







Escherichia coli

TCGTCGGTTCTGGCCTACT
1289
GAGAAGCGACGACATGATT
1290





AACTCT







Escherichia coli

CGCCACGGCAATGGTTTC
1291
GCGCAAACGTGGTTAATGG
1292





TA







Escherichia coli

CGTTATGTCGGGCGAACCA
1293
GCAATCATGGAAAACATCA
1294





ACGTCAT







Escherichia coli

CTTCACCGCCATTTCCGTAA
1295
CCACACCGTTAGCAGCAAT
1296



C

CA







Escherichia coli

GACCCATCCGGCTGATACC
1297
CCGTGCTCGGCAATTTTAC
1298





AT











TABLE 16F (PRIMERS/PROBES SEQ ID NOS: 1299-1604)












Escherichia coli

CGCATTGGTGAGCTGGC
1299
GACAGCAACTCGCGGATC
1300






Escherichia coli

TCCGTATCGATCCTGAACAC
1301
GCCACTCGCCCCTTGTT
1302



CA









Bifidobacterium longum

GTACCCAACGGGCCGTTT
1303
CAGATGGTGCCCAGACG
1304






Bifidobacterium longum

GCCTCGCGCGAGGGATT
1305
ACCTTGGTCGAGGCCGCT
1306






Clostridioides difficile

ACAAGAAAGGAGCGATAACT
1307
CTCATCAATATTTAAAGCT
1308


QCD-66c26
TTGGTT

CTTTGTTCAGCT







Clostridioides difficile

TGGATATTAAAAGTAAAACT
1309
ATCATGTTATCCCTCCCAA
1310


QCD-66c26
AGCTGATGTGG

TTTGTTCT







Clostridioides difficile

TTGATGAGATATCAACGGAA
1311
AATTTCTCACCCTAGTAAA
1312


QCD-66c26
ATAACTAGTATG

TACTGTTTCTC







Lactococcus lactis subsp.

GCTCCTTGAGTATAACCATT
1313
TCAGATGAAACAAAAGCGG
1314



lactis I11403

GGTC

CTTTC







Lactococcus lactis subsp.

GAAATTCACTTCATCAATTA
1315
GCTGTTGCAACTGCTTTGT
1316



lactis I11403

TACCATAAACCAT

CA







Lactococcus lactis subsp.

TCTAAGGCAATTGCTTTTAT
1317
TTTCTGAAATTCTCTTTTA
1318



lactis I11403

CATTGGG

TGTCATTTTAGGAC







Lactococcus lactis subsp.

GAATTCTACAACCATCTTCA
1319
ATTCGCTGATTTTTCAGGT
1320



lactis I11403

CCACTTCA

ATTTGCT







Chlamydia pneumoniae TW-

CGCCTATGTTCAGGCAGC
1321
CGTCACTCGATTCCCCGT
1322


183










Chlamydia pneumoniae TW-

GAGTGACGGAATCTTTTACC
1323
GAGCCTCTGGGTTCTGCTG
1324


183
CC









Fusobacterium nucleatum

CAAAACCAATTAAAGAGTTA
1325
AAATGTGGATTAATTTGAC
1326


subsp. nucleatum ATCC
GAGCAACATATG

TGTAAAGTGCAT



25586










Fusobacterium nucleatum

TCCTATCAAGAATACTCATT
1327
TGGTTTTGTAATTCTTCTC
1328


subsp. nucleatum ATCC
GGACATTGATT

AATACACTGAT



25586










Porphyromonas gingivalis

CAACCTTTAGCCTCGCCATA
1329
ACCTTTGAAAATACAACAG
1330


W83
GAA

AGGTGATAAA







Porphyromonas gingivalis

GAGTCGACAACCGTCTGC
1331
CGAGTATCTGCTGAAATGA
1332


W83


GTGATATAAC







Helicobacter hepaticus

AGCTAAGGCGCCTTGCAA
1333
GTCTTCACCTTTAGAATCC
1334


ATCC 51449


ATCGCT







Helicobacter hepaticus

TGCAAAGCTAAGCAATTTAG
1335
GCAATGTTGATACTTTGTC
1336


ATCC 51449
TCAAGCTTTTA

TTCACCT







Lactobacillus johnsonii

TTCAAAAACATATCTTCTAG
1337
ATGGCCACCATAATTTTGC
1338


NCC 533
ATCTTCTTGGT

TTTTAAAGG







Cutibacterium acnes

GCCCTCCGCATCGCTGT
1339
ACGTATACCAGGCTCAAGG
1340


KPA171202


CT







Cutibacterium acnes

TCGCCCAGGTGCTCTCC
1341
GGGTTGGTGGAACGCGA
1342


KPA171202










Cutibacterium acnes

GCCGCAGCCGAACTGGTT
1343
ACCGCGAACTCGGGTGG
1344


KPA171202










Cutibacterium acnes

TTCGCGGTCGACACCAA
1345
GAGTTGAGGTGCTGATCAA
1346


KPA171202


CG







Chlamydia trachomatis

GTGAGCGAATCAAGAAAGTT
1347
AGAAGCTAGTGTATACACT
1348


D/UW-3/CX
CGT

GCTTGTC







Campylobacter jejuni

TATAGTTGGCGTGGAGCAAA
1349
ATTTTAATATTTTCCCCAG
1350


subsp. jejuni NCTC 11168
AATTGA

TATCTTTAGTGCA



ATCC 700819










Campylobacter jejuni

AAGGTCTAAATTTTGTCCAT
1351
TGCTTTCTCAAAAAGGATC
1352


subsp. jejuni NCTC 11168
CTAGCATG

TCAAGGT



ATCC 700819










Campylobacter jejuni

AAAACGAAAACGAAAAAGAT
1353
TCTCTCATAAAAACGCATA
1354


subsp. jejuni NCTC 11168
GAAGGTTT

CCACTAAGT



ATCC 700819










Campylobacter jejuni

AGAAAGCATCAAAACCAATA
1355
TGCTATAAAAGATGGAGAA
1356


subsp. jejuni NCTC 11168
AAGGATCAG

CGCTATAGT



ATCC 700819










Campylobacter jejuni

AATAAGGTTTTGATTGCAAA
1357
CCACTTTAGACATAGGTGG
1358


subsp. jejuni NCTC 11168
ATTCTTTAGGAA

TGGT



ATCC 700819










Bacteroides fragilis

AGCCGCAAATGAATACGGC
1359
CCATGGAGCTGGTTGGTTG
1360


YCH46










Bacteroides fragilis

GGTATAAATGGATCGTACGT
1361
TCCGCCAACAAAACCTATG
1362


YCH46
TTCGA

TCT







Bacteroides fragilis

GGCAACCACTTCCGGAATTT
1363
TTGCGGCTGGATGAGGT
1364


YCH46










Lactobacillus reuteri JCM

GGGATGGACAATTATTTTAT
1365
CTGGCCAGTAACGGCGA
1366


1112
GGATTCTGA









Lactobacillus reuteri JCM

AGTATTTTGGCTCACCAAGC
1367
TGCCATTGATCCACCTCAC
1368


1112
ATCA

TT







Lactobacillus reuteri JCM

ATTATGATTATTGGTGGAGG
1369
AATTACGCCAACGTACCCA
1370


1112
ATGGTATACTGT

CC







Lactobacillus reuteri JCM

CTGGCCAGTATTTTGGCGGT
1371
ACAGTTGAGGCTGAGAGAA
1372


1112


AACTTTG







Bifidobacterium

GCCAATCCCCGTCATAGC
1373
GATCCGGCGGCTGATATTC
1374



adolescentis ATCC 15703











Lactobacillus rhamnosus

CCGGTTTTTGCGCGCTTC
1375
GCGATGGCAGAAGCGTT
1376


GG










Lactobacillus rhamnosus

AAACCTTGATGATTGCTTTT
1377
AGACCCGTAATGCCGCCT
1378


GG
GGCAA









Lactobacillus rhamnosus

GCCATCGCTCTTGGCGT
1379
ACTTTGGTTTGAATCAAGA
1380


GG


CTTGATCAC







Lactobacillus rhamnosus

GTGATCGTCATGTGCGATCC
1381
AAGGATGGATCAACCGTTA
1382


GG


TCCTTAATAAA







Bacteroides

GCTGGTTCAGGTATTACAAC
1383
GCGATTACGATTTGAAAAG
1384



thetaiotaomicron VPI-5482

TGC

TTCTCACTTT







Bacteroides

TTATTGCAGGGTATGGTAGC
1385
CTCCACCTAGTCCCTGTCC
1386



thetaiotaomicron VPI-5482

CAG

G







Bacteroides

AAATCACCCTGGTGGAGCGA
1387
CTACTGCCTTCTTCCGGGA
1388



thetaiotaomicron VPI-5482



AAAT







Lactobacillus acidophilus

TAATGACGAGATGCGTTTGG
1389
CTGAATTAAATTTAGTGCA
1390


NCFM
ACAG

TTTTCTAGCAAAGC







Lactobacillus acidophilus

CGGTTTAAACGATGCTACTC
1391
TTTCAGCATACCAAAGTGG
1392


NCFM
TCGA

ATATTTCCAT







Desulfovibrio alaskensis

GCAGCGAAAGTCCGTCG
1393
ATATCCGCGCTGCGCTG
1394


G20










Desulfovibrio alaskensis

CTGATGCGTGCCGTGCC
1395
AGAACAGCGCATGCGCTC
1396


G20










Bacteroides vulgatus ATCC

CAAACCACTTGTTCAACTTC
1397
GTCAGCAATGTAACCGTCA
1398


8482
CCTG

GG







Bacteroides vulgatus ATCC

AGGCATGAGCATGAAACGC
1399
GGTGGAACTGACCGTAGGC
1400


8482










Bacteroides vulgatus ATCC

TTTGGCCACAGCATGGGA
1401
AAAATGCTTCTTGTTCCAG
1402


8482


TTCATCC







Parabacteroides

CCCGGTCGTGTTTATGGG
1403
TCTGCGCATTCATGTCCG
1404



distasonis ATCC 8503











Parabacteroides

CCGGAGGGAGTGGAGTT
1405
CCCTTACCCGTATCTTTCA
1406



distasonis ATCC 8503



CGG







Parabacteroides

GAGGAAAAGGCGGAGTTTAT
1407
CCCTCCGGCATCATCAATT
1408



distasonis ATCC 8503

AGATCTG

G







Parabacteroides

TGCGCAGATTCAGGATATTT
1409
CCAGATACGCTTTATTATA
1410



distasonis ATCC 8503

GTGC

ATAATTCTCGCC







Lactobacillus delbrueckii

CCGCAACCTGGTCTTAAAGA
1411
TTGGCGCTTCAGCCAGTAT
1412


subsp. bulgaricus ATCC
G





BAA-365










Lactobacillus delbrueckii

GTGCCCGCTGAAACGGA
1413
CCGGTCAAGTTCCGGGCA
1414


subsp. bulgaricus ATCC






BAA-365










Lactobacillus delbrueckii

CCTTTAAGAGCAGCCGGGAT
1415
GCCTGAGTCAATCCCGACC
1416


subsp. bulgaricus ATCC
T





BAA-365










Campylobacter curvus

GCGACGGATGACGTCCT
1417
GGATGTTGCTAGCTAGCGG
1418


525.92










Campylobacter hominis

CGCGCGAAATGCTGAGA
1419
GGTGAAGCGATGGCGAA
1420


ATCC BAA-381










Campylobacter hominis

GCTTCACCGAATCCGTCG
1421
TTTAGCATCTATTGAAAAT
1422


ATCC BAA-381


AGTAGGATTTCACC







Campylobacter concisus

GGTTTGATGGCAAAAATTTG
1423
AGTTAATTGTGGCTTTAGC
1424


13826
TGTGGT

TAGGATAAATT







Akkermansia muciniphila

TGTAAGCGGCGTTGTATTTG
1425
CCAATACCAGTCCAGTGCA
1426


ATCC BAA-835
TCC

GC







Akkermansia muciniphila

CATTATCAACGGTTTTCAGC
1427
AAGAAAGTAAACCTTACTA
1428


ATCC BAA-835
GTGTAG

TCACGGC







Atopobium parvulum DSM

TAGGTCCTATATTCCCCAGA
1429
AGTAGTTTTGTCTACTCTT
1430


20469
CTCAAA

GGAGTAGTG







Atopobium parvulum DSM

TATTGATTCAAGTTTTGTGA
1431
AGGACCTATACTTGCAATT
1432


20469
AGAGAGAAAAAC

TAAACGAC







Veillonella parvula DSM

GGAACTCACGAACTGACCAA
1433
CAGATTTAATAAAAGCATC
1434


2008
AGA

CCCATTTTTAGCC







Veillonella parvula DSM

GCGCCCATTCCAACTAATAC
1435
CCAATGTACAGGATAACTC
1436


2008
ATTATCTTC

TGTATTACACG







Veillonella parvula DSM

AAAAGAAGCGGATAGTTGAG
1437
CACTTGTGGACTGTAGAAT
1438


2008
TTAATCAGC

ATGGCA







Citrobacter rodentium

GCTGGATGGCGGTATCACT
1439
GAGCATCAATCCATGTCGG
1440


ICC168


AT







Citrobacter rodentium

TGATGCTCCGCCACCCA
1441
AGGTGTCTACGGCACTCAC
1442


ICC168










Streptococcus

AACTTGAAAAAGCAAAAGAT
1443
AGCTTATACTAACGATAAT
1444



gallolyticus UCN34

ACAAGAGTTAATG

AAAAATTAACCCGA







Streptococcus

AGAAAGCCCAACGGTATAAA
1445
CAATCGCTGTCTCTTACTT
1446



gallolyticus UCN34

CATTACAA

CATTTATTTTATGA







Enterococcus faecium

CGGCGTGATCAGCGCCA
1447
GGTTGCTGTGCCTCTTATG
1448


TX0133a04


TGG







Enterococcus faecium

AGCTTTATACAAAAGCATAT
1449
CTTTAACGAACGTGTTCGC
1450


TX0133a04
CTGCTCCTT

TAAAAA







Enterococcus faecium

AGCTCGTTCTCATTCAGCAG
1451
ACTCAGGAAGCTTTGGCAG
1452


TX0133a04
A

A







Peptostrepto-coccus

CCTCCATATACCAACTTAAA
1453
CCCAAGAATATTTTGCCAA
1454



stomatis DSM 17678

TACTAAACATGT

GGTCA







Peptostrepto-coccus

TCTTGGGCTATACCCATAGA
1455
GAGTCGATAATAAAGAGGC
1456



stomatis DSM 17678

CCT

TTTTAAGTGAT







Mycoplasma fermentans JER

AGCCACTTTTTGTTCGTCTT
1457
ATTCAGCATATTTACCACT
1458



AGTACT

TGCAATGTT







Mycoplasma fermentans JER

TGGATGGATTTTATGATGCT
1459
AAGTGGCTTTTTAGTTCCT
1460



TATCCACA

TCTGC







Eubacterium limosum

CACAAGGGTCGCCGCGTC
1461
CCGCGAAATACGGCGAACT
1462


KIST612


G







Eubacterium limosum

AAACATAAACGATGGAAAAC
1463
TAAGAAATTAACGGAAGGA
1464


KIST612
AGATTATGGAAAA

GATGAAACAC







Parabacteroides merdae

TCAATGTACCGGTGGGCAA
1465
ACAGACAGCCTAATTAACG
1466


ATCC 43184


TAGTCC







Parabacteroides merdae

TTCGGAACCATCCGGCA
1467
CCATGCAGAAAAACCGATT
1468


ATCC 43184


CCG







Faecalibacterium

GCCATTGCGCATCGTCAAAA
1469
CATTGCAGGCAAGGAATGA
1470



prausnitzii M21/2

ATA

AGAG







Faecalibacterium

TAAGCCGAAATCTGAATGAC
1471
CAGCTGATGGATGAGATGA
1472



prausnitzii M21/2

CGA

TCGAA







Faecalibacterium

CCTTAACGGCAACCACGATG
1473
GCGCTTCCAGCATGCCA
1474



prausnitzii M21/2











Faecalibacterium

CCGGTATGGGAATAGGAAAA
1475
ATGCCCCCGCGCAAAATC
1476


prausnitzii M21/2
AGC









Parvimonas micra ATCC

CCAGCACTCCGACTATAGAT
1477
AATACAAAAGTATTCTGAT
1478


33270
TTAGT

GACGGAGAG







Parvimonas micra ATCC

AGCTACTGATCCCCAAAGTA
1479
CGGACAACGAAGTCCGTTG
1480


33270
AAATTCTC

TTT







Bifidobacterium bifidum

AGCGTACCGGAAGCTCG
1481
GTCGGCAATGCCGGCAC
1482


NCIMB 41171










Bifidobacterium bifidum

GGCCGAATATGTTTCGCGGA
1483
GCGAGGTCAAGATATACGG
1484


NCIMB 41171
TA

C







Collinsella stercoris DSM

CGCCACCCCACCTCAATT
1485
ACCCCCGTTGCGCACATT
1486


13279










Roseburia intestinalis

GTCAGATTCTCTGCATAATT
1487
AGCGTCAATCAGGATGAGG
1488


L1-82
TTTCCG

T







Roseburia intestinalis

TTGACGCTGTCATCCTGCT
1489
AAAAATCGAGGATCTGCTG
1490


L1-82


CG







Enterococcus gallinarum

GAGAAGATAAGTACCTAAAT
1491
CAATGAAAACTGGATCACC
1492


EG2
CTGAAAGAAACGC

CTTCTGAT







Enterococcus gallinarum

AAGGATGTGTCCAACATGAA
1493
TCAAGAAATACTGTCTTTC
1494


EG2
TCAGGA

TTCTGACCG







Enterococcus gallinarum

GAAGCCAATCCTGGTCCTGG
1495
AAATGGGAAATAAACTTCA
1496


EG2
TTTA

TGAATACCTCCTAT







Prevotella copri DSM

AGTCATTATGAAGGAGACCA
1497
GGGCGGATAGATTTCCGGC
1498


18205
ATTTCGAC

A







Prevotella copri DSM

ATCCGCCCAGCTTAGCC
1499
CTGATTTTGTAGAGAATCC
1500


18205


CTCGTTGA







Prevotella copri DSM

AACCTTGCACATCGAAGAGG
1501
ATTGCTTTATCGTTACTGA
1502


18205


AATTCATAATCTTC







Prevotella copri DSM

AGACAATTCTTGGCAAACAA
1503
GCAAGGTTCCTACGAAATC
1504


18205
TTCTGG

AAGC







Holdemania filiformis DSM

ACCCTCCTTACCCCACCA
1505
AACAGGCGAAGGAAAAATG
1506


12042


TACTTAC







Holdemania filiformis DSM

AGCGTCGACGCTCTATCCA
1507
AGGAGGGTGAACGTTTTGG
1508


12042


T







Helicobacter bilis ATCC

AAACAGAAGAACAAATTCAA
1509
TTTGCATGGTATTCTAGCT
1510


43879
ATGCGTAACAA

CAGC







Helicobacter bilis ATCC

GGATATGTATAATCTCAATC
1511
CGACATGATATGCACTCCC
1512


43879
CACAAGATATCAC

AGAGA







Anaerococcus vaginalis

TTATGCAACGAATATCCTAA
1513
AAACTCCATTAAGCATAGG
1514


ATCC 51170
ATACAATGGAT

TAATGATGAGA







Anaerococcus vaginalis

AGTCTAAATTCTAAATCTAG
1515
ATTGCCGATTTTCAGGATA
1516


ATCC 51170
GGCAACGG

AGCCA







Anaerococcus vaginalis

TCGGCAATATCATTTTGATT
1517
AATAGTTGCGGATTATATA
1518


ATCC 51170
TCCTTCA

ATCAACAATCCAA







Collinsella aerofaciens

GAGCAGTCGGGTGTCTCC
1519
GGGAAATCAGCCCTTGAGA
1520


ATCC 25986


TC







Dorea formicigenerans

AACCGGGAATGACTAATCAA
1521
CCGATTCATCAAAGCATAC
1522


ATCC 27755
GTGT

CCC







Dorea formicigenerans

ACAAAAGAAATTATTGGAAC
1523
TCCCGGTTCTACCGAAACC
1524


ATCC 27755
CATTGGCA









Ruminococcus gnavus ATCC

CCGGAGTTTGATACCATGGG
1525
ATACCTGCTGCCCGGTTGA
1526


29149
ACA









Ruminococcus gnavus ATCC

GCCGCTTTTACTGGCATTGT
1527
TCTCTTTTTCCTGTCCCGC
1528


29149


A







Campylobacter rectus

TTGTTTGCGTCTTATACTCG
1529
CGATAATTTCTTAAAATTT
1530


RM3267
TGTCT

AGATGTCTGACACA







Campylobacter rectus

AGTCATTTTGCTTGACTGTA
1531
GCAAACAAACGGATTTACG
1532


RM3267
TTTTTGGT

AAGCTA







Campylobacter rectus

AAAAACAAACAAATTTGAGA
1533
TTTTGTTTTACTTTATCGT
1534


RM3267
GCATAGAGGA

CCATATCGACTT







Actinomycesviscosus C505

GTACAGGCTCCCGGCGT
1535
GCTTGCTGCAGCCCTCG
1536






Actinomycesviscosus C505

CTCCACCGTCGGGTTGT
1537
TTCCAACATGTTGGCTCGC
1538






Actinomycesviscosus C505

TGTTGGAATGCCCGCTTATC
1539
CTTACGCCGTGGCCGGT
1540



A









Campylobacter gracilis

CGCGCCTCGTCGATCTT
1541
CGCCGTGCTTTTTGACGA
1542


RM3268










Campylobacter gracilis

GCACGGCGTCAATGCTT
1543
TGCCAACGGCTTTATATAT
1544


RM3268


TTCTACATC







Campylobacter gracilis

TCCTAGATTTGCGATCAGCG
1545
AGCCGTTTTACGCGCTG
1546


RM3268
TAAG









Campylobacter gracilis

GCGGCGATTGAGCGAAATT
1547
TGGGCGAGAGTTTATCGTG
1548


RM3268


C







Peptostrepto-coccus

CCATAACGGTCTTACTGCTC
1549
AGTTACAGGTAGTCCCATC
1550



anaerobius 653-L

TTGAA

TCTATACAG







Peptostrepto-coccus

AGACTTGCATGTTCTCCTGA
1551
GATCGTAAACGTAACCACA
1552



anaerobius 653-L

TGA

TGGTC







Prevotella histicola

TTGATAATGTGTTTACCAAC
1553
CAGACGGTCTCAGTATTGT
1554


F0411
ATCACCAC

TCTGAT







Prevotella histicola

ACCGCAACCCTTGTGAGGT
1555
AGGTGCTAACGGCGAGA
1556


F0411










Helicobacter bizzozeronii

GATCCAAAGTGATGGGTCCA
1557
TGCCCAAAATCTCCAAAAG
1558


CIII-1
TAGAG

ATTGT







Helicobacter bizzozeronii

TGCTTTGGCTTCTCCCACT
1559
CGATTTTATGGATTGCTTA
1560


CIII-1


AAAAGGGTTAAGA






Helicobacte bizzozeronii
TGTGGAACAAATGAGTATTC
1561
ATAAACATCGGTCGCACGA
1562


CIII-1
TAGCCAA

TTAGT







Enterococcus hirae ATCC

AAAACTTATGATTGACAATC
1563
TGGCTGATGTTTGGTCTGT
1564


9790
GAGGCATT

ACA







Enterococcus hirae ATCC

AAACGGAAGAAGGAGTCTAT
1565
ATCCTACACGACTAATCAT
1566


9790
CATGA

TAGAGAAAGTT







Bacteroides nordii

CGTAGAGCCTTCCCGGT
1567
GGGCACCGATGAGAAAAGT
1568


CL02T12C05


T







Bacteroides nordii

TGATCACTCCGGCTACAAAG
1569
TCCGGATATAGATACTATT
1570


CLO02T12C05
GT

GCACCG







Bacteroides nordii

AAACGCCTTAAATTGATTCA
1571
GTGGTGAAAGTTTCTGTGC
1572


CL02T12C05
AGCGA

CC







Bacteroides nordii

TCAAGTTTCCTTCTAAAAGT
1573
GGCGTTTTCTGGTGTTTAT
1574


CL02T12C05
AGCTCGT

GTTCT







Barnesiella

AATCTTTGATTGGAAGGTTA
1575
ACGCAAGATTTTCATTCTT
1576



intestinihominis YIT

GAAGTATAAAAGG

GAAAGAGGAG



11860










Lactobacillus murinus

CTTTGCGACCACACTTAGCT
1577
ATTCATAAGCGGTCGTGAC
1578


ASF361
C

TTTTAACTT







Lactobacillus murinus

TGACTCACCTTCATATTCAA
1579
ACGTTTTGAGCGATACGGT
1580


ASF361
AGCC

CC







Eubacterium rectale

AATTACTCCTCTCTTCTTTT
1581
TACCTTATTATGATATCGT
1582


CAG:36
AACCTTTGATCTG

CATCAAATCGCC







Cloacibacillus porcorum

TCTCTTGATGTACTTGTTAA
1583
AGAGCACTATTCGACGCTA
1584



TAATGCCG

CC







Cloacibacillus porcorum

AGTGCTCTTAGCGGACGC
1585
TTCTAATAGACGTTCACGT
1586





GATATTGGT







Blautia coccoides

CTTCCATCCTCAGGTATACT
1587
TGCTCTGTAAATGGAAAAT
1588



CCAG

AGTCCATCAAAT







Ruminococcus bromii

GCGGTATTTATGAAGAACAG
1589
CCCGACAAAATTTCTTCAA
1590



CGT

GAGTATCC







Phascolarcto-bacterium

CCGTTGCAAAGGCTTTACAC
1591
CGGCCCAGTAACCAGAAGT
1592



faecium DSM 14760



A







Phascolarcto-bacterium

GGTTCTGGTTTTTCGAAAGC
1593
CCTGTCAGCAATAGTTCAG
1594



faecium DSM 14760

GAG

CACT







Helicobacter salomonis

CCTCCACAAATTTGAGGGCT
1595
ACAAGGACTATATGAAGTA
1596





TATGCAAGCG







Helicobacter salomonis

TCACTAATCTTTTACTTGCC
1597
TGTGGAGGCGTTGGCAT
1598



ATCTCTCC









Gardnerella vaginalis

TTGCCGCTATAGGAGCAGTA
1599
ATTCTGCTTTAATTGAACG
1600



A

CAATCG







Gardnerella vaginalis

AGCAGCAGTCGTGTTTGG
1601
AGCGGCAACAACTGAGATG
1602





A







Gardnerella vaginalis

TTTTGGCAACTTGGGCTAGG
1603
ACCCAAGTGACATTGCGCT
1604
















TABLE 17







SPECIES NUCLEIC ACID SEQUENCES










SEQ



GEMS AND SPECIES
ID NO:
NUCLEOTIDE SEQUENCE










TABLE 17A-SEQ ID NOS: 1605-1820










Bifidobacterium longum

1605
GATATCAGGGATAGGCCGGAGGCCTCGTAATGTGTCTTCGGATTGTTCAT




ATCGGGCATATAGACATGTCGTAAGCGCTGATGGCATTGACGAGATCCAT




GATCGGAAGTCACGATGGTTATGCAGTCATCATGCAAATGCCATTGATGT




TCGAATCCATAGG






Bifidobacterium longum

1606
GTATGCATAGTGATGGGGCTGGTGGTCATCATTCTCGGATTCAGAGGACG




GCGTACCGGTGGTCTGATTCCTCTCGGACTGGTTGCCGGTGGATGCGCGC




TCTGCATGACCATCGTTTCAGGCACGTATGGCGTGTACTACCGTGATCTT




GGTGCCAG






Clostridioides

1607
TCAACTGTATTTGTAGATTTTATAGTTGCTGTTAACTTGTTATCAGAATC



difficile QCD-66c26


TTCTGCTAAAGTTGCATTGTAAGCATTAACTGCATCTACTATTTCTTGAG




CAGTTTTGTAGTTTTCAATTTTATAGTCAACTACTTTTCC






Clostridioides

1608
AAGAAAGTTATGTGGGAGATGAAATTATGAGTTTAGATTTTAGTTTTTTA



difficile QCD-66c26


AGTAGATTTGGGACTTCTTTTTTGGAAGGAACAGGTGTGACAGTATCAAT




TTCTCTTGTGGCATTATGCTTTGGATTTATAATAGGTATAATT






Clostridioides

1609
TCTGTTAAAGAGTTTATTTTATTTTGTAATTGAGTTGCATTAGTTAAGTT



difficile QCD-66c26


TACATTGTCTACTTTTATATCGTAAACTCTATCATTTACATTTGCTCCTG




CTTTTGTATCTTCAAACTTAACTTCTAATGCTTT






Lactococcus lactis

1610
CTTTTTCTATTTCTTCTTCTTGTTCAACAAATAGTCCCACCAACTTCTCA


subsp. lactis I11403

CTTAAAATAAGTGAGTTAATACTTTCTAGCTCCAAAAGTTCTTTAAGAAA




TTTATCATAGATTAGCTCAAAATACTCTAATACGACTTCTTCCTTATTTT






Lactococcus lactis

1611
GCTGAACTCGTTATCGCTAACGTCATTGACCTTCGTGCCTTCCAATCTAT


subsp. lactis I11403

CTCAGCCTATGATTCAGTTGTTGCTGATGATACACACAAAGGTGCTGAAA




ACCTCATTAATGACTTTGCTGACGAAGCAAAAAAAGCTGGCGTTAAAAAA




GTCA






Chlamydia pneumoniae

1612
GGGAATTTCAATACCCACAGCAGCTTTCCCCGGAATCGGAGCAATAATCC


TW-183

GTATGCTCGAAGCTTGGAGTTTTAAAGCTATATCATTTTCTAAAGATTTG




ATTTTCTGAACCTTAACTCCAGAATGAGGTAACACTTCAAAAGCTGCTAA




TGTCG






Chlamydia pneumoniae

1613
CTCAACTTGTAGGCGAAGAAGACGCCCAGTCCCAAAAGGAAATCGACTTT


TW-183

CTCTCGCAGTGTGACAAGCTCTCTTGGCGTGCGTTCCTCAAAAATAGCTA




CGAGATCATCCCAACATTTAAAGAGATGG






Chlamydia pneumoniae

1614
ATCTTAAGTTGCGACAGCGAGCCCCTTTGAGACTTGCCTCAAAGTTGTTC


TW-183

CGCTTTTTAGATGTTCCCTCGATTCGATTTAGTAGCTAAGCTATCGGGAA




GATTCTCCTGCAACACTCCTAGGAGATGGTGTATAAGAA






Chlamydia pneumoniae

1615
GAAGTTTAGGTTGAAGGTTTTAGAGTCAGATTTAGAAGGGATTCTAGCTC


TW-183

AGACTGAGAGTGCTGAGAGTCTGTTAACTCAAGAAGAACTTCCGATTCTT




GCAACTCGGGGAGCCTTAGAGAAAGCTGTTTTCAAA






Fusobacterium

1616
TTCAAGCATTTCATTAATAAGTTGTTCTATGTTTTCAGCTGGAAAATCAT



nucleatum subsp.


CACCTAATTCTTCATTAATTTCTTCATAAGTTATAATCCCTTCTTCTACT



nucleatum ATCC 25586


GCTTTTTTTATTAAAGCTCTAGCTTTTTCATTTTTTATTAGC






Porphyromonas

1617
GAGCGCTAATGCTCAGACAATGGCTCCAAATTACTTCCATGCCGATCCGC



gingivalis W83


AGCAATTCAAACACAGGATTGTAAAAGAA






Porphyromonas

1618
CAGGTGGCACGCATCGCGTGGGAACTATGTTGCCAAGTGGCATAGTAAGG



gingivalis W83


CCCCATTACATGGCAATTGATACAAACTCGTGGATCGTCACTTAAGTAGC




TATAAGCTTTGGACATATATACAAGGTATGCGCCTAATCCAACGAATACA




CCAACTAGT






Helicobacter hepaticus

1619
CACAAGACTCAATAGCTCTGCAAAGGTATCTTGTGGGATAAAGAGCATAC


ATCC 51449

TTGCTCCACTAAAAATCGCTTTTTTAAGCTCATCTTTATTTTGCACCCCA




ATAAATGGCTCAAAGCTAAGGCGCCTTGCAAAGCTAAGCAATTTAGTCAA




GCTTTTA






Helicobacter hepaticus

1620
TCTGACAAATCAAGCATATAGGCTTCAAAATGTCCAAGCGTTTCCATTTG


ATCC 51449

CTCCTCAAGTGCATAGGCTCTATGAATATAAGAGAGAGAATCTGCACAAA




AAGTAGCAGATTCTATAAATAATCTTTGGGCAATTT






Lactobacillus

1621
AATTTTCTTCATTGTGATAATCAACTCTATTATTGGTATTATCCAAGAAA



johnsonii NCC 533


AGAAAGCCCAAGCTTCTCTTGCCGCTCTAAAAACAATGAGTGCTCCAACT




GCAACAGTAATTAGAAATGGATCTGAAAAAATTGTTTCAGCTAGTGAATT




A



Lactobacillus

1622
TTCTAAAATATAATCTATACTATCTCTAAAAAATATAGAAATCAAGAGAG



johnsonii NCC 533


GATAAAGATCTATGTATTTAGATGATTTCTTAATTTTAATTTCTGATCTC




CATCCTAATTTTCAATTTTTTTATCAAAATAAGAAAAATGGCGTAATTGA





Lactobacillus
1623
GATTAGTTTAGTGATTTGACTATCTGTAAGATTGCGCAGCAGTTGACCAG



johnsonii NCC 533


CGATTGCAATTGTTTGATCTTCGATTTCAAAATTATCTTGAATAGGCAAA




ATAGTTGTTGCAAAGCTAAAACTGGCTAAAAGACTAGCCCAGTTTTTTTC






Lactobacillus

1624
TTTACGCTACTAGCATCTTTTAAAAAGTAACCTGCTCTAATTCTAGTTAT



johnsonii NCC 533


TCCACTTAAAACTTCAACTAAAGGAAAACGGGGAAAAGCCGGAAAATGAT




ATAAGTCTTCTGCTTCGTGTTCTATCTCATCCTTAACATTA






Cutibacterium acnes

1625
GAACGGACAAGCTCGAAGTATCAAAGCGGTTGGATTCGTCGGATGGGGCT


KPA171202

CGGCGGAGAAACCGAAAGCCGTGAGGAACACGCGGATTTTCGTCATATGA




TCGGCCAGGCTGAATTTGTCACTGGGGGAGTGCTCCGTGTGATCCGGGAC




GTTGGCCATCGCT






Cutibacterium acnes

1626
CGTCACCGACCTGGCAGCGATGTCGTCAGCAACCTTGCGGCCAGATCGTA


KPA171202

CCTCACGATCGGCCGCCGCGAGGTCAACCGGCCTGCCCGTAATGCAGCGA




CGCCGCTTCCAAGAGAGGATGCGCAGCACCGATTCGTCGAGACGCCGCTC




GCTAACCCGACCG






Helicobacter pylori

1627
GATTAAAATAAGCGGGAGTCTAAAGACCTTAAATTGCTCATAGATTTCAG


26695

AATTTAAATTGACTTCTGGCTGAATTTCATCGTCTTTTTTGATTTTAAAA




AATTTCAACTTTTCAAACAAAACCAATTCCTAGATCAAAATAAATTCTT






Helicobacter pylori

1628
AAGTTCCTAAAATTGATTTTTGTTTCAATTTATTCTCATTCAATCGCTAT


26695

ATTTAATCAAAAAGAAAGCAATTTTATAGTAGAATGTAGCATTTAGAACT




CAAGTAGAGAAAATGTAGAAGGAAGGAATACATGAA






Borreliella

1629
CAGTTGTAGTTCTAAGTGATAGAAATCTAAATTCAAATCCAAATCTTAAG



burgdorferi B31


TCTTATTTATAAAGAAATAAAAGCTAAGACAGGAGCTGCCTTAAGTAAGG




CTAGATCATGTATTGGTAAATACTGTTCTTTTAAATAGTTTAGAAAG






Borreliella

1630
TGATCATATCGGAAAAACTTTCTCTCTAGTATAAATAAGTCAGTAGTTTT



burgdorferi B31


TAGATTAAAAATAAGATTTTCAGCAATGCATAAATAATTAGAATATTTTT




TCTTATAAGCTTTGATGAAGATATTGTATTAAATT






Borreliella

1631
CATTAATGGTTATTTTCCATAAGGAATATAAAAGTATTATTTTGTGAGAC



burgdorferi B31


AGGGCATAAGCTCATCATTTGTGTCTATGGTTAGTAACTAGTACTCGGGG




GGGGGGGATAATTAACTAAATATA






Chlamydia trachomatis

1632
GTCATTTTTCACCTTCCACTGGAAGCCTCTACTCTATTGTCTATAATAGT


D/UW-3/CX

ATCGGGTATTGCTTTTATTATTTTATCTATAGGACGCCTATAGTCTAACC




TTTGGGAGAAGTATTTGCTTCGTTATCTAATTTCTTGTCGTGATCTCGTT




G






Chlamydia trachomatis

1633
AAAAGCGGTTTCTATAGTTTCATCAAAAGGTGTTTTCGTAAGCTGGATTT


D/UW-3/CX

TTTTAGTTAACAAAGAAGGTTTTGCTTGTGTCAATACAGCAAATTTAAAT




CTTTTAGAAAACCCTGAAGGGAGAATCTCGCTTTGAATAAAGTTCACGAG




ACCTT






Chlamydia trachomatis

1634
TGAAGAAGAGGCTTATTGAGTTGACACGTTAAACACCACTTTGCAGTGAA


D/UW-3/CX

ATTTACAAAAACTGGAATCCCTTTTTCGCGTAAATCAGCTAGCTTTTCGG




GAGAAAAAGATTGCCAATCAGAGCTATGTGCAGGAGGGACGTTCT






Campylobacter jejuni

1635
CAAAGAAGTCTTAGAACTTTCTATAAGTGAATTTTGAGTATGCTCTCTAA


subsp. Jejuni

TAGGTAAAATACCTTCAAAACCAGCA






Campylobacter jejuni

1636
ATATCAAGAGATATGCAAGAAATAATTCTATTATTTTTTATTAAACACAA


subsp. Jejuni

TTCACTAGAACCACCACCTATGTCTAAAGTGGTTCCATCCTTAAAAGGAC




TTAATAAATTTAGAGCT






Campylobacter jejuni

1637
AAATACGCAAAATCAAAGTGTATTGCCAAGTGAACCTATAGCAACTCAAG


subsp. Jejuni

ACAATAACAATGATACTTCTTTTGAAAGTATGCCAATTACAGA






Bacteroides fragilis

1638
TGGTTTTCTTCCCTGCCTTTTCTTAATATCAATAGTATGCTAAATTTTAA


YCH46

AAATCTGTTTCTGGTAAGTGTTGCTCTGTGGTCGGCAGTAGGAATGGTTC




GTGCCCAGGAGTTCGATCCGAAGCAAAGCTACGAGATCCATACCCAGAAC




GGACTTGTCC






Bacteroides fragilis

1639
GCATCCCCTTCCACAAAGCCCTCCTGCAACAGATACTCAATTCCGACACC


YCH46

TACTCCGCACAAACCGTCACCATAAGTCACCGGAAGTTCCAAAGAACAAT




TCTCCATCACCTCATCCAGCAACACTTCTGCTTTCTC






Bacteroides fragilis

1640
CTATCCAGTCAATCAGGTAAATTAAATGCTCTTTCATTCGTAGCACCTTG


YCH46

ATGTCTTCCTCACCTTTCCTCCGGGACAACCGGTAATACAGATAATAAGC




GAGACCACAAATACCTTTCCCGATGCCTAAAGTTCCGATCGCCCTACTAT




TAATCGTGTTAAAT






Lactobacillus reuteri

1641
AGAGCAAGAAAAAAGACATTGTTATTTCTAACAGCGCCGATTTCGATCAA


JCM 1112

CAAGAATATGACACCGCAGTTGGTA






Lactobacillus reuteri

1642
ATGTTCTCCCATTAAAATGATTTTCGCATGGCTAGTACCAATCCCTTGTT


JCM 1112

GTTTCACCATCGTAACCTACTTTCTACATAAAAATTCAAACTTAATCATA




GCATAA






Bifidobacterium

1643
AAAAACGATGGAAAGCGGTGAGACAACCAGAAGCATTTGTCTCACCGCTT



adolescentis ATCC


TTTCTATCGGATTTTAGGTATGCGCTACTTGACTTCGACTAACCGTAAAG


15703

GGTATGCGAGAATGCTGTTTCATCGTTTTTCAGAAAAAGAATGGAACTTT




C






Bifidobacterium

1644
CGTGTTCCGGGCTGATTCGGAACGATGAGCATTCCGCAGAGTGCTGTGAT



adolescentis ATCC


TTGACTTCAAGCGGCTTCAAGGTTGTATGGTGCTTTCTGAACCGATGTTC


15703

AGAATGAT






Bifidobacterium

1645
GGGGGCCGGTGACATCCGAGTGATGCCATCGGCCCCCACCATATCCGGAA



adolescentis ATCC


GAATCCCGGAAGAATCCCGGAAGAATCATAGGCCCGCATCCGCCAGCGAA


15703

ATGTAGCCGGGTTCA






Lactobacillus

1646
ATTGATGATGCCTTTTGTCGGTGTTGATCGCCAAGGAATGGAGAATCTTT



rhamnosus GG


ACACCATGTTACCGGTCCGCCGTACTACCATTGTCGCCGGCCATTATCTG




TTCGGTTTGATGACAGT






Lactobacillus

1647
CATCCTGTGCAAAACTCAGTTGACCGTCATTTGCTACATTAATTGCATTG



rhamnosus GG


ACACTCTTTGTCCCTGTACCCGTTACATTAAGATCCAAAGTGCTGCCACT




TGCAACGAAGAAGCCTCCGGTACCTTTAATATTGATAAAGCCATAGCCAA




GTTTAGG






Bacteroides

1648
CCGATAGTAGCCGCTGGTTTGTATCTGATATAAAGTCGGAAGTACGGCCG



thetaiotaomicron VPI-


ACATACGCATTGGCAAGCCTCGTTCAATTAGCT


5482








Bacteroides

1649
GCCGGACAGGGACTAGGTGGAGGTAATGCTGGTTCAGGTATTACAACTGC



thetaiotaomicron VPI-


ACAATCGTTGGGA


5482








Bacteroides

1650
TCGCCATGATTTTCGCGTTGATTTCCGTTTGGAGTGGAGACCTGACACAT



thetaiotaomicron VPI-


TGACTACGATTATTTTCC


5482








Mycoplasma penetrans

1651
TACAGAACTGAATAATAATTATTATTCACAAAATAAAGACTTACTATAGC


HF-2

AGTTACCGGAATCAAGAATAAGAATGAAATAACAAATCAAAACACATAGG




ATTTTTTTAACTGTTGGAAATATAGCTTTGTTAC





Mycoplasma penetrans
1652
ACATAAAGTTCATTGTAATCTTTTGGGGTATTTGTTGAATACATTAATTT


HF-2

AGAAGGATTTTGAACATATGCATCAAACTTAGTTTGATCAAATACATTGT




TAGCAATTAATTTATTTAATAATCCAATTGCTAATAAACCAT






Mycoplasma penetrans

1653
ATTACTTCCATTATTAGTTGTTAAACTAACACCCATCGCTCTTAATGTGT


HF-2

CACTAGAGAATGCCTTTTGAATTCTTGTGTCATTTTCTAAAGAAGATCCA




TTTGATGTTGATGAATTAGGTAAAGTTTTTACTTCATATTTATTC






Mycoplasma penetrans

1654
AACTAATGTATTAAATAGAATAGTTACAAATTGAATAAAGGATATTACTT


HF-2

GTACTTTTGATCAGAAAATTTTATTTCTAGATACTGATTGAGTAATTAAA




TCAAAATCAATTGAATTTCTAATATTGTTTGTGGATATTATA






Lactobacillus

1655
GAATGTAACCACTCAAAGCCCTGTAGATAATAGTACTAATAATGAT



acidophilus NCFM


GTTAATGTAAATAATTCTAATTTAGCTGATACACAAGCAGAATTAA




TTGATTCAAATACACAGTTTTATGAAAGTTCGCCTTTAATTGATCA




AATT






Lactobacillus

1656
GCGAGTCCAAATGCAGATAATTACACTACTGTTAATAACTATAATG



acidophilus NCFM


ATCTTCAAAGAGCTGTTAGCAATTATAGTGTAAGCGGAGTAAATAT




CGATGGTGATATTTA






Lactobacillus

1657
ATTGTTGAACAAAATCAATCATCAAGTGAAGGTGCTCAACAAGATA



acidophilus NCFM


TTAATGCAGCAAATGATGTATCTGCACAAAATGATCAAAAAAGTGT




TAATAAAATAAATGATGAAATTATAAAAAATGAAAATGTAGACGCT




GATATTAA






Lactobacillus

1658
TAAAGGATATAACATTCAATCAAGTACTGTTAATGTCGATGATAAT



acidophilus NCFM


GCTTCACTAACAATTAATCGCTCATCTGTTGGCGATGGTATCCATT




TGTTAAGTAATGGTATTGTTAATGTTGGTAATTATAGCCAATTAAC




TATTAAT






Desulfovibrio

1659
CCGAGCGCTGCATGTACTCAGACGCGGCATGATGCAGGGCACCGGT



alaskensis G20


CAGCGTTGCTGCGTGATGCGGCAGGCTGCGGCGCGGTGCTTTCCAC




AGCGCCGAAACCGGCGGATGCGGGTTGCGGGCGGAGCGGGTTGTCA




TGGGCTGTTCTCCG






Desulfovibrio

1660
CCGCACGGCTTAACTGTTTCCAGCGTATCATGGTCAGCCTGTCATG



alaskensis G20


GTCGTAGTTGCGCGCGGGCATGTGCCAGCGGCCTGAGGTGTCTATG




AACATGACGCGGCAGTCGCCTATCTGCGCCCATTGTGCGCTGTTTT




CTGTCAGACGCAGAGCTGCCGCACTGGC






Desulfovibrio

1661
GGTCAGCGCCACGGCGGCCACGTCGTCATGGCATTTGAAGCGCGGA



alaskensis G20


AACATGCGGCAGTCGGGGTCCTGACTCTGCAGCGCGTGTATATGCC




GGTGCAGTCCCTCAAGGCCTCTGGTCAGGTAGGCGCGGGCCAGATC




GCCGAACGATGTTCCGTACTCCGGTG






Desulfovibrio

1662
GTGTAAAAAATTGTAAGCATGAGAGTGTCCGGTTGATGAATGGCAG



alaskensis G20


CCGGACGGCATGACCGCCCGGCAAAGAAGGACGATCAGCATACCCC




CTCTTGGCAGGGCTTTCAATACGCCGGAGTATGTAAAATGGAACTG




TCAGGAATCGTTGTGTTTTGCAT






Bacteroides vulgatus

1663
CAGCAACGAAGGCAATTCATGGAGAAGTGTCCTTACCTCTATTCTC


ATCC 8482

GGACGCGTATTCTATTCCTACCAAAATAAATACTTATTCACCGCCA




CTATCCGCCGGGACGGTTCCTCCAAATTCGGTAAGAACAATCGATA




TGGTTACTTCCCCTCTTTTTCA






Bacteroides vulgatus

1664
AACCATCTTGCCGCCAAAACCAACAACAGCGACTGGTGGTATTATT


ATCC 8482

ATGAAATTCCAATGATAAGGAAGACAAGAACATGGATGAACTCTCA




GACCG






Bacteroides vulgatus

1665
TGCCAATGACCCTATACGCAATGCGGGCAAGATACGTAACAATGGC


ATCC 8482

TTTGAATTCAATTTAGGATGGATGGACCAACCCAATCCGGATATTT




CGTATGGCATCAACTTAATTGGGTCTTTCAATAAAAACAAAGTAAT




AGCCATGGGAAGTGAA






Parabacteroides

1666
CTGCCGGAGCAGTTTTAATACGTTTTTCATTTGAGATTCCGTAGCGCCTT



distasonis ATCC 8503


TGTCCCAAGCGAATTTCGCTTCTATATGGGCTTTTGTCAATTCTTCCTCT




ACACGCATGCGTATGGCGTAGACTT






Parabacteroides

1667
CTCTGAATGCGATCGTAGGCTTCTCCCAATTGCTCAATTCGGATATGCCT



distasonis ATCC 8503


TTGGAACCGGAGGAAAAGGCGGAGTTTATAGATCTGATTACCAAAAACAG




TGACCTGTTGCTTAAGCTGATCAACGATATCTTGGATCTATCACG






Lactobacillus

1668
CCTGGCCTTTGCCAGAAGCCTTTTGCTGGATCCCCTGGCTAACTACGCCT



delbrueckii


TGCGCCTGGCGGTGTGCGAGGACCTGGTCAAGCTGGGGGTTAAAGAAGAG


subsp.bulgaricus ATCC

ATGCAGGTTATGATCCTGGGGGACT


BAA-365








Lactobacillus

1669
ATTAAGCCGCTTTTGACCATGAACAATGACCAAATTCAAGTCCTGCGGGC



delbrueckii


AGAAGCTGGCAAGATAGCGGACAAATTGCAGCTGGTCGGCTTTTTAAGCG


subsp.bulgaricus ATCC

TCCACTTCGCCATCAGCCACCGGGGTACGGAAATGGTCTACAAGCTCTTG


BAA-365

GCCGTTAAGCCAC






Lactobacillus

1670
GCCGTCTTGCTCTTGAAGATCATCGACAAGCTGGCTTCTTTGCCCCAGGC



delbrueckii


AACTTTGAATGTTCTGGGCTCTTTGGCCAGTGGCCTTATCCGGGACACGG


subsp.bulgaricus ATCC

GGGACGTGATCAAGGTGATTGCTGACCAGTCCCGGCAAAGCAGGCGCAAA


BAA-365

CTGCCCAAAGACAA





Campylobacter curvus
1671
AAATAGGGGATGTAGTCCCGAAAATACGGGGCGAAACGCCTCAAAATATC


525.92

TCTTAGTTTCAGCTCTTTTTTACTCATTCGTATCCCAAATTTTAAATTTT




CTCCCACATCGTGCCTTGGGCGGTATCCATTATGGCGATATCCATCGCAT




TGAGTTTTTTC






Campylobacter curvus

1672
TAAGCTCATAGTCACTGATGTGAGGAAGAAGAAATTTTAAGGATTGCTCC


525.92

TTACTCAAATGCTGATCGCATCCTCCATATTCTCGCAAGCAAATTGCAGG




AAAAATGCGAACAGCGTTTGAGTAGTGTAACGAAGCAAAATTCCCTTAAA




ATTT






Campylobacter curvus

1673
TTTAACTTCTAAAATATATGATCTTTCACTCATAAAATTTAGCCAAGTTA


525.92

TATATATCAATTGATAAAATCAACTTTA






Campylobacter curvus

1674
AAAGAGTAATATTTCATTATAATTTTTAAATAAATTAAAGATTACTTTAA


525.92

TATTATCGAGTTACAATAACGCCCAAATAATACGTAAATTTATAGTAAAG




GAGCTTTTATGAGATCCATAACCAACAAAATAGCACTCATGCTATTGATT




GCGTTGTTTA






Campylobacter hominis

1675
TAGACAACATATGATCAAAAGTTTCTTCTTCAAGTCTATTTCCGATTAAT


ATCC BAA-381

TCACTAAATTTATCGTGTATATACTTGAGATTGTATCTTATTTGTCTTGA




TTGTAAATATTCACCATTTTGATAAACAACCAAAAATG






Campylobacter hominis

1676
GTTTTTTAGCAATTTTTCCATCGTCTGTTTCATCCGTTTTAGTTAAAATT


ATCC BAA-381

AGAATATGTGGCGGTCTTTTTGCGATTTTTAAAAATTCTTCATAATCTCT




TGTATCATCATGAATGCTTGCTAAAAACAACACCAAATCAGCATC






Campylobacter hominis

1677
TTCTAAAATAAAATCTCTGTATATATCGCGTAAATTTGTGTTACTGATGA


ATCC BAA-381

TTTGCGGATCAAAGAAATACTCATGCTCCGGCAATAATTTAGCAATCTCT




CTTAAAATTTCAGCACGATAAACTTTTTTTTTAATATTTACAGG






Campylobacter hominis

1678
CCAGGAATTTCGCCAACATCAGTTCCAAAATATATATTTTGAATCTTGAT


ATCC BAA-381

GTTATAAATTTTATTCAAAGATGTAATTATGTCATTAAAATAATAAAATA




TCTCATCAAAAATTTGGATTATAT






Campylobacter concisus

1679
CTTTTATAAGTCCTTTTAAGAGCGAATTTTCAGCTCCGCCAGAGCTTCGC


13826

CTAAAGTGGATAAGAGAAATTTGGGGCGGCCTAGAGA






Campylobacter concisus

1680
ATACGCCCAAAGCCGTTTATTGCTACTTTAACTGACATCTTAGCTCCTTT


13826

TGATATAATTACGCCTAATTCTACAAAAAAGAATTTTAAAACAAATATAA




ACAAGGCACTTTAATAGATGAAGTTAGCACTTTTTGGC






Campylobacter concisus

1681
AGTGACGCAAAGATCGTCAGTATCAATGGTGATGAGGTTTTAATCGACGT


13826

TGGCAAGAAGTCAGAAGGCATTTTAA






Akkermansia

1682
TCCTAACGAACCCCAAGTCAAACCGGACCCCCGCGGCGGTTTTTCAGGAA



muciniphila ATCC BAA-


GCCGCCGCTGAGAGACCGCACAATTCCGGGTGAAGCCGCTTTACACACTT


835

GCCAATAGTGGGAAGCGTGCTA






Akkermansia

1683
TGGGGAGAGTAAAACTAGATTGCCAACTGGATGAGATAGTTGACCACGCT



muciniphila ATCC BAA-


GTGAAGAAAGATGTCGAACGATGCAACAAACG


835








Akkermansia

1684
TCTCTTCTTTCGTGGGAAATGAGGGGGCCTGCGGGGAGGCCCCCTCATTA



muciniphila ATCC BAA-


AACCTGATGTAGATTCCTCTACAAGTTCCTGAGGAACTTAGTCAAGGATT


835

TCGCTGATA






Bifidobacterium

1685
CTGTGCGAAGTCGCGCCGGGCGGCAACGGCGAACCGGTCGTCGGCGATGA



animalis subsp. lactis


CGAAAGCATGCGCGTCGGGTGGTTCGCGCTCGACGATCTGCCCGAACCGC


AD011

TCAGCGACAGCACAC






Bifidobacterium

1686
ATCTCAGGCACCGTGCGGAAGGAAGCGCACGGGCACAGTTCGCGTTCGAC



animalis subsp. lactis


GGGGTCATCGAGCATCTGTAGGCCGCATACAGCGGCATATGAATCAATGG


AD011

ATGCTGCCGAACT






Bifidobacterium

1687
GGACTGGGAGCCGCTGTTGCTGTCTCCATTGCCGGAGTTGCCGTTGCTCC



animalis subsp. lactis


CGCTGCTGCCGTTGCTATTGCTCGGTGAGGGGCTGGGGCTTGGGCTTTCC


AD011

TTGGGCTCAGTGGTGAGTGTCACCTCGGTACCTGGATTGACCTGAGACCC




TGCGCTCGGAT






Bifidobacterium

1688
TTGCTATTGGCTGCCACCTTCCACTTGAGGTTGAGAGCCGAATCGGTGAG



animalis subsp. lactis


TGCGGCCTTGGCTTCCCCGAGTGTACGGCCTACAACGTTCGGCACTGCTA


AD011

CCTTGCCGTTCGACACCCAGATGGTCACGGAGGCGCCTCGCTCCACCGAT




GTGCCCTCGTTC






Atopobiumparvulum DSM

1689
GTAGGCCTATATGAAGCCTTAATCTCAGAATCTGCTGCATCAACGCATTC


20469

TAGTAACTCATCTTGTACACAGGCATCAAACACTTCTTGTAAAGTTAATG




CACCAAATACCTGCTCAGACTCATCCAAAAGCGCAGGCGCAATTAGCTCT




GTGAGGCGTGC






Atopobiumparvulum DSM

1690
AGGAATTCATAATGAATCACTAGCTTAAAGACAGCTTGTGTTCAGATGCT


20469

TCTTGTTTGCCCACAAGTTGTGTAACTGTTTCTTTAATTTCTAATGCAAG




ATCAATAAAATCTTGAGCAGTAACACATCTCACTGACTTGCCACGC






Atopobiumparvulum DSM

1691
AGATGTAATATCATCAAAGCTAGGTCTTTTTTCTGCCATTCTTCAATCCT


20469

TTCTTTATTATTCATTTTATTTTGTAACATCAATTAAATACTAACTGCAT




CAAATAAATTTTTCTAACATTATCTTAACTCCCAAAAACGGCCATAAAG






Veillonella parvula

1692
AATAGATATTGGTCCACATCGCGTAAAAGCAGGGCTAGATATCATTTTGT


DSM 2008

CAGGTGCTATTGGAGATCACTCCATTGCCGT






Veillonella parvula

1693
CGTCTCAGTGAGAATATGTGGGGAACTCACGAACTGACCAAAGAAGCTAT


DSM 2008

GGAACGCTCTTTGCGTGCTCTAAA






Veillonella parvula

1694
GTACAGTCAGTATGTAATATATTAGGGTATGACCCTTTATATTTAGCGAA


DSM 2008

TGAAGGGAAAGTGGTT






Citrobacter rodentium

1695
TTTCATCATTTGTCATACTTAAGCATATTTTTTATCAATCATTATAACAA


ICC168

AAATTGTACAGAGCAGAGATGAAATATATCTTGTATCTCTATGATTTTAA




TGTATTTATAACGCGTATGAATTATTTTA






Citrobacter rodentium

1696
AATTGATTGCACAAGCGGAAGCCGAAAAACAACGGTTGATTGATGAGACC


ICC168

AACGTCTGGATAAACGGGCAGCAATGGCCGTCTAAATTAGCGCTGGGCCG




CCTCTCTGAGGATGAAAAAGCGCAGTTTAACGAATGGCTGGACTATCTGG




ACGCGGTGAGTGCCG






Citrobacter rodentium

1697
GTGGGTTTTTATATCGAAGGTGTGTCTGCGGTTCCCTCCAATGCTATTGA


ICC168

AGTTAGCGCGGATATTTATAATGAGTTTGCCGGAGTGGCGTGGCCTGATG




GGAAAGTACTAGGTGCTGATGATTCAGGATATCCGACATGGAT






Citrobacter rodentium

1698
CAACCCCATCCGCAAAAAACAGCGCGCCCGAGGGAAGTAAATGCGTCAGT


ICC168

GACTTTAGCTAATTGTGCTGAAATTTACCCGTAAT






Streptococcus

1699
CCAAAAGTGTGTTATTGGAAGAAAGCGTTGAAAACTTTGATGCTGTTGCT



gallolyticus UCN34


ACCTTGACAGGAGTTGACGAAGAAAATA






Streptococcus

1700
AATCGTGTGGAAATGATTCTCTTCCACAACTTTGTCAAAACCAAAATCAT



gallolyticus UCN34


TAAAAATCTGATGATTATCGGCGCAGGACGAATCGCATACTATCTCCTCA




ACATCTTAAAACATACAAGAATCAATCTTAAAGTGATTGAAAACAA






Streptococcus

1701
CTTTCAATATCTTTTAATTGATGATAAATGTGATATTCCTCATCAGCGGT



gallolyticus UCN34


TACGACCTTGACCATATGAATTTTTT






Enterococcus faecium

1702
CTTTCTGGTTTCTTTAATAAGCGGACCTTTCGTACTCCAGAAATTTTTTC


TX0133a04

TAGCTGTTGTTTCACTTGATTGAGCTGTTCAAAATCAGTACTGTCTATGT




CAATCAATATGTAGACATGCTCATTCTTATGGTTACGCGATATATC






Enterococcus faecium

1703
CACGGCCAGCCACTTTCTTTACTGTCAGTTGAAGGATCTGCTCCTCCTCA


TX0133a04

GATTGTGTGAGCGTAGAAGATTCTCCTGTTGTACCTAAAACTAATAATCC




GTCTGTTTGTTCATCTAAATGGAATTGGATCAATTTTTCCAAACCAGCAT




AATCTACTGATCC






Enterococcus faecium

1704
ATGTATTCACTCGGCTGTCATATTCGTGATTGTTATCAGCAATATTGAAC


TX0133a04

CAGAATTTATACATGTCGTTTTTGTATTCCCAATAAGTAGTTTTCGTCAC




TTTTTCAGCATCGCCGTCTGTCTTTTCTACTTTATTTGTTTCGTTATC






Peptostreptococcus

1705
ATTAGTATACCAAATAGCTCAACTATAGCTGATATAAATATAAGTCTGCC



stomatis DSM 17678


CATATTTATAGGTACACTTCCCATAGTCTCCTTGGTTCTGGCATTTACTG




CAACATAGTGGAGGATATTTTCACTACCCTTTTTCTCCATATAGGAGTAT




AGC






Peptostreptococcus

1706
TTGGTATCCCTCTTCTCAGATGTGTAGCCTCTTAGGTAGTTTGAATTATA



stomatis DSM 17678


CTTGACCACATTATCTGTATCAAAAGGCATTATGGAATTTATAATGTTGT




TGGTCTTGCGCCTGTTAGATTGGTCTAGCTTGTCTGAGCT






Peptostreptococcus

1707
TAGTCCTAACCTTCTGGTCGCTCGTCTCCATATTTTTTATATTGTATTGG



stomatis DSM 17678


GTTTCTCGCTCATAGGTATGTCTTGCATTTGAATTTCTATACCTCAAATA




CTTGATTATAAAGAATATAAATCCTAATAGGTAGAATACCCAA






Peptostreptococcus

1708
TTATATACATCTACATCATATACTACCTTCTTGTTGTCGTCAGACCCAAT



stomatis DSM 17678


TGTATATTCACATATAGTCTCTTCACCAACGCCCCTTAGGTCCACATGGG




CCTTCATGTCAACTATCATATAAGGTAAGTATACACCACTTA






Mycoplasma fermentans

1709
CTGTTTAAAATAAAAATAGGATTATTTGTATTATCTGCAATTACATAATT


JER

AACAAGATTTTTAATCTTGCCTGAAGTTAAATTGATTTTATCTTCGCAAT




GAGCAAAAATAAATTTTCTAATT






Mycoplasma fermentans

1710
CTTTTTCTAATCATTCTAATTCTTTGTTTTTTTCATTAACTTCATCAATA


JER

ATTTTTTGTTTTTCTTCGTTAGTTAATTTCTTTAGTTTTAATCT






Mycoplasma fermentans

1711
AAGAGATGAAAAAATATTTGGTTTTTTAAATGAAAATGAAACTAAAAAAC


JER

TAGTCAATAAATTGCACAAATATAACAAAAAATATTTACTAAAATCAATT




AAATACTTTAGAATTGGTAAAATTGTAGAAAGAAAAAA






Mycoplasma fermetans

1712
GCTAATGGATTAGAAATTTTAGATGGTAAAATTATGAATGTAGAAAGTGA


JER

TGGAATGCTTTGCTCAGCAGAATCTTTAGGTTTAGAA






Eubacterium limosum

1713
TTTCAGCACAGTGGTCAGCAGCATGTAGGCCGCCACGACAGCCAGCAGAA


KIST612

AGCCAAAGTAGCGCGGCGGGAGGATGGTCAGCCCCAGCACGCTGAACAGC




GGGGTAAAGGAAAGCCCGGTGAACAGCAGGATACCGGCAACGGTGATGAG




CATGACCGGGGCA






Eubacterium limosum

1714
GTATTTTGCTTTTTTCACGTTCGGACATTCTCTGCCAGACGTTGAGCCAG


KIST612

CCAGGGGTCAGCCCGGCGAAAAAGGCGGCTCCGGCCGCCAGAAAGATCAG




CCCGAAAATAATGCAGTATA






Eubacterium limosum

1715
ATTCTGCGATGATCTCGATGCGGCCGCCGGGCAGGTCGTTGCTGTCCAGG


KIST612

CTTACCAGGTACATTTTCCGGTTCCGCACCGCGTTGATGAGGTAGCTGCG




CAGGCGGCCCTCATACCGCTCGTCGCACACAACGGATACGGTGTAGCGGT




TGTCCGTCTC






Eubacterium limosum

1716
TTGCGATTAAAAAAGCCGGACATAAGCCCGGCTACTGAACCTGCTCCCAC


KIST612

TTGGCGGAAACCTTGTCCTCATTTTTCAGGCAGCTCGTGGTGATCTGCTC




AATG






Blautia obeum ATCC

1717
TTAATAAATACATCTTTTGTCATAACTTTGTCTTCTCCTTTTGTCCAGCC


29174

GGATTACTACTCATGTTTACTTTTACACTTATTAGATGCATCGGAAAGGA




AATCGGTTCTCATAAAAGTATTTTTTTCCGT






Parabacteroides merdae

1718
TCATCCAAACGGCGTTCCGTCTGCAACTCCGTCGGTATATCTAAAATATA


ATCC43184

AGACAACCAATAATTCACCTGAGCAGATAAAGTATAAGAGGAATCATCTT




TAATCAATTCCGGCATTAAAGGATTAGATTTTTCTTTTTCAAATGAACCT




ACTATAT






Parabacteroides merdae

1719
ATAATACTTATTTCTAATCGTATCGAATGTCAGATTCACAATAGTATCCA


ATCC43184

GTAAAATTTGTCCATTTTTTCCTATCCTGCGAACAGGAAGTAATACCGAC




TGTATCAAACTTGACTTACCTGCCGAGTTAACACCCGTTACAATTGTAAA






Parabacteroides merdae

1720
CGAAATTTTGGAATGCAACTCCGCCCCGATTCTATTGGCAGACAAATAGT


ATCC43184

ATAAATTTTGCTCTATGTCGATCGGCATGCTGGCATCCTCTTCCGAAGGA




AGAAAAACCAAAGAGAGTTCATTTGTTCCGTGATAGGAAATATGC






Parabacteroides merdae

1721
ATTGATCGTACAGGCATATTTTTTTCGCATCCTGGCATAGTGCTCTTATG


ATCC43184

TGCCTGATCGCCTTGTCTCTATGGGCATTTCTATAAAAACAACCGGTGAT




ATGGCTACTGATCTGATCGGACATGATATTGACATACGGATAATCCGTT






Faecalibacterium

1722
GCCCACGGGGGTGTATTTGATGGCGTTATCCACCAGATTGACAACGACCT



prausnitzii M21/2


GCATGATCAGCCGTGCATCCACATTGACCAGGAGGATCTCGTCCCCATAC




TGTGTTGTGATGGTGTGTTCGCAGCTTTTCCGGTT






Faecalibacterium

1723
CCGCCGCGCCACGCCGATCGTCAGGCTAGCGCTTTGGTAGCTTCATGAGT



prausnitzii M21/2


TCCTGCCGGTCCGCAAAGCTTAGGTCCGTTCAGGTGCTGTGTCTAAAGGA




CGCCCCGACCCCACACGACACAAGAGTGGGCGGAACGCGCAGCACTC






Parvimonas micra ATCC

1724
GATTTCTTTTCTTCAGTTGTAAGAGTGTTGTCTGAATTAACTTTATCCTT


33270

AATACTATTTGCATAAGCAGTTAATTCTTTTTTTGCAGCTTCTTTTGCTT




TTAAAGCTTTTTGATCATCATTATCTGATGCTTTTAATACATTTGTTG






Parvimonas micra ATCC

1725
CGATTTGTATTGGTATATTTTTGTTTTTTAAAAATACTATGAATTTAAAT


33270

TTCCTACCTTGGAGAAATGCATATTTGATTGTTACATTCTATGTTTCAGC




TGTTTGTACATTAATGGCATTTATAGCTATTCCTAAAAC






Parvimonas micra ATCC

1726
ATAAGTTTATAAATGGAATTTTGGGAACTATCGTTAGAGCAAAAAAATAA


33270

AATATATTTTTAGGAGGTAGTTATAATGTTACTTAATACTTTATTGTTAG




TTGTGTTCGTTGGTATTGTTTTTTCAGGCATAGCTGTGTCAACTTTTTT





Parvimonas micra ATCC
1727
AATTTTAAATTGCTTCTTTAACTCCTCATCAATAACTTTTTTAGATATAT


33270

CCTTAATTTTATTTTTAACAACTGAAGCCTCACCAATCTTTTGTTTCATC




TCTCTTTTTATTAAATCAACATACTTATGAGTATGTAAGATATTA






Streptococcus

1728
TTATGGCGTTCTAACAATATACGAAGTATCTCTTTATAATTAGCATTAGG



infantarius subsp.


ATACTTATCAAGATATGCCAACTGAGTTTCTAAAAGTTTATCAAAACGTT



infantarius ATCC BAA-


TATCAAGTGCTTCTTTACTCTTAGGGCGAGAATCTGAACCTTGGTTAACA


102

CGA






Streptococcus

1729
AATGTAAATGATTTTGAGCGTAATATTGAAATATCATATACTAGTTGAGG



infantarius subsp.


ATCGTTATCTTTAAACATTACGTAATTACCAATGATGTCACCATTAGTAT



infantarius ATCC BAA-


AACTTGGTAAAATTTCGGTGTTAGATTCATTATTAACT


102








Streptococcus

1730
TAAACTAGTAGCTTCTTCTGGAACACTAAAAAAGACATTGTCTTTGTATT



infantarius subsp.


GTAGTTGAAATTCCAGTGCTTTTTCTTGAGAAAATCCTTCTTCTGGTGTT



infantarius ATCC BAA-


GCATAATAAATTGTTACTTTTTGAGAACTTACT


102








Bifidobacterium

1731
GTATTTCCTGCTCGTCGTTCTTGCTGACCAGTCCTTGGGCGCCGGCCTGC



bifidum NCIMB 41171


GCCACGTCGAATCCGAACGTCTCCTTGGCGAACGAGGTCATGGCCAGAAG




CTTCACCATGCTGTCGCGTCGGCGGATACGACGGCATACCATGACGCCCG




ACATGGATTCCATCG






Bifidobacterium

1732
TACAAGATCTGCATTGAGTTCGACGGAGGGCATCACGCCGGTCAATGGCT



bifidum NCIMB 41171


GGAAGATGCACGCCGGCGGCAGGCAATCGAAGACGAGAAGTGGCGGTATA




TCCAAGTCACCAAGCTTGACCTCGGTGATGAATGGAGTGAGGAAGCTTTG




GCCAGACGGA






Bifidobacterium

1733
CCGAGCCGTTTCGCTATCGATAAGTCAGTCAGCCCACGCGACATAAGATC



bifidum NCIMB 41171


GAGCACCTGCACTTCGCGGTTGGAGACCAGGTCAGCCCGTTTCGGCGTCT




CATGCGCGAGCTGGTCGAATGAGGCACGGGCGTCAACGAACCCGAAATCA




CCGTAAACGCCTCC






Bifidobacterium

1734
CTTCCCCTTTCCTGAATATGGATAACCGTAGTACCCATATGGTTAGTGGT



bifidum NCIMB 41171


CATATTGTAGGCGCATAATGTGGACAGCCGACGCTAGGCTAGAGTTAGTG




GAATGGCGGTCGCGAGGCTGCCGCGGATTGCGATTTGTGGAGTGTGATGA




CGATGGGCG






Collinsella stercoris

1735
CCGACCAACGCCAAGATGATGCCGAGCTCGATCGACTCGGAAACCTGCTT


DSM 13279

TGCGCGTTGCATGCACCCTCCCTCTCGAATATGCCGTCAGTGCAACTCCC




TAC






Collinsella stercoris

1736
CATGCACAATCGAGCGAGGTGCGATCGATGCCATCTCGGTGCCGTAGTCC


DSM 13279

CGATGATGAGCCGGAATTTATAATCAAGGCGTTGAGATGGGAGGTTGCTT




GTATGATTTGGTTCACATCCGACACGCACTTCGGCCATGCCAACGTGCTG




CATTTCACCG






Collinsella stercoris

1737
TCAAAATACCCTACTTGTGCGCCGAGCCACGTGAGGGAGATCGGCAAAAC


DSM 13279

CGCAGGATTTGTTGGCGTGGATGCGCTCGGGTGCAAGGTCGGTGGGCTCC




AGGCATCATTTGCAACGGCAAAACCGCAGGATTGCTGCGCACGAATGAAT




ATACATGACGGCA






Collinsella stercoris

1738
ATTTTGACCACTATCGCAAAAACGTATCAACCGTCGAGGGCGAGGGCCTC


DSM 13279

TGCGCCCGCTGAGATGCGAGTCTGAAAGTCGGGTTCTGCGCGAAGTGATC




GTCAACTTGACACATATGCCGAGCACGACGGCGGGTTTATCATGGGCGCC




C






Roseburia intestinalis

1739
AAACGGCAAAAAAAGTCCCCACCAGTAATACGATTGGAAATATATATGAA


L1-82

ATAATTCCAAATAAACCAAAAAAGAAGCGGCTGACTGTACCGCCGATCGT




ACCGCCAAATCCAAAGTTACTGATAAACAGCAAAAGTGAAACAGCAACCA




CGATCCATAAAA






Roseburia intestinalis

1740
ACATATGGGAAAGTCCGCCTCCGATCAGCCCGCCTCCACTTTTTGCATCA


L1-82

AAACTGTACCGATAAGCATCCATTGCACTTTTTGCATCAGTTCCATCGGA




AATGAGTTCAAAAAACATACATAAAAATGCCACAAACAAAATGACCGCCA




CTAATTT






Roseburia intestinalis

1741
CGTTTTTTGTCCATGACGCTATGGTAAACCCAATACTCACAAACGTCAAC


L1-82

ATTCCTATTGTCTTTTAAGAAACATTTTAGATTTTGGTTATATTTCCACA




TATGGATATCTTCTTATCTATTGATATCTGATTTAAGCAGGACGATAAGC




CGGGTTATGTCTAAAC






Roseburia intestinalis

1742
TAAATATCTCTGGTTCTCTTCGTAACGCTTTGACACATTCCGTGAGAGTA


L1-82

TTCTGAAATAGGAATAAATTAACAATGCGAGGCCAAGCCAGTATAAAATA




TTTACCCGGAAAAATGCAGATAAAAATGTGGATACAAGTAACAGAACCAT




CG






Enterococcus

1743
TGGGTGGCATGCTTCAATTTTTTGATCATATCTTTTCTCGAAAAGTAAAG



gallinarum EG2


ACTGGCTATGAGATGCTCTTT






Enterococcus

1744
GGGTCGTCTTGAAGGTGGCTGAGGTCCCGATTGGCTTCGGCAAACAAGTG



gallinarum EG2


CAAGGTGTGGTGAAAAACTTCTTCCCGAAAGGATTGCGGTTCACTCTGGA




ATAAACGA






Enterococcus

1745
GGTAGTGCAGGATTCATCGCATTGGCGATAACGATTGCTTTTTATAATTT



gallinarum EG2


TGGAAATCCATTAGCAGGGGAAGCCATCTATAAAGTTTGGACTCAAGAAT




CATTTCCAACAGAATCTC






Prevotella copri DSM

1746
CAATAGACAATTCTTGGCAAACAATTCTGGCAGTCTACTCCTTCCTGATA


18205

TTTGCCCATTCCTTCGCAACCCTGGGAGGAGCCTTCTACCGTAAGTTTCC




TGTGCTTCTGACAGCATGTACGGGATTGGCACTTTGTCTGATTTTGGGTT




ATATTATCAA






Prevotella copri DSM

1747
TTTCGTAGGAACCTTGCACATCGAAGAGGGTTCTTCTGCTCAATATTGTG


18205

CTGTATTCACCTCGTCTGCAGTATTTCTTGCTTTGGCTGCCTTCAACTAT




TGGGCATCCTACAAGCTCTTCACCCGTATGCAGGTTATCTGCAACAA






Holdemania filiformis

1748
CAGACGACCTTGCGGACTCTGTCTGGAAACACGCTGTCCGCGTGCCATCA


DSM 12042

GAAGCTTTTTCACTCGATAATAATCCAGCGTTAAACGA






Holdemania filiformis

1749
ATAATTCCGGTGCCAGAACAATCGCGGGATTAAGAAATCCGATCGCCATC


DSM 12042

GGCGTAGGTGCTGACCGGCTGGCAA






Holdemania filiformis

1750
TTCGACTGCCCTTTTAATAATCGCTGATCACAGGCCATCTCGATATCATT


DSM 12042

TTCCGCCTGATTGTTCATCAGCCAGACGAACGGATTAAA






Holdemania filiformis

1751
TAGAGGGTCTTTTCAGTGAGCTTGATTTCAGGATGGCTTTGTAGAATGGC


DSM 12042

ATAGACCGACTGGCCCTGCGCTAACAGAGGCTTGATTAGAAGACCGAGCT




GCTTGAT






Helicobacter bilis

1752
CCTTTTAGAGCAAGATGAAGCAGGGCTATTATATGAGTATCTGGGCTTTA


ATCC 43879

TTAAAAGCCGTGATGACA






Helicobacter bilis

1753
TTATGGCAAGAATACTTTGCAAAAACGCATACGCATGTATCACAAGAAGC


ATCC 43879

AATCATTAGTAGTGGTAAAAA






Slackia exigua ATCC

1754
GGATATGCATCGGCGGTGTAGAAACGGCGTGCATGCGACATCGCACTTAG


700122

ACGGGGACATGCGAAGCGCATCGCAGCGTCTGCGTCTAGACAACGGGTGC




GCGGCGACGCAGCGTCCATGCCTGGATAGTAGGCGCGATGTACCGTACCT




CGAATGTTGTCTGC






Slackia exigua ATCC

1755
CGAGCGCGCCGTCGCGATACATCAGCAGCGCTGCCGTTTTAGAGAGGCGC


700122

TCTTCTTCGTGGGCAAGCTTCACGTCGTCCATTATGCCCAATCTGTTTTC




GAGGGTCATCCAACCACCTGCCTGCCGGTACCGAATTCGG






Slackia exigua ATCC

1756
GCCGAGCTTCTGATAGATGTGCTTGATATGGGTCTTGACAGTGTTGTAGG


700122

AAAGCACAAGCTCCTGTTCAATGAACTTTCCGTCGCGCCCGCGCGCAAGC




TGGCCCAGAATCTCGAGCTCGCGCGTCGTGAGCCCATATGCTCGAGCAAG




CACTTCGCACCTCT






Anaerococcus vaginalis

1757
TTTAATAGTAAGTTTGATTATAGCTATTTTGATGCAATATATAATTGGAG


ATCC 51170

TTCCTATAATTTGGTTGACAGAAAGTGTAAATTCACTTCTTAAAAGTTTA




AGCTCTAAGCCAGAATATTCTATGATTTTTGGAGTTG






Anaerococcus vaginalis

1758
TATTCTATAGCGGACAAAGCTGGAATAGCACCAGGAATTATTTTGGGTGT


ATCC 51170

TTTATGTAAGACAAATGGTTATGGATTTTTAGGTGGTATAGTAGTAGGAT




TTTTAGCTGGATATTTAACAAAAATAGTTTTAAGTAATTTAAAACTT






Anaerococcus vaginalis

1759
TCTGAAGTTAATGATGATGATAATGACAATAGGTTTTATGAACAAGTAGG


ATCC 51170

AAGATTTCCTGAAATAATA






Collinsella

1760
GCAGCAACTGCCTCACGGGCTCATTGAAAACCGCATAGCACCCCAACGTC



aerofaciens ATCC 25986


AGCACATCACCCACGGCAAGCACATCACTCACGACAACCAACACATCAAC




AAACAGCACGAACATCAACCGATTGGAGAAGCTATGACAAACCACCGCGC




TGAAGACGG






Collinsella

1761
TGGATGAGATTTACAAGATTAAGACGAGCGGAAAATACCATTCCTTCGTC



aerofaciens ATCC 25986


CCTGTTTATCAAGATCGCGGAAATCATCGTTATGCTATTTCCCGATCGGC




GCCAAAACAGGACTTTTCTATCCTTCTATGCGAGGACAGTAAGTCTGGTT




TCCAGTTC






Collinsella

1762
AGTAGATCGAGCATATGAAGTTTGAGTTTTAGCGTGTAAGAGAAGGAAGA



aerofaciens ATCC 25986


TCGGCAAAGTACAACCCCAAACGCAATGACAACTTAATGAGGTAACTGCC




ACATGTGAGGCACCCAGGGCATTGCTGTCTCGATTTTGAGCACGATGCCT




TCCGCCT






Dorea formicigenerans

1763
TCTCAATCTTTCTTCCTTTTACAATTAATGAAAATGTCCCCGTCATTACA


ATCC 27755

TTAATTCCTGCAAATACACCATCTACATCGGTTACAAGGATACCATGACA




CATTTTTTTCTTCAAGTCCAAAATTGAGGATTCTGTTTTCTCA






Dorea formicigenerans

1764
TGTTCCGAACCAATGTGCTGCACTTGGTCCCTGATTGATTGCACCCGAAT


ATCC 27755

CAATTTTCTCATATCCATACCCACTGATGGCCGGTCCGAATATAATGAAA




AATATAAGAACAAGAAGGATAATACCTGCCAGAAGCGCTACTTTATTTTC




TTTAAATCT






Dorea formicigenerans

1765
TCTCGGACGAGCAGGAACCTCTTCATCTTTCTCAAGTCCGATGATTTCAA


ATCC 27755

ACATTTCATCTGGGATATTTTCTGTCATATCAAATGTATTCTGGTTTTCC




ATCATGTA






Dorea formicigenerans

1766
ACATCAGATTTTGGCGATTGTTTAAAACCATTTCCTGTAGGTTCGATTTT


ATCC 27755

TAATTTTTTCGCAAGTTCTTTATTTACAAGTTGTGTTTTAAATGTTCCTT




CTTTTATCAGGTATTTTTCTTTCACAGGTATTCCTTCATCATCAATTCTA






Ruminococcus gnavus

1767
TTGAAATTTGTAGCAGTAGCGGAACCAAGAGCGGATCGACGGGAAGAGTT


ATCC 29149

TGCAAAGCTTCATGATATTGCACCTCAAAACGCTGTGGAATCGGACATGG




AGCTTTTGAACCGTCCGAAAATGGCGGACTGTGTTCTGATCTGTACACAG






Ruminococcus gnavus

1768
TTAATCATATGTTTCTTTCCTTCCCATTCTCTTTTTTTCTTTTGCAGAAG


ATCC 29149

TGATCACGTGCCGGATTGCCATGATTGGATGATAAAATAACATTCTCGGT




CCGGCAAACCGCATCACATTTCGAATTTCTTCTCTCATCTGCGGCTTGTA




ACAAT






Ruminococcus gnavus

1769
GTAATTCTGCTTCCGCCGGTGATCTTGTTCTGCTGTTTCAGCAAGTACTT


ATCC 29149

TATTGAAGCACTGTCTGGTGGGGCAGTAAAAGGCTAAAAGGAGAGAAAAA




ACATGAAACAAGTAACAGCCATTCTTTTGGGAGCCGGACAGAGAGGGGCA




GAG






Ruminococcus gnavus

1770
TCCAATGTTTTTTCAAACAGCTGCTGGATCATATTTCCTGCCGCAAACAC


ATCC 29149

TGCTGCGATCTGAAACAGCAGTGCGATCCACTGCCAGAAAATATTAGCTC




CGATATATTTTC






Campylobacter rectus

1771
TTTTATGCTTTTTATTCCAGCAACCTTCATTAAGCTCTACTAATCCTATT


RM3267

ATGTTTTTAAAACTTAAAATTTGATAGTTCTAAATTTATTAAAGTGCTTA




TTAATCGTATCGCAATTGATACTCGAAGGCTTATTTGTT






Campylobacter rectus

1772
TAAAAGAACATAACATAAAATCCATAGATTTACGTTATAAAGAAACACAG


RM3267

ATAGATAACAACGGCAATCTAATCAAACAAACCTCTACCGTTAC






Campylobacter rectus

1773
AATTTATCTTTTTGACAATCACAAAATTTAAATTTGACAATTAGCGGCCT


RM3267

TGTTGTTAGATTTGAGGATAAATTTTGGCTCAAGATAAGTTAAAAACT






Campylobacter gracilis

1774
CGCCAAAATTTAGCGATGTCAAGTCCGATTACACGAGAATAGTAGATGTT


RM3268

CATCACGCAGATATTTAGCAAGGTGATCGCAAAAGCCGTACCGCACGCAG




CGCCCACGCCACCGTAAGCCTTAGCTAGCGGGATAGAAATAGCGATATTT




ACGAGCGTG






Campylobacter gracilis

1775
AATTTTATCACGCAGACTTCATTCTCGCGGCAGAAAAAATTTGCCAGCAC


RM3268

GCTTAGCACGCGCTCGGCACCGCCCTTGCCTAAAGCAGAAATTACCATAG




CTATTTTCATTTTATGAGCCTAAACTTATTTTTAAAAATTTCATCG






Peptostreptococcus

1776
AACAGATTATTCCAATATGTTTTAGCCTTTTCATAATGTTTGTTATCTGG



anaerobius 653-L


CAAAACTACACCTCCTCTACTTAGTTTTTGTTTAGATCAATAAATTCTGC




TATCTTGTTGACAGACCTAGCTTGCTGTTTTTCAGGACTAATAATA






Peptostreptococcus

1777
TATTTATCAATTTATCATTTAAACTATTTAGGTGGCCTTCTGAAGATATC



anaerobius 653-L


ATATAGTAGTAGTATTCCATGGAATCACTTGTCTGTCTCATAATATCTCT




TGCAAGGTCTGTAAACCTCATGTCCCTACTAGATGGATTGAC






Peptostreptococcus

1778
GCTATTTCCACCCTCTATAGCATAGTATTTAGACTTGCTCGGTAGGTTTT



anaerobius 653-L


TCTTGGCTTCCTTAATTCTAGTAAAATCAAGAAGTCCGTCATTTGTACCC




CATATAGACAATACCGGCATATTTACTAGGTT






Peptostreptococcus

1779
GGCTATTATTTGATTTATAATGTGCTTAAATAGGTATATTATACTCCTAG



anaerobius 653-L


CAGTCACCTTGTCATAATAAGATGGCTTGTACATAAGCTTTATAGACATT




CCCTTTGTAAGACTGACCTTCATATATAAATCAAGGTCTTTCTCTTTTAT




CG






Prevotella histicola

1780
ATAAAATATGAAACAAAAAACTTTTCCTTACCAGACAAGATGCCATTGTA


F0411

CCCTGGAGGTGATGGAGCATTAAGAGCTTTCTTATCTTTGAACTTACATT




ATCCTGAAAAGGCACAAGCTTTTGGTGTAGAAGGTAGAAGTCTCATGAAG






Prevotella histicola

1781
GGTTGTCTTCAATGTTGAAATGGATGGAACAGTGACGGGAGCTCGTGTTG


F0411

CAGATGTGAAGAATGCCCGTGGAACCAGCAAGCGTTTTATGAAGATGGAA




CCAGCAAAACAACAACGGATTCTTAAAGAATGCGTTGATTACTACAA






Prevotella histicola

1782
AAATGGACACCAGCTCAGGAGAACGGACGCCCCGTTCAGAGCAGGACTTC


F0411

TCTGACAATTGCCTTTCGGGCCTGACCATTAAAGAAACTAACTCATGATA




ATACGAAGAGAGCATATCGATAAGAAGACAAGCTATAG






Prevotella histicola

1783
GTTGTCACGTCAACAGAAACAGGATCAGACTCTGCTCCGTCAGCATATAC


F0411

GGCAGTTACGCTGTAGTTATGCTTTGTTCCGTCAACGGTAACATCACCAC




TCTCCAGCTTGATGCTGTTAGATAGATATTCACCATCACGATAGATGTTG




TATGA






Helicobacter

1784
TGTCAAATTCTTGCAAGAATCGATGAAAGTCGTCTATGATCACCACTTGG



bizzozeronii CIII-1


TCAAGGATTTCATTGAAAAGGTTTGGATTCTCAATACCTACCAAATCTCG




CCCAAGATGTTGGAAATTCTCCAAGATGAGCTCATGCTAGACATTGTGCA




AA






Helicobacter

1785
TGCTAATTTCTTTGGTGTTGCCCAACATCGATGCCGCCACCTCGTGCGCC



bizzozeronii CIII-1


ATGCGGGCGATCTGATTGACCTTATCAGCGATGTTCTCAATGCTTTGGTG




CTGATCTTTGGAGTAAGACACGACATTTTTGGCATCTCTTAAGGAATGGG




TGGC






Helicobacter

1786
ACCTCGCACATCTAACTCCATAAAAGTGGGGGCGAGCAAGGACACCCTAC



bizzozeronii CIII-1


CTACCAAACCGACAATCATGCTAAAACTTGCTGATGTGCTGGCGGCTACT




GACCCTTACCAGCTCTCCATTCTCTTTATCTAGGGCAAAATCCGTCATTT




C






Enterococcus hirae

1787
GCCAGTTAGGAAAGTGATTTTTGAACTAGTGGCTATTGTTGCTACCGCTT


ATCC 9790

CTTCAACGGTTACTTTTTTCATTTCCTTTCACCTCTCTCTTATTATACAT




CTATCTTTCGTTTATTGATTGATCGTACTTTTTATTCAAACGATTTCATT




T






Enterococcus hirae

1788
TGACAGAGTATAAACTCAAATCTTTTGTTGTCTCTTTCTTTACGAGAATC


ATCC 9790

AAACGAAAGGATGTATCAATCTGACAAGTTACTCGACTGTCGGCAAAAGG




ACCATCTCCGTCTTCAATATCAAGTAATAACTGTTCATCTTCGTT






Enterococcus hirae

1789
AAGTGAATAATATTTGGTCTTGCTGTTGGATGATAGATTTTTTTCACAAA


ATCC 9790

TGAATAAAATTTTTGTGGCTCATTGTTCATTGCTTCATGACTTAGAAGGT




ATTCGGGTTGTTCCAATCCATGATAGATTCCTTTTAACGATCGATAGTC






Enterococcus hirae

1790
ATCCTAATTTTATCAATGCTTGACTCGCTTCTGGAACAAATCCTCTTCCC


ATCC 9790

CAATAGTTCTTATTCAGTGCATATCCGATCTCTGCGTTATCTTCATTTTC




TTGTACTCTTAAATCAATTGTCCCGATGAATTGTTGATTTTCTTTTAA






Bacteroides nordii

1791
CACGGATATAGTATGTATTTCCCGGCTTCAAGCCACCGATTGTTCCATAA


CL02T12C05

ATCGTTGTATCGGGCACTAATGAAATCTCTTTCAGATTGGCAATGGTAGG




TTCTTGTACCTCCGAACTATAACAGAATCCTTTCTCTGTCACCAAAGTAC




TATTG





Bacteroides nordii
1792
ACATTGGTAGAGTCTTGCGGAGCACAGGAAGAAACAACGGGAAGATCAGT


CL02T12C05

CGTCTTTGTTGTTACATAAGTTGTTGTACCATACCCACGCCCGTTGGAAT




TGATAGCATACGCACGCACTGCGTAGAGCCTTCCCGGTTTCAAGTCATTG




ATACGTAC






Barnesiella

1793
TTACAATATGATATGGAAGGAACTTATGTATTACCACGTTTTTCCCTAAC



intestinihominis YIT


CTTATTTCAACATAACCATAACAACTTGAAGAATTCCTAAAACTACTGTT


11860

CCAAGAACAAATTTTTTCTATCCATTCTGTTTTA






Barnesiella

1794
CTTTCCCTTCGACAATTCTCCATAAATTCAATCGTGAAGAATTAGGAGAA



intestinihominis YIT


TCAATATGATCTAATTTTTCTCCATTCCAATAATTCTTAATACTTATATC


11860

ATTTAAAATAAATTCTTTTAACTCTAAATTTTCTATAATAAG






Barnesiella

1795
CGACACCTTGATTCATTCCACGCGTGAGGAACGAGATGACAAATACAGCA



intestinihominis YIT


ACGATAGCTAATATGCATATATAACCCGTCTTGCGCAAGCCC


11860








Barnesiella

1796
CGAATGCGAAACTGCGAACCGATTCAGCTCCCAACAAGAGAATACACAAC



intestinihominis YIT


AACACGATAATAGTACTTTGCCCCGTGTTGATCGTACGAGGTAACGTACT


11860

GTTC






Lactobacillusmurinus

1797
ATCAAAGCAAAAAATATCACGATAAACACACTGTACCAAAGGGTGACTCT


ASF361

ATAAACGATACTCCCCTTATGCTTCTTTGATAACATAACCTAAACCTCGT




TTTGTCTTGATCAAAGCTGGATCTTTTGTCTTTTTACGGATATTTTTG






Lactobacillus murinus

1798
TCCAAGCCTTCGTTTTGATCGGTGGGGCTTTGCTCATCATCTTTATCGGG


ASF361

GTCTTCTCGATCGATGGTGGCTTTGCGACTGTTGCCCATACCGCCGCTAG




CAACCACAAGATCCTCTCCAGTGCTGACTTTAAGATCAATGATCTAGCAG




CTTTTGT






Lactobacillus murinus

1799
ATTCATTATTTATCATATCGGACGGGTCGCGGTGGTAACTTACCTACCTG


ASF361

TATTAGCGGTCGCTTCAGTGACTGATATCGATCCTTTACTCGTTGCAGCC




TGTGT






Lactobacillus murinus

1800
ATTGTTCTTTGTTCTTGCTCTTAGCTTAGGTAAATGGTTCGTTCCTAAGT


ASF361

TTTTAGCTGTTTTTAGTCGGTTAAATGCAAGTGAAAATGAAACGACCGCA




GCTTTGGTCCTTTGTTTTGGTTTTGCTTTTTTAGCAGTCAGTCTGGGGAT




GAG






Eubacterium rectale

1801
TGTACAGTGAAGCAGGTTCTTGATAAACTCTACAATCACTCCTGCTAACG


CAG: 36

GTCCGTATGCAAATGCTCCGATAAGTGC






Eubacterium rectale

1802
GGTGACAGTCTCTTGTATATAAGCACTGTGATAAGTGTTGATATAAGTCC


CAG:36

CTTTACGATGGTAAA






Cloacibacillus

1803
AAGCGGCGGCCGCGTGGCGGCGCTGACGGCGGAGGAACCTTCGGCGGCCA



porcorum


GCGTCATTGACGCCGCCGGACTCTGCGTCTCTCCGGGTTTTATCGACACA




CATATGCACGATGAAGAGGCCGAGGACGGTGACACGGTCGAACAGGCGCT




TTTGCGGCAGGGG






Cloacibacillus

1804
GTGGAAATACTTCGTTCCCTTTTTCCTCCTCCTCTTCGCCGTGCAGATGG



porcorum


CCTTCATATTTGTCGCGGTAATGATCAATTATCATTAAAGATTACAGTAA




TGGAAAAATTTGAACTTCTTATCCGGGGAGGAGAGGTCATCCTCCCCGGA






Cloacibacillus

1805
CGCGTACGGTTACGCCTTCACACCGACCATGATATCGGAGGCTTTGATTA



porcorum


TCGCGCAGACCTCTTTACCCTCTTTAAGTCCGAGGCGTCCTGTGCTGGCC




TTGGTGATGATGGAGGCGATCTTCTCGCCGCCGGCGAGGGTCACGAGGAT




CTCGCAGTTTACCGC






Cloacibacillus

1806
ATCACCGCATAGGCCTCGGCGCCCTCTTTGAGGCCAAGGTTTTCGCAGCT



porcorum


GGTCTCGGTGATGATGGAGGTGAGCATCGTACCGTCCGCGAGGCGAAGCG




ATACTTCGTCGTTGACGGCTCCCTTTTTAACTGTGGCAACAGTAGCTTTT




AGCT






Blautia coccoides

1807
AAAACTTTTTATAAAAGAACAAGTCCATGTAAGCAGGGACTTATACTCTC




TTACATGGACTTGTGAAA






Blautia coccoides

1808
CACTACAGGTGCCATAGCCTGCACTGTATATACTGCAATCCTTTCCTGTT




CTGGCCGCCGGAGGATTTATGTTGGTTA






Ruminococcus bromii

1809
GCGGTGATCCTGCTGTATTGATTGCATCAAGCGATAAGGCGAAAACTGTA




CTTGGCTGGAAACCTGAGTATGATGATCTGGGAACAATTATTAAAACAGC




GTGGAAATGGCATTCAACACATCCCAAC






Ruminococcus bromii

1810
TACGAAAAAGGAATCATTCCCGAAAATACAGTAACATATCGCGACTTATT




TGATGTGAAACTTATGGCTTCACTTGTTAAGCGACCGTCAGAAGTAATAA




GAGAATTTTGGCAGAATTATAGCTGTTCTCCTAAGCTTGCAACCGATAGC




T






Phascolarctobacterium

1811
ACAGGGATTAAAAATAAATGTGCGATTTTCAACAAAGCTTTTAACAAAGC



faecium DSM 14760


CGCTGTGTGTTAAAACGCTTGATACAGGCTGGGAATGGAATTTGTCAGGG




AAAGTAGTACATGGTAATCAACAAAGGTTATACAT






Phascolarctobacterium

1812
ATAAGTATGCTGTTACTGTTGGCGTTGCCCATACCAGTATAGCCGCCGTA



faecium DSM 14760


GATATCGCTGCTGACAGTGCTGCCGTTAATGATCTTTACTGTGTTGCTGC




TGGCTGAACCTGAACCAAAGGTTCCTGTTGAGGAATACCCTCCCATGACC




TGGGATGCTAT






Phascolarctobacterium

1813
CGCCCGTATTTTCAAAACGAAGCGTATTGCCTGTTTTCGCATTGCCACTG



faecium DSM 14760


CTGGAGTCGCCGCCATAGATGGTTCTGCCAGACAGATCTACTGCGCCTCT




GATCGTGATGCTGTTGTTATTGGCGGTTCCATAACCAGTATGGCCGCCGT




AG






Gemmiger formicilis

1814
ATTTGTTCCTCGGCAGGAACCTGTTTGAACGTTCATTAAGTAAAGAGAAT




AGGAAACACTTTTTGTAAGCCA






Gemmiger formicilis

1815
CGAAAGTTTCGGGCGGTGGTTTCGAGCGTGGAGACAATCTCAGCGTAGGC




GATGTTCCGCTCATTCGTAGGGGCTTCGTACTGGAAAATCACGAAGAAGC




GACGGCTAACAGC






Gemmiger formicilis

1816
GGGCAGGGTATCCGGGTCGTCATAGTAGACAACACGGTACCGCTTGCTGA




TGAACCGCAGATGCCGATGTCGGTTGGCTTCGAGATTGCCCTGCGGTGTA




GGCGTAACGATACGGCGGCGCTTCATAAAGCGGAACCAGTGTGCCACA






Helicobacter salomonis

1817
TTTGATTTTCATTATTATTGCCATCATTGGTGTGCGTATGATGAGCACAG




AGGGCGGGTTTGGGGATCGTTTTCTCTCAACTAGTACTAAAAATGTTAGC




TATCACGAGCTTAAACAGTTGATCGAAAACAAAGAAGTGGACAATGTAAG




CATTGG






Helicobacter salomonis

1818
ACACCCAAGAGAATGAAGTATATACAATTTATGCAAAAGATTTTACTCTT




CTGCGCAGAGACCAAGAAAATGAATGGGTGTGCTTGACTCTGTGCAGGCA




TTTTAACTCTTAAGGATATCTATGGACAATCGATCGCAAAATCCTAACCC




CAA






Helicobacter salomonis

1819
ATCCAGCGAACAAGCCCAATGCACGCGCATGCAAACCAAGTACTTTGCTA




CACTTGACAACCACTACAGCACCCTACAACATGCCTATACAATATTGTTG




CAAGACATTGTCGCTGCTTGCCACACCCGCGCTAGCAAACAGGCCGTATT




GCAGAGC






Helicobacter salomonis

1820
AGACATCCAGGGCACACCTTTAGCTAATGGTGTAGAAATCCAGCGTAGCG




ATCTGCTCGCAGAAATGCAAGACTCTCAAGAAACCCAAGCCCCGCTCCCC




CCTCCAACCCAGCAATCTTTCCATGCCCTCTTGAATAATTGTGCTAGGAA




CGATCTTTTTAA










TABLE 17B-SEQ ID NOS: 1821-1826










Gardnerella vaginalis

1821
GCACTAGCTGAGCATGTAGGAATTTTCAGGACAACAGTTGCTGCATGCTT




AGCAAGAATGATTGCGGGATTATGTATTATCGCAATTGTATCGGTATCCT




TCTCAACC






Gardnerella vaginalis

1822
AGTAAATATTTATACATTCCAATTCTATTTACAGAAACTATCGCATTAGC




AGGAATAATAATTTTTATTTATTTTATTTACAAGCATGCAGTTAGAACCG




GAATTTATACTTCAGACGTCGCAGATGAAAATC






Gardnerella vaginalis

1823
GGATATCTGCGTTTAGCAGCTAGTGATTTTCTTTCTAACTATGCAACTAC




ACTAGCCTCAACTACTATTCAATTATGGCTATTGAACTCACTTTTTATCG




GTTTTCAGCAGTCAAGCCATTATTTATCGCTTTACCTCACCCTCACT






Klebsiella pneumoniae

1824
TACCAGAATATTAATACCATTATGGGCCAGGGTGTCTGCGATGTCTCCCA




GTTCCCCAGGACGTTCCTGAGGTAACTTTCGTATCAGAGGACGGCATACA




TTGCTGACCGTAAAGCCGGC






Klebsiella pneumoniae

1825
ACCCTTGACGAGTCAGAGAGGGCTGAGGCCACCATCGCCATAGCGGTTTC




CAGCGCCTCCTCTGGCTCACTGTCTTGCTGCGGTTTCAGGAGATTCATTC




AGAAAGTATGGCCCATTTTTTGGTCACTTCCGCTGCCCGGGCGTCATCAT




CGGTAAGGAGAAT






Klebsiella pneumoniae

1826
CACCGTCTTCCACCAGAAAATGCGCATGTCCGGCATCAGGCGTGGTGAAT




ACGCCGCCGCCTTCGAGCCCCACGCCGTTGTTACCCAGCGCCATACCCAT




TGCGCCAAGCGAGCCCGGCGAGTTGCTGAGAATCACGTGAATGTCATACA




TCTGACTGTGCTC










TABLE 17C-SEQ ID NOS: 1827-1979










Escherichia coli

1827
AAAAATGGCGGAAGTAACACCCGACACGATTCGTTATTACGAAAAGCAGC




AGATGATGGAGCATGAAGTGCGTACCGAAGGTGGGTTTCGCCTGTATACC




GAAAGCGATCTCCAGCGATTGAAATTTATCCGCCATGCCAGACAACTAGG




TTTCAGTCTGGAGTC






Escherichia coli

1828
TACCTGTCAGGAGTCAAAAGGCATTGTGCAGGAAAGATTGCAGGAAGTCG




AAGCACGGATAGCCGAGTTGCAGAGTATGCAGCGTTCCTTGCAACGCCTT




AACGATGCCTGTTGTGGGACCGCCCATAGCAGTGTTTATTGCTCGATTCT




TGAAGCTCTTG






Bifidobacterium longum

1829
GCCCCCGACCGTATGAATGCGACGGCCTATGCCCCCACCGAGTCCAAGCG




CCGCAAGCCGGCGGGCCCAATCGTGGTCTTGAGCATACTGGGCCTTACCT




TCCTCTCCTTTGCCGGTCTGATGGGACTGATTTGGGTAAACGACATGGGC




ATTATCGGCATTATG






Bifidobacterium longum

1830
CTCCACCATTCTCGCTCCTTTCGTTTGCATAGCGCAGCAGGCGGAATACT




CGGGCATAATCGATTCTCCATTGAACGCAGTCGGCGGCTATCTCTTCCTT




GGTTATTCCCTTGGCAAGGGCCGAGTCAAAGATCGGTAACGCAAACCGAA




AATCAAGGTCTAT






Clostridioides

1831
GAGAAGGATAATATTTCTTCCATGAAATTTTTGAGGAATAATAATGTAAA



difficile QCD-66c26


CGATAGATTAATATATACGAATGACATTGTAGAAGTAGTGATAGATTCTG




CAAAGCAAGAAGTGTTAAATAAAAAAATACTTAATGGTACTAA






Clostridioides

1832
CTGAACTTGATACGTTATTAAGTGATAAAGAATACGAGGCTGGATTAGAA



difficile QCD-66c26


TAATTAAAGATTAAGATATTATTATAATTGGGGAAAGTATAGTAAAATTT




GAGTGCACGCAAATTTAGCTATACTTTCCTTTTATCATATAT






Clostridioides

1833
ATAATAAGTATAATGAAAGAGAAACTTTTGAAAGAATACATTCTAAAATG



difficile QCD-66c26


CTATGGGCGTACATATCCTGAAGATAAAAATAATATATACATATATTCAC




TGAACATATTTGCTAAAAAAGAGATTTTTATGA






Lactococcus lactis

1834
CCAGCTGATAAAAGTTGGAGCCATTTTAAATTTTTATATTTTGCTAATAA


subsp. lactis I11403

ATCAGGCTTTGGCAGACCAACTAATACTTCTGTTTCCAGAAAATCTGATT




CTGTCAGCTCTTCTTCTTTTTTAAATTGAAACTGGTAAGCTTTATTTTTC




TT






Lactococcus lactis

1835
TTTTTAACAGTTTTCTTTTTACAAATTGACAAAATAAGAGTAAAATTAAA


subsp. lactis I11403

ATATAGATAAAAACTTATGAAGGAGTCGCCTATGTTAGATAACTATAAGA




AAATCCTTGTTGGTATTGATGGTTCGGTAGAAGCTACTAAAGCTTT






Lactococcus lactis

1836
ACTCCTTTGTATAAAATGAATTTTAAAGACTACTTAAAGAGAATAAAGTC


subsp. lactis I11403

AAGCCTTGTTTTCTTTATCTTCTTTCTTTGATTATAGCACTTTGCCTTAA




TTTTATAAAAAATAGAAGTTTGACAATTCTGAAATTATCA






Lactococcus lactis

1837
TCATAAGACTCTTGACAGCCATTTTTCACCCAATTTCCTAAAACATAAAA


subsp. lactis I11403

TAAACCAGCAGCCATAAAATCGTTCCAAAATTTTTGCTTTGTCCCTAGAT




AATCAGCCCAAGTCGTTGTTTCATTATAAAAAATAGTCATCTTTTG






Chlamydia pneumoniae

1838
CCTCGATTCATCCATGCATACTCAGGATATCTATGTTCTAGGGCTCTCTC


TW-183

CGACTGTCTATATTAGAGGGAACTATCACGTACAGCACTACCGTGTTCGA




GGATTTTGGCCCTCTTGCCTGGATTCTCTAGCGGCCTGTGCGGAAAATAC




ATCAGTACTTCCCT






Chlamydia pneumoniae

1839
TCTCTATTCAGCCACACATTTGATAACGCGATACGGTATGGTGAGAGATG


TW-183

CCTGTTGGTTTGTTCTGAGGGCATGGGAATGCTTCCAGAAACGCAACAAC




AAACATCTCCTTTAACTTCACTAGAAGGGGGACATGAGGTAGCTCTAGTT




CTCAATCCC






Fusobacterium

1840
AATAATTTTAATAATGATTTATCAAAAATAAATTTAAAAGGTGATGTAAG



nucleatum subsp.


TATTGAAATTTTAAGTTTACAAAACCATTTTAATGAAATGATAGATAAGA



nucleatum ATCC 25586


TTAAATATTTAAGAGAATATGAAATTA






Fusobacterium

1841
GAAATTTCATATATGCAAGAAATTGACAGCTTAAAAAATCATTTTTTTGA



nucleatum subsp.


AATGATAGTTATAAGTTGTTTGGCTTCTCTTTTAATTACAGTTTTAATAA



nucleatum ATCC 25586


AT






Porphyromonas

1842
TTAATCGATAAAAATCCACGCCTTTAGCACTGTCCTGTAGGTGCCTTTTT



gingivalis W83


CCGTATCGGTATCTCAATGGACATCCCTAATGCTTTTGGGCGTTTTTTTG




TCTTTTAGGCT






Porphyromonas

1843
AAAGGATCTATTCGGTGGATGAGTCTTCGGGAGAGATCGAACACGAAAGA



gingivalis W83


CGATTCTTTTTCAATGAAGGCGGATATATGATTCGTGAGGAGGAATACGA




TGGAACCGTTCAGATACCTGTCAGAAAATGGGAATTTGTCCGCGATGACA




AGG






Helicobacter hepaticus

1844
AGCTAAGCAATTTAGTCAAGCTTTTACCAGATAGGGGTTGAGCAGGGATA


ATCC 51449

AGAATCGTATCTGCGCCAAAAAGAGCAGATTCTAAAATCTGATATTCTTC




TAAAAAAATATCGCTATGGATAATCGGAAGTGTGCTATGACGGCGCAAAA




G






Helicobacter hepaticus

1845
CCAGATAGGGGTTGAGCAGGGATAAGAATCGTATCTGCGCCAAAAAGAGC


ATCC 51449

AGATTCTAAAATCTGATATTCTTCTAAAAAAATATCGCTATGGATAATCG




GAAGTGTGCTATGACGGCGCAAAAGAGCGATGGATTCTAA






Lactobacillus

1846
AATCTTCAATTACTGTAGGTATATCATTTATAAACTCAATGACATTATCA



johnsonii NCC 533


CCAAATTCATCAAAAAGTTTTTCATTTATGTTATCAATTATAACGGCAGC




CATTAAATGATGCTCTTTAAGATATTTGTTTAGGTCTTTTC






Cutibacterium acnes

1847
GACGACGGCATCAGCGTCATGGGGGTTCAGCAAAATGTCGGATCCGGCTT


KPA171202

CGAAAGCCTGCACCGCAACACGCCCCCGGATCGGTTTCATAAAGGCGTTG




AACTCGGCACTCGCATGCCCCTCATCGGCCGCCGTGGGGTCATTGCCCTC




GGCGTCGGCAGC






Cutibacterium acnes

1848
CAGGCTTCTCGCGGCGGCGACGGGATCGTCGCTGGTGTCGCGAATCCACC


KPA171202

GGATTGTCGACCGGCTCAGGTGCCGTAACAGATCGGCCGGTGAATTTGGT




CCGGACAAGGTCCGGCTCGTGTATCCCCAGTATGGACGGCCCCGGCCTGC




TGCTGGGAGTTTC






Cutibacterium acnes

1849
GACTGCGAGGTCTACTTGCTCGTGGTGAGCTGCCAGGACAGTGGTGGTGG


KPA171202

CGCTGGCGCAGAACACGTCGGGAGTGCCTTCTCCCACGTCCAAGCGGCTC




GACAAATCGAGGGTGACGAAGTGATGATCGCTGGTGTGCATGGGGCGGTG




CGTGCTCAGGCC






Cutibacterium acnes

1850
GGACATCGCTGAGCTGGTAGACCGCTCCGCGGGCGGGGTGGGCGGTGGGG


KPA171202

TCACCGACGCCGACCAGACCGCCTCCGGCTGCGACAAATCTGCGCACTGC




CGAGGTGACACGCTCGTCGGCCCATTCGGATCCACCACTGAACGCAGTGC




CGGCGGCACCGA






Chlamydia trachomatis

1851
GAATTATTAAAAACCTATCGAGAAGGAGTTTTTTCAGCTTGGCTCTTACT


D/UW-3/CX

CACCTATGGGAATCGGCAGACACCTTATAATTTTCTTGTTTATTACGAGC




TATTCTCAGCTCTTCCAGACACTCTTAAACTCGAGTTAGAAAGACTGCCT




C






Campylobacter jejuni

1852
AAATATTTTAAATTTTCAAAATGATTTTAAAAATGTTAAACTTGTAAAAC


subsp. jejuni NCTC

TTGAACAAAACTATCGTTCAGTAGGGACTATTTTACAAGCAGCAAATAAT


11168 ATCC 700819

CTCATATCTCACAATGAGCAACGACTTGGAAAAACTTTAATC






Campylobacter jejuni

1853
CTTAGAAAATGTAAGTTTTCACTTGAACTTAAAGTTAAATTTTCATCAAT


subsp. jejuni NCTC

TTTTACAAAAGGTGCTTTAACATTATCATTCATAAAAGCAAAGTTAACCC


11168 ATCC 700819

CTTCTATATTCTTAACTTCACCTTTTTCAAATTTAACATCAAC






Campylobacter jejuni

1854
ATATATCGCTCAAGAAGTGAAAAAATTGCTAAATTCTGGAGTAGAAGCTA


subsp. jejuni NCTC

AAGAGATCGCCATTTTATTTCGAGTTAATGCACTATCAAGAGCAATAGAA


11168 ATCC 700819

GAAGCATTTATGAAGAAACAAATTTCTTATAAACT






Campylobacter jejuni

1855
AATTATTATTTTCTTTATAAGTATAATGAGCATTTAAGATTAAATCTTTA


subsp. jejuni NCTC

TATTTTAATACAGCTTGATCATCACCTAAATTAAGTTTTAGTTTAAAAGA


11168 ATCC 700819

ATTAGCAAAAGGCAAATTTCCTATATAACGATCATTAACCGC






Campylobacter jejuni

1856
TTTGTTCTAAAATTGGTTTTATAAATTCTTCTAAAGAATCCATTTTTCCA


subsp. jejuni NCTC

GTATCATAGAACAATTCTTTTAGCCTAACAGTGCCAATATCAAGAGATAT


11168 ATCC 700819

GCAAGAAATAATTCTATTATTTTTTATTAAACACAATTCACTAGA






Bacteroides fragilis

1857
TCTCACTCAAATTCCGCTCAAAACTGTTCGACAAAGCATTCTCATACTTA


YCH46

CCATCCCAATCATAAATGAAAAGATCCAACTGTAATCCACCGAACTGCAT




CGGCTCGGCCGGAGTCTTTTTATCCGGAACATACCGGCTGTGCCGATCTC




TCAACCGCGCCT






Bacteroides fragilis

1858
ACCGGTAAAAGCCAAACAGGTACGTCTTCGTATCCTCGACGGATTTGCCT


YCH46

GTCCTGCTATCCATACATTCGGAGTGTACAAACAATCGGCTTTATTTCAG




TAAAAACAATAGGTAGTTGGGTTGAATATAGGTTTGGGCTGTATCACACA




GGC






Bacteroides fragilis

1859
GCCTTTCCTCTTCCGGGGGAAACGTTCTATAATCTCCATAACAACGGGTC


YCH46

AGATAAGCATCGTATCCTGACGGAACCGGAAACTCCACTCCTTCAAATCG




TGCAGTATCCAGATATTCC






Lactobacillus reuteri

1860
TAATCTTTAATATTGGATTTTATTGGCCAGCTTATCCTATTGCCTTATGG


JCM 1112

ATGATTCTAGCTCTGTTATTAATTGCTTTACAAATTATTCACAATCATGA




ATTCATTTA






Lactobacillus reuteri

1861
TTAATTGGTAACTTGGATAACTTGATTGTAAAATAAGACGTTTTAATTGT


JCM 1112

ATTGCTGCTAACTTGTGGTATTATAGATACTAGTTAAATGTAAAAATAGG




TGGAGGTCGCATTCCGTTAGGTTGCGACCTTTCATTTCGTTGCTGTTCGC




TTACT






Lactobacillus reuteri

1862
TAACCACTTTACAAATTTAACTAGTAATAGTTCAAAAGTTGTTAAACCAA


JCM 1112

AATATGTTGAAAAGAAAAATGTAAAGCTTGTTGCACTAGGTGATTCCCTT




ACTCACGGTCAAGGGGATGAAACTAATAAT






Lactobacillus reuteri

1863
ATTCGGCATATTATGCTTTAACTATATATATTGGTAGCTTAATTATTTCA


JCM 1112

CCATGGTTACCGTTGATATAAAAATATTAAGTGAATTGAGGCTGGGAGAA




AA






Bifidobacterium

1864
ATCAAGCACCACATGCGCACCCCGCGCACACTCGACCGCAGCCTGCAGCC



adolescentis ATCC


GCTCACCAGCCTCACCCGCGGAAAAACCAGACGTGCACCCAG


15703








Lactobacillus

1865
CACTGGGGTCATCACCTCAACTATTGGTATTTGCGTAAAAAGATTGTAAC



rhamnosus GG


GACCCAATCTTTTGCGCAAGCGCCATCATTTTTGTGATAGTCTTAAATCA




TCATGAAACCTTTCTGAAAGGAAGTCTCACAATTGAAAATGATCAATCTC




GGGCGCTCCGGCTTA






Lactobacillus

1866
GGGGAGGCGTGCACATGCGTGATTTGGTCAAACTTGTTGGCTTGGACCTG



rhamnosus GG


GCTGTATTTCATACTAATAGTAAGAAGTATTTTTGGTTTGGTACTTTGAT




CAGTATCGTTTTGTTGCTTCTCCCTTTCTTGAACGGTGACTTAGCAACCC




CGCTTGC






Lactobacillus

1867
CATGCGCCTACCGAATCTTGATCGTCCGCGCGCAACGGAATTGCTCGATG



rhamnosus GG


CTGCTTACGATGTCGGAATTAACTTCTTTGATAACGCCGATATTTACAGT




AATGGTAAGGCCGAACAGTTATTCAGTGACGCACTGAAAAAAGCGAGCTT




CACCC






Lactobacillus

1868
TTCGTTGAGGTTGACAGTTCCGGTACCGTCAATTCTTACCGGGATGTTTC



rhamnosus GG


CAGTTGCACCGGTAATGGATACCGTTGCATTCTTATCAACGGTTAGATTA




CCACCGTCTTCAATATAGATTGGTGCAAAGCCATCTTTCACAAGTGCATT






Bacteroides

1869
ACAATCGTTGGGAGTCAACTTCGCTAAGGATACTAAAAAGTTGCAGATCG



thetaiotaomicron VPI-


GGGGGAATGTACAATATGGACATTCTGATAATGACGCTCGTCGCAAAACC


5482

TCTTCAGAAACATTTTTGGGGGAAACATCTTCTTTTGCTC






Bacteroides

1870
GATCGGTATGAAGGGGGAGTAATGATCAGCCGCTTTAAAGATGATGCGAG



thetaiotaomicron VPI-


TCTTTCAATTATAGGTTCTGCAAATAATACTAATAATAAGGGATTTTCTG


5482

AATTTGGTGATGC






Bacteroides

1871
AGTATCTGATGCGGTAAAAGCAATTCCCCTATATCGCATGGCTGAGAAAG



thetaiotaomicron VPI-


GATTAAGAGAGGATGGGTATCCGATGGGACCG


5482








Lactobacillus

1872
GGGTTCAATTCCCCTCATCTCCATAGATAAAAATAGAACCGCTCTAGTAA



acidophilus NCFM


GTTGTCTAGAGCGGTTTTTTTGAT






Lactobacillus

1873
TGGTATTACTTTTGGAAATGGTGAATCTCCTTTAAAAGAAACAAAGAATA



acidophilus NCFM


TAAATTTAAAAAATTCAATTTTCAA






Desulfovibrio

1874
GTGAACAGCAGAATATGGGCTATGCCGTCCAGCGACATTTCTCCGTACTG



alaskensis G20


CAGAAACCTGCTCATTTCCGGCTCACCGTTCAGAACGCCGAAGGTGCGGT




TCATGCTGCGGCGTACCGATACAATCTGGTCGTG






Desulfovibrio

1875
GCTGATTCCAGTATGCGCGCCGATGTGCGGGCGCCGTCCAGATTGATGCC



alaskensis G20


GGTTTTTTGCGGACGCCGCGCCAGCGCGGCCTGCATCAGCTCTGCCAGCT




TGTGCGGTTCCAGATCGTCGGGCTGCAGGATGCGCAGGGCGCCCAGACGT




TCGAGCCGCAGG






Bacteroides vulgatus

1876
TCCGTCTGATTCTAGCCTTGCACAACTATGATATCACTATACATGAACAA


ATCC 8482

GACAAACAGACAGCATTGCAACAAATAAATGAAGTGTGCAATGACTTTCA




GACCATGAGAAAACAATTGGAAGAGACCTATTCACAGACCCGTTTTATGG




AACAA






Bacteroides vulgatus

1877
TTCCGCCTCCGCACGGGTAAGGGGCAGCGTATTGTCGGGTCGGGTGAAAG


ATCC 8482

CCACCAGGATTTTCACCGAATGTCCGCTGGCGCCCATAAAGGCGCACCGC




GTCTGCGGCAACTTCCATGCCTCCCGTTTCACCTGCTCCACTTCCGCCCT




GCCCGCCAAGGG






Bacteroides vulgatus

1878
TGATGGTTCCCCACACTCTGAAACGGTGTGGCGTGGATACATCGCACAGG


ATCC 8482

GAGAATATGGCTGGAACCCAACAGCACGGACAGTCGAAGCCTTCAAAGCC




GCACATGCCCAACGAGAATTTGGCTTTCATCCAAATGACAACCATATGGC




TTTTCT






Parabacteroides

1879
GGTTAGTTATCCCAATATCCCGATCATCCGGAAATATCCGAATATGACGG



distasonis ATCC 8503


GCTTTTATGATAAACCGGATTATAAGAGGAATATAGAATTGATACGGCGG




CTTGTTGGTAATTGCATTGTCATAAGAGTCTCGGACGATACCTTTCAGGA




TAATATGATGCTGG






Parabacteroides

1880
AAAGATAAGGGTTCCCCATGAACCTCTTATCATCCGTTCGGATCATTTCC



distasonis ATCC 8503


GGTTGACGCAGGTTTGTACCAATTTCATTAATAATGCGGTGAAGTTTACC




GCTAAGGGATATATCGAGATCGGATACGAACTTAGCGCAGATGGAAAATC




GATCCTTATTT






Parabacteroides

1881
ATTACCAAAAACAGTGACCTGTTGCTTAAGCTGATCAACGATATCTTGGA



distasonis ATCC 8503


TCTATCACGCATAGAATCCGGTAGCATGTCTTTCTCTTATGAGAACCTCG




ATCTGAGTAAACTGATGGGAGATATCTTCCATACGCAT






Parabacteroides

1882
CGTGAACAATATATATTCCTTGGATCGGGTACGTTTATCCGGAAAGAACG



distasonis ATCC 8503


GGATTTCGATATCGGATATACCTAAGATAAAGCCGGATACGATGTATATA




AGCACACTCAGTACGAAATCGGCGAATGCCCTGATTAAGGGATTT






Lactobacillus

1883
GCAGGAGATTCCGGCGGTTGAACATGGGGAAAATGGGCATTTTCCAGTTA



delbrueckii subsp.


AAGACACGGACAAACGCAAGCTTTTTAAAGGCAAGATCAATTACCGGCAA



bulgaricus ATCC BAA-


GCCCAAGTGGACTTAATCGACCGCTTGCACGACTTTGTCGAAGATGCCGG


365

GCAAAGGGTC






Lactobacillus

1884
TTTGCTCAAGCAGGATCCGGTTTACCGGGAATTTATCACTTCTCTGGCCA



delbrueckii subsp.


GCCGCTACCAAAACCGCCCAAGCGAACTGCCCTTGATCTTGGCTGAAGGA



bulgaricus ATCC BAA-


AATTTCGCTTTTGGCCAGCTTTATCCCTGCCAGGGAGATTACGTGACTAA


365

TCCCGATGCTTT






Lactobacillus

1885
TAGCCCAGTACCTGCGCCAGGAGAGCCAGGACCGGGACTTCCGCCCGGCC



delbrueckii subsp.


AGCTACCGGATGACGGAAGACTTGTCTAATTGTGAAGAGGTCATCTTGGA



bulgaricus ATCC BAA-


CCTGCTCCGTGATGAAGACGGCAACCTCTGTTTTGTCGGAGCGACTGGGT


365

CTTTGGAACC






Campylobacter curvus

1886
GACCGATGTAGGCGGTATAGTAGGCTTGTAAAAATGTGCCTAAATTCTTA


525.92

GCGACATAGACTATCACGATGCCAAGGGGCAGCAGATAAAGCCATGTTTT




TTCTTTTTCGATGAAAATTTCATTTAGCACGGGCTTTACCATATATGCAG




TCGCCGCCGTGCCT






Campylobacter hominis

1887
GTTCCTGTAATTACAACAGTTTTTTTTGTGAAAACATTTTGTGAAATTTC


ATCC BAA-381

AGTTTTTTGGGCGCTTAAATGCAAAAAAGAAAGTAAATTTTCAATTTTTT




CGCGATTTATCTCACAAAAACCGACAAAACTT






Campylobacter hominis

1888
ATTTTTAAAATTTCATCAAAACTTGCATTCAGCCAATTTTCGCCAAAAAT


ATCC BAA-381

TTCAGCGATTTTTCTGGCGGCCACTTCGCCTATATGTTCGATTCCAAGTG




CTGTAATAAACCTATAAAGT






Campylobacter concisus

1889
ATAAGTGCGGGTGCTAGCACACCTGACTGGATCATACAAAAAGTCGTTGA


13826

CAGAATCAAAAAAGTATAA






Akkermansia

1890
AGGGCTTCCTTGTACTTGCCTTCGCGGTACAGGTTGCGGCCTTCCGCAAG



muciniphila ATCC BAA-


GAGCTGCATGGCTTCCTGGGTTTGCGCTTCGCGGCGGGCCATGGCTGTGC


835

GC






Akkermansia

1891
ATTTGTTTGGCGTCTCCCGGAATCCGGAAGTCGGAACCTTCCTCTACTTC



muciniphila ATCC BAA-


AAAGCTGGTATCCCGTTCGGGGCCG


835








Atopobiumparvulum DSM

1892
CTTTTCATACGAGAAAATATTATCAACTGATTGCTCCCCTATATATTCCG


20469

CAGCTTTATGTTTTAAAAAATCAAGATTTCGCTCTTTAACCGCTTCGGGT




CTAAACCAACTGCATATGAATACAGAGGTTATAAG






Atopobiumparvulum DSM

1893
TGAATATAAAGCGGCATCTTGCTCGATTGTTCCAATGATTTCTCCATTAA


20469

CACTTACAACCCACAAAATACCACCGTCTAATAATTCATTGTTTTTCTCT




ACAATTGATAAAATTGGACCAGATATATACAC






Veillonella parvula

1894
AGCTATGGAACGCTCTTTGCGTGCTCTAAAAGGTTTTGTACACATGGCTG


DSM 2008

ATGCAATGGAGGCAAATACGATTAAAGCTGTTGCTACCGCAGCAGTACGT




TT






Veillonella parvula

1895
AATGGTACTAGCGTGTACTATTGCACCATGGCCAATTGTTACGTAATCTC


DSM 2008

CAAGAATGCAAGCCCTGTCGTCATCAA






Veillonella parvula

1896
TATCCGCTTTTAATGTATATGTAATTATACGATTTTGGCGTTCTTATCGT


DSM 2008

ATTCCTCGCCACCCGCATCAGGAAGCA






Citrobacter rodentium

1897
ATGAATAAAATTTATTTCTCTCAAGACCCGGTGGGTTTTTATATCGAAGG


ICC168

TGTGTCTGCGGTTCCCTCCAATGCTATTGAAGTTAGCGCGGATATTTATA




ATGAGTTTGCCGGAGTGGCGTGGCCTGATGGGAAAGTACTAGGTGCTGAT




GATTCAGGAT






Citrobacter rodentium

1898
GCCATGATGAATTGATTGCACAAGCGGAAGCCGAAAAACAACGGTTGATT


ICC168

GATGAGACCAACGTCTGGATAAACGGGCAGCAATGGCCGTCTAAATTAGC




GCTGGGCCGCCTCTCTGAGGATGAAAAAGCGCAGTTTAACGAATGGCTGG




ACTATCTGGACGCG






Streptococcus

1899
CGATATTGATGATACTTCTTTTTAAAGTTGCCATTTTGATTTCCTCCTTC



gallolyticus UCN34


TAGTGATTAATAG






Streptococcus

1900
GGTTCATAATAGGCACAAAGCGGCTGCCACTTGTAAAGTCAGAAATAAAC



gallolyticus UCN34


GATAATGGATTAAAAGAATAGCCTTGGATACCGATATTTTTCAGCAAAAT




CACATAAATCAAAA






Enterococcus faecium

1901
CGATCGAACCTTTTATCATCTTCATTCCTCCAATATTCTTGTCCATTCAT


TX0133a04

GAACTGCTGGGCCAGCCATTTGTATCGCATCCTCACCAGAGATCATCAAG




TCTCCTTGTTCCAAGTGAACAACTACCTGATCTGCTGAAATTTTCCCAGT




TTCCTTAGCAA






Enterococcus faecium

1902
GGGAATCAGCACTGGAAACAATTTTTTCTGTTTTTGTTTCCGTTACTTTT


TX0133a04

TCCATTCCAAAGTGATTGGACAAATTAGCGACGATCAAACTTAGTGAAGC




GACGAAGATTAGTCCGAAAATCAAAGATAAAAAGGTTTGCCATGTT






Enterococcus faecium

1903
GAAAATATCTCTGCTAATATATTGGTTTTCTTTTGATAAAATATAGTGAA


TX0133a04

TCGAAACGGCGCTTGGAAAAACATTCGTGCGGATGGATAATTGACCGCTT




CTCTAACCGTACCGAACAAAAGATAATCTTTTAAGGCTTCAACTGCCAAA




CGTCCGCTG






Peptostreptococcus

1904
TTTAGTAGTTAACCTAAACCTGGTCATCTACCATCTGTACGTACTCTAAC



stomatis DSM 17678


ACATCAAATCCTAGTCTTTTATAAAGGTCTATAGCCCTCTTATTAGTTCC




ACAAACTTCTAACCTATACCTTACTGCTTCAGGAAAGTTTTCCT






Peptostreptococcus

1905
GTATTCTTCACTTATATAAAGGTCCTCTAGCTGTATGGTGATTCCTGCCA



stomatis DSM 17678


CCTCTGAAGCATAATAGCTGGTTACTATACCAAACCCAACCAAATTTCCA




TCATACCTTGCTTCATATCCCCTTATAGAATGCTCTCTTGATAAGAT






Mycoplasma fermentans

1906
TTTGTTGAACCTTCAAAAAGAAATAACTTCACAAAAAATGTTTTTTCTTA


JER

CATTACAAGAGATAGATTTGATTTTGGTGAAGCTTTTAATAATAACTATG




ATTTCTTTTCAACTATTTTTGAATACCTAATTTCAGATTAC






Mycoplasma fermentans

1907
AGATGTTGCATTCAAATATGAAGATACAATTGATTATTTAGTTGGATACA


JER

TTAATGAAGATGATTTTTATCAAAAATTTGACGATACATTAATTAGAATT




TCAGATTATAAACAAAATGAAGCTTTTAATGTTGATACT






Eubacterium limosum

1908
ACCGTCGAGGAGGTCACGCAGGCCCTGGCGCAGCTCCTCGGCGTAGGCAG


KIST612

ACTGGACCGGCAGGGTCTCCAGGGCCTGGTACAAAGACTGGGCAAGGGCG




GAATCCATCATGATAAAGCTTGCGTTTTTTTTCATCTGGGTTCCTCCTTT




CTAGAT






Eubacterium limosum

1909
AAGGCAGAGGTATCAACACCCGGAGGGCAGTCTGCCGCCTGGGGATACGG


KIST612

CGGGCACGCCGCCTGCCCGCAGGCCTGCTTTTCTGATCGTTTTGTCGGTT




CCGCTGTTACTGCAACTGTCCAAG






Parabacteroides merdae

1910
AGATGCAGAGGGGAGAAAAGAATTGGAAGAATAATGTTTTAAAGGCATTC


ATCC 43184

TCGTCGTCGTTTACAGCGATGTTCCAAAACAACTTGTCGATTGTATGTGG




CAGTGTCTCCTTCATGTTTTCGTTTGATGACAACAAATATTTACTAAGTT




CTCATA






Parabacteroides merdae

1911
GTGTAATTTTAGAAACAAGGCCATAAGCGCTTGACAATTCCACCTGTTCG


ATCC 43184

ATCGGCTGATTTTCCAATCTGTAGATACGCTGGTAAAAGTAGATACTGCT




AATTAAGGCTGGAATGAGCAATATGGCTGCCGTGTAACGAAAATAGCGAC




CGATCTTTTGT






Faecalibacterium

1912
TCGGTGCAGATCTTGTTCCGGGTCTCTGTATCCAGCGTTTCTCCGTTGGA



prausnitzii M21/2


AAGCAGATTGCTGGCATTGCCGGAAATGGAGGTCAGCGGGGTGCGCAGGT




CATGGGAGATCGTGCGCAGCAGGTTTGCCCGCAG






Faecalibacterium

1913
GCCCACGGGGGTGTATTTGATGGCGTTATCCACCAGATTGACAACGACCT



prausnitzii M21/2


GCATGATCAGCCGTGCATCCACATTGACCAGGAGGATCTCGTCCCCATAC




TGTGTTGTGATGGTGTGTTCGCAGCTTTTCCGGTTGACATGATGCAGTGC




TTC






Faecalibacterium

1914
GTCTTATCCATCTTGTCGGACACGACAACGCCGACGCGGGTCTTTCTCAG



prausnitzii M21/2


ATTACGTTCTTCCATCGTTTAACCTCCTTACTGCTTCTCAGCCAGAACGG




TCATCACGCGGGCGATGTCCTTCTTGACTGCTTCGATGCGGCCGGGGTTC




TCCAGCTGGTTGA






Faecalibacterium

1915
AGAATCCTCTTTTTCAATGCCCAAAAGCGTGGTAGAATAGGGAAAAGATT



prausnitzii M21/2


CAGTATCTGAAAAACGGAGCGCGCAACAATGAAGATTCTGGTCAGCGCCT




GCCTGCTGGGCGAAAACTGCAAATACAGCGGCGGAAACAATTACAATCAG




GCGGTCTGT






Parvimonas micra ATCC

1916
ATATGCCCTAACAAGTCCTCCCGCTCCTAGTAAGATTCCACCGAAATATC


33270

TAGTGGAAATCACCAAGCAATTAGTTAAGTCCTTTTTTCTTAAAACTTCA




AGCATTGGAATTCCGGCAGTTCCTGAAGG






Parvimonas micra ATCC

1917
ATTAATTTTTCTCTATCATTTATAAAATTTTTCATAAATTCAATCTTTTT


33270

ATTTTCCATAATTATCCCCAAAATATTATTATAAAATCGCATTGATTATA




TCATATATATTAGTTAGAATCAATTTTTGAATCTTTTTTAATTTCATAAA






Bifidobacterium

1918
ATTGGCCGCGTACTTGCGGTTCAGGCGGTGGCATACGTGGTGCTATTCGG



bifidum NCIMB 41171


CGTGACGGGCGTGCTGGAGTTCACGTTCGGGAACCAGGGCGGTGTGGCGC




AAGGGGTTGCGATCGCGCTCGCCGTGGTGCTGCTCGTGCTATTCCAGGTT




GGGTGGCCGTTCC






Bifidobacterium

1919
TGCCAATTGCCGGCGTGCGTTGCGTGTGATTCAGGAGGGGACGGATTCGT



bifidum NCIMB 41171


CGATGGAAACGCGCACCCGACTCGTTCCCCTCAAATACGGGATAGACGCG




CCCTGCGTCAATTACAAGATTGGAGTGAAGCGTCTGAACCGG






Collinsella stercoris

1920
TAGGGACTGTCCCTAAATTAAAGTTCTAAATTGAGGTTGACAGCCCTAAT


DSM 13279

CCTCTTCGATGCCTAAAAACACCTGTGTGCGAGGGTAGTAAAGCACCGTC




GACCCAGCCTCGGCGCTGGTGAGCTCGGAAAGGACTCGCGGCCAATCGCC




GCGCTTGACCGA






Roseburia intestinalis

1921
GATCTCCCCCTCCACATCAAGAAGCACACCGCGGGACGCATTCTGCCTTA


L1-82

ACTGTACCAGATAATCCGCATACTCCATGACGAATCTGCGGAAATGTGAA




AGGCTTCCTCGGATCATATGTACCTGAAACGCATTTTCGCAGATATACTG




TAAC






Roseburia intestinalis

1922
CCATCGCAAGGGAAACCAGACGGTACAGGAGATAATTCATGTTCATCTGT


L1-82

ACCAATTCCGGGTTCATTCCCGCCTCATTCATCTCCTCATAGATCGCTGC




AACATTATCCTCGATCCTTTTCTTGTCATTCTCTTCCACACTC






Enterococcus

1923
TGTATCAAGAGTGGTCAGGTCGACAGGTTCAACCTTCTTTACTGCATGGT



gallinarum EG2


GACTTTTGGCGCGGCAATGTGCTGTTTG






Enterococcus

1924
ATTGCTTGCCCAGTTTGGGTTATCCGGAACATTAACGCCGATCAGCGGGG



gallinarum EG2


GTGACGTGAATCAGACGTTCCGGTTG






Enterococcus

1925
GGCGTCTTGTTTGCATGTATGAGGTATGGTTCTAAAAAAGCAAGAAATGG



gallinarum EG2


GGCGACTGCTGCTGCATTTGTGCAAGCA






Prevotellacopri DSM

1926
TATAATCATACTATAGGCATTAGAGCATCGACGACTTTCGGTATAGGCAA


18205

TTGGATGAACGGAAGTGTTTCGGCAACAGGAATCTACAGACATGACAAGA




GTAACGATTTCTTTGACTTACCCTTCAATCGCAAACATATCTCTGCCATT




CT






Prevotellacopri DSM

1927
GAAGCCATCACATCCATTTCATACTTAACCCATTCTATCAGTCGAAAGCC


18205

ATTCAAGGACTCTATGACATCAAATCAGTCTTTCTCTTGATTGCTATGTT




GAGATGGGCTTCCGATAATGATAAATGGAGCATTGTGGTTAAAGGCAGCA




ACATCT






Prevotellacopri DSM

1928
GTTCTTCTGCTCAATATTGTGCTGTATTCACCTCGTCTGCAGTATTTCTT


18205

GCTTTGGCTGCCTTCAACTATTGGGCATCCTACAAGCTCTTCACCCGTAT




GCAGGTTATCTGCAACAAGTGGATCAACATCTAAGAATTAAAATTT






Prevotellacopri DSM

1929
CAGTCTACTCCTTCCTGATATTTGCCCATTCCTTCGCAACCCTGGGAGGA


18205

GCCTTCTACCGTAAGTTTCCTGTGCTTCTGACAGCATGTACGGGATTGGC




ACTTTGTCTGATTTTGGGTTATATTATCAACGAACTGGGTGAAGCCGGAT




G






Holdemania filiformis

1930
CGGCAACAGGAAACATATTGCATCTCACTCAGCAAAAAAACACGCCTGCA


DSM 12042

TATCAACCTCAAATCCCTGAATCCTTTTGCTTTACTGGGATTACCTAGAC




AGCAGGCAAACCGTTTCCACATGGCACGAAGCTTCAGATGAATGTCAGCG




CATCTG






Holdemania filiformis

1931
TTTTTTCTTTTTCCAACCCTATACGATTTTATAGCTCATTTTCTAAAAAC


DSM 12042

CCTACAAGATTTTATATTACCGAAAAACTTTAAATTAATAAGGTGTCTTT




CAATGCACTTATTGCTTAAGACCTATCCCCATTTTCCTGACTAATGATAT




CCTGACTTTTC






Helicobacter bilis

1932
TTTGATTGACTTCATCTTTGGTGAAAATTTGATTTATGATTTACTAACAG


ATCC 43879

ATTCTGCTGAAAATCTTATTAAC






Helicobacter bilis

1933
ACATGATTTTCAAGCAGTCCCGCAGGATACTGACTTTTATGCTGAGATTG


ATCC 43879

ATGAAAAATTTATCGCCCCATTAC






Anaerococcus vaginalis

1934
AAGAGAATAAATAGAAAATGGAATTAATAATATAGGAGTTTAAAATGTTA


ATCC 51170

ATTAATATGAAAGAAATGTTGAAAGTTGCTAACGAGAATAATTTTGCTGT




ACCAGCATTTAATA






Anaerococcus vaginalis

1935
GCAGAGTATTTCCTGTGGCGGCTATTAAGGTTATGTTGCTGTCTAATAAT


ATCC 51170

TTACTTACTAATTTTTTTATATTATTTATATTATCTTCTTTGTTAATAGT




GTTTATTAAACTTTTTTTTCTATTTAAAAAAAATTCTAT






Anaerococcus vaginalis

1936
CCAATATCTCTTAAAAGTAAAATTTTCATTTGGTAAAAACCGTCAAAACC


ATCC 51170

AGAATGTTGACACATTCTAACAACAGTTGCATCGCTTACTTTTGTTTCGT




TTGCTATTTCTTTTACACTCATAAGAGTTACTTTGT






Collinsella

1937
TTGCAATGCGATTGTCCCTGTGTCGCCCTGACCGATTATGATTGGCGAAA



aerofaciens ATCC 25986


CCAGCTTTCGTCCGTTCATGACTCGATTGTATTTGTCGATGAGGGCTTAA




AAGAGATCCATTCTGATGAGTTTGCCCATCATGTGCTGTATTCCTCGAAT




TATTTCGTGCT






Dorea formicigenerans

1938
TGGAAAATCCGAAAAGTAAAATTAAAGGTAGTAATACGGAAAATACGGAA


ATCC 27755

ACAAAAGAAGGGGCTGTTCGCTTTGATATTATTTTTTATGTTCGAATGAA




AGATGGAATTTCTCAGATTATTGTGAATATAGAAGCTCAAAAGAATAGTT




CGCC






Dorea formicigenerans

1939
TGAAAGCGCAGTATGACGAGAATGCCAAAAAGCTTTTAAGTAACAAAATT


ATCC 27755

TTTCTGGCACATATTTTAAAAGGAACAGTAACGGAATTTAAAGATGCGAA




TCCAAGAGATATCATTTCTTTGATTGAAGGAGAACCATATGTATCTAC






Ruminococcus gnavus

1940
GATACTGTTTCGGGACAGTAACCAGCGTATGATCGAAGGGAAGCAGGTAC


ATCC 29149

TGATCCTGACAGATTCTGTGACGACCGGTGCTTTGCTTGCAAAAGCCGTG




GAAGCAGTGCTGTATTACGGTGGACGTGTATGTGGTATCTGTGCAGTATT




CAGTGCGG






Ruminococcus gnavus

1941
TATGGAACTCACTGGCATATAATGGTGCCAGATGGATTGCATCCGGGCGA


ATCC 29149

TATCATTACAATATTGAGACTGTGTTGGATGAGAGGATCCCTTTCATACC




CTGGACATTGGTAATCTATTTTGGATGTTATCTGTTTTGGGGAATCAATT




ACATTTTGAT






Campylobacter rectus

1942
ATATGCGGATTTTAAATTTTGTGTATTTCTCATAATATATCCTTGAGAAA


RM3267

GAGGTTAAAATATAAAAACATAATAATAAATTAATGTTTAATGTAAGCTT




AAGGGATTAGTAAAATTTAATATGAAATATAAATTCTTATAT






Campylobacter rectus

1943
CAAATTTATCTTTTTTTGAGTGCCGGTAAAATAATAATATTTTTGAGCGG


RM3267

TGGTGCCGTTTTCTATGTTTGAGTTTATCGTAGCGAAATCGGCCGCAAAA




TTTGAAACGGCGAGTAAAATAGCCGTTGCCGCAACCGAACT






Campylobacter rectus

1944
TATCAAAAACGTAAACGGTATCGGAGAAAAGACATTTGAAAATTTAAAAG


RM3267

GCGATATCTCGATAAGCGGCGAAAACGTGATGCCTGCAAGCAGTAAAGTG




TCTAAAAAAGTAAAAGAAGCTA






Actinomyces viscosus

1945
CACCCCCGGCTTCGACGACGCCTAAGTCTTCCGAGGTGTACATCCAGTGC


C505

TGGACATCGAGGTCAGGAGGACCGAGAGGAGCGTCGACACCAGCTGCTGT




CCACGCCAGAGCGTTCGCCGAACGGCGGACGTCGGCCCCGATCCAGGCAA




GCTCGTCAAGCGCCGC






Actinomyces viscosus

1946
GTGTTCTACCGAACGGCCAGAACCACATCACTGGAAGGGACAGGATGAGG


C505

GCAAGAAACACCGTCACATTGAGAATAAAATTCAAAGAAGAGGTAAACGC




CTTCAGAACCCGAACAATCGATCTCAACACATTCATCTCTTTTCAAGTGG




ATGACGATGCCACC






Actinomyces viscosus

1947
TTCAGTTCCGCGTTCACTTCTGAAATATTCATCGGTTGTATTCGTTTCTG


C505

GGATTCGTGGGGTTGGACTTTTGACATGCAGTCGCACCGAGTATCAGACT




CCCGGGAAGGTCCCGTCCCCGAGGAATTCTCTGCGCCTCGGCGCATATGG




ACCGATGCCCGC






Campylobacter gracilis

1948
AAGCTCAGTGCTTTCAAAAAACCTTAAATTTTTGCGCGGAATTTCGGCTA


RM3268

AAATTTTATCTTTAAAGCCCTCATAATCGTTAGTTAAAACGATCTGATTT




ATGATTTTCAAAGCACCACT






Campylobacter gracilis

1949
TTGTCTAAAATTTTAAAAAGCTGAATCAGTTTGAGATCTAAGCTCGGATC


RM3268

GCTCTCGCTCATGTAAAAGCTATTTTGTAGCCTCTCATCGATGAGCCACA




GATAGCTATCCTCGTGGCTTTTGGCGATTTTGGCTAACGGATGGTTGCCG




TTGCC






Campylobacter gracilis

1950
GACATTAGTATTATCTAAAAACGAGATTACGATGTCAAATTTACGACTTT


RM3268

TTAGAAGCCTGTGCAAGGCTAAAATTTTATCATAGCGCTTTTTTATATTT




CCAAAAACTCCTAAATCGCCTACGCCCAAATCCAAGCTAATAAGCTCGAT




CTTAGGATC






Campylobacter gracilis

1951
TGACTTTGTTTTTTGCCTGAAGCACAGCTAGGCCTAAATTTTGAATCAAC


RM3268

GGCACGCTTACTGGAAGCACCAAAATAAGCGCGATTGCGTATGAGATCTC




GTAATTCGCGCCCGCCCATA






Peptostreptococcus

1952
GAGGCCCACAATATGTATCACGGTGAGAAGGTTGCCTTTGGAACTTGTGT



anaerobius 653-L


ACAGCTTATATTAGAAGATGCTCCTTTGGAAGAAATAGAAGAAATATATA




ATTT






Peptostreptococcus

1953
CACTATACACAATATGCCATTCGAAGTTAACGAAAAGAAGGTATACAGTG



anaerobius 653-L


CAATTATCGCTGCTGATAATATGGGAAGGAAGTATTTAGGCAAATAATCG




AGAATTAGGAGGTAATAAAATGTCAAATGTAGAGTCAACAAAATACAGGT




GTA






Prevotella histicola

1954
TTGTGCTGGCCATCATCACTACGCGTGTCAGGCATTAGCACAACATGTGT


F0411

TGATTATCAGCTCCAATCCAACTAAACATAGTCTGATTTATAATAACTTA




CTACAATAACTATAAAAGATGAATTATTAT






Prevotella histicola

1955
TCCGGCTCCGTTTTTATCAATCCTGTTGTATGGATCGCGTTCTGTTCATT


F0411

ATCTTATGATGGTCTTTTTGCCGTTGATGACATAGACGCCCTTGCCCAAT




GACTTCAGACTCTTCGCATCTTTGAGAACCTGTTTGCCGTCGAGTGTGTA




GACATCATATAGAA






Helicobacter

1956
TCTGGGGTCTCTATGCCCCAACATGTTGAGGTTAGGTCCTTGAATCACCA



bizzozeronii CIII-1


AAATTTTCATGCCATTCTCCTTACCAAGTGAAAGATTAAAGAGGTCATTA




TAGCATAAAACTCCCGTTTAAAGCCCAAAGGCTTAGAGTGTGAAATTATG




G






Helicobacter

1957
GTAAGTTTTGATACAACCAATGGTGTTAAAAGCCGTCCAAGGACTATCAA



bizzozeronii CIII-1


TATGCAACTTAGCCTGTTTCATACCCGACTCAATTTGCACCAAATTATCT




AAAAACTTATGGTTGAAGTCTTGGATTTTTAAAATATAAAGGTGCAAAA






Helicobacter

1958
AAAAGAATATTACCATGAAGTTCAGTTATAGCAAACCCACGCCCAAGCAC



bizzozeronii CIII-1


ATGGTGGATTTACTCACCAAAGTTTGGGTTTTTTACATGTGTTTGTCCTT




GTGTCTGATTTGGGGACTAGCCTATTTTTTACGCCACTACACCAAAGCC






Enterococcus hirae

1959
ACCAAAGAAACAATTCCTTTTGCGGCAATCGTCATAGTGGATTCCGACGA


ATCC 9790

CATTTTAGATTCGAAAGCCTATCTTGAAAACTATGCCAGCTTCGGTGGGT




ATTTTGATTTTTCGCTGAGCGATGAAATGATTTATGGCTTC






Enterococcus hirae

1960
ATATTATTTAGGATGGATCGAAAGCCAAAAAGTAGAAGCTGATTTAACAA


ATCC 9790

ATGAGGATAAGCAAAAGTAGACGAAGAAGGAGCGGGAAAGGATAGTTATC




TCGCTCTTCTTCGTCTATTTGGTTAGGATTTATCAGGGTTAA






Bacteroides nordii

1961
TTCAAGTCATTGATACGTACCTGATAGTTGGACACGTCGTCTATATTCTT


CL02T12C05

TGTGTTGTCCTTCTCCGTAGGCACATAATTCGCATCTGTCGTCTCTTTCC




AGCAAAAGCCGGAAAGGAACATTTCACCTCCTCCGTCATCCAGCACAGAA




GTAGATACAATAA






Bacteroides nordii

1962
TTTTAAGAAAGCATTCTTGGGCCAGATAGAACTACCCAACACAAAACAGG


CL02T12C05

ACTGGAAAAAGAAATATCCTCCAATAACCAACATAAAATCTTTCAACGTA




CGGGTCAATGTATAATAGGGAACGTTTCCTCCGACACAATGTGACAACGG




AAC






Bacteroides nordii

1963
TAATTTAATATCTTGTGATACCATTATCAACAAAATGCAGATAAACACAG


CLO2T12C05

ATTAATGCATATTAAAACCATTGATTCCTTGTACTTCCCACACTGGGAAG




TTCTCCAGGCGGTGTTTACGTTGGTTCCCCACAGATTGGCACCGAGTTTC




ACAG






Bacteroides nordii

1964
TTAGGGAGGATATAATAAATAGTTAATAGCGTTTCAATATAGAGTTTTAT


CL02T12C05

ACAATATTACTTCTTGATTTTCAGAATTTCTGTGAATTGTTTTCAGTGTT




TTCTTTATAGTATCACAAAT






Barnesiella

1965
ACAATTCTCACTGATTTTTACTGTACCGAGGTAATTCCCATCAAAAAAAA



intestinihominis YIT


AGACACCCCATGACAAAGATACCACATATCTAAACAAAAAAATGCGACAG


11860

ATTTTTCTGTCGCATTCGTAGCCCATAGGAGAATCGAA






Lactobacillus murinus

1966
GATCACTTTCATGTCGGTGCCTGAGCGGGCCTTTCTCTCAGACTGGTCAT


ASF361

ATGCGATCGGATCATTAGTTATCATCCCGATCATCCCGATCTTAAATCAT




TATTATGTCCCTTTCTTTCGC






Lactobacillus murinus

1967
CCACACATGCTCGCGGATCTGGTCTCTTGTTAAGACTTGTCCTCGATTGC


ASF361

GACATAAATATTCCAACACTTCGTATTCTTTAGCTGTCAGCTCGATCAAA




GTCGCTTCCTTTTTCACCTGCTTTTTAGCAATATTTATCGTTATATCACC




TATTT






Eubacterium rectale

1968
TATTTTTTCTCAACCTCTGTGAGGAACTTGTCATCAAGCTCCGCCTTTAT


CAG:36

AAGTACGTTCTCATCA






Cloacibacillus

1969
GTGCTATGTTCAGAGGAGATGTTGTAGATACTCCCGATGATGAATGGCTT



porcorum


AAACTTTTTGATATTAATGTTCACGCGGTCTTTTATCTTTCTAGGGAGGC




CATACGGCTTATGCGGGAACATAAAATAGCGGGGAACATTGTACA






Cloacibacillus

1970
AAGGGGCATATCGCCTACGCAACTACAAAAGGAGCAGTAGTGCAAATGAC



porcorum


ACGTTGCATGGCTTTAGACTGTGCCTCAGATGGAATACGCATAAACGCTG




TATGCCCTGGTGCAACTGATACTGCGATGCCAATGTCAAAGCATAGTGC






Blautia coccoides

1971
GGCACTTGCCGCACTACAGGTGCCATAGCCTGCACTGTATATACTGCAAT




CCTTTCCTGTTCTGGCCGCCGGAGGATTTATGTTGGTTATAGTATATACG




GTAATGCCCAAGGGTAAAATGCTGTATTCTGCT






Ruminococcus bromii

1972
TTACGGCTTAATATTAAGACTTGTTTCATACGGAATCGATACAGGGCTTA




TTGAAAAAGAGGATATAATCTACACGCAAAATAGGCTTTTATCGTTGTTC




CATATGGATGAACCCGATGACAGTTGTACTTGCCTTACAGCAGATAACGA






Phascolarctobacterium

1973
CTAATAAGAAAGATAAGAAGTATAAAAAGGAAAAGAAGGTTATTAAAAAC



faecium DSM 14760


CCATTTTTCTAAATTTATAATAAAGGAAGACATACGATGTATATCAAACG




CCATTTGGAGACGACAATTGAAAAACTGAGCGGTTGTTGTAAGG






Phascolarctobacterium

1974
AGGAAATGGTTTATCTGCGCAATATTGCCAGGGATAAGTTTGGCATGAAA



faecium DSM 14760


TTTTTGTAGTTTTACATGAAAATTTTGAATTTTATTTTCACTGTTGGTAT




GTAGTACAACAGATTGCTTTTAAGATAGCTGCTAGTTGATT






Helicobacter salomonis

1975
TCCAAGTATTTGCCAAAGAACCCTTTGTGGGCATCAAAAGCGCGCACAAA




CTCCAGAGTTGGATGAAAAATTCTCTCCAAGTGTGCGATCATCACTGGAG




GCATGTTGACTAGGGCAAGTTTGGCGAGATTTTCTGCACTCAAGAG






Helicobacter salomonis

1976
TTGCATGTCTTTTTGAATACTGGAAATGGCGCGTTGCTGGTCATTTGTGA




GAACAAAGGGTAGGGCCTGTAGAAATTCTCTAAGGAGCTGGGCATTAGCA




TGCCCACCAAACTTGCTTTGAAAATTCCACTTTTTGGCATTCAAGGAGAG




CATAT






Gardnerella vaginalis

1977
GCTATCCGCTATGGGATTACGCTTATTCCCACATGACCATTCCATCTCTA




CATGCAGAGTCAGGAATAATTGTCATAGCTGCTCTTATCCACTTTATACT




GGTGCTTATCTGCATATTCGATAAG






Gardnerella vaginalis

1978
TTTCCTTTTTGCTTTCTTCCAAGGAGTAGTAAATCCACCATTCCAAACAG




TATTTATTCATGCAGTAGATGATAATCTCGTTGGACAGGTAATGTCTGTA




T






Gardnerella vaginalis

1979
GGAATGGCAACATTTGTAAAAAACCAAACGTACAGGACTTTAGCTCCATT




TGCAATACTCGAAGCTATCGCATCTACTGGCATAAGCTTTGCTGGAGTCC




TCTTATTCTCAATGGTTCTACATAGTGCAGAAGGATATGGCTGGTATTTA









Exemplary Embodiments

This section provides exemplary embodiments of compositions and methods described herein:


A1. A composition comprising one or more primers or primer pairs capable of specifically amplifying a target nucleic acid sequence contained within a genome of a microorganism selected from among:


(a) Actinomyces viscosus, Akkermansia muciniphila, Anaerococcus vaginalis, Atopobium parvulum, Bacteroides fragilis, Bacteroides nordii, Bacteroides thetaiotaomicron, Bacteroides vulgatus, Barnesiella intestinihominis, Bifidobacterium adolescentis, Bifidobacterium animalis, Bifidobacterium bifidum, Bifidobacterium longum, Blautia coccoides, Blautia obeum, Borreliella burgdorferi, Campylobacter concisus, Campylobacter curvus, Campylobacter gracilis, Campylobacter hominis, Campylobacter jejuni, Campylobacter rectus, Chlamydia pneumoniae, Chlamydia trachomatis, Citrobacter rodentium, Cloacibacillus porcorum, Clostridioides difficile, Collinsella aerofaciens, Collinsella stercoris, Cutibacterium acnes, Desulfovibrio alaskensis, Dorea formicigenerans, Enterococcus faecalis, Enterococcus faecium, Enterococcus gallinarum, Enterococcus hirae, Escherichia coli, Eubacterium limosum, Eubacterium rectale, Faecalibacterium prausnitzii, Fusobacterium nucleatum, Gardnerella vaginalis, Gemmiger formicilis, Helicobacter bilis, Helicobacter bizzozeronii, Helicobacter hepaticus, Helicobacter pylori, Helicobacter salomonis, Holdemania filiformis, Klebsiella pneumoniae, Lactobacillus acidophilus, Lactobacillus delbrueckii, Lactobacillus johnsonii, Lactobacillus murinus, Lactobacillus reuteri, Lactobacillus rhamnosus, Lactococcus lactis, Mycoplasma fermentans, Mycoplasma penetrans, Parabacteroides distasonis, Parabacteroides merdae, Parvimonas micra, Peptostreptococcus anerobius, Peptostreptococcus stomatis, Phascolarctobacterium faecium, Porphyromonas gingivalis, Prevotella copri, Prevotella histicola, Proteus mirabilis, Roseburia intestinalis, Ruminococcus bromii, Ruminococcus gnavus, Slackia exigua, Streptococcus gallolyticus, Streptococcus infantarius, Veillonella parvula, or


(b) the microorganisms of (a) excluding Actinomyces viscosus, or


(c) Akkermansia muciniphila, Anaerococcus vaginalis, Atopobium parvulum, Bacteroides fragilis, Bacteroides nordii, Bacteroides thetaiotaomicron, Bacteroides vulgatus, Barnesiella intestinihominis, Bifidobacterium adolescentis, Bifidobacterium animalis, Bifidobacterium bifidum, Bifidobacterium longum, Blautia coccoides, Blautia obeum, Borreliella burgdorferi, Campylobacter concisus, Campylobacter curvus, Campylobacter gracilis, Campylobacter hominis, Campylobacter jejuni, Campylobacter rectus, Chlamydia pneumoniae, Chlamydia trachomatis, Citrobacter rodentium, Cloacibacillus porcorum, Clostridioides difficile, Collinsella aerofaciens, Collinsella stercoris, Cutibacterium acnes, Desulfovibrio alaskensis, Dorea formicigenerans, Enterococcus faecalis, Enterococcus faecium, Enterococcus gallinarum, Enterococcus hirae, Escherichia coli, Eubacterium limosum, Eubacterium rectale, Faecalibacterium prausnitzii, Fusobacterium nucleatum, Gardnerella vaginalis, Gemmiger formicilis, Helicobacter bilis, Helicobacter bizzozeronii, Helicobacter hepaticus, Helicobacter pylori, Helicobacter salomonis, Holdemania fihformis, Klebsiella pneumoniae, Lactobacillus acidophilus, Lactobacillus delbrueckii, Lactobacillus johnsonii, Lactobacillus murinus, Lactobacillus reuteri, Lactobacillus rhamnosus, Lactococcus lactis, Mycoplasma fermentans, Mycoplasma penetrans, Parabacteroides distasonis, Parabacteroides merdae, Parvimonas micra, Peptostreptococcus anerobius, Peptostreptococcus stomatis, Phascolarctobacterium faecium, Porphyromonas gingivalis, Prevotella copri, Prevotella histicola, Proteus mirabilis, Roseburia intestinalis, Ruminococcus bromii, Ruminococcus gnavus, Slackia exigua, Streptococcus gallolyticus, Streptococcus infantarius, Veillonella parvula, or (d) the microorganisms of (c) excluding Blautia coccoides and/or Helicobacter salomonis.

A2. A composition comprising one or more nucleic acids that specifically bind to and/or hybridize to a target nucleic acid sequence contained within the genome of a microorganism selected from among the microorganisms listed in embodiment A1.


A3. The composition of embodiment A2, wherein the one or more nucleic acids specifically bind to and/or hybridize to a target nucleic acid sequence contained within the genome of the microorganism in a mixture comprising nucleic acid of the genomes of multiple different microorganisms.


A4. The composition of embodiment A3, wherein the mixture comprises nucleic acids of the genome of a different microorganism that is in the same genus of the microorganism containing the target nucleic acid sequence.


A5. The composition of embodiment A2, wherein the one or more nucleic acids that specifically bind to and/or hybridize to a target nucleic acid sequence contained with the genome of a microorganism do not bind to and/or hybridize to a nucleic acid contained within any other genus of microorganism.


A6. The composition of embodiment A2, wherein the one or more nucleic acids that specifically bind to and/or hybridize to a nucleic acid sequence contained with the genome of a microorganism do not bind to and/or hybridize to a nucleic acid contained within any other species of microorganism.


A7. The composition of embodiment A2, wherein the one or more nucleic acids comprises or consists essentially of a nucleotide sequence selected from among the sequences in Table 16, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or a sequence substantially identical or similar thereto, or any of the aforementioned nucleotide sequences in which one or more thymine bases is substituted with a uracil base.


A8. The composition of embodiment A1 or A2, wherein the target nucleic acid sequence contained within the genome of a microorganism comprises or consists essentially of a nucleotide sequence selected from among the nucleotide sequences in SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or the complement thereof.


A9. The composition of embodiment A1, wherein the primer pair specifically amplifies the target nucleic acid in an amplification reaction mixture comprising nucleic acids of the genomes of multiple different microorganisms.


A10. The composition of embodiment A9, wherein the amplification reaction mixture comprises nucleic acid of the genome of a different microorganism that is in the same genus of the microorganism containing the target nucleic acid sequence.


A11. The composition of embodiment A1, wherein the primer pair does not amplify a nucleic acid sequence contained within any other genus of microorganism.


A12. The composition of embodiment A1, wherein the primer pair does not amplify a nucleic acid sequence contained within any other species of microorganism.


A13. The composition of embodiment A1, wherein the primer or primer pair comprises, or consists essentially of, a sequence or sequences selected from the sequences in Table 16, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or a sequence substantially identical or similar thereto, or any of the aforementioned nucleotide sequences in which one or more thymine bases is substituted with a uracil base.


A14. The composition of any of embodiments A1 and A8-A13, wherein at least one primer of the primer pair or both primers of the primer pair contains a modification relative to the target nucleic acid sequence that increases the susceptibility of the primer to cleavage.


A15. The composition of any of embodiments A1 and A8-A13, wherein at least one primer of the primer pair contains one or more or two or more uracil nucleobases or wherein both primers of the primer pair contain one or more or two or more uracil nucleobases.


A16. A composition comprising a plurality of nucleic acids and/or nucleic acid primers or primer pairs of any of embodiments A1-A15.


A17. The composition of embodiment A16, wherein the plurality of nucleic acids, primers and/or primer pairs comprises at least one nucleic acid that specifically binds to and/or hybridizes to and/or at least one primer pair that specifically amplifies a genomic target nucleic acid for each of the microorganisms listed in embodiment A1.


A18. The composition of embodiment A16, wherein the plurality of primer pairs comprises at least one primer pair that specifically and separately amplifies a genomic target nucleic acid for each of at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 70 of the microorganisms of embodiment A1.


A19. The composition of any of embodiments A16-A18, wherein the plurality of primer pairs comprises at least one primer pair that amplifies a target nucleic acid sequence comprising a nucleotide sequence selected from Table 17, or SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or substantially identical or similar sequence, or the complement thereof to generate an amplicon sequence that is less than about 500, less than about 475, less than about 450, less than about 400, less than about 375, less than about 350, less than about 300, less than about 275, less than about 250, less than about 200, less than about 175, less than about 150, or less than about 100 nucleotides in length, or that consists essentially of a nucleotide sequence selected from Table 17, or SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or substantially identical or similar sequence, and optionally containing one or more of the nucleic acid primer sequences at the 5′ and/or 3′ ends of the sequence.


A20. The composition of any of embodiments A16-A19, wherein each primer pair of the plurality of primer pairs specifically amplifies a different target nucleic acid sequence.


A21. The composition of embodiment A16, wherein the sequences of the plurality of primer pairs comprise or consist essentially of sequences in Table 16, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or a sequence substantially identical or similar thereto, or any of the aforementioned nucleotide sequences in which one or more thymine bases is substituted with a uracil base.


A22. The composition of any of embodiments A1 and A8-A21, further comprising one or more primers or primer pairs that separately amplify a nucleic acid comprising a nucleotide sequence contained within a hypervariable region of a prokaryotic 16S rRNA gene.


A23. The composition of embodiment A22, wherein the one or more primers or primer pairs that amplify a nucleic acid comprising a nucleotide sequence contained within a hypervariable region of a prokaryotic 16S rRNA gene separately amplify a nucleic acid comprising a nucleotide sequence contained within a hypervariable region of a prokaryotic 16S rRNA gene.


A24. The composition of any of embodiments A1-A23, further comprising nucleic acids of a sample from the alimentary canal of an organism.


A25. The composition of embodiment A24, wherein the sample is a fecal sample.


A26. The composition of any of embodiments A1-A25, further comprising a polymerase.


A27. A composition comprising one or more primer pairs capable of amplifying a target nucleic acid sequence comprising a nucleotide sequence selected from Table 17, or SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or substantially identical or similar sequence, or the complement thereof.


A28. The composition of embodiment A27, wherein the primer pair specifically amplifies the target nucleic acid sequence.


A29. The composition of embodiment A27 or embodiment A28, wherein at least one primer of the primer pair or both primers of the primer pair contains a modification relative to the target nucleic acid sequence that increases the susceptibility of the primer to cleavage.


A30. The composition of embodiment A27 or embodiment A28, wherein at least one primer of the primer pair contains one or more or two or more uracil nucleobases or wherein both primers of the primer pair contain one or more or two or more uracil nucleobases.


A31. A composition comprising a plurality of primer pairs of any of embodiments A27-A30.


A32. The composition of embodiment A31, wherein each primer pair of the plurality of primer pairs amplifies a different target nucleic acid sequence.


A33. A composition comprising a combination of primer pairs, wherein each of at least three primer pairs in the combination of primer pairs is capable of separately amplifying a nucleic acid comprising a sequence of a different hypervariable region of a prokaryotic 16S rRNA gene and wherein one of the hypervariable regions is a V5 region.


A34. A composition comprising a combination of primer pairs, wherein each of at least eight different primer pairs in the combination of primer pairs is capable of separately amplifying a nucleic acid comprising a sequence of a different hypervariable region of a prokaryotic 16S rRNA gene.


A35. The composition of embodiment A33, wherein each of at least 4, at least 5, at least 6, at least 7, or at least 8 or more different primer pairs in the combination of primer pairs are capable of separately amplifying a nucleic acid comprising a sequence of a different hypervariable region of a prokaryotic 16S rRNA gene.


A36. The composition of any of embodiments A33-A35, wherein the nucleic acids comprising a sequence of a hypervariable region are less than about 200 bp, or less than about 175 bp, or less than about 150 bp, or less than about 125 bp in length.


A37. The composition of any of embodiments A33-A36, wherein each primer of the combination of primer pairs contains less than 7 contiguous nucleotides of sequence identical to a sequence of contiguous nucleotides of another primer in the combination of primer pairs.


A38. The composition of any of embodiments A33-A37, wherein the nucleic acids comprising a sequence of a hypervariable region also contain sequence of a conserved region of a prokaryotic 16S rRNA gene.


A39. The composition of embodiment A38, wherein for at least one of the hypervariable region sequences amplified by the combination of primer pairs, at least two different primer pairs in the combination of primer pairs separately amplify nucleic acids containing a sequence of the same hypervariable region for 2 or more species of a prokaryotic genus having differences in nucleic acid sequences at the same conserved region.


A40. The composition of embodiment A39, wherein the at least one hypervariable region is the V2 region and/or the V8 region.


A41. The composition of embodiment A33, wherein the combination of primer pairs comprises primer pairs containing sequences selected from SEQ ID NOS:1-48 in Table 15.


A42. The composition of embodiment A34, wherein the combination of primer pairs comprises primer pairs containing sequences selected SEQ ID NOS:1-48 in Table 15.


A43. The composition of embodiment A33 or embodiment A34, wherein the combination of primer pairs comprises primer pairs containing sequences selected from SEQ ID NOS: 1-24 of Table 15 and/or SEQ ID NOS: 25-48 of Table 15.


A44. The composition of embodiment A33 or embodiment A34, wherein the combination of primer pairs comprises primer pairs containing sequences selected from SEQ ID NOS: 25-48 of Table 15 in which one or more thymine bases is substituted with a uracil base.


A45. A composition comprising two or more primers, wherein the primers comprise or consist essentially of sequences selected from SEQ ID NOs. 11-16, 23 and 24.


A46. A composition comprising two or more primers, wherein the primers comprise or consist essentially of sequences selected from SEQ ID NOs. 35-40, 47 and 48.


A47. The composition of any of embodiments A33-A46, wherein at least one primer or both primers of at least one primer pair contains a modification relative to the nucleic acid sequence being amplified wherein the modification increases the susceptibility of the primer to cleavage.


A48. The composition of any of embodiments A33-A46, wherein at least one primer or both primers of at least one primer pair contains one or more or two or more uracil nucleobases.


A49. A composition comprising nucleic acids, wherein the nucleic acids comprise one or more single-stranded nucleic acids consisting essentially of a nucleotide sequence selected from Table 17, or SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence, or the complement thereof, and/or one or more double-stranded nucleic acids consisting essentially of a nucleotide sequence selected from Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence, or the complement thereof, and a complementary nucleotide sequence hybridized thereto.


A50. A composition comprising nucleic acids, wherein the nucleic acids comprise one or more single-stranded nucleic acids consisting essentially of:


(a) a nucleotide sequence selected from Table 17, or SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence, or the complement thereof, and


(b) one or more sequences at the 5′ and/or 3′ end of the nucleic acid wherein the one or more sequences is selected from Table 16, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or a sequence substantially identical or similar thereto, or the complement thereof.


A51. A composition comprising nucleic acids, wherein the nucleic acids comprise one or more double-stranded nucleic acids consisting essentially of:


(a) a nucleotide sequence selected from among Table 17, or SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or a substantially identical or similar sequence, and a complementary nucleotide sequence hybridized thereto, and


(b) one or more sequences at the 5′ and/or 3′ end of the nucleic acid wherein the one or more sequences is selected from Table 16, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or a sequence substantially identical or similar thereto, and a complementary nucleotide sequence hybridized thereto.


A52. A composition comprising:


(a) nucleic acids in or from a sample, wherein the sample comprises a plurality of microorganisms, and


(b) a composition comprising nucleic acids, one or more primers, and/or primer pairs of any of embodiments A1-A48.


A53. The composition of embodiment A52, further comprising a polymerase.


A54. The composition of embodiment A52 or A53, wherein the sample is from the contents of an alimentary tract of an animal.


A55. The composition of embodiment A54, wherein the sample is a fecal sample.


A56. The composition of embodiment A54 or embodiment A55, wherein the animal is a mammal.


A57. The composition of embodiment A56, wherein the animal is a human.


A58. A composition comprising:


(a) at least 76 nucleic acid primer pairs, wherein the combination of nucleic acid primer pairs is capable of specifically amplifying a different target nucleic acid sequence contained within a genome of any microorganism selected from among Actinomyces viscosus, Akkermansia muciniphila, Anaerococcus vaginalis, Atopobium parvulum, Bacteroides fragilis, Bacteroides nordii, Bacteroides thetaiotaomicron, Bacteroides vulgatus, Barnesiella intestinihominis, Bifidobacterium adolescentis, Bifidobacterium animalis, Bifidobacterium bifidum, Bifidobacterium longum, Blautia coccoides, Blautia obeum, Borreliella burgdorferi, Campylobacter concisus, Campylobacter curvus, Campylobacter gracilis, Campylobacter hominis, Campylobacter jejuni, Campylobacter rectus, Chlamydia pneumoniae, Chlamydia trachomatis, Citrobacter rodentium, Cloacibacillus porcorum, Clostridioides difficile, Collinsella aerofaciens, Collinsella stercoris, Cutibacterium acnes, Desulfovibrio alaskensis, Dorea formicigenerans, Enterococcus faecalis, Enterococcus faecium, Enterococcus gallinarum, Enterococcus hirae, Escherichia coli, Eubacterium limosum, Eubacterium rectale, Faecalibacterium prausnitzii, Fusobacterium nucleatum, Gardnerella vaginalis, Gemmiger formicilis, Helicobacter bilis, Helicobacter bizzozeronii, Helicobacter hepaticus, Helicobacter pylori, Helicobacter salomonis, Holdemania filformis, Klebsiella pneumoniae, Lactobacillus acidophilus, Lactobacillus delbrueckii, Lactobacillus johnsonii, Lactobacillus murinus, Lactobacillus reuteri, Lactobacillus rhamnosus, Lactococcus lactis, Mycoplasma fermentans, Mycoplasma penetrans, Parabacteroides distasonis, Parabacteroides merdae, Parvimonas micra, Peptostreptococcus anerobius, Peptostreptococcus stomatis, Phascolarctobacterium faecium, Porphyromonas gingivalis, Prevotella copri, Prevotella histicola, Proteus mirabilis, Roseburia intestinalis, Ruminococcus bromii, Ruminococcus gnavus, Slackia exigua, Streptococcus gallolyticus, Streptococcus infantarius, Veillonella parvula;

(b) at least 75 nucleic acid primer pairs, wherein the combination of nucleic acid primer pairs is capable of specifically amplifying a different target nucleic acid sequence contained within a genome of any microorganism selected from among Akkermansia muciniphila, Anaerococcus vaginalis, Atopobium parvulum, Bacteroides fragilis, Bacteroides nordii, Bacteroides thetaiotaomicron, Bacteroides vulgatus, Barnesiella intestinihominis, Bifidobacterium adolescentis, Bifidobacterium animalis, Bifidobacterium bifidum, Bifidobacterium longum, Blautia coccoides, Blautia obeum, Borreliella burgdorferi, Campylobacter concisus, Campylobacter curvus, Campylobacter gracilis, Campylobacter hominis, Campylobacter jejuni, Campylobacter rectus, Chlamydia pneumoniae, Chlamydia trachomatis, Citrobacter rodentium, Cloacibacillus porcorum, Clostridioides difficile, Collinsella aerofaciens, Collinsella stercoris, Cutibacterium acnes, Desulfovibrio alaskensis, Dorea formicigenerans, Enterococcus faecalis, Enterococcus faecium, Enterococcus gallinarum, Enterococcus hirae, Escherichia coli, Eubacterium limosum, Eubacterium rectale, Faecalibacterium prausnitzii, Fusobacterium nucleatum, Gardnerella vaginalis, Gemmiger formicilis, Helicobacter bilis, Helicobacter bizzozeronii, Helicobacter hepaticus, Helicobacter pylori, Helicobacter salomonis, Holdemania filformis, Klebsiella pneumoniae, Lactobacillus acidophilus, Lactobacillus delbrueckii, Lactobacillus johnsonii, Lactobacillus murinus, Lactobacillus reuteri, Lactobacillus rhamnosus, Lactococcus lactis, Mycoplasma fermentans, Mycoplasma penetrans, Parabacteroides distasonis, Parabacteroides merdae, Parvimonas micra, Peptostreptococcus anerobius, Peptostreptococcus stomatis, Phascolarctobacterium faecium, Porphyromonas gingivalis, Prevotella copri, Prevotella histicola, Proteus mirabilis, Roseburia intestinalis, Ruminococcus bromii, Ruminococcus gnavus, Slackia exigua, Streptococcus gallolyticus, Streptococcus infantarius, Veillonella parvula;

(c) at least 74 nucleic acid primer pairs, wherein the combination of nucleic acid primer pairs is capable of specifically amplifying a different target nucleic acid sequence contained within a genome of any microorganism selected from among Akkermansia muciniphila, Anaerococcus vaginalis, Atopobium parvulum, Bacteroides fragilis, Bacteroides nordii, Bacteroides thetaiotaomicron, Bacteroides vulgatus, Barnesiella intestinihominis, Bifidobacterium adolescentis, Bifidobacterium animalis, Bifidobacterium bifidum, Bifidobacterium longum, Blautia obeum, Borreliella burgdorferi, Campylobacter concisus, Campylobacter curvus, Campylobacter gracilis, Campylobacter hominis, Campylobacter jejuni, Campylobacter rectus, Chlamydia pneumoniae, Chlamydia trachomatis, Citrobacter rodentium, Cloacibacillus porcorum, Clostridioides difficile, Collinsella aerofaciens, Collinsella stercoris, Cutibacterium acnes, Desulfovibrio alaskensis, Dorea formicigenerans, Enterococcus faecalis, Enterococcus faecium, Enterococcus gallinarum, Enterococcus hirae, Escherichia coli, Eubacterium limosum, Eubacterium rectale, Faecalibacterium prausnitzii, Fusobacterium nucleatum, Gardnerella vaginalis, Gemmiger formicilis, Helicobacter bilis, Helicobacter bizzozeronii, Helicobacter hepaticus, Helicobacter pylori, Helicobacter salomonis, Holdemania filformis, Klebsiella pneumoniae, Lactobacillus acidophilus, Lactobacillus delbrueckii, Lactobacillus johnsonii, Lactobacillus murinus, Lactobacillus reuteri, Lactobacillus rhamnosus, Lactococcus lactis, Mycoplasma fermentans, Mycoplasma penetrans, Parabacteroides distasonis, Parabacteroides merdae, Parvimonas micra, Peptostreptococcus anerobius, Peptostreptococcus stomatis, Phascolarctobacterium faecium, Porphyromonas gingivalis, Prevotella copri, Prevotella histicola, Proteus mirabilis, Roseburia intestinalis, Ruminococcus bromii, Ruminococcus gnavus, Slackia exigua, Streptococcus gallolyticus, Streptococcus infantarius, Veillonella parvula;

(d) at least 74 nucleic acid primer pairs, wherein the combination of nucleic acid primer pairs is capable of specifically amplifying a different target nucleic acid sequence contained within a genome of any microorganism selected from among Akkermansia muciniphila, Anaerococcus vaginalis, Atopobium parvulum, Bacteroides fragilis, Bacteroides nordii, Bacteroides thetaiotaomicron, Bacteroides vulgatus, Barnesiella intestinihominis, Bifidobacterium adolescentis, Bifidobacterium animalis, Bifidobacterium bifidum, Bifidobacterium longum, Blautia coccoides, Blautia obeum, Borreliella burgdorferi, Campylobacter concisus, Campylobacter curvus, Campylobacter gracilis, Campylobacter hominis, Campylobacter jejuni, Campylobacter rectus, Chlamydia pneumoniae, Chlamydia trachomatis, Citrobacter rodentium, Cloacibacillus porcorum, Clostridioides difficile, Collinsella aerofaciens, Collinsella stercoris, Cutibacterium acnes, Desulfovibrio alaskensis, Dorea formicigenerans, Enterococcus faecalis, Enterococcus faecium, Enterococcus gallinarum, Enterococcus hirae, Escherichia coli, Eubacterium limosum, Eubacterium rectale, Faecalibacterium prausnitzii, Fusobacterium nucleatum, Gardnerella vaginalis, Gemmiger formicilis, Helicobacter bilis, Helicobacter bizzozeronii, Helicobacter hepaticus, Helicobacter pylori, Holdemania filiformis, Klebsiella pneumoniae, Lactobacillus acidophilus, Lactobacillus delbrueckii, Lactobacillus johnsonii, Lactobacillus murinus, Lactobacillus reuteri, Lactobacillus rhamnosus, Lactococcus lactis, Mycoplasma fermentans, Mycoplasma penetrans, Parabacteroides distasonis, Parabacteroides merdae, Parvimonas micra, Peptostreptococcus anerobius, Peptostreptococcus stomatis, Phascolarctobacterium faecium, Porphyromonas gingivalis, Prevotella copri, Prevotella histicola, Proteus mirabilis, Roseburia intestinalis, Ruminococcus bromii, Ruminococcus gnavus, Slackia exigua, Streptococcus gallolyticus, Streptococcus infantarius, Veillonella parvula; or


(e) at least 73 nucleic acid primer pairs, wherein the combination of nucleic acid primer pairs is capable of specifically amplifying a different target nucleic acid sequence contained within a genome of any microorganism selected from among Akkermansia muciniphila, Anaerococcus vaginalis, Atopobium parvulum, Bacteroides fragilis, Bacteroides nordii, Bacteroides thetaiotaomicron, Bacteroides vulgatus, Barnesiella intestinihominis, Bifidobacterium adolescentis, Bifidobacterium animalis, Bifidobacterium bifidum, Bifidobacterium longum, Blautia obeum, Borreliella burgdorferi, Campylobacter concisus, Campylobacter curvus, Campylobacter gracilis, Campylobacter hominis, Campylobacter jejuni, Campylobacter rectus, Chlamydia pneumoniae, Chlamydia trachomatis, Citrobacter rodentium, Cloacibacillus porcorum, Clostridioides difficile, Collinsella aerofaciens, Collinsella stercoris, Cutibacterium acnes, Desulfovibrio alaskensis, Dorea formicigenerans, Enterococcus faecalis, Enterococcus faecium, Enterococcus gallinarum, Enterococcus hirae, Escherichia coli, Eubacterium limosum, Eubacterium rectale, Faecalibacterium prausnitzii, Fusobacterium nucleatum, Gardnerella vaginalis, Gemmiger formicilis, Helicobacter bilis, Helicobacter bizzozeronii, Helicobacter hepaticus, Helicobacter pylori, Holdemania filformis, Klebsiella pneumoniae, Lactobacillus acidophilus, Lactobacillus delbrueckii, Lactobacillus johnsonii, Lactobacillus murinus, Lactobacillus reuteri, Lactobacillus rhamnosus, Lactococcus lactis, Mycoplasma fermentans, Mycoplasma penetrans, Parabacteroides distasonis, Parabacteroides merdae, Parvimonas micra, Peptostreptococcus anerobius, Peptostreptococcus stomatis, Phascolarctobacterium faecium, Porphyromonas gingivalis, Prevotella copri, Prevotella histicola, Proteus mirabilis, Roseburia intestinalis, Ruminococcus bromii, Ruminococcus gnavus, Slackia exigua, Streptococcus gallolyticus, Streptococcus infantarius, Veillonella parvula.

A59. The composition of embodiment A58, wherein the combination of nucleic acid primer pairs is capable of specifically amplifying a different target nucleic acid sequence in the genome of all the microorganisms simultaneously in a multiplex nucleic acid amplification reaction.


A60. The composition of embodiment A58 or embodiment A59, wherein one or both primers of the nucleic acid primer pairs contains a modification relative to the target nucleic acid sequence that increases the susceptibility of the primer to cleavage.


A61. The composition of any of embodiments A58-A60, wherein the different target nucleic acid sequences of the genomes of the different microorganisms include sequences selected from the nucleotide sequences in Table 17, or SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, or the complement thereof.


A62. The composition of embodiment A58, wherein the sequences of the nucleic acid primer pairs comprise sequences selected from the sequences of primer pairs in Table 16 or SEQ ID NOS: 49-1604.


A63. The composition of embodiment A62, wherein the sequences of the nucleic acid primer pairs consist essentially of sequences selected from the sequences of primer pairs in Table 16 or SEQ ID NOS: 49-1604.


A64. The composition of embodiment A58, wherein the sequences of the nucleic acid primer pairs comprise sequences of primer pairs in Tables 16A and 16B or SEQ ID NOS: 49-520.


A65. The composition of embodiment A58, wherein the sequences of the of the nucleic acid primer pairs consist essentially of sequences of primer pairs in Tables 16A and 16B or SEQ ID NOS: 49-520.


A66. The composition of embodiment A58, wherein the sequences of the nucleic acid primer pairs comprise sequences of primer pairs in Tables 16D and 16E or SEQ ID NOS: 827-1298.


A67. The composition of embodiment A58, wherein the sequences of the of the nucleic acid primer pairs consist essentially of sequences of primer pairs in Tables 16D and 16E or SEQ ID NOS: 827-1298.


A68. A composition comprising a plurality of primer pairs, wherein each of at least three primer pairs in the plurality of primer pairs is capable of separately amplifying a separate nucleic acid comprising a sequence of one different hypervariable region of a prokaryotic 16S rRNA gene and wherein one of the hypervariable regions is a V5 region.


A69. A composition comprising a plurality of primer pairs, wherein each of at least eight different primer pairs in the plurality of primer pairs is capable of separately amplifying a nucleic acid comprising a sequence of one different hypervariable region of a prokaryotic 16S rRNA gene.


A70. A kit comprising any of the compositions of embodiments A1-A69.


A71. The kit of embodiment A70, further comprising one or more polymerases.


A72. The kit of embodiment A70 or embodiment A71, further comprising one or more oligonucleotide adapters.


A73. The kit of any of embodiments A70-A72, further comprising one or more ligases.


B1. A method for detecting, or determining the presence or absence of, a microorganism in a sample, comprising:


(a) subjecting nucleic acids in or from a sample to nucleic acid amplification using a combination of primer pairs comprising:


(i) one or more primer pairs capable of amplifying a nucleic acid sequence of a hypervariable region of a prokaryotic 16S rRNA gene and


(ii) one or more primer pairs capable of amplifying a target nucleic acid sequence contained within the genome of a target microorganism that is not contained within a hypervariable region of a prokaryotic 16S rRNA gene, wherein different primer pairs amplify different target nucleic acid sequences contained within the genome of different microorganisms; and


(b) detecting one or more amplification products, thereby detecting the target microorganism if it present in the sample.


B2. A method for detecting, or determining the presence or absence of, a microorganism in a sample, comprising:


(a) subjecting nucleic acids in or from a sample to two separate nucleic acid amplification reactions using a first set of primer pairs for one nucleic acid amplification reaction and a second set of primer pairs for the other nucleic acid amplification reaction, wherein:


(i) the first set of primer pairs comprises one or more primer pairs capable of amplifying a nucleic acid sequence of a hypervariable region of a prokaryotic 16S rRNA gene, and


(ii) the second set of primer pairs comprises one or more primer pairs capable of amplifying a target nucleic acid sequence contained within the genome of a target microorganism that is not contained within a hypervariable region of a prokaryotic 16S rRNA gene, wherein different primer pairs amplify different target nucleic acid sequences contained within the genome of different microorganisms; and


(b) detecting one or more amplification products, thereby detecting the target microorganism if it is present in the sample.


B3. The method of embodiment B1 or embodiment B2, wherein detecting one or more products of amplification using one or more primer pairs of (i) and not detecting a product of amplification using one or more primer pairs of (ii) is indicative of the absence of the target microorganism and the presence of one or more microorganisms different from the target microorganism.


B4. The method of embodiment B3, wherein the one or more microorganisms different from the target microorganism is/are a species that is different from the target microorganism.


B5. A method for detecting, or determining the presence or absence of, one or more microorganisms in a sample, comprising:


(a) subjecting nucleic acids in or from the sample to nucleic acid amplification using a plurality of primer pairs wherein each of at least three primer pairs in the plurality of primer pairs is capable of separately amplifying a separate nucleic acid sequence of one different hypervariable region of a prokaryotic 16S rRNA gene and wherein one of the hypervariable regions is a V5 region; and


(b) detecting one or more amplification products, thereby detecting the one or more microorganisms in the sample if the one or more microorganisms is present in the sample.


B6. A method for detecting, or determining the presence or absence of, one or more microorganisms in a sample, comprising:


(a) subjecting nucleic acids in or from the sample to nucleic acid amplification using a plurality of primer pairs wherein each of at least 8 primer pairs in the plurality of primer pairs is capable of separately amplifying a separate nucleic acid comprising a sequence of one different hypervariable region of a prokaryotic 16S rRNA gene; and


(b) detecting one or more amplification products, thereby detecting one or more microorganisms in the sample if the one or more microorganisms is present in the sample.


B7. A method for detecting, or determining the presence or absence of, a microorganism in a sample, comprising:


(a) subjecting nucleic acids in or from the sample to nucleic acid amplification using one or more primer pairs, wherein at least one of the one or more primer pairs is capable of specifically amplifying a target nucleic acid sequence contained within a genome of a microorganism selected from among the microorganisms of embodiment A1; and


(b) detecting one or more amplification products, thereby detecting one or more microorganisms selected from among the microorganisms of embodiment A1 if the one or more of the microorganisms is present in the sample.


B8. The method of embodiment B1 or embodiment B2, wherein the one or more primer pairs capable of amplifying a target nucleic acid sequence contained within a genome of the microorganism that is not contained within a hypervariable region of a prokaryotic 16S rRNA gene specifically amplifies the target nucleic acid sequence contained within the genome of the microorganism.


B9. The method of embodiment B1, embodiment B2 or embodiment B8, wherein the one or more primer pairs capable of amplifying a nucleic acid sequence of a hypervariable region of a prokaryotic 16S rRNA gene comprises at least 2 or more primer pairs, each of which separately amplifies a nucleic acid sequence of a different hypervariable region of a prokaryotic 16S rRNA gene, at least 3 or more primer pairs, each of which separately amplifies a nucleic acid sequence of a different hypervariable region of a prokaryotic 16S rRNA gene, at least 4 or more primer pairs, each of which separately amplifies a nucleic acid sequence of a different hypervariable region of a prokaryotic 16S rRNA gene, at least 5 or more primer pairs, each of which separately amplifies a nucleic acid sequence of a different hypervariable region of a prokaryotic 16S rRNA gene, at least 6 or more primer pairs, each of which separately amplifies a nucleic acid sequence of a different hypervariable region of a prokaryotic 16S rRNA gene, 7 at least or more primer pairs, each of which separately amplifies a nucleic acid sequence of a different hypervariable region of a prokaryotic 16S rRNA gene, or at least 8 or more primer pairs, each of which separately amplifies a nucleic acid sequence of a different hypervariable region of a prokaryotic 16S rRNA gene.


B10. The method of embodiment B1, embodiment B2 or embodiment B8, wherein the one or more primer pairs capable of amplifying a nucleic acid sequence of a hypervariable region of a prokaryotic 16S rRNA gene comprises at least 3 or more primer pairs, each of which separately amplifies a nucleic acid sequence of a different hypervariable region of a prokaryotic 16S rRNA gene and wherein one of the 3 or more regions is a V5 region.


B11. The method of embodiment B1, B2, B8 or B9, wherein the one or more primer pairs capable of amplifying a target nucleic acid sequence that is not contained within a hypervariable region of a prokaryotic 16S rRNA gene does not amplify a nucleic acid sequence contained within any other genus of microorganism.


B12. The method of embodiment B1, B2, B8 or B9, wherein the one or more primer pairs capable of amplifying a target nucleic acid sequence that is not contained within a hypervariable region of a prokaryotic 16S rRNA gene does not amplify a nucleic acid sequence contained within any other species of microorganism.


B13. The method of embodiment B1, B2, B8 or B9, wherein the one or more primer pairs capable of amplifying a target nucleic acid sequence that is not contained within a hypervariable region of a prokaryotic 16S rRNA gene is selected from the primer pairs in Table 16 or SEQ ID NOS: 49-1604.


B14. The method of embodiment B1, B2, B8, B9, B10, B11, B12 or B13, wherein the microorganism is in a genus selected from among the genera listed in embodiment A1.


B15. The method of embodiment B1, B2, B8, B9, B10, B11, B12 or B13, wherein the microorganism is selected from among the species listed in embodiment A1.


B16. The method of embodiment B1, B2, B8 or B9, wherein the target nucleic acid sequence contained within the genome of the microorganism comprises, or consists essentially of, a nucleotide sequence selected from among the nucleotide sequences in Table 17, or SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, a substantially identical or similar sequence, or the complement thereof.


B17. The method of embodiment B1, B2, B8 or B9, wherein a product of the nucleic acid amplification comprises a nucleotide sequence selected from among the nucleotide sequences in Table 17, or SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, a substantially identical or similar sequence, or the complement thereof.


B18. The method of any of embodiments B1-B17, wherein the one or more microorganisms are bacteria.


B19. The method of any of embodiments B1-B18, wherein the prokaryotic 16S rRNA gene is a bacterial gene.


B20. The method of embodiment B5, wherein each of at least 4 primer pairs in the plurality of primer pairs is capable of separately amplifying a separate nucleic acid sequence of one different hypervariable region of a prokaryotic 16S rRNA gene, wherein each of at least 5 primer pairs in the plurality of primer pairs is capable of separately amplifying a separate nucleic acid sequence of one different hypervariable region of a prokaryotic 16S rRNA gene, wherein each of at least 6 primer pairs in the plurality of primer pairs is capable of separately amplifying a separate nucleic acid sequence of one different hypervariable region of a prokaryotic 16S rRNA gene, wherein each of at least 7 primer pairs in the plurality of primer pairs is capable of separately amplifying a separate nucleic acid sequence of one different hypervariable region of a prokaryotic 16S rRNA gene, wherein each of at least 8 primer pairs in the plurality of primer pairs is capable of separately amplifying a separate nucleic acid sequence of one different hypervariable region of a prokaryotic 16S rRNA gene, or wherein each of at least 9 primer pairs in the plurality of primer pairs is capable of separately amplifying a separate nucleic acid sequence of one different hypervariable region of a prokaryotic 16S rRNA gene.


B21. The method of embodiment B6, wherein the plurality of primer pairs comprises a combination of primer pairs which is capable of separately amplifying separate nucleic acid sequences of 8 different hypervariable regions of a prokaryotic 16S rRNA gene.


B22. The method of embodiment B20 or embodiment B21, wherein the nucleic acid sequences are less than about 200 bp, or less than about 175 bp, or less than about 150 bp, or less than about 125 bp in length.


B23. The method of any of embodiments B20-B22, wherein each primer of the plurality of primer pairs contains less than 7 contiguous nucleotides of sequence identical to a sequence of contiguous nucleotides of another primer in the combination of primer pairs.


B24. The method of any of embodiments B20-B23, wherein for at least one of the hypervariable regions amplified by the plurality of primer pairs, at least two different primer pairs in the plurality of primer pairs separately amplify nucleic acid sequence within the same hypervariable region for 2 or more species of a prokaryotic genus having differences in nucleic acid sequences at the same hypervariable region.


B25. The method of any of embodiments B20-B23, wherein for at least one of the hypervariable regions amplified by the plurality of primer pairs, at least two different primer pairs in the plurality of primer pairs separately amplify nucleic acid sequence with the same hypervariable region for 2 or more strains of a prokaryotic species having differences in nucleic acid sequences at the same hypervariable region.


B26. The method of embodiment B24 or embodiment B25, wherein the at least one hypervariable region is the V2 region and/or the V8 region.


B27. The method of any of embodiments B20-B26, wherein the one or more microorganisms are bacteria.


B28. The method of any of embodiments B20-B27, wherein the prokaryotic 16S rRNA gene is a bacterial gene.


B29. The method of embodiment B7, wherein the at least one primer pair does not detectably amplify a nucleic acid sequence contained within any genus other than the genus of the microorganism.


B30. The method of embodiment B7, wherein the at least one primer pair does not detectably amplify a nucleic acid sequence contained within any species other than the species of the microorganism.


B31. The method of embodiment B7, wherein at least one primer pair of the one or more primer pairs, or at least one primer of the one or more primer pairs, comprises, or consists essentially of, a sequence of a primer or sequences of a primer pair selected from sequences in Table 16, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences in which one or more thymine bases is substituted with a uracil base.


B32. The method of embodiment B7, wherein at least one primer pair of the one or more primer pairs, or at least one primer of the one or more primer pairs, comprises, or consists essentially of, a sequence of a primer or sequences of a primer pair selected from sequences in Tables 16D, 16E and 16F or SEQ ID NOS: 827-1604 of Table 16, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or substantially identical or similar sequences, in which one or more thymine bases is substituted with a uracil base.


B33. The method of embodiment B7, wherein the nucleic acids are subjected to nucleic acid amplification using a plurality of primers or primer pairs, each containing, or consisting essentially of, a sequence or sequences selected from sequences in Table 16, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences in in which one or more thymine bases is substituted with a uracil base.


B34. The method of embodiment B7, wherein the nucleic acids are subjected to nucleic acid amplification using a plurality of primers or primer pairs, each containing, or consisting essentially of, a sequence or sequences selected from sequences in Tables 16 D, 16E and 16F or SEQ ID NOS: 827-1604 of Table 16, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or substantially identical or similar sequences, in which one or more thymine bases is substituted with a uracil base.


B35. The method of embodiment B7, wherein the target nucleic acid sequence comprises a nucleotide sequence selected from among the nucleotide sequences in Table 17, or SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, a substantially identical or similar sequence, or the complement thereof.


B36. The method of embodiment B7, wherein detecting one or more amplification products comprises detecting one or more nucleotide sequences selected from sequences in Table 17, or SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, a substantially identical or similar sequence, or the complement thereof.


B37. The method of embodiment B7, wherein detecting one or more amplification products comprises detecting one or more nucleotide sequences selected from sequences in Table 17, or SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, a substantially identical or similar sequence, or the complement thereof, and optionally having one or more primer sequences at the 5′ and/or 3′ end(s) of the sequence.


B38. The method of embodiment B7, wherein the target nucleic acid is not contained within a prokaryotic16S rRNA gene.


B39. The method of embodiment B1, embodiment B5, embodiment B6, or embodiments B8-B28, wherein a primer pair or a plurality of primer pairs capable of amplifying a nucleic acid sequence of a hypervariable region of a prokaryotic 16S rRNA gene comprises one or more primer pairs selected from the sequences of primer pairs in Table 15, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences in which one or more thymine bases is substituted with a uracil base.


B40. The method of embodiment B1, B2, B5, B6, or B8-B21, wherein a primer pair or a plurality of primer pairs capable of amplifying a nucleic acid sequence of one or more hypervariable regions of a prokaryotic 16S rRNA gene comprises one or more primer pairs containing, or consisting essentially of, sequences selected from Table 15, or SEQ ID NOS: 1-24 of Table 15 and/or SEQ ID NOS: 25-48 of Table 15, or substantially identical or similar sequences.


B41. The method of embodiment B1, embodiment B2, embodiment B5, embodiment B6 or embodiments B8-B21, wherein a primer pair or a plurality of primer pairs capable of amplifying a nucleic acid sequence of one or more hypervariable regions of a prokaryotic 16S rRNA gene comprises one or more primer pairs containing, or consisting essentially of, sequences selected from SEQ ID NOS: 25-48 of Table 15, or substantially identical or similar sequences, in which one or more thymine bases is substituted with a uracil base.


B42. The method of any of embodiments B1-B30 or B31-B36, wherein one or both primers of one or more primer pairs contains a modification relative to a nucleic acid sequence amplified by the primer pair wherein the modification reduces the binding of the primer to other primers.


B43. The method of any of embodiments B1-B30 or B31-B36, wherein one or both primers of one or more primer pairs contains a modification relative to a nucleic acid sequence amplified by the primer pair wherein the modification increases the susceptibility of the primer to cleavage.


B44. The method of any of embodiments B1-B43, wherein one or both primers of one or more primer pairs contains one or more or two or more uracil nucleobases.


B45. The method of any of embodiments B1-B44, wherein the sample comprises nucleic acids of the genomes of multiple different microorganisms.


B46. The method of any of embodiments B1-B44, wherein the sample comprises nucleic acid of the genome of a different microorganism that is in the same genus of the microorganism being detected.


B47. The method of any of embodiments B1-B44, wherein two or more microorganisms in the sample are detected.


B48. The method of any of embodiments B1-B44, wherein the one or more primer pairs, or plurality of primer pairs, or combination of primer pairs comprises at least 2, 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230 or more primer pairs.


B49. The method of embodiment B48, wherein the nucleic acid amplification is conducted in a single amplification reaction mixture.


B50. The method of any of embodiments B1-B49, wherein detecting comprises contacting an amplification reaction mixture after nucleic acid amplification with one or more detectable probes that specifically interacts with a product of amplification of nucleic acids comprising a sequence that is amplified by the one or more primer pairs or the plurality of primer pairs or the combination of primer pairs.


B51. The method of any of embodiments B1-B49, wherein detecting comprises performing nucleic acid sequencing of an amplification reaction mixture after nucleic acid amplification.


B52. The method of embodiment B51, further comprising aligning a nucleotide sequence obtained from sequencing with a reference sequence.


B53. The method of any of embodiments B1-B52, further comprising determining the abundance of one or more microorganisms present in the sample.


B54. The method of any of embodiments B1-B52, wherein detecting comprises identifying the genus and species of one or more microorganisms present in the sample.


B55. The method of any of embodiments B1-B52, wherein detecting comprises identifying the genus and species of two or more microorganisms present in the sample.


B56. The method of any of embodiments B1-B55, wherein the sample is a biological sample.


B57. The method of embodiment B56, wherein the sample is a fecal sample.


B58. A method for amplifying a target nucleic acid of one or more microorganisms, comprising:


(a) obtaining nucleic acids of one or more microorganisms selected from among the microorganisms of embodiment A1; and


(b) subjecting the nucleic acids to nucleic acid amplification using at least one primer pair that specifically amplifies a target nucleic acid sequence contained within a genome of a microorganism selected from among the microorganisms of embodiment A1 thereby producing amplified copies of the target nucleic acid.


B59. The method of embodiment B58, wherein the at least one primer pair does not detectably amplify a nucleic acid sequence contained within any genus other than the genus of the microorganism containing the target nucleic acid sequence.


B60. The method of embodiment B58, wherein the at least one primer pair does not detectably amplify a nucleic acid sequence contained within any species other than the species of the microorganism containing the target nucleic acid sequence.


B61. The method of embodiment B58, wherein the at least one primer pair is selected from the sequences of primer pairs in Table 16, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences in in which one or more thymine bases is substituted with a uracil base.


B62. The method of embodiment B58, wherein the target nucleic acid sequence comprises a nucleotide sequence selected from among the nucleotide sequences Table 17, or SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, a substantially identical or similar sequence, or the complement thereof.


B63. The method of embodiment B58, wherein a product of the nucleic acid amplification comprises a nucleotide sequence selected from Table 17, or SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, a substantially identical or similar sequence, or the complement thereof, and optionally having one or more primer sequences at the 5′ and/or 3′ end(s) of the sequence.


B64. The method of any of embodiments B58-B63, wherein the method comprises obtaining nucleic acids for two or more or a plurality of microorganisms and subjecting the nucleic acids to multiplex nucleic acid amplification using at least two or more primer pairs.


B65. A method for amplifying multiple regions of a gene of one or more microorganisms, comprising:


(a) obtaining nucleic acids of one or more microorganisms comprising a 16S rRNA gene;


(b) subjecting the nucleic acids to nucleic acid amplification using a plurality of primer pairs wherein each of at least 3 primer pairs in the plurality of primer pairs is capable of separately amplifying a separate nucleic acid sequence of one different hypervariable region of a prokaryotic 16S rRNA gene and wherein one of the hypervariable regions is a V5 region thereby producing separate amplified copies of separate nucleic acid sequences of at least 3 different hypervariable regions of the 16S rRNA gene of one or more microorganisms.


B66. A method for amplifying multiple regions of a gene of one or more microorganisms, comprising:


(a) obtaining nucleic acids of one or more microorganisms comprising a 16S rRNA gene;


(b) subjecting the nucleic acids to nucleic acid amplification using a using a plurality of primer pairs wherein each of at least 8 primer pairs in the plurality of primer pairs is capable of separately amplifying a separate nucleic acid sequence of one different hypervariable region of a prokaryotic 16S rRNA gene thereby producing separate amplified copies of separate nucleic acid sequences of at least 8 different hypervariable regions of the 16S rRNA gene of one or more microorganisms.


B67. The method of embodiment B65 or embodiment B66, wherein the amplified copies of separate nucleic acid sequences are less than about 200 bp, or less than about 175 bp, or less than about 150 bp, or less than about 125 bp in length.


B68. The method of any of embodiments B65-B67, wherein each primer of the plurality of primer pairs contains less than 7 contiguous nucleotides of sequence identical to a sequence of contiguous nucleotides of another primer in the plurality of primer pairs.


B69. The method of any of embodiments B65-B68, wherein for at least one of the nucleic acid sequences of different hypervariable regions amplified by the plurality of primer pairs, at least two different primer pairs in the plurality of primer pairs separately amplify nucleic acid sequence of the same hypervariable region regions for 2 or more species of the same prokaryotic genus having differences in nucleic acid sequences at the same hypervariable region.


B70. The method of any of embodiments B65-B68, wherein for at least one of the nucleic acid sequences of different hypervariable regions amplified by the plurality of primer pairs, at least two different primer pairs in the combination of primer pairs separately amplify nucleic acid sequence of the same hypervariable region or regions for 2 or more strains of a prokaryotic species having differences in nucleic acid sequences at the same hypervariable region.


B71. The method of embodiment B69 or embodiment B70, wherein the same hypervariable region or regions is the V2 region and/or the V8 region.


B72. The method of embodiment B65 or embodiment B66, wherein the plurality of primer pairs of separately amplifying a separate nucleic acid sequence of different hypervariable regions of a prokaryotic 16S rRNA gene comprises primer pairs containing, or consisting essentially of, sequences selected from Table 15 or substantially identical or similar sequences.


B73. The method of embodiment B65 or embodiment B66, wherein the plurality of primer pairs of separately amplifying a separate nucleic acid sequence of different hypervariable regions of a prokaryotic 16S rRNA gene comprises primer pairs containing, or consisting essentially of, sequences selected from SEQ ID NOS: 1-24 of Table 15 and/or SEQ ID NOS: 25-48 of Table 15 or substantially identical or similar sequences.


B74. The method of embodiment B65 or embodiment B66, wherein the plurality of primer pairs of separately amplifying a separate nucleic acid sequence of different hypervariable regions of a prokaryotic 16S rRNA gene comprises primer pairs containing, or consisting essentially of, sequences selected from SEQ ID NOS: 25-48 of Table 15 or substantially identical or similar sequences in which one or more thymine bases is substituted with a uracil base.


B75. The method of any of embodiments B65-B74, wherein the method comprises obtaining nucleic acids of two or more or a plurality of microorganisms and subjecting the nucleic acids to multiplex nucleic acid amplification using the plurality of primer pairs.


B76. The method of any of embodiments B65-B75, wherein the one or more microorganisms are bacteria.


B77. A method for amplifying genome regions of one or more microorganisms, comprising:


(a) obtaining nucleic acids of one or more microorganisms comprising a 16S rRNA gene,


(b) subjecting the nucleic acids to nucleic acid amplification using a combination of primer pairs comprising:


(i) one or more primer pairs that separately amplifies a nucleic acid sequence of a hypervariable region of a prokaryotic 16S rRNA gene, and


(ii) one or more primer pairs that amplify a target nucleic acid sequence contained within the genome of a microorganism that is not contained within a hypervariable region of a prokaryotic 16S rRNA gene, wherein different primer pairs amplify different target nucleic acid sequences contained within the genome of different microorganisms; and


(c) generating amplified copies of at least two different regions of the genome of one or more microorganisms.


B78. A method for amplifying genome regions of one or more microorganisms, comprising:


(a) obtaining nucleic acids of one or more microorganisms comprising a 16S rRNA gene,


(b) subjecting the nucleic acids to two separate nucleic acid amplification reactions using a first set of primer pairs for one nucleic acid amplification reaction and a second set of primer pairs for the other nucleic acid amplification reaction, wherein:


(i) the first set of primer pairs comprises one or more primer pairs that separately amplifies a nucleic acid sequence of a hypervariable region of a prokaryotic 16S rRNA gene, and


(ii) the second set of primer pairs comprises one or more primer pairs that amplify a target nucleic acid sequence contained within the genome of a microorganism that is not contained within a hypervariable region of a prokaryotic 16S rRNA gene, wherein different primer pairs amplify different target nucleic acid sequences contained within the genome of different microorganisms; and


(c) generating amplified copies of at least two different regions of the genome of one or more microorganisms.


B79. The method of embodiment B77 or embodiment B78, wherein the one or more primer pairs of (i) amplify a nucleic acid sequence in a plurality of microorganisms from different genera.


B80. The method of embodiment B79, wherein a mixture of nucleic acids of at least two different microorganisms is obtained and subjected to nucleic acid amplification and the genome of only one of the microorganisms contains a target sequence specifically amplified by a primer pair of (ii).


B81. The method of embodiment B80, wherein the generated amplified copies contain copies of a target nucleic acid sequence amplified by a primer pair of (ii) from the nucleic acid of the genome of one microorganism but do not contain copies of a target nucleic acid sequence amplified by a primer pair of (ii) from the nucleic acid of the genome of any other microorganism that was subjected to nucleic acid amplification.


B82. The method of embodiment B81, wherein the generated amplified copies contain copies of a nucleic acid sequence contained within a hypervariable region amplified by a primer pair of (i) from the nucleic acids of the genome of a plurality of microorganisms.


B83 The method of any of embodiments B77-B82, wherein the one or more microorganisms are bacteria.


B84. The method of any of embodiments B77-B83, wherein the prokaryotic 16S rRNA gene is a bacterial gene.


B85. The method of any of embodiments B58-B84, wherein one or both primers of one or more primer pairs contains a modification relative to a nucleic acid sequence amplified by the primer pair wherein the modification reduces the binding of the primer to other primers.


B86. The method of any of embodiments B58-B84, wherein one or both primers of one or more primer pairs contains a modification relative to a nucleic acid sequence amplified by the primer pair wherein the modification increases the susceptibility of the primer to cleavage.


B87. The method of any of embodiments B58-B84, wherein one or both primers of one or more primer pairs contains one or more or two or more uracil nucleobases.


B88. A method for characterizing a population of microorganisms in a sample, comprising:


(a) subjecting nucleic acids in or from the sample to nucleic acid amplification using a combination of primer pairs comprising:


(i) one or more primer pairs capable of amplifying a nucleic acid sequence of one or more hypervariable regions of a prokaryotic 16S rRNA gene and


(ii) one or more primer pairs capable of amplifying a target nucleic acid sequence contained within the genome of a microorganism that is not contained within a hypervariable region of a prokaryotic 16S rRNA gene, wherein different primer pairs amplify different target nucleic acid sequences contained within the genome of different microorganisms;


(b) obtaining sequence information from nucleic acid products amplified by the combination of primer pairs of (i) and (ii) and determining levels of nucleic acid products amplified by the one or more primer pairs of (i); and


(c) identifying genera of microorganisms in the sample and species of one or more of the microorganisms in the sample, thereby characterizing a population of microorganisms in the sample.


B89. A method for characterizing a population of microorganisms in a sample, comprising:


(a) subjecting the nucleic acids to two separate nucleic acid amplification reactions using a first set of primer pairs for one nucleic acid amplification reaction and a second set of primer pairs for the other nucleic acid amplification reaction, wherein:


(i) the first set of primer pairs comprises one or more primer pairs capable of amplifying a nucleic acid sequence of one or more hypervariable regions of a prokaryotic 16S rRNA gene, and


(ii) the second set of primer pairs comprises one or more primer pairs capable of amplifying a target nucleic acid sequence contained within the genome of a microorganism that is not contained within a hypervariable region of a prokaryotic 16S rRNA gene, wherein different primer pairs amplify different target nucleic acid sequences contained within the genome of different microorganisms;


(b) obtaining sequence information from nucleic acid products amplified by primer pairs of (i) and (ii) and determining levels of nucleic acid products amplified by the one or more primer pairs of (i); and


(c) identifying genera of microorganisms in the sample and species of one or more of the microorganisms in the sample, thereby characterizing a population of microorganisms in the sample.


B90. The method of embodiment B88 or embodiment B89, wherein the one or more primer pairs of (ii) comprises a plurality of primer pairs that amplify target nucleic acid sequences contained in the genomes of a plurality of microorganisms that are not contained within a hypervariable region of a prokaryotic 16S rRNA gene.


B91. The method of embodiment B88, B89 or embodiment B90, wherein at least one of the one or more primer pairs of (ii) specifically amplifies the target nucleic acid sequence contained within the genome of the microorganism.


B92. The method of any of embodiments B88-B91, wherein the one or more primer pairs of (ii) amplify a target nucleic acid sequence contained within the genome of a microorganism selected from the microorganisms of embodiment A1.


B93. The method of any of embodiments B88-B90, wherein the one or more primer pairs of (i) comprises at least 2 or more primer pairs, each of which separately amplifies a nucleic acid sequence of a different hypervariable region of a prokaryotic 16S rRNA gene, at least 3 or more primer pairs, each of which separately amplifies a nucleic acid sequence of a different hypervariable region of a prokaryotic 16S rRNA gene, at least 4 or more primer pairs, each of which separately amplifies a nucleic acid sequence of a different hypervariable region of a prokaryotic 16S rRNA gene, at least 5 or more primer pairs, each of which separately amplifies a nucleic acid sequence of a different hypervariable region of a prokaryotic 16S rRNA gene, at least 6 or more primer pairs, each of which separately amplifies a nucleic acid sequence of a different hypervariable region of a prokaryotic 16S rRNA gene, 7 at least or more primer pairs, each of which separately amplifies a nucleic acid sequence of a different hypervariable region of a prokaryotic 16S rRNA gene, or at least 8 or more primer pairs, each of which separately amplifies a nucleic acid sequence of a different hypervariable region of a prokaryotic 16S rRNA gene.


B94. The method of any of embodiments B88-B92, wherein the one or more primer pairs of (i) comprises at least 3 or more primer pairs, each of which separately amplifies a nucleic acid sequence of a different hypervariable region of a prokaryotic 16S rRNA gene and wherein one of the 3 or more regions is a V5 region.


B95. The method of any of embodiments B88-B94, wherein the one or more primer pairs of (ii) does not amplify a nucleic acid sequence contained within any other genus of microorganism.


B96. The method of any of embodiments B88-B94, wherein at least one of the one or more primer pairs of (ii) does not amplify a nucleic acid sequence contained within any other species of microorganism.


B97. The method of any of embodiments B88-B94, wherein at least one of the one or more primer pairs of (ii) amplify a target nucleic acid sequence contained within the genome of a microorganism in a genus selected from among the genera listed in embodiment A1.


B98. The method of embodiment B97, wherein the at least one primer pair specifically amplifies a target nucleic acid sequence contained only within the genome of a microorganism in a genus selected from among the genera listed in embodiment A1.


B99. The method of embodiment B97, wherein the at least one primer pair specifically amplifies a target nucleic acid sequence contained only within the genome of a microorganism selected from among the microorganisms listed in embodiment A1.


B100. The method of any of embodiments B88-B99, wherein at least one primer of the one or more primer pairs, or at least one of the one or more primer pairs, of (ii) comprises, or consists essentially of, the sequence or sequences of a primer or primer pair in Table 16, or SEQ ID NOS: 49-520 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-520 of Table 16, or SEQ ID NOS: 49-492 of Table 16, or SEQ ID NOS: 49-452, 457-472 and 481-492 of Table 16, or SEQ ID NOS: 49-480 of Table 16A, or SEQ ID NOS: 49-452 and 457-472 of Table 16A, or SEQ ID NOS: 521-826 of Table 16C, or SEQ ID NOS: 521-820 of Table 16C, or SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or substantially identical or similar sequences, or any of the aforementioned nucleotide sequences in in which one or more thymine bases is substituted with a uracil base.


B101. The method of any of embodiments B88-B99, wherein at least one primer of the one or more primer pairs, or at least one of the one or more primer pairs, of (ii) comprises, or consists essentially of, the sequence or sequences of a primer or primer pair in SEQ ID NOS: 827-1298 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1298 of Table 16, or SEQ ID NOS: 827-1270 of Table 16, or SEQ ID NOS: 827-1230, 1235-1250 and 1259-1270 of Table 16, or SEQ ID NOS: 827-1258 of Table 16D, or SEQ ID NOS: 827-1230 and 1235-1250 of Table 16D, or SEQ ID NOS: 1299-1604 of Table 16F, or SEQ ID NOS: 1299-1598 of Table 16F, or substantially identical or similar sequences in which one or more thymine bases is substituted with a uracil base.


B102. The method of any of embodiments B88-B101, wherein the target nucleic acid sequence contained within the genome of the microorganism comprises, or consists essentially of, a nucleotide sequence selected from Table 17, or SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, a substantially identical or similar sequence, or the complement thereof.


B103. The method of any of embodiments B88-102, wherein a product of the nucleic acid amplification comprises a nucleotide sequence selected from among Table 17, or SEQ ID NOS: 1605-1979 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816, 1821-1970, 1972-1974 and 1977-1979 of Table 17, or SEQ ID NOS: 1605-1826 in Table 17, or SEQ ID NOS: 1605-1806, 1809-1816 and 1821-1826 in Table 17, or SEQ ID NOS: 1605-1820 in Table 17A, or SEQ ID NOS: 1605-1806 and 1809-1816 in Table 17A, or SEQ ID NOS: 1827-1979 in Table 17C, or SEQ ID NOS: 1827-1976 in Table 17C, a substantially identical or similar sequence, or the complement thereof, and optionally having one or more primer sequences at the 5′ and/or 3′ end(s) of the sequence.


B104. The method of any of embodiments B88-B103, wherein the one or more primer pairs of (i) comprise primers or primer pairs comprising, or consisting essentially of, a sequence or sequences in SEQ ID NOS: 1-24 of Table 15 and/or SEQ ID NOS: 25-48 of Table 15, or substantially identical or similar sequences.


B105. The method of any of embodiments B88-B103, wherein the one or more primer pairs of (i) comprise primers or primer pairs comprising, or consisting essentially of, a sequence or sequences in SEQ ID NOS: 25-48 of Table 15, or substantially identical or similar sequences, in which one or more thymine bases is substituted with a uracil base.


B106. The method of any of embodiments B88-B105, wherein each primer in the primer pairs of (i) contains less than 10, less than 9, less than 8, less than 7, less than 6, less than 5, less than 4, less than 3, or less than 2 contiguous nucleotides of sequence identical to a sequence of contiguous nucleotides of another primer in the primer pairs.


B107. The method of any of embodiments B88-B104, wherein the one or more primer pairs of (i) amplify a nucleic acid sequence in a plurality of microorganisms from different genera.


B108 The method of any of embodiments B88-B107, wherein the primers of the one or more primer pairs of (i) selectively hybridize to nucleic acid sequences contained in conserved regions of a prokaryotic 16S rRNA gene.


B109. The method of any of embodiments B88-B108, wherein for at least one of the hypervariable regions amplified by the primer pairs of (i), at least two different primer pairs in the primer pairs of (i) separately amplify nucleic acid sequence within the same hypervariable region for 2 or more species of a prokaryotic genus having differences in nucleic acid sequences at the same hypervariable region.


B110. The method of any of embodiments B88-B108, wherein for at least one of the hypervariable regions amplified by the primer pairs of (i), at least two different primer pairs in the primer pairs of (i) separately amplify nucleic acid sequence with the same hypervariable region for 2 or more strains of a prokaryotic species having differences in nucleic acid sequences at the same hypervariable region.


B111. The method of embodiment B109 or embodiment B110, wherein the at least one hypervariable region is the V2 region and/or the V8 region.


B112. The method of any of embodiments B88-B111, wherein obtaining sequence information from nucleic acid products amplified by the combination of primer pairs comprises subjecting the amplified nucleic acid products to nucleic acid sequencing and obtaining sequence reads and wherein determining levels of amplified nucleic acid products comprises counting the sequence reads.


B113. The method of embodiment B112, wherein counting the sequence reads comprises determining a total number of sequence reads mapping to a sequence in the genome of a microorganism containing sequence amplified by the combination of primer pairs and normalizing the total number of mapped sequence reads by dividing the total number of mapped sequence reads by the number of amplicon sequences that would be expected to be amplified in the genome of the microorganism by the combination of primer pairs to obtain a normalized number of sequence reads, and optionally dividing the normalized number of sequence reads for a microorganism mapping to a sequence in the genome of a microorganism by the total number of normalized reads obtained for sequencing of all nucleic acids in a sample to obtain a relative fractional abundance.


B114. The method of embodiment B112, wherein the nucleic acid products amplified by the combination of primer pairs of (i) contain a first common barcode sequence and the nucleic acid products amplified by the combination of primer pairs of (ii) contain a second common barcode sequence that is different from the first common barcode sequence.


B115. The method of embodiment B112, wherein identifying genera of microorganisms in the sample comprises (1) aligning sequence reads of nucleic acid products amplified by the combination of primer pairs of (i) to a collection of full-length nucleotide sequences of reference prokaryotic 16S rRNA genes of a filtered group of microorganisms that are selected from and less than the total number of full-length nucleotide sequences in a prokaryotic 16S rRNA gene reference database that has not been filtered, (2) assigning sequence reads to genera of microorganisms based on alignments of the reads to full-length nucleotide sequences of prokaryotic 16S rRNA genes of the filtered group of microorganisms; and (3) identifying with at least 90% sensitivity, or at least 91% sensitivity, or at least 92% sensitivity, or at least 93% sensitivity, or at least 94% sensitivity, or at least 95% sensitivity, or 100% sensitivity, the genera of the microorganisms in the sample based on the assigning of sequence reads to genera of microorganisms.


B116. The method of embodiment B115, wherein the collection of nucleotide sequences of prokaryotic 16S rRNA genes of the filtered group of microorganisms is obtained by:

    • (A) predetermining the sequences of hypervariable region sequence-containing amplicons expected to be generated by nucleic acid amplification of the sequences in a prokaryotic 16S rRNA gene reference database using the one or more primer pairs of (i) and identifying microorganisms containing one or more of the hypervariable region sequence-containing amplicons expected to be produced by each of the separate primer pairs,
    • (B) generating a signature pattern of hypervariable region sequence-containing amplicons expected for each microorganism containing the sequences of expected hypervariable region amplicons wherein the signature pattern is based on which of each of the primer pairs of (i) would be expected to amplify a sequence in the microorganism and which of the primer pairs of (i) would not be expected to amplify a sequence in the microorganism,
    • (C) aligning the sequence reads of nucleic acid products amplified by the combination of primer pairs of (i) with the sequences of expected hypervariable region sequence-containing amplicons and separating and assigning the sequence reads according to the different expected hypervariable region sequence-containing amplicons produced by the separate primer pairs based on the alignments,
    • (D) determining the number of sequence reads that align with each of the expected hypervariable region sequence-containing amplicons for each microorganism and selecting a first group of microorganisms for which a minimum threshold number of sequence reads align,
    • (E) determine, for each microorganism in the first group of microorganisms, an observed pattern of actual hypervariable region amplicons for which sequence reads were obtained and compare the observed pattern of actual hypervariable region amplicons to the signature pattern of hypervariable region amplicons expected for the microorganism; and
    • (F) select for inclusion in the collection of nucleotide sequences of prokaryotic 16S rRNA genes of the filtered group of microorganisms only the sequences of those microorganisms having a signature pattern of hypervariable region amplicons for which there is an observed pattern of actual amplicons that meets a minimum similarity threshold.


      B117. The method of embodiment B116, further comprising, after aligning the sequence reads of nucleic acid products amplified by the combination of primer pairs of (i) to the collection of full-length nucleotide sequences of prokaryotic 16S rRNA genes of a filtered group of microorganisms, determining the number of sequence reads aligning to each reference prokaryotic 16S rRNA gene sequence and normalizing each sequence read number by dividing it by the number of expected hypervariable region amplicons for the microorganism.


      B118. The method of any of embodiments B88-B117, wherein identifying species of microorganisms in the sample comprises aligning sequence reads from the nucleic acid products amplified by the primer pairs of (ii) to the nucleotide sequences of a plurality of microorganism reference genomes and identifying the species of the reference genomes to which the sequence reads most closely align thereby identifying species of microorganisms in the sample.


      B119. The method of embodiment B118, wherein species of microorganisms in the sample are identified with at least 95% sensitivity, or at least 96% sensitivity, or at least 97% sensitivity, or at least 98% sensitivity, or at least 99% sensitivity, or 100% sensitivity.


      B120. The method of embodiment B118 or embodiment B119, wherein the plurality of microorganism reference genomes comprises reference genomes selected by identifying microorganism genomes that contain sequence amplifiable using primer pairs of (ii) and that would be expected to contain sequence amplifiable using primer pairs of (ii).


      B121. The method of any of embodiments B118-B120, further comprising identifying the sequence reads of products amplified by the primer pairs of (ii) that align with only one reference genome or to multiple reference genomes wherein the multiple reference genomes are genomes of the same species of microorganism.


      B122. The method of embodiment B121, further comprising:


      (A) determining the total number of sequence reads that align with only one reference genome or to multiple reference genomes wherein the multiple reference genomes are genomes of the same species of microorganism,


      (B) selecting species for which the number of aligning sequence reads is equal to or greater than a threshold value, and for those species, normalizing the number of aligning sequence reads by dividing the total number of aligning sequence reads for the species by the number of amplicons within the species genome to which sequence reads aligned; and


      (C) selecting only species for which the normalized number of aligning sequence reads is greater than a minimum threshold percentage of the sum of the normalized number of aligning sequence reads for all species.


      B123. The method of any of embodiments B88-B122, wherein the population of microorganisms is a population of bacteria.


      B124. The method of any of embodiments B88-122, wherein the prokaryotic 16S rRNA gene is a bacterial gene.


      B125. A method of detecting an imbalance of microorganisms in a subject comprising:


      (a) subjecting nucleic acids in or from a sample from the subject to nucleic acid amplification using a combination of primer pairs comprising:


      (i) one or more primer pairs that capable of amplifying a nucleic acid sequence of one or more hypervariable regions of a prokaryotic 16S rRNA gene and


      (ii) one or more primer pairs that capable of amplifying a target nucleic acid sequence contained within the genome of a microorganism that is not contained within a hypervariable region of a prokaryotic 16S rRNA gene, wherein different primer pairs amplify different target nucleic acid sequences contained within the genome of different microorganisms;


      (b) obtaining sequence information from nucleic acid products amplified by the combination of primer pairs of (i) and (ii) and, optionally determining the levels of nucleic acid products amplified by the one or more primer pairs of (i);


      (c) determining the microorganism composition of the sample by identifying genera of microorganisms in the sample, and optionally the relative levels thereof, and species of one or more of the microorganisms in the sample;


      (d) comparing the microorganism composition of the sample to a reference microorganism composition; and


      (e) detecting an imbalance of microorganisms in the subject if the level of one or more microorganisms in the sample differ from the level of the microorganism(s) in the reference microorganism composition, one or more microorganisms in the reference composition is not present in the sample and/or one or more microorganisms present in the sample is not present in the reference microorganism composition.


      B126. A method of detecting an imbalance of microorganisms in a subject comprising:


      (a) subjecting nucleic acids in or from a sample from the subject to two separate nucleic acid amplification reactions using a first set of primer pairs for one nucleic acid amplification reaction and a second set of primer pairs for the other nucleic acid amplification reaction, wherein:


      (i) the first set of primer pairs comprises one or more primer pairs capable of amplifying a nucleic acid sequence of one or more hypervariable regions of a prokaryotic 16S rRNA gene, and


      (ii) the second set of primer pairs comprises one or more primer pairs capable of amplifying a target nucleic acid sequence contained within the genome of a microorganism that is not contained within a hypervariable region of a prokaryotic 16S rRNA gene, wherein different primer pairs amplify different target nucleic acid sequences contained within the genome of different microorganisms;


      (b) obtaining sequence information from nucleic acid products amplified by primer pairs of (i) and (ii) and, optionally determining the levels of nucleic acid products amplified by the one or more primer pairs of (i);


      (c) determining the microorganism composition of the sample by identifying genera of microorganisms in the sample, and optionally the relative levels thereof, and species of one or more of the microorganisms in the sample;


      (d) comparing the microorganism composition of the sample to a reference microorganism composition; and


      (e) detecting an imbalance of microorganisms in the subject if the level of one or more microorganisms in the sample differ from the level of the microorganism(s) in the reference microorganism composition, one or more microorganisms in the reference composition is not present in the sample and/or one or more microorganisms present in the sample is not present in the reference microorganism composition.


      B127. A method of treating a subject having an imbalance of microorganisms comprising:


      (a) subjecting nucleic acids in or from a sample from the subject to nucleic acid amplification using a combination of primer pairs comprising:


      (i) one or more primer pairs capable of amplifying a nucleic acid sequence of one or more hypervariable regions of a prokaryotic 16S rRNA gene and


      (ii) one or more primer pairs capable of amplifying a target nucleic acid sequence contained within the genome of a microorganism that is not contained within a hypervariable region of a prokaryotic 16S rRNA gene, wherein different primer pairs amplify different target nucleic acid sequences contained within the genome of different microorganisms;


      (b) obtaining sequence information from nucleic acid products amplified by the combination of primer pairs of (i) and (ii) and, optionally determining the levels of nucleic acid products amplified by the one or more primer pairs of (i);


      (c) determining the microorganism composition of the sample by identifying genera of microorganisms in the sample, and optionally the relative levels thereof, and species of one or more of the microorganisms in the sample;


      (d) detecting an imbalance of microorganisms in the subject; and


      (e) treating the subject to establish a balance of microorganisms in the subject.


      B128. A method of treating a subject having an imbalance of microorganisms comprising:


      (a) subjecting nucleic acids in or from a sample from the subject to two separate nucleic acid amplification reactions using a first set of primer pairs for one nucleic acid amplification reaction and a second set of primer pairs for the other nucleic acid amplification reaction, wherein:


      (i) the first set of primer pairs comprises one or more primer pairs capable of amplifying a nucleic acid sequence of one or more hypervariable regions of a prokaryotic 16S rRNA gene, and


      (ii) the second set of primer pairs comprises one or more primer pairs that capable of amplifying a target nucleic acid sequence contained within the genome of a microorganism that is not contained within a hypervariable region of a prokaryotic 16S rRNA gene, wherein different primer pairs amplify different target nucleic acid sequences contained within the genome of different microorganisms;


      (b) obtaining sequence information from nucleic acid products amplified by primer pairs of (i) and (ii) and, optionally determining the levels of nucleic acid products amplified by the one or more primer pairs of (i);


      (c) determining the microorganism composition of the sample by identifying genera of microorganisms in the sample, and optionally the relative levels thereof, and species of one or more of the microorganisms in the sample;


      (d) detecting an imbalance of microorganisms in the subject; and


      (e) treating the subject to establish a balance of microorganisms in the subject.


      B129. A method for treating a subject with an immunotherapy, comprising:


      (a) subjecting nucleic acids in or from a sample from the subject to nucleic acid amplification using a combination of primer pairs comprising:


      (i) one or more primer pairs capable of amplifying a nucleic acid sequence of one or more hypervariable regions of a prokaryotic 16S rRNA gene and


      (ii) one or more primer pairs capable of amplifying a target nucleic acid sequence contained within the genome of a microorganism that is not contained within a hypervariable region of a prokaryotic 16S rRNA gene, wherein the microorganism is one that is positively or negatively associated with response to immune checkpoint inhibition-based immunotherapy;


      (b) obtaining sequence information from nucleic acid products amplified by the combination of primer pairs of (i) and (ii) and, optionally determining the levels of nucleic acid products amplified by the one or more primer pairs of (i);


      (c) identifying genera of microorganisms in the sample and species of one or more of the microorganisms in the sample; and


      (d) treating the subject with:


      (1) an immune checkpoint inhibition-based immunotherapy if the sample includes one or more microorganisms positively associated with response to immune checkpoint inhibition-based immunotherapy and/or excludes or has sufficiently low levels of one or more microorganisms negatively associated with response to immune checkpoint inhibition-based immunotherapy or


      (2) a composition that increases levels of one or more microorganisms positively associated with response to immune checkpoint inhibition-based immunotherapy if the sample lacks one or more microorganisms or sufficient levels thereof that is positively associated with response to immune checkpoint inhibition-based immunotherapy, and/or a composition that eliminates or reduces levels of one or more microorganisms negatively associated with response to immune checkpoint inhibition-based immunotherapy if the sample contains one or more microorganisms or prohibitively high levels thereof that is negatively associated with response to immune checkpoint inhibition-based immunotherapy; and treating the subject with an immune checkpoint inhibition-based immunotherapy.


      B130. A method for treating a subject with an immunotherapy, comprising:


      (a) subjecting nucleic acids in or from a sample from the subject to two separate nucleic acid amplification reactions using a first set of primer pairs for one nucleic acid amplification reaction and a second set of primer pairs for the other nucleic acid amplification reaction, wherein:


      (i) the first set of primer pairs comprises one or more primer pairs capable of amplifying a nucleic acid sequence of one or more hypervariable regions of a prokaryotic 16S rRNA gene, and


      (ii) the second set of primer pairs comprises one or more primer pairs that amplify a target nucleic acid sequence contained within the genome of a microorganism that is not contained within a hypervariable region of a prokaryotic 16S rRNA gene, wherein the microorganism is one that is positively or negatively associated with response to immune checkpoint inhibition-based immunotherapy;


      (b) obtaining sequence information from nucleic acid products amplified by primer pairs of (i) and (ii) and, optionally determining the levels of nucleic acid products amplified by the one or more primer pairs of (i);


      (c) identifying genera of microorganisms in the sample and species of one or more of the microorganisms in the sample; and


      (d) treating the subject with:


      (1) an immune checkpoint inhibition-based immunotherapy if the sample includes one or more microorganisms positively associated with response to immune checkpoint inhibition-based immunotherapy and/or excludes or has sufficiently low levels of one or more microorganisms negatively associated with response to immune checkpoint inhibition-based immunotherapy or


      (2) a composition that increases levels of one or more microorganisms positively associated with response to immune checkpoint inhibition-based immunotherapy if the sample lacks one or more microorganisms or sufficient levels thereof that is positively associated with response to immune checkpoint inhibition-based immunotherapy, and/or a composition that eliminates or reduces levels of one or more microorganisms negatively associated with response to immune checkpoint inhibition-based immunotherapy if the sample contains one or more microorganisms or prohibitively high levels thereof that is negatively associated with response to immune checkpoint inhibition-based immunotherapy; and treating the subject with an immune checkpoint inhibition-based immunotherapy.


      C1. A kit comprising any of the compositions of embodiments A1-A45.


      C2. The kit of embodiment C1, further comprising one or more polymerases.


      C3. The kit of embodiment C1 or embodiment C2, further comprising one or more oligonucleotide adapters.


      C4. The kit of any of embodiments C1-C3, further comprising one or more ligases.


      D1. A method, comprising:


      (a) receiving a plurality of nucleic acid sequence reads, wherein the sequence reads include a plurality of 16S sequence reads;


      (b) first mapping the plurality of 16S sequence reads to a plurality of compressed 16S reference sequences, wherein each compressed 16S reference sequences include a set of hypervariable segments for a corresponding strain of a species;


      (c) generating a read count matrix containing read counts of 16S sequence reads mapped to each hypervariable segment in the set of hypervariable segments, wherein rows of the read count matrix correspond to strains of species and columns correspond the hypervariable segments;


      (d) reducing the read count matrix by applying thresholding to the read counts to form a reduced read count matrix;


      (e) compressing a database of full-length 16S reference sequences to form a reduced set of full-length 16S reference sequences based on the reduced read count matrix, the reduced set of full-length 16S reference sequences stored in a memory;


      (f) second mapping the plurality of 16S sequence reads to the reduced set of full-length 16S reference sequences;


      (g) counting the 16S sequence reads that mapped to each full-length reference in the reduced set of full-length 16S reference sequences to form a second set of read counts;


      (h) normalizing the read counts in the second set of read counts to form normalized counts;


      (i) aggregating the normalized counts for a given level to form aggregated counts, wherein the given level is a species level, a genus level or a family level; and


      (j) applying a threshold to the aggregated counts to detect a presence of a microbe at the given level in a sample.


      D2. The method of embodiment D1, wherein the reducing the read count matrix further comprises: eliminating rows of the read count matrix when a sum of read counts within the row are less than a row sum threshold to form a first reduced read count matrix.


      D3. The method of embodiment D2, wherein the reducing the read count matrix further comprises:


      (a) adding the read counts of the rows of the first reduced read count matrix that correspond to identical expected signatures for a corresponding species to form column sums; and


      (b) adding the column sums to form a combined sum, wherein an expected signature comprises binary values corresponding to the hypervariable segments in the set of hypervariable segments expected to be present (=1) or absent (=0) in the strain.


      D4. The method of embodiment D3, wherein the reducing the read count matrix further comprises eliminating the rows of the first reduced read count matrix when the combined sum is less than a combined sum threshold to form a second reduced read count matrix.


      D5. The method of embodiment D3, wherein the reducing the read count matrix further comprises applying a signature threshold to the column sums to assign binary values to form an observed signature for each row of the second reduced read count matrix, the observed signature and expected signature each having a total number of categories.


      D6. The method of embodiment D5, wherein the compressing further comprises determining a ratio of the categories that have matching binary values in the observed signature and the expected signature to the total number of categories.


      D7. The method of embodiment D6, wherein the compressing further comprises selecting a corresponding full-length 16S reference sequence from the database of full-length 16S reference sequences stored in memory for a first reduced set of full-length 16S reference sequences when the ratio is greater than a ratio threshold.


      D8. The method of embodiment D7, wherein the second mapping step uses the first reduced set of full-length 16S reference sequences as the reduced set of full-length 16S reference sequences.


      D9. The method of embodiment D7, wherein the compressing further comprises reassigning unannotated strains to annotated strains in the first reduced set of full-length 16S reference sequences based on a sequence similarity metric to form a second reduced set of full-length 16S reference sequences.


      D10. The method of embodiment D9, wherein the second mapping step uses the second reduced set of full-length 16S reference sequences as the reduced set of full-length 16S reference sequences.


      D11. The method of embodiment D3, wherein the normalizing step further comprises by dividing the read count in the second set of read counts by a number of 1's in the expected signature to form the normalized read count.


      D12. The method of embodiment D11, wherein the normalizing step further comprises dividing the normalized count by an average copy number of a corresponding 16S gene.


      D13. The method of embodiment D1, wherein the step of applying a threshold further comprises applying the threshold to a ratio of the aggregated counts to a total number of mapped 16S sequence reads.


      D14. The method of embodiment D1, wherein the plurality of nucleic acid sequence reads further include a plurality of targeted species sequence reads.


      D15. The method of embodiment D14, further comprising mapping the targeted species sequence reads to segmented reference sequences to form targeted species mapped reads, wherein each segmented reference sequence comprises segments corresponding to expected amplicons for a strain of the targeted species.


      D16. The method of embodiment D15, further comprising aggregating counts of the targeted species mapped reads to form aggregated read counts per species.


      D17. The method of embodiment D16, further comprising normalizing the aggregated read counts per species by dividing by a total number of amplifying amplicons to form a normalized read count per species.


      D18. The method of embodiment D17, further comprising adding the normalized read counts per species across the species to form a total of normalized read counts.


      D19. The method of embodiment D18, further comprising dividing each normalized read count per species by a total of normalized read counts per species to form a ratio per species.


      D20. The method of embodiment D19, further comprising applying a second threshold to the ratio per species to detect a presence of the targeted species in the sample.


      D21. The method of embodiment D15, further comprising generating the segmented reference sequences by applying an in silico PCR based on primers of a species primer pool.


      D22. The method of embodiment D1, further comprising generating the compressed 16S reference sequences by applying an in silico PCR based on primers of a 16S primer pool.


      D23. The method of embodiment D1, wherein the plurality of 16S sequence reads correspond to amplicons produced by amplifying a nucleic acid sample in the presence of one or more primer pairs targeting one or more hypervariable regions of a prokaryotic 16S rRNA gene.


      D24. The method of embodiment D14, wherein the plurality of targeted species sequence reads correspond to amplicons produced by amplifying a target nucleic acid sequence contained within a genome of a microorganism that is outside a hypervariable region of a prokaryotic 16S rRNA gene, wherein different primer pairs amplify different target nucleic acid sequences contained within the genome of different microorganisms in the nucleic acid sample.


      E1. A method, comprising:


      (a) receiving a plurality of nucleic acid sequence reads at a processor, wherein the sequence reads include a plurality of 16S sequence reads;


      (b) first mapping the reads the plurality of 16S sequence reads to a plurality of compressed 16S reference sequences, wherein each compressed 16S reference sequence includes a set of hypervariable segments for a corresponding strain of a species;


      (c) counting the 16S sequence reads mapped to each hypervariable segment in the set of hypervariable segments to form a first set of read counts;


      (d) compressing a database of full-length 16S reference sequences to form a reduced set of full-length 16S reference sequences based on the first set of read counts of the 16S sequence reads mapped to the compressed 16S reference sequences, the reduced set of full-length 16S reference sequences stored in a memory;


      (e) second mapping the plurality of 16S sequence reads to the reduced set of full-length 16S reference sequences;


      (f) counting the 16S sequence reads that mapped to each full-length reference sequence in the reduced set of full-length 16S reference sequences to form a second set of read counts; and


      (g) detecting a presence of a microbe at a species level, a genus level or a family level in a sample based on the second set of read counts.


      E2. The method of embodiment E1, wherein the plurality of nucleic acid sequence reads further include a plurality of targeted species sequence reads.


      E3. The method of embodiment E2, further comprising mapping the targeted species sequence reads to segmented reference sequences to form targeted species mapped reads, wherein each segmented reference sequence comprises segments corresponding to expected amplicons for a strain of the targeted species.


      E4. The method of embodiment E3, further comprising aggregating counts of the targeted species mapped reads to form aggregated read counts per species.


      E5. The method of embodiment E4, further comprising detecting a presence of the targeted species in the sample based on the aggregated read counts per species.


      E6. The method of embodiment E3, further comprising generating the segmented reference sequences by applying an in silico PCR based on primers of a species primer pool.


      E7. The method of embodiment E1, further comprising generating the compressed 16S reference sequences by applying an in silico PCR based on primers of a 16S primer pool.


      E8. The method of embodiment E1, wherein the plurality of 16S sequence reads correspond to amplicons produced by amplifying a nucleic acid sample in the presence of one or more primer pairs targeting one or more hypervariable regions of a prokaryotic 16S rRNA gene.


      E9. The method of embodiment E2, wherein the plurality of targeted species sequence reads correspond to amplicons produced by amplifying a target nucleic acid sequence contained within a genome of a microorganism that is outside a hypervariable region of a prokaryotic 16S rRNA gene, wherein different primer pairs amplify different target nucleic acid sequences contained within the genome of different microorganisms in the nucleic acid sample.


      F1. A system, comprising:


      (a) a machine-readable memory; and


      (b) a processor configured to execute machine-readable instructions, which, when executed by the processor, cause the system to perform a method, comprising:


      (i) receiving a plurality of nucleic acid sequence reads at the processor, wherein the sequence reads include a plurality of 16S sequence reads;


      (ii) first mapping the plurality of 16S sequence reads to a plurality of compressed 16S reference sequences, wherein each compressed 16S reference sequences include a set of hypervariable segments for a corresponding strain of a species;


      (iii) generating a read count matrix containing read counts of 16S sequence reads mapped to each hypervariable segment in the set of hypervariable segments, wherein rows of the read count matrix correspond to strains of species and columns correspond the hypervariable segments;


      (iv) reducing the read count matrix by applying thresholding to the read counts to form a reduced read count matrix;


      (v) compressing a database of full-length 16S reference sequences to form a reduced set of full-length 16S reference sequences based on the reduced read count matrix, the reduced set of full-length 16S reference sequences stored in the memory;


      (vi) second mapping the plurality of 16S sequence reads to the reduced set of full-length 16S reference sequences;


      (vii) counting the 16S sequence reads that mapped to each full-length reference in the reduced set of full-length 16S reference sequences to form a second set of read counts;


      (viii) normalizing the read counts in the second set of read counts to form normalized counts;


      (ix) aggregating the normalized counts for a given level to form aggregated counts, wherein the given level is a species level, a genus level or a family level; and


      (x) applying a threshold to the aggregated counts to detect a presence of a microbe at the given level in a sample.


      F2. The system of embodiment F1, wherein the reducing the read count matrix further comprises eliminating rows of the read count matrix when a sum of read counts within the row are less than a row sum threshold to form a first reduced read count matrix.


      F3. The system of embodiment F2, wherein the reducing the read count matrix further comprises:


      (a) adding the read counts of the rows of the first reduced read count matrix that correspond to identical expected signatures for a corresponding species to form column sums; and


      (b) adding the column sums to form a combined sum, wherein an expected signature comprises binary values corresponding to the hypervariable segments in the set of hypervariable segments expected to be present (=1) or absent (=0) in the strain.


      F4. The system of embodiment F3, wherein the reducing the read count matrix further comprises eliminating the rows of the first reduced read count matrix when the combined sum is less than a combined sum threshold to form a second reduced read count matrix.


      F5. The system of embodiment F3, wherein the reducing the read count matrix further comprises applying a signature threshold to the column sums to assign binary values to form an observed signature for each row of the second reduced read count matrix, the observed signature and expected signature each having a total number of categories.


      F6. The system of embodiment F5, wherein the compressing further comprises determining a ratio of the categories that have matching binary values in the observed signature and the expected signature to the total number of categories.


      F7. The system of embodiment F6, wherein the compressing further comprises selecting a corresponding full-length 16S reference sequence from the database of full-length 16S reference sequences stored in memory for a first reduced set of full-length 16S reference sequences when the ratio is greater than a ratio threshold.


      F8. The system of embodiment F7, wherein the second mapping step uses the first reduced set of full-length 16S reference sequences as the reduced set of full-length 16S reference sequences.


      F9. The system of embodiment F7, wherein the compressing further comprises reassigning unannotated strains to annotated strains in the first reduced set of full-length 16S reference sequences based on a sequence similarity metric to form a second reduced set of full-length 16S reference sequences.


      F10. The system of embodiment F9, wherein the second mapping step uses the second reduced set of full-length 16S reference sequences as the reduced set of full-length 16S reference sequences.


      F11. The system of embodiment F3, wherein the normalizing step further comprises by dividing the read count in the second set of read counts by a number of l's in the expected signature to form the normalized read count.


      F12. The system of embodiment F11, wherein the normalizing step further comprises dividing the normalized count by an average copy number of a corresponding 16S gene.


      F13. The system of embodiment F1, wherein the step of applying a threshold further comprises applying the threshold to a ratio of the aggregated counts to a total number of mapped 16S sequence reads.


      F14. The system of embodiment F1, wherein the plurality of nucleic acid sequence reads further include a plurality of targeted species sequence reads.


      F15. The system of embodiment F14, further comprising mapping the targeted species sequence reads to segmented reference sequences to form targeted species mapped reads, wherein each segmented reference sequence comprises segments corresponding to expected amplicons for a strain of the targeted species.


      F16. The system of embodiment F15, further comprising aggregating counts of the targeted species mapped reads to form aggregated read counts per species.


      F17. The system of embodiment F16, further comprising normalizing the aggregated read counts per species by dividing by a total number of amplifying amplicons to form a normalized read count per species.


      F18. The system of embodiment F17, further comprising adding the normalized read counts per species across the species to form a total of normalized read counts.


      F19. The system of embodiment F18, further comprising dividing each normalized read count per species by a total of normalized read counts per species to form a ratio per species.


      F20. The system of embodiment F19, further comprising applying a second threshold to the ratio per species to detect a presence of the targeted species in the sample.


      F21. The system of embodiment F15, further comprising generating the segmented reference sequences by applying an in silico PCR based on primers of a species primer pool.


      F22. The system of embodiment F1, further comprising generating the compressed 16S reference sequences by applying an in silico PCR based on primers of a 16S primer pool.


      F23. The system of embodiment F1, wherein the plurality of 16S sequence reads correspond to amplicons produced by amplifying a nucleic acid sample in the presence of one or more primer pairs targeting one or more hypervariable regions of a prokaryotic 16S rRNA gene.


      F24. The system of embodiment F14, wherein the plurality of targeted species sequence reads correspond to amplicons produced by amplifying a target nucleic acid sequence contained within a genome of a microorganism that is outside a hypervariable region of a prokaryotic 16S rRNA gene, wherein different primer pairs amplify different target nucleic acid sequences contained within the genome of different microorganisms in the nucleic acid sample.


      G1. A system, comprising:


      (a) a machine-readable memory; and


      (b) a processor configured to execute machine-readable instructions, which, when executed by the processor, cause the system to perform a method, comprising:


      (i) receiving a plurality of nucleic acid sequence reads at the processor, wherein the sequence reads include a plurality of 16S sequence reads;


      (ii) first mapping the reads the plurality of 16S sequence reads to a plurality of compressed 16S reference sequences, wherein each compressed 16S reference sequence includes a set of hypervariable segments for a corresponding strain of a species;


      (iii) counting the 16S sequence reads mapped to each hypervariable segment in the set of hypervariable segments to form a first set of read counts;


      (iv) compressing a database of full-length 16S reference sequences to form a reduced set of full-length 16S reference sequences based on the first set of read counts of the 16S sequence reads mapped to the compressed 16S reference sequences, the reduced set of full-length 16S reference sequences stored in the memory;


      (v) second mapping the plurality of 16S sequence reads to the reduced set of full-length 16S reference sequences;


      (vi) counting the 16S sequence reads that mapped to each full-length reference sequence in the reduced set of full-length 16S reference sequences to form a second set of read counts; and


      (vii) detecting a presence of a microbe at a species level, a genus level or a family level in a sample based on the second set of read counts.


      G2. The system of embodiment G1, wherein the plurality of nucleic acid sequence reads further include a plurality of targeted species sequence reads.


      G3. The system of embodiment G2, further comprising mapping the targeted species sequence reads to segmented reference sequences to form targeted species mapped reads, wherein each segmented reference sequence comprises segments corresponding to expected amplicons for a strain of the targeted species.


      G4. The system of embodiment G3, further comprising aggregating counts of the targeted species mapped reads to form aggregated read counts per species.


      G5. The system of embodiment G4, further comprising detecting a presence of the targeted species in the sample based on the aggregated read counts per species.


      G6. The system of embodiment G3, further comprising generating the segmented reference sequences by applying an in silico PCR based on primers of a species primer pool.


      G7. The system of embodiment G1, further comprising generating the compressed 16S reference sequences by applying an in silico PCR based on primers of a 16S primer pool.


      G8. The system of embodiment G1, wherein the plurality of 16S sequence reads correspond to amplicons produced by amplifying a nucleic acid sample in the presence of one or more primer pairs targeting one or more hypervariable regions of a prokaryotic 16S rRNA gene.


      G9. The system of embodiment G2, wherein the plurality of targeted species sequence reads correspond to amplicons produced by amplifying a target nucleic acid sequence contained within a genome of a microorganism that is outside a hypervariable region of a prokaryotic 16S rRNA gene, wherein different primer pairs amplify different target nucleic acid sequences contained within the genome of different microorganisms in the nucleic acid sample.


      H1. A non-transitory machine-readable storage medium comprising instructions which, when executed by a processor, cause the processor to perform a method, comprising:


      (a) receiving a plurality of nucleic acid sequence reads at the processor, wherein the sequence reads include a plurality of 16S sequence reads;


      (b) first mapping the plurality of 16S sequence reads to a plurality of compressed 16S reference sequences, wherein each compressed 16S reference sequences include a set of hypervariable segments for a corresponding strain of a species;


      (c) generating a read count matrix containing read counts of 16S sequence reads mapped to each hypervariable segment in the set of hypervariable segments, wherein rows of the read count matrix correspond to strains of species and columns correspond the hypervariable segments;


      (d) reducing the read count matrix by applying thresholding to the read counts to form a reduced read count matrix;


      (e) compressing a database of full-length 16S reference sequences to form a reduced set of full-length 16S reference sequences based on the reduced read count matrix, the reduced set of full-length 16S reference sequences stored in a memory;


      (f) second mapping the plurality of 16S sequence reads to the reduced set of full-length 16S reference sequences;


      (g) counting the 16S sequence reads that mapped to each full-length reference in the reduced set of full-length 16S reference sequences to form a second set of read counts;


      (h) normalizing the read counts in the second set of read counts to form normalized counts;


      (i) aggregating the normalized counts for a given level to form aggregated counts, wherein the given level is a species level, a genus level or a family level; and


      (j) applying a threshold to the aggregated counts to detect a presence of a microbe at the given level in a sample.


      H2. The non-transitory machine-readable storage medium of embodiment H1, further comprising instructions which cause the processor to perform the method, wherein the reducing the read count matrix further comprises eliminating rows of the read count matrix when a sum of read counts within the row are less than a row sum threshold to form a first reduced read count matrix.


      H3. The non-transitory machine-readable storage medium of embodiment H2, further comprising instructions which cause the processor to perform the method, wherein the reducing the read count matrix further comprises:


      (a) adding the read counts of the rows of the first reduced read count matrix that correspond to identical expected signatures for a corresponding species to form column sums; and


      (b) adding the column sums to form a combined sum, wherein an expected signature comprises binary values corresponding to the hypervariable segments in the set of hypervariable segments expected to be present (=1) or absent (=0) in the strain.


      H4. The non-transitory machine-readable storage medium of embodiment H3, further comprising instructions which cause the processor to perform the method, wherein the reducing the read count matrix further comprises eliminating the rows of the first reduced read count matrix when the combined sum is less than a combined sum threshold to form a second reduced read count matrix.


      H5. The non-transitory machine-readable storage medium of embodiment H3, further comprising instructions which cause the processor to perform the method, wherein the reducing the read count matrix further comprises applying a signature threshold to the column sums to assign binary values to form an observed signature for each row of the second reduced read count matrix, the observed signature and expected signature each having a total number of categories.


      H6. The non-transitory machine-readable storage medium of embodiment H5, further comprising instructions which cause the processor to perform the method, wherein the compressing further comprises determining a ratio of the categories that have matching binary values in the observed signature and the expected signature to the total number of categories.


      H7. The non-transitory machine-readable storage medium of embodiment H6, further comprising instructions which cause the processor to perform the method, wherein the compressing further comprises selecting a corresponding full-length 16S reference sequence from the database of full-length 16S reference sequences stored in memory for a first reduced set of full-length 16S reference sequences when the ratio is greater than a ratio threshold.


      H8. The non-transitory machine-readable storage medium of embodiment H7, further comprising instructions which cause the processor to perform the method, wherein the second mapping step uses the first reduced set of full-length 16S reference sequences as the reduced set of full-length 16S reference sequences.


      H9. The non-transitory machine-readable storage medium of embodiment H7, further comprising instructions which cause the processor to perform the method, wherein the compressing further comprises reassigning unannotated strains to annotated strains in the first reduced set of full-length 16S reference sequences based on a sequence similarity metric to form a second reduced set of full-length 16S reference sequences.


      H10. The non-transitory machine-readable storage medium of embodiment H9, further comprising instructions which cause the processor to perform the method, wherein the second mapping step uses the second reduced set of full-length 16S reference sequences as the reduced set of full-length 16S reference sequences.


      H11. The non-transitory machine-readable storage medium of embodiment H3, further comprising instructions which cause the processor to perform the method, wherein the normalizing step further comprises by dividing the read count in the second set of read counts by a number of l's in the expected signature to form the normalized read count.


      H12. The non-transitory machine-readable storage medium of embodiment H11, further comprising instructions which cause the processor to perform the method, wherein the normalizing step further comprises dividing the normalized count by an average copy number of a corresponding 16S gene.


      H13. The non-transitory machine-readable storage medium of embodiment H1, further comprising instructions which cause the processor to perform the method, wherein the step of applying a threshold further comprises applying the threshold to a ratio of the aggregated counts to a total number of mapped 16S sequence reads.


      H14. The non-transitory machine-readable storage medium of embodiment H1, further comprising instructions which cause the processor to perform the method, wherein the plurality of nucleic acid sequence reads further include a plurality of targeted species sequence reads.


      H15. The non-transitory machine-readable storage medium of embodiment H14, further comprising instructions which cause the processor to perform the method, further comprising mapping the targeted species sequence reads to segmented reference sequences to form targeted species mapped reads, wherein each segmented reference sequence comprises segments corresponding to expected amplicons for a strain of the targeted species.


      H16. The non-transitory machine-readable storage medium of embodiment H15, further comprising instructions which cause the processor to perform the method, further comprising aggregating counts of the targeted species mapped reads to form aggregated read counts per species.


      H17. The non-transitory machine-readable storage medium of embodiment H16, further comprising instructions which cause the processor to perform the method, further comprising normalizing the aggregated read counts per species by dividing by a total number of amplifying amplicons to form a normalized read count per species.


      H18. The non-transitory machine-readable storage medium of embodiment H17, further comprising instructions which cause the processor to perform the method, further comprising adding the normalized read counts per species across the species to form a total of normalized read counts.


      H19. The non-transitory machine-readable storage medium of embodiment H18, further comprising instructions which cause the processor to perform the method, further comprising dividing each normalized read count per species by a total of normalized read counts per species to form a ratio per species.


      H20. The non-transitory machine-readable storage medium of embodiment H19, further comprising instructions which cause the processor to perform the method, further comprising applying a second threshold to the ratio per species to detect a presence of the targeted species in the sample.


      H21. The non-transitory machine-readable storage medium of embodiment 1115, further comprising instructions which cause the processor to perform the method, further comprising generating the segmented reference sequences by applying an in silico PCR based on primers of a species primer pool.


      H22. The non-transitory machine-readable storage medium of embodiment H1, further comprising instructions which cause the processor to perform the method, further comprising generating the compressed 16S reference sequences by applying an in silico PCR based on primers of a 16S primer pool.


      H23. The non-transitory machine-readable storage medium of embodiment H1, further comprising instructions which cause the processor to perform the method, wherein the plurality of 16S sequence reads correspond to amplicons produced by amplifying a nucleic acid sample in the presence of one or more primer pairs targeting one or more hypervariable regions of a prokaryotic 16S rRNA gene.


      H24. The non-transitory machine-readable storage medium of embodiment H14, further comprising instructions which cause the processor to perform the method, wherein the plurality of targeted species sequence reads correspond to amplicons produced by amplifying a target nucleic acid sequence contained within a genome of a microorganism that is outside a hypervariable region of a prokaryotic 16S rRNA gene, wherein different primer pairs amplify different target nucleic acid sequences contained within the genome of different microorganisms in the nucleic acid sample.


      J1. A non-transitory machine-readable storage medium comprising instructions which, when executed by a processor, cause the processor to perform a method, comprising:


      (a) receiving a plurality of nucleic acid sequence reads at the processor, wherein the sequence reads include a plurality of 16S sequence reads;


      (b) first mapping the reads the plurality of 16S sequence reads to a plurality of compressed 16S reference sequences, wherein each compressed 16S reference sequence includes a set of hypervariable segments for a corresponding strain of a species;


      (c) counting the 16S sequence reads mapped to each hypervariable segment in the set of hypervariable segments to form a first set of read counts;


      (d) compressing a database of full-length 16S reference sequences to form a reduced set of full-length 16S reference sequences based on the first set of read counts of the 16S sequence reads mapped to the compressed 16S reference sequences, the reduced set of full-length 16S reference sequences stored in a memory;


      (e) second mapping the plurality of 16S sequence reads to the reduced set of full-length 16S reference sequences;


      (f) counting the 16S sequence reads that mapped to each full-length reference sequence in the reduced set of full-length 16S reference sequences to form a second set of read counts; and


      (g) detecting a presence of a microbe at a species level, a genus level or a family level in a sample based on the second set of read counts.


      J2. The non-transitory machine-readable storage medium of embodiment J1, further comprising instructions which cause the processor to perform the method, wherein the plurality of nucleic acid sequence reads further include a plurality of targeted species sequence reads.


      J3. The non-transitory machine-readable storage medium of embodiment J2, further comprising instructions which cause the processor to perform the method, further comprising mapping the targeted species sequence reads to segmented reference sequences to form targeted species mapped reads, wherein each segmented reference sequence comprises segments corresponding to expected amplicons for a strain of the targeted species.


      J4. The non-transitory machine-readable storage medium of embodiment J3, further comprising instructions which cause the processor to perform the method, further comprising aggregating counts of the targeted species mapped reads to form aggregated read counts per species.


      J5. The non-transitory machine-readable storage medium of embodiment J4, further comprising instructions which cause the processor to perform the method, further comprising detecting a presence of the targeted species in the sample based on the aggregated read counts per species.


      J6. The non-transitory machine-readable storage medium of embodiment J3, further comprising instructions which cause the processor to perform the method, further comprising generating the segmented reference sequences by applying an in silico PCR based on primers of a species primer pool.


      J7. The non-transitory machine-readable storage medium of embodiment J1, further comprising instructions which cause the processor to perform the method, further comprising generating the compressed 16S reference sequences by applying an in silico PCR based on primers of a 16S primer pool.


      J8. The non-transitory machine-readable storage medium of embodiment J1, further comprising instructions which cause the processor to perform the method, wherein the plurality of 16S sequence reads correspond to amplicons produced by amplifying a nucleic acid sample in the presence of one or more primer pairs targeting one or more hypervariable regions of a prokaryotic 16S rRNA gene.


      J9. The non-transitory machine-readable storage medium of embodiment J2, further comprising instructions which cause the processor to perform the method, wherein the plurality of targeted species sequence reads correspond to amplicons produced by amplifying a target nucleic acid sequence contained within a genome of a microorganism that is outside a hypervariable region of a prokaryotic 16S rRNA gene, wherein different primer pairs amplify different target nucleic acid sequences contained within the genome of different microorganisms in the nucleic acid sample.


The disclosure and contents of any patents, patent applications, publications, GENBANK (and other database) sequences, websites and other published materials cited herein are hereby incorporated by reference in their entirety. Citation of any patents, patent applications, publications, GENBANK (and other database) sequences, websites and other published materials is not an admission that any of the foregoing is pertinent prior art, nor does it constitute any admission as to the contents or date of publication.

Claims
  • 1-53. (canceled)
  • 54. A method, comprising: receiving a plurality of nucleic acid sequence reads, wherein the sequence reads include a plurality of 16S sequence reads;first mapping the plurality of 16S sequence reads to a plurality of compressed 16S reference sequences, wherein each compressed 16S reference sequences include a set of hypervariable segments for a corresponding strain of a species;generating a read count matrix containing read counts of 16S sequence reads mapped to each hypervariable segment in the set of hypervariable segments, wherein rows of the read count matrix correspond to strains of species and columns correspond the hypervariable segments;reducing the read count matrix by applying thresholding to the read counts to form a reduced read count matrix;compressing a database of full-length 16S reference sequences to form a reduced set of full-length 16S reference sequences based on the reduced read count matrix, the reduced set of full-length 16S reference sequences stored in a memory;second mapping the plurality of 16S sequence reads to the reduced set of full-length 16S reference sequences;counting the 16S sequence reads that mapped to each full-length reference in the reduced set of full-length 16S reference sequences to form a second set of read counts;normalizing the read counts in the second set of read counts to form normalized counts;aggregating the normalized counts for a given level to form aggregated counts, wherein the given level is a species level, a genus level or a family level; andapplying a threshold to the aggregated counts to detect a presence of a microbe at the given level in a sample.
  • 55. The method of claim 54, wherein the reducing the read count matrix further comprises eliminating rows of the read count matrix when a sum of read counts within the row are less than a row sum threshold to form a first reduced read count matrix.
  • 56. The method of claim 55, wherein the reducing the read count matrix further comprises: adding the read counts of the rows of the first reduced read count matrix that correspond to identical expected signatures for a corresponding species to form column sums; andadding the column sums to form a combined sum, wherein an expected signature comprises binary values corresponding to the hypervariable segments in the set of hypervariable segments expected to be present (=1) or absent (=0) in the strain.
  • 57. The method of claim 56, wherein the reducing the read count matrix further comprises eliminating the rows of the first reduced read count matrix when the combined sum is less than a combined sum threshold to form a second reduced read count matrix.
  • 58. The method of claim 56, wherein the reducing the read count matrix further comprises applying a signature threshold to the column sums to assign binary values to form an observed signature for each row of the second reduced read count matrix, the observed signature and expected signature each having a total number of categories.
  • 59. The method of claim 58, wherein the compressing further comprises determining a ratio of the categories that have matching binary values in the observed signature and the expected signature to the total number of categories.
  • 60. The method of claim 59, wherein the compressing further comprises selecting a corresponding full-length 16S reference sequence from the database of full-length 16S reference sequences stored in memory for a first reduced set of full-length 16S reference sequences when the ratio is greater than a ratio threshold.
  • 61-66. (canceled)
  • 67. The method of claim 54, wherein the plurality of nucleic acid sequence reads further include a plurality of targeted species sequence reads.
  • 68. The method of claim 67, further comprising mapping the targeted species sequence reads to segmented reference sequences to form targeted species mapped reads, wherein each segmented reference sequence comprises segments corresponding to expected amplicons for a strain of the targeted species.
  • 69-75. (canceled)
  • 76. The method of claim 54, wherein the plurality of 16S sequence reads correspond to amplicons produced by amplifying a nucleic acid sample in the presence of one or more primer pairs targeting one or more hypervariable regions of a prokaryotic 16S rRNA gene.
  • 77. The method of claim 67, wherein the plurality of targeted species sequence reads correspond to amplicons produced by amplifying a target nucleic acid sequence contained within a genome of a microorganism that is outside a hypervariable region of a prokaryotic 16S rRNA gene, wherein different primer pairs amplify different target nucleic acid sequences contained within the genome of different microorganisms in the nucleic acid sample.
  • 78. A method, comprising: receiving a plurality of nucleic acid sequence reads at a processor, wherein the sequence reads include a plurality of 16S sequence reads;first mapping the reads the plurality of 16S sequence reads to a plurality of compressed 16S reference sequences, wherein each compressed 16S reference sequence includes a set of hypervariable segments for a corresponding strain of a species;counting the 16S sequence reads mapped to each hypervariable segment in the set of hypervariable segments to form a first set of read counts;compressing a database of full-length 16S reference sequences to form a reduced set of full-length 16S reference sequences based on the first set of read counts of the 16S sequence reads mapped to the compressed 16S reference sequences, the reduced set of full-length 16S reference sequences stored in a memory;second mapping the plurality of 16S sequence reads to the reduced set of full-length 16S reference sequences;counting the 16S sequence reads that mapped to each full-length reference sequence in the reduced set of full-length 16S reference sequences to form a second set of read counts; anddetecting a presence of a microbe at a species level, a genus level or a family level in a sample based on the second set of read counts.
  • 79. The method of claim 78, wherein the plurality of nucleic acid sequence reads further include a plurality of targeted species sequence reads.
  • 80. The method of claim 79, further comprising mapping the targeted species sequence reads to segmented reference sequences to form targeted species mapped reads, wherein each segmented reference sequence comprises segments corresponding to expected amplicons for a strain of the targeted species.
  • 81. The method of claim 80, further comprising aggregating counts of the targeted species mapped reads to form aggregated read counts per species.
  • 82. The method of claim 81, further comprising detecting a presence of the targeted species in the sample based on the aggregated read counts per species.
  • 83. The method of claim 80, further comprising generating the segmented reference sequences by applying an in silico PCR based on primers of a species primer pool.
  • 84. The method of claim 78, further comprising generating the compressed 16S reference sequences by applying an in silico PCR based on primers of a 16S primer pool.
  • 85. The method of claim 78, wherein the plurality of 16S sequence reads correspond to amplicons produced by amplifying a nucleic acid sample in the presence of one or more primer pairs targeting one or more hypervariable regions of a prokaryotic 16S rRNA gene.
  • 86. The method of claim 79, wherein the plurality of targeted species sequence reads correspond to amplicons produced by amplifying a target nucleic acid sequence contained within a genome of a microorganism that is outside a hypervariable region of a prokaryotic 16S rRNA gene, wherein different primer pairs amplify different target nucleic acid sequences contained within the genome of different microorganisms in the nucleic acid sample.
  • 87-152. (canceled)
CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority to and the benefit of U.S. Provisional App. Nos. 62/914,366, filed Oct. 11, 2019, and 62/914,368, filed Oct. 11, 2019, and 62/944,877, filed Dec. 6, 2019, each of which is incorporated herein by reference in its entirety.

Provisional Applications (3)
Number Date Country
62944877 Dec 2019 US
62914368 Oct 2019 US
62914366 Oct 2019 US
Continuations (1)
Number Date Country
Parent PCT/US2020/070643 Oct 2020 US
Child 17658493 US