Composition For Use In Identification Of Bacteria

Information

  • Patent Application
  • 20120171692
  • Publication Number
    20120171692
  • Date Filed
    October 30, 2007
    17 years ago
  • Date Published
    July 05, 2012
    12 years ago
Abstract
The present invention provides oligonucleotide primers and compositions and kits containing the same for rapid identification of bacteria by amplification of a segment of bacterial nucleic acid followed by molecular mass analysis.
Description
FIELD OF THE INVENTION

The present invention relates generally to the field of genetic identification of bacteria and provides nucleic acid compositions and kits useful for this purpose when combined with molecular mass analysis.


BACKGROUND OF THE INVENTION

A problem in determining the cause of a natural infectious outbreak or a bioterrorist attack is the sheer variety of organisms that can cause human disease. There are over 1400 organisms infectious to humans; many of these have the potential to emerge suddenly in a natural epidemic or to be used in a malicious attack by bioterrorists (Taylor et al. Philos. Trans. R. Soc. London B. Biol. Sci., 2001, 356, 983-989). This number does not include numerous strain variants, bioengineered versions, or pathogens that infect plants or animals.


Much of the new technology being developed for detection of biological weapons incorporates a polymerase chain reaction (PCR) step based upon the use of highly specific primers and probes designed to selectively detect certain pathogenic organisms. Although this approach is appropriate for the most obvious bioterrorist organisms, like smallpox and anthrax, experience has shown that it is very difficult to predict which of hundreds of possible pathogenic organisms might be employed in a terrorist attack. Likewise, naturally emerging human disease that has caused devastating consequence in public health has come from unexpected families of bacteria, viruses, fungi, or protozoa. Plants and animals also have their natural burden of infectious disease agents and there are equally important biosafety and security concerns for agriculture.


A major conundrum in public health protection, biodefense, and agricultural safety and security is that these disciplines need to be able to rapidly identify and characterize infectious agents, while there is no existing technology with the breadth of function to meet this need. Currently used methods for identification of bacteria rely upon culturing the bacterium to effect isolation from other organisms and to obtain sufficient quantities of nucleic acid followed by sequencing of the nucleic acid, both processes which are time and labor intensive.


Mass spectrometry provides detailed information about the molecules being analyzed, including high mass accuracy. It is also a process that can be easily automated. DNA chips with specific probes can only determine the presence or absence of specifically anticipated organisms. Because there are hundreds of thousands of species of benign bacteria, some very similar in sequence to threat organisms, even arrays with 10,000 probes lack the breadth needed to identify a particular organism.


There is a need for a method for identification of bioagents which is both specific and rapid, and in which no culture or nucleic acid sequencing is required. Disclosed in U.S. patent application Ser. Nos. 09/798,007, 09/891,793, 10/405,756, 10/418,514, 10/660,997, 10/660,122, 10/660,996, 10/728,486, 10/754,415 and 10/829,826, each of which is commonly owned and incorporated herein by reference in its entirety, are methods for identification of bioagents (any organism, cell, or virus, living or dead, or a nucleic acid derived from such an organism, cell or virus) in an unbiased manner by molecular mass and base composition analysis of “bioagent identifying amplicons” which are obtained by amplification of segments of essential and conserved genes which are involved in, for example, translation, replication, recombination and repair, transcription, nucleotide metabolism, amino acid metabolism, lipid metabolism, energy generation, uptake, secretion and the like. Examples of these proteins include, but are not limited to, ribosomal RNAs, ribosomal proteins, DNA and RNA polymerases, elongation factors, tRNA synthetases, protein chain initiation factors, heat shock protein groEL, phosphoglycerate kinase, NADH dehydrogenase, DNA ligases, DNA gyrases and DNA topoisomerases, metabolic enzymes, and the like.


To obtain bioagent identifying amplicons, primers are selected to hybridize to conserved sequence regions which bracket variable sequence regions to yield a segment of nucleic acid which can be amplified and which is amenable to methods of molecular mass analysis. The variable sequence regions provide the variability of molecular mass which is used for bioagent identification. Upon amplification by PCR or other amplification methods with the specifically chosen primers, an amplification product that represents a bioagent identifying amplicon is obtained. The molecular mass of the amplification product, obtained by mass spectrometry for example, provides the means to uniquely identify the bioagent without a requirement for prior knowledge of the possible identity of the bioagent. The molecular mass of the amplification product or the corresponding base composition (which can be calculated from the molecular mass of the amplification product) is compared with a database of molecular masses or base compositions and a match indicates the identity of the bioagent. Furthermore, the method can be applied to rapid parallel analyses (for example, in a multi-well plate format) the results of which can be employed in a triangulation identification strategy which is amenable to rapid throughput and does not require nucleic acid sequencing of the amplified target sequence for bioagent identification.


The result of determination of a previously unknown base composition of a previously unknown bioagent (for example, a newly evolved and heretofore unobserved bacterium or virus) has downstream utility by providing new bioagent indexing information with which to populate base composition databases. The process of subsequent bioagent identification analyses is thus greatly improved as more base composition data for bioagent identifying amplicons becomes available.


The present invention provides oligonucleotide primers and compositions and kits containing the oligonucleotide primers, which define bacterial bioagent identifying amplicons and, upon amplification, produce corresponding amplification products whose molecular masses provide the means to identify bacteria, for example, at and below the species taxonomic level.


SUMMARY OF THE INVENTION

The present invention provides primers and compositions comprising pairs of primers, and kits containing the same for use in identification of bacteria. The primers are designed to produce bacterial bioagent identifying amplicons of DNA encoding genes essential to life such as, for example, 16S and 23S rRNA, DNA-directed RNA polymerase subunits (rpoB and rpoC), valyl-tRNA synthetase (valS), elongation factor EF-Tu (TufB), ribosomal protein L2 (rplB), protein chain initiation factor (infB), and spore protein (sspE). The invention further provides drill-down primers, compositions comprising pairs of primers and kits containing the same, which are designed to provide sub-species characterization of bacteria.


In particular, the present invention provides an oligonucleotide primer 16 to 35 nucleobases in length comprising 80% to 100% sequence identity with SEQ ID NO: 26, or a composition comprising the same; an oligonucleotide primer 20 to 27 nucleobases in length comprising at least a 20 nucleobase portion of SEQ ID NO: 388, or a composition comprising the same; a composition comprising both primers; and a composition comprising a first oligonucleotide primer 15 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 26, and a second oligonucleotide primer 16 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 388.


The present invention also provides an oligonucleotide primer 22 to 35 nucleobases in length comprising SEQ ID NO: 29, or a composition comprising the same; an oligonucleotide primer 18 to 35 nucleobases in length comprising SEQ ID NO: 391, or a composition comprising the same; a composition comprising both primers; and a composition comprising a first oligonucleotide primer 16 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 29, and a second oligonucleotide primer 13 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 391.


The present invention also provides an oligonucleotide primer 22 to 26 nucleobases in length comprising SEQ ID NO: 37, or a composition comprising the same; an oligonucleotide primer 20 to 30 nucleobases in length comprising SEQ ID NO: 362, or a composition comprising the same; a composition comprising both primers; and a composition comprising a first oligonucleotide primer 16 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 37, and a second oligonucleotide primer 14 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 362.


The present invention also provides an oligonucleotide primer 13 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 48, or a composition comprising the same; an oligonucleotide primer 19 to 35 nucleobases in length comprising SEQ ID NO: 404, or a composition comprising the same; a composition comprising both primers; and a composition comprising a first oligonucleotide primer 13 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 48, and a second oligonucleotide primer 14 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 404.


The present invention also provides an oligonucleotide primer 21 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 160, or a composition comprising the same; an oligonucleotide primer 21 to 35 nucleobases in length comprising at least a 16 nucleobase portion of SEQ ID NO: 515, or a composition comprising the same; a composition comprising both primers; and a composition comprising a first oligonucleotide primer 21 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 160, and a second oligonucleotide primer 21 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 515.


The present invention also provides an oligonucleotide primer 17 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 261, or a composition comprising the same; an oligonucleotide primer 18 to 35 nucleobases in length comprising at least a 16 nucleobase portion of SEQ ID NO: 624, or a composition comprising the same; a composition comprising both primers; and a composition comprising a first oligonucleotide primer 17 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 261, and a second oligonucleotide primer 18 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 624.


The present invention also provides an oligonucleotide primer 21 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 231, or a composition comprising the same; an oligonucleotide primer 17 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 591; or a composition comprising the same; a composition comprising both primers; and a composition comprising a first oligonucleotide primer 21 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 231, and a second oligonucleotide primer 17 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 591.


The present invention also provides an oligonucleotide primer 14 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 349, or a composition comprising the same; an oligonucleotide primer 17 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 711, or a composition comprising the same; a composition comprising both primers; and a composition comprising a first oligonucleotide primer 14 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 349, and a second oligonucleotide primer 17 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 711.


The present invention also provides an oligonucleotide primer 16 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 240, or a composition comprising the same; an oligonucleotide primer 15 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 596, or a composition comprising the same; a composition comprising both primers; and a composition comprising a first oligonucleotide primer 16 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 240, and a second oligonucleotide primer 15 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 596.


The present invention also provides an oligonucleotide primer 16 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 58, or a composition comprising the same; an oligonucleotide primer 21 to 35 nucleobases in length comprising at least a 16 nucleobase portion of SEQ ID NO:414, or a composition comprising the same; a composition comprising both primers; and a composition comprising a first oligonucleotide primer 16 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 58, and a second oligonucleotide primer 15 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 414.


The present invention also provides an oligonucleotide primer 16 to 35 nucleobases in length comprising at least a 16 nucleobase portion of SEQ ID NO: 6, or a composition comprising the same; an oligonucleotide primer 16 to 35 nucleobases in length comprising at least a 16 nucleobase portion of SEQ ID NO:369, or a composition comprising the same; a composition comprising both primers; and a composition comprising a first oligonucleotide primer 16 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 6, and a second oligonucleotide primer 15 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 369.


The present invention also provides an oligonucleotide primer 16 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 246, or a composition comprising the same; an oligonucleotide primer 19 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 602, or a composition comprising the same; a composition comprising both primers; and a composition comprising a first oligonucleotide primer 16 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 246, and a second oligonucleotide primer 19 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 602.


The present invention also provides an oligonucleotide primer 21 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 256, or a composition comprising the same; an oligonucleotide primer 14 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 620, or a composition comprising the same; a composition comprising both primers; and a composition comprising a first oligonucleotide primer 21 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 256, and a second oligonucleotide primer 14 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 620.


The present invention also provides an oligonucleotide primer 16 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 344, or a composition comprising the same; an oligonucleotide primer 18 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 700, or a composition comprising the same; a composition comprising both primers; and a composition comprising a first oligonucleotide primer 16 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 344, and a second oligonucleotide primer 18 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 700.


The present invention also provides an oligonucleotide primer 16 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 235, or a composition comprising the same; an oligonucleotide primer 16 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 587, or a composition comprising the same; a composition comprising both primers; and a composition comprising a first oligonucleotide primer 16 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 235, and a second oligonucleotide primer 16 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 587.


The present invention also provides an oligonucleotide primer 16 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 322, or a composition comprising the same; an oligonucleotide primer 19 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 686, or a composition comprising the same; a composition comprising both primers; and a composition comprising a first oligonucleotide primer 16 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 322, and a second oligonucleotide primer 19 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 686.


The present invention also provides an oligonucleotide primer 21 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 97, or a composition comprising the same; an oligonucleotide primer 20 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 451, or a composition comprising the same; a composition comprising both primers; and a composition comprising a first oligonucleotide primer 21 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 97, and a second oligonucleotide primer 20 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 451.


The present invention also provides an oligonucleotide primer 19 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 127, or a composition comprising the same; an oligonucleotide primer 14 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 482, or a composition comprising the same; a composition comprising both primers; and a composition comprising a first oligonucleotide primer 19 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 127, and a second oligonucleotide primer 14 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 482.


The present invention also provides an oligonucleotide primer 19 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 174, or a composition comprising the same; an oligonucleotide primer 21 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 530, or a composition comprising the same; a composition comprising both primers; and a composition comprising a first oligonucleotide primer 19 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 174, and a second oligonucleotide primer 21 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 530.


The present invention also provides an oligonucleotide primer 21 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 310, or a composition comprising the same; an oligonucleotide primer 19 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 668, or a composition comprising the same; a composition comprising both primers; and a composition comprising a first oligonucleotide primer 21 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 310, and a second oligonucleotide primer 19 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 668.


The present invention also provides an oligonucleotide primer 21 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 313, or a composition comprising the same; an oligonucleotide primer 21 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 670, or a composition comprising the same; a composition comprising both primers; and a composition comprising a first oligonucleotide primer 21 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 313, and a second oligonucleotide primer 21 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 670.


The present invention also provides an oligonucleotide primer 17 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 277, or a composition comprising the same; an oligonucleotide primer 21 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 632, or a composition comprising the same; a composition comprising both primers; and a composition comprising a first oligonucleotide primer 17 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 277, and a second oligonucleotide primer 21 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 632.


The present invention also provides an oligonucleotide primer 21 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 285, or a composition comprising the same; an oligonucleotide primer 19 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 640, or a composition comprising the same; a composition comprising both primers; and a composition comprising a first oligonucleotide primer 21 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 285, and a second oligonucleotide primer 19 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 640.


The present invention also provides an oligonucleotide primer 21 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 301, or a composition comprising the same; an oligonucleotide primer 21 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 656, or a composition comprising the same; a composition comprising both primers; and a composition comprising a first oligonucleotide primer 21 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 301, and a second oligonucleotide primer 21 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 656.


The present invention also provides an oligonucleotide primer 18 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 308, or a composition comprising the same; an oligonucleotide primer 18 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 663, or a composition comprising the same; a composition comprising both primers; and a composition comprising a first oligonucleotide primer 18 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 308, and a second oligonucleotide primer 18 to 35 nucleobases in length comprising between 70% to 100% sequence identity of SEQ ID NO: 663.


The present invention also provides compositions, such as those described herein, wherein either or both of the first and second oligonucleotide primers comprise at least one modified nucleobase, a non-templated T residue on the 5′-end, at least one non-template tag, or at least one molecular mass modifying tag, or any combination thereof.


The present invention also provides kits comprising any of the compositions described herein. The kits can comprise at least one calibration polynucleotide, or at least one ion exchange resin linked to magnetic beads, or both.


The present invention also provides methods for identification of an unknown bacterium. Nucleic acid from the bacterium is amplified using any of the compositions described herein to obtain an amplification product. The molecular mass of the amplification product is determined Optionally, the base composition of the amplification product is determined from the molecular mass. The base composition or molecular mass is compared with a plurality of base compositions or molecular masses of known bacterial bioagent identifying amplicons, wherein a match between the base composition or molecular mass and a member of the plurality of base compositions or molecular masses identifies the unknown bacterium. The molecular mass can be measured by mass spectrometry. In addition, the presence or absence of a particular clade, genus, species, or sub-species of a bioagent can be determined by the methods described herein.


The present invention also provides methods for determination of the quantity of an unknown bacterium in a sample. The sample is contacted with any of the compositions described herein and a known quantity of a calibration polynucleotide comprising a calibration sequence. Concurrently, nucleic acid from the bacterium in the sample is amplified with any of the compositions described herein and nucleic acid from the calibration polynucleotide in the sample is amplified with any of the compositions described herein to obtain a first amplification product comprising a bacterial bioagent identifying amplicon and a second amplification product comprising a calibration amplicon. The molecular mass and abundance for the bacterial bioagent identifying amplicon and the calibration amplicon is determined. The bacterial bioagent identifying amplicon is distinguished from the calibration amplicon based on molecular mass, wherein comparison of bacterial bioagent identifying amplicon abundance and calibration amplicon abundance indicates the quantity of bacterium in the sample. The method can also comprise determining the base composition of the bacterial bioagent identifying amplicon.





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 is a representative pseudo-four dimensional plot of base compositions of bioagent identifying amplicons of enterobacteria obtained with a primer pair targeting the rpoB gene (primer pair no 14 (SEQ ID NOs: 37:362). The quantity each of the nucleobases A, G and C are represented on the three axes of the plot while the quantity of nucleobase T is represented by the diameter of the spheres. Base composition probability clouds surrounding the spheres are also shown.



FIG. 2 is a representative diagram illustrating the primer selection process.



FIG. 3 lists common pathogenic bacteria and primer pair coverage. The primer pair number in the upper right hand corner of each polygon indicates that the primer pair can produce a bioagent identifying amplicon for all species within that polygon.



FIG. 4 is a representative 3D diagram of base composition (axes A, G and C) of bioagent identifying amplicons obtained with primer pair number 14 (a precursor of primer pair number 348 which targets 16S rRNA). The diagram indicates that the experimentally determined base compositions of the clinical samples (labeled NHRC samples) closely match the base compositions expected for Streptococcus pyogenes and are distinct from the expected base compositions of other organisms.



FIG. 5 is a representative mass spectrum of amplification products representing bioagent identifying amplicons of Streptococcus pyogenes, Neisseria meningitidis, and Haemophilus influenzae obtained from amplification of nucleic acid from a clinical sample with primer pair number 349 which targets 23S rRNA. Experimentally determined molecular masses and base compositions for the sense strand of each amplification product are shown.



FIG. 6 is a representative mass spectrum of amplification products representing a bioagent identifying amplicon of Streptococcus pyogenes, and a calibration amplicon obtained from amplification of nucleic acid from a clinical sample with primer pair number 356 which targets rplB. The experimentally determined molecular mass and base composition for the sense strand of the Streptococcus pyogenes amplification product is shown.



FIG. 7 is a representative process diagram for identification and determination of the quantity of a bioagent in a sample.



FIG. 8 is a representative mass spectrum of an amplified nucleic acid mixture which contained the Ames strain of Bacillus anthracis, a known quantity of combination calibration polynucleotide (SEQ ID NO: 741), and primer pair number 350 which targets the capC gene on the virulence plasmid pX02 of Bacillus anthracis. Calibration amplicons produced in the amplification reaction are visible in the mass spectrum as indicated and abundance data (peak height) are used to calculate the quantity of the Ames strain of Bacillus anthracis.





DESCRIPTION OF EMBODIMENTS

The present invention provides oligonucleotide primers which hybridize to conserved regions of nucleic acid of genes encoding, for example, proteins or RNAs necessary for life which include, but are not limited to: 16S and 23S rRNAs, RNA polymerase subunits, t-RNA synthetases, elongation factors, ribosomal proteins, protein chain initiation factors, cell division proteins, chaperonin groEL, chaperonin dnaK, phosphoglycerate kinase, NADH dehydrogenase, DNA ligases, metabolic enzymes and DNA topoisomerases. These primers provide the functionality of producing, for example, bacterial bioagent identifying amplicons for general identification of bacteria at the species level, for example, when contacted with bacterial nucleic acid under amplification conditions.


Referring to FIG. 2, primers are designed as follows: for each group of organisms, candidate target sequences are identified (200) from which nucleotide alignments are created (210) and analyzed (220). Primers are designed by selecting appropriate priming regions (230) which allows the selection of candidate primer pairs (240). The primer pairs are subjected to in silico analysis by electronic PCR (ePCR) (300) wherein bioagent identifying amplicons are obtained from sequence databases such as, for example, GenBank or other sequence collections (310), and checked for specificity in silico (320). Bioagent identifying amplicons obtained from GenBank sequences (310) can also be analyzed by a probability model which predicts the capability of a particular amplicon to identify unknown bioagents such that the base compositions of amplicons with favorable probability scores are stored in a base composition database (325). Alternatively, base compositions of the bioagent identifying amplicons obtained from the primers and GenBank sequences can be directly entered into the base composition database (330). Candidate primer pairs (240) are validated by in vitro amplification by a method such as, for example, PCR analysis (400) of nucleic acid from a collection of organisms (410). Amplification products that are obtained are optionally analyzed to confirm the sensitivity, specificity and reproducibility of the primers used to obtain the amplification products (420).


Synthesis of primers is well known and routine in the art. The primers may be conveniently and routinely made through the well-known technique of solid phase synthesis. Equipment for such synthesis is sold by several vendors including, for example, Applied Biosystems (Foster City, Calif.). Any other means for such synthesis known in the art may additionally or alternatively be employed.


The primers can be employed as compositions for use in, for example, methods for identification of bacterial bioagents as follows. In some embodiments, a primer pair composition is contacted with nucleic acid of an unknown bacterial bioagent. The nucleic acid is amplified by a nucleic acid amplification technique, such as PCR for example, to obtain an amplification product that represents a bioagent identifying amplicon. The molecular mass of one strand or each strand of the double-stranded amplification product is determined by a molecular mass measurement technique such as, for example, mass spectrometry wherein the two strands of the double-stranded amplification product are separated during the ionization process. In some embodiments, the mass spectrometry is electrospray Fourier transform ion cyclotron resonance mass spectrometry (ESI-FTICR-MS) or electrospray time of flight mass spectrometry (ESI-TOF-MS). A list of possible base compositions can be generated for the molecular mass value obtained for each strand and the choice of the correct base composition from the list is facilitated by matching the base composition of one strand with a complementary base composition of the other strand. The molecular mass or base composition thus determined is compared with a database of molecular masses or base compositions of analogous bioagent identifying amplicons for known bacterial bioagents. A match between the molecular mass or base composition of the amplification product from the unknown bacterial bioagent and the molecular mass or base composition of an analogous bioagent identifying amplicon for a known bacterial bioagent indicates the identity of the unknown bioagent.


In some embodiments, the primer pair used is one of the primer pairs of Table 1. In some embodiments, the method is repeated using a different primer pair to resolve possible ambiguities in the identification process or to improve the confidence level for the identification assignment.


In some embodiments, a bioagent identifying amplicon may be produced using only a single primer (either the forward or reverse primer of any given primer pair), provided an appropriate amplification method is chosen, such as, for example, low stringency single primer PCR (LSSP-PCR). Adaptation of this amplification method in order to produce bioagent identifying amplicons can be accomplished by one with ordinary skill in the art without undue experimentation.


In some embodiments, the oligonucleotide primers are “broad range survey primers” which hybridize to conserved regions of nucleic acid encoding RNA, such as ribosomal RNA (rRNA), of all, or at least 70%, at least 80%, at least 85%, at least 90%, or at least 95% of known bacteria and produce bacterial bioagent identifying amplicons. As used herein, the term “broad range survey primers” refers to primers that bind to nucleic acid encoding rRNAs of all, or at least 70%, at least 80%, at least 85%, at least 90%, or at least 95% known species of bacteria. In some embodiments, the rRNAs to which the primers hybridize are 16S and 23S rRNAs. In some embodiments, the broad range survey primer pairs comprise oligonucleotides ranging in length from 13 to 35 nucleobases, each of which have from 70% to 100% sequence identity with primer pair numbers 3, 10, 11, 14, 16, and 17 which consecutively correspond to SEQ ID NOs: 6:369, 26:388, 29:391, 37:362, 48:404, and 58:414.


In some cases, the molecular mass or base composition of a bacterial bioagent identifying amplicon defined by a broad range survey primer pair does not provide enough resolution to unambiguously identify a bacterial bioagent at the species level. These cases benefit from further analysis of one or more bacterial bioagent identifying amplicons generated from at least one additional broad range survey primer pair or from at least one additional “division-wide” primer pair (vide infra). The employment of more than one bioagent identifying amplicon for identification of a bioagent is herein referred to as “triangulation identification” (vide infra).


In other embodiments, the oligonucleotide primers are “division-wide” primers which hybridize to nucleic acid encoding genes of broad divisions of bacteria such as, for example, members of the Bacillus/Clostridia group or members of the α-, β-, γ-, and ε-proteobacteria. In some embodiments, a division of bacteria comprises any grouping of bacterial genera with more than one genus represented. For example, the β-proteobacteria group comprises members of the following genera: Eikenella, Neisseria, Achromobacter, Bordetella, Burkholderia, and Raltsonia. Species members of these genera can be identified using bacterial bioagent identifying amplicons generated with primer pair 293 (SEQ ID NOs: 344:700) which produces a bacterial bioagent identifying amplicon from the tufB gene of β-proteobacteria. Examples of genes to which division-wide primers may hybridize to include, but are not limited to: RNA polymerase subunits such as rpoB and rpoC, tRNA synthetases such as valyl-tRNA synthetase (valS) and aspartyl-tRNA synthetase (aspS), elongation factors such as elongation factor EF-Tu (tufB), ribosomal proteins such as ribosomal protein L2 (rplB), protein chain initiation factors such as protein chain initiation factor infB, chaperonins such as groL and dnaK, and cell division proteins such as peptidase ftsH (hflB). In some embodiments, the division-wide primer pairs comprise oligonucleotides ranging in length from 13 to 35 nucleobases, each of which have from 70% to 100% sequence identity with primer pair numbers 34, 52, 66, 67, 71, 72, 289, 290 and 293 which consecutively correspond to SEQ ID NOs: 160:515, 261:624, 231:591, 235:587, 349:711, 240:596, 246:602, 256:620, 344:700.


In other embodiments, the oligonucleotide primers are designed to enable the identification of bacteria at the clade group level, which is a monophyletic taxon referring to a group of organisms which includes the most recent common ancestor of all of its members and all of the descendants of that most recent common ancestor. The Bacillus cereus clade is an example of a bacterial clade group. In some embodiments, the clade group primer pairs comprise oligonucleotides ranging in length from 13 to 35 nucleobases, each of which have from 70% to 100% sequence identity with primer pair number 58 which corresponds to SEQ ID NOs: 322:686.


In other embodiments, the oligonucleotide primers are “drill-down” primers which enable the identification of species or “sub-species characteristics.” Sub-species characteristics are herein defined as genetic characteristics that provide the means to distinguish two members of the same bacterial species. For example, Escherichia coli O157:H7 and Escherichia coli K12 are two well known members of the species Escherichia coli. Escherichia coli O157:H7, however, is highly toxic due to the its Shiga toxin gene which is an example of a sub-species characteristic. Examples of sub-species characteristics may also include, but are not limited to: variations in genes such as single nucleotide polymorphisms (SNPs), variable number tandem repeats (VNTRs). Examples of genes indicating sub-species characteristics include, but are not limited to, housekeeping genes, toxin genes, pathogenicity markers, antibiotic resistance genes and virulence factors. Drill-down primers provide the functionality of producing bacterial bioagent identifying amplicons for drill-down analyses such as strain typing when contacted with bacterial nucleic acid under amplification conditions. Identification of such sub-species characteristics is often critical for determining proper clinical treatment of bacterial infections. Examples of pairs of drill-down primers include, but are not limited to, a trio of primer pairs for identification of strains of Bacillus anthracis. Primer pair 24 (SEQ ID NOs: 97:451) targets the capC gene of virulence plasmid pX02, primer pair 30 (SEQ ID NOs: 127:482) targets the cyA gene of virulence plasmid pX02, and primer pair 37 (SEQ ID NOs: 174:530) targets the lef gene of virulence plasmid pX02. Additional examples of drill-down primers include, but are not limited to, six primer pairs that are used for determining the strain type of group A Streptococcus. Primer pair 80 (SEQ ID NOs: 310:668) targets the gki gene, primer pair 81 (SEQ ID NOs: 313:670) targets the gtr gene, primer pair 86 (SEQ ID NOs: 227:632) targets the marl gene, primer pair 90 (SEQ ID NOs: 285:640) targets the mutS gene, primer pair 96 (SEQ ID NOs: 301:656) targets the xpt gene, and primer pair 98 (SEQ ID NOs: 308:663) targets the yqiL gene.


In some embodiments, the primers used for amplification hybridize to and amplify genomic DNA, DNA of bacterial plasmids, or DNA of DNA viruses.


In some embodiments, the primers used for amplification hybridize directly to ribosomal RNA or messenger RNA (mRNA) and act as reverse transcription primers for obtaining DNA from direct amplification of bacterial RNA or rRNA. Methods of amplifying RNA using reverse transcriptase are well known to those with ordinary skill in the art and can be routinely established without undue experimentation.


One with ordinary skill in the art of design of amplification primers will recognize that a given primer need not hybridize with 100% complementarity in order to effectively prime the synthesis of a complementary nucleic acid strand in an amplification reaction. Moreover, a primer may hybridize over one or more segments such that intervening or adjacent segments are not involved in the hybridization event (e.g., a loop structure or a hairpin structure). The primers of the present invention may comprise at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95% or at least 99% sequence identity with any of the primers listed in Table 1. Thus, in some embodiments of the present invention, an extent of variation of 70% to 100%, or any range therewithin, of the sequence identity is possible relative to the specific primer sequences disclosed herein. Determination of sequence identity is described in the following example: a primer 20 nucleobases in length which is otherwise identical to another 20 nucleobase primer but having two non-identical residues has 18 of 20 identical residues (18/20=0.9 or 90% sequence identity). In another example, a primer 15 nucleobases in length having all residues identical to a 15 nucleobase segment of primer 20 nucleobases in length would have 15/20=0.75 or 75% sequence identity with the 20 nucleobase primer.


Percent homology, sequence identity or complementarity, can be determined by, for example, the Gap program (Wisconsin Sequence Analysis Package, Version 8 for Unix, Genetics Computer Group, University Research Park, Madison Wis.), using default settings, which uses the algorithm of Smith and Waterman (Adv. Appl. Math., 1981, 2, 482-489). In some embodiments, homology, sequence identity, or complementarity of primers with respect to the conserved priming regions of bacterial nucleic acid, is at least 70%, at least 80%, at least 90%, at least 92%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or is 100%.


In some embodiments, the primers described herein comprise at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 94%, at least 95%, at least 96%, at least 98%, or at least 99%, or 100% (or any range therewithin) sequence identity with the primer sequences specifically disclosed herein. Thus, for example, a primer may have between 70% and 100%, between 75% and 100%, between 80% and 100%, and between 95% and 100% sequence identity with SEQ ID NO: 26. Likewise, a primer may have similar sequence identity with any other primer whose nucleotide sequence is disclosed herein.


One with ordinary skill is able to calculate percent sequence identity or percent sequence homology and able to determine, without undue experimentation, the effects of variation of primer sequence identity on the function of the primer in its role in priming synthesis of a complementary strand of nucleic acid for production of an amplification product of a corresponding bioagent identifying amplicon.


In some embodiments of the present invention, the oligonucleotide primers are between 13 and 35 nucleobases in length (13 to 35 linked nucleotide residues). These embodiments comprise oligonucleotide primers 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34 or 35 nucleobases in length, or any range therewithin.


In some embodiments, any given primer comprises a modification comprising the addition of a non-templated T residue to the 5′ end of the primer (i.e., the added T residue does not necessarily hybridize to the nucleic acid being amplified). The addition of a non-templated T residue has an effect of minimizing the addition of non-templated A residues as a result of the non-specific enzyme activity of Taq polymerase (Magnuson et al. Biotechniques, 1996, 21, 700-709), an occurrence which may lead to ambiguous results arising from molecular mass analysis.


In some embodiments of the present invention, primers may contain one or more universal bases. Because any variation (due to codon wobble in the 3rd position) in the conserved regions among species is likely to occur in the third position of a DNA triplet, oligonucleotide primers can be designed such that the nucleotide corresponding to this position is a base which can bind to more than one nucleotide, referred to herein as a “universal nucleobase.” For example, under this “wobble” pairing, inosine (I) binds to U, C or A; guanine (G) binds to U or C, and uridine (U) binds to U or C. Other examples of universal nucleobases include nitroindoles such as 5-nitroindole or 3-nitropyrrole (Loakes et al., Nucleosides and Nucleotides, 1995, 14, 1001-1003), the degenerate nucleotides dP or dK (Hill et al.), an acyclic nucleoside analog containing 5-nitroindazole (Van Aerschot et al., Nucleosides and Nucleotides, 1995, 14, 1053-1056) or the purine analog 1-(2-deoxy-β-D-ribofuranosyl)-imidazole-4-carboxamide (Sala et al., Nucl. Acids Res., 1996, 24, 3302-3306).


In some embodiments, to compensate for the somewhat weaker binding by the “wobble” base, the oligonucleotide primers are designed such that the first and second positions of each triplet are occupied by nucleotide analogs which bind with greater affinity than the unmodified nucleotide. Examples of these analogs include, but are not limited to, 2,6-diaminopurine which binds to thymine, 5-propynyluracil which binds to adenine and 5-propynylcytosine and phenoxazines, including G-clamp, which binds to G. Propynylated pyrimidines are described in U.S. Pat. Nos. 5,645,985, 5,830,653 and 5,484,908, each of which is commonly owned and incorporated herein by reference in its entirety. Propynylated primers are described in U.S. Ser. No. 10/294,203 which is also commonly owned and incorporated herein by reference in entirety. Phenoxazines are described in U.S. Pat. Nos. 5,502,177, 5,763,588, and 6,005,096, each of which is incorporated herein by reference in its entirety. G-clamps are described in U.S. Pat. Nos. 6,007,992 and 6,028,183, each of which is incorporated herein by reference in its entirety.


In some embodiments, non-template primer tags are used to increase the melting temperature (Tm) of a primer-template duplex in order to improve amplification efficiency. A non-template tag is at least three consecutive A or T nucleotide residues on a primer which are not complementary to the template. In any given non-template tag, A can be replaced by C or G and T can also be replaced by C or G. Although Watson-Crick hybridization is not expected to occur for a non-template tag relative to the template, the extra hydrogen bond in a G-C pair relative to a A-T pair confers increased stability of the primer-template duplex and improves amplification efficiency for subsequent cycles of amplification when the primers hybridize to strands synthesized in previous cycles.


In other embodiments, propynylated tags may be used in a manner similar to that of the non-template tag, wherein two or more 5-propynylcytidine or 5-propynyluridine residues replace template matching residues on a primer. In other embodiments, a primer contains a modified internucleoside linkage such as a phosphorothioate linkage, for example.


In some embodiments, the primers contain mass-modifying tags. Reducing the total number of possible base compositions of a nucleic acid of specific molecular weight provides a means of avoiding a persistent source of ambiguity in determination of base composition of amplification products. Addition of mass-modifying tags to certain nucleobases of a given primer will result in simplification of de novo determination of base composition of a given bioagent identifying amplicon (vide infra) from its molecular mass.


In some embodiments of the present invention, the mass modified nucleobase comprises one or more of the following: for example, 7-deaza-2′-deoxyadenosine-5-triphosphate, 5-iodo-2′-deoxyuridine-5′-triphosphate, 5-bromo-2′-deoxyuridine-5′-triphosphate, 5-bromo-2′-deoxycytidine-5′-triphosphate, 5-iodo-2′-deoxycytidine-5′-triphosphate, 5-hydroxy-2′-deoxyuridine-5′-triphosphate, 4-thiothymidine-5′-triphosphate, 5-aza-2′-deoxyuridine-5′-triphosphate, 5-fluoro-2′-deoxyuridine-5′-triphosphate, O6-methyl-2′-deoxyguanosine-5′-triphosphate, N2-methyl-2′-deoxyguanosine-5′-triphosphate, 8-oxo-2′-deoxyguanosine-5′-triphosphate or thiothymidine-5′-triphosphate. In some embodiments, the mass-modified nucleobase comprises 15N or 13C or both 15N and 13C.


In some embodiments of the present invention, at least one bacterial nucleic acid segment is amplified in the process of identifying the bioagent. Thus, the nucleic acid segments that can be amplified by the primers disclosed herein and that provide enough variability to distinguish each individual bioagent and whose molecular masses are amenable to molecular mass determination are herein described as “bioagent identifying amplicons.” The term “amplicon” as used herein, refers to a segment of a polynucleotide which is amplified in an amplification reaction. In some embodiments of the present invention, bioagent identifying amplicons comprise from about 45 to about 200 nucleobases (i.e. from about 45 to about 200 linked nucleosides), from about 60 to about 150 nucleobases, from about 75 to about 125 nucleobases. One of ordinary skill in the art will appreciate that the invention embodies compounds of 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, and 200 nucleobases in length, or any range therewithin. It is the combination of the portions of the bioagent nucleic acid segment to which the primers hybridize (hybridization sites) and the variable region between the primer hybridization sites that comprises the bioagent identifying amplicon. Since genetic data provide the underlying basis for identification of bioagents by the methods of the present invention, it is prudent to select segments of nucleic acids which ideally provide enough variability to distinguish each individual bioagent and whose molecular mass is amenable to molecular mass determination.


In some embodiments, bioagent identifying amplicons amenable to molecular mass determination which are produced by the primers described herein are either of a length, size or mass compatible with the particular mode of molecular mass determination or compatible with a means of providing a predictable fragmentation pattern in order to obtain predictable fragments of a length compatible with the particular mode of molecular mass determination. Such means of providing a predictable fragmentation pattern of an amplification product include, but are not limited to, cleavage with restriction enzymes or cleavage primers, for example. Methods of using restriction enzymes and cleavage primers are well known to those with ordinary skill in the art.


In some embodiments, amplification products corresponding to bacterial bioagent identifying amplicons are obtained using the polymerase chain reaction (PCR) which is a routine method to those with ordinary skill in the molecular biology arts. Other amplification methods may be used such as ligase chain reaction (LCR), low-stringency single primer PCR, and multiple strand displacement amplification (MDA) which are also well known to those with ordinary skill.


In the context of this invention, a “bioagent” is any organism, cell, or virus, living or dead, or a nucleic acid derived from such an organism, cell or virus. Examples of bioagents include, but are not limited, to cells, (including but not limited to human clinical samples, bacterial cells and other pathogens), viruses, fungi, protists, parasites, and pathogenicity markers (including but not limited to: pathogenicity islands, antibiotic resistance genes, virulence factors, toxin genes and other bioregulating compounds). Samples may be alive or dead or in a vegetative state (for example, vegetative bacteria or spores) and may be encapsulated or bioengineered. In the context of this invention, a “pathogen” is a bioagent which causes a disease or disorder.


In the context of this invention, the term “unknown bioagent” may mean either: (i) a bioagent whose existence is known (such as the well known bacterial species Staphylococcus aureus for example) but which is not known to be in a sample to be analyzed, or (ii) a bioagent whose existence is not known (for example, the SARS coronavirus was unknown prior to April 2003). For example, if the method for identification of coronaviruses disclosed in commonly owned U.S. patent Ser. No. 10/829,826 (incorporated herein by reference in its entirety) was to be employed prior to April 2003 to identify the SARS coronavirus in a clinical sample, both meanings of “unknown” bioagent are applicable since the SARS coronavirus was unknown to science prior to April, 2003 and since it was not known what bioagent (in this case a coronavirus) was present in the sample. On the other hand, if the method of U.S. patent Ser. No. 10/829,826 was to be employed subsequent to April 2003 to identify the SARS coronavirus in a clinical sample, only the first meaning (i) of “unknown” bioagent would apply since the SARS coronavirus became known to science subsequent to April 2003 and since it was not known what bioagent was present in the sample.


The employment of more than one bioagent identifying amplicon for identification of a bioagent is herein referred to as “triangulation identification.” Triangulation identification is pursued by analyzing a plurality of bioagent identifying amplicons selected within multiple core genes. This process is used to reduce false negative and false positive signals, and enable reconstruction of the origin of hybrid or otherwise engineered bioagents. For example, identification of the three part toxin genes typical of B. anthracis (Bowen et al., J. Appl. Microbiol., 1999, 87, 270-278) in the absence of the expected signatures from the B. anthracis genome would suggest a genetic engineering event.


In some embodiments, the triangulation identification process can be pursued by characterization of bioagent identifying amplicons in a massively parallel fashion using the polymerase chain reaction (PCR), such as multiplex PCR where multiple primers are employed in the same amplification reaction mixture, or PCR in multi-well plate format wherein a different and unique pair of primers is used in multiple wells containing otherwise identical reaction mixtures. Such multiplex and multi-well PCR methods are well known to those with ordinary skill in the arts of rapid throughput amplification of nucleic acids.


In some embodiments, the molecular mass of a particular bioagent identifying amplicon is determined by mass spectrometry. Mass spectrometry has several advantages, not the least of which is high bandwidth characterized by the ability to separate (and isolate) many molecular peaks across a broad range of mass to charge ratio (m/z). Thus, mass spectrometry is intrinsically a parallel detection scheme without the need for radioactive or fluorescent labels, since every amplification product is identified by its molecular mass. The current state of the art in mass spectrometry is such that less than femtomole quantities of material can be readily analyzed to afford information about the molecular contents of the sample. An accurate assessment of the molecular mass of the material can be quickly obtained, irrespective of whether the molecular weight of the sample is several hundred, or in excess of one hundred thousand atomic mass units (amu) or Daltons.


In some embodiments, intact molecular ions are generated from amplification products using one of a variety of ionization techniques to convert the sample to gas phase. These ionization methods include, but are not limited to, electrospray ionization (ES), matrix-assisted laser desorption ionization (MALDI) and fast atom bombardment (FAB). Upon ionization, several peaks are observed from one sample due to the formation of ions with different charges. Averaging the multiple readings of molecular mass obtained from a single mass spectrum affords an estimate of molecular mass of the bioagent identifying amplicon. Electrospray ionization mass spectrometry (ESI-MS) is particularly useful for very high molecular weight polymers such as proteins and nucleic acids having molecular weights greater than 10 kDa, since it yields a distribution of multiply-charged molecules of the sample without causing a significant amount of fragmentation.


The mass detectors used in the methods of the present invention include, but are not limited to, Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR-MS), time of flight (TOF), ion trap, quadrupole, magnetic sector, Q-TOF, and triple quadrupole.


In some embodiments, conversion of molecular mass data to a base composition is useful for certain analyses. As used herein, a “base composition” is the exact number of each nucleobase (A, T, C and G). For example, amplification of nucleic acid of Neisseria meningitidis with a primer pair that produces an amplification product from nucleic acid of 23S rRNA that has a molecular mass (sense strand) of 28480.75124, from which a base composition of A25 G27 C22 T18 is assigned from a list of possible base compositions calculated from the molecular mass using standard known molecular masses of each of the four nucleobases.


In some embodiments, assignment of base compositions to experimentally determined molecular masses is accomplished using “base composition probability clouds.” Base compositions, like sequences, vary slightly from isolate to isolate within species. It is possible to manage this diversity by building “base composition probability clouds” around the composition constraints for each species. This permits identification of organisms in a fashion similar to sequence analysis. A “pseudo four-dimensional plot” (FIG. 1) can be used to visualize the concept of base composition probability clouds. Optimal primer design requires optimal choice of bioagent identifying amplicons and maximizes the separation between the base composition signatures of individual bioagents. Areas where clouds overlap indicate regions that may result in a misclassification, a problem which is overcome by a triangulation identification process using bioagent identifying amplicons not affected by overlap of base composition probability clouds.


In some embodiments, base composition probability clouds provide the means for screening potential primer pairs in order to avoid potential misclassifications of base compositions. In other embodiments, base composition probability clouds provide the means for predicting the identity of a bioagent whose assigned base composition was not previously observed and/or indexed in a bioagent identifying amplicon base composition database due to evolutionary transitions in its nucleic acid sequence. Thus, in contrast to probe-based techniques, mass spectrometry determination of base composition does not require prior knowledge of the composition or sequence in order to make the measurement.


The present invention provides bioagent classifying information similar to DNA sequencing and phylogenetic analysis at a level sufficient to identify a given bioagent. Furthermore, the process of determination of a previously unknown base composition for a given bioagent (for example, in a case where sequence information is unavailable) has downstream utility by providing additional bioagent indexing information with which to populate base composition databases. The process of future bioagent identification is thus greatly improved as more BCS indexes become available in base composition databases.


In one embodiment, a sample comprising an unknown bioagent is contacted with a pair of primers which provide the means for amplification of nucleic acid from the bioagent, and a known quantity of a polynucleotide that comprises a calibration sequence. The nucleic acids of the bioagent and of the calibration sequence are amplified and the rate of amplification is reasonably assumed to be similar for the nucleic acid of the bioagent and of the calibration sequence. The amplification reaction then produces two amplification products: a bioagent identifying amplicon and a calibration amplicon. The bioagent identifying amplicon and the calibration amplicon should be distinguishable by molecular mass while being amplified at essentially the same rate. Effecting differential molecular masses can be accomplished by choosing as a calibration sequence, a representative bioagent identifying amplicon (from a specific species of bioagent) and performing, for example, a 2 to 8 nucleobase deletion or insertion within the variable region between the two priming sites. The amplified sample containing the bioagent identifying amplicon and the calibration amplicon is then subjected to molecular mass analysis by mass spectrometry, for example. The resulting molecular mass analysis of the nucleic acid of the bioagent and of the calibration sequence provides molecular mass data and abundance data for the nucleic acid of the bioagent and of the calibration sequence. The molecular mass data obtained for the nucleic acid of the bioagent enables identification of the unknown bioagent and the abundance data enables calculation of the quantity of the bioagent, based on the knowledge of the quantity of calibration polynucleotide contacted with the sample.


In some embodiments, the identity and quantity of a particular bioagent is determined using the process illustrated in FIG. 7. For instance, to a sample containing nucleic acid of an unknown bioagent are added primers (500) and a known quantity of a calibration polynucleotide (505). The total nucleic acid in the sample is subjected to an amplification reaction (510) to obtain amplification products. The molecular masses of amplification products are determined (515) from which are obtained molecular mass and abundance data. The molecular mass of the bioagent identifying amplicon (520) provides the means for its identification (525) and the molecular mass of the calibration amplicon obtained from the calibration polynucleotide (530) provides the means for its identification (535). The abundance data of the bioagent identifying amplicon is recorded (540) and the abundance data for the calibration data is recorded (545), both of which are used in a calculation (550) which determines the quantity of unknown bioagent in the sample.


In some embodiments, construction of a standard curve where the amount of calibration polynucleotide spiked into the sample is varied, provides additional resolution and improved confidence for the determination of the quantity of bioagent in the sample. The use of standard curves for analytical determination of molecular quantities is well known to one with ordinary skill and can be performed without undue experimentation.


In some embodiments, multiplex amplification is performed where multiple bioagent identifying amplicons are amplified with multiple primer pairs which also amplify the corresponding standard calibration sequences. In this or other embodiments, the standard calibration sequences are optionally included within a single vector which functions as the calibration polynucleotide. Multiplex amplification methods are well known to those with ordinary skill and can be performed without undue experimentation.


In some embodiments, the calibrant polynucleotide is used as an internal positive control to confirm that amplification conditions and subsequent analysis steps are successful in producing a measurable amplicon. Even in the absence of copies of the genome of a bioagent, the calibration polynucleotide should give rise to a calibration amplicon. Failure to produce a measurable calibration amplicon indicates a failure of amplification or subsequent analysis step such as amplicon purification or molecular mass determination. Reaching a conclusion that such failures have occurred is in itself, a useful event.


In some embodiments, the calibration sequence is inserted into a vector which then itself functions as the calibration polynucleotide. In some embodiments, more than one calibration sequence is inserted into the vector that functions as the calibration polynucleotide. Such a calibration polynucleotide is herein termed a “combination calibration polynucleotide.” The process of inserting polynucleotides into vectors is routine to those skilled in the art and can be accomplished without undue experimentation. Thus, it should be recognized that the calibration method should not be limited to the embodiments described herein. The calibration method can be applied for determination of the quantity of any bioagent identifying amplicon when an appropriate standard calibrant polynucleotide sequence is designed and used. The process of choosing an appropriate vector for insertion of a calibrant is also a routine operation that can be accomplished by one with ordinary skill without undue experimentation.


The present invention also provides kits for carrying out, for example, the methods described herein. In some embodiments, the kit may comprise a sufficient quantity of one or more primer pairs to perform an amplification reaction on a target polynucleotide from a bioagent to form a bioagent identifying amplicon. In some embodiments, the kit may comprise from one to fifty primer pairs, from one to twenty primer pairs, from one to ten primer pairs, or from two to five primer pairs. In some embodiments, the kit may comprise one or more primer pairs recited in Table 1.


In some embodiments, the kit may comprise one or more broad range survey primer(s), division wide primer(s), Glade group primer(s) or drill-down primer(s), or any combination thereof. A kit may be designed so as to comprise particular primer pairs for identification of a particular bioagent. For example, a broad range survey primer kit may be used initially to identify an unknown bioagent as a member of the Bacillus/Clostridia group. Another example of a division-wide kit may be used to distinguish Bacillus anthracis, Bacillus cereus and Bacillus thuringiensis from each other. A clade group primer kit may be used, for example, to identify an unknown bacterium as a member of the Bacillus cereus clade group. A drill-down kit may be used, for example, to identify genetically engineered Bacillus anthracis. In some embodiments, any of these kits may be combined to comprise a combination of broad range survey primers and division-wide primers, clade group primers or drill-down primers, or any combination thereof, for identification of an unknown bacterial bioagent.


In some embodiments, the kit may contain standardized calibration polynucleotides for use as internal amplification calibrants. Internal calibrants are described in commonly owned U.S. Patent Application Ser. No. 60/545,425 which is incorporated herein by reference in its entirety.


In some embodiments, the kit may also comprise a sufficient quantity of reverse transcriptase (if an RNA virus is to be identified for example), a DNA polymerase, suitable nucleoside triphosphates (including any of those described above), a DNA ligase, and/or reaction buffer, or any combination thereof, for the amplification processes described above. A kit may further include instructions pertinent for the particular embodiment of the kit, such instructions describing the primer pairs and amplification conditions for operation of the method. A kit may also comprise amplification reaction containers such as microcentrifuge tubes and the like. A kit may also comprise reagents or other materials for isolating bioagent nucleic acid or bioagent identifying amplicons from amplification, including, for example, detergents, solvents, or ion exchange resins which may be linked to magnetic beads. A kit may also comprise a table of measured or calculated molecular masses and/or base compositions of bioagents using the primer pairs of the kit.


In order that the invention disclosed herein may be more efficiently understood, examples are provided below. It should be understood that these examples are for illustrative purposes only and are not to be construed as limiting the invention in any manner. Throughout these examples, molecular cloning reactions, and other standard recombinant DNA techniques, were carried out according to methods described in Maniatis et al., Molecular Cloning—A Laboratory Manual, 2nd ed., Cold Spring Harbor Press (1989), using commercially available reagents, except where otherwise noted.


EXAMPLES
Example 1
Selection of Primers That Define Bioagent Identifying Amplicons

For design of primers that define bacterial bioagent identifying amplicons, relevant sequences from, for example, GenBank are obtained, aligned and scanned for regions where pairs of PCR primers would amplify products of about 45 to about 200 nucleotides in length and distinguish species from each other by their molecular masses or base compositions. A typical process shown in FIG. 2 is employed.


A database of expected base compositions for each primer region is generated using an in silico PCR search algorithm, such as (ePCR). An existing RNA structure search algorithm (Macke et al., Nuc. Acids Res., 2001, 29, 4724-4735, which is incorporated herein by reference in its entirety) has been modified to include PCR parameters such as hybridization conditions, mismatches, and thermodynamic calculations (SantaLucia, Proc. Natl. Acad. Sci. U.S.A., 1998, 95, 1460-1465, which is incorporated herein by reference in its entirety). This also provides information on primer specificity of the selected primer pairs.


Table 1 represents a collection of primers (sorted by forward primer name) designed to identify bacteria using the methods herein described. The forward or reverse primer name indicates the gene region of bacterial genome to which the primer hybridizes relative to a reference sequence eg: the forward primer name 16S_EC10771106 indicates that the primer hybridizes to residues 1077-1106 of the gene encoding 16S ribosomal RNA in an E. coli reference sequence represented by a sequence extraction of coordinates 4033120.4034661 from GenBank gi number 16127994 (as indicated in Table 2). As an additional example: the forward primer name BONTA_X52066450473 indicates that the primer hybridizes to residues 450-437 of the gene encoding Clostridium botulinum neurotoxin type A (BoNT/A) represented by GenBank Accession No. X52066 (primer pair name codes appearing in Table 1 are defined in Table 2). In Table 1, Ua=5-propynyluracil; Ca=5-propynylcytosine; *=phosphorothioate linkage. The primer pair number is an in-house database index number.









TABLE 1







Primer Pairs for Identification of Bacterial Bioagents
















For.


Rev.


Primer
For.

SEQ


SEQ


pair
primer

ID
Rev. primer

ID


number
name
Forward sequence
NO:
name
Reverse sequence
NO:
















1
16S_EC_1077_
GTGAGATGTTGGGTTAA
1
16S_EC_1175_
GACGTCATCCCCACCTTCC
368



1106_F
GTCCCGTAACGAG

1195_R
TC






266
16S_EC_1082_
ATGTTGGGTTAAGTCCC
2
16S_EC_1177_
TGACGTCATGGCCACCTTC
372



1100_F
GC

1196_10G_
C







11G_R







265
16S_EC_1082_
ATGTTGGGTTAAGTCCC
2
16S_EC_1177_
TGACGTCATGCCCACCTTC
373



1100_F
GC

1196_10G_R
C






230
16S_EC_1082_
ATGTTGGGTTAAGTCCC
2
16S_EC_1177_
TGACGTCATCCCCACCTTC
374



1100_F
GC

1196_R
C






263
16S_EC_1082_
ATGTTGGGTTAAGTCCC
2
16S_EC_1525_
AAGGAGGTGATCCAGCC
382



1100_F
GC

1541_R







2
16S_EC_1082_
ATGTTGGGTTAAGTCCC
3
16S_EC_1175_
TTGACGTCATCCCCACCTT
371



1106_F
GCAACGAG

1197_R
CCTC






278
16S_EC_1090_
TTAAGTCCCGCAACGAG
4
16S_EC_1175_
TGACGTCATCCCCACCTTC
369



1111_2_F
CGCAA

1196_R
CTC






361
16S_EC_1090_
TTTAAGTCCCGCAACGA
5
16S_EC_1175_
TTGACGTCATCCCCACCTT
370



1111_2_
GCGCAA

1196_TMOD_R
CCTC




TMOD_F










3
16S_EC_1090_
TTAAGTCCCGCAACGAT
6
16S_EC_1175_
TGACGTCATCCCCACCTTC
369



1111_F
CGCAA

1196_R
CTC






256
16S_EC_1092_
TAGTCCCGCAACGAGCG
7
16S_EC_1174_
GACGTCATCCCCACCTTCC
367



1109_F
C

1195_R
TCC






159
16S_EC_1100_
CAACGAGCGCAACCCTT
8
16S_EC_1174_
TCCCCACCTTCCTCC
366



1116_F


1188_R







247
16S_EC_1195_
CAAGTCATCATGGCCCT
9
16S_EC_1525_
AAGGAGGTGATCCAGCC
382



1213_F
TA

1541_R







4
16S_EC_1222_
GCTACACACGTGCTACA
10
16S_EC_1303_
CGAGTTGCAGACTGCGATC
376



1241_F
ATG

1323_R
CG






232
16S_EC_1303_
CGGATTGGAGTCTGCAA
11
16S_EC_1389_
GACGGGCGGTGTGTACAAG
378



1323_F
CTCG

1407_R







5
16S_EC_1332_
AAGTCGGAATCGCTAGT
12
16S_EC_1389_
GACGGGCGGTGTGTACAAG
378



1353_F
AATCG

1407_R







252
16S_EC_1367_
TACGGTGAATACGTTCC
13
16S_EC_1485_
ACCTTGTTACGACTTCACC
379



1387_F
CGGG

1506_R
CCA






250
16S_EC_1387_
GCCTTGTACACACCTCC
14
16S_EC_1494_
CACGGCTACCTTGTTACGA
381



1407_F
CGTC

1513_R
C






231
16S_EC_1389_
CTTGTACACACCGCCCG
15
16S_EC_1525_
AAGGAGGTGATCCAGCC
382



1407_F
TC

1541_R







251
16S_EC_1390_
TTGTACACACCGCCCGT
16
16S_EC_1486_
CCTTGTTACGACTTCACCC
380



1411_F
CATAC

1505_R
C






6
16S_EC_30_
TGAACGCTGGTGGCATG
17
16S_EC_105_
TACGCATTACTCACCCGTC
361



54_F
CTTAACAC

126_R
CGC






243
16S_EC_314_
CACTGGAACTGAGACAC
18
16S_EC_556_
CTTTACGCCCAGTAATTCC
385



332_F
GG

575_R
G






7
16S_EC_38_
GTGGCATGCCTAATACA
19
16S_EC_101_
TTACTCACCCGTCCGCCGC
357



64_F
TGCAAGTCG

120_R
T






279
16S_EC_405_
TGAGTGATGAAGGCCTT
20
16S_EC_507_
CGGCTGCTGGCACGAAGTT
384



432_F
AGGGTTGTAAA

527_R
AG






8
16S_EC_49_
TAACACATGCAAGTCGA
21
16S_EC_104_
TTACTCACCCGTCCGCC
359



68_F
ACG

120_R







275
16S_EC_49_
TAACACATGCAAGTCGA
21
16S_EC_1061_
ACGACACGAGCTGACGAC
364



68_F
ACG

1078_R







274
16S_EC_49_
TAACACATGCAAGTCGA
21
16S_EC_880_
CGTACTCCCCAGGCG
390



68_F
ACG

894_R







244
16S_EC_518_
CCAGCAGCCGCGGTAAT
22
16S_EC_774_
GTATCTAATCCTGTTTGCT
387



536_F
AC

795_R
CCC






226
16S_EC_556_
CGGAATTACTGGGCGTA
23
16S_EC_683_
CGCATTTCACCGCTACAC
386



575_F
AAG

700_R







264
16S_EC_556_
CGGAATTACTGGGCGTA
23
16S_EC_774_
GTATCTAATCCTGTTTGCT
387



575_F
AAG

795_R
CCC






273
16S_EC_683_
GTGTAGCGGTGAAATGC
24
16S_EC_1303_
CGAGTTGCAGACTGCGATC
377



700_F
G

1323_R
CG






9
16S_EC_683_
GTGTAGCGGTGAAATGC
24
16S_EC_774_
GTATCTAATCCTGTTTGCT
387



700_F
G

795_R
CCC






158
16S_EC_683_
GTGTAGCGGTGAAATGC
24
16S_EC_880_
CGTACTCCCCAGGCG
390



700_F
G

894_R







245
16S_EC_683_
GTGTAGCGGTGAAATGC
24
16S_EC_967_
GGTAAGGTTCTTCGCGTTG
396



700_F
G

985_R







294
16S_EC_7_33_
GAGAGTTTGATCCTGGC
25
16S_EC_101_
TGTTACTCACCCGTCTGCC
358



3_F
TCAGAACGAA

122_R
ACT






10
16S_EC_713_
AGAACACCGATGGCGAA
26
16S_EC_789_
CGTGGACTACCAGGGTATC
388



732_F
GGC

809_R
TA






346
16S_EC_713_
TAGAACACCGATGGCGA
27
16S_EC_789_
TCGTGGACTACCAGGGTAT
389



732_TMOD_F
AGGC

809_TMOD_R
CTA






228
16S_EC_774_
GGGAGCAAACAGGATTA
28
16S_EC_880_
CGTACTCCCCAGGCG
390



795_F
GATAC

894_R







11
16S_EC_785_
GGATTAGAGACCCTGGT
29
16S_EC_880_
GGCCGTACTCCCCAGGCG
391



806_F
AGTCC

897_R







347
16S_EC_785_
TGGATTAGAGACCCTGG
30
16S_EC_880_
TGGCCGTACTCCCCAGGCG
392



806_TMOD_F
TAGTCC

897_TMOD_R







12
16S_EC_785_
GGATTAGATACCCTGGT
31
16S_EC_880_
GGCCGTACTCCCCAGGCG
391



810_F
AGTCCACGC

897_2_R







13
16S_EC_789_
TAGATACCCTGGTAGTC
32
16S_EC_880_
CGTACTCCCCAGGCG
390



810_F
CACGC

894_R







255
16S_EC_789_
TAGATACCCTGGTAGTC
32
16S_EC_882_
GCGACCGTACTCCCCAGG
393



810_F
CACGC

899_R







254
16S_EC_791_
GATACCCTGGTAGTCCA
33
16S_EC_886_
GCCTTGCGACCGTACTCCC
394



812_F
CACCG

904_R







248
16S_EC_8_27_
AGAGTTTGATCATGGCT
34
16S_EC_1525_
AAGGAGGTGATCCAGCC
382



F
CAG

1541_R _







242
16S_EC_8_27_
AGAGTTTGATCATGGCT
34
16S_EC_342_
ACTGCTGCCTCCCGTAG
383



7_F
CAG

358_R







253
16S_EC_804_
ACCACGCCGTAAACGAT
35
16S_EC_909_
CCCCCGTCAATTCCTTTGA
395



822_F
GA

929_R
GT






246
16S_EC_937_
AAGCGGTGGAGCATGTG
36
16S_EC_1220_
ATTGTAGCACGTGTGTAGC
375



954_F
G

1240_R
CC






14
16S_EC_960_
TTCGATGCAACGCGAAG
37
16S_EC_1054_
ACGAGCTGACGACAGCCAT
362



981_F
AACCT

1073_R
G






348
16S_EC_960_
TTTCGATGCAACGCGAA
38
16S_EC_1054_
TACGAGCTGACGACAGCCA
363



981_TMOD_F
GAACT

1073_TMOD_R
TG






119
16S_EC_969_
ACGCGAAGAACCTTA
39
16S_EC_1061_
ACGACACGAGUaCaGACGAC
364



985_1P_F
UaC

1078_2P_R







15
16S_EC_969_
ACGCGAAGAACCTTACC
39
16S_EC_1061_
ACGACACGAGCTGACGAC
364



985_F


1078_R







272
16S_EC_969_
ACGCGAAGAACCTTACC
40
16S_EC_1389_
GACGGGCGGTGTGTACAAG
378



985_F


1407_R







344
16S_EC_971_
GCGAAGAACCTTACCAG
41
16S_EC_1043_
ACAACCATGCACCACCTGT
360



990_F
GTC

1062_R
C






120
16S_EC_972_
CGAAGAAUaUaTTACC
42
16S_EC_1064_
ACACGAGUaCaGAC
365



985_2P_F


1075_2P_R







121
16S_EC_972_
CGAAGAACCTTACC
42
16S_EC_1064_
ACACGAGCTGAC
365



985_F


1075_R







1073
23S_BRM_1110_
TGCGCGGAAGATGTAAC
43
23S_BRM_1176_
TCGCAGGCTTACAGAACGC
397



1129_F
GGG

1201_R
TCTCCTA






1074
23S_BRM_515_
TGCATACAAACAGTCGG
44
23S_BRM_616_
TCGGACTCGCTTTCGCTAC
398



536_F
AGCCT

635_R
G






241
23S_BS_
AAACTAGATAACAGTAG
45
23S_BS_5_21_
GTGCGCCCTTTCTAACTT
399



−68_−44_F
ACATCAC

R







235
23S_EC_1602_
TACCCCAAACCGACACA
46
23S_EC_1686_
CCTTCTCCCGAAGTTACG
402



1620_F
GG

1703_R







236
23S_EC_1685_
CCGTAACTTCGGGAGAA
47
23S_EC_1828_
CACCGGGCAGGCGTC
403



1703_F
GG

1842_R







16
23S_EC_1826_
CTGACACCTGCCCGGTG
48
23S_EC_1906_
GACCGTTATAGTTACGGCC
404



1843_F
C

1924_R







349
23S_EC_1826_
TCTGACACCTGCCCGGT
49
23S_EC_1906_
TGACCGTTATAGTTACGGC
405



1843_TMOD_F
GC

1924_TMOD_R
C






237
23S_EC_1827_
GACGCCTGCCCGGTGC
50
23S_EC_1929_
CCGACAAGGAATTTCGCTA
407



1843_F


1949_R
CC






249
23S_EC_1831_
ACCTGCCCAGTGCTGGA
51
23S_EC_1919_
TCGCTACCTTAGGACCGT
406



1849_F
AG

1936_R







234
23S_EC_187_
GGGAACTGAAACATCTA
52
23S_EC_242_
TTCGCTCGCCGCTAC
408



207_F
AGTA

256_R







233
23S_EC_23_
GGTGGATGCCTTGGC
53
23S_EC_115_
GGGTTTCCCCATTCGG
401



37_F


130_R







238
23S_E C_2434_
AAGGTACTCCGGGGATA
54
23S_EC_2490_
AGCCGACATCGAGGTGCCA
409



2456_F
ACAGGC

2511_R
AAC






257
23S_EC_2586_
TAGAACGTCGCGAGACA
55
23S_EC_2658_
AGTCCATCCCGGTCCTCTC
411



2607_F
GTTCG

2677_R
G






239
23S_EC_2599_
GACAGTTCGGTCCCTAT
56
23S_EC_2653_
CCGGTCCTCTCGTACTA
410



2616_F
C

2669_R







18
23S_EC_2645_
CTGTCCCTAGTACGAGA
57
23S_EC_2751_
GTTTCATGCTTAGATGCTT
417



2669_2_F
GGACCGG

2767_R
TCAGC






17
23S_EC_2645_
TCTGTCCCTAGTACGAG
58
23S_EC_2744_
TGCTTAGATGCTTTCAGC
414



2669_F
AGGACCGG

2761_R







118
23S_EC_2646_
CTGTTCTTAGTACGAGA
59
23S_EC_2745_
TTCGTGCTTAGATGCTTTC
415



2667_F
GGACC

2765_R
AG






360
23S_EC_2646_
TCTGTTCTTAGTACGAG
60
23S_EC_2745_
TTTCGTGCTTAGATGCTTT
416



2667_TMOD_F
AGGACC

2765_TMOD_R
CAG






147
23S_EC_2652_
CTAGTACGAGAGGACCG
61
23S_EC_2741_
ACTTAGATGCTTTCAGCGG
413



2669_F
G

2760_R
T






240
23S_EC_2653_
TAGTACGAGAGGACCGG
62
23S_EC_2737_
TTAGATGCTTTCAGCACTT
412



2669_F


2758_R
ATC






20
23S_EC_493_
GGGGAGTGAAAGAGATC
63
23S_EC_551_
ACAAAAGGCACGCCATCAC
418



518_2_F
CTGAAACCG

571_2_R
CC






19
23S_EC_493_
GGGGAGTGAAAGAGATC
63
23S_EC_551_
ACAAAAGGTACGCCGTCAC
419



518_F
CTGAAACCG

571_R
CC






21
23S_EC_971_
CGAGAGGGAAACAACCC
64
23S_EC_1059_
TGGCTGCTTCTAAGCCAAC
400



992_F
AGACC

1077_R







1158
AB_MLST-11-
TCGTGCCCGCAATTTGC
65
AB_MLST-11-
TAATGCCGGGTAGTGCAAT
420



OIF007_1202_
ATAAAGC

OIF007_1266_
CCATTCTTCTAG




1225_F


1296_R







1159
AB_MLST-11-
TCGTGCCCGCAATTTGC
65
AB_MLST-11-
TGACCTGCGGTCGAGCG
421



OIF007_1202_
ATAAAGC

OIF007_1299_





1225_F


1316_R







1160
AB_MLST-11-
TTGTAGCACAGCAAGGC
66
AB_MLST-11-
TGCCATCCATAATCACGCC
422



OIF007_1234_
AAATTTCCTGAAAC

OIF007_1335_
ATACTGACG




1264_F


1362_R







1161
AB_MLST-11-
TAGGTTTACGTCAGTAT
67
AB_MLST-11-
TGCCAGTTTCCACATTTCA
423



OIF007_1327_
GGCGTGATTATGG

OIF007_1422_
CGTTCGTG




1356_F


1448_R







1162
AB_MLST-11-
TCGTGATTATGGATGGC
68
AB_MLST-11-
TCGCTTGAGTGTAGTCATG
424



OIF007_1345_
AACGTGAA

OIF007_1470_
ATTGCG




1369_F


1494_R







1163
AB_MLST-11-
TTATGGATGGCAACGTG
69
AB_MLST-11-
TCGCTTGAGTGTAGTCATG
424



OIF007_1351_
AAACGCGT

OIF007_1470_
ATTGCG




1375_F


1494_R







1164
AB_MLST-11-
TCTTTGCCATTGAAGAT
70
AB_MLST-11-
TCGCTTGAGTGTAGTCATG
424



OIF007_1387_
GACTTAAGC

OIF007_1470_
ATTGCG




1412_F


1494_R







1165
AB_MLST-11-
TACTAGCGGTAAGCTTA
71
AB_MLST-11-
TGAGTCGGGTTCACTTTAC
425



OIF007_1542_
AACAAGATTGC

OIF007_1656_
CTGGCA




1569_F


1680_R







1166
AB_MLST-11-
TTGCCAATGATATTCGT
72
AB_MLST-11-
TGAGTCGGGTTCACTTTAC
425



OIF007_1566_
TGGTTAGCAAG

OIF007_1656_
CTGGCA




1593_F


1680_R







1167
AB_MLST-11-
TCGGCGAAATCCGTATT
73
AB_MLST-11-
TACCGGAAGCACCAGCGAC
427



OIF007_1611_
CCTGAAAATGA

OIF007_1731_
ATTAATAG




1638_F


1757_R







1168
AB_MLST-11-
TACCACTATTAATGTCG
74
AB_MLST-11-
TGCAACTGAATAGATTGCA
428



OIF007_1726_
CTGGTGCTTC

OIF007_1790_
GTAAGTTATAAGC




1752_F


1821_R







1169
AB_MLST-11-
TTATAACTTACTGCAAT
75
AB_MLST-11-
TGAATTATGCAAGAAGTGA
429



OIF007_1792_
CTATTCAGTTGCTTGGT

OIF007_1876_
TCAATTTTCTCACGA




1826_F
G

1909_R







1170
AB_MLST-11-
TTATAACTTACTGCAAT
75
AB_MLST-11-
TGCCGTAACTAACATAAGA
430



OIF007_1792_
CTATTCAGTTGCTTGGT

OIF007_1895_
GAATTATGCAAGAA




1826_F
G

1927_R







1152
AB_MLST-11-
TATTGTTTCAAATGTAC
76
AB_MLST-11-
TCACAGGTTCTACTTCATC
432



OIF007_185_
AAGGTGAAGTGCG

OIF007_291_
AATAATTTCCATTGC




214_F


324_R







1171 
AB_MLST-11-
TGGTTATGTACCAAATA
77
AB_MLST-11-
TGACGGCATCGATACCACC
431



OIF007_1970_
CTTTGTCTGAAGATGG

OIF007_2097_
GTC




2002_F


2118_R







1154
AB_MLST-11-
TGAAGTGCGTGATGATA
78
AB_MLST-11-
TCCGCCAAAAACTCCCCTT
433



OIF007_206_
TCGATGCACTTGATGTA

OIF007_318_
TTCACAGG




239_F


344_R







1153
AB_MLST-11-
TGGAACGTTATCAGGTG
79
AB_MLST-11-
TTGCAATCGACATATCCAT
434



OIF007_260_
CCCCAAAAATTCG

OIF007_364_
TTCACCATGCC




289_F


393_R







1155
AB_MLST-11-
TCGGTTTAGTAAAAGAA
80
AB_MLST-11-
TTCTGCTTGAGGAATAGTG
435



OIF007_522_
CGTATTGCTCAACC

OIF007_587_
CGTGG




552_F


610_R







1156
AB_MLST-11-
TCAACCTGACTGCGTGA
81
AB_MLST-11-
TACGTTCTACGATTTCTTC
436



OIF007_547_
ATGGTTGT

OIF007_656_
ATCAGGTACATC




571_F


686_R







1157
AB_MLST-11-
TCAAGCAGAAGCTTTGG
82
AB_MLST-11-
TACAACGTGATAAACACGA
437



OIF007_601_
AAGAAGAAGG

OIF007_710_
CCAGAAGC




627_F


736_R







1151
AB_MLST-11-
TGAGATTGCTGAACATT
83
AB_MLST-11-
TTGTACATTTGAAACAATA
426



OIF007_62_
TAATGCTGATTGA

OIF007_169_
TGCATGACATGTGAAT




91_F


203_R







1100
ASD_FRT_1_
TTGCTTAAAGTTGGTTT
84
ASD_FRT_86_
TGAGATGTCGAAAAAAACG
439



29_F
TATTGGTTGGCG

116_R
TTGGCAAAATAC






1101
ASD_FRT_43_
TCAGTTTTAATGTCTCG
85
ASD_FRT_129_
TCCATATTGTTGCATAAAA
438



76_F
TATGATCGAATCAAAAG

156_R
CCTGTTGGC






291
ASPS_EC_405_
GCACAACCTGCGGCTGC
86
ASPS_EC_521_
ACGGCACGAGGTAGTCGC
440



422_F
G

538_R







485
BONTA_X52066_
TCTAGTAATAATAGGAC
87
BONTA_X52066_
TAACCATTTCGCGTAAGAT
441



450_473_F
CCTCAGC

517_539_R
TCAA






486
BONTA_X52066_
T*Ua*CaAGTAATAATAG
87
BONTA_X52066_
TAACCA*Ca*Ca*Ca*UaGC
441



450_473P_F
GA*Ua*Ua*Ua*Ca*UaAGC

517_539P_R
GTAAGA*Ca*Ca*UaAA






481
BONTA_X52066_
TATGGCTCTACTCAA
88
BONTA_X52066_
TGTTACTGCTGGAT
443



538_552_F


647_660_R







482
BONTA_X52066_
TA*CaGGC*Ca*Ua*CaA
88
BONTA_X52066_
TG*Ca*CaA*Ua*CaG*Ua*Ca
443



538_552P_F
*Ua*Ca*UaAA

647_660P_R
GGAT






487
BONTA_X52066_
TGAGTCACTTGAAGTTG
89
BONTA_X52066_
TCATGTGCTAATGTTACTG
442



591_620_F
ATACAAATCCTCT

644_671_R
CTGGATCTG






483
BONTA_X52066_
GAATAGCAATTAATCCA
90
BONTA_X52066_
TTACTTCTAACCCACTC
444



701_720_F
AAT

759_775_R







484
BONTA_X52066_
GAA*CaAG*UaAA*Ca*Ca
90
BONTA_X52066_
TTA*Ua*Ca*Ca*Ua*CaAA*
444



701_720P_F
AA*Ca*Ua*UaAAAT

759_775P_R
Ua*Ua*UaA*Ua*CaC






774
CAF1_AF053947_
TCAGTTCCGTTATCGCC
91
CAF1_AF053947_
TGCGGGCTGGTTCAACAAG
445



33407_33430_F
ATTGCAT

33494_33514_R
AG






776
CAF1_AF053947_
TGGAACTATTGCAACTG
92
CAF1_AF053947_
TGATGCGGGCTGGTTCAAC
446



33435_33457_F
CTAATG

33499_33517_R







775
CAF1_AF053947_
TCACTCTTACATATAAG
93
CAF1_AF053947_
TCCTGTTTTATAGCCGCCA
447



33515_33541_F
GAAGGCGCTC

33595_33621_R
AGAGTAAG






777
CAF1_AF053947_
TCAGGATGGAAATAACC
94
CAF1_AF053947_
TCAAGGTTCTCACCGTTTA
448



33687_33716_F
ACCAATTCACTAC

33755_33782_R
CCTTAGGAG






22
CAPC_BA_104_
GTTATTTAGCACTCGTT
95
CAPC_BA_180_
TGAATCTTGAAACACCATA
449



131_F
TTTAATCAGCC

205_R
CGTAACG






23
CAPC_BA_114_
ACTCGTTTTTAATCAGC
96
CAPC_BA_185_
TGAATCTTGAAACACCATA
450



133_F
CCG

205_R
CG






24
CAPC_BA_274_
GATTATTGTTATCCTGT
97
CAPC_BA_349_
GTAACCCTTGTCTTTGAAT
451



303_F
TATGCCATTTGAG

376_R
TGTATTTGC






350
CAPC_BA_274_
TGATTATTGTTATCCTG
98
CAPC_BA_349_
TGTAACCCTTGTCTTTGAA
452



303_TMOD_F
TTATGCCATTTGAG

376_TMOD_R
TTGTATTTGC






25
CAPC_BA_276_
TTATTGTTATCCTGTTA
99
CAPC_BA_358_
GGTAACCCTTGTCTTTGAA
453



296_F
TGCC

377_R
T






26
CAPC_BA_281_
GTTATCCTGTTATGCCA
100
CAPC_BA_361_
TGGTAACCCTTGTCTTTG
454



301_F
TTTG

378_R







27
CAPC_BA_315_
CCGTGGTATTGGAGTTA
101
CAPC_BA_361_
TGGTAACCCTTGTCTTTG
454



334_F
TTG

378_R







1053
CJST_CJ_1080_
TTGAGGGTATGCACCGT
102
CJST_CJ_1166_
TCCCCTCATGTTTAAATGA
456



1110_F
CTTTTTGATTCTTT

1198_R
TCAGGATAAAAAGC






1063
CJST_CJ_1268_
AGTTATAAACACGGCTT
103
CJST_CJ_1349_
TCGGTTTAAGCTCTACATG
457



1299_F
TCCTATGGCTTATCC

1379_R
ATCGTAAGGATA






1050
CJST_CJ_1290_
TGGCTTATCCAAATTTA
104
CJST_CJ_1406_
TTTGCTCATGATCTGCATG
458



1320_F
GATCGTGGTTTTAC

1433_R
AAGCATAAA






1058
CJST_CJ_1643_
TTATCGTTTGTGGAGCT
105
CJST_CJ_1724_
TGCAATGTGTGCTATGTCA
459



1670_F
AGTGCTTATGC

1752_R
GCAAAAAGAT






1045
CJST_CJ_1668_
TGCTCGAGTGATTGACT
106
CJST_CJ_1774_
TGAGCGTGTGGAAAAGGAC
460



1700_F
TTGCTAAATTTAGAGA

1799_R
TTGGATG






1064
CJST_CJ_1680_
TGATTTTGCTAAATTTA
107
CJST_CJ_1795_
TATGTGTAGTTGAGCTTAC
461



1713_F
GAGAAATTGCGGATGAA

1822_R
TACATGAGC






1056
CJST_CJ_1880_
TCCCAATTAATTCTGCC
108
CJST_CJ_1981_
TGGTTCTTACTTGCTTTGC
462



1910_F
ATTTTTCCAGGTAT

2011_R
ATAAACTTTCCA






1054
CJST_CJ_2060_
TCCCGGACTTAATATCA
109
CJST_CJ_2148_
TCGATCCGCATCACCATCA
463



2090_F
ATGAAAATTGTGGA

2174_R
AAAGCAAA






1059
CJST_CJ_2165_
TGCGGATCGTTTGGTGG
110
CJST_CJ_2247_
TCCACACTGGATTGTAATT
464



2194_F
TTGTAGATGAAAA

2278_R
TACCTTGTTCTTT






1046
CJST_CJ_2171_
TCGTTTGGTGGTGGTAG
111
CJST_CJ_2283_
TCTCTTTCAAAGCACCATT
465



2197_F
ATGAAAAAGG

2313_R
GCTCATTATAGT






1057
CJST_CJ_2185_
TAGATGAAAAGGGCGAA
112
CJST_CJ_2283_
TGAATTCTTTCAAAGCACC
466



2212_F
GTGGCTAATGG

2316_R
ATTGCTCATTATAGT






1049
CJST_CJ_2636_
TGCCTAGAAGATCTTAA
113
CJST_CJ_2753_
TTGCTGCCATAGCAAAGCC
467



2668_F
AAATTTCCGCCAACTT

2777_R
TACAGC






1062
CJST_CJ_2678_
TCCCCAGGACACCCTGA
114
CJST_CJ_2760_
TGTGCTTTTTTTGCTGCCA
468



2703_F
AATTTCAAC

2787_R
TAGCAAAGC






1065
CJST_CJ_2857_
TGGCATTTCTTATGAAG
115
CJST_CJ_2965_
TGCTTCAAAACGCATTTTT
469



2887_F
CTTGTTCTTTAGCA

2998_R
ACATTTTCGTTAAAG






1055
CJST_CJ_2869_
TGAAGCTTGTTCTTTAG
116
CJST_CJ_2979_
TCCTCCTTGTGCCTCAAAA
470



2895_F
CAGGACTTCA

3007_R
CGCATTTTTA






1051
CJST_CJ_3267_
TTTGATTTTACGCCGTC
117
CJST_CJ_3356_
TCAAAGAACCCGCACCTAA
471



3293_F
CTCCAGGTCG

3385_R
TTCATCATTTA






1061
CJST_CJ_360_
TCCTGTTATCCCTGAAG
118
CJST_CJ_443_
TACAACTGGTTCAAAAACA
473



393_F
TAGTTAATCAAGTTTGT

477_R
TTAAGCTGTAATTGTC






1048
CJST_CJ_360_
TCCTGTTATCCCTGAAG
119
CJST_CJ_442_
TCAACTGGTTCAAAAACAT
472



394_F
TAGTTAATCAAGTTTGT

476_R
TAAGTTGTAATTGTCC





T









1052
CJST_CJ_5_
TAGGCGAAGATATACAA
120
CJST_CJ_104_
TCCCTTATTTTTCTTTCTA
455



39_F
AGAGTATTAGAAGCTAG

137_R
CTACCTTCGGATAAT





A









1047
CJST_CJ_584_
TCCAGGACAAATGTATG
121
CJST_CJ_663_
TTCATTTTCTGGTCCAAAG
474



616_F
AAAAATGTCCAAGAAG

692_R
TAAGCAGTATC






1060
CJST_CJ_599_
TGAAAAATGTCCAAGAA
122
CJST_CJ_711_
TCCCGAACAATGAGTTGTA
475



632_F
GCATAGCAAAAAAAGCA

743_R
TCAACTATTTTTAC






1096
CTXA_VBC_117_
TCTTATGCCAAGAGGAC
123
CTXA_VBC_194_
TGCCTAACAAATCCCGTCT
476



142_F
AGAGTGAGT

218_R
GAGTTC






1097
CTXA_VBC_351_
TGTATTAGGGGCATACA
124
CTXA_VBC_441_
TGTCATCAAGCACCCCAAA
477



377_F
GTCCTCATCC

466_R
ATGAACT






28
CYA_BA_1055_
GAAAGAGTTCGGATTGG
125
CYA_BA_1112_
TGTTGACCATGCTTCTTAG
479



1072_F
G

1130_R







277
CYA_BA_1349_
ACAACGAAGTACAATAC
126
CYA_BA_1426_
CTTCTACATTTTTAGCCAT
480



1370_F
AAGAC

1447_R
CAC






30
CYA_BA_1353_
CGAAGTACAATACAAGA
127
CYA_BA_1448_
TGTTAACGGCTTCAAGACC
482



1379_F
CAAAAGAAGG

1467_R
C






351
CYA_BA_1359_
TCGAAGTACAATACAAG
128
CYA_BA_1448_
TTGTTAACGGCTTCAAGAC
483



1379_TMOD_F
ACAAAAGAAGG

1467_TMOD_R
CC






31
CYA_BA_1359_
ACAATACAAGACAAAAG
129
CYA_BA_1447_
CGGCTTCAAGACCCC
481



1379_F
AAGG

1461_R







32
CYA_BA_914_
CAGGTTTAGTACCAGAA
130
CYA_BA_999_
ACCACTTTTAATAAGGTTT
484



937_F
CATGCAG

1026_R
GTAGCTAAC






33
CYA_BA_916_
GGTTTAGTACCAGAACA
131
CYA_BA_1003_
CCACTTTTAATAAGGTTTG
478



935_F
TGC

1025_R
TAGC






115
DNAK_EC_428_
CGGCGTACTTCAACGAC
132
DNAK_EC_503_
CGCGGTCGGCTCGTTGATG
485



449_F
AGCCA

522_R
A






1102
GALE_FRT_168_
TTATCAGCTAGACCTTT
133
GALE_FRT_241_
TCACCTACAGCTTTAAAGC
486



199_F
TAGGTAAAGCTAAGC

269_R
CAGCAAAATG






1104
GALE_FRT_308_
TCCAAGGTACACTAAAC
134
GALE_FRT_390_
TCTTCTGTAAAGGGTGGTT
487



339_F
TTACTTGAGCTAATG

422_R
TATTATTCATCCCA






1103
GALE_FRT_834_
TCAAAAAGCCCTAGGTA
135
GALE_FRT_901_
TAGCCTTGGCAACATCAGC
488



865_F
AAGAGATTCCATATC

925_R
AAAACT






1092
GLTA_RKP_1023_
TCCGTTCTTACAAATAG
136
GLTA_RKP_1129_
TTGGCGACGGTATACCCAT
489



1055_F
CAATAGAACTTGAAGC

1156_R
AGCTTTATA






1093
GLTA_RKP_1043_
TGGAGCTTGAAGCTATC
137
GLTA_RKP_1138_
TGAACATTTGCGACGGTAT
490



1072_2_F
GCTCTTAAAGATG

1162_R
ACCCAT






1094
GLTA_RKP_1043_
TGGAACTTGAAGCTCTC
138
GLTA_RKP_1138_
TGTGAACATTTGCGACGGT
492



1072_3_F
GCTCTTAAAGATG

1164_R
ATACCCAT






1090
GLTA_RKP_1043_
TGGGACTTGAAGCTATC
139
GLTA_RKP_1138_
TGAACATTTGCGACGGTAT
491



1072_F
GCTCTTAAAGATG

1162_R
ACCCAT






1091
GLTA_RKP_400_
TCTTCTCATCCTATGGC
140
GLTA_RKP_499_
TGGTGGGTATCTTAGCAAT
493



428_F
TATTATGCTTGC

529_R
CATTCTAATAGC






1095
GLTA_RKP_400_
TCTTCTCATCCTATGGC
140
GLTA_RKP_505_
TGCGATGGTAGGTATCTTA
494



428_F
TATTATGCTTGC

534_R
GCAATCATTCT






224
GROL_EC_219_
GGTGAAAGAAGTTGCCT
141
GROL_EC_328_
TTCAGGTCCATCGGGTTCA
496



242_F
CTAAAGC

350_R
TGCC






280
GROL_EC_496_
ATGGACAAGGTTGGCAA
142
GROL_EC_577_
TAGCCGCGGTCGAATTGCA
498



518_F
GGAAGG

596_R
T






281
GROL_EC_511_
AAGGAAGGCGTGATCAC
143
GROL_EC_571_
CCGCGGTCGAATTGCATGC
497



536_F
CGTTGAAGA

593_R
CTTC






220
GROL_EC_941_
TGGAAGATCTGGGTCAG
144
GROL_EC_1039_
CAATCTGCTGACGGATCTG
495



959_F
GC

1060_R
AGC






924
GYRA_AF100557_
TCTGCCCGTGTCGTTGG
145
GYRA_AF100557_
TCGAACCGAAGTTACCCTG
499



4_23_F
TGA

119_142_R
ACCAT






925
GYRA_AF100557_
TCCATTGTTCGTATGGC
146
GYRA_AF100557_
TGCCAGCTTAGTCATACGG
500



70_94_F
TCAAGACT

178_201_R
ACTTC






926
GYRB_AB008700_
TCAGGTGGCTTACACGG
147
GYRB_AB008700_
TATTGCGGATCACCATGAT
501



19_40_F
CGTAG

111_140_R
GATATTCTTGC






927
GYRB_AB008700_
TCTTTCTTGAATGCTGG
148
GYRB_AB008700_
TCGTTGAGATGGTTTTTAC
502



265_292_F
TGTACGTATCG

369_395_R
CTTCGTTG






928
GYRB_AB008700_
TCAACGAAGGTAAAAAC
149
GYRB_AB008700_
TTTGTGAAACAGCGAACAT
503



368_394_F
CATCTCAACG

466_494_R
TTTCTTGGTA






929
GYRB_AB008700_
TGTTCGCTGTTTCACAA
150
GYRB_AB008700_
TCACGCGCATCATCACCAG
504



477_504_F
ACAACATTCCA

611_632_R
TCA






949
GYRB_AB008700_
TACTTACTTGAGAATCC
151
GYRB_AB008700_
TCCTGCAATATCTAATGCA
505



760_787_F
ACAAGCTGCAA

862_888_2_R
CTCTTACG






930
GYRB_AB008700_
TACTTACTTGAGAATCC
151
GYRB_AB008700_
ACCTGCAATATCTAATGCA
506



760_787_F
ACAAGCTGCAA

862_888_R
CTCTTACG






222
HFLB_EC_1082_
TGGCGAACCTGGTGAAC
152
HFLB_EC_1144_
CTTTCGCTTTCTCGAACTC
507



1102_F
GAAGC

1168_R
AACCAT






1128
HUPB_CJ_113_
TAGTTGCTCAAACAGCT
153
HUPB_CJ_157_
TCCCTAATAGTAGAAATAA
509



134_F
GGGCT

188_R
CTGCATCAGTAGC






1130
HUPB_CJ_76_
TCCCGGAGCTTTTATGA
154
HUPB_CJ_114_
TAGCCCAGCTGTTTGAGCA
508



102_F
CTAAAGCAGAT

135_R
ACT






1129
HUPB_CJ_76_
TCCCGGAGCTTTTATGA
154
HUPB_CJ_157_
TCCCTAATAGTAGAAATAA
510



102_F
CTAAAGCAGAT

188_R
CTGCATCAGTAGC






1079
ICD_CXB_176_
TCGCCGTGGAAAAATCC
155
ICD_CXB_224_
TAGCCTTTTCTCCGGCGTA
512



198_F
TACGCT

247_R
GATCT






1078
ICD_CXB_92_
TTCCTGACCGACCCATT
156
ICD_CXB_172_
TAGGATTTTTCCACGGCGG
510



120_F
ATTCCCTTTATC

194_R
CATC






1077
ICD_CXB_93_
TCCTGACCGACCCATTA
157
ICD_CXB_172_
TAGGATTTTTCCACGGCGG
511



120_F
TTCCCTTTATC

194_R
CATC






221
INFB_EC_1103_
GTCGTGAAAACGAGCTG
158
INFB_EC_1174_
CATGATGGTCACAACCGG
513



1124_F
GAAGA

1191_R







964
INFB_EC_1347_
TGCGTTTACCGCAATGC
159
INFB_EC_1414_
TCGGCATCACGCCGTCGTC
514



1367_F
GTGC

1432_R







34
INFB_EC_1365_
TGCTCGTGGTGCACAAG
160
INFB_EC_1439_
TGCTGCTTTCGCATGGTTA
515



1393_F
TAACGGATATTA

1467_R
ATTGCTTCAA






352
INFB_EC_1365_
TTGCTCGTGGTGCACAA
161
INFB_EC_1439_
TTGCTGCTTTCGCATGGTT
516



1393_TMOD_F
GTAACGGATATTA

1467_TMOD_R
AATTGCTTCAA






223
INFB_EC_1969_
CGTCAGGGTAAATTCCG
162
INFB_EC_2038_
AACTTCGCCTTCGGTCATG
517



1994_F
TGAAGTTAA

2058_R
TT






781
INV_U22457_
TGGTAACAGAGCCTTAT
163
INV_U22457_
TTGCGTTGCAGATTATCTT
518



1558_1581_F
AGGCGCA

1619_1643_R
TAACCAA






778
INV_U22457_
TGGCTCCTTGGTATGAC
164
INV_U22457_
TGTTAAGTGTGTTGCGGCT
519



515_539_F
TCTGCTTC

571_598_R
GTCTTTATT






779
INV_U22457_
TGCTGAGGCCTGGACCG
165
INV_U22457_
TCACGCGACGAGTGCCATC
520



699_724_F
ATCATTTAC

753_776_R
CATTG






780
INV_U22457_
TTATTTACCTGCACTCC
166
INV_U22457_
TGACCCAAAGCTGAAAGCT
521



834_858_F
CACAACTG

942_966_R
TTACTG






1106
IPAH_SGF_113_
TCCTTGACCGCCTTTCC
167
IPAH_SGF_172_
TTTTCCAGCCATGCAGCGA
522



134_F
GATAC

191_R
C






1105
IPAH_SGF_258_
TGAGGACCGTGTCGCGC
168
IPAH_SGF_301_
TCCTTCTGATGCCTGATGG
523



277_F
TCA

327_R
ACCAGGAG






1107
IPAH_SGF_462_
TCAGACCATGCTCGCAG
169
IPAH_SGF_522_
TGTCACTCCCGACACGCCA
524



486_F
AGAAACTT

540_R







1080
IS1111A_
TCAGTATGTATCCACCG
170
IS1111A_
TAAACGTCCGATACCAATG
525



NC002971_
TAGCCAGTC

NC002971_
GTTCGCTC




6866_6891_F


6928_6954_R







1081
IS1111A_
TGGGTGACATTCATCAA
171
IS1111A_
TCAACAACACCTCCTTATT
526



NC002971_
TTTCATCGTTC

NC002971_
CCCACTC




7456_7483_F


7529_7554_R







35
LEF_BA_1033_
TCAAGAAGAAAAAGAGC
172
LEF_BA_1119_
GAATATCAATTTGTAGC
527



1052_F


1135_R







36
LEF_BA_1036_
CAAGAAGAAAAAGAGCT
173
LEF_BA_1119_
AGATAAAGAATCACGAATA
528



1066_F
TCTAAAAAGAATAC

1149_R
TCAATTTGTAGC






37
LEF_BA_756_
AGCTTTTGCATATTATA
174
LEF_BA_843_
TCTTCCAAGGATAGATTTA
530



781_F
TCGAGCCAC

872_R
TTTCTTGTTCG






353
LEF_BA_756_
TAGCTTTTGCATATTAT
175
LEF_BA_843_
TTCTTCCAAGGATAGATTT
531



781_TMOD_F
ATCGAGCCAC

872_TMOD_R
ATTTCTTGTTCG






38
LEF_BA_758_
CTTTTGCATATTATATC
176
LEF_BA_843_
AGGATAGATTTATTTCTTG
529



778_F
GAGC

865_R
TTCG






39
LEF_BA_795_
TTTACAGCTTTATGCAC
177
LEF_BA_883_
TCTTGACAGCATCCGTTG
532



813_F
CG

900_R







40
LEF_BA_883_
CAACGGATGCTGGCAAG
178
LEF_BA_939_
CAGATAAAGAATCGCTCCA
533



899_F


958_R
G






782
LL_NC003143_
TGTAGCCGCTAAGCACT
179
LL_NC003143_
TCTCATCCCGATATTACCG
534



2366996_
ACCATCC

2367073_
CCATGA




2367019_F


2367097_R







783
LL_NC003143_
TGGACGGCATCACGATT
180
LL_NC003143_
TGGCAACAGCTCAACACCT
535



2367172_
CTCTAC

2367249_
TTGG




2367194_F


2367271_R







878
MECA_Y14051_
TGAAGTAGAAATGACTG
181
MECA_Y14051_
TGATCCTGAATGTTTATAT
536



3645_3670_F
AACGTCCGA

3690_3719_R
CTTTAACGCCT






877
MECA_Y14051_
TAAAACAAACTACGGTA
182
MECA_Y14051_
TCCCAATCTAACTTCCACA
537



3774_3802_F
ACATTGATCGCA

3828_3854_R
TACCATCT






879
MECA_Y14051_
TCAGGTACTGCTATCCA
183
MECA_Y14051_
TGGATAGACGTCATATGAA
538



4507_4530_F
CCCTCAA

4555_4581_R
GGTGTGCT






880
MECA_Y14051_
TGTACTGCTATCCACCC
184
MECA_Y14051_
TATTCTTCGTTACTCATGC
539



4510_4530_F
TCAA

4586_4610_R
CATACA






882
MECA_Y14051_
TUaUaAUaUaUaCaUaAA
185
MECA_Y14051_
CaAUaCaUaACaGUaUaA
540



4520_4530P_F


4590_4600P_R







883
MECA_Y14051_
TUaUaAUaUaUaCaUaAA
185
MECA_Y14051_
CaACaCaUaCaCaUaGCaT
541



4520_4530P_F


4600_4610P_R







881
MECA_Y14051_
TCACCAGGTTCAACTCA
186
MECA_Y14051_
TAACCACCCCAAGATTTAT
542



4669_4698_F
AAAAATATTAACA

4765_4793_R
CTTTTTGCCA






876
MECIA_Y14051_
TTACACATATCGTGAGC
187
MECIA_Y14051_
TGTGATATGGAGGTGTAGA
543



3315_3341_F
AATGAACTGA

3367_3393_R
AGGTGTTA






914
OMPA_AY485227_
TTACTCCATTATTGCTT
188
OMPA_AY485227_
GAGCTGCGCCAACGAATAA
544



272_301_F
GGTTACACTTTCC

364_388_R
ATCGTC






916
OMPA_AY485227_
TACACAACAATGGCGGT
189
OMPA_AY485227_
TACGTCGCCTTTAACTTGG
545



311_335_F
AAAGATGG

424_453_R
TTATATTCAGC






915
OMPA_AY485227_
TGCGCAGCTCTTGGTAT
190
OMPA_AY485227_
TGCCGTAACATAGAAGTTA
546



379_401_F
CGAGTT

492_519_R
CCGTTGATT






917
OMPA_AY485227_
TGCCTCGAAGCTGAATA
191
OMPA_AY485227_
TCGGGCGTAGTTTTTAGTA
547



415_441_F
TAACCAAGTT

514_546_R
ATTAAATCAGAAGT






918
OMPA_AY485227_
TCAACGGTAACTTCTAT
192
OMPA_AY485227_
TCGTCGTATTTATAGTGAC
548



494_520_F
GTTACTTCTG

569_596_R
CAGCACCTA






919
OMPA_AY485227_
TCAAGCCGTACGTATTA
193
OMPA_AY485227_
TTTAAGCGCCAGAAAGCAC
550



227_551_577_F
TTAGGTGCTG

658_680_R
CAAC






920
OMPA_AY485227_
TCCGTACGTATTATTAG
194
OMPA_AY485227_
TCAACACCAGCGTTACCTA
549



555_581_F
GTGCTGGTCA

635_662_R
AAGTACCTT






921
OMPA_AY485227_
TCGTACGTATTATTAGG
195
OMPA_AY485227_
TCGTTTAAGCGCCAGAAAG
551



556_583_F
TGCTGGTCACT

659_683_R
CACCAA






922
OMPA_AY485227_
TGTTGGTGCTTTCTGGC
196
OMPA_AY485227_
TAAGCCAGCAAGAGCTGTA
552



657_679_F
GCTTAA

739_765_R
TAGTTCCA






923
OMPA_AY485227_
TGGTGCTTTCTGGCGCT
197
OMPA_AY485227_
TACAGGAGCAGCAGGCTTC
553



660_683_F
TAAACGA

786_807_R
AAG






1088
OMPB_RKP_
TCTACTGATTTTGGTAA
198
OMPB_RKP_1288_
TAGCAGCAAAAGTTATCAC
554



1192_1221_F
TCTTGCAGCACAG

1315_R
ACCTGCAGT






1089
OMPB_RKP_
TGCAAGTGGTACTTCAA
199
OMPB_RKP_3520_
TGGTTGTAGTTCCTGTAGT
555



3417_3440_F
CATGGGG

3550_R
TGTTGCATTAAC






1087
OMPB_RKP_
TTACAGGAAGTTTAGGT
200
OMPB_RKP_972_
TCCTGCAGCTCTACCTGCT
556



860_890_F
GGTAATCTAAAAGG

996_R
CCATTA






41
PAG_BA_122_
CAGAATCAAGTTCCCAG
201
PAG_BA_190_
CCTGTAGTAGAAGAGGTAA
558



142_F
GGG

209_R
C






42
PAG_BA_123_
AGAATCAAGTTCCCAGG
203
PAG_BA_187_
CCCTGTAGTAGAAGAGGTA
557



145_F
GGTTAC

210_R
ACCAC






43
PAG_BA_269_
AATCTGCTATTTGGTCA
203
PAG_BA_326_
TGATTATCAGCGGAAGTAG
559



287_F
GG

344_R







44
PAG_BA_655_
GAAGGATATACGGTTGA
204
PAG_BA_755_
CCGTGCTCCATTTTTCAG
560



675_F
TGTC

772_R







45
PAG_BA_753_
TCCTGAAAAATGGAGCA
205
PAG_BA_849_
TCGGATAAGCTGCCACAAG
561



772_F
CGG

868_R
G






46
PAG_BA_763_
TGGAGCACGGCTTCTGA
206
PAG_BA_849_
TCGGATAAGCTGCCACAAG
562



781_F
TC

868_R
G






912
PARC_X95819_
GGCTCAGCCATTTAGTT
207
PARC_X95819_
TCGCTCAGCAATAATTCAC
566



123_147_F
ACCGCTAT

232_260_R
TATAAGCCGA






913
PARC_X95819_
TCAGCGCGTACAGTGGG
208
PARC_X95819_
TTCCCCTGACCTTCGATTA
563



43_63_F
TGAT

143_170_R
AAGGATAGC






911
PARC_X95819_
TGGTGACTCGGCATGTT
209
PARC_X95819_
GGTATAACGCATCGCAGCA
564



87_110_F
ATGAAGC

192_219_R
AAAGATTTA






910
PARC_X95819_
TGGTGACTCGGCATGTT
209
PARC_X95819_
TTCGGTATAACGCATCGCA
565



87_110_F
ATGAAGC

201_222_R
GCA






773
PLA_AF053945_
TTATACCGGAAACTTCC
210
PLA_AF053945_
TAATGCGATACTGGCCTGC
567



7186_7211_F
CGAAAGGAG

7257_7280_R
AAGTC






770
PLA_AF053945_
TGACATCCGGCTCACGT
211
PLA_AF053945_
TGTAAATTCCGCAAAGACT
568



7377_7402_F
TATTATGGT

7434_7462_R
TTGGCATTAG






771
PLA_AF053945_
TCCGGCTCACGTTATTA
212
PLA_AF053945_
TGGTCTGAGTACCTCCTTT
569



7382_7404_F
TGGTAC

7482_7502_R
GC






772
PLA_AF053945_
TGCAAAGGAGGTACTCA
213
PLA_AF053945_
TATTGGAAATACCGGCAGC
570



7481_7503_F
GACCAT

7539_7562_R
ATCTC






909
RECA_AF251469_
TGACATGCTTGTCCGTT
214
RECA_AF251469_
TGGCTCATAAGACGCGCTT
572



169_190_F
CAGGC

277_300_R
GTAGA






908
RECA_AF251469_
TGGTACATGTGCCTTCA
215
RECA_AF251469_
TTCAAGTGCTTGCTCACCA
571



43_68_F
TTGATGCTG

140_163_R
TTGTC






1072
RNASEP_BDP_
TGGCACGGCCATCTCCG
216
RNASEP_BDP_
TCGTTTCACCCTGTCATGC
573



574_592_F
TG

616_635_R
CG






1070
RNASEP_BKM_
TGCGGGTAGGGAGCTTG
217
RNASEP_BKM_
TCCGATAAGCCGGATTCTG
574



580_599_F
AGC

665_686_R
TGC






1071
RNASEP_BKM_
TCCTAGAGGAATGGCTG
218
RNASEP_BKM_
TGCCGATAAGCCGGATTCT
575



616_637_F
CCACG

665_687_R
GTGC






1112
RNASEP_BRM_
TACCCCAGGGAAAGTGC
219
RNASEP_BRM_
TCTCTTACCCCACCCTTTC
576



325_347_F
CACAGA

402_428_R
ACCCTTAC






1172
RNASEP_BRM_
TAAACCCCATCGGGAGC
220
RNASEP_BRM_
TGCCTCGTGCAACCCACCC
577



461_488_F
AAGACCGAATA

542_561_2_R
G






1111
RNASEP_BRM_
TAAACCCCATCGGGAGC
220
RNASEP_BRM_
TGCCTCGCGCAACCTACCC
578



461_488_F
AAGACCGAATA

542_561_R
G






258
RNASEP_BS_
GAGGAAAGTCCATGCTC
221
RNASEP_BS_
GTAAGCCATGTTTTGTTCC
579



43_61_F
GC

363_384_R
ATC






259
RNASEP_BS_
GAGGAAAGTCCATGCTC
221
RNASEP_BS_
GTAAGCCATGTTTTGTTCC
578



43_61_F
GC

363_384_R
ATC






258
RNASEP_BS_
GAGGAAAGTCCATGCTC
221
RNASEP_EC_
ATAAGCCGGGTTCTGTCG
581



43_61_F
GC

45_362_R







258
RNASEP_BS_
GAGGAAAGTCCATGCTC
221
RNASEP_SA_
ATAAGCCATGTTCTGTTCC
584



43_61_F
GC

358_379_R
ATC






1076
RNASEP_CLB_
TAAGGATAGTGCAACAG
222
RNASEP_CLB_
TTTACCTCGCCTTTCCACC
579



459_487_F
AGATATACCGCC

498_522_R
CTTACC






1075
RNASEP_CLB_
TAAGGATAGTGCAACAG
222
RNASEP_CLB_
TGCTCTTACCTCACCGTTC
580



459_487_F
AGATATACCGCC

498_526_R
CACCCTTACC






258
RNASEP_EC_
GAGGAAAGTCCGGGCTC
223
RNASEP_BS_
GTAAGCCATGTTTTGTTCC
578



61_77_F


63_384_R
ATC






258
RNASEP_EC_
GAGGAAAGTCCGGGCTC
223
RNASEP_EC_
ATAAGCCGGGTTCTGTCG
581



61_77_F


345_362_R







260
RNASEP_EC_
GAGGAAAGTCCGGGCTC
223
RNASEP_EC_
ATAAGCCGGGTTCTGTCG
581



61_77_F


345_362_R







258
RNASEP_EC_
GAGGAAAGTCCGGGCTC
223
RNASEP_SA_
ATAAGCCATGTTCTGTTCC
584



61_77_F


358_379_R
ATC






1085
RNASEP_RKP_
TCTAAATGGTCGTGCAG
224
RNASEP_RKP_
TCTATAGAGTCCGGACTTT
582



264_287_F
TTGCGTG

295_321_R
CCTCGTGA






1082
RNASEP_RKP_
TGGTAAGAGCGCACCGG
225
RNASEP_RKP_
TCAAGCGATCTACCCGCAT
583



419_448_F
TAAGTTGGTAACA

542_565_R
TACAA






1083
RNASEP_RKP_
TAAGAGCGCACCGGTAA
226
RNASEP_RKP_
TCAAGCGATCTACCCGCAT
583



422_443_F
GTTGG

542_565_R
TACAA






1086
RNASEP_RKP_
TGCATACCGGTAAGTTG
227
RNASEP_RKP_
TCAAGCGATCTACCCGCAT
583



426_448_F
GCAACA

542_565_R
TACAA






1084
RNASEP_RKP_
TCCACCAAGAGCAAGAT
228
RNASEP_RKP_
TCAAGCGATCTACCCGCAT
583



466_491_F
CAAATAGGC

542_565_R
TACAA






258
RNASEP_SA_
GAGGAAAGTCCATGCTC
229
RNASEP_BS_
GTAAGCCATGTTTTGTTCC
578



31_49_F
AC

363_384_R
ATC






258
RNASEP_SA_
GAGGAAAGTCCATGCTC
229
RNASEP_EC_
ATAAGCCGGGTTCTGTCG
581



31_49_F
AC

345_362_R







258
RNASEP_SA_
GAGGAAAGTCCATGCTC
229
RNASEP_SA_
ATAAGCCATGTTCTGTTCC
584



31_49_F
AC

358_379_R
ATC






262
RNASEP_SA_
GAGGAAAGTCCATGCTC
229
RNASEP_SA_
ATAAGCCATGTTCTGTTCC
584



31_49_F
AC

358_379_R
ATC






1098
RNASEP_VBC_
TCCGCGGAGTTGACTGG
230
RNASEP_VBC_
TGACTTTCCTCCCCCTTAT
585



331_349_F
GT

388_414_R
CAGTCTCC






66
RPLB_EC_650_
GACCTACAGTAAGAGGT
231
RPLB_EC_739_
TCCAAGTGCTGGTTTACCC
591



679_F
TCTGTAATGAACC

762_R
CATGG






356
RPLB_EC_650_
TGACCTACAGTAAGAGG
232
RPLB_EC_739_
TTCCAAGTGCTGGTTTACC
592



679_TMOD_F
TTCTGTAATGAACC

762_TMOD_R
CCATGG






73
RPLB_EC_669_
TGTAATGAACCCTAATG
233
RPLB_EC_735_
CCAAGTGCTGGTTTACCCC
586



698_F
ACCATCCACACGG

761_R
ATGGAGTA






74
RPLB_EC_671_
TAATGAACCCTAATGAC
234
RPLB_EC_737_
TCCAAGTGCTGGTTTACCC
590



700_F
CATCCACACGGTG

762_R
CATGGAG






67
RPLB_EC_688_
CATCCACACGGTGGTGG
235
RPLB_EC_736_
GTGCTGGTTTACCCCATGG
587



710_F
TGAAGG

757_R
AGT






70
RPLB_EC_688_
CATCCACACGGTGGTGG
235
RPLB_EC_743_
TGTTTTGTATCCAAGTGCT
593



710_F
TGAAGG

771_R
GGTTTACCCC






357
RPLB_EC_688_
TCATCCACACGGTGGTG
236
RPLB_EC_736_
TGTGCTGGTTTACCCCATG
588



710_TMOD_F
GTGAAGG

757_TMOD_R
GAGT






449
RPLB_EC_690_
TCCACACGGTGGTGGTG
237
RPLB_EC_737_
TGTGCTGGTTTACCCCATG
589



710_F
AAGG

758_R
GAG






113
RPOB_EC_1336_
GACCACCTCGGCAACCG
238
RPOB_EC_1438_
TTCGCTCTCGGCCTGGCC
594



1353_F
T

1455_R







963
RPOB_EC_1527_
TCAGCTGTCGCAGTTCA
239
RPOB_EC_1630_
TCGTCGCGGACTTCGAAGC
595



1549_F
TGGACC

1649_R
C






72
RPOB_EC_1845_
TATCGCTCAGGCGAACT
240
RPOB_EC_1909_
GCTGGATTCGCCTTTGCTA
596



1866_F
CCAAC

1929_R
CG






359
RPOB_EC_1845_
TTATCGCTCAGGCGAAC
241
RPOB_EC_1909_
TGCTGGATTCGCCTTTGCT
597



1866_TMOD_F
TCCAAC

1929_TMOD_R
ACG






962
RPOB_EC_2005_
TCGTTCCTGGAACACGA
242
RPOB_EC_2041_
TTGACGTTGCATGTTCGAG
598



2027_F
TGACGC

2064_R
CCCAT






69
RPOB_EC_3762_
TCAACAACCTCTTGGAG
243
RPOB_EC_3836_
TTTCTTGAAGAGTATGAGC
600



3790_F
GTAAAGCTCAGT

3865_R
TGCTCCGTAAG






111
RPOB_EC_3775_
CTTGGAGGTAAGTCTCA
244
RPOB_EC_3829_
CGTATAAGCTGCACCATAA
599



3803_F
TTTTGGTGGGCA

3858_R
GCTTGTAATGC






940
RPOB_EC_3798_
TGGGCAGCGTTTCGGCG
245
RPOB_EC_3862_
TGTCCGACTTGACGGTTAG
604



3821_F
AAATGGA

3889_2_R
CATTTCCTG






939
RPOB_EC_3798_
TGGGCAGCGTTTCGGCG
245
RPOB_EC_3862_
TGTCCGACTTGACGGTCAG
605



3821_F
AAATGGA

3889_R
CATTTCCTG






289
RPOB_EC_3799_
GGGCAGCGTTTCGGCGA
246
RPOB_EC_3862_
GTCCGACTTGACGGTCAAC
602



3821_F
AATGGA

3888_R
ATTTCCTG






362
RPOB_EC_3799_
TGGGCAGCGTTTCGGCG
245
RPOB_EC_3862_
TGTCCGACTTGACGGTCAA
603



3821_TMOD_F
AAATGGA

3888_TMOD_R
CATTTCCTG






288
RPOB_EC_3802_
CAGCGTTTCGGCGAAAT
247
RPOB_EC_3862_
CGACTTGACGGTTAACATT
601



3821_F
GGA

3885_R
TCCTG






48
RPOC_EC_1018_
CAAAACTTATTAGGTAA
248
RPOC_EC_1095_
TCAAGCGCCATCTCTTTCG
610



1045_2_F
GCGTGTTGACT

1124_2_R
GTAATCCACAT






47
RPOC_EC_1018_
CAAAACTTATTAGGTAA
248
RPOC_EC_1095_
TCAAGCGCCATTTCTTTTG
611



1045_F
GCGTGTTGACT

1124_R
GTAAACCACAT






68
RPOC_EC_1036_
CGTGTTGACTATTCGGG
249
RPOC_EC_1097_
ATTCAAGAGCCATTTCTTT
612



1060_F
GCGTTCAG

1126_R
TGGTAAACCAC






49
RPOC_EC_114_
TAAGAAGCCGGAAACCA
250
RPOC_EC_213_
GGCGCTTGTACTTACCGCA
617



140_F
TCAACTACCG

232_R
C






227
RPOC_EC_1256_
ACCCAGTGCTGCTGAAC
251
RPOC_EC_1295_
GTTCAAATGCCTGGATACC
613



1277_F
CGTGC

1315_R
CA






292
RPOC_EC_1374_
CGCCGACTTCGACGGTG
252
RPOC_EC_1437_
GAGCATCAGCGTGCGTGCT
614



1393_F
ACC

1455_R







364
RPOC_EC_1374_
TCGCCGACTTCGACGGT
253
RPOC_EC_1437_
TGAGCATCAGCGTGCGTGC
615



1393_TMOD_F
GACC

1455_TMOD_R
T






229
RPOC_EC_1584_
TGGCCCGAAAGAAGCTG
254
RPOC_EC_1623_
ACGCGGGCATGCAGAGATG
616



1604_F
AGCG

1643_R
CC






978
RPOC_EC_2145_
TCAGGAGTCGTTCAACT
255
RPOC_EC_2228_
TTACGCCATCAGGCCACGC
622



2175_F
CGATCTACATGATG

2247_R
A






290
RPOC_EC_2146_
CAGGAGTCGTTCAACTC
256
RPOC_EC_2227_
ACGCCATCAGGCCACGCAT
620



2174_F
GATCTACATGAT

2245_R







363
RPOC_EC_2146_
TCAGGAGTCGTTCAACT
257
RPOC_EC_2227_
TACGCCATCAGGCCACGCA
621



2174_TMOD_F
CGATCTACATGAT

2245_TMOD_R
T






51
RPOC_EC_2178_
TGATTCCGGTGCCCGTG
258
RPOC_EC_2225_
TTGGCCATCAGACCACGCA
618



2196_2_F
GT

2246_2_R
TAC






50
RPOC_EC_2178_
TGATTCTGGTGCCCGTG
259
RPOC_EC_2225_
TTGGCCATCAGGCCACGCA
619



2196_F
GT

2246_R
TAC






53
RPOC_EC_2218_
CTTGCTGGTATGCGTGG
260
RPOC_EC_2313_
CGCACCATGCGTAGAGATG
623



2241_2_F
TCTGATG

2337_2_R
AAGTAC






52
RPOC_EC_2218_
CTGGCAGGTATGCGTGG
261
RPOC_EC_2313_
CGCACCGTGGGTTGAGATG
624



2241_F
TCTGATG

2337_R
AAGTAC






354
RPOC_EC_2218_
TCTGGCAGGTATGCGTG
262
RPOC_EC_2313_
TCGCACCGTGGGTTGAGAT
625



2241_TMOD_F
GTCTGATG

2337_TMOD_R
GAAGTAC






958
RPOC_EC_2223_
TGGTATGCGTGGTCTGA
263
RPOC_EC_2329_
TGCTAGACCTTTACGTGCA
626



2243_F
TGGC

2352_R
CCGTG






960
RPOC_EC_2334_
TGCTCGTAAGGGTCTGG
264
RPOC_EC_2380_
TACTAGACGACGGGTCAGG
627



2357_F
CGGATAC

2403_R
TAACC






55
RPOC_EC_808_
CGTCGTGTAATTAACCG
265
RPOC_EC_865_
ACGTTTTTCGTTTTGAACG
629



833_2_F
TAACAACCG

891_R
ATAATGCT






54
RPOC_EC_808_
CGTCGGGTGATTAACCG
266
RPOC_EC_865_
GTTTTTCGTTGCGTACGAT
628



833_F
TAACAACCG

889_R
GATGTC






961
RPOC_EC_917_
TATTGGACAACGGTCGT
267
RPOC_EC_1009_
TTACCGAGCAGGTTCTGAC
607



938_F
CGCGG

1034_R
GGAAACG






959
RPOC_EC_918_
TCTGGATAACGGTCGTC
268
RPOC_EC_1009_
TCCAGCAGGTTCTGACGGA
606



938_F
GCGG

1031_R
AACG






57
RPOC_EC_993_
CAAAGGTAAGCAAGGAC
269
RPOC_EC_1036_
CGAACGGCCAGAGTAGTCA
608



1019_2_F
GTTTCCGTCA

1059_2_R
ACACG






56
RPOC_EC_993_
CAAAGGTAAGCAAGGTC
270
RPOC_EC_1036_
CGAACGGCCTGAGTAGTCA
609



1019_F
GTTTCCGTCA

1059_R
ACACG






75
SP101_
AACCTTAATTGGAAAGA
271
SP101_
CCTACCCAACGTTCACCAA
676



SPET11_1_
AACCCAAGAAGT

SPET11_92_
GGGCAG




29_F


116_R







446
SP101_
TAACCTTAATTGGAAAG
272
SP101_
TCCTACCCAACGTTCACCA
677



SPET11_1_29_
AAACCCAAGAAGT

SPET11_92_
AGGGCAG




TMOD_F


116_TMOD_R







85
SP101_
CAATACCGCAACAGCGG
273
SP101_
GACCCCAACCTGGCCTTTT
630



SPET11_1154_
TGGCTTGGG

SPET11_1251_
GTCGTTGA




1179_F


1277_R







424
SP101_
TCAATACCGCAACAGCG
274
SP101_
TGACCCCAACCTGGCCTTT
631



SPET11_1154_
GTGGCTTGGG

SPET11_1251_
TGTCGTTGA




1179_TMOD_F


1277_TMOD_R







76
SP101_
GCTGGTGAAAATAACCC
275
SP101_
TGTGGCCGATTTCACCACC
644



SPET11_118_
AGATGTCGTCTTC

SPET11_213_
TGCTCCT




147_F


238_R







425
SP101_
TGCTGGTGAAAATAACC
276
SP101_
TTGTGGCCGATTTCACCAC
645



SPET11_118_
CAGATGTCGTCTTC

SPET11_213_
CTGCTCCT




147_TMOD_F


238_TMOD_R







86
SP101_
CGCAAAAAAATCCAGCT
277
SP101_
AAACTATTTTTTTAGCTAT
632



SPET11_1314_
ATTAGC

SPET11_1403_
ACTCGAACAC




1336_F


1431_R







426
SP101_
TCGCAAAAAAATCCAGC
278
SP101_
TAAACTATTTTTTTAGCTA
633



SPET11_1314_
TATTAGC

SPET11_1403_
TACTCGAACAC




1336_TMOD_F


1431_TMOD_R







87
SP101_
CGAGTATAGCTAAAAAA
279
SP101_
GGATAATTGGTCGTAACAA
634



SPET11_1408_
ATAGTTTATGACA

SPET11_1486_
GGGATAGTGAG




1437_F


1515_R







427
SP101_
TCGAGTATAGCTAAAAA
280
SP101_
TGGATAATTGGTCGTAACA
635



SPET11_1408_
AATAGTTTATGACA

SPET11_1486_
AGGGATAGTGAG




1437_TMOD_F


1515_TMOD_R







88
SP101_
CCTATATTAATCGTTTA
281
SP101_
ATATGATTATCATTGAACT
636



SPET11_1688_
CAGAAACTGGCT

SPET11_1783_
GCGGCCG




1716_F


1808_R







428
SP101_
TCCTATATTAATCGTTT
282
SP101_
TATATGATTATCATTGAAC
637



SPET11_1688_
ACAGAAACTGGCT

SPET11_1783_
TGCGGCCG




1716_TMOD_F


1808_TMOD_R







89
SP101_
CTGGCTAAAACTTTGGC
283
SP101_
GCGTGACGACCTTCTTGAA
638



SPET11_1711_
AACGGT

SPET11_1808_
TTGTAATCA




1733_F


1835_R







429
SP101_
TCTGGCTAAAACTTTGG
284
SP101_
TGCGTGACGACCTTCTTGA
639



SPET11_1711_
CAACGGT

SPET11_1808_
ATTGTAATCA




1733_TMOD_F


1835_TMOD_R







90
SP101_
ATGATTACAATTCAAGA
285
SP101_
TTGGACCTGTAATCAGCTG
640



SPET11_1807_
AGGTCGTCACGC

SPET11_1901_
AATACTGG




1835_F


1927_R







430
SP101_
TATGATTACAATTCAAG
286
SP101_
TTTGGACCTGTAATCAGCT
641



SPET11_1807_
AAGGTCGTCACGC

SPET11_1901_
GAATACTGG




1835_TMOD_F


1927_TMOD_R







91
SP101_
TAACGGTTATCATGGCC
287
SP101_
ATTGCCCAGAAATCAAATC
642



SPET11_1967_
CAGATGGG

SPET11_2062_
ATC




1991_F


2083_R







431
SP101_
TTAACGGTTATCATGGC
288
SP101_
TATTGCCCAGAAATCAAAT
643



SPET11_1967_
CCAGATGGG

SPET11_2062_
CATC




1991_TMOD_F


2083_TMOD_R







77
SP101_
AGCAGGTGGTGAAATCG
289
SP101_
TGCCACTTTGACAACTCCT
654



SPET11_216_
GCCACATGATT

SPET11_308_
GTTGCTG




243_F


333_R







432
SP101_
TAGCAGGTGGTGAAATC
290
SP101_
TTGCCACTTTGACAACTCC
655



SPET11_216_
GGCCACATGATT

SPET11_308_
TGTTGCTG




243_TMOD_F


333_TMOD_R







92
SP101_
CAGAGACCGTTTTATCC
291
SP101_
TCTGGGTGACCTGGTGTTT
646



SPET11_2260_
TATCAGC

SPET11_2375_
TAGA




2283_F


2397_R







433
SP101_
TCAGAGACCGTTTTATC
292
SP101_
TTCTGGGTGACCTGGTGTT
647



SPET11_2260_
CTATCAGC

SPET11_2375_
TTAGA




2283_TMOD_F


2397_TMOD_R







93
SP101_
TCTAAAACACCAGGTCA
293
SP101_
AGCTGCTAGATGAGCTTCT
648



SPET11_2375_
CCCAGAAG

SPET11_2470_
GCCATGGCC




2399_F


2497_R







434
SP101_
TTCTAAAACACCAGGTC
294
SP101_
TAGCTGCTAGATGAGCTTC
649



SPET11_2375_
ACCCAGAAG

SPET11_2470_
TGCCATGGCC




2399_TMOD_F


2497_TMOD_R







94
SP101_
ATGGCCATGGCAGAAGC
295
SP101_
CCATAAGGTCACCGTCACC
650



SPET11_2468_
TCA

SPET11_2543_
ATTCAAAGC




2487_F


2570_R







435
SP101_
TATGGCCATGGCAGAAG
296
SP101_
TCCATAAGGTCACCGTCAC
651



SPET11_2468_
CTCA

SPET11_2543_
CATTCAAAGC




2487_TMOD_F


2570_TMOD_R







78
SP101_
CTTGTACTTGTGGCTCA
297
SP101_
GCTGCTTTGATGGCTGAAT
661



SPET11_266_
CACGGCTGTTTGG

SPET11_355_
CCCCTTC




295_F


380_R







436
SP101_
TCTTGTACTTGTGGCTC
298
SP101_
TGCTGCTTTGATGGCTGAA
662



SPET11_266_
ACACGGCTGTTTGG

SPET11_355_
TCCCCTTC




295_TMOD_F


380_TMOD_R







95
SP101_
ACCATGACAGAAGGCAT
299
SP101_
GGAATTTACCAGCGATAGA
652



SPET11_2961_
TTTGACA

SPET11_3023_
CACC




2984_F


3045_R







437
SP101_
TACCATGACAGAAGGCA
300
SP101_
TGGAATTTACCAGCGATAG
653



SPET11_2961_
TTTTGACA

SPET11_3023_
ACACC




2984_TMOD_F


3045_TMOD_R







96
SP101_
GATGACTTTTTAGCTAA
301
SP101_
AATCGACGACCATCTTGGA
656



SPET11_3075_
TGGTCAGGCAGC

SPET11_3168_
AAGATTTCTC




3103_F


3196_R







438
SP101_
TGATGACTTTTTAGCTA
302
SP101_
TAATCGACGACCATCTTGG
657



SPET11_3075_
ATGGTCAGGCAGC

SPET11_3168_
AAAGATTTCTC




3103_TMOD_F


3196_TMOD_R







448
SP101_
TAGCTAATGGTCAGGCA
303
SP101_
TCGACGACCATCTTGGAAA
658



SPET11_3085_
GCC

SPET11_3170_
GATTTC




3104_F


3194_R







79
SP101_
GTCAAAGTGGCACGTTT
304
SP101_
ATCCCCTGCTTCTGCTGCC
665



SPET11_322_
ACTGGC

SPET11_423_





344_F


441_R







439
SP101_
TGTCAAAGTGGCACGTT
305
SP101_
TATCCCCTGCTTCTGCTGC
666



SPET11_322_
TACTGGC

SPET11_423_
C




344_TMOD_F


441_TMOD_R







97
SP101_
AGCGTAAAGGTGAACCT
306
SP101_
CCAGCAGTTACTGTCCCCT
659



SPET11_3386_
T

SPET11_3480_
CATCTTTG




3403_F


3506_R







440
SP101_
TAGCGTAAAGGTGAACC
307
SP101_
TCCAGCAGTTACTGTCCCC
660



SPET11_3386_
TT

SPET11_3480_
TCATCTTTG




3403_TMOD_F


3506_TMOD_R







98
SP101_
GCTTCAGGAATCAATGA
308
SP101_
GGGTCTACACCTGCACTTG
663



SPET11_3511_
TGGAGCAG

SPET11_3605_
CATAAC




3535_F


3629_R







441
SP101_
TGCTTCAGGAATCAATG
309
SP101_
TGGGTCTACACCTGCACTT
664



SPET11_3511_
ATGGAGCAG

SPET11_3605_
GCATAAC




3535_TMOD_F


3629_TMOD_R







80
SP101_
GGGGATTCAGCCATCAA
310
SP101_
CCAACCTTTTCCACAACAG
668



SPET11_358_
AGCAGCTATTGAC

SPET11_448_
AATCAGC




387_F


473_R







442
SP101_
TGGGGATTCAGCCATCA
311
SP101_
TCCAACCTTTTCCACAACA
669



SPET11_358_
AAGCAGCTATTGAC

SPET11_448_
GAATCAGC




387_TMOD_F


473_TMOD_R







447
SP101_
TCAGCCATCAAAGCAGC
312
SP101_
TACCTTTTCCACAACAGAA
667



SPET11_364_
TATTG

SPET11_448_
TCAGC




385_F


471_R







81
SP101_
CCTTACTTCGAACTATG
313
SP101_
CCCATTTTTTCACGCATGC
670



SPET11_600_
AATCTTTTGGAAG

SPET11_686_
TGAAAATATC




629_F


714_R







443
SP101_
TCCTTACTTCGAACTAT
314
SP101_
TCCCATTTTTTCACGCATG
671



SPET11_600_
GAATCTTTTGGAAG

SPET11_686_
CTGAAAATATC




629_TMOD_F


714_TMOD_R







82
SP101_
GGGGATTGATATCACCG
315
SP101_
GATTGGCGATAAAGTGATA
672



SPET11_658_
ATAAGAAGAA

SPET11_756_
TTTTCTAAAA




684_F


784_R







444
SP101_
TGGGGATTGATATCACC
316
SP101_
TGATTGGCGATAAAGTGAT
673



SPET11_658_
GATAAGAAGAA

SPET11_756_
ATTTTCTAAAA




684_TMOD_F


784_TMOD_R







83
SP101_
TCGCCAATCAAAACTAA
317
SP101_
GCCCACCAGAAAGACTAGC
674



SPET11_776_
GGGAATGGC

SPET11_871_
AGGATAA




801_F


896_R







445
SP101_
TTCGCCAATCAAAACTA
318
SP101_
TGCCCACCAGAAAGACTAG
675



SPET11_776_
AGGGAATGGC

SPET11_871_
CAGGATAA




801_TMOD_F


896_TMOD_R







84
SP101_
GGGCAACAGCAGCGGAT
319
SP101_
CATGACAGCCAAGACCTCA
678



SPET11_893_
TGCGATTGCGCG

SPET11_988_
CCCACC




921_F


1012_R







423
SP101_
TGGGCAACAGCAGCGGA
320
SP101_
TCATGACAGCCAAGACCTC
679



SPET11_893_
TTGCGATTGCGCG

SPET11_988_
ACCCACC




921_TMOD_F


1012_TMOD_R







706
SSPE_BA_
TCAAGCAAACGCACAAT
321
SSPE_BA_196_
TTGCACGTCTGTTTCAGTT
683



114_137_F
CAGAAGC

222_R
GCAAATTC






612
SSPE_BA_
TCAAGCAAACGCACAACa
321
SSPE_B_196_
TTGCACGTUaCaGTTTCAGT
684



114_137P_F
UaAGAAGC

222P_R
TGCAAATTC






58
SSPE_BA_
CAAGCAAACGCACAATC
322
SSPE_BA_197_
TGCACGTCTGTTTCAGTTG
686



115_137_F
AGAAGC

222_R
CAAATTC






355
SSPE_BA_115_
TCAAGCAAACGCACAAT
321
SSPE_BA_197_
TTGCACGTCTGTTTCAGTT
687



137_TMOD_F
CAGAAGC

222_TMOD_R
GCAAATTC






215
SSPE_BA_121_
AACGCACAATCAGAAGC
323
SSPE_BA_197_
TCTGTTTCAGTTGCAAATT
685



137_F


216_R
C






699
SSPE_BA_123_
TGCACAATCAGAAGCTA
324
SSPE_BA_202_
TTTCACAGCATGCACGTCT
688



153_F
AGAAAGCGCAAGCT

231_R
GTTTCAGTTGC






704
SSPE_BA_146_
TGCAAGCTTCTGGTGCT
325
SSPE_BA_242_
TTGTGATTGTTTTGCAGCT
689



168_F
AGCATT

267_R
GATTGTG






702
SSPE_BA_150_
TGCTTCTGGTGCTAGCA
326
SSPE_BA_243_
TGATTGTTTTGCAGCTGAT
691



168_F
TT

264_R
TGT






610
SSPE_BA_150_
TGCTTCTGGCaGUaCaAG
326
SSPE_BA_243_
TGATTGTTTTGUaAGUaTGA
691



168P_F
UaATT

264P_R
CaCaGT






700
SSPE_BA_156_
TGGTGCTAGCATT
327
SSPE_BA_243_
TGCAGCTGATTGT
690



168_F


255_R







608
SSPE_BA_156_
TGGCaGUaCaAGUaATT
327
SSPE_BA_243_
TGUaAGUaTGACaCaGT
690



168P_F


255P_R







705
SSPE_BA_63_
TGCTAGTTATGGTACAG
328
SSPE_BA_163_
TCATAACTAGCATTTGTGC
682



89_F
AGTTTGCGAC

191_R
TTTGAATGCT






703
SSPE_BA_72_
TGGTACAGAGTTTGCGA
329
SSPE_BA_163_
TCATTTGTGCTTTGAATGC
681



89_F
C

182_R
T






611
SSPE_BA_72_
TGGTAUaAGAGCaCaCaG
329
SSPE_BA_163_
TCATTTGTGCCaCaCaGAACa
681



89P_F
UaGAC

182P_R
GUaT






701
SSPE_BA_75_
TACAGAGTTTGCGAC
330
SSPE_BA_163_
TGTGCTTTGAATGCT
680



89_F


177_R







609
SSPE_BA_75_
TAUaAGAGCaCaCaCGUaG
330
SSPE_BA_163_
TGTGCCaCaCaGAACaGUaT
680



89P_F
AC

177P_R







1099
TOXR_VBC_135_
TCGATTAGGCAGCAACG
331
TOXR_VBC_221_
TTCAAAACCTTGCTCTCGC
692



158_F
AAAGCCG

246_R
CAAACAA






905
TRPE_AY094355_
TCGACCTTTGGCAGGAA
332
TRPE_AY094355_
TACATCGTTTCGCCCAAGA
693



1064_1086_F
CTAGAC

1171_1196_R
TCAATCA






904
TRPE_AY094355_
TCAAATGTACAAGGTGA
333
TRPE_AY094355_
TCCTCTTTTCACAGGCTCT
694



1278_1303_F
AGTGCGTGA

1392_1418_R
ACTTCATC






903
TRPE_AY094355_
TGGATGGCATGGTGAAA
334
TRPE_AY094355_
TATTTGGGTTTCATTCCAC
695



1445_1471_F
TGGATATGTC

1551_1580_R
TCAGATTCTGG






902
TRPE_AY094355_
ATGTCGATTGCAATCCG
335
TRPE_AY094355_
TGCGCGAGCTTTTATTTGG
696



1467_1491_F
TACTTGTG

1569_1592_R
GTTTC






906
TRPE_AY094355_
GTGCATGCGGATACAGA
336
TRPE_AY094355_
TTCAAAATGCGGAGGCGTA
697



666_688_F
GCAGAG

769_791_R
TGTG






907
TRPE_AY094355_
TGCAAGCGCGACCACAT
337
TRPE_AY094355_
TGCCCAGGTACAACCTGCA
698



757_776_F
ACG

864_883_R
T






114
TUFB_EC_225_
GCACTATGCACACGTAG
338
TUFB_EC_284_
TATAGCACCATCCATCTGA
706



251_F
ATTGTCCTGG

309_R
GCGGCAC






60
TUFB_EC_239_
TTGACTGCCCAGGTCAC
339
TUFB_EC_283_
GCCGTCCATTTGAGCAGCA
704



259_2_F
GCTG

303_2_R
CC






59
TUFB_EC_239_
TAGACTGCCCAGGACAC
340
TUFB_EC_283_
GCCGTCCATCTGAGCAGCA
705



259_F
GCTG

303_R
CC






942
TUFB_EC_251_
TGCACGCCGACTATGTT
341
TUFB_EC_337_
TATGTGCTCACGAGTTTGC
707



278_F
AAGAACATGAT

360_R
GGCAT






941
TUFB_EC_275_
TGATCACTGGTGCTGCT
342
TUFB_EC_337_
TGGATGTGCTCACGAGTCT
708



299_F
CAGATGGA

362_R
GTGGCAT






117
TUFB_EC_757_
AAGACGACCTGCACGGG
343
TUFB_EC_849_
GCGCTCCACGTCTTCACGC
709



774_F
C

867_R







293
TUFB_EC_957_
CCACACGCCGTTCTTCA
344
TUFB_EC_1034_
GGCATCACCATTTCCTTGT
700



979_F
ACAACT

1058_R
CCTTCG






367
TUFB_EC_957_
TCCACACGCCGTTCTTC
345
TUFB_EC_1034_
TGGCATCACCATTTCCTTG
701



979_TMOD_F
AACAACT

1058_TMOD_R
TCCTTCG






62
TUFB_EC_976_
AACTACCGTCCTCAGTT
346
TUFB_EC_1045_
GTTGTCACCAGGCATTACC
702



1000_2_F
CTACTTCC

1068_2_R
ATTTC






61
TUFB_EC_976_
AACTACCGTCCGCAGTT
347
TUFB_EC_1045_
GTTGTCGCCAGGCATAACC
703



1000_F
CTACTTCC

1068_R
ATTTC






63
TUFB_EC_985_
CCACAGTTCTACTTCCG
348
TUFB_EC_1033_
TCCAGGCATTACCATTTCT
699



1012_F
TACTACTGACG

1062_R
ACTCCTTCTGG






225
VALS_EC_1105_
CGTGGCGGCGTGGTTAT
349
VALS_EC_1195_
ACGAACTGGATGTCGCCGT
710



1124_F
CGA

1214_R
T






71
VALS_EC_1105_
CGTGGCGGCGTGGTTAT
349
VALS_EC_1195_
CGGTACGAACTGGATGTCG
711



1124_F
CGA

1218_R
CCGTT






358
VALS_EC_1105_
TCGTGGCGGCGTGGTTA
350
VALS_EC_1195_
TCGGTACGAACTGGATGTC
712



1124_TMOD_F
TCGA

1218_TMOD_R
GCCGTT






965
VALS_EC_1128_
TATGCTGACCGACCAGT
351
VALS_EC_1231_
TTCGCGCATCCAGGAGAAG
713



1151_F
GGTACGT

1257_R
TACATGTT






112
VALS_EC_1833_
CGACGCGCTGCGCTTCA
352
VALS_EC_1920_
GCGTTCCACAGCTTGTTGC
714



1850_F
C

1943_R
AGAAG






116
VALS_EC_1920_
CTTCTGCAACAAGCTGT
353
VALS_EC_1948_
TCGCAGTTCATCAGCACGA
715



1943_F
GGAACGC

1970_R
AGCG






295
VALS_EC_610_
ACCGAGCAAGGAGACCA
354
VALS_EC_705_
TATAACGCACATCGTCAGG
716



649_F
GC

727_R
GTGA






931
WAAA_Z96925_
TCTTGCTCTTTCGTGAG
355
WAAA_Z96925_
CAAGCGGTTTGCCTCAAAT
717



2_29_F
TTCAGTAAATG

115_138_R
AGTCA






932
WAAA_Z96925_
TCGATCTGGTTTCATGC
356
WAAA_Z96925_
TGGCACGAGCCTGACCTGT
718



286_311_F
TGTTTCAGT

394_412_R









Primer pair name codes and reference sequences are shown in Table 2. The primer name code typically represents the gene to which the given primer pair is targeted. The primer pair name includes coordinates with respect to a reference sequence defined by an extraction of a section of sequence or defined by a GenBank gi number, or the corresponding complementary sequence of the extraction, or the entire GenBank gi number as indicated by the label “no extraction.” Where “no extraction” is indicated for a reference sequence, the coordinates of a primer pair named to the reference sequence are with respect to the GenBank gi listing. Gene abbreviations are shown in bold type in the “Gene Name” column.









TABLE 2







Primer Name Codes and Reference Sequences

















Extraction


Primer


Reference
Extracted gene
or entire


name


GenBank
coordinates of gi
gene


code
Gene Name
Organism
gi number
number
SEQ ID NO:















16S_EC
16S rRNA (16S

Escherichia

16127994
4033120 . . . 4034661
719



ribosomal RNA

coli




gene)


23S_EC
23S rRNA (23S

Escherichia

16127994
4166220 . . . 4169123
720



ribosomal RNA

coli




gene)


CAPC_BA
capC (capsule

Bacillus

6470151
Complement
721



biosynthesis gene)

anthracis


(55628 . . . 56074)


CYA_BA
cya (cyclic AMP

Bacillus

4894216
Complement
722



gene)

anthracis


(154288 . . . 156626)


DNAK_EC
dnaK (chaperone

Escherichia

16127994
12163 . . . 14079
723



dnaK gene)

coli



GROL_EC
groL (chaperonin

Escherichia

16127994
4368603 . . . 4370249
724



groL)

coli



HFLB_EC
hflb (cell

Escherichia

16127994
Complement
725



division protein

coli


(3322645 . . . 3324576)



peptidase ftsH)


INFB_EC
infB (protein

Escherichia

16127994
Complement
726



chain initiation

coli


(3310983 . . . 3313655)



factor infB gene)


LEF_BA
lef (lethal

Bacillus

21392688
Complement
727



factor)

anthracis


(149357 . . . 151786)


PAG_BA
pag (protective

Bacillus

21392688
143779 . . . 146073
728



antigen)

anthracis



RPLB_EC
rplB (50S

Escherichia

16127994
3449001 . . . 3448180
729



ribosomal protein

coli




L2)


RPOB_EC
rpoB (DNA-directed

Escherichia

6127994
Complement
730



RNA polymerase

coli


4178823 . . . 4182851



beta chain)


RPOC_EC
rpoC (DNA-directed

Escherichia

16127994
4182928 . . . 4187151
731



RNA polymerase

coli




beta′ chain)


SP101ET_SPET_11
Concatenation
Artificial
15674250

732



comprising:
Sequence* -



gki (glucose
partial gene

Complement



kinase)
sequences of

(1258294 . . . 1258791)



gtr (glutamine

Streptococcus


complement



transporter

pyogenes


(1236751 . . . 1237200)



protein)



murI (glutamate


312732 . . . 313169



racemase)



mutS (DNA mismatch


Complement



repair protein)


(1787602 . . . 1788007)



xpt (xanthine


930977 . . . 931425



phosphoribosyl



transferase)



yqiL (acetyl-CoA-


129471 . . . 129903



acetyl



transferase)



tkt


1391844 . . . 1391386



(transketolase)


SSPE_BA
sspE (small acid-

Bacillus

30253828
226496 . . . 226783
733



soluble spore

anthracis




protein)


TUFB_EC
tufB (Elongation

Escherichia

16127994
4173523 . . . 4174707
734



factor Tu)

coli



VALS_EC
valS (Valyl-tRNA

Escherichia

16127994
Complement
735



synthetase)

coli


(4481405 . . . 4478550)


ASPS_EC
aspS (Aspartyl-

Escherichia

16127994
complement (1946777 . . . 1948546)
736



tRNA synthetase)

coli



CAF1_AF053947
caf1 (capsular

Yersinia

2996286
No extraction -




protein caf1)

pestis


GenBank coordinates






used


INV_U22457
inv (invasin)

Yersinia

1256565
74 . . . 3772
737





pestis



LL_NC003143
Y. pestis specific

Yersinia

16120353
No extraction -




chromosomal genes -

pestis


GenBank coordinates



difference


used



region


BONTA_X52066
BoNT/A (neurotoxin

Clostridium

40381
77 . . . 3967
738



type A)

botulinum



MECA_Y14051
mecA methicillin

Staphylococcus

2791983
No extraction -
739



resistance gene

aureus


GenBank coordinates






used


TRPE_AY094355
trpE (anthranilate

Acinetobacter

20853695
No extraction -
740



synthase (large

baumanii


GenBank coordinates



component))


used


RECA_AF251469
recA (recombinase

Acinetobacter

9965210
No extraction -
741



A)

baumanii


GenBank coordinates






used


GYRA_AF100557
gyrA (DNA gyrase

Acinetobacter

4240540
No extraction -
742



subunit A)

baumanii


GenBank coordinates






used


GYRB_AB008700
gyrB (DNA gyrase

Acinetobacter

4514436
No extraction -
743



subunit B)

baumanii


GenBank coordinates






used


WAAA_Z96925
waaA (3-deoxy-D-

Acinetobacter

2765828
No extraction -
744



manno-octulosonic-

baumanii


GenBank coordinates



acid transferase)


used


CJST_CJ
Concatenation
Artificial
15791399

745



comprising:
Sequence* -



tkt
partial gene

1569415 . . . 1569873



(transketolase)
sequences of



glyA (serine

Campylobacter


367573 . . . 368079



hydroxymethyltransferase)

jejuni




gltA (citrate


complement



synthase)


(1604529 . . . 1604930)



aspA (aspartate


96692 . . . 97168



ammonia lyase)



glnA (glutamine


complement



synthase)


(657609 . . . 658085)



pgm


327773 . . . 328270



(phosphoglycerate



mutase)



uncA (ATP


112163 . . . 112651



synthetase alpha



chain)


RNASEP_BDP
RNase P

Bordetella

33591275
Complement
746



(ribonuclease P)

pertussis


(3226720 . . . 3227933)


RNASEP_BKM
RNase P

Burkholderia

53723370
Complement
747



(ribonuclease P)

mallei


(2527296 . . . 2528220)


RNASEP_BS
RNase P

Bacillus

16077068
Complement
748



(ribonuclease P)

subtilis


(2330250 . . . 2330962)


RNASEP_CLB
RNase P

Clostridium

18308982
Complement
749



(ribonuclease P)

perfringens


(2291757 . . . 2292584)


RNASEP_EC
RNase P

Escherichia

16127994
Complement
750



(ribonuclease P)

coli


(3267457 . . . 3268233


RNASEP_RKP
RNase P

Rickettsia

15603881
complement (605276 . . . 606109)
751



(ribonuclease P)

prowazekii



RNASEP_SA
RNase P

Staphylococcus

15922990
complement (1559869 . . . 1560651)
752



(ribonuclease P)

aureus



RNASEP_VBC
RNase P

Vibrio

15640032
complement (2580367 . . . 2581452)
753



(ribonuclease P)

cholerae



ICD_CXB
icd (isocitrate

Coxiella

29732244
complement (1143867 . . . 1144235)
754



dehydrogenase)

burnetii



IS1111A
multi-locus

Acinetobacter

29732244
No extraction




IS1111A insertion

baumannii




element


OMPA_AY485227
ompA (outer

Rickettsia

40287451
No extraction
755



membrane protein

prowazekii




A)


OMPB_RKP
ompB (outer

Rickettsia

15603881
complement (881264 . . . 886195)
756



membrane protein

prowazekii




B)


GLTA_RKP
gltA (citrate

Vibrio

15603881
complement (1062547 . . . 1063857)
757



synthase)

cholerae



TOXR_VBC
toxR

Francisella

15640032
complement (1047143 . . . 1048024)
758



(transcription

tularensis




regulator toxR)


ASD_FRT
asd (Aspartate

Francisella

56707187
complement (438608 . . . 439702)
759



semialdehyde

tularensis




dehydrogenase)


GALE_FRT
galE (UDP-glucose

Shigella

56707187
809039 . . . 810058
760



4-epimerase)

flexneri



IPAH_SGF
ipaH (invasion

Campylobacter

30061571
2210775 . . . 2211614
761



plasmid antigen)

jejuni



HUPB_CJ
hupB (DNA-binding

Coxiella

15791399
complement (849317 . . . 849819)
762



protein Hu-beta)

burnetii



AB_MLST
Concatenation
Artificial

Sequenced in-house
763



comprising:
Sequence* -



trpE (anthranilate
partial gene



synthase component
sequences of



I))

Acinetobacter




adk (adenylate

baumannii




kinase)



mutY (adenine



glycosylase)



fumC (fumarate



hydratase)



efp (elongation



factor p)



ppa (pyrophosphate



phospho-



hydratase





*Note: These artificial reference sequences represent concatenations of partial gene extractions from the indicated reference gi number. Partial sequences were used to create the concatenated sequence because complete gene sequences were not necessary for primer design. The stretches of arbitrary residues “N”s were added for the convenience of separation of the partial gene extractions (100N for SP101_SPET11 (SEQ ID NO: 732); 50N for CJST_CJ (SEQ ID NO: 745); and 40N for AB_MLST (SEQ ID NO: 763)).






Example 2
DNA Isolation and Amplification

Genomic materials from culture samples or swabs were prepared using the DNeasy® 96 Tissue Kit (Qiagen, Valencia, Calif.). All PCR reactions are assembled in 50 μA reactions in the 96 well microtiter plate format using a Packard MPII liquid handling robotic platform and MJ Dyad® thermocyclers (MJ research, Waltham, Mass.). The PCR reaction consisted of 4 units of Amplitaq Gold®, 1× buffer II (Applied Biosystems, Foster City, Calif.), 1.5 mM MgCl2, 0.4 M betaine, 800 μM dNTP mix, and 250 nM of each primer.


The following PCR conditions were used to amplify the sequences used for mass spectrometry analysis: 95 C for 10 minutes followed by 8 cycles of 95 C for 30 seconds, 48 C for 30 seconds, and 72 C for 30 seconds, with the 48 C annealing temperature increased 0.9 C after each cycle. The PCR was then continued for 37 additional cycles of 95 C for 15 seconds, 56 C for 20 seconds, and 72 C for 20 seconds.


Example 3
Solution Capture Purification of PCR Products for Mass Spectrometry with Ion Exchange Resin-Magnetic Beads

For solution capture of nucleic acids with ion exchange resin linked to magnetic beads, 25 μA of a 2.5 mg/mL suspension of BioClon amine terminated supraparamagnetic beads were added to 25 to 50 μA of a PCR reaction containing approximately 10 μM of a typical PCR amplification product. The above suspension was mixed for approximately 5 minutes by vortexing or pipetting, after which the liquid was removed after using a magnetic separator. The beads containing bound PCR amplification product were then washed 3× with 50 mM ammonium bicarbonate/50% MeOH or 100 mM ammonium bicarbonate/50% MeOH, followed by three more washes with 50% MeOH. The bound PCR amplicon was eluted with 25 mM piperidine, 25 mM imidazole, 35% MeOH, plus peptide calibration standards.


Example 4
Mass Spectrometry and Base Composition Analysis

The ESI-FTICR mass spectrometer is based on a Bruker Daltonics (Billerica, Mass.) Apex II 70e electrospray ionization Fourier transform ion cyclotron resonance mass spectrometer that employs an actively shielded 7 Tesla superconducting magnet. The active shielding constrains the majority of the fringing magnetic field from the superconducting magnet to a relatively small volume. Thus, components that might be adversely affected by stray magnetic fields, such as CRT monitors, robotic components, and other electronics, can operate in close proximity to the FTICR spectrometer. All aspects of pulse sequence control and data acquisition were performed on a 600 MHz Pentium II data station running Bruker's Xmass software under Windows NT 4.0 operating system. Sample aliquots, typically 15 μl, were extracted directly from 96-well microtiter plates using a CTC HTS PAL autosampler (LEAP Technologies, Carrboro, N.C.) triggered by the FTICR data station. Samples were injected directly into a 10 μl sample loop integrated with a fluidics handling system that supplies the 100 μl/hr flow rate to the ESI source. Ions were formed via electrospray ionization in a modified Analytica (Branford, Conn.) source employing an off axis, grounded electrospray probe positioned approximately 1.5 cm from the metalized terminus of a glass desolvation capillary. The atmospheric pressure end of the glass capillary was biased at 6000 V relative to the ESI needle during data acquisition. A counter-current flow of dry N2 was employed to assist in the desolvation process. Ions were accumulated in an external ion reservoir comprised of an rf-only hexapole, a skimmer cone, and an auxiliary gate electrode, prior to injection into the trapped ion cell where they were mass analyzed. Ionization duty cycles >99% were achieved by simultaneously accumulating ions in the external ion reservoir during ion detection. Each detection event consisted of 1M data points digitized over 2.3 s. To improve the signal-to-noise ratio (S/N), 32 scans were co-added for a total data acquisition time of 74 s.


The ESI-TOF mass spectrometer is based on a Bruker Daltonics MicroTOFT™. Ions from the ESI source undergo orthogonal ion extraction and are focused in a reflectron prior to detection. The TOF and FTICR are equipped with the same automated sample handling and fluidics described above. Ions are formed in the standard MicroTOFT™ ESI source that is equipped with the same off-axis sprayer and glass capillary as the FTICR ESI source. Consequently, source conditions were the same as those described above. External ion accumulation was also employed to improve ionization duty cycle during data acquisition. Each detection event on the TOF was comprised of 75,000 data points digitized over 75 μs.


The sample delivery scheme allows sample aliquots to be rapidly injected into the electrospray source at high flow rate and subsequently be electrosprayed at a much lower flow rate for improved ESI sensitivity. Prior to injecting a sample, a bolus of buffer was injected at a high flow rate to rinse the transfer line and spray needle to avoid sample contamination/carryover. Following the rinse step, the autosampler injected the next sample and the flow rate was switched to low flow. Following a brief equilibration delay, data acquisition commenced. As spectra were co-added, the autosampler continued rinsing the syringe and picking up buffer to rinse the injector and sample transfer line. In general, two syringe rinses and one injector rinse were required to minimize sample carryover. During a routine screening protocol a new sample mixture was injected every 106 seconds. More recently a fast wash station for the syringe needle has been implemented which, when combined with shorter acquisition times, facilitates the acquisition of mass spectra at a rate of just under one spectrum/minute.


Raw mass spectra were post-calibrated with an internal mass standard and deconvoluted to monoisotopic molecular masses. Unambiguous base compositions were derived from the exact mass measurements of the complementary single-stranded oligonucleotides. Quantitative results are obtained by comparing the peak heights with an internal PCR calibration standard present in every PCR well at 500 molecules per well for the ribosomal DNA-targeted primers and 100 molecules per well for the protein-encoding gene targets. Calibration methods are commonly owned and disclosed in U.S. Provisional Patent Application Ser. No. 60/545,425.


Example 5
De Novo Determination of Base Composition of Amplification Products using Molecular Mass Modified Deoxynucleotide Triphosphates

Because the molecular masses of the four natural nucleobases have a relatively narrow molecular mass range (A=313.058, G=329.052, C=289.046, T=304.046—See Table 3), a persistent source of ambiguity in assignment of base composition can occur as follows: two nucleic acid strands having different base composition may have a difference of about 1 Da when the base composition difference between the two strands is G⇄A (−15.994) combined with C⇄T (+15.000). For example, one 99-mer nucleic acid strand having a base composition of A27G30C21T21 has a theoretical molecular mass of 30779.058 while another 99-mer nucleic acid strand having a base composition of A26G31C22T20 has a theoretical molecular mass of 30780.052. A 1 Da difference in molecular mass may be within the experimental error of a molecular mass measurement and thus, the relatively narrow molecular mass range of the four natural nucleobases imposes an uncertainty factor.


The present invention provides for a means for removing this theoretical 1 Da uncertainty factor through amplification of a nucleic acid with one mass-tagged nucleobase and three natural nucleobases. The term “nucleobase” as used herein is synonymous with other terms in use in the art including “nucleotide,” “deoxynucleotide,” “nucleotide residue,” “deoxynucleotide residue,” “nucleotide triphosphate (NTP),” or deoxynucleotide triphosphate (dNTP).


Addition of significant mass to one of the 4 nucleobases (dNTPs) in an amplification reaction, or in the primers themselves, will result in a significant difference in mass of the resulting amplification product (significantly greater than 1 Da) arising from ambiguities arising from the G⇄A combined with C⇄T event (Table 3). Thus, the same the G⇄A (−15.994) event combined with 5-Iodo-C⇄T (−110.900) event would result in a molecular mass difference of 126.894. If the molecular mass of the base composition A27G30 5-Iodo-C21T21 (33422.958) is compared with A26G315-Iodo-C22T20, (33549.852) the theoretical molecular mass difference is +126.894. The experimental error of a molecular mass measurement is not significant with regard to this molecular mass difference. Furthermore, the only base composition consistent with a measured molecular mass of the 99-mer nucleic acid is A27G305-Iodo-C21T21. In contrast, the analogous amplification without the mass tag has 18 possible base compositions.









TABLE 3







Molecular Masses of Natural Nucleobases and the Mass-Modified


Nucleobase 5-Iodo-C and Molecular Mass Differences Resulting


from Transitions










Nucleobase
Molecular Mass
Transition
Δ Molecular Mass













A
313.058
A-->T
−9.012


A
313.058
A-->C
−24.012


A
313.058
A-->5-Iodo-C
101.888


A
313.058
A-->G
15.994


T
304.046
T-->A
9.012


T
304.046
T-->C
−15.000


T
304.046
T-->5-Iodo-C
110.900


T
304.046
T-->G
25.006


C
289.046
C-->A
24.012


C
289.046
C-->T
15.000


C
289.046
C-->G
40.006


5-Iodo-C
414.946
5-Iodo-C-->A
−101.888


5-Iodo-C
414.946
5-Iodo-C-->T
−110.900


5-Iodo-C
414.946
5-Iodo-C-->G
−85.894


G
329.052
G-->A
−15.994


G
329.052
G-->T
−25.006


G
329.052
G-->C
−40.006


G
329.052
G-->5-Iodo-C
85.894









Example 6
Data Processing

Mass spectra of bioagent identifying amplicons are analyzed independently using a maximum-likelihood processor, such as is widely used in radar signal processing. This processor, referred to as GenX, first makes maximum likelihood estimates of the input to the mass spectrometer for each primer by running matched filters for each base composition aggregate on the input data. This includes the GenX response to a calibrant for each primer.


The algorithm emphasizes performance predictions culminating in probability-of-detection versus probability-of-false-alarm plots for conditions involving complex backgrounds of naturally occurring organisms and environmental contaminants. Matched filters consist of a priori expectations of signal values given the set of primers used for each of the bioagents. A genomic sequence database is used to define the mass base count matched filters. The database contains the sequences of known bacterial bioagents and includes threat organisms as well as benign background organisms. The latter is used to estimate and subtract the spectral signature produced by the background organisms. A maximum likelihood detection of known background organisms is implemented using matched filters and a running-sum estimate of the noise covariance. Background signal strengths are estimated and used along with the matched filters to form signatures which are then subtracted. the maximum likelihood process is applied to this “cleaned up” data in a similar manner employing matched filters for the organisms and a running-sum estimate of the noise-covariance for the cleaned up data.


The amplitudes of all base compositions of bioagent identifying amplicons for each primer are calibrated and a final maximum likelihood amplitude estimate per organism is made based upon the multiple single primer estimates. Models of all system noise are factored into this two-stage maximum likelihood calculation. The processor reports the number of molecules of each base composition contained in the spectra. The quantity of amplification product corresponding to the appropriate primer set is reported as well as the quantities of primers remaining upon completion of the amplification reaction.


Example 7
Use of Broad Range Survey and Division Wide Primer Pairs for Identification of Bacteria in an Epidemic Surveillance Investigation

This investigation employed a set of 16 primer pairs which is herein designated the “surveillance primer set” and comprises broad range survey primer pairs, division wide primer pairs and a single Bacillus clade primer pair. The surveillance primer set is shown in Table 4 and consists of primer pairs originally listed in Table 1. This surveillance set comprises primers with T modifications (note TMOD designation in primer names) which constitutes a functional improvement with regard to prevention of non-templated adenylation (vide supra) relative to originally selected primers which are displayed below in the same row. Primer pair 449 (non-T modified) has been modified twice. Its predecessors are primer pairs 70 and 357, displayed below in the same row. Primer pair 360 has also been modified twice and its predecessors are primer pairs 17 and 118.









TABLE 4







Bacterial Primer Pairs of the Surveillance Primer Set














Forward

Reverse



Primer

Primer

Primer


Pair

(SEQ ID

(SEQ ID


No.
Forward Primer Name
NO:)
Reverse Primer Name
NO:)
Target Gene















346
16S_EC_713_732_TMOD_F
27
16S_EC_789_809_TMOD_R
389
16S rRNA


10
16S_EC_713_732_F
26
16S_EC_789_809
388
16S rRNA


347
16S_EC_785_806_TMOD_F
30
16S_EC_880_897_TMOD_R
392
16S rRNA


11
16S_EC_785_806_F
29
16S_EC_880_897_R
391
16S rRNA


348
16S_EC_960_981_TMOD_F
38
16S_EC_1054_1073_TMOD_R
363
16S rRNA


14
16S_EC_960_981_F
37
16S_EC_1054_1073_R
362
16S rRNA


349
23S_EC_1826_1843_TMOD_F
49
23S_EC_1906_1924_TMOD_R
405
23S rRNA


16
23S_EC_1826_1843_F
48
23S_EC_1906_1924_R
404
23S rRNA


352
INFB_EC_1365_1393_TMOD_F
161
INFB_EC_1439_1467_TMOD_R
516
infB


34
INFB_EC_1365_1393_F
160
INFB_EC_1439_1467_R
515
infB


354
RPOC_EC_2218_2241_TMOD_F
262
RPOC_EC_2313_2337_TMOD_R
625
rpoC


52
RPOC_EC_2218_2241_F
261
RPOC_EC_2313_2337_R
624
rpoC


355
SSPE_BA_115_137_TMOD_F
321
SSPE_BA_197_222_TMOD_R
687
sspE


58
SSPE_BA_115_137_F
322
SSPE_BA_197_222_R
686
sspE


356
RPLB_EC_650_679_TMOD_F
232
RPLB_EC_739_762_TMOD_R
592
rplB


66
RPLB_EC_650_679_F
231
RPLB_EC_739_762_R
591
rplB


358
VALS_EC_1105_1124_TMOD_F
350
VALS_EC_1195_1218_TMOD_R
712
valS


71
VALS_EC_1105_1124_F
349
VALS_EC_1195_1218_R
711
valS


359
RPOB_EC_1845_1866_TMOD_F
241
RPOB_EC_1909_1929_TMOD_R
597
rpoB


72
RPOB_EC_1845_1866_F
240
RPOB_EC_1909_1929_R
596
rpoB


360
23S_EC_2646_2667_TMOD_F
60
23S_EC_2745_2765_TMOD_R
416
23S rRNA


118
23S_EC_2646_2667_F
59
23S_EC_2745_2765_R
415
23S rRNA


17
23S_EC_2645_2669_F
58
23S_EC_2744_2761_R
414
23S rRNA


361
16S_EC_1090_1111_2_TMOD_F
5
16S_EC_1175_1196_TMOD_R
370
16S rRNA


3
16S_EC_1090_1111_2_F
6
16S_EC_1175_1196_R
369
16S rRNA


362
RPOB_EC_3799_3821_TMOD_F
245
RPOB_EC_3862_3888_TMOD_R
603
rpoB


289
RPOB_EC_3799_3821_F
246
RPOB_EC_3862_3888_R
602
rpoB


363
RPOC_EC_2146_2174_TMOD_F
257
RPOC_EC_2227_2245_TMOD_R
621
rpoC


290
RPOC_EC_2146_2174_F
256
RPOC_EC_2227_2245_R
620
rpoC


367
TUFB_EC_957_979_TMOD_F
345
TUFB_EC_1034_1058_TMOD_R
701
tufB


293
TUFB_EC_957_979_F
344
TUFB_EC_1034_1058_R
700
tufB


449
RPLB_EC_690_710_F
237
RPLB_EC_737_758_R
589
rplB


357
RPLB_EC_688_710_TMOD_F
236
RPLB_EC_736_757_TMOD_R
588
rplB


67
RPLB_EC_688_710_F
235
RPLB_EC_736_757_R
587
rplB









The 16 primer pairs of the surveillance set are used to produce bioagent identifying amplicons whose base compositions are sufficiently different amongst all known bacteria at the species level to identify, at a reasonable confidence level, any given bacterium at the species level. As shown in Tables 6A-E, common respiratory bacterial pathogens can be distinguished by the base compositions of bioagent identifying amplicons obtained using the 16 primer pairs of the surveillance set. In some cases, triangulation identification improves the confidence level for species assignment. For example, nucleic acid from Streptococcus pyogenes can be amplified by nine of the sixteen surveillance primer pairs and Streptococcus pneumoniae can be amplified by ten of the sixteen surveillance primer pairs. The base compositions of the bioagent identifying amplicons are identical for only one of the analogous bioagent identifying amplicons and differ in all of the remaining analogous bioagent identifying amplicons by up to four bases per bioagent identifying amplicon. The resolving power of the surveillance set was confirmed by determination of base compositions for 120 isolates of respiratory pathogens representing 70 different bacterial species and the results indicated that natural variations (usually only one or two base substitutions per bioagent identifying amplicon) amongst multiple isolates of the same species did not prevent correct identification of major pathogenic organisms at the species level.



Bacillus anthracis is a well known biological warfare agent which has emerged in domestic terrorism in recent years. Since it was envisioned to produce bioagent identifying amplicons for identification of Bacillus anthracis, additional drill-down analysis primers were designed to target genes present on virulence plasmids of Bacillus anthracis so that additional confidence could be reached in positive identification of this pathogenic organism. Three drill-down analysis primers were designed and are listed in Tables 1 and 5. In Table 5 the drill-down set comprises primers with T modifications (note TMOD designation in primer names) which constitutes a functional improvement with regard to prevention of non-templated adenylation (vide supra) relative to originally selected primers which are displayed below in the same row.









TABLE 5







Drill-Down Primer Pairs for Confirmation of Identification of Bacillus anthracis














Forward

Reverse



Primer

Primer

Primer


Pair

(SEQ ID

(SEQ ID


No.
Forward Primer Name
NO:)
Reverse Primer Name
NO:)
Target Gene















350
CAPC_BA_274_303_TMOD_F
98
CAPC_BA_349_376_TMOD_R
452
capC


24
CAPC_BA_274_303_F
97
CAPC_BA_349_376_R
451
capC


351
CYA_BA_1353_1379_TMOD_F
128
CYA_BA_1448_1467_TMOD_R
483
cyA


30
CYA_BA_1353_1379_F
127
CYA_BA_1448_1467_R
482
cyA


353
LEF_BA_756_781_TMOD_F
175
LEF_BA_843_872_TMOD_R
531
lef


37
LEF_BA_756_781_F
174
LEF_BA_843_872_R
530
lef









Phylogenetic coverage of bacterial space of the sixteen surveillance primers of Table 4 and the three Bacillus anthracis drill-down primers of Table 5 is shown in FIG. 3 which lists common pathogenic bacteria. FIG. 3 is not meant to be comprehensive in illustrating all species identified by the primers. Only pathogenic bacteria are listed as representative examples of the bacterial species that can be identified by the primers and methods of the present invention. Nucleic acid of groups of bacteria enclosed within the polygons of FIG. 3 can be amplified to obtain bioagent identifying amplicons using the primer pair numbers listed in the upper right hand corner of each polygon. Primer coverage for polygons within polygons is additive. As an illustrative example, bioagent identifying amplicons can be obtained for Chlamydia trachomatis by amplification with, for example, primer pairs 346-349, 360 and 361, but not with any of the remaining primers of the surveillance primer set. On the other hand, bioagent identifying amplicons can be obtained from nucleic acid originating from Bacillus anthracis (located within 5 successive polygons) using, for example, any of the following primer pairs: 346-349, 360, 361 (base polygon), 356, 449 (second polygon), 352 (third polygon), 355 (fourth polygon), 350, 351 and 353 (fifth polygon). Multiple coverage of a given organism with multiple primers provides for increased confidence level in identification of the organism as a result of enabling broad triangulation identification.


In Tables 6A-E, base compositions of respiratory pathogens for primer target regions are shown. Two entries in a cell, represent variation in ribosomal DNA operons. The most predominant base composition is shown first and the minor (frequently a single operon) is indicated by an asterisk (*). Entries with NO DATA mean that the primer would not be expected to prime this species due to mismatches between the primer and target region, as determined by theoretical PCR.









TABLE 6A







Base Compositions of Common Respiratory Pathogens for Bioagent


Identifying Amplicons Corresponding to Primer Pair Nos: 346, 347 and 348













Primer 346
Primer 347
Primer 348


Organism
Strain
[A G C T]
[A G C T]
[A G C T]






Klebsiella

MGH78578
[29 32 25 13]
[23 38 28 26]
[26 32 28 30]



pneumoniae


[29 31 25 13]*
[23 37 28 26]*
[26 31 28 30]*



Yersinia pestis

CO-92 Biovar
[29 32 25 13]
[22 39 28 26]
[29 30 28 29]



Orientalis


[30 30 27 29]*



Yersinia pestis

KIM5 P12 (Biovar
[29 32 25 13]
[22 39 28 26]
[29 30 28 29]



Mediaevalis)



Yersinia pestis

91001
[29 32 25 13]
[22 39 28 26]
[29 30 28 29]






[30 30 27 29]*



Haemophilus

KW20
[28 31 23 17]
[24 37 25 27]
[29 30 28 29]



influenzae




Pseudomonas

PAO1
[30 31 23 15]
[26 36 29 24]
[26 32 29 29]



aeruginosa



[27 36 29 23]*



Pseudomonas

Pf0-1
[30 31 23 15]
[26 35 29 25]
[28 31 28 29]



fluorescens




Pseudomonas

KT2440
[30 31 23 15]
[28 33 27 27]
[27 32 29 28]



putida




Legionella

Philadelphia-1
[30 30 24 15]
[33 33 23 27]
[29 28 28 31]



pneumophila




Francisella

schu 4
[32 29 22 16]
[28 38 26 26]
[25 32 28 31]



tularensis




Bordetella

Tohama I
[30 29 24 16]
[23 37 30 24]
[30 32 30 26]



pertussis




Burkholderia

J2315
[29 29 27 14]
[27 32 26 29]
[27 36 31 24]



cepacia




[20 42 35 19]*



Burkholderia

K96243
[29 29 27 14]
[27 32 26 29]
[27 36 31 24]



pseudomallei




Neisseria

FA 1090, ATCC
[29 28 24 18]
[27 34 26 28]
[24 36 29 27]



gonorrhoeae

700825



Neisseria

MC58 (serogroup B)
[29 28 26 16]
[27 34 27 27]
[25 35 30 26]



meningitidis




Neisseria

serogroup C, FAM18
[29 28 26 16]
[27 34 27 27]
[25 35 30 26]



meningitidis




Neisseria

Z2491 (serogroup A)
[29 28 26 16]
[27 34 27 27]
[25 35 30 26]



meningitidis




Chlamydophila

TW-183
[31 27 22 19]
NO DATA
[32 27 27 29]



pneumoniae




Chlamydophila

AR39
[31 27 22 19]
NO DATA
[32 27 27 29]



pneumoniae




Chlamydophila

CWL029
[31 27 22 19]
NO DATA
[32 27 27 29]



pneumoniae




Chlamydophila

J138
[31 27 22 19]
NO DATA
[32 27 27 29]



pneumoniae




Corynebacterium

NCTC13129
[29 34 21 15]
[22 38 31 25]
[22 33 25 34]



diphtheriae




Mycobacterium

k10
[27 36 21 15]
[22 37 30 28]
[21 36 27 30]



avium




Mycobacterium

104
[27 36 21 15]
[22 37 30 28]
[21 36 27 30]



avium




Mycobacterium

CSU#93
[27 36 21 15]
[22 37 30 28]
[21 36 27 30]



tuberculosis




Mycobacterium

CDC 1551
[27 36 21 15]
[22 37 30 28]
[21 36 27 30]



tuberculosis




Mycobacterium

H37Rv (lab strain)
[27 36 21 15]
[22 37 30 28]
[21 36 27 30]



tuberculosis




Mycoplasma

M129
[31 29 19 20]
NO DATA
NO DATA



pneumoniae




Staphylococcus

MRSA252
[27 30 21 21]
[25 35 30 26]
[30 29 30 29]



aureus




[29 31 30 29]*



Staphylococcus

MSSA476
[27 30 21 21]
[25 35 30 26]
[30 29 30 29]



aureus




[30 29 29 30]*



Staphylococcus

COL
[27 30 21 21]
[25 35 30 26]
[30 29 30 29]



aureus




[30 29 29 30]*



Staphylococcus

Mu50
[27 30 21 21]
[25 35 30 26]
[30 29 30 29]



aureus




[30 29 29 30]*



Staphylococcus

MW2
[27 30 21 21]
[25 35 30 26]
[30 29 30 29]



aureus




[30 29 29 30]*



Staphylococcus

N315
[27 30 21 21]
[25 35 30 26]
[30 29 30 29]



aureus




[30 29 29 30]*



Staphylococcus

NCTC 8325
[27 30 21 21]
[25 35 30 26]
[30 29 30 29]



aureus



[25 35 31 26]*
[30 29 29 30]



Streptococcus

NEM316
[26 32 23 18]
[24 36 31 25]
[25 32 29 30]



agalactiae



[24 36 30 26]*



Streptococcus

NC_002955
[26 32 23 18]
[23 37 31 25]
[29 30 25 32]



equi




Streptococcus

MGAS8232
[26 32 23 18]
[24 37 30 25]
[25 31 29 31]



pyogenes




Streptococcus

MGAS315
[26 32 23 18]
[24 37 30 25]
[25 31 29 31]



pyogenes




Streptococcus

SSI-1
[26 32 23 18]
[24 37 30 25]
[25 31 29 31]



pyogenes




Streptococcus

MGAS10394
[26 32 23 18]
[24 37 30 25]
[25 31 29 31]



pyogenes




Streptococcus

Manfredo (M5)
[26 32 23 18]
[24 37 30 25]
[25 31 29 31]



pyogenes




Streptococcus

SF370 (M1)
[26 32 23 18]
[24 37 30 25]
[25 31 29 31]



pyogenes




Streptococcus

670
[26 32 23 18]
[25 35 28 28]
[25 32 29 30]



pneumoniae




Streptococcus

R6
[26 32 23 18]
[25 35 28 28]
[25 32 29 30]



pneumoniae




Streptococcus

TIGR4
[26 32 23 18]
[25 35 28 28]
[25 32 30 29]



pneumoniae




Streptococcus

NCTC7868
[25 33 23 18]
[24 36 31 25]
[25 31 29 31]



gordonii




Streptococcus

NCTC 12261
[26 32 23 18]
[25 35 30 26]
[25 32 29 30]



mitis




[24 31 35 29]*



Streptococcus

UA159
[24 32 24 19]
[25 37 30 24]
[28 31 26 31]



mutans

















TABLE 6B







Base Compositions of Common Respiratory Pathogens for Bioagent


Identifying Amplicons Corresponding to Primer Pair Nos: 349, 360, and 356













Primer 349
Primer 360
Primer 356


Organism
Strain
[A G C T]
[A G C T]
[A G C T]






Klebsiella

MGH78578
[25 31 25 22]
[33 37 25 27]
NO DATA



pneumoniae




Yersinia pestis

CO-92 Biovar
[25 31 27 20]
[34 35 25 28]
NO DATA



Orientalis
[25 32 26 20]*



Yersinia pestis

KIM5 P12 (Biovar
[25 31 27 20]
[34 35 25 28]
NO DATA



Mediaevalis)
[25 32 26 20]*



Yersinia pestis

91001
[25 31 27 20]
[34 35 25 28]
NO DATA



Haemophilus

KW20
[28 28 25 20]
[32 38 25 27]
NO DATA



influenzae




Pseudomonas

PAO1
[24 31 26 20]
[31 36 27 27]
NO DATA



aeruginosa



[31 36 27 28]*



Pseudomonas

Pf0-1
NO DATA
[30 37 27 28]
NO DATA



fluorescens



[30 37 27 28]



Pseudomonas

KT2440
[24 31 26 20]
[30 37 27 28]
NO DATA



putida




Legionella

Philadelphia-1
[23 30 25 23]
[30 39 29 24]
NO DATA



pneumophila




Francisella

schu 4
[26 31 25 19]
[32 36 27 27]
NO DATA



tularensis




Bordetella

Tohama I
[21 29 24 18]
[33 36 26 27]
NO DATA



pertussis




Burkholderia

J2315
[23 27 22 20]
[31 37 28 26]
NO DATA



cepacia




Burkholderia

K96243
[23 27 22 20]
[31 37 28 26]
NO DATA



pseudomallei




Neisseria

FA 1090, ATCC 700825
[24 27 24 17]
[34 37 25 26]
NO DATA



gonorrhoeae




Neisseria

MC58 (serogroup B)
[25 27 22 18]
[34 37 25 26]
NO DATA



meningitidis




Neisseria

serogroup C, FAM18
[25 26 23 18]
[34 37 25 26]
NO DATA



meningitidis




Neisseria

Z2491 (serogroup A)
[25 26 23 18]
[34 37 25 26]
NO DATA



meningitidis




Chlamydophila

TW-183
[30 28 27 18]
NO DATA
NO DATA



pneumoniae




Chlamydophila

AR39
[30 28 27 18]
NO DATA
NO DATA



pneumoniae




Chlamydophila

CWL029
[30 28 27 18]
NO DATA
NO DATA



pneumoniae




Chlamydophila

J138
[30 28 27 18]
NO DATA
NO DATA



pneumoniae




Corynebacterium

NCTC13129
NO DATA
[29 40 28 25]
NO DATA



diphtheriae




Mycobacterium

k10
NO DATA
[33 35 32 22]
NO DATA



avium




Mycobacterium

104
NO DATA
[33 35 32 22]
NO DATA



avium




Mycobacterium

CSU#93
NO DATA
[30 36 34 22]
NO DATA



tuberculosis




Mycobacterium

CDC 1551
NO DATA
[30 36 34 22]
NO DATA



tuberculosis




Mycobacterium

H37Rv (lab strain)
NO DATA
[30 36 34 22]
NO DATA



tuberculosis




Mycoplasma

M129
[28 30 24 19]
[34 31 29 28]
NO DATA



pneumoniae




Staphylococcus

MRSA252
[26 30 25 20]
[31 38 24 29]
[33 30 31 27]



aureus




Staphylococcus

MSSA476
[26 30 25 20]
[31 38 24 29]
[33 30 31 27]



aureus




Staphylococcus

COL
[26 30 25 20]
[31 38 24 29]
[33 30 31 27]



aureus




Staphylococcus

Mu50
[26 30 25 20]
[31 38 24 29]
[33 30 31 27]



aureus




Staphylococcus

MW2
[26 30 25 20]
[31 38 24 29]
[33 30 31 27]



aureus




Staphylococcus

N315
[26 30 25 20]
[31 38 24 29]
[33 30 31 27]



aureus




Staphylococcus

NCTC 8325
[26 30 25 20]
[31 38 24 29]
[33 30 31 27]



aureus




Streptococcus

NEM316
[28 31 22 20]
[33 37 24 28]
[37 30 28 26]



agalactiae




Streptococcus

NC_002955
[28 31 23 19]
[33 38 24 27]
[37 31 28 25]



equi




Streptococcus

MGAS8232
[28 31 23 19]
[33 37 24 28]
[38 31 29 23]



pyogenes




Streptococcus

MGAS315
[28 31 23 19]
[33 37 24 28]
[38 31 29 23]



pyogenes




Streptococcus

SSI-1
[28 31 23 19]
[33 37 24 28]
[38 31 29 23]



pyogenes




Streptococcus

MGAS10394
[28 31 23 19]
[33 37 24 28]
[38 31 29 23]



pyogenes




Streptococcus

Manfredo (M5)
[28 31 23 19]
[33 37 24 28]
[38 31 29 23]



pyogenes




Streptococcus

SF370 (M1)
[28 31 23 19]
[33 37 24 28]
[38 31 29 23]



pyogenes


[28 31 22 20]*



Streptococcus

670
[28 31 22 20]
[34 36 24 28]
[37 30 29 25]



pneumoniae




Streptococcus

R6
[28 31 22 20]
[34 36 24 28]
[37 30 29 25]



pneumoniae




Streptococcus

TIGR4
[28 31 22 20]
[34 36 24 28]
[37 30 29 25]



pneumoniae




Streptococcus

NCTC7868
[28 32 23 20]
[34 36 24 28]
[36 31 29 25]



gordonii




Streptococcus

NCTC 12261
[28 31 22 20]
[34 36 24 28]
[37 30 29 25]



mitis


[29 30 22 20]*



Streptococcus

UA159
[26 32 23 22]
[34 37 24 27]
NO DATA



mutans

















TABLE 6C







Base Compositions of Common Respiratory Pathogens for Bioagent


Identifying Amplicons Corresponding to Primer Pair Nos: 449, 354, and 352













Primer 449
Primer 354
Primer 352


Organism
Strain
[A G C T]
[A G C T]
[A G C T]






Klebsiella

MGH78578
NO DATA
[27 33 36 26]
NO DATA



pneumoniae




Yersinia pestis

CO-92 Biovar
NO DATA
[29 31 33 29]
[32 28 20 25]



Orientalis



Yersinia pestis

KIM5 P12 (Biovar
NO DATA
[29 31 33 29]
[32 28 20 25]



Mediaevalis)



Yersinia pestis

91001
NO DATA
[29 31 33 29]
NO DATA



Haemophilus

KW20
NO DATA
[30 29 31 32]
NO DATA



influenzae




Pseudomonas

PAO1
NO DATA
[26 33 39 24]
NO DATA



aeruginosa




Pseudomonas

Pf0-1
NO DATA
[26 33 34 29]
NO DATA



fluorescens




Pseudomonas

KT2440
NO DATA
[25 34 36 27]
NO DATA



putida




Legionella

Philadelphia-1
NO DATA
NO DATA
NO DATA



pneumophila




Francisella

schu 4
NO DATA
[33 32 25 32]
NO DATA



tularensis




Bordetella

Tohama I
NO DATA
[26 33 39 24]
NO DATA



pertussis




Burkholderia

J2315
NO DATA
[25 37 33 27]
NO DATA



cepacia




Burkholderia

K96243
NO DATA
[25 37 34 26]
NO DATA



pseudomallei




Neisseria

FA 1090, ATCC 700825
[17 23 22 10]
[29 31 32 30]
NO DATA



gonorrhoeae




Neisseria

MC58 (serogroup B)
NO DATA
[29 30 32 31]
NO DATA



meningitidis




Neisseria

serogroup C, FAM18
NO DATA
[29 30 32 31]
NO DATA



meningitidis




Neisseria

Z2491 (serogroup A)
NO DATA
[29 30 32 31]
NO DATA



meningitidis




Chlamydophila

TW-183
NO DATA
NO DATA
NO DATA



pneumoniae




Chlamydophila

AR39
NO DATA
NO DATA
NO DATA



pneumoniae




Chlamydophila

CWL029
NO DATA
NO DATA
NO DATA



pneumoniae




Chlamydophila

J138
NO DATA
NO DATA
NO DATA



pneumoniae




Corynebacterium

NCTC13129
NO DATA
NO DATA
NO DATA



diphtheriae




Mycobacterium

k10
NO DATA
NO DATA
NO DATA



avium




Mycobacterium

104
NO DATA
NO DATA
NO DATA



avium




Mycobacterium

CSU#93
NO DATA
NO DATA
NO DATA



tuberculosis




Mycobacterium

CDC 1551
NO DATA
NO DATA
NO DATA



tuberculosis




Mycobacterium

H37Rv (lab strain)
NO DATA
NO DATA
NO DATA



tuberculosis




Mycoplasma

M129
NO DATA
NO DATA
NO DATA



pneumoniae




Staphylococcus

MRSA252
[17 20 21 17]
[30 27 30 35]
[36 24 19 26]



aureus




Staphylococcus

MSSA476
[17 20 21 17]
[30 27 30 35]
[36 24 19 26]



aureus




Staphylococcus

COL
[17 20 21 17]
[30 27 30 35]
[35 24 19 27]



aureus




Staphylococcus

Mu50
[17 20 21 17]
[30 27 30 35]
[36 24 19 26]



aureus




Staphylococcus

MW2
[17 20 21 17]
[30 27 30 35]
[36 24 19 26]



aureus




Staphylococcus

N315
[17 20 21 17]
[30 27 30 35]
[36 24 19 26]



aureus




Staphylococcus

NCTC 8325
[17 20 21 17]
[30 27 30 35]
[35 24 19 27]



aureus




Streptococcus

NEM316
[22 20 19 14]
[26 31 27 38]
[29 26 22 28]



agalactiae




Streptococcus

NC_002955
[22 21 19 13]
NO DATA
NO DATA



equi




Streptococcus

MGAS8232
[23 21 19 12]
[24 32 30 36]
NO DATA



pyogenes




Streptococcus

MGAS315
[23 21 19 12]
[24 32 30 36]
NO DATA



pyogenes




Streptococcus

SSI-1
[23 21 19 12]
[24 32 30 36]
NO DATA



pyogenes




Streptococcus

MGAS10394
[23 21 19 12]
[24 32 30 36]
NO DATA



pyogenes




Streptococcus

Manfredo (M5)
[23 21 19 12]
[24 32 30 36]
NO DATA



pyogenes




Streptococcus

SF370 (M1)
[23 21 19 12]
[24 32 30 36]
NO DATA



pyogenes




Streptococcus

670
[22 20 19 14]
[25 33 29 35]
[30 29 21 25]



pneumoniae




Streptococcus

R6
[22 20 19 14]
[25 33 29 35]
[30 29 21 25]



pneumoniae




Streptococcus

TIGR4
[22 20 19 14]
[25 33 29 35]
[30 29 21 25]



pneumoniae




Streptococcus

NCTC7868
[21 21 19 14]
NO DATA
[29 26 22 28]



gordonii




Streptococcus

NCTC 12261
[22 20 19 14]
[26 30 32 34]
NO DATA



mitis




Streptococcus

UA159
NO DATA
NO DATA
NO DATA



mutans

















TABLE 6D







Base Compositions of Common Respiratory Pathogens for Bioagent


Identifying Amplicons Corresponding to Primer Pair Nos: 355, 358, and 359













Primer 355
Primer 358
Primer 359


Organism
Strain
[A G C T]
[A G C T]
[A G C T]






Klebsiella

MGH78578
NO DATA
[24 39 33 20]
[25 21 24 17]



pneumoniae




Yersinia pestis

CO-92 Biovar
NO DATA
[26 34 35 21]
[23 23 19 22]



Orientalis



Yersinia pestis

KIM5 P12 (Biovar
NO DATA
[26 34 35 21]
[23 23 19 22]



Mediaevalis)



Yersinia pestis

91001
NO DATA
[26 34 35 21]
[23 23 19 22]



Haemophilus

KW20
NO DATA
NO DATA
NO DATA



influenzae




Pseudomonas

PAO1
NO DATA
NO DATA
NO DATA



aeruginosa




Pseudomonas

Pf0-1
NO DATA
NO DATA
NO DATA



fluorescens




Pseudomonas

KT2440
NO DATA
[21 37 37 21]
NO DATA



putida




Legionella

Philadelphia-1
NO DATA
NO DATA
NO DATA



pneumophila




Francisella

schu 4
NO DATA
NO DATA
NO DATA



tularensis




Bordetella

Tohama I
NO DATA
NO DATA
NO DATA



pertussis




Burkholderia

J2315
NO DATA
NO DATA
NO DATA



cepacia




Burkholderia

K96243
NO DATA
NO DATA
NO DATA



pseudomallei




Neisseria

FA 1090, ATCC 700825
NO DATA
NO DATA
NO DATA



gonorrhoeae




Neisseria

MC58 (serogroup B)
NO DATA
NO DATA
NO DATA



meningitidis




Neisseria

serogroup C, FAM18
NO DATA
NO DATA
NO DATA



meningitidis




Neisseria

Z2491 (serogroup A)
NO DATA
NO DATA
NO DATA



meningitidis




Chlamydophila

TW-183
NO DATA
NO DATA
NO DATA



pneumoniae




Chlamydophila

AR39
NO DATA
NO DATA
NO DATA



pneumoniae




Chlamydophila

CWL029
NO DATA
NO DATA
NO DATA



pneumoniae




Chlamydophila

J138
NO DATA
NO DATA
NO DATA



pneumoniae




Corynebacterium

NCTC13129
NO DATA
NO DATA
NO DATA



diphtheriae




Mycobacterium

k10
NO DATA
NO DATA
NO DATA



avium




Mycobacterium

104
NO DATA
NO DATA
NO DATA



avium




Mycobacterium

CSU#93
NO DATA
NO DATA
NO DATA



tuberculosis




Mycobacterium

CDC 1551
NO DATA
NO DATA
NO DATA



tuberculosis




Mycobacterium

H37Rv (lab strain)
NO DATA
NO DATA
NO DATA



tuberculosis




Mycoplasma

M129
NO DATA
NO DATA
NO DATA



pneumoniae




Staphylococcus

MRSA252
NO DATA
NO DATA
NO DATA



aureus




Staphylococcus

MSSA476
NO DATA
NO DATA
NO DATA



aureus




Staphylococcus

COL
NO DATA
NO DATA
NO DATA



aureus




Staphylococcus

Mu50
NO DATA
NO DATA
NO DATA



aureus




Staphylococcus

MW2
NO DATA
NO DATA
NO DATA



aureus




Staphylococcus

N315
NO DATA
NO DATA
NO DATA



aureus




Staphylococcus

NCTC 8325
NO DATA
NO DATA
NO DATA



aureus




Streptococcus

NEM316
NO DATA
NO DATA
NO DATA



agalactiae




Streptococcus

NC_002955
NO DATA
NO DATA
NO DATA



equi




Streptococcus

MGAS8232
NO DATA
NO DATA
NO DATA



pyogenes




Streptococcus

MGAS315
NO DATA
NO DATA
NO DATA



pyogenes




Streptococcus

SSI-1
NO DATA
NO DATA
NO DATA



pyogenes




Streptococcus

MGAS10394
NO DATA
NO DATA
NO DATA



pyogenes




Streptococcus

Manfredo (M5)
NO DATA
NO DATA
NO DATA



pyogenes




Streptococcus

SF370 (M1)
NO DATA
NO DATA
NO DATA



pyogenes




Streptococcus

670
NO DATA
NO DATA
NO DATA



pneumoniae




Streptococcus

R6
NO DATA
NO DATA
NO DATA



pneumoniae




Streptococcus

TIGR4
NO DATA
NO DATA
NO DATA



pneumoniae




Streptococcus

NCTC7868
NO DATA
NO DATA
NO DATA



gordonii




Streptococcus

NCTC 12261
NO DATA
NO DATA
NO DATA



mitis




Streptococcus

UA159
NO DATA
NO DATA
NO DATA



mutans

















TABLE 6E







Base Compositions of Common Respiratory Pathogens for Bioagent


Identifying Amplicons Corresponding to Primer Pair Nos: 362, 363, and 367













Primer 362
Primer 363
Primer 367


Organism
Strain
[A G C T]
[A G C T]
[A G C T]






Klebsiella

MGH78578
[21 33 22 16]
[16 34 26 26]
NO DATA



pneumoniae




Yersinia pestis

CO-92 Biovar
[20 34 18 20]
NO DATA
NO DATA



Orientalis



Yersinia pestis

KIM5 P12 (Biovar
[20 34 18 20]
NO DATA
NO DATA



Mediaevalis)



Yersinia pestis

91001
[20 34 18 20]
NO DATA
NO DATA



Haemophilus

KW20
NO DATA
NO DATA
NO DATA



influenzae




Pseudomonas

PAO1
[19 35 21 17]
[16 36 28 22]
NO DATA



aeruginosa




Pseudomonas

Pf0-1
NO DATA
[18 35 26 23]
NO DATA



fluorescens




Pseudomonas

KT2440
NO DATA
[16 35 28 23]
NO DATA



putida




Legionella

Philadelphia-1
NO DATA
NO DATA
NO DATA



pneumophila




Francisella

schu 4
NO DATA
NO DATA
NO DATA



tularensis




Bordetella

Tohama I
[20 31 24 17]
[15 34 32 21]
[26 25 34 19]



pertussis




Burkholderia

J2315
[20 33 21 18]
[15 36 26 25]
[25 27 32 20]



cepacia




Burkholderia

K96243
[19 34 19 20]
[15 37 28 22]
[25 27 32 20]



pseudomallei




Neisseria

FA 1090, ATCC 700825
NO DATA
NO DATA
NO DATA



gonorrhoeae




Neisseria

MC58 (serogroup B)
NO DATA
NO DATA
NO DATA



meningitidis




Neisseria

serogroup C, FAM18
NO DATA
NO DATA
NO DATA



meningitidis




Neisseria

Z2491 (serogroup A)
NO DATA
NO DATA
NO DATA



meningitidis




Chlamydophila

TW-183
NO DATA
NO DATA
NO DATA



pneumoniae




Chlamydophila

AR39
NO DATA
NO DATA
NO DATA



pneumoniae




Chlamydophila

CWL029
NO DATA
NO DATA
NO DATA



pneumoniae




Chlamydophila

J138
NO DATA
NO DATA
NO DATA



pneumoniae




Corynebacterium

NCTC13129
NO DATA
NO DATA
NO DATA



diphtheriae




Mycobacterium

k10
[19 34 23 16]
NO DATA
[24 26 35 19]



avium




Mycobacterium

104
[19 34 23 16]
NO DATA
[24 26 35 19]



avium




Mycobacterium

CSU#93
[19 31 25 17]
NO DATA
[25 25 34 20]



tuberculosis




Mycobacterium

CDC 1551
[19 31 24 18]
NO DATA
[25 25 34 20]



tuberculosis




Mycobacterium

H37Rv (lab strain)
[19 31 24 18]
NO DATA
[25 25 34 20]



tuberculosis




Mycoplasma

M129
NO DATA
NO DATA
NO DATA



pneumoniae




Staphylococcus

MRSA252
NO DATA
NO DATA
NO DATA



aureus




Staphylococcus

MSSA476
NO DATA
NO DATA
NO DATA



aureus




Staphylococcus

COL
NO DATA
NO DATA
NO DATA



aureus




Staphylococcus

Mu50
NO DATA
NO DATA
NO DATA



aureus




Staphylococcus

MW2
NO DATA
NO DATA
NO DATA



aureus




Staphylococcus

N315
NO DATA
NO DATA
NO DATA



aureus




Staphylococcus

NCTC 8325
NO DATA
NO DATA
NO DATA



aureus




Streptococcus

NEM316
NO DATA
NO DATA
NO DATA



agalactiae




Streptococcus

NC_002955
NO DATA
NO DATA
NO DATA



equi




Streptococcus

MGAS8232
NO DATA
NO DATA
NO DATA



pyogenes




Streptococcus

MGAS315
NO DATA
NO DATA
NO DATA



pyogenes




Streptococcus

SSI-1
NO DATA
NO DATA
NO DATA



pyogenes




Streptococcus

MGAS10394
NO DATA
NO DATA
NO DATA



pyogenes




Streptococcus

Manfredo (M5)
NO DATA
NO DATA
NO DATA



pyogenes




Streptococcus

SF370 (M1)
NO DATA
NO DATA
NO DATA



pyogenes




Streptococcus

670
NO DATA
NO DATA
NO DATA



pneumoniae




Streptococcus

R6
[20 30 19 23]
NO DATA
NO DATA



pneumoniae




Streptococcus

TIGR4
[20 30 19 23]
NO DATA
NO DATA



pneumoniae




Streptococcus

NCTC7868
NO DATA
NO DATA
NO DATA



gordonii




Streptococcus

NCTC 12261
NO DATA
NO DATA
NO DATA



mitis




Streptococcus

UA159
NO DATA
NO DATA
NO DATA



mutans










Four sets of throat samples from military recruits at different military facilities taken at different time points were analyzed using the primers of the present invention. The first set was collected at a military training center from Nov. 1 to Dec. 20, 2002 during one of the most severe outbreaks of pneumonia associated with group A Streptococcus in the United States since 1968. During this outbreak, fifty-one throat swabs were taken from both healthy and hospitalized recruits and plated on blood agar for selection of putative group A Streptococcus colonies. A second set of 15 original patient specimens was taken during the height of this group A Streptococcus-associated respiratory disease outbreak. The third set were historical samples, including twenty-seven isolates of group A Streptococcus, from disease outbreaks at this and other military training facilities during previous years. The fourth set of samples was collected from five geographically separated military facilities in the continental U.S. in the winter immediately following the severe November/December 2002 outbreak.


Pure colonies isolated from group A Streptococcus-selective media from all four collection periods were analyzed with the surveillance primer set. All samples showed base compositions that precisely matched the four completely sequenced strains of Streptococcus pyogenes. Shown in FIG. 4 is a 3D diagram of base composition (axes A, G and C) of bioagent identifying amplicons obtained with primer pair number 14 (a precursor of primer pair number 348 which targets 16S rRNA). The diagram indicates that the experimentally determined base compositions of the clinical samples closely match the base compositions expected for Streptococcus pyogenes and are distinct from the expected base compositions of other organisms.


In addition to the identification of Streptococcus pyogenes, other potentially pathogenic organisms were identified concurrently. Mass spectral analysis of a sample whose nucleic acid was amplified by primer pair number 349 (SEQ ID NOs: 49 and 405) exhibited signals of bioagent identifying amplicons with molecular masses that were found to correspond to analogous base compositions of bioagent identifying amplicons of Streptococcus pyogenes (A27 G32 C24 T18), Neisseria meningitidis (A25 G27 C22 T18), and Haemophilus influenzae (A28 G28 C25 T20) (see FIG. 5 and Table 6B). These organisms were present in a ratio of 4:5:20 as determined by comparison of peak heights with peak height of an internal PCR calibration standard as described in commonly owned U.S. Patent Application Ser. No. 60/545,425 which is incorporated herein by reference in its entirety.


Since certain division-wide primers that target housekeeping genes are designed to provide coverage of specific divisions of bacteria to increase the confidence level for identification of bacterial species, they are not expected to yield bioagent identifying amplicons for organisms outside of the specific divisions. For example, primer pair number 356 (SEQ ID NOs: 232:592) primarily amplifies the nucleic acid of members of the classes Bacilli and Clostridia and is not expected to amplify proteobacteria such as Neisseria meningitidis and Haemophilus influenzae. As expected, analysis of the mass spectrum of amplification products obtained with primer pair number 356 does not indicate the presence of Neisseria meningitidis and Haemophilus influenzae but does indicate the presence of Streptococcus pyogenes (FIGS. 3 and 6, Table 6B). Thus, these primers or types of primers can confirm the absence of particular bioagents from a sample.


The 15 throat swabs from military recruits were found to contain a relatively small set of microbes in high abundance. The most common were Haemophilus influenza, Neisseria meningitides, and Streptococcus pyogenes. Staphylococcus epidermidis, Moraxella cattarhalis, Corynebacterium pseudodiphtheriticum, and Staphylococcus aureus were present in fewer samples. An equal number of samples from healthy volunteers from three different geographic locations, were identically analyzed. Results indicated that the healthy volunteers have bacterial flora dominated by multiple, commensal non-beta-hemolytic Streptococcal species, including the viridans group streptococci (S. parasangunis, S. vestibularis, S. mitis, S. oralis and S. pneumoniae; data not shown), and none of the organisms found in the military recruits were found in the healthy controls at concentrations detectable by mass spectrometry. Thus, the military recruits in the midst of a respiratory disease outbreak had a dramatically different microbial population than that experienced by the general population in the absence of epidemic disease.


Example 8
Drill-Down Analysis for Determination of emm-Type of Streptococcus pyogenes in Epidemic Surveillance

As a continuation of the epidemic surveillance investigation of Example 7, determination of sub-species characteristics (genotyping) of Streptococcus pyogenes, was carried out based on a strategy that generates strain-specific signatures according to the rationale of Multi-Locus Sequence Typing (MLST). In classic MLST analysis, internal fragments of several housekeeping genes are amplified and sequenced (Enright et al. Infection and Immunity, 2001, 69, 2416-2427). In classic MLST analysis, internal fragments of several housekeeping genes are amplified and sequenced. In the present investigation, bioagent identifying amplicons from housekeeping genes were produced using drill-down primers and analyzed by mass spectrometry. Since mass spectral analysis results in molecular mass, from which base composition can be determined, the challenge was to determine whether resolution of emm classification of strains of Streptococcus pyogenes could be determined.


An alignment was constructed of concatenated alleles of seven MLST housekeeping genes (glucose kinase (gki), glutamine transporter protein (gtr), glutamate racemase (murl), DNA mismatch repair protein (mutS), xanthine phosphoribosyl transferase (xpt), and acetyl-CoA acetyl transferase (yqiL)) from each of the 212 previously emm-typed strains of Streptococcus pyogenes. From this alignment, the number and location of primer pairs that would maximize strain identification via base composition was determined. As a result, 6 primer pairs were chosen as standard drill-down primers for determination of emm-type of Streptococcus pyogenes. These six primer pairs are displayed in Table 7. This drill-down set comprises primers with T modifications (note TMOD designation in primer names) which constitutes a functional improvement with regard to prevention of non-templated adenylation (vide supra) relative to originally selected primers which are displayed below in the same row.









TABLE 7







Group A Streptococcus Drill-Down Primer Pairs














Forward







Primer


Primer

(SEQ

Reverse Primer
Target


Pair No.
Forward Primer Name
ID NO:)
Reverse Primer Name
(SEQ ID NO:)
Gene















442
SP101_SPET11_358_387_TMOD_F
311
SP101_SPET11_448_473_TMOD_R
669
gki


80
SP101_SPET11_358_387_F
310
SP101_SPET11_448_473_TMOD_R
668
gki


443
SP101_SPET11_600_629_TMOD_F
314
SP101_SPET11_686_714_TMOD_R
671
gtr


81
SP101_SPET11_600_629_F
313
SP101_SPET11_686_714_R
670
gtr


426
SP101_SPET11_1314_1336_TMOD_F
278
SP101_SPET11_1403_1431_TMOD_R
633
murI


86
SP101_SPET11_1314_1336_F
277
SP101_SPET11_1403_1431_R
632
murI


430
SP101_SPET11_1807_1835_TMOD_F
286
SP101_SPET11_1901_1927_TMOD_R
641
mutS


90
SP101_SPET11_1807_1835_F
285
SP101_SPET11_1901_1927_R
640
mutS


438
SP101_SPET11_3075_3103_TMOD_F
302
SP101_SPET11_3168_3196_TMOD_R
657
xpt


96
SP101_SPET11_3075_3103_F
301
SP101_SPET11_3168_3196_R
656
xpt


441
SP101_SPET11_3511_3535_TMOD_F
309
SP101_SPET11_3605_3629_TMOD_R
664
yqiL


98
SP101_SPET11_3511_3535_F
308
SP101_SPET11_3605_3629_R
663
yqiL









The primers of Table 7 were used to produce bioagent identifying amplicons from nucleic acid present in the clinical samples. The bioagent identifying amplicons which were subsequently analyzed by mass spectrometry and base compositions corresponding to the molecular masses were calculated.


Of the 51 samples taken during the peak of the November/December 2002 epidemic (Table 8A-C rows 1-3), all except three samples were found to represent emm3, a Group A Streptococcus genotype previously associated with high respiratory virulence. The three outliers were from samples obtained from healthy individuals and probably represent non-epidemic strains. Archived samples (Tables 8A-C rows 5-13) from historical collections showed a greater heterogeneity of base compositions and emm types as would be expected from different epidemics occurring at different places and dates. The results of the mass spectrometry analysis and emm gene sequencing were found to be concordant for the epidemic and historical samples.









TABLE 8A







Base Composition Analysis of Bioagent Identifying Amplicons of Group A Streptococcus


samples from Six Military Installations Obtained with Primer Pair Nos. 426 and 430














emm-type by



murI
mutS


# of
Mass
emm-Gene
Location

(Primer Pair
(Primer Pair


Instances
Spectrometry
Sequencing
(sample)
Year
No. 426)
No. 430)





48 
3
3
MCRD San
2002
A39 G25 C20 T34
A38 G27 C23 T33


2
6
6
Diego

A40 G24 C20 T34
A38 G27 C23 T33


1
28 
28 
(Cultured)

A39 G25 C20 T34
A38 G27 C23 T33


15 
3
ND


A39 G25 C20 T34
A38 G27 C23 T33


6
3
3
NHRC San
2003
A39 G25 C20 T34
A38 G27 C23 T33


3
 5, 58
5
Diego-

A40 G24 C20 T34
A38 G27 C23 T33


6
6
6
Archive

A40 G24 C20 T34
A38 G27 C23 T33


1
11 
11 
(Cultured)

A39 G25 C20 T34
A38 G27 C23 T33


3
12 
12 


A40 G24 C20 T34
A38 G26 C24 T33


1
22 
22 


A39 G25 C20 T34
A38 G27 C23 T33


3
25, 75
75 


A39 G25 C20 T34
A38 G27 C23 T33


4
44/61, 82, 9
44/61


A40 G24 C20 T34
A38 G26 C24 T33


2
53, 91
91 


A39 G25 C20 T34
A38 G27 C23 T33


1
2
2
Ft.
2003
A39 G25 C20 T34
A38 G27 C24 T32


2
3
3
Leonard

A39 G25 C20 T34
A38 G27 C23 T33


1
4
4
Wood

A39 G25 C20 T34
A38 G27 C23 T33


1
6
6
(Cultured)

A40 G24 C20 T34
A38 G27 C23 T33


11 
25 or 75
75 


A39 G25 C20 T34
A38 G27 C23 T33


1
25, 75, 33,
75 


A39 G25 C20 T34
A38 G27 C23 T33



34, 4, 52, 84


1
44/61 or 82
44/61


A40 G24 C20 T34
A38 G26 C24 T33



or 9


2
 5 or 58
5


A40 G24 C20 T34
A38 G27 C23 T33


3
1
1
Ft. Sill
2003
A40 G24 C20 T34
A38 G27 C23 T33


2
3
3
(Cultured)

A39 G25 C20 T34
A38 G27 C23 T33


1
4
4


A39 G25 C20 T34
A38 G27 C23 T33


1
28 
28 


A39 G25 C20 T34
A38 G27 C23 T33


1
3
3
Ft.
2003
A39 G25 C20 T34
A38 G27 C23 T33


1
4
4
Benning

A39 G25 C20 T34
A38 G27 C23 T33


3
6
6
(Cultured)

A40 G24 C20 T34
A38 G27 C23 T33


1
11 
11 


A39 G25 C20 T34
A38 G27 C23 T33


1
13 
 94**


A40 G24 C20 T34
A38 G27 C23 T33


1
44/61 or 82
82 


A40 G24 C20 T34
A38 G26 C24 T33



or 9


1
 5 or 58
58 


A40 G24 C20 T34
A38 G27 C23 T33


1
78 or 89
89 


A39 G25 C20 T34
A38 G27 C23 T33


2
 5 or 58
ND
Lackland
2003
A40 G24 C20 T34
A38 G27 C23 T33


1
2

AFB

A39 G25 C20 T34
A38 G27 C24 T32


1
81 or 90

(Throat

A40 G24 C20 T34
A38 G27 C23 T33


1
78 

Swabs)

A38 G26 C20 T34
A38 G27 C23 T33


  3***
No detection



No detection
No detection


7
3
ND
MCRD San
2002
A39 G25 C20 T34
A38 G27 C23 T33


1
3
ND
Diego

No detection
A38 G27 C23 T33


1
3
ND
(Throat

No detection
No detection


1
3
ND
Swabs)

No detection
No detection


2
3
ND


No detection
A38 G27 C23 T33


3
No detection
ND


No detection
No detection
















TABLE 8B







Base Composition Analysis of Bioagent Identifying Amplicons of Group A Streptococcus


samples from Six Military Installations Obtained with Primer Pair Nos. 438 and 441














emm-type by



xpt
yqiL


# of
Mass
emm-Gene
Location

(Primer Pair
(Primer Pair


Instances
Spectrometry
Sequencing
(sample)
Year
No. 438)
No. 441)





48 
3
3
MCRD San
2002
A30 G36 C20 T36
A40 G29 C19 T31


2
6
6
Diego

A30 G36 C20 T36
A40 G29 C19 T31


1
28 
28 
(Cultured)

A30 G36 C20 T36
A41 G28 C18 T32


15 
3
ND


A30 G36 C20 T36
A40 G29 C19 T31


6
3
3
NHRC San
2003
A30 G36 C20 T36
A40 G29 C19 T31


3
 5, 58
5
Diego-

A30 G36 C20 T36
A40 G29 C19 T31


6
6
6
Archive

A30 G36 C20 T36
A40 G29 C19 T31


1
11 
11 
(Cultured)

A30 G36 C20 T36
A40 G29 C19 T31


3
12 
12 


A30 G36 C19 T37
A40 G29 C19 T31


1
22 
22 


A30 G36 C20 T36
A40 G29 C19 T31


3
25, 75
75 


A30 G36 C20 T36
A40 G29 C19 T31


4
44/61, 82, 9
44/61


A30 G36 C20 T36
A41 G28 C19 T31


2
53, 91
91 


A30 G36 C19 T37
A40 G29 C19 T31


1
2
2
Ft.
2003
A30 G36 C20 T36
A40 G29 C19 T31


2
3
3
Leonard

A30 G36 C20 T36
A40 G29 C19 T31


1
4
4
Wood

A30 G36 C19 T37
A41 G28 C19 T31


1
6
6
(Cultured)

A30 G36 C20 T36
A40 G29 C19 T31


11 
25 or 75
75 


A30 G36 C20 T36
A40 G29 C19 T31


1
25, 75, 33,
75 


A30 G36 C19 T37
A40 G29 C19 T31



34, 4, 52, 84


1
44/61 or 82
44/61


A30 G36 C20 T36
A41 G28 C19 T31



or 9


2
 5 or 58
5


A30 G36 C20 T36
A40 G29 C19 T31


3
1
1
Ft. Sill
2003
A30 G36 C19 T37
A40 G29 C19 T31


2
3
3
(Cultured)

A30 G36 C20 T36
A40 G29 C19 T31


1
4
4


A30 G36 C19 T37
A41 G28 C19 T31


1
28 
28 


A30 G36 C20 T36
A41 G28 C18 T32


1
3
3
Ft.
2003
A30 G36 C20 T36
A40 G29 C19 T31


1
4
4
Benning

A30 G36 C19 T37
A41 G28 C19 T31


3
6
6
(Cultured)

A30 G36 C20 T36
A40 G29 C19 T31


1
11 
11 


A30 G36 C20 T36
A40 G29 C19 T31


1
13 
 94**


A30 G36 C20 T36
A41 G28 C19 T31


1
44/61 or 82
82 


A30 G36 C20 T36
A41 G28 C19 T31



or 9


1
 5 or 58
58 


A30 G36 C20 T36
A40 G29 C19 T31


1
78 or 89
89 


A30 G36 C20 T36
A41 G28 C19 T31


2
 5 or 58
ND
Lackland
2003
A30 G36 C20 T36
A40 G29 C19 T31


1
2

AFB

A30 G36 C20 T36
A40 G29 C19 T31


1
81 or 90

(Throat

A30 G36 C20 T36
A40 G29 C19 T31


1
78 

Swabs)

A30 G36 C20 T36
A41 G28 C19 T31


  3***
No detection



No detection
No detection


7
3
ND
MCRD San
2002
A30 G36 C20 T36
A40 G29 C19 T31


1
3
ND
Diego

A30 G36 C20 T36
A40 G29 C19 T31


1
3
ND
(Throat

A30 G36 C20 T36
No detection


1
3
ND
Swabs)

No detection
A40 G29 C19 T31


2
3
ND


A30 G36 C20 T36
A40 G29 C19 T31


3
No detection
ND


No detection
No detection
















TABLE 8C







Base Composition Analysis of Bioagent Identifying Amplicons of Group A Streptococcus


samples from Six Military Installations Obtained with Primer Pair Nos. 438 and 441














emm-type by



gki
gtr


# of
Mass
emm-Gene
Location

(Primer Pair
((Primer Pair


Instances
Spectrometry
Sequencing
(sample)
Year
No. 442)
No. 443)





48 
3
3
MCRD San
2002
A32 G35 C17 T32
A39 G28 C16 T32


2
6
6
Diego

A31 G35 C17 T33
A39 G28 C15 T33


1
28 
28 
(Cultured)

A30 G36 C17 T33
A39 G28 C16 T32


15 
3
ND


A32 G35 C17 T32
A39 G28 C16 T32


6
3
3
NHRC San
2003
A32 G35 C17 T32
A39 G28 C16 T32


3
 5, 58
5
Diego-

A30 G36 C20 T30
A39 G28 C15 T33


6
6
6
Archive

A31 G35 C17 T33
A39 G28 C15 T33


1
11 
11 
(Cultured)

A30 G36 C20 T30
A39 G28 C16 T32


3
12 
12 


A31 G35 C17 T33
A39 G28 C15 T33


1
22 
22 


A31 G35 C17 T33
A38 G29 C15 T33


3
25, 75
75 


A30 G36 C17 T33
A39 G28 C15 T33


4
44/61, 82, 9
44/61


A30 G36 C18 T32
A39 G28 C15 T33


2
53, 91
91 


A32 G35 C17 T32
A39 G28 C16 T32


1
2
2
Ft.
2003
A30 G36 C17 T33
A39 G28 C15 T33


2
3
3
Leonard

A32 G35 C17 T32
A39 G28 C16 T32


1
4
4
Wood

A31 G35 C17 T33
A39 G28 C15 T33


1
6
6
(Cultured)

A31 G35 C17 T33
A39 G28 C15 T33


11 
25 or 75
75 


A30 G36 C17 T33
A39 G28 C15 T33


1
25, 75, 33,
75 


A30 G36 C17 T33
A39 G28 C15 T33



34, 4, 52, 84


1
44/61 or 82
44/61


A30 G36 C18 T32
A39 G28 C15 T33



or 9


2
 5 or 58
5


A30 G36 C20 T30
A39 G28 C15 T33


3
1
1
Ft. Sill
2003
A30 G36 C18 T32
A39 G28 C15 T33


2
3
3
(Cultured)

A32 G35 C17 T32
A39 G28 C16 T32


1
4
4


A31 G35 C17 T33
A39 G28 C15 T33


1
28 
28 


A30 G36 C17 T33
A39 G28 C16 T32


1
3
3
Ft.
2003
A32 G35 C17 T32
A39 G28 C16 T32


1
4
4
Benning

A31 G35 C17 T33
A39 G28 C15 T33


3
6
6
(Cultured)

A31 G35 C17 T33
A39 G28 C15 T33


1
11 
11 


A30 G36 C20 T30
A39 G28 C16 T32


1
13 
 94**


A30 G36 C19 T31
A39 G28 C15 T33


1
44/61 or 82
82 


A30 G36 C18 T32
A39 G28 C15 T33



or 9


1
 5 or 58
58 


A30 G36 C20 T30
A39 G28 C15 T33


1
78 or 89
89 


A30 G36 C18 T32
A39 G28 C15 T33


2
 5 or 58
ND
Lackland
2003
A30 G36 C20 T30
A39 G28 C15 T33


1
2

AFB

A30 G36 C17 T33
A39 G28 C15 T33


1
81 or 90

(Throat

A30 G36 C17 T33
A39 G28 C15 T33


1
78 

Swabs)

A30 G36 C18 T32
A39 G28 C15 T33


  3***
No detection



No detection
No detection


7
3
ND
MCRD San
2002
A32 G35 C17 T32
A39 G28 C16 T32


1
3
ND
Diego

No detection
No detection


1
3
ND
(Throat

A32 G35 C17 T32
A39 G28 C16 T32


1
3
ND
Swabs)

A32 G35 C17 T32
No detection


2
3
ND


A32 G35 C17 T32
No detection


3
No detection
ND


No detection
No detection









Example 9
Design of Calibrant Polynucleotides Based on Bioagent Identifying Amplicons for Identification of Species of Bacteria (Bacterial Bioagent Identifying Amplicons)

This example describes the design of 19 calibrant polynucleotides based on bacterial bioagent identifying amplicons corresponding to the primers of the broad surveillance set (Table 4) and the Bacillus anthracis drill-down set (Table 5).


Calibration sequences were designed to simulate bacterial bioagent identifying amplicons produced by the T modified primer pairs shown in Table 4 (primer names have the designation “TMOD”). The calibration sequences were chosen as a representative member of the section of bacterial genome from specific bacterial species which would be amplified by a given primer pair. The model bacterial species upon which the calibration sequences are based are also shown in Table 9. For example, the calibration sequence chosen to correspond to an amplicon produced by primer pair no. 361 is SEQ ID NO: 722. In Table 9, the forward (_F) or reverse (_R) primer name indicates the coordinates of an extraction representing a gene of a standard reference bacterial genome to which the primer hybridizes e.g.: the forward primer name 16S_EC713732_TMOD_F indicates that the forward primer hybridizes to residues 713-732 of the gene encoding 16S ribosomal RNA in an E. coli reference sequence (in this case, the reference sequence is an extraction consisting of residues 4033120-4034661 of the genomic sequence of E. coli K12 (GenBank gi number 16127994). Additional gene coordinate reference information is shown in Table 10. The designation “TMOD” in the primer names indicates that the 5′ end of the primer has been modified with a non-matched template T residue which prevents the PCR polymerase from adding non-templated adenosine residues to the 5′ end of the amplification product, an occurrence which may result in miscalculation of base composition from molecular mass data (vide supra).


The 19 calibration sequences described in Tables 9 and 10 were combined into a single calibration polynucleotide sequence (SEQ ID NO: 741—which is herein designated a “combination calibration polynucleotide”) which was then cloned into a pCR®-Blunt vector (Invitrogen, Carlsbad, Calif.). This combination calibration polynucleotide can be used in conjunction with the primers of Table 9 as an internal standard to produce calibration amplicons for use in determination of the quantity of any bacterial bioagent. Thus, for example, when the combination calibration polynucleotide vector is present in an amplification reaction mixture, a calibration amplicon based on primer pair 346 (16S rRNA) will be produced in an amplification reaction with primer pair 346 and a calibration amplicon based on primer pair 363 (rpoC) will be produced with primer pair 363. Coordinates of each of the 19 calibration sequences within the calibration polynucleotide (SEQ ID NO: 783) are indicated in Table 10.









TABLE 9







Bacterial Primer Pairs for Production of Bacterial Bioagent Identifying Amplicons and Corresponding Representative Calibration Sequences















Forward

Reverse

Calibration




Primer

Primer
Calibration
Sequence


Primer

(SEQ ID

(SEQ ID
Sequence Model
(SEQ ID


Pair No.
Forward Primer Name
NO:)
Reverse Primer Name
NO:)
Species
NO:)
















361
16S_EC_1090_1111_2_TMOD_F
5
16S_EC_1175_1196_TMOD_R
370

Bacillus

764








anthracis



346
16S_EC_713_732_TMOD_F
27
16S_EC_789_809_TMOD_R
389

Bacillus

765








anthracis



347
16S_EC_785_806_TMOD_F
30
16S_EC_880_897_TMOD_R
392

Bacillus

766








anthracis



348
16S_EC_960_981_TMOD_F
38
16S_EC_1054_1073_TMOD_R
363

Bacillus

767








anthracis



349
23S_EC_1826_1843_TMOD_F
49
23S_EC_1906_1924_TMOD_R
405

Bacillus

768








anthracis



360
23S_EC_2646_2667_TMOD_F
60
23S_EC_2745_2765_TMOD_R
416

Bacillus

769








anthracis



350
CAPC_BA_274_303_TMOD_F
98
CAPC_BA_349_376_TMOD_R
452

Bacillus

770








anthracis



351
CYA_BA_1353_1379_TMOD_F
128
CYA_BA_1448_1467_TMOD_R
483

Bacillus

771








anthracis



352
INFB_EC_1365_1393_TMOD_F
161
INFB_EC_1439_1467_TMOD_R
516

Bacillus

772








anthracis



353
LEF_BA_756_781_TMOD_F
175
LEF_BA_843_872_TMOD_R
531

Bacillus

773








anthracis



356
RPLB_EC_650_679_TMOD_F
232
RPLB_EC_739_762_TMOD_R
592

Clostridium

774








botulinum



449
RPLB_EC_690_710_F
237
RPLB_EC_737_758_R
589

Clostridium

775








botulinum



359
RPOB_EC_1845_1866_TMOD_F
241
RPOB_EC_1909_1929_TMOD_R
597

Yersinia

776








Pestis



362
RPOB_EC_3799_3821_TMOD_F
245
RPOB_EC_3862_3888_TMOD_R
603

Burkholderia

777








mallei



363
RPOC_EC_2146_2174_TMOD_F
257
RPOC_EC_2227_2245_TMOD_R
621

Burkholderia

778








mallei



354
RPOC_EC_2218_2241_TMOD_F
262
RPOC_EC_2313_2337_TMOD_R
625

Bacillus

779








anthracis



355
SSPE_BA_115_137_TMOD_F
321
SSPE_BA_197_222_TMOD_R
687

Bacillus

780








anthracis



367
TUFB_EC_957_979_TMOD_F
345
TUFB_EC_1034_1058_TMOD_R
701

Burkholderia

781








mallei



358
VALS_EC_1105_1124_TMOD_F
350
VALS_EC_1195_1218_TMOD_R
712

Yersinia

782








Pestis

















TABLE 10







Primer Pair Gene Coordinate References and Calibration Polynucleotide


Sequence Coordinates within the Combination Calibration Polynucleotide















Coordinates of Calibration




Reference GenBank GI No. of

Sequence in Combination


Bacterial Gene
Gene Extraction Coordinates
Genomic (G) or Plasmid (P)
Primer Pair
Calibration Polynucleotide (SEQ


and Species
of Genomic or Plasmid Sequence
Sequence
No.
ID NO: 783)





16S E. coli
4033120 . . . 4034661
16127994 (G)
346
 16 . . . 109


16S E. coli
4033120 . . . 4034661
16127994 (G)
347
 83 . . . 190


16S E. coli
4033120 . . . 4034661
16127994 (G)
348
246 . . . 353


16S E. coli
4033120 . . . 4034661
16127994 (G)
361
368 . . . 469


23S E. coli
4166220 . . . 4169123
16127994 (G)
349
743 . . . 837


23S E. coli
4166220 . . . 4169123
16127994 (G)
360
865 . . . 981


rpoB E. coli.
4178823 . . . 4182851
16127994 (G)
359
1591 . . . 1672



(complement strand)


rpoB E. coli
4178823 . . . 4182851
16127994 (G)
362
2081 . . . 2167



(complement strand)


rpoC E. coli
4182928 . . . 4187151
16127994 (G)
354
1810 . . . 1926


rpoC E. coli
4182928 . . . 4187151
16127994 (G)
363
2183 . . . 2279


infB E. coli
3313655 . . . 3310983
16127994 (G)
352
1692 . . . 1791



(complement strand)


tufB E. coli
4173523 . . . 4174707
16127994 (G)
367
2400 . . . 2498


rplB E. coli
3449001 . . . 3448180
16127994 (G)
356
1945 . . . 2060


rplB E. coli
3449001 . . . 3448180
16127994 (G)
449
1986 . . . 2055


valS E. coli
4481405 . . . 4478550
16127994 (G)
358
1462 . . . 1572



(complement strand)


capC
56074 . . . 55628 (complement
6470151 (P)
350
2517 . . . 2616



B. anthracis

strand)


cya
156626 . . . 154288
4894216 (P)
351
1338 . . . 1449



B. anthracis

(complement strand)


lef
127442 . . . 129921
4894216 (P)
353
1121 . . . 1234



B. anthracis



sspE
226496 . . . 226783
30253828 (G)
355
1007-1104



B. anthracis










Example 10
Use of a Calibration Polynucleotide for Determining the Quantity of Bacillus Anthracis in a Sample Containing a Mixture of Microbes

The process described in this example is shown in FIG. 7. The capC gene is a gene involved in capsule synthesis which resides on the pX02 plasmid of Bacillus anthracis. Primer pair number 350 (see Tables 9 and 10) was designed to identify Bacillus anthracis via production of a bacterial bioagent identifying amplicon. Known quantities of the combination calibration polynucleotide vector described in Example 3 were added to amplification mixtures containing bacterial bioagent nucleic acid from a mixture of microbes which included the Ames strain of Bacillus anthracis. Upon amplification of the bacterial bioagent nucleic acid and the combination calibration polynucleotide vector with primer pair no. 350, bacterial bioagent identifying amplicons and calibration amplicons were obtained and characterized by mass spectrometry. A mass spectrum measured for the amplification reaction is shown in FIG. 8). The molecular masses of the bioagent identifying amplicons provided the means for identification of the bioagent from which they were obtained (Ames strain of Bacillus anthracis) and the molecular masses of the calibration amplicons provided the means for their identification as well. The relationship between the abundance (peak height) of the calibration amplicon signals and the bacterial bioagent identifying amplicon signals provides the means of calculation of the copies of the pX02 plasmid of the Ames strain of Bacillus anthracis. Methods of calculating quantities of molecules based on internal calibration procedures are well known to those of ordinary skill in the art.


Averaging the results of 10 repetitions of the experiment described above, enabled a calculation that indicated that the quantity of Ames strain of Bacillus anthracis present in the sample corresponds to approximately 10 copies of pX02 plasmid.


Example 11
Drill-Down Genotyping of Campylobacter Species

A series of drill-down primers were designed as described in Example 1 with the objective of identification of different strains of Campylobacter jejuni. The primers are listed in Table 11 with the designation “CJST_SJ.” Housekeeping genes to which the primers hybridize and produce bioagent identifying amplicons include: tkt (transketolase), glyA (serine hydroxymethyltransferase), gltA (citrate synthase), aspA (aspartate ammonia lyase), glnA (glutamine synthase), pgm (phosphoglycerate mutase), and uncA (ATP synthetase alpha chain).









TABLE 11








Campylobacter Drill-down Primer Pairs













Primer







Pair

Forward Primer

Reverse Primer


No.
Forward Primer Name
(SEQ ID NO:)
Reverse Primer Name
(SEQ ID NO:)
Target Gene





1053
CJST_CJ_1080_1110_F
102
CJST_CJ_1166_1198_R
456
gltA


1064
CJST_CJ_1680_1713_F
107
CJST_CJ_1795_1822_R
461
glyA


1054
CJST_CJ_2060_2090_F
109
CJST_CJ_2148_2174_R
463
pgm


1049
CJST_CJ_2636_2668_F
113
CJST_CJ_2753_2777_R
467
tkt


1048
CJST_CJ_360_394_F
119
CJST_CJ_442_476_R
472
aspA


1047
CJST_CJ_584_616_F
121
CJST_CJ_663_692_R
474
glnA









The primers were used to amplify nucleic acid from 50 food product samples provided by the USDA, 25 of which contained Campylobacter jejuni and 25 of which contained Campylobacter coli. Primers used in this study were developed primarily for the discrimination of Campylobacter jejuni clonal complexes and for distinguishing Campylobacter jejuni from Campylobacter coli. Finer discrimination between Campylobacter coli types is also possible by using specific primers targeted to loci where closely-related Campylobacter coli isolates demonstrate polymorphisms between strains. The conclusions of the comparison of base composition analysis with sequence analysis are shown in Tables 12A-C.









TABLE 12A







Results of Base Composition Analysis of 50 Campylobacter Samples with Drill-down MLST Primer Pair Nos: 1048 and 1047

















MLST type or
MLST Type or

Base Composition of
Base Composition of





Clonal Complex by
Clonal Complex

Bioagent Identifying
Bioagent Identifying




Isolate
Base Composition
by Sequence

Amplicon Obtained with
Amplicon Obtained with


Group
Species
origin
analysis
analysis
Strain
Primer Pair No: 1048 (aspA)
Primer Pair No: 1047 (glnA)





J-1

C.

Goose
ST 690/
ST 991
RM3673
A30 G25 C16 T46
A47 G21 C16 T25




jejuni


692/707/991


J-2

C.

Human
Complex
ST 356,
RM4192
A30 G25 C16 T46
A48 G21 C17 T23




jejuni


206/48/353
complex 353


J-3

C.

Human
Complex
ST 436
RM4194
A30 G25 C15 T47
A48 G21 C18 T22




jejuni


354/179


J-4

C.

Human
Complex 257
ST 257,
RM4197
A30 G25 C16 T46
A48 G21 C18 T22




jejuni



complex 257


J-5

C.

Human
Complex 52
ST 52,
RM4277
A30 G25 C16 T46
A48 G21 C17 T23




jejuni



complex 52


J-6

C.

Human
Complex 443
ST 51,
RM4275
A30 G25 C15 T47
A48 G21 C17 T23




jejuni



complex 443
RM4279
A30 G25 C15 T47
A48 G21 C17 T23


J-7

C.

Human
Complex 42
ST 604,
RM1864
A30 G25 C15 T47
A48 G21 C18 T22




jejuni



complex 42


J-8

C.

Human
Complex
ST 362,
RM3193
A30 G25 C15 T47
A48 G21 C18 T22




jejuni


42/49/362
complex 362


J-9

C.

Human
Complex
ST 147,
RM3203
A30 G25 C15 T47
A47 G21 C18 T23




jejuni


45/283
Complex 45




C.

Human
Consistent
ST 828
RM4183
A31 G27 C20 T39
A48 G21 C16 T24




jejuni


with 74


C-1

C. coli


closely
ST 832
RM1169
A31 G27 C20 T39
A48 G21 C16 T24





related
ST 1056
RM1857
A31 G27 C20 T39
A48 G21 C16 T24




Poultry
sequence
ST 889
RM1166
A31 G27 C20 T39
A48 G21 C16 T24





types (none
ST 829
RM1182
A31 G27 C20 T39
A48 G21 C16 T24





belong to a
ST 1050
RM1518
A31 G27 C20 T39
A48 G21 C16 T24





clonal
ST 1051
RM1521
A31 G27 C20 T39
A48 G21 C16 T24





complex)
ST 1053
RM1523
A31 G27 C20 T39
A48 G21 C16 T24






ST 1055
RM1527
A31 G27 C20 T39
A48 G21 C16 T24






ST 1017
RM1529
A31 G27 C20 T39
A48 G21 C16 T24






ST 860
RM1840
A31 G27 C20 T39
A48 G21 C16 T24






ST 1063
RM2219
A31 G27 C20 T39
A48 G21 C16 T24






ST 1066
RM2241
A31 G27 C20 T39
A48 G21 C16 T24






ST 1067
RM2243
A31 G27 C20 T39
A48 G21 C16 T24






ST 1068
RM2439
A31 G27 C20 T39
A48 G21 C16 T24




Swine

ST 1016
RM3230
A31 G27 C20 T39
A48 G21 C16 T24






ST 1069
RM3231
A31 G27 C20 T39
A48 G21 C16 T24






ST 1061
RM1904
A31 G27 C20 T39
A48 G21 C16 T24




Unknown

ST 825
RM1534
A31 G27 C20 T39
A48 G21 C16 T24






ST 901
RM1505
A31 G27 C20 T39
A48 G21 C16 T24


C-2

C. coli

Human
ST 895
ST 895
RM1532
A31 G27 C19 T40
A48 G21 C16 T24


C-3

C. coli

Poultry
Consistent
ST 1064
RM2223
A31 G27 C20 T39
A48 G21 C16 T24





with 63
ST 1082
RM1178
A31 G27 C20 T39
A48 G21 C16 T24





closely
ST 1054
RM1525
A31 G27 C20 T39
A48 G21 C16 T24





related
ST 1049
RM1517
A31 G27 C20 T39
A48 G21 C16 T24




Marmoset
sequence
ST 891
RM1531
A31 G27 C20 T39
A48 G21 C16 T24





types (none





belong to a





clonal





complex)
















TABLE 12B







Results of Base Composition Analysis of 50 Campylobacter Samples with Drill-down MLST Primer Pair Nos: 1053 and 1064

















MLST type or
MLST Type or

Base Composition of
Base Composition of





Clonal Complex by
Clonal Complex

Bioagent Identifying
Bioagent Identifying




Isolate
Base Composition
by Sequence

Amplicon Obtained with
Amplicon Obtained with


Group
Species
origin
analysis
analysis
Strain
Primer Pair No: 1053 (gltA)
Primer Pair No: 1064 (glyA)





J-1

C.

Goose
ST 690/
ST 991
RM3673
A24 G25 C23 T47
A40 G29 C29 T45




jejuni


692/707/991


J-2

C.

Human
Complex
ST 356,
RM4192
A24 G25 C23 T47
A40 G29 C29 T45




jejuni


206/48/353
complex 353


J-3

C.

Human
Complex
ST 436
RM4194
A24 G25 C23 T47
A40 G29 C29 T45




jejuni


354/179


J-4

C.

Human
Complex 257
ST 257,
RM4197
A24 G25 C23 T47
A40 G29 C29 T45




jejuni



complex 257


J-5

C.

Human
Complex 52
ST 52,
RM4277
A24 G25 C23 T47
A39 G30 C26 T48




jejuni



complex 52


J-6

C.

Human
Complex 443
ST 51,
RM4275
A24 G25 C23 T47
A39 G30 C28 T46




jejuni



complex 443
RM4279
A24 G25 C23 T47
A39 G30 C28 T46


J-7

C.

Human
Complex 42
ST 604,
RM1864
A24 G25 C23 T47
A39 G30 C26 T48




jejuni



complex 42


J-8

C.

Human
Complex
ST 362,
RM3193
A24 G25 C23 T47
A38 G31 C28 T46




jejuni


42/49/362
complex 362


J-9

C.

Human
Complex
ST 147,
RM3203
A24 G25 C23 T47
A38 G31 C28 T46




jejuni


45/283
Complex 45




C.

Human
Consistent
ST 828
RM4183
A23 G24 C26 T46
A39 G30 C27 T47




jejuni


with 74


C-1

C. coli


closely
ST 832
RM1169
A23 G24 C26 T46
A39 G30 C27 T47





related
ST 1056
RM1857
A23 G24 C26 T46
A39 G30 C27 T47




Poultry
sequence
ST 889
RM1166
A23 G24 C26 T46
A39 G30 C27 T47





types (none
ST 829
RM1182
A23 G24 C26 T46
A39 G30 C27 T47





belong to a
ST 1050
RM1518
A23 G24 C26 T46
A39 G30 C27 T47





clonal
ST 1051
RM1521
A23 G24 C26 T46
A39 G30 C27 T47





complex)
ST 1053
RM1523
A23 G24 C26 T46
A39 G30 C27 T47






ST 1055
RM1527
A23 G24 C26 T46
A39 G30 C27 T47






ST 1017
RM1529
A23 G24 C26 T46
A39 G30 C27 T47






ST 860
RM1840
A23 G24 C26 T46
A39 G30 C27 T47






ST 1063
RM2219
A23 G24 C26 T46
A39 G30 C27 T47






ST 1066
RM2241
A23 G24 C26 T46
A39 G30 C27 T47






ST 1067
RM2243
A23 G24 C26 T46
A39 G30 C27 T47






ST 1068
RM2439
A23 G24 C26 T46
A39 G30 C27 T47




Swine

ST 1016
RM3230
A23 G24 C26 T46
A39 G30 C27 T47






ST 1069
RM3231
A23 G24 C26 T46
NO DATA






ST 1061
RM1904
A23 G24 C26 T46
A39 G30 C27 T47




Unknown

ST 825
RM1534
A23 G24 C26 T46
A39 G30 C27 T47






ST 901
RM1505
A23 G24 C26 T46
A39 G30 C27 T47


C-2

C. coli

Human
ST 895
ST 895
RM1532
A23 G24 C26 T46
A39 G30 C27 T47


C-3

C. coli

Poultry
Consistent
ST 1064
RM2223
A23 G24 C26 T46
A39 G30 C27 T47





with 63
ST 1082
RM1178
A23 G24 C26 T46
A39 G30 C27 T47





closely
ST 1054
RM1525
A23 G24 C25 T47
A39 G30 C27 T47





related
ST 1049
RM1517
A23 G24 C26 T46
A39 G30 C27 T47




Marmoset
sequence
ST 891
RM1531
A23 G24 C26 T46
A39 G30 C27 T47





types (none





belong to a





clonal





complex)
















TABLE 12C







Results of Base Composition Analysis of 50 Campylobacter Samples with Drill-down MLST Primer Pair Nos: 1054 and 1049

















MLST type or
MLST Type or

Base Composition of
Base Composition of





Clonal Complex by
Clonal Complex

Bioagent Identifying
Bioagent Identifying




Isolate
Base Composition
by Sequence

Amplicon Obtained with
Amplicon Obtained with


Group
Species
origin
analysis
analysis
Strain
Primer Pair No: 1054 (pgm)
Primer Pair No: 1049 (tkt)





J-1

C.

Goose
ST 690/
ST 991
RM3673
A26 G33 C18 T38
A41 G28 C35 T38




jejuni


692/707/991


J-2

C.

Human
Complex
ST 356,
RM4192
A26 G33 C19 T37
A41 G28 C36 T37




jejuni


206/48/353
complex 353


J-3

C.

Human
Complex
ST 436
RM4194
A27 G32 C19 T37
A42 G28 C36 T36




jejuni


354/179


J-4

C.

Human
Complex 257
ST 257,
RM4197
A27 G32 C19 T37
A41 G29 C35 T37




jejuni



complex 257


J-5

C.

Human
Complex 52
ST 52,
RM4277
A26 G33 C18 T38
A41 G28 C36 T37




jejuni



complex 52


J-6

C.

Human
Complex 443
ST 51,
RM4275
A27 G31 C19 T38
A41 G28 C36 T37




jejuni



complex 443
RM4279
A27 G31 C19 T38
A41 G28 C36 T37


J-7

C.

Human
Complex 42
ST 604,
RM1864
A27 G32 C19 T37
A42 G28 C35 T37




jejuni



complex 42


J-8

C.

Human
Complex
ST 362,
RM3193
A26 G33 C19 T37
A42 G28 C35 T37




jejuni


42/49/362
complex 362


J-9

C.

Human
Complex
ST 147,
RM3203
A28 G31 C19 T37
A43 G28 C36 T35




jejuni


45/283
Complex 45




C.

Human
Consistent
ST 828
RM4183
A27 G30 C19 T39
A46 G28 C32 T36




jejuni


with 74


C-1

C. coli


closely
ST 832
RM1169
A27 G30 C19 T39
A46 G28 C32 T36





related
ST 1056
RM1857
A27 G30 C19 T39
A46 G28 C32 T36




Poultry
sequence
ST 889
RM1166
A27 G30 C19 T39
A46 G28 C32 T36





types (none
ST 829
RM1182
A27 G30 C19 T39
A46 G28 C32 T36





belong to a
ST 1050
RM1518
A27 G30 C19 T39
A46 G28 C32 T36





clonal
ST 1051
RM1521
A27 G30 C19 T39
A46 G28 C32 T36





complex)
ST 1053
RM1523
A27 G30 C19 T39
A46 G28 C32 T36






ST 1055
RM1527
A27 G30 C19 T39
A46 G28 C32 T36






ST 1017
RM1529
A27 G30 C19 T39
A46 G28 C32 T36






ST 860
RM1840
A27 G30 C19 T39
A46 G28 C32 T36






ST 1063
RM2219
A27 G30 C19 T39
A46 G28 C32 T36






ST 1066
RM2241
A27 G30 C19 T39
A46 G28 C32 T36






ST 1067
RM2243
A27 G30 C19 T39
A46 G28 C32 T36






ST 1068
RM2439
A27 G30 C19 T39
A46 G28 C32 T36




Swine

ST 1016
RM3230
A27 G30 C19 T39
A46 G28 C32 T36






ST 1069
RM3231
A27 G30 C19 T39
A46 G28 C32 T36






ST 1061
RM1904
A27 G30 C19 T39
A46 G28 C32 T36




Unknown

ST 825
RM1534
A27 G30 C19 T39
A46 G28 C32 T36






ST 901
RM1505
A27 G30 C19 T39
A46 G28 C32 T36


C-2

C. coli

Human
ST 895
ST 895
RM1532
A27 G30 C19 T39
A45 G29 C32 T36


C-3

C. coli

Poultry
Consistent
ST 1064
RM2223
A27 G30 C19 T39
A45 G29 C32 T36





with 63
ST 1082
RM1178
A27 G30 C19 T39
A45 G29 C32 T36





closely
ST 1054
RM1525
A27 G30 C19 T39
A45 G29 C32 T36





related
ST 1049
RM1517
A27 G30 C19 T39
A45 G29 C32 T36




Marmoset
sequence
ST 891
RM1531
A27 G30 C19 T39
A45 G29 C32 T36





types (none





belong to a





clonal





complex)









The base composition analysis method was successful in identification of 12 different strain groups. Campylobacter jejuni and Campylobacter coli are generally differentiated by all loci. Ten clearly differentiated Campylobacter jejuni isolates and 2 major Campylobacter coli groups were identified even though the primers were designed for strain typing of Campylobacter jejuni. One isolate (RM4183) which was designated as Campylobacter jejuni was found to group with Campylobacter coli and also appears to actually be Campylobacter coli by full MLST sequencing.


Example 12
Identification of Acinetobacter baumannii Using Broad Range Survey and Division-Wide Primers in Epidemiological Surveillance

To test the capability of the broad range survey and division-wide primer sets of Table 4 in identification of Acinetobacter species, 183 clinical samples were obtained from individuals participating in, or in contact with individuals participating in Operation Iraqi Freedom (including US service personnel, US civilian patients at the Walter Reed Army Institute of Research (WRAIR), medical staff, Iraqi civilians and enemy prisoners). In addition, 34 environmental samples were obtained from hospitals in Iraq, Kuwait, Germany, the United States and the USNS Comfort, a hospital ship.


Upon amplification of nucleic acid obtained from the clinical samples, primer pairs 346-349, 360, 361, 354, 362 and 363 (Table 4) all produced bacterial bioagent amplicons which identified Acinetobacter baumannii in 215 of 217 samples. The organism Klebsiella pneumoniae was identified in the remaining two samples. In addition, 14 different strain types (containing single nucleotide polymorphisms relative to a reference strain of Acinetobacter baumannii) were identified and assigned arbitrary numbers from 1 to 14. Strain type 1 was found in 134 of the sample isolates and strains 3 and 7 were found in 46 and 9 of the isolates respectively.


The epidemiology of strain type 7 of Acinetobacter baumannii was investigated. Strain 7 was found in 4 patients and 5 environmental samples (from field hospitals in Iraq and Kuwait). The index patient infected with strain 7 was a pre-war patient who had a traumatic amputation in March of 2003 and was treated at a Kuwaiti hospital. The patient was subsequently transferred to a hospital in Germany and then to WRAIR. Two other patients from Kuwait infected with strain 7 were found to be non-infectious and were not further monitored. The fourth patient was diagnosed with a strain 7 infection in September of 2003 at WRAIR. Since the fourth patient was not related involved in Operation Iraqi Freedom, it was inferred that the fourth patient was the subject of a nosocomial infection acquired at WRAIR as a result of the spread of strain 7 from the index patient.


The epidemiology of strain type 3 of Acinetobacter baumannii was also investigated. Strain type 3 was found in 46 samples, all of which were from patients (US service members, Iraqi civilians and enemy prisoners) who were treated on the USNS Comfort hospital ship and subsequently returned to Iraq or Kuwait. The occurrence of strain type 3 in a single locale may provide evidence that at least some of the infections at that locale were a result of a nosocomial infections.


This example thus illustrates an embodiment of the present invention wherein the methods of analysis of bacterial bioagent identifying amplicons provide the means for epidemiological surveillance.


Example 13
Selection and Use of MLST Acinetobacter baumanii Drill-Down Primers

To combine the power of high-throughput mass spectrometric analysis of bioagent identifying amplicons with the sub-species characteristic resolving power provided by multi-locus sequence typing (MLST) such as the MLST methods of the MLST Databases at the Max-Planck Institute for Infectious Biology (web.mpiib-berlin.mpg.de/mlst/dbs/Mcatarrhalis/documents/primersCatarrhalis_html), an additional 21 primer pairs were selected based on analysis of housekeeping genes of the genus Acinetobacter. Genes to which the drill-down MLST analogue primers hybridize for production of bacterial bioagent identifying amplicons include anthranilate synthase component I (trpE), adenylate kinase (adk), adenine glycosylase (mutY), fumarate hydratase (fumC), and pyrophosphate phospho-hydratase (ppa). These 21 primer pairs are indicated with reference to sequence listings in Table 13. Primer pair numbers 1151-1154 hybridize to and amplify segments of trpE. Primer pair numbers 1155-1157 hybridize to and amplify segments of adk. Primer pair numbers 1158-1164 hybridize to and amplify segments of mutY. Primer pair numbers 1165-1170 hybridize to and amplify segments of fumC. Primer pair number 1171 hybridizes to and amplifies a segment of ppa. The primer names given in Table 13 indicates the coordinates to which the primers hybridize to a reference sequence which comprises a concatenation of the genes TrpE, efp (elongation factor p), adk, mutT, fumC, and ppa. For example, the forward primer of primer pair 1151 is named AB_MLST-11-OIF0076291 F because it hybridizes to the Acinetobacter MLST primer reference sequence of strain type 11 in sample 007 of Operation Iraqi Freedom (OIF) at positions 62 to 91.









TABLE 13







MLST Drill-Down Primers for Identification of Sub-species characteristics


(Strain Type) of Members of the Bacterial Genus Acinetobacter











Primer

Forward

Reverse


Pair

Primer

Primer


No.
Forward Primer Name
(SEQ ID NO:)
Reverse Primer Name
(SEQ ID NO:)





1151
AB_MLST-11-OIF007_62_91_F
83
AB_MLST-11-OIF007_169_203_R
426


1152
AB_MLST-11-OIF007_185_214_F
76
AB_MLST-11-OIF007_291_324_R
432


1153
AB_MLST-11-OIF007_260_289_F
79
AB_MLST-11-OIF007_364_393_R
434


1154
AB_MLST-11-OIF007_206_239_F
78
AB_MLST-11-OIF007_318_344_R
433


1155
AB_MLST-11-OIF007_522_552_F
80
AB_MLST-11-OIF007_587_610_R
435


1156
AB_MLST-11-OIF007_547_571_F
81
AB_MLST-11-OIF007_656_686_R
436


1157
AB_MLST-11-OIF007_601_627_F
82
AB_MLST-11-OIF007_710_736_R
437


1158
AB_MLST-11-
65
AB_MLST-11-OIF007_1266_1296_R
420



OIF007_1202_1225_F


1159
AB_MLST-11-
65
AB_MLST-11-OIF007_1299_1316_R
421



OIF007_1202_1225_F


1160
AB_MLST-11-
66
AB_MLST-11-OIF007_1335_1362_R
422



OIF007_1234_1264_F


1161
AB_MLST-11-
67
AB_MLST-11-OIF007_1422_1448_R
423



OIF007_1327_1356_F


1162
AB_MLST-11-
68
AB_MLST-11-OIF007_1470_1494_R
424



OIF007_1345_1369_F


1163
AB_MLST-11-
69
AB_MLST-11-OIF007_1470_1494_R
424



OIF007_1351_1375_F


1164
AB_MLST-11-
70
AB_MLST-11-OIF007_1470_1494_R
424



OIF007_1387_1412_F


1165
AB_MLST-11-
71
AB_MLST-11-OIF007_1656_1680_R
425



OIF007_1542_1569_F


1166
AB_MLST-11-
72
AB_MLST-11-OIF007_1656_1680_R
425



OIF007_1566_1593_F


1167
AB_MLST-11-
73
AB_MLST-11-OIF007_1731_1757_R
427



OIF007_1611_1638_F


1168
AB_MLST-11-
74
AB_MLST-11-OIF007_1790_1821_R
428



OIF007_1726_1752_F


1169
AB_MLST-11-
75
AB_MLST-11-OIF007_1876_1909_R
429



OIF007_1792_1826_F


1170
AB_MLST-11-
75
AB_MLST-11-OIF007_1895_1927_R
430



OIF007_1792_1826_F


1171
AB_MLST-11-
77
AB_MLST-11-OIF007_2097_2118_R
431



OIF007_1970_2002_F









Analysis of bioagent identifying amplicons obtained using the primers of Table 13 for over 200 samples from Operation Iraqi Freedom resulted in the identification of 50 distinct strain type clusters. The largest cluster, designated strain type 11 (ST11) includes 42 sample isolates, all of which were obtained from US service personnel and Iraqi civilians treated at the 28th Combat Support Hospital in Baghdad. Several of these individuals were also treated on the hospital ship USNS Comfort. These observations are indicative of significant epidemiological correlation/linkage.


All of the sample isolates were tested against a broad panel of antibiotics to characterize their antibiotic resistance profiles. As an example of a representative result from antibiotic susceptibility testing, ST11 was found to consist of four different clusters of isolates, each with a varying degree of sensitivity/resistance to the various antibiotics tested which included penicillins, extended spectrum penicillins, cephalosporins, carbipenem, protein synthesis inhibitors, nucleic acid synthesis inhibitors, anti-metabolites, and anti-cell membrane antibiotics. Thus, the genotyping power of bacterial bioagent identifying amplicons, particularly drill-down bacterial bioagent identifying amplicons, has the potential to increase the understanding of the transmission of infections in combat casualties, to identify the source of infection in the environment, to track hospital transmission of nosocomial infections, and to rapidly characterize drug-resistance profiles which enable development of effective infection control measures on a time-scale previously not achievable.


Various modifications of the invention, in addition to those described herein, will be apparent to those skilled in the art from the foregoing description. Such modifications are also intended to fall within the scope of the appended claims. Each reference (including, but not limited to, journal articles, U.S. and non-U.S. patents, patent application publications, international patent application publications, gene bank accession numbers, internet web sites, and the like) cited in the present application is incorporated herein by reference in its entirety.

Claims
  • 1. An oligonucleotide primer 21 to 35 nucleobases in length comprising no more than six sequence mismatches if aligned with SEQ ID NO: 97.
  • 2. An oligonucleotide primer 20 to 35 nucleobases in length comprising no more than six sequence mismatches if aligned with SEQ ID NO: 451.
  • 3. A composition comprising the primer of claim 1.
  • 4. The composition of claim 3 further comprising an oligonucleotide primer 20 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 451.
  • 5. The composition of claim 4 wherein either or both of said first and second oligonucleotide primers comprises at least one modified nucleobase.
  • 6. The composition of claim 4 wherein either or both of said first and second oligonucleotide primers comprises a non-templated T residue on the 5′-end.
  • 7. The composition of claim 4 wherein either or both of said first and second oligonucleotide primers comprises at least one non-template tag.
  • 8. The composition of claim 4 wherein either or both of said first and second oligonucleotide primers comprises at least one molecular mass modifying tag.
  • 9. A kit comprising the composition of claim 4.
  • 10. The kit of claim 9 further comprising at least one calibration polynucleotide.
  • 11. The kit of claim 9 further comprising at least one ion exchange resin linked to magnetic beads.
  • 12. A method for identification of an unknown bacterium comprising: amplifying nucleic acid from said bacterium using the composition of claim 4 to obtain an amplification product;determining the molecular mass of said amplification product;optionally determining the base composition of said amplification product from said molecular mass; andcomparing said molecular mass or base composition of said amplification product with a plurality of molecular masses or base compositions of known bacterial bioagent identifying amplicons, wherein a match between said molecular mass or base composition of said amplification product and the molecular mass or base composition of a member of said plurality of molecular masses or base compositions identifies said unknown bacterium.
  • 13. The method of claim 12 wherein said molecular mass is determined by mass spectrometry.
  • 14. A method of determining the presence or absence of a Bacillus species in a sample comprising: amplifying nucleic acid from said sample using the composition of claim 4 to obtain an amplification product;determining the molecular mass of said amplification product;optionally determining the base composition of said amplification product from said molecular mass; andcomparing said molecular mass or base composition of said amplification product with the known molecular masses or base compositions of one or more known Bacillus species bioagent identifying amplicons, wherein a match between said molecular mass or base composition of said amplification product and the molecular mass or base composition of one or more known Bacillus species bioagent identifying amplicons indicates the presence of said Bacillus species in said sample.
  • 15. The method of claim 14 wherein said molecular mass is determined by mass spectrometry.
  • 16. The method of claim 14 wherein said Bacillus species is Bacillus anthracis.
  • 17. A method for determination of the quantity of an unknown bacterium in a sample comprising: contacting said sample with the composition of claim 4 and a known quantity of a calibration polynucleotide comprising a calibration sequence;concurrently amplifying nucleic acid from said bacterium in said sample with the composition of claim 4 and amplifying nucleic acid from said calibration polynucleotide in said sample with the composition of claim 4 to obtain a first amplification product comprising a bacterial bioagent identifying amplicon and a second amplification product comprising a calibration amplicon;determining the molecular mass and abundance for said bacterial bioagent identifying amplicon and said calibration amplicon; anddistinguishing said bacterial bioagent identifying amplicon from said calibration amplicon based on molecular mass, wherein comparison of bacterial bioagent identifying amplicon abundance and calibration amplicon abundance indicates the quantity of bacterium in said sample.
  • 18. The method of claim 17 further comprising determining the base composition of said bacterial bioagent identifying amplicon.
  • 19. A composition comprising the primer of claim 2.
  • 20. The composition of claim 19 further comprising an oligonucleotide primer 20 to 35 nucleobases in length comprising 70% to 100% sequence identity with SEQ ID NO: 97.
CROSS-REFERENCE TO RELATED APPLICATIONS

The present application is a continuation of U.S. application Ser. No. 11/060,135, filed Feb. 17, 2005, which application is: 1) a continuation-in-part of U.S. application Ser. No. 10/728,486, filed Dec. 5, 2003, which claims the benefit of priority to U.S. Provisional Application Ser. No. 60/501,926, filed Sep. 11, 2003, and 2) claims the benefit of priority to: U.S. Provisional Application Ser. No. 60/545,425 filed Feb. 18, 2004, U.S. Provisional Application Ser. No. 60/559,754, filed Apr. 5, 2004, U.S. Provisional Application Ser. No. 60/632,862, filed Dec. 3, 2004, U.S. Provisional Application Ser. No. 60/639,068, filed Dec. 22, 2004, and U.S. Provisional Application Ser. No. 60/648,188, filed Jan. 28, 2005, each of which is incorporated herein by reference in its entirety.

STATEMENT OF GOVERNMENT SUPPORT

This invention was made with United States Government support under DARPA/SPO contract BAA00-09. The United States Government has certain rights in the invention.

Provisional Applications (6)
Number Date Country
60648188 Jan 2005 US
60639068 Dec 2004 US
60632862 Dec 2004 US
60559754 Apr 2004 US
60545425 Feb 2004 US
60501926 Sep 2003 US
Continuations (1)
Number Date Country
Parent 11060135 Feb 2005 US
Child 11930040 US
Continuation in Parts (1)
Number Date Country
Parent 10728486 Dec 2003 US
Child 11060135 US